Working on open projects

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Working on open projects

David Chisnall via cfe-dev
Hi All,
I was going through open projects page (https://clang-analyzer.llvm.org/open_projects.html) and wondering if that page is up to date or not. I found 'Explicitly model standard library functions with BodyFarm' and 'Enhance CFG to model C++ new more precisely' interesting to work on. I have some experience with LLVM API and modeling functions for verification as part of my masters project. So if anyone can let me know whom should I contact for those projects or how should I get started then it would be very helpful.

Thanks,
Jiten

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Working on open projects

David Chisnall via cfe-dev
Hello,

These are analyzer projects, which improve symbolic execution-based
bug-finding of the clang's --analyze option, but not compilation or code
generation. At the same time, these projects require relatively little
understanding of the analyzer's internals (compared to other projects).

* The body farm project does not require much knowledge about the
analyzer, and mostly requires knowledge of the AST. The idea of the
project is to synthesize ASTs of functions in order to help the analyzer
what they do, when they are not available in the current translation
unit (which is a problem because clang only compiles, and therefore
analyzes, one translation unit at a time). Having an AST for an external
function automagically allows the analyzer to "inline" it during
analysis; lack of the AST would mean that the analyzer would assume that
anything can happen when such function is called, which reduces
precision of the analysis.

Body-farmed ASTs are useful for system library functions that are simple
enough. The AST does not need to necessarily do exactly what the
function does, because the analyzer does not model everything exactly.
For example, any atomic operations on integers may be replaced with
regular integer operations because the analyzer would naturally do all
its symbolic calculations atomically. You can see what functions are
already there (very few, i guess we only have a couple of libdispatch
functions that are modeled to immediately call their callback; George
has recently farmed a body for std::call_once similarly in
https://reviews.llvm.org/D37840, which turned out to be harder than
usual) and follow the example to add the functions you're interested in.
Various compiler builtins (eg., again, atomics?) might make a nice
addition, and as far as I remember, Devin may have a couple of ideas as
well.

There is another mechanism in the analyzer, "evalCall", that allows
analyzer checkers to compute the effects of the function directly,
without consulting any sort of AST. The evalCall mechanism is older and
in many (but not all) cases more powerful, but probably overly powerful
and poorly scales with the number of checkers, so body farms are
preferred whenever possible.

Finally, there is an effort to allow the analyzer to import stuff from
other translation units through ASTImporter
(https://reviews.llvm.org/D30691); if successful, as a neat side effect
this may allow us to replace manual AST construction in body farms with
simply feeding raw source code to the analyzer, which might be easier.

* The C++ operator-new project is about constructing the clang's CFG
more accurately. Because most of the compilation relies on the LLVM's
CFG, clang CFG is essentially used only by the analyzer and a couple of
analysis-based compiler warnings, but not for compilation, and as such
it is not entirely finished. I didn't look deeply into this problem yet,
but it seems that by the time the analyzer sees the object construction
element in the CFG, he wasn't informed that he needs to allocate
symbolic memory to hold the newly constructed object, which needs fixing.

While fixing the CFG is the first step, the ultimate goal of this
project is to enable the "-analyzer-config c++-allocator-inlining=true"
option by default. Which means that work would also need to be done on
the analyzer side in order to understand the new CFG items and act
accordingly.

As an example of a recent CFG work i could recommend
https://reviews.llvm.org/D15031 which is not related to operator new,
but gives an impression of how this area of our code looks.

* As for contacts, this mailing list is the right place to discuss what
you want to do, and our phabricator (reviews.llvm.org) is the right
place to publish your patches. I've also CC'd the analyzer's code owner
Anna and other potentially interested people.

On 9/14/17 2:27 AM, Jiten Thakkar via cfe-dev wrote:

> Hi All,
> I was going through open projects page
> (https://clang-analyzer.llvm.org/open_projects.html) and wondering if
> that page is up to date or not. I found 'Explicitly model standard
> library functions with BodyFarm' and 'Enhance CFG to model C++ new
> more precisely' interesting to work on. I have some experience with
> LLVM API and modeling functions for verification as part of my masters
> project. So if anyone can let me know whom should I contact for those
> projects or how should I get started then it would be very helpful.
>
> Thanks,
> Jiten
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Working on open projects

David Chisnall via cfe-dev
In reply to this post by David Chisnall via cfe-dev
Hi Jiten,

The open projects list is somewhat out of date. However, the main problem is that most of the projects on the list are too difficult, especially, for contributors who do not have a lot of experience working on the analyzer.

One more specific suggestion I have that aligns with the Body Farm project is to add modeling for the atomics:

Let us know if you have more questions or would like other starter project suggestions.

Thanks!
Anna.
On Sep 13, 2017, at 4:27 PM, Jiten Thakkar via cfe-dev <[hidden email]> wrote:

Hi All,
I was going through open projects page (https://clang-analyzer.llvm.org/open_projects.html) and wondering if that page is up to date or not. I found 'Explicitly model standard library functions with BodyFarm' and 'Enhance CFG to model C++ new more precisely' interesting to work on. I have some experience with LLVM API and modeling functions for verification as part of my masters project. So if anyone can let me know whom should I contact for those projects or how should I get started then it would be very helpful.

Thanks,
Jiten
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Working on open projects

David Chisnall via cfe-dev
In reply to this post by David Chisnall via cfe-dev
Hi,

On 14 September 2017 at 16:19, Artem Dergachev <[hidden email]> wrote:
Hello,

These are analyzer projects, which improve symbolic execution-based bug-finding of the clang's --analyze option, but not compilation or code generation. At the same time, these projects require relatively little understanding of the analyzer's internals (compared to other projects).

* The body farm project does not require much knowledge about the analyzer, and mostly requires knowledge of the AST. The idea of the project is to synthesize ASTs of functions in order to help the analyzer what they do, when they are not available in the current translation unit (which is a problem because clang only compiles, and therefore analyzes, one translation unit at a time). Having an AST for an external function automagically allows the analyzer to "inline" it during analysis; lack of the AST would mean that the analyzer would assume that anything can happen when such function is called, which reduces precision of the analysis.

Body-farmed ASTs are useful for system library functions that are simple enough. The AST does not need to necessarily do exactly what the function does, because the analyzer does not model everything exactly. For example, any atomic operations on integers may be replaced with regular integer operations because the analyzer would naturally do all its symbolic calculations atomically. You can see what functions are already there (very few, i guess we only have a couple of libdispatch functions that are modeled to immediately call their callback; George has recently farmed a body for std::call_once similarly in https://reviews.llvm.org/D37840, which turned out to be harder than usual) and follow the example to add the functions you're interested in. Various compiler builtins (eg., again, atomics?) might make a nice addition, and as far as I remember, Devin may have a couple of ideas as well.

There is another mechanism in the analyzer, "evalCall", that allows analyzer checkers to compute the effects of the function directly, without consulting any sort of AST. The evalCall mechanism is older and in many (but not all) cases more powerful, but probably overly powerful and poorly scales with the number of checkers, so body farms are preferred whenever possible.

Finally, there is an effort to allow the analyzer to import stuff from other translation units through ASTImporter (https://reviews.llvm.org/D30691); if successful, as a neat side effect this may allow us to replace manual AST construction in body farms with simply feeding raw source code to the analyzer, which might be easier.

Note that, it is possible to feed raw source code to the analyzer to be used through body farm (there are some limitations though). This functionality works without relying on the ASTImporter.

Regards,
Gábor

 

* The C++ operator-new project is about constructing the clang's CFG more accurately. Because most of the compilation relies on the LLVM's CFG, clang CFG is essentially used only by the analyzer and a couple of analysis-based compiler warnings, but not for compilation, and as such it is not entirely finished. I didn't look deeply into this problem yet, but it seems that by the time the analyzer sees the object construction element in the CFG, he wasn't informed that he needs to allocate symbolic memory to hold the newly constructed object, which needs fixing.

While fixing the CFG is the first step, the ultimate goal of this project is to enable the "-analyzer-config c++-allocator-inlining=true" option by default. Which means that work would also need to be done on the analyzer side in order to understand the new CFG items and act accordingly.

As an example of a recent CFG work i could recommend https://reviews.llvm.org/D15031 which is not related to operator new, but gives an impression of how this area of our code looks.

* As for contacts, this mailing list is the right place to discuss what you want to do, and our phabricator (reviews.llvm.org) is the right place to publish your patches. I've also CC'd the analyzer's code owner Anna and other potentially interested people.


On 9/14/17 2:27 AM, Jiten Thakkar via cfe-dev wrote:
Hi All,
I was going through open projects page (https://clang-analyzer.llvm.org/open_projects.html) and wondering if that page is up to date or not. I found 'Explicitly model standard library functions with BodyFarm' and 'Enhance CFG to model C++ new more precisely' interesting to work on. I have some experience with LLVM API and modeling functions for verification as part of my masters project. So if anyone can let me know whom should I contact for those projects or how should I get started then it would be very helpful.

Thanks,
Jiten


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Working on open projects

David Chisnall via cfe-dev
In reply to this post by David Chisnall via cfe-dev
Hi Anna,
I think the atomics modeling project seems interesting. But I was looking at the BodyFarm code and I found that atomics is being taken care of by this function: https://github.com/llvm-mirror/clang/blob/master/lib/Analysis/BodyFarm.cpp#L274 Can you please tell me if this function needs to be improved? How?

Thanks,
Jiten

On Thu, Sep 14, 2017 at 12:49 PM, Anna Zaks <[hidden email]> wrote:
Hi Jiten,

The open projects list is somewhat out of date. However, the main problem is that most of the projects on the list are too difficult, especially, for contributors who do not have a lot of experience working on the analyzer.

One more specific suggestion I have that aligns with the Body Farm project is to add modeling for the atomics:

Let us know if you have more questions or would like other starter project suggestions.

Thanks!
Anna.
On Sep 13, 2017, at 4:27 PM, Jiten Thakkar via cfe-dev <[hidden email]> wrote:

Hi All,
I was going through open projects page (https://clang-analyzer.llvm.org/open_projects.html) and wondering if that page is up to date or not. I found 'Explicitly model standard library functions with BodyFarm' and 'Enhance CFG to model C++ new more precisely' interesting to work on. I have some experience with LLVM API and modeling functions for verification as part of my masters project. So if anyone can let me know whom should I contact for those projects or how should I get started then it would be very helpful.

Thanks,
Jiten
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Working on open projects

David Chisnall via cfe-dev
In reply to this post by David Chisnall via cfe-dev

On Sep 15, 2017, at 4:45 AM, Gábor Horváth <[hidden email]> wrote:

Hi,
Note that, it is possible to feed raw source code to the analyzer to be used through body farm (there are some limitations though). This functionality works without relying on the ASTImporter.

Hi Gabor,

From my understanding that only works for C, and not for C++, right?

Thanks,
George


Regards,
Gábor

 

* The C++ operator-new project is about constructing the clang's CFG more accurately. Because most of the compilation relies on the LLVM's CFG, clang CFG is essentially used only by the analyzer and a couple of analysis-based compiler warnings, but not for compilation, and as such it is not entirely finished. I didn't look deeply into this problem yet, but it seems that by the time the analyzer sees the object construction element in the CFG, he wasn't informed that he needs to allocate symbolic memory to hold the newly constructed object, which needs fixing.

While fixing the CFG is the first step, the ultimate goal of this project is to enable the "-analyzer-config c++-allocator-inlining=true" option by default. Which means that work would also need to be done on the analyzer side in order to understand the new CFG items and act accordingly.

As an example of a recent CFG work i could recommend https://reviews.llvm.org/D15031 which is not related to operator new, but gives an impression of how this area of our code looks.

* As for contacts, this mailing list is the right place to discuss what you want to do, and our phabricator (reviews.llvm.org) is the right place to publish your patches. I've also CC'd the analyzer's code owner Anna and other potentially interested people.


On 9/14/17 2:27 AM, Jiten Thakkar via cfe-dev wrote:
Hi All,
I was going through open projects page (https://clang-analyzer.llvm.org/open_projects.html) and wondering if that page is up to date or not. I found 'Explicitly model standard library functions with BodyFarm' and 'Enhance CFG to model C++ new more precisely' interesting to work on. I have some experience with LLVM API and modeling functions for verification as part of my masters project. So if anyone can let me know whom should I contact for those projects or how should I get started then it would be very helpful.

Thanks,
Jiten


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Working on open projects

David Chisnall via cfe-dev
In reply to this post by David Chisnall via cfe-dev
Hi Jiten,

The existing function models Apple specific API, but std::atomic_compare_exchange_* functions are not modeled yet. The modeling would be similar but not quite the same.

Cheers,
Anna
On Sep 15, 2017, at 5:08 PM, Jiten Thakkar <[hidden email]> wrote:

Hi Anna,
I think the atomics modeling project seems interesting. But I was looking at the BodyFarm code and I found that atomics is being taken care of by this function: https://github.com/llvm-mirror/clang/blob/master/lib/Analysis/BodyFarm.cpp#L274 Can you please tell me if this function needs to be improved? How?

Thanks,
Jiten

On Thu, Sep 14, 2017 at 12:49 PM, Anna Zaks <[hidden email]> wrote:
Hi Jiten,

The open projects list is somewhat out of date. However, the main problem is that most of the projects on the list are too difficult, especially, for contributors who do not have a lot of experience working on the analyzer.

One more specific suggestion I have that aligns with the Body Farm project is to add modeling for the atomics:

Let us know if you have more questions or would like other starter project suggestions.

Thanks!
Anna.
On Sep 13, 2017, at 4:27 PM, Jiten Thakkar via cfe-dev <[hidden email]> wrote:

Hi All,
I was going through open projects page (https://clang-analyzer.llvm.org/open_projects.html) and wondering if that page is up to date or not. I found 'Explicitly model standard library functions with BodyFarm' and 'Enhance CFG to model C++ new more precisely' interesting to work on. I have some experience with LLVM API and modeling functions for verification as part of my masters project. So if anyone can let me know whom should I contact for those projects or how should I get started then it would be very helpful.

Thanks,
Jiten
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev