Clang Static Analyzer supporting Cross Translation Unit

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Clang Static Analyzer supporting Cross Translation Unit

Hans Wennborg via cfe-dev
Hi all,

As far as I know, Clang static analyzer suported interprocedural analysis in one translation unit very well but not very good for the whole program interprocedural analysis. 

I got some primilary information about cross translation unit analysis online like this one: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053366.html. But it seems like it's still an exprimental work, and no more meterials are available. 

My work cares about interprocedural analysis a lot. I am kind of struggling on choosing tools to write checkers between clang analyzer and llvm pass. LLVM pass supports interprocedural analysis very well but they don't have a lot of checkers available like clang analyzer. I wonder will cross translation unit analysis in clang analyzer be supported very solidly in the furture? Is it a promising project that you might be interested putting effort on? 

I know it might be very expensive to support both path-sensitive and interprocedural analysis, especially for large systems. They may run out of memory. So I am curious if anyone is working on cross translation unit. 

Thank you.

Best,
Ying

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Static Analyzer supporting Cross Translation Unit

Hans Wennborg via cfe-dev
Hi Yingtong,

The work on integration of CTU into Clang Static Analyzer upstream is still ongoing. There were some experimental prototypes and now, as I know, Ericsson CodeChecker contains the most production-close version of CTU.
You should note that it is still experimental and has a number of known bugs and non-implemented functionality; however, we're working on fixing them.

(+ Gabor).


15.02.2018 03:31, Yingtong Liu via cfe-dev пишет:
Hi all,

As far as I know, Clang static analyzer suported interprocedural analysis in one translation unit very well but not very good for the whole program interprocedural analysis. 

I got some primilary information about cross translation unit analysis online like this one: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053366.html. But it seems like it's still an exprimental work, and no more meterials are available. 

My work cares about interprocedural analysis a lot. I am kind of struggling on choosing tools to write checkers between clang analyzer and llvm pass. LLVM pass supports interprocedural analysis very well but they don't have a lot of checkers available like clang analyzer. I wonder will cross translation unit analysis in clang analyzer be supported very solidly in the furture? Is it a promising project that you might be interested putting effort on? 

I know it might be very expensive to support both path-sensitive and interprocedural analysis, especially for large systems. They may run out of memory. So I am curious if anyone is working on cross translation unit. 

Thank you.

Best,
Ying


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


-- 
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Static Analyzer supporting Cross Translation Unit

Hans Wennborg via cfe-dev
The current CTU effort erases the boundaries we have between a single
translation unit and the whole program, but it isn't going to be
powerful enough to be described as a "whole-program" analysis, similarly
to how our existing inter-procedural analysis isn't quite "whole
translation unit" analysis.

With out static symbolic execution-based approach, we do not ever
attempt to understand any significant module of the program "as a
whole". Instead, we try to model specific individual functions, and
sometimes, occasionally, depending on numerous unobvious circumstances,
when we encounter calls of other functions during such modeling, we
allow ourselves to descend into the callee function to explore
consequences of the function call in the current context. It opens up
execution paths that traverse multiple functions, but we always keep in
mind that we're still analyzing the program by focusing on a very small
part of the code at a time, conducting multiple independent analyses
even within a single translation unit, and never assuming understanding
the program as a whole.

CTU allows us, sometimes, occasionally, depending on numerous unobvious
circumstances, to do the same when we encounter calls of functions that
have their bodies defined in a different translation unit, therefore
erasing the boundaries and allowing us to focus on more promising
execution paths. The current effort is for now focused at that first
step for now - erasing the boundaries. As far as i know, not much effort
has been done to tweak our heuristics to determine the promising
execution paths, but the existing heuristics work pretty well in the new
circumstances, and a significant improvement of the bugs-per-second
metric is observed, together with a considerable skew from finding
deeper bugs within the current translation unit towards finding
shallower bugs that require understanding of multiple translation units.
But still, and probably even more so, CTU is not whole-program analysis
- it's only an effort to erase the artificial boundaries of translation
unit, but our static symbolic execution approach would never scale
enough to understanding the program as a whole. Even if at all possible,
it requires a way more significant effort and advanced techniques.

So the real question here is - what kind of analysis do you want to
perform? Is symbolic execution the right tool for your work? Like, for
~1/2 of problems, symbolic execution is not even the right tool: if, for
instance, you're trying to find a problem that can be identified by an
invariant that holds on all paths (dead code, expression always has the
same value, various check-after-use), then the analyzer wouldn't be of
much help, because it never guarantees to explore all paths through the
program; it's only good for finding specific paths on which a certain
invariant is violated (use-after-failed-check, null dereference, memory
leak). And also symbolic execution of the whole program's source code
doesn't scale, but another analysis method may scale well.

On 15/02/2018 1:41 AM, Aleksei Sidorin via cfe-dev wrote:

> Hi Yingtong,
>
> The work on integration of CTU into Clang Static Analyzer upstream is
> still ongoing. There were some experimental prototypes and now, as I
> know, Ericsson CodeChecker contains the most production-close version
> of CTU.
> You should note that it is still experimental and has a number of
> known bugs and non-implemented functionality; however, we're working
> on fixing them.
>
> (+ Gabor).
>
>
> 15.02.2018 03:31, Yingtong Liu via cfe-dev пишет:
>> Hi all,
>>
>> As far as I know, Clang static analyzer suported interprocedural
>> analysis in one translation unit very well but not very good for the
>> whole program interprocedural analysis.
>>
>> I got some primilary information about cross translation unit
>> analysis online like this one:
>> http://lists.llvm.org/pipermail/cfe-dev/2017-March/053366.html. But
>> it seems like it's still an exprimental work, and no more meterials
>> are available.
>>
>> My work cares about interprocedural analysis a lot. I am kind of
>> struggling on choosing tools to write checkers between clang analyzer
>> and llvm pass. LLVM pass supports interprocedural analysis very well
>> but they don't have a lot of checkers available like clang analyzer.
>> I wonder will cross translation unit analysis in clang analyzer be
>> supported very solidly in the furture? Is it a promising project that
>> you might be interested putting effort on?
>>
>> I know it might be very expensive to support both path-sensitive and
>> interprocedural analysis, especially for large systems. They may run
>> out of memory. So I am curious if anyone is working on cross
>> translation unit.
>>
>> Thank you.
>>
>> Best,
>> Ying
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> --
> Best regards,
> Aleksei Sidorin,
> SRR, Samsung Electronics
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Static Analyzer supporting Cross Translation Unit

Hans Wennborg via cfe-dev
Hi Aleksei and Artem,

Thank you so much for the detailed answers. I decided to write my own checkers from scratch using LLVM because I need the checkers to be more specific and they do rely on interprocedural analysis a lot. 

Thanks,
Ying

Best,
Yingtong

--
Ph.D
Computer Science Department
University of California, Irvine

On Thu, Feb 15, 2018 at 10:50 AM, Artem Dergachev <[hidden email]> wrote:
The current CTU effort erases the boundaries we have between a single translation unit and the whole program, but it isn't going to be powerful enough to be described as a "whole-program" analysis, similarly to how our existing inter-procedural analysis isn't quite "whole translation unit" analysis.

With out static symbolic execution-based approach, we do not ever attempt to understand any significant module of the program "as a whole". Instead, we try to model specific individual functions, and sometimes, occasionally, depending on numerous unobvious circumstances, when we encounter calls of other functions during such modeling, we allow ourselves to descend into the callee function to explore consequences of the function call in the current context. It opens up execution paths that traverse multiple functions, but we always keep in mind that we're still analyzing the program by focusing on a very small part of the code at a time, conducting multiple independent analyses even within a single translation unit, and never assuming understanding the program as a whole.

CTU allows us, sometimes, occasionally, depending on numerous unobvious circumstances, to do the same when we encounter calls of functions that have their bodies defined in a different translation unit, therefore erasing the boundaries and allowing us to focus on more promising execution paths. The current effort is for now focused at that first step for now - erasing the boundaries. As far as i know, not much effort has been done to tweak our heuristics to determine the promising execution paths, but the existing heuristics work pretty well in the new circumstances, and a significant improvement of the bugs-per-second metric is observed, together with a considerable skew from finding deeper bugs within the current translation unit towards finding shallower bugs that require understanding of multiple translation units. But still, and probably even more so, CTU is not whole-program analysis - it's only an effort to erase the artificial boundaries of translation unit, but our static symbolic execution approach would never scale enough to understanding the program as a whole. Even if at all possible, it requires a way more significant effort and advanced techniques.

So the real question here is - what kind of analysis do you want to perform? Is symbolic execution the right tool for your work? Like, for ~1/2 of problems, symbolic execution is not even the right tool: if, for instance, you're trying to find a problem that can be identified by an invariant that holds on all paths (dead code, expression always has the same value, various check-after-use), then the analyzer wouldn't be of much help, because it never guarantees to explore all paths through the program; it's only good for finding specific paths on which a certain invariant is violated (use-after-failed-check, null dereference, memory leak). And also symbolic execution of the whole program's source code doesn't scale, but another analysis method may scale well.


On 15/02/2018 1:41 AM, Aleksei Sidorin via cfe-dev wrote:
Hi Yingtong,

The work on integration of CTU into Clang Static Analyzer upstream is still ongoing. There were some experimental prototypes and now, as I know, Ericsson CodeChecker contains the most production-close version of CTU.
You should note that it is still experimental and has a number of known bugs and non-implemented functionality; however, we're working on fixing them.

(+ Gabor).


15.02.2018 03:31, Yingtong Liu via cfe-dev пишет:
Hi all,

As far as I know, Clang static analyzer suported interprocedural analysis in one translation unit very well but not very good for the whole program interprocedural analysis.

I got some primilary information about cross translation unit analysis online like this one: http://lists.llvm.org/pipermail/cfe-dev/2017-March/053366.html. But it seems like it's still an exprimental work, and no more meterials are available.

My work cares about interprocedural analysis a lot. I am kind of struggling on choosing tools to write checkers between clang analyzer and llvm pass. LLVM pass supports interprocedural analysis very well but they don't have a lot of checkers available like clang analyzer. I wonder will cross translation unit analysis in clang analyzer be supported very solidly in the furture? Is it a promising project that you might be interested putting effort on?

I know it might be very expensive to support both path-sensitive and interprocedural analysis, especially for large systems. They may run out of memory. So I am curious if anyone is working on cross translation unit.

Thank you.

Best,
Ying


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


--
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev