Newby SA question

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Newby SA question

via cfe-dev

Hello,

  I have (unexpectedly) found myself developing a specific SA checker, and during the process I run across an interesting observation.

1) The checker traces zero value passed to a specific function. That is not that important, but I based my implementation on zero arguments passed to memcpy/memset checker. Simple enough.
2) I realized that the path from function arguments to the checked function matters. Here is an example:

void test (unsigned short *, unsigned, unsigned);

void foo(const float aa[], unsigned short bb[], unsigned int cc)
{
  unsigned char dd;
  float ee = 0.f;
  for (dd = 0; dd < cc; dd++)
    ee += aa[dd];

  if (ee == 0.f)
    test(bb, 99, cc);
}


void bar(const float aa[], unsigned short bb[], unsigned int cc, unsigned int xx)
{
  unsigned char dd;
  float ee = 0.f;
  for (dd = 0; dd < cc; dd++)
    ee += aa[dd];

  if (ee == 0.f)
    test(bb, 99, xx);
}

In function foo arg cc passed to test() is SVal (0 U32b) (among other values) so my checker is triggered.
In function bar arg xx is SVal (reg_$2<unsigned int xx>) which does _not_ match zero check.

The reason for that, as far as I can read from the code, is the for() loop. In foo cc is used in (dd<cc) inequality, and value for it is estimated with a range(reg_$0<unsigned int cc> : { [0, 0] }) which at the point my checker run for test() is re-interpreted as known value of zero.

3) From #2 I can make a conclusion that presence of comparison (dd<cc above) alters analysis in a substantial way.

Now the question - is this expected behavior or a bug? If this is expected behavior is there a way to affect it with any options already available? If this is a bug, is there a mechanism to "flush" state after analysis of the for loop is over, and I want to run value estimator at the point of test() evaluation from scratch? Is there a better way to achieve determinism?


...or I simply do not know what I am talking about 😊  

Any feedback is highly appreciated. Thank you.

Sergei

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Newby SA question

via cfe-dev
Well, that's how it has always been and is expected to work, though it
might be not ideal, so it's kinda a little bit of both. I'll just go on
explaining how exactly Static Analyzer understands these code snippets,
and it should become clear.

In foo(), Static Analyzer says: "If the loop is executed 0 times, then
cc is inevitably equal to 0, from which it follows that in test() the
third argument is 0". This reasoning is, well, correct. The only problem
here is in the first "If": the developer might be sure that the loop
always executes at least once, even though there's no indication of it
whatsoever in the code under analysis.

It might, of course, turn out that the developer is wrong. But if he's
right, he doesn't mind writing "assert(cc);" at the beginning of the
function, which would not only suppress the false positive, but also
reduce the undefined consequences of calling test() with 0 to a simple
assertion failure, and, as a bonus, document the contract of the
function for future generations of developers. If such assertion ever
fails, Static Analyzer was right. If it never fails, the developer was
right.

Another way to describe it is that Static Analyzer proposes a unit test
for foo() that consists in calling it with cc equal to 0, because it
triggers an interesting corner-case: this is the only case in which the
loop is executed 0 times.

In bar(), on the other hand, there's no indication that the branch on
which "xx == 0" is of any interest. We would not explore it separately,
pretty much like we won't explore "xx == 2018" separately. We'll just
think of all these possibilities as a single path on which xx (or, more
specifically, the algebraic symbol "reg_$2<unsigned int xx>" that
denotes the unknown value of variable xx at the beginning of the
analysis, i.e. the argument with which the function was called) "can be
anything". That'd be the case until we encounter a line of code like "if
(xx == 2018) { ... }". This line of code indicates that the developer
expects that both branches of if() can potentially occur because
otherwise it would have been dead code that doesn't make any sense and
telling the developer to remove it would be considered a good thing.

Such presumption of no dead code is pretty much one of the central ideas
behind symbolic execution, the static analysis technique used in Static
Analyzer. The Analyzer interprets the program and "forks" the
interpreter every time it encounters an "if" in the program, remembering
that symbol "reg_$2<unsigned int xx>" satisfies certain equations
("constraints") on each branch (eg., it is equal to 2018 on one branch
and not-equal to 2018 on another branch). If the system of all equations
collected on the current execution path so far has at least one
solution, the path is explored further, otherwise it is discarded as
impossible.

Now let's go back to foo(). Is the presence of a for() loop an
indication of the possibility that the loop is executed exactly once?
Not necessarily. If the loop is known to always be executed at least
once, code still makes perfect sense, and every statement in the code is
still exercised. So the assumption that the loop is sometimes executed
exactly 0 times is definitely too eager, there's little difference
between that and the assumption that the loop is sometimes executed
exactly 2018 times.

In fact, there's a certain amount of discussion about this problem (eg.,
https://bugs.llvm.org/show_bug.cgi?id=32911) in which some advocate for
completely skipping paths on which loops are executed 0 times because
bugs found on them are too often not interesting. I personally mildly
tend to believe that this corner-case is often missed by programmers, so
it's actually pretty interesting, but this opinion is not really that
strong. I guess nobody went far enough to gather enough data to support
one of these opinions - and it's veeery hard to gather enough data in
static analysis because a representative selection of codebases is
almost impossible to obtain, they're just too different.

Is this a big problem? I think it's quite clear that it isn't, really.
Because, well, you can always put that assert() and it'd be good in so
many ways.

  Is this an easy problem to fix? Probably not. Apart from the easy hack
that destroys coverage for 0-iteration paths, the actual proper solution
is "loop widening", which is a form of what you described as "flushing"
state after the analysis. There's an experimental flag "-analyzer-config
widen-loops=true" which enables such behavior, and some users are using
it, but i won't recommend using it yet because there are known crashes
due to how it sometimes flushes information that is impossible to flush
(eg., C++ references cannot be reassigned). It's probably not that hard
to fix all such problems and have a working implementation of
widen-loops=true. But i'm also not quite convinced that such primitive
behavior is actually good. If you destroy all information gathered
before entering the loop and within the loop, you may lose a lot of
important bits of information that are completely obvious by looking at
the code. Eg., imagine code in which a variable is always set to 0
before the loop, and the loop definitely doesn't touch it, and then
after the loop we assume that it is equal to 2018. That's a false
positive that would be much more annoying to the user because the
assert() that says "this variable is not touched within the loop" is not
only hard to write in the general case, but also looks ridiculous in the
code. So when doing loop widening, it is important to preserve as much
information as possible, and that is a much more complicated problem
than simply discarding information.

What's specifically hard about loop widening is how would checkers
interact with it. If the internal state trait of the program tracked by
the checker changes within the loop, the core of the Analyzer won't be
able to figure out what should be the final state of the checker's
internal trait. In fact, it won't even be able to access the checker's
private data or make sense out of it. This means that in order to
implement loop widening, we need to force every checker developer to
implement additional checker callback, i.e. "checkLoopWidening()", to
update its private state traits. And one of the biggest strengths of the
Analyzer, with all its naive understanding of control flow, is a very
friendly checker API. Well, we could do better, but even in its current
shape a lot of people find it very pleasant to work with.

One of the areas where our path split behavior shines is "taint
analysis". Which is still experimental, but very promising. For
instance, if you knew that cc in foo() or xx in bar() was in fact read
from the user, the positive would immediately become definitely-true,
even in bar(), and we'll be able to report it. This happens because
"tainted" symbols are guaranteed to potentially take any value within
the current constraints, i.e. symbolic values obtained from the user can
be forged by a malicious hacker to correspond to any concrete value.
Which is why, say, division by zero checker drops the check for
stateNonZero being impossible when the symbol is tainted.

But unless you are dealing with tainted symbols or being really paranoid
about verifying that your program is correct, you shouldn't report the
error unless you are also sure that the non-buggy state is impossible.
That'd be super loud.

So that's where we are. As usual: trade-offs, heuristics, budgets,
suppressions... Can't really get away from it, and in this case it seems
that the current behavior is both very easy to understand and not at all
that problematic.

For more info on our attitude towards suppressions please see
https://clang-analyzer.llvm.org/faq.html#use_assert and the next few
questions of the FAQ and the links in it.

On 10/10/18 12:37 PM, Sergei Larin via cfe-dev wrote:

> Hello,
>
>    I have (unexpectedly) found myself developing a specific SA checker, and during the process I run across an interesting observation.
>
> 1) The checker traces zero value passed to a specific function. That is not that important, but I based my implementation on zero arguments passed to memcpy/memset checker. Simple enough.
> 2) I realized that the path from function arguments to the checked function matters. Here is an example:
>
> void test (unsigned short *, unsigned, unsigned);
>
> void foo(const float aa[], unsigned short bb[], unsigned int cc)
> {
>    unsigned char dd;
>    float ee = 0.f;
>    for (dd = 0; dd < cc; dd++)
>      ee += aa[dd];
>
>    if (ee == 0.f)
>      test(bb, 99, cc);
> }
>
>
> void bar(const float aa[], unsigned short bb[], unsigned int cc, unsigned int xx)
> {
>    unsigned char dd;
>    float ee = 0.f;
>    for (dd = 0; dd < cc; dd++)
>      ee += aa[dd];
>
>    if (ee == 0.f)
>      test(bb, 99, xx);
> }
>
> In function foo arg cc passed to test() is SVal (0 U32b) (among other values) so my checker is triggered.
> In function bar arg xx is SVal (reg_$2<unsigned int xx>) which does _not_ match zero check.
>
> The reason for that, as far as I can read from the code, is the for() loop. In foo cc is used in (dd<cc) inequality, and value for it is estimated with a range(reg_$0<unsigned int cc> : { [0, 0] }) which at the point my checker run for test() is re-interpreted as known value of zero.
>
> 3) From #2 I can make a conclusion that presence of comparison (dd<cc above) alters analysis in a substantial way.
>
> Now the question - is this expected behavior or a bug? If this is expected behavior is there a way to affect it with any options already available? If this is a bug, is there a mechanism to "flush" state after analysis of the for loop is over, and I want to run value estimator at the point of test() evaluation from scratch? Is there a better way to achieve determinism?
>
>
> ...or I simply do not know what I am talking about 😊
>
> Any feedback is highly appreciated. Thank you.
>
> Sergei
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Newby SA question

via cfe-dev

Artem,

  Спасибо for the detailed explanation. It helps a lot when you understand the thinking behind a feature. I totally agree with your reasoning and the fact that tradeoff space varies form one code base to another. In my case I have to balance between acceptance of SA in a specific industrial environment and theoretical soundness of the analysis. I guess I will have to  exercise some of the mentioned tradeoffs.

  Thank you once more for the good explanation.

Sergei

-----Original Message-----
From: Artem Dergachev <[hidden email]>
Sent: Wednesday, October 10, 2018 10:54 PM
To: Sergei Larin <[hidden email]>; [hidden email]
Cc: llvm.dev <[hidden email]>
Subject: [EXTERNAL] Re: [cfe-dev] Newby SA question

Well, that's how it has always been and is expected to work, though it might be not ideal, so it's kinda a little bit of both. I'll just go on explaining how exactly Static Analyzer understands these code snippets, and it should become clear.

In foo(), Static Analyzer says: "If the loop is executed 0 times, then cc is inevitably equal to 0, from which it follows that in test() the third argument is 0". This reasoning is, well, correct. The only problem here is in the first "If": the developer might be sure that the loop always executes at least once, even though there's no indication of it whatsoever in the code under analysis.

It might, of course, turn out that the developer is wrong. But if he's right, he doesn't mind writing "assert(cc);" at the beginning of the function, which would not only suppress the false positive, but also reduce the undefined consequences of calling test() with 0 to a simple assertion failure, and, as a bonus, document the contract of the function for future generations of developers. If such assertion ever fails, Static Analyzer was right. If it never fails, the developer was right.

Another way to describe it is that Static Analyzer proposes a unit test for foo() that consists in calling it with cc equal to 0, because it triggers an interesting corner-case: this is the only case in which the loop is executed 0 times.

In bar(), on the other hand, there's no indication that the branch on which "xx == 0" is of any interest. We would not explore it separately, pretty much like we won't explore "xx == 2018" separately. We'll just think of all these possibilities as a single path on which xx (or, more specifically, the algebraic symbol "reg_$2<unsigned int xx>" that denotes the unknown value of variable xx at the beginning of the analysis, i.e. the argument with which the function was called) "can be anything". That'd be the case until we encounter a line of code like "if (xx == 2018) { ... }". This line of code indicates that the developer expects that both branches of if() can potentially occur because otherwise it would have been dead code that doesn't make any sense and telling the developer to remove it would be considered a good thing.

Such presumption of no dead code is pretty much one of the central ideas behind symbolic execution, the static analysis technique used in Static Analyzer. The Analyzer interprets the program and "forks" the interpreter every time it encounters an "if" in the program, remembering that symbol "reg_$2<unsigned int xx>" satisfies certain equations
("constraints") on each branch (eg., it is equal to 2018 on one branch and not-equal to 2018 on another branch). If the system of all equations collected on the current execution path so far has at least one solution, the path is explored further, otherwise it is discarded as impossible.

Now let's go back to foo(). Is the presence of a for() loop an indication of the possibility that the loop is executed exactly once?
Not necessarily. If the loop is known to always be executed at least once, code still makes perfect sense, and every statement in the code is still exercised. So the assumption that the loop is sometimes executed exactly 0 times is definitely too eager, there's little difference between that and the assumption that the loop is sometimes executed exactly 2018 times.

In fact, there's a certain amount of discussion about this problem (eg.,
https://bugs.llvm.org/show_bug.cgi?id=32911) in which some advocate for completely skipping paths on which loops are executed 0 times because bugs found on them are too often not interesting. I personally mildly tend to believe that this corner-case is often missed by programmers, so it's actually pretty interesting, but this opinion is not really that strong. I guess nobody went far enough to gather enough data to support one of these opinions - and it's veeery hard to gather enough data in static analysis because a representative selection of codebases is almost impossible to obtain, they're just too different.

Is this a big problem? I think it's quite clear that it isn't, really.
Because, well, you can always put that assert() and it'd be good in so many ways.

  Is this an easy problem to fix? Probably not. Apart from the easy hack that destroys coverage for 0-iteration paths, the actual proper solution is "loop widening", which is a form of what you described as "flushing"
state after the analysis. There's an experimental flag "-analyzer-config widen-loops=true" which enables such behavior, and some users are using it, but i won't recommend using it yet because there are known crashes due to how it sometimes flushes information that is impossible to flush (eg., C++ references cannot be reassigned). It's probably not that hard to fix all such problems and have a working implementation of widen-loops=true. But i'm also not quite convinced that such primitive behavior is actually good. If you destroy all information gathered before entering the loop and within the loop, you may lose a lot of important bits of information that are completely obvious by looking at the code. Eg., imagine code in which a variable is always set to 0 before the loop, and the loop definitely doesn't touch it, and then after the loop we assume that it is equal to 2018. That's a false positive that would be much more annoying to the user because the
assert() that says "this variable is not touched within the loop" is not only hard to write in the general case, but also looks ridiculous in the code. So when doing loop widening, it is important to preserve as much information as possible, and that is a much more complicated problem than simply discarding information.

What's specifically hard about loop widening is how would checkers interact with it. If the internal state trait of the program tracked by the checker changes within the loop, the core of the Analyzer won't be able to figure out what should be the final state of the checker's internal trait. In fact, it won't even be able to access the checker's private data or make sense out of it. This means that in order to implement loop widening, we need to force every checker developer to implement additional checker callback, i.e. "checkLoopWidening()", to update its private state traits. And one of the biggest strengths of the Analyzer, with all its naive understanding of control flow, is a very friendly checker API. Well, we could do better, but even in its current shape a lot of people find it very pleasant to work with.

One of the areas where our path split behavior shines is "taint analysis". Which is still experimental, but very promising. For instance, if you knew that cc in foo() or xx in bar() was in fact read from the user, the positive would immediately become definitely-true, even in bar(), and we'll be able to report it. This happens because "tainted" symbols are guaranteed to potentially take any value within the current constraints, i.e. symbolic values obtained from the user can be forged by a malicious hacker to correspond to any concrete value.
Which is why, say, division by zero checker drops the check for stateNonZero being impossible when the symbol is tainted.

But unless you are dealing with tainted symbols or being really paranoid about verifying that your program is correct, you shouldn't report the error unless you are also sure that the non-buggy state is impossible.
That'd be super loud.

So that's where we are. As usual: trade-offs, heuristics, budgets, suppressions... Can't really get away from it, and in this case it seems that the current behavior is both very easy to understand and not at all that problematic.

For more info on our attitude towards suppressions please see https://clang-analyzer.llvm.org/faq.html#use_assert and the next few questions of the FAQ and the links in it.

On 10/10/18 12:37 PM, Sergei Larin via cfe-dev wrote:

> Hello,
>
>    I have (unexpectedly) found myself developing a specific SA checker, and during the process I run across an interesting observation.
>
> 1) The checker traces zero value passed to a specific function. That is not that important, but I based my implementation on zero arguments passed to memcpy/memset checker. Simple enough.
> 2) I realized that the path from function arguments to the checked function matters. Here is an example:
>
> void test (unsigned short *, unsigned, unsigned);
>
> void foo(const float aa[], unsigned short bb[], unsigned int cc) {
>    unsigned char dd;
>    float ee = 0.f;
>    for (dd = 0; dd < cc; dd++)
>      ee += aa[dd];
>
>    if (ee == 0.f)
>      test(bb, 99, cc);
> }
>
>
> void bar(const float aa[], unsigned short bb[], unsigned int cc,
> unsigned int xx) {
>    unsigned char dd;
>    float ee = 0.f;
>    for (dd = 0; dd < cc; dd++)
>      ee += aa[dd];
>
>    if (ee == 0.f)
>      test(bb, 99, xx);
> }
>
> In function foo arg cc passed to test() is SVal (0 U32b) (among other values) so my checker is triggered.
> In function bar arg xx is SVal (reg_$2<unsigned int xx>) which does _not_ match zero check.
>
> The reason for that, as far as I can read from the code, is the for() loop. In foo cc is used in (dd<cc) inequality, and value for it is estimated with a range(reg_$0<unsigned int cc> : { [0, 0] }) which at the point my checker run for test() is re-interpreted as known value of zero.
>
> 3) From #2 I can make a conclusion that presence of comparison (dd<cc above) alters analysis in a substantial way.
>
> Now the question - is this expected behavior or a bug? If this is expected behavior is there a way to affect it with any options already available? If this is a bug, is there a mechanism to "flush" state after analysis of the for loop is over, and I want to run value estimator at the point of test() evaluation from scratch? Is there a better way to achieve determinism?
>
>
> ...or I simply do not know what I am talking about 😊
>
> Any feedback is highly appreciated. Thank you.
>
> Sergei
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev