[analyzer] moving alpha.IdenticalExpression to core

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[analyzer] moving alpha.IdenticalExpression to core

Eric Fiselier via cfe-dev

Hello Clang developers!

 

We wanted to see if alpha.IdenticlExpression can be moved to core. It has been in StaticAnalyzer for a while. We have run alpha.IdenticalExpression on 451 projects in debian and evaluated the results.

 

Identical expression found 127 warnings - 117 true positives and 10 of them seem to be false positives. All 10 false positives are however due to code is hidden by preprocessor (macros and ifdefs) so we can’t do anything about them. For example:

 

#ifdef __clang__

#define MACRO(X)

#endif

….

    If (x > 10) {

        MACRO(10);

    } else {

        MACRO(1000);

    }

 

=> 2.c:9:5: warning: true and false branches are identical

 

 

There was a warning that is not “perfect”. Reduced:

 

  if (x==10 || x == 20 || x == 10)

 

2.c:3:24: warning: identical expressions on both sides of logical operator

  if (x==10 || x == 20 || x == 10)

      ~~~~~            ^  ~~~~~~~

 

Maybe it could be clarified sometime.

 

Our conclusion is that this checker is ready for core.

 

Do you have some opinions?

 

Best regards,

Daniel Marjamäki

 

 

 


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] moving alpha.IdenticalExpression to core

Eric Fiselier via cfe-dev
Hello,

I had a look at a few of this checker's warnings and i generally liked
it. I think it's worth working on. I've seen worse false positives than
that, but i'd gladly welcome fixes.

This checker is a simple syntactic check that finds nice typos and
copy-paste errors such as "a || a" or "a == a" or "if (a) { b } else { b
}". The checker is working through a mechanism that is very similar to
CloneDetector: it tries to "understand" (a-la CloneDetector's "hash")
the AST of both sides/branches and see if those "hashes" are identical.
This may lead us into trying to re-use CloneDetection framework instead
of re-doing the same thing. (+Raphael)

I've seen false positives because of AST statement kinds that he doesn't
"understand", in particular in C++ he wasn't clever enough to
discriminate between constexprs such as "std::is_same<A, B>::value ||
std::is_same<A, C>::value". It would be very important to ensure the
checker understands most of the stuff in C++ and Objective-C, so i guess
we cannot turn it on by default until those are fixed.

I've got an alternative approach in mind though: instead of traversing
the AST manually, make the checker compare results of
Stmt::printPretty() (as strings) for both sides/branches. This might be
super simple and reliable and enough for this checker; side effects of
expressions, of course, would still need to be considered (as in
Expr::hasSideEffects()). I think it should be the preferred way to go,
unless unexpected problems show up.

The example with macros can be suppressed by adding some sort of
"((void)0);" into one of the branches; i can imagine it being annoying
on some projects, but i guess it's not super bad.

I agree that the "not perfect" error message you've found is worth
improving, which should be easy.


On 10/13/17 3:42 PM, Daniel Marjamäki via cfe-dev wrote:

>
> Hello Clang developers!
>
> We wanted to see if alpha.IdenticlExpression can be moved to core. It
> has been in StaticAnalyzer for a while. We have run
> alpha.IdenticalExpression on 451 projects in debian and evaluated the
> results.
>
> Identical expression found 127 warnings - 117 true positives and 10 of
> them seem to be false positives. All 10 false positives are however
> due to code is hidden by preprocessor (macros and ifdefs) so we can’t
> do anything about them. For example:
>
> #ifdef __clang__
>
> #define MACRO(X)
>
> #endif
>
> ….
>
>     If (x > 10) {
>
>         MACRO(10);
>
>     } else {
>
>         MACRO(1000);
>
>     }
>
> => 2.c:9:5: warning: true and false branches are identical
>
> There was a warning that is not “perfect”. Reduced:
>
>   if (x==10 || x == 20 || x == 10)
>
> 2.c:3:24: warning: identical expressions on both sides of logical operator
>
>   if (x==10 || x == 20 || x == 10)
>
>       ~~~~~ ^  ~~~~~~~
>
> Maybe it could be clarified sometime.
>
> Our conclusion is that this checker is ready for core.
>
> Do you have some opinions?
>
> Best regards,
>
> Daniel Marjamäki
>
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] moving alpha.IdenticalExpression to core

Eric Fiselier via cfe-dev
Thanks for the response.

I will not do the necessary fixes right now but will put this in our todo list for now so we can look at this later.

> CloneDetector: it tries to "understand" (a-la CloneDetector's "hash") the AST of both sides/branches and see if those "hashes" are identical.
This may lead us into trying to re-use CloneDetection framework instead of re-doing the same thing. (+Raphael)

As far as I know, the CloneDetector has lots of false positives.

> I've seen false positives because of AST statement kinds that he doesn't "understand", in particular in C++ he wasn't clever enough to discriminate between constexprs such as "std::is_same<A, B>::value || std::is_same<A, C>::value". It would be very important to ensure the checker understands most of the stuff in C++ and Objective-C, so i guess we cannot turn it on by default until those are fixed.

Yes I agree that is very important.

> I've got an alternative approach in mind though: instead of traversing the AST manually, make the checker compare results of
Stmt::printPretty() (as strings) for both sides/branches.

That has advantages. Maybe that would allow that we warn about identical code when macros are used. As far as I know the checker bails out when it see that macros are used so this won't generate a warning:

    If (a==HOURS_PER_DAY || a==HOURS_PER_DAY)

However that is how the Cppcheck checker worked before and we changed it to AST. The Cppcheck stringification is not rock-solid, details (indentation, macronames, etc) are lost. For this checker the indentation should be ignored but other details are important.

> I agree that the "not perfect" error message you've found is worth improving, which should be easy.

Yes.

Best regards,
Daniel Marjamäki

-----Original Message-----
From: Artem Dergachev [mailto:[hidden email]]
Sent: den 13 oktober 2017 16:31
To: Daniel Marjamäki; '[hidden email]'
Cc: Anders Rönnholm; Raphael Isemann
Subject: Re: [cfe-dev] [analyzer] moving alpha.IdenticalExpression to core

Hello,

I had a look at a few of this checker's warnings and i generally liked it. I think it's worth working on. I've seen worse false positives than that, but i'd gladly welcome fixes.

This checker is a simple syntactic check that finds nice typos and copy-paste errors such as "a || a" or "a == a" or "if (a) { b } else { b }". The checker is working through a mechanism that is very similar to
CloneDetector: it tries to "understand" (a-la CloneDetector's "hash") the AST of both sides/branches and see if those "hashes" are identical.
This may lead us into trying to re-use CloneDetection framework instead of re-doing the same thing. (+Raphael)

I've seen false positives because of AST statement kinds that he doesn't "understand", in particular in C++ he wasn't clever enough to discriminate between constexprs such as "std::is_same<A, B>::value || std::is_same<A, C>::value". It would be very important to ensure the checker understands most of the stuff in C++ and Objective-C, so i guess we cannot turn it on by default until those are fixed.

I've got an alternative approach in mind though: instead of traversing the AST manually, make the checker compare results of
Stmt::printPretty() (as strings) for both sides/branches. This might be super simple and reliable and enough for this checker; side effects of expressions, of course, would still need to be considered (as in Expr::hasSideEffects()). I think it should be the preferred way to go, unless unexpected problems show up.

The example with macros can be suppressed by adding some sort of "((void)0);" into one of the branches; i can imagine it being annoying on some projects, but i guess it's not super bad.

I agree that the "not perfect" error message you've found is worth improving, which should be easy.


On 10/13/17 3:42 PM, Daniel Marjamäki via cfe-dev wrote:

>
> Hello Clang developers!
>
> We wanted to see if alpha.IdenticlExpression can be moved to core. It
> has been in StaticAnalyzer for a while. We have run
> alpha.IdenticalExpression on 451 projects in debian and evaluated the
> results.
>
> Identical expression found 127 warnings - 117 true positives and 10 of
> them seem to be false positives. All 10 false positives are however
> due to code is hidden by preprocessor (macros and ifdefs) so we can't
> do anything about them. For example:
>
> #ifdef __clang__
>
> #define MACRO(X)
>
> #endif
>
> ..
>
>     If (x > 10) {
>
>         MACRO(10);
>
>     } else {
>
>         MACRO(1000);
>
>     }
>
> => 2.c:9:5: warning: true and false branches are identical
>
> There was a warning that is not "perfect". Reduced:
>
>   if (x==10 || x == 20 || x == 10)
>
> 2.c:3:24: warning: identical expressions on both sides of logical operator
>
>   if (x==10 || x == 20 || x == 10)
>
>       ~~~~~ ^  ~~~~~~~
>
> Maybe it could be clarified sometime.
>
> Our conclusion is that this checker is ready for core.
>
> Do you have some opinions?
>
> Best regards,
>
> Daniel Marjamäki
>
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] moving alpha.IdenticalExpression to core

Eric Fiselier via cfe-dev

On 16/10/17 09:35, Daniel Marjamäki via cfe-dev wrote:
> Thanks for the response.
>
> I will not do the necessary fixes right now but will put this in our todo list for now so we can look at this later.
>
>> CloneDetector: it tries to "understand" (a-la CloneDetector's "hash") the AST of both sides/branches and see if those "hashes" are identical.
> This may lead us into trying to re-use CloneDetection framework instead of re-doing the same thing. (+Raphael)
It'd be great if we can centralize the implementation and reuse as much
as possible the CloneDetection infrastructure. IIRC, Raphael has opened
the API recently for clients as we were investigating whether we can
reuse it in some other parts in clang.
>
> As far as I know, the CloneDetector has lots of false positives.
I'd leave to Raphael to give details about that ;)

>
>> I've seen false positives because of AST statement kinds that he doesn't "understand", in particular in C++ he wasn't clever enough to discriminate between constexprs such as "std::is_same<A, B>::value || std::is_same<A, C>::value". It would be very important to ensure the checker understands most of the stuff in C++ and Objective-C, so i guess we cannot turn it on by default until those are fixed.
> Yes I agree that is very important.
>
>> I've got an alternative approach in mind though: instead of traversing the AST manually, make the checker compare results of
> Stmt::printPretty() (as strings) for both sides/branches.
>
> That has advantages. Maybe that would allow that we warn about identical code when macros are used. As far as I know the checker bails out when it see that macros are used so this won't generate a warning:
>
>      If (a==HOURS_PER_DAY || a==HOURS_PER_DAY)
>
> However that is how the Cppcheck checker worked before and we changed it to AST. The Cppcheck stringification is not rock-solid, details (indentation, macronames, etc) are lost. For this checker the indentation should be ignored but other details are important.
>
>> I agree that the "not perfect" error message you've found is worth improving, which should be easy.
> Yes.
>
> Best regards,
> Daniel Marjamäki
>
> -----Original Message-----
> From: Artem Dergachev [mailto:[hidden email]]
> Sent: den 13 oktober 2017 16:31
> To: Daniel Marjamäki; '[hidden email]'
> Cc: Anders Rönnholm; Raphael Isemann
> Subject: Re: [cfe-dev] [analyzer] moving alpha.IdenticalExpression to core
>
> Hello,
>
> I had a look at a few of this checker's warnings and i generally liked it. I think it's worth working on. I've seen worse false positives than that, but i'd gladly welcome fixes.
>
> This checker is a simple syntactic check that finds nice typos and copy-paste errors such as "a || a" or "a == a" or "if (a) { b } else { b }". The checker is working through a mechanism that is very similar to
> CloneDetector: it tries to "understand" (a-la CloneDetector's "hash") the AST of both sides/branches and see if those "hashes" are identical.
> This may lead us into trying to re-use CloneDetection framework instead of re-doing the same thing. (+Raphael)
>
> I've seen false positives because of AST statement kinds that he doesn't "understand", in particular in C++ he wasn't clever enough to discriminate between constexprs such as "std::is_same<A, B>::value || std::is_same<A, C>::value". It would be very important to ensure the checker understands most of the stuff in C++ and Objective-C, so i guess we cannot turn it on by default until those are fixed.
>
> I've got an alternative approach in mind though: instead of traversing the AST manually, make the checker compare results of
> Stmt::printPretty() (as strings) for both sides/branches. This might be super simple and reliable and enough for this checker; side effects of expressions, of course, would still need to be considered (as in Expr::hasSideEffects()). I think it should be the preferred way to go, unless unexpected problems show up.
>
> The example with macros can be suppressed by adding some sort of "((void)0);" into one of the branches; i can imagine it being annoying on some projects, but i guess it's not super bad.
>
> I agree that the "not perfect" error message you've found is worth improving, which should be easy.
>
>
> On 10/13/17 3:42 PM, Daniel Marjamäki via cfe-dev wrote:
>> Hello Clang developers!
>>
>> We wanted to see if alpha.IdenticlExpression can be moved to core. It
>> has been in StaticAnalyzer for a while. We have run
>> alpha.IdenticalExpression on 451 projects in debian and evaluated the
>> results.
>>
>> Identical expression found 127 warnings - 117 true positives and 10 of
>> them seem to be false positives. All 10 false positives are however
>> due to code is hidden by preprocessor (macros and ifdefs) so we can't
>> do anything about them. For example:
>>
>> #ifdef __clang__
>>
>> #define MACRO(X)
>>
>> #endif
>>
>> ..
>>
>>      If (x > 10) {
>>
>>          MACRO(10);
>>
>>      } else {
>>
>>          MACRO(1000);
>>
>>      }
>>
>> => 2.c:9:5: warning: true and false branches are identical
>>
>> There was a warning that is not "perfect". Reduced:
>>
>>    if (x==10 || x == 20 || x == 10)
>>
>> 2.c:3:24: warning: identical expressions on both sides of logical operator
>>
>>    if (x==10 || x == 20 || x == 10)
>>
>>        ~~~~~ ^  ~~~~~~~
>>
>> Maybe it could be clarified sometime.
>>
>> Our conclusion is that this checker is ready for core.
>>
>> Do you have some opinions?
>>
>> Best regards,
>>
>> Daniel Marjamäki
>>
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev