Re: Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class

Anastasia Stulova via cfe-dev
Thanks Artem, Aleksei for those precious information.

Yes, I observed exactly the same thing about MaterializeTemporaryExpr and I actually spent quite sometime trace into the clang's codebase and reached to the exactly same place in the link: createTemporaryRegionIfNeeded. My workaround was inspired by one of the example checkers, that uses a checkPostStmt on MaterializeTemporaryExpr. Thanks for pointing out this.

Aleksei, thanks for pointing out the getQualifiedNameAsString, I didn't know I could use that  . I personally prefers it because getParent()->getName() doesn't contain the namespace, but only the class name. So to make it safe, I need to go all the way up to traverse all parent decls, and getQualifiedNameAsString already does the thing for me.

Thanks for taking your time to answer my question. I appreciate it.

Now I observed some other weird thing, and I wonder if anyone has seen this before and knows any solution.

As a reminder, the checker I am writing is for a particular class T in my company's code base and the checker is to ensure T.ok() == true before calling T.value().

The above is actually a over-simplification to my task. The class T actually has a member, Let's say s_ (in type S) to keep track of the status, so T.ok() actually returns s_.ok(). And type S is used directly by a few other classes and macros which are used to creates T. So to make the check fully working, I also need to model the behaviors of S and a few other classes.

Fortunately, T are template class and therefore all method implementations are visible to current TU. I was thinking, it is a whole lot easier if I can just model the behavior of S, and rely on inline evaluation in the static analyzer. In most cases, it works, except I found the following corner case (suppose T has a conversion constructor from X: T(X&&) and default move constructor T(T&&)):

X f();
T t = f();

This is essentially constructs a temporary T t2{f()} and then doing a move T t{std::move(t2)}. It looks fine if everything is inlined, however, the inline doesn't work because t2 is a temporary. I dived into the code base and what happened is

1. At the end of ExprEngine::getRegionForConstructedObject, it is marked as improperly modeled target region:

2. In ExprEngine::mayInlineCallKind, it returns false if the flag is set:
      // If we did not find the correct this-region, it would be pointless
      // to inline the constructor. Instead we will simply invalidate
      // the fake temporary target.
      if (CallOpts.IsCtorOrDtorWithImproperlyModeledTargetRegion)
        return CIP_DisallowedOnce;

I am not an expert in C++ compiler. I just wonder why it can't allocates a temporary region for temporary variables? Having this special case means I need to model all T's constructors, which is really annoying. Does anyone knows some workarounds? Thanks.


----- Original Message -----
From: Artem Dergachev <[hidden email]>
To: Aleksei Sidorin <[hidden email]>, Li Kan <[hidden email]>, [hidden email]
Subject: Re: [cfe-dev] Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class
Date: 2018-02-16 02:26


Yeah, the traditional way to see if this is a certain method of a
certain class is to compare the name of the method
(CXXMethodDecl->getName()) and the class
(CXXMethodDecl->getParent()->getName()), as strings, to the desired
strings. getQualifiedNameAsString() works fine, but it's considered slow
for whatever reason (i've never observed it personally). ASTMatchers
would also work, even if a bit of an overkill. Don't forget to check the
method's argument number and types as well. We have an effort to make
this whole common idiom more simple via CallDescription, but it doesn't
work for C++ methods yet.
With C++, you'll need to be aware of the bug we're currently having with
object lifetime extension. If you're identifying the object by its
memory region (which is correct), you'd occasionally notice that the
region for a C++ temporary object may change along the analysis even if
no copy-construction occurs. It has been making our new C++ checkers
(most noticeably, the iterator checker) more difficult to write than
they needed to be. More details in
http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - i'm
hoping to do something about it soon.
On 15/02/2018 1:35 AM, Aleksei Sidorin via cfe-dev wrote:

> Hello Li,
>
> What you're doing is mostly fine. I can only add some potentially
> useful info inline.
>
> 14.02.2018 23:04, Li Kan via cfe-dev пишет:
>> Hi folks,
>>
>> If this is not the correct mailing list for this question, please let
>> me know.
> You're in the right place. Welcome :)
>
>> I am trying to write a static analyzer for code base of my current
>> job. The particular static analyzer I am writing involves a
>> particular class of my company's code base. Let's say it is T. I want
>> to write a checker to ensure T.ok() is called before T.value() is called.
>>
>> Static analyzer is perfect for this type of check as it requires path
>> sensitive checks. When I trying to write the checker, I basically
>> checks for pre-call and post-call. I want to tell if a CallEvent is
>> T.ok() and T.value(). Currently what I am doing is:
>> 1. From CallEvent, I cast to CXXMemberCall, and getOriginExpr(), then
>> call getMethodDecl().
> You can also try to do the following chain:
> dyn_cast_or_null<CXXMethodDecl>(CallEvent.getDecl()) to obtain
> CXXMethodDecl.
>
>> 2. From CXXMethodDecl, I first call getThisType(), then call
>> getAsCXXRecordDecl(), then getQualifiedNameAsString(), and compare
>> with qualified name of T, to make sure it is member of T.
> To obtain parent class declaration from CXXMethodDecl, you can use
> getParent() method. So, the pseudocode will look like
> "MethodDecl->getParent()->getQualifiedNameAsString()".
>
>> 3. From CXXMethodDecl, I call getNameAsString() to get the method
>> name, and compare them with "ok" and "value".
> This looks OK. But it can be useful to know that
> getQualifiedNameAsString() for CXXMethodDecl will also include parent
> name. So, you can just check if MethodDecl->getQualifiedNameAsString()
> is equal to "T::value" or "T::ok".
>
>> It works, but it looks complicated, and involves string comparison,
>> which I assume is slow. Is there a easier way, blessed by clang
>> static analyzer official teams, that tell if a MethodDecl, or
>> CXXRecordDecl, is the function or class I am interested in?
>>
>> The ideal way is, there is one-time call to lookup the MethodDecl for
>> T::ok() and T::value(), and CXXRecordDecl for T. Then ever since
>> then, I just need to compare the pointers.
> This is possible. You can just launch a simple matcher against AST:
> cxxMethodDecl(hasName("::T::value"))
> and get CXXMethodDecl you need and compare with it later.
>
>> Another way is, the first time I found it is T, T::ok() or
>> T::value(), I save the pointer of CXXRecordDecl or MethodDecl, and
>> use the pointers later. Is this the reliable (assume I use the
>> pointer canonical decl will not change) and blessed way to do it? If
>> it is, is my steps above the correct and simplest way to determine it
>> is T::ok or T::value? Is there some better and more reliable way?
> I think it is reliable enough: pointers to canonical declarations
> don't change in AST.
>>
>> Thanks.
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> --
> Best regards,
> Aleksei Sidorin,
> SRR, Samsung Electronics
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class

Anastasia Stulova via cfe-dev
You're essentially in an area that's largely under construction at the
moment, and i'm likely to address your issue in like nearest future, you
can probably follow my progress on phabricator or in cfe-commits, or i
can try to ping you when this case is supported.

Being source-based, the analyzer needs to model every possible language
construct explicitly, and in C++ it means *a ton of stuff*. We could
analyze, like, llvm IR instead, and it'd be much easier to implement
correctly for us, but it'd be much harder to produce user-friendly
warnings, because the user does think in terms of source code. And it'd
be harder to make checkers for the same reason. Which is why "symbolic
execution" we're doing here is to compilation like algebra to
arithmetic: we need to do the same thing - support all language
constructs - but with a lot of "concrete" values replaced with algebraic
unknown values. Luckily, it is also easier because we don't have to
support everything perfectly: we can always say that we don't know
something, behave "conservatively", and the analysis would be slightly
less precise but this is fine if there's not too much of these.

Modeling temporaries is one of such cases - this is one of the difficult
parts of the language that we never did properly until now. One does not
simply construct into a temporary region because in this case we're not
even sure that it is indeed a temporary region in this part of the
analyzer, it cannot be understood by looking at the construct-expression
in the AST, which is something i'm working around now by introducing
"construction contexts". And even if we were sure it's a temporary, we
need to not only construct the object, but also know how to properly
extend its lifetime and destroy it when lifetime ends, otherwise we'd
encounter massive false positives (because it'd not be "we don't know
something", it'd be "we know something but it's wrong"), which should
also be part of the "context" in which construction takes place. So for
now this part simply behaves conservatively.

Your particular case will be relatively tricky when f() is inlined
during analysis. In this case the temporary will be constructed within
the function, but in order to properly work with this temporary we'd
need what the caller function is going to do with it - at least
according to the AST that we currently have this is what we have to do.
More info in
http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - "==
Return by value ==" part. CodeGen does it differently, treating a lot of
AST rvalues as lvalues under the hood, but i don't think we can afford
breaking this invariant for now, because, well, we're more complicated
and fragile.

On 22/02/2018 4:46 PM, [hidden email] wrote:

> Thanks Artem, Aleksei for those precious information.
>
> Yes, I observed exactly the same thing about MaterializeTemporaryExpr
> and I actually spent quite sometime trace into the clang's codebase
> and reached to the exactly same place in the link:
> createTemporaryRegionIfNeeded. My workaround was inspired by one of
> the example checkers, that uses a checkPostStmt on
> MaterializeTemporaryExpr. Thanks for pointing out this.
>
> Aleksei, thanks for pointing out the getQualifiedNameAsString, I
> didn't know I could use that  . I personally prefers it because
> getParent()->getName() doesn't contain the namespace, but only the
> class name. So to make it safe, I need to go all the way up to
> traverse all parent decls, and getQualifiedNameAsString already does
> the thing for me.
>
> Thanks for taking your time to answer my question. I appreciate it.
>
> Now I observed some other weird thing, and I wonder if anyone has seen
> this before and knows any solution.
>
> As a reminder, the checker I am writing is for a particular class T in
> my company's code base and the checker is to ensure T.ok() == true
> before calling T.value().
>
> The above is actually a over-simplification to my task. The class T
> actually has a member, Let's say s_ (in type S) to keep track of the
> status, so T.ok() actually returns s_.ok(). And type S is used
> directly by a few other classes and macros which are used to creates
> T. So to make the check fully working, I also need to model the
> behaviors of S and a few other classes.
>
> Fortunately, T are template class and therefore all method
> implementations are visible to current TU. I was thinking, it is a
> whole lot easier if I can just model the behavior of S, and rely on
> inline evaluation in the static analyzer. In most cases, it works,
> except I found the following corner case (suppose T has a conversion
> constructor from X: T(X&&) and default move constructor T(T&&)):
>
> X f();
> T t = f();
>
> This is essentially constructs a temporary T t2{f()} and then doing a
> move T t{std::move(t2)}. It looks fine if everything is inlined,
> however, the inline doesn't work because t2 is a temporary. I dived
> into the code base and what happened is
>
> 1. At the end of ExprEngine::getRegionForConstructedObject
> <https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#%2100183>,
> it is marked as improperly modeled target region:
> CallOpts.IsCtorOrDtorWithImproperlyModeledTargetRegion
> <https://clang.llvm.org/doxygen/structclang_1_1ento_1_1ExprEngine_1_1EvalCallOptions.html#a339aa5a5b9aecfa8a844647b8d211eb0>=
> true;
> 2. In ExprEngine::mayInlineCallKind
> <https://clang.llvm.org/doxygen/ExprEngineCallAndReturn_8cpp_source.html#%2100677>,
> it returns false if the flag is set:
> // If we did not find the correct this-region, it would be pointless
> // to inline the constructor. Instead we will simply invalidate // the
> fake temporary target. if (CallOpts
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp?l=618&ct=xref_jump_to_def&gsn=CallOpts&rcl=186553626>.IsCtorOrDtorWithImproperlyModeledTargetRegion
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=62&ct=xref_jump_to_def&gsn=IsCtorOrDtorWithImproperlyModeledTargetRegion&rcl=186553626>)
> return CIP_DisallowedOnce
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=623&ct=xref_jump_to_def&gsn=CIP_DisallowedOnce&rcl=186553626>;
>
> I am not an expert in C++ compiler. I just wonder why it can't
> allocates a temporary region for temporary variables? Having this
> special case means I need to model all T's constructors, which is
> really annoying. Does anyone knows some workarounds? Thanks.
>
>
> ----- Original Message -----
> From: Artem Dergachev <[hidden email]>
> To: Aleksei Sidorin <[hidden email]>, Li Kan
> <[hidden email]>, [hidden email]
> Subject: Re: [cfe-dev] Static analyzer question on how to tell if a
> Decl is a particular c++ function, or a particular class
> Date: 2018-02-16 02:26
>
>
> Yeah, the traditional way to see if this is a certain method of a
> certain class is to compare the name of the method
> (CXXMethodDecl->getName()) and the class
> (CXXMethodDecl->getParent()->getName()), as strings, to the desired
> strings. getQualifiedNameAsString() works fine, but it's considered slow
> for whatever reason (i've never observed it personally). ASTMatchers
> would also work, even if a bit of an overkill. Don't forget to check the
> method's argument number and types as well. We have an effort to make
> this whole common idiom more simple via CallDescription, but it doesn't
> work for C++ methods yet.
> With C++, you'll need to be aware of the bug we're currently having with
> object lifetime extension. If you're identifying the object by its
> memory region (which is correct), you'd occasionally notice that the
> region for a C++ temporary object may change along the analysis even if
> no copy-construction occurs. It has been making our new C++ checkers
> (most noticeably, the iterator checker) more difficult to write than
> they needed to be. More details in
> http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - i'm
> hoping to do something about it soon.
> On 15/02/2018 1:35 AM, Aleksei Sidorin via cfe-dev wrote:
> > Hello Li,
> >
> > What you're doing is mostly fine. I can only add some potentially
> > useful info inline.
> >
> > 14.02.2018 23:04, Li Kan via cfe-dev пишет:
> >> Hi folks,
> >>
> >> If this is not the correct mailing list for this question, please let
> >> me know.
> > You're in the right place. Welcome :)
> >
> >> I am trying to write a static analyzer for code base of my current
> >> job. The particular static analyzer I am writing involves a
> >> particular class of my company's code base. Let's say it is T. I want
> >> to write a checker to ensure T.ok() is called before T.value() is
> called.
> >>
> >> Static analyzer is perfect for this type of check as it requires path
> >> sensitive checks. When I trying to write the checker, I basically
> >> checks for pre-call and post-call. I want to tell if a CallEvent is
> >> T.ok() and T.value(). Currently what I am doing is:
> >> 1. From CallEvent, I cast to CXXMemberCall, and getOriginExpr(), then
> >> call getMethodDecl().
> > You can also try to do the following chain:
> > dyn_cast_or_null<CXXMethodDecl>(CallEvent.getDecl()) to obtain
> > CXXMethodDecl.
> >
> >> 2. From CXXMethodDecl, I first call getThisType(), then call
> >> getAsCXXRecordDecl(), then getQualifiedNameAsString(), and compare
> >> with qualified name of T, to make sure it is member of T.
> > To obtain parent class declaration from CXXMethodDecl, you can use
> > getParent() method. So, the pseudocode will look like
> > "MethodDecl->getParent()->getQualifiedNameAsString()".
> >
> >> 3. From CXXMethodDecl, I call getNameAsString() to get the method
> >> name, and compare them with "ok" and "value".
> > This looks OK. But it can be useful to know that
> > getQualifiedNameAsString() for CXXMethodDecl will also include parent
> > name. So, you can just check if MethodDecl->getQualifiedNameAsString()
> > is equal to "T::value" or "T::ok".
> >
> >> It works, but it looks complicated, and involves string comparison,
> >> which I assume is slow. Is there a easier way, blessed by clang
> >> static analyzer official teams, that tell if a MethodDecl, or
> >> CXXRecordDecl, is the function or class I am interested in?
> >>
> >> The ideal way is, there is one-time call to lookup the MethodDecl for
> >> T::ok() and T::value(), and CXXRecordDecl for T. Then ever since
> >> then, I just need to compare the pointers.
> > This is possible. You can just launch a simple matcher against AST:
> > cxxMethodDecl(hasName("::T::value"))
> > and get CXXMethodDecl you need and compare with it later.
> >
> >> Another way is, the first time I found it is T, T::ok() or
> >> T::value(), I save the pointer of CXXRecordDecl or MethodDecl, and
> >> use the pointers later. Is this the reliable (assume I use the
> >> pointer canonical decl will not change) and blessed way to do it? If
> >> it is, is my steps above the correct and simplest way to determine it
> >> is T::ok or T::value? Is there some better and more reliable way?
> > I think it is reliable enough: pointers to canonical declarations
> > don't change in AST.
> >>
> >> Thanks.
> >>
> >>
> >> _______________________________________________
> >> cfe-dev mailing list
> >> [hidden email]
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> >
> >
> > --
> > Best regards,
> > Aleksei Sidorin,
> > SRR, Samsung Electronics
> >
> >
> > _______________________________________________
> > cfe-dev mailing list
> > [hidden email]
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class

Anastasia Stulova via cfe-dev
In reply to this post by Anastasia Stulova via cfe-dev
Thanks Artem for clarification. It is very clear and now I understand the situation. Sorry to bother again, but now I have yet another question.

I am now trying to make the checker work for pointer escape. Since I track the status of objects using MemRegion, so I am using checkRegionChanges. I've observed something weird.

Suppose I have free functions whose definitions are not seen by current TU: void f(const S*); void g(S*);
1. Suppose I have
S s;
f(&s);
Then in checkRegionChanges, s's region will appear in explicit_regions but not regions.
2. Suppose I have
S s;
g(&s);
Then in checkRegionChanges, t's region will appear in both explicit_regions and regions.

Based on the above observations, if I want to remove s from GDM in case 2 but not 1, then I need to iterate over all entries in GDM and remove the ones that are sub-regions of any one in `region` variable, not `explicit_regions` variable. However, this doesn't work for the following case
3. If I have struct T { S s; D d; }; and U has some non-const member void D::h(); for the following call:
T t;
t.d.h();
Then in checkRegionChanges, t.d is in explicit_regions and t is in regions. So t.s will be removed from GDM if I check 'regions' variable, which is incorrect.

My question is, is case 1 a bug or a feature that s appears in explicit_regions? If it is by design, how what is the blessed way to distinguish the above cases? Thanks.

BTW: the code that causes the behavior is
https://clang.llvm.org/doxygen/RegionStore_8cpp_source.html#l01256
     if (const MemRegion *R = V.getAsRegion()) {
       if (TopLevelRegions)
         TopLevelRegions->push_back(R);
       W.AddToWorkList(R);
       continue;
     }
All regions are added to TopLevelRegions, without checking RegionAndSymbolInvalidationTraits::TK_PreserveContents.

----- Original Message -----
From: Artem Dergachev <[hidden email]>
To: [hidden email], Aleksei Sidorin <[hidden email]>, cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class
Date: 2018-02-23 09:20


You're essentially in an area that's largely under construction at the
moment, and i'm likely to address your issue in like nearest future, you
can probably follow my progress on phabricator or in cfe-commits, or i
can try to ping you when this case is supported.
Being source-based, the analyzer needs to model every possible language
construct explicitly, and in C++ it means *a ton of stuff*. We could
analyze, like, llvm IR instead, and it'd be much easier to implement
correctly for us, but it'd be much harder to produce user-friendly
warnings, because the user does think in terms of source code. And it'd
be harder to make checkers for the same reason. Which is why "symbolic
execution" we're doing here is to compilation like algebra to
arithmetic: we need to do the same thing - support all language
constructs - but with a lot of "concrete" values replaced with algebraic
unknown values. Luckily, it is also easier because we don't have to
support everything perfectly: we can always say that we don't know
something, behave "conservatively", and the analysis would be slightly
less precise but this is fine if there's not too much of these.
Modeling temporaries is one of such cases - this is one of the difficult
parts of the language that we never did properly until now. One does not
simply construct into a temporary region because in this case we're not
even sure that it is indeed a temporary region in this part of the
analyzer, it cannot be understood by looking at the construct-expression
in the AST, which is something i'm working around now by introducing
"construction contexts". And even if we were sure it's a temporary, we
need to not only construct the object, but also know how to properly
extend its lifetime and destroy it when lifetime ends, otherwise we'd
encounter massive false positives (because it'd not be "we don't know
something", it'd be "we know something but it's wrong"), which should
also be part of the "context" in which construction takes place. So for
now this part simply behaves conservatively.
Your particular case will be relatively tricky when f() is inlined
during analysis. In this case the temporary will be constructed within
the function, but in order to properly work with this temporary we'd
need what the caller function is going to do with it - at least
according to the AST that we currently have this is what we have to do.
More info in
http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - "==
Return by value ==" part. CodeGen does it differently, treating a lot of
AST rvalues as lvalues under the hood, but i don't think we can afford
breaking this invariant for now, because, well, we're more complicated
and fragile.
On 22/02/2018 4:46 PM, [hidden email] wrote:

> Thanks Artem, Aleksei for those precious information.
>
> Yes, I observed exactly the same thing about MaterializeTemporaryExpr
> and I actually spent quite sometime trace into the clang's codebase
> and reached to the exactly same place in the link:
> createTemporaryRegionIfNeeded. My workaround was inspired by one of
> the example checkers, that uses a checkPostStmt on
> MaterializeTemporaryExpr. Thanks for pointing out this.
>
> Aleksei, thanks for pointing out the getQualifiedNameAsString, I
> didn't know I could use that  . I personally prefers it because
> getParent()->getName() doesn't contain the namespace, but only the
> class name. So to make it safe, I need to go all the way up to
> traverse all parent decls, and getQualifiedNameAsString already does
> the thing for me.
>
> Thanks for taking your time to answer my question. I appreciate it.
>
> Now I observed some other weird thing, and I wonder if anyone has seen
> this before and knows any solution.
>
> As a reminder, the checker I am writing is for a particular class T in
> my company's code base and the checker is to ensure T.ok() == true
> before calling T.value().
>
> The above is actually a over-simplification to my task. The class T
> actually has a member, Let's say s_ (in type S) to keep track of the
> status, so T.ok() actually returns s_.ok(). And type S is used
> directly by a few other classes and macros which are used to creates
> T. So to make the check fully working, I also need to model the
> behaviors of S and a few other classes.
>
> Fortunately, T are template class and therefore all method
> implementations are visible to current TU. I was thinking, it is a
> whole lot easier if I can just model the behavior of S, and rely on
> inline evaluation in the static analyzer. In most cases, it works,
> except I found the following corner case (suppose T has a conversion
> constructor from X: T(X&&) and default move constructor T(T&&)):
>
> X f();
> T t = f();
>
> This is essentially constructs a temporary T t2{f()} and then doing a
> move T t{std::move(t2)}. It looks fine if everything is inlined,
> however, the inline doesn't work because t2 is a temporary. I dived
> into the code base and what happened is
>
> 1. At the end of ExprEngine::getRegionForConstructedObject
> <https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#%2100183>,
> it is marked as improperly modeled target region:
> CallOpts.IsCtorOrDtorWithImproperlyModeledTargetRegion
> <https://clang.llvm.org/doxygen/structclang_1_1ento_1_1ExprEngine_1_1EvalCallOptions.html#a339aa5a5b9aecfa8a844647b8d211eb0>=
> true;
> 2. In ExprEngine::mayInlineCallKind
> <https://clang.llvm.org/doxygen/ExprEngineCallAndReturn_8cpp_source.html#%2100677>,
> it returns false if the flag is set:
> // If we did not find the correct this-region, it would be pointless
> // to inline the constructor. Instead we will simply invalidate // the
> fake temporary target. if (CallOpts
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp?l=618&ct=xref_jump_to_def&gsn=CallOpts&rcl=186553626>.IsCtorOrDtorWithImproperlyModeledTargetRegion
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=62&ct=xref_jump_to_def&gsn=IsCtorOrDtorWithImproperlyModeledTargetRegion&rcl=186553626>)
> return CIP_DisallowedOnce
> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=623&ct=xref_jump_to_def&gsn=CIP_DisallowedOnce&rcl=186553626>;
>
> I am not an expert in C++ compiler. I just wonder why it can't
> allocates a temporary region for temporary variables? Having this
> special case means I need to model all T's constructors, which is
> really annoying. Does anyone knows some workarounds? Thanks.
>
>
> ----- Original Message -----
> From: Artem Dergachev <[hidden email]>
> To: Aleksei Sidorin <[hidden email]>, Li Kan
> <[hidden email]>, [hidden email]
> Subject: Re: [cfe-dev] Static analyzer question on how to tell if a
> Decl is a particular c++ function, or a particular class
> Date: 2018-02-16 02:26
>
>
> Yeah, the traditional way to see if this is a certain method of a
> certain class is to compare the name of the method
> (CXXMethodDecl->getName()) and the class
> (CXXMethodDecl->getParent()->getName()), as strings, to the desired
> strings. getQualifiedNameAsString() works fine, but it's considered slow
> for whatever reason (i've never observed it personally). ASTMatchers
> would also work, even if a bit of an overkill. Don't forget to check the
> method's argument number and types as well. We have an effort to make
> this whole common idiom more simple via CallDescription, but it doesn't
> work for C++ methods yet.
> With C++, you'll need to be aware of the bug we're currently having with
> object lifetime extension. If you're identifying the object by its
> memory region (which is correct), you'd occasionally notice that the
> region for a C++ temporary object may change along the analysis even if
> no copy-construction occurs. It has been making our new C++ checkers
> (most noticeably, the iterator checker) more difficult to write than
> they needed to be. More details in
> http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - i'm
> hoping to do something about it soon.
> On 15/02/2018 1:35 AM, Aleksei Sidorin via cfe-dev wrote:
> > Hello Li,
> >
> > What you're doing is mostly fine. I can only add some potentially
> > useful info inline.
> >
> > 14.02.2018 23:04, Li Kan via cfe-dev пишет:
> >> Hi folks,
> >>
> >> If this is not the correct mailing list for this question, please let
> >> me know.
> > You're in the right place. Welcome :)
> >
> >> I am trying to write a static analyzer for code base of my current
> >> job. The particular static analyzer I am writing involves a
> >> particular class of my company's code base. Let's say it is T. I want
> >> to write a checker to ensure T.ok() is called before T.value() is
> called.
> >>
> >> Static analyzer is perfect for this type of check as it requires path
> >> sensitive checks. When I trying to write the checker, I basically
> >> checks for pre-call and post-call. I want to tell if a CallEvent is
> >> T.ok() and T.value(). Currently what I am doing is:
> >> 1. From CallEvent, I cast to CXXMemberCall, and getOriginExpr(), then
> >> call getMethodDecl().
> > You can also try to do the following chain:
> > dyn_cast_or_null<CXXMethodDecl>(CallEvent.getDecl()) to obtain
> > CXXMethodDecl.
> >
> >> 2. From CXXMethodDecl, I first call getThisType(), then call
> >> getAsCXXRecordDecl(), then getQualifiedNameAsString(), and compare
> >> with qualified name of T, to make sure it is member of T.
> > To obtain parent class declaration from CXXMethodDecl, you can use
> > getParent() method. So, the pseudocode will look like
> > "MethodDecl->getParent()->getQualifiedNameAsString()".
> >
> >> 3. From CXXMethodDecl, I call getNameAsString() to get the method
> >> name, and compare them with "ok" and "value".
> > This looks OK. But it can be useful to know that
> > getQualifiedNameAsString() for CXXMethodDecl will also include parent
> > name. So, you can just check if MethodDecl->getQualifiedNameAsString()
> > is equal to "T::value" or "T::ok".
> >
> >> It works, but it looks complicated, and involves string comparison,
> >> which I assume is slow. Is there a easier way, blessed by clang
> >> static analyzer official teams, that tell if a MethodDecl, or
> >> CXXRecordDecl, is the function or class I am interested in?
> >>
> >> The ideal way is, there is one-time call to lookup the MethodDecl for
> >> T::ok() and T::value(), and CXXRecordDecl for T. Then ever since
> >> then, I just need to compare the pointers.
> > This is possible. You can just launch a simple matcher against AST:
> > cxxMethodDecl(hasName("::T::value"))
> > and get CXXMethodDecl you need and compare with it later.
> >
> >> Another way is, the first time I found it is T, T::ok() or
> >> T::value(), I save the pointer of CXXRecordDecl or MethodDecl, and
> >> use the pointers later. Is this the reliable (assume I use the
> >> pointer canonical decl will not change) and blessed way to do it? If
> >> it is, is my steps above the correct and simplest way to determine it
> >> is T::ok or T::value? Is there some better and more reliable way?
> > I think it is reliable enough: pointers to canonical declarations
> > don't change in AST.
> >>
> >> Thanks.
> >>
> >>
> >> _______________________________________________
> >> cfe-dev mailing list
> >> [hidden email]
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> >
> >
> > --
> > Best regards,
> > Aleksei Sidorin,
> > SRR, Samsung Electronics
> >
> >
> > _______________________________________________
> > cfe-dev mailing list
> > [hidden email]
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class

Anastasia Stulova via cfe-dev
In reply to this post by Anastasia Stulova via cfe-dev


On 22/02/2018 5:19 PM, Artem Dergachev wrote:
> You're essentially in an area that's largely under construction at the
> moment, and i'm likely to address your issue in like nearest future,
> you can probably follow my progress on phabricator or in cfe-commits,
> or i can try to ping you when this case is supported.

It's https://reviews.llvm.org/D44124 and https://reviews.llvm.org/D44131 
- after that return values should work. The former is about functions
that can be "inlined" during analysis (i.e. we see what object is being
returned), the latter is about functions that aren't inlined (but it
should make it easier to track them).

>
> Being source-based, the analyzer needs to model every possible
> language construct explicitly, and in C++ it means *a ton of stuff*.
> We could analyze, like, llvm IR instead, and it'd be much easier to
> implement correctly for us, but it'd be much harder to produce
> user-friendly warnings, because the user does think in terms of source
> code. And it'd be harder to make checkers for the same reason. Which
> is why "symbolic execution" we're doing here is to compilation like
> algebra to arithmetic: we need to do the same thing - support all
> language constructs - but with a lot of "concrete" values replaced
> with algebraic unknown values. Luckily, it is also easier because we
> don't have to support everything perfectly: we can always say that we
> don't know something, behave "conservatively", and the analysis would
> be slightly less precise but this is fine if there's not too much of
> these.
>
> Modeling temporaries is one of such cases - this is one of the
> difficult parts of the language that we never did properly until now.
> One does not simply construct into a temporary region because in this
> case we're not even sure that it is indeed a temporary region in this
> part of the analyzer, it cannot be understood by looking at the
> construct-expression in the AST, which is something i'm working around
> now by introducing "construction contexts". And even if we were sure
> it's a temporary, we need to not only construct the object, but also
> know how to properly extend its lifetime and destroy it when lifetime
> ends, otherwise we'd encounter massive false positives (because it'd
> not be "we don't know something", it'd be "we know something but it's
> wrong"), which should also be part of the "context" in which
> construction takes place. So for now this part simply behaves
> conservatively.
>
> Your particular case will be relatively tricky when f() is inlined
> during analysis. In this case the temporary will be constructed within
> the function, but in order to properly work with this temporary we'd
> need what the caller function is going to do with it - at least
> according to the AST that we currently have this is what we have to
> do. More info in
> http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html -
> "== Return by value ==" part. CodeGen does it differently, treating a
> lot of AST rvalues as lvalues under the hood, but i don't think we can
> afford breaking this invariant for now, because, well, we're more
> complicated and fragile.
>
> On 22/02/2018 4:46 PM, [hidden email] wrote:
>> Thanks Artem, Aleksei for those precious information.
>>
>> Yes, I observed exactly the same thing about MaterializeTemporaryExpr
>> and I actually spent quite sometime trace into the clang's codebase
>> and reached to the exactly same place in the link:
>> createTemporaryRegionIfNeeded. My workaround was inspired by one of
>> the example checkers, that uses a checkPostStmt on
>> MaterializeTemporaryExpr. Thanks for pointing out this.
>>
>> Aleksei, thanks for pointing out the getQualifiedNameAsString, I
>> didn't know I could use that  . I personally prefers it because
>> getParent()->getName() doesn't contain the namespace, but only the
>> class name. So to make it safe, I need to go all the way up to
>> traverse all parent decls, and getQualifiedNameAsString already does
>> the thing for me.
>>
>> Thanks for taking your time to answer my question. I appreciate it.
>>
>> Now I observed some other weird thing, and I wonder if anyone has
>> seen this before and knows any solution.
>>
>> As a reminder, the checker I am writing is for a particular class T
>> in my company's code base and the checker is to ensure T.ok() == true
>> before calling T.value().
>>
>> The above is actually a over-simplification to my task. The class T
>> actually has a member, Let's say s_ (in type S) to keep track of the
>> status, so T.ok() actually returns s_.ok(). And type S is used
>> directly by a few other classes and macros which are used to creates
>> T. So to make the check fully working, I also need to model the
>> behaviors of S and a few other classes.
>>
>> Fortunately, T are template class and therefore all method
>> implementations are visible to current TU. I was thinking, it is a
>> whole lot easier if I can just model the behavior of S, and rely on
>> inline evaluation in the static analyzer. In most cases, it works,
>> except I found the following corner case (suppose T has a conversion
>> constructor from X: T(X&&) and default move constructor T(T&&)):
>>
>> X f();
>> T t = f();
>>
>> This is essentially constructs a temporary T t2{f()} and then doing a
>> move T t{std::move(t2)}. It looks fine if everything is inlined,
>> however, the inline doesn't work because t2 is a temporary. I dived
>> into the code base and what happened is
>>
>> 1. At the end of ExprEngine::getRegionForConstructedObject
>> <https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#%2100183>,
>> it is marked as improperly modeled target region:
>> CallOpts.IsCtorOrDtorWithImproperlyModeledTargetRegion
>> <https://clang.llvm.org/doxygen/structclang_1_1ento_1_1ExprEngine_1_1EvalCallOptions.html#a339aa5a5b9aecfa8a844647b8d211eb0>=
>> true;
>> 2. In ExprEngine::mayInlineCallKind
>> <https://clang.llvm.org/doxygen/ExprEngineCallAndReturn_8cpp_source.html#%2100677>,
>> it returns false if the flag is set:
>> // If we did not find the correct this-region, it would be pointless
>> // to inline the constructor. Instead we will simply invalidate //
>> the fake temporary target. if (CallOpts
>> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp?l=618&ct=xref_jump_to_def&gsn=CallOpts&rcl=186553626>.IsCtorOrDtorWithImproperlyModeledTargetRegion
>> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=62&ct=xref_jump_to_def&gsn=IsCtorOrDtorWithImproperlyModeledTargetRegion&rcl=186553626>)
>> return CIP_DisallowedOnce
>> <https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=623&ct=xref_jump_to_def&gsn=CIP_DisallowedOnce&rcl=186553626>;
>>
>> I am not an expert in C++ compiler. I just wonder why it can't
>> allocates a temporary region for temporary variables? Having this
>> special case means I need to model all T's constructors, which is
>> really annoying. Does anyone knows some workarounds? Thanks.
>>
>>
>> ----- Original Message -----
>> From: Artem Dergachev <[hidden email]>
>> To: Aleksei Sidorin <[hidden email]>, Li Kan
>> <[hidden email]>, [hidden email]
>> Subject: Re: [cfe-dev] Static analyzer question on how to tell if a
>> Decl is a particular c++ function, or a particular class
>> Date: 2018-02-16 02:26
>>
>>
>> Yeah, the traditional way to see if this is a certain method of a
>> certain class is to compare the name of the method
>> (CXXMethodDecl->getName()) and the class
>> (CXXMethodDecl->getParent()->getName()), as strings, to the desired
>> strings. getQualifiedNameAsString() works fine, but it's considered slow
>> for whatever reason (i've never observed it personally). ASTMatchers
>> would also work, even if a bit of an overkill. Don't forget to check the
>> method's argument number and types as well. We have an effort to make
>> this whole common idiom more simple via CallDescription, but it doesn't
>> work for C++ methods yet.
>> With C++, you'll need to be aware of the bug we're currently having with
>> object lifetime extension. If you're identifying the object by its
>> memory region (which is correct), you'd occasionally notice that the
>> region for a C++ temporary object may change along the analysis even if
>> no copy-construction occurs. It has been making our new C++ checkers
>> (most noticeably, the iterator checker) more difficult to write than
>> they needed to be. More details in
>> http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - i'm
>> hoping to do something about it soon.
>> On 15/02/2018 1:35 AM, Aleksei Sidorin via cfe-dev wrote:
>> > Hello Li,
>> >
>> > What you're doing is mostly fine. I can only add some potentially
>> > useful info inline.
>> >
>> > 14.02.2018 23:04, Li Kan via cfe-dev пишет:
>> >> Hi folks,
>> >>
>> >> If this is not the correct mailing list for this question, please let
>> >> me know.
>> > You're in the right place. Welcome :)
>> >
>> >> I am trying to write a static analyzer for code base of my current
>> >> job. The particular static analyzer I am writing involves a
>> >> particular class of my company's code base. Let's say it is T. I want
>> >> to write a checker to ensure T.ok() is called before T.value() is
>> called.
>> >>
>> >> Static analyzer is perfect for this type of check as it requires path
>> >> sensitive checks. When I trying to write the checker, I basically
>> >> checks for pre-call and post-call. I want to tell if a CallEvent is
>> >> T.ok() and T.value(). Currently what I am doing is:
>> >> 1. From CallEvent, I cast to CXXMemberCall, and getOriginExpr(), then
>> >> call getMethodDecl().
>> > You can also try to do the following chain:
>> > dyn_cast_or_null<CXXMethodDecl>(CallEvent.getDecl()) to obtain
>> > CXXMethodDecl.
>> >
>> >> 2. From CXXMethodDecl, I first call getThisType(), then call
>> >> getAsCXXRecordDecl(), then getQualifiedNameAsString(), and compare
>> >> with qualified name of T, to make sure it is member of T.
>> > To obtain parent class declaration from CXXMethodDecl, you can use
>> > getParent() method. So, the pseudocode will look like
>> > "MethodDecl->getParent()->getQualifiedNameAsString()".
>> >
>> >> 3. From CXXMethodDecl, I call getNameAsString() to get the method
>> >> name, and compare them with "ok" and "value".
>> > This looks OK. But it can be useful to know that
>> > getQualifiedNameAsString() for CXXMethodDecl will also include parent
>> > name. So, you can just check if MethodDecl->getQualifiedNameAsString()
>> > is equal to "T::value" or "T::ok".
>> >
>> >> It works, but it looks complicated, and involves string comparison,
>> >> which I assume is slow. Is there a easier way, blessed by clang
>> >> static analyzer official teams, that tell if a MethodDecl, or
>> >> CXXRecordDecl, is the function or class I am interested in?
>> >>
>> >> The ideal way is, there is one-time call to lookup the MethodDecl for
>> >> T::ok() and T::value(), and CXXRecordDecl for T. Then ever since
>> >> then, I just need to compare the pointers.
>> > This is possible. You can just launch a simple matcher against AST:
>> > cxxMethodDecl(hasName("::T::value"))
>> > and get CXXMethodDecl you need and compare with it later.
>> >
>> >> Another way is, the first time I found it is T, T::ok() or
>> >> T::value(), I save the pointer of CXXRecordDecl or MethodDecl, and
>> >> use the pointers later. Is this the reliable (assume I use the
>> >> pointer canonical decl will not change) and blessed way to do it? If
>> >> it is, is my steps above the correct and simplest way to determine it
>> >> is T::ok or T::value? Is there some better and more reliable way?
>> > I think it is reliable enough: pointers to canonical declarations
>> > don't change in AST.
>> >>
>> >> Thanks.
>> >>
>> >>
>> >> _______________________________________________
>> >> cfe-dev mailing list
>> >> [hidden email]
>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >
>> >
>> > --
>> > Best regards,
>> > Aleksei Sidorin,
>> > SRR, Samsung Electronics
>> >
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > [hidden email]
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev