Weird, I was super sure I answered that back then. Probably misclicked
something :/ Sorries.
Normally you shouldn't be using explicit_regions at all. They are
released at some point - which is not how most checkers work.
other arguments or globals. In either case the object's state is not
known anymore.
object through a const pointer.
pointer arithmetic: (this - offsetof(T, s) + offsetof(T, d)) and such.
definitely not many). I think you should be able to partially suppress
region in the explicit_regions list (and see what happens). We've been
extra escaping causes more false negatives than false positives.
> Thanks Artem for clarification. It is very clear and now I understand the situation. Sorry to bother again, but now I have yet another question.
>
> I am now trying to make the checker work for pointer escape. Since I track the status of objects using MemRegion, so I am using checkRegionChanges. I've observed something weird.
>
> Suppose I have free functions whose definitions are not seen by current TU: void f(const S*); void g(S*);
> 1. Suppose I have
> S s;
> f(&s);
> Then in checkRegionChanges, s's region will appear in explicit_regions but not regions.
> 2. Suppose I have
> S s;
> g(&s);
> Then in checkRegionChanges, t's region will appear in both explicit_regions and regions.
>
> Based on the above observations, if I want to remove s from GDM in case 2 but not 1, then I need to iterate over all entries in GDM and remove the ones that are sub-regions of any one in `region` variable, not `explicit_regions` variable. However, this doesn't work for the following case
> 3. If I have struct T { S s; D d; }; and U has some non-const member void D::h(); for the following call:
> T t;
> t.d.h();
> Then in checkRegionChanges, t.d is in explicit_regions and t is in regions. So t.s will be removed from GDM if I check 'regions' variable, which is incorrect.
>
> My question is, is case 1 a bug or a feature that s appears in explicit_regions? If it is by design, how what is the blessed way to distinguish the above cases? Thanks.
>
> BTW: the code that causes the behavior is
>
https://clang.llvm.org/doxygen/RegionStore_8cpp_source.html#l01256> if (const MemRegion *R = V.getAsRegion()) {
> if (TopLevelRegions)
> TopLevelRegions->push_back(R);
> W.AddToWorkList(R);
> continue;
> }
> All regions are added to TopLevelRegions, without checking RegionAndSymbolInvalidationTraits::TK_PreserveContents.
>
> ----- Original Message -----
> From: Artem Dergachev <
[hidden email]>
> To:
[hidden email], Aleksei Sidorin <
[hidden email]>, cfe-dev <
[hidden email]>
> Subject: Re: [cfe-dev] Static analyzer question on how to tell if a Decl is a particular c++ function, or a particular class
> Date: 2018-02-23 09:20
>
>
> You're essentially in an area that's largely under construction at the
> moment, and i'm likely to address your issue in like nearest future, you
> can probably follow my progress on phabricator or in cfe-commits, or i
> can try to ping you when this case is supported.
> Being source-based, the analyzer needs to model every possible language
> construct explicitly, and in C++ it means *a ton of stuff*. We could
> analyze, like, llvm IR instead, and it'd be much easier to implement
> correctly for us, but it'd be much harder to produce user-friendly
> warnings, because the user does think in terms of source code. And it'd
> be harder to make checkers for the same reason. Which is why "symbolic
> execution" we're doing here is to compilation like algebra to
> arithmetic: we need to do the same thing - support all language
> constructs - but with a lot of "concrete" values replaced with algebraic
> unknown values. Luckily, it is also easier because we don't have to
> support everything perfectly: we can always say that we don't know
> something, behave "conservatively", and the analysis would be slightly
> less precise but this is fine if there's not too much of these.
> Modeling temporaries is one of such cases - this is one of the difficult
> parts of the language that we never did properly until now. One does not
> simply construct into a temporary region because in this case we're not
> even sure that it is indeed a temporary region in this part of the
> analyzer, it cannot be understood by looking at the construct-expression
> in the AST, which is something i'm working around now by introducing
> "construction contexts". And even if we were sure it's a temporary, we
> need to not only construct the object, but also know how to properly
> extend its lifetime and destroy it when lifetime ends, otherwise we'd
> encounter massive false positives (because it'd not be "we don't know
> something", it'd be "we know something but it's wrong"), which should
> also be part of the "context" in which construction takes place. So for
> now this part simply behaves conservatively.
> Your particular case will be relatively tricky when f() is inlined
> during analysis. In this case the temporary will be constructed within
> the function, but in order to properly work with this temporary we'd
> need what the caller function is going to do with it - at least
> according to the AST that we currently have this is what we have to do.
> More info in
>
http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - "==
> Return by value ==" part. CodeGen does it differently, treating a lot of
> AST rvalues as lvalues under the hood, but i don't think we can afford
> breaking this invariant for now, because, well, we're more complicated
> and fragile.
> On 22/02/2018 4:46 PM,
[hidden email] wrote:
>> Thanks Artem, Aleksei for those precious information.
>>
>> Yes, I observed exactly the same thing about MaterializeTemporaryExpr
>> and I actually spent quite sometime trace into the clang's codebase
>> and reached to the exactly same place in the link:
>> createTemporaryRegionIfNeeded. My workaround was inspired by one of
>> the example checkers, that uses a checkPostStmt on
>> MaterializeTemporaryExpr. Thanks for pointing out this.
>>
>> Aleksei, thanks for pointing out the getQualifiedNameAsString, I
>> didn't know I could use that . I personally prefers it because
>> getParent()->getName() doesn't contain the namespace, but only the
>> class name. So to make it safe, I need to go all the way up to
>> traverse all parent decls, and getQualifiedNameAsString already does
>> the thing for me.
>>
>> Thanks for taking your time to answer my question. I appreciate it.
>>
>> Now I observed some other weird thing, and I wonder if anyone has seen
>> this before and knows any solution.
>>
>> As a reminder, the checker I am writing is for a particular class T in
>> my company's code base and the checker is to ensure T.ok() == true
>> before calling T.value().
>>
>> The above is actually a over-simplification to my task. The class T
>> actually has a member, Let's say s_ (in type S) to keep track of the
>> status, so T.ok() actually returns s_.ok(). And type S is used
>> directly by a few other classes and macros which are used to creates
>> T. So to make the check fully working, I also need to model the
>> behaviors of S and a few other classes.
>>
>> Fortunately, T are template class and therefore all method
>> implementations are visible to current TU. I was thinking, it is a
>> whole lot easier if I can just model the behavior of S, and rely on
>> inline evaluation in the static analyzer. In most cases, it works,
>> except I found the following corner case (suppose T has a conversion
>> constructor from X: T(X&&) and default move constructor T(T&&)):
>>
>> X f();
>> T t = f();
>>
>> This is essentially constructs a temporary T t2{f()} and then doing a
>> move T t{std::move(t2)}. It looks fine if everything is inlined,
>> however, the inline doesn't work because t2 is a temporary. I dived
>> into the code base and what happened is
>>
>> 1. At the end of ExprEngine::getRegionForConstructedObject
>> <
https://clang.llvm.org/doxygen/ExprEngineCXX_8cpp_source.html#%2100183>,
>> it is marked as improperly modeled target region:
>> CallOpts.IsCtorOrDtorWithImproperlyModeledTargetRegion
>> <
https://clang.llvm.org/doxygen/structclang_1_1ento_1_1ExprEngine_1_1EvalCallOptions.html#a339aa5a5b9aecfa8a844647b8d211eb0>=
>> true;
>> 2. In ExprEngine::mayInlineCallKind
>> <
https://clang.llvm.org/doxygen/ExprEngineCallAndReturn_8cpp_source.html#%2100677>,
>> it returns false if the flag is set:
>> // If we did not find the correct this-region, it would be pointless
>> // to inline the constructor. Instead we will simply invalidate // the
>> fake temporary target. if (CallOpts
>> <
https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp?l=618&ct=xref_jump_to_def&gsn=CallOpts&rcl=186553626>.IsCtorOrDtorWithImproperlyModeledTargetRegion
>> <
https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=62&ct=xref_jump_to_def&gsn=IsCtorOrDtorWithImproperlyModeledTargetRegion&rcl=186553626>)
>> return CIP_DisallowedOnce
>> <
https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm/tools/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h?l=623&ct=xref_jump_to_def&gsn=CIP_DisallowedOnce&rcl=186553626>;
>>
>> I am not an expert in C++ compiler. I just wonder why it can't
>> allocates a temporary region for temporary variables? Having this
>> special case means I need to model all T's constructors, which is
>> really annoying. Does anyone knows some workarounds? Thanks.
>>
>>
>> ----- Original Message -----
>> From: Artem Dergachev <
[hidden email]>
>> To: Aleksei Sidorin <
[hidden email]>, Li Kan
>> <
[hidden email]>,
[hidden email]
>> Subject: Re: [cfe-dev] Static analyzer question on how to tell if a
>> Decl is a particular c++ function, or a particular class
>> Date: 2018-02-16 02:26
>>
>>
>> Yeah, the traditional way to see if this is a certain method of a
>> certain class is to compare the name of the method
>> (CXXMethodDecl->getName()) and the class
>> (CXXMethodDecl->getParent()->getName()), as strings, to the desired
>> strings. getQualifiedNameAsString() works fine, but it's considered slow
>> for whatever reason (i've never observed it personally). ASTMatchers
>> would also work, even if a bit of an overkill. Don't forget to check the
>> method's argument number and types as well. We have an effort to make
>> this whole common idiom more simple via CallDescription, but it doesn't
>> work for C++ methods yet.
>> With C++, you'll need to be aware of the bug we're currently having with
>> object lifetime extension. If you're identifying the object by its
>> memory region (which is correct), you'd occasionally notice that the
>> region for a C++ temporary object may change along the analysis even if
>> no copy-construction occurs. It has been making our new C++ checkers
>> (most noticeably, the iterator checker) more difficult to write than
>> they needed to be. More details in
>>
http://lists.llvm.org/pipermail/cfe-dev/2018-February/056898.html - i'm
>> hoping to do something about it soon.
>> On 15/02/2018 1:35 AM, Aleksei Sidorin via cfe-dev wrote:
>>> Hello Li,
>>>
>>> What you're doing is mostly fine. I can only add some potentially
>>> useful info inline.
>>>
>>> 14.02.2018 23:04, Li Kan via cfe-dev пишет:
>>>> Hi folks,
>>>>
>>>> If this is not the correct mailing list for this question, please let
>>>> me know.
>>> You're in the right place. Welcome :)
>>>
>>>> I am trying to write a static analyzer for code base of my current
>>>> job. The particular static analyzer I am writing involves a
>>>> particular class of my company's code base. Let's say it is T. I want
>>>> to write a checker to ensure T.ok() is called before T.value() is
>> called.
>>>> Static analyzer is perfect for this type of check as it requires path
>>>> sensitive checks. When I trying to write the checker, I basically
>>>> checks for pre-call and post-call. I want to tell if a CallEvent is
>>>> T.ok() and T.value(). Currently what I am doing is:
>>>> 1. From CallEvent, I cast to CXXMemberCall, and getOriginExpr(), then
>>>> call getMethodDecl().
>>> You can also try to do the following chain:
>>> dyn_cast_or_null<CXXMethodDecl>(CallEvent.getDecl()) to obtain
>>> CXXMethodDecl.
>>>
>>>> 2. From CXXMethodDecl, I first call getThisType(), then call
>>>> getAsCXXRecordDecl(), then getQualifiedNameAsString(), and compare
>>>> with qualified name of T, to make sure it is member of T.
>>> To obtain parent class declaration from CXXMethodDecl, you can use
>>> getParent() method. So, the pseudocode will look like
>>> "MethodDecl->getParent()->getQualifiedNameAsString()".
>>>
>>>> 3. From CXXMethodDecl, I call getNameAsString() to get the method
>>>> name, and compare them with "ok" and "value".
>>> This looks OK. But it can be useful to know that
>>> getQualifiedNameAsString() for CXXMethodDecl will also include parent
>>> name. So, you can just check if MethodDecl->getQualifiedNameAsString()
>>> is equal to "T::value" or "T::ok".
>>>
>>>> It works, but it looks complicated, and involves string comparison,
>>>> which I assume is slow. Is there a easier way, blessed by clang
>>>> static analyzer official teams, that tell if a MethodDecl, or
>>>> CXXRecordDecl, is the function or class I am interested in?
>>>>
>>>> The ideal way is, there is one-time call to lookup the MethodDecl for
>>>> T::ok() and T::value(), and CXXRecordDecl for T. Then ever since
>>>> then, I just need to compare the pointers.
>>> This is possible. You can just launch a simple matcher against AST:
>>> cxxMethodDecl(hasName("::T::value"))
>>> and get CXXMethodDecl you need and compare with it later.
>>>
>>>> Another way is, the first time I found it is T, T::ok() or
>>>> T::value(), I save the pointer of CXXRecordDecl or MethodDecl, and
>>>> use the pointers later. Is this the reliable (assume I use the
>>>> pointer canonical decl will not change) and blessed way to do it? If
>>>> it is, is my steps above the correct and simplest way to determine it
>>>> is T::ok or T::value? Is there some better and more reliable way?
>>> I think it is reliable enough: pointers to canonical declarations
>>> don't change in AST.
>>>> Thanks.
>>>>
>>>>
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>>
[hidden email]
>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>> --
>>> Best regards,
>>> Aleksei Sidorin,
>>> SRR, Samsung Electronics
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>>
[hidden email]
>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev