[analyzer] MallocChecker & checker dependencies

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
Hi there!

While thinking about how it would be possible to implement various smart ptr related checkers I tried to review current state of MallocChecker and came up with that it would be great to have RegionState info available to other checkers. Could you please share your points of view and comments on the following statements / questions:

1. Is there any right way for chaining checkers? How they are expected to communicate between each other (excluding generation of nodes / ProgramStates). I've heard that there are couple of problems caused by inlining functions, constructors / descructors.
2. What do you think about moving RegionState to the Core of CSA or providing optional extended info in MemRegion about the source of such region (new opearator / array new, malloc, alloca, etc). So it would be available to all checkers.
3. Is there any roadmap for CSA and especially for dynamic memory management modeling & related checkers?

Regards, Alexey K

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
Hi Alexey!

Could you share your ideas of checker design? It is possible that problems you met can be solved in different (maybe even better) way if we know the whole picture.
Regarding your questions, you can find some answers below.

1. a) The first way for checker communication is to share their program state trait. You can see an example of how state trait is exported in GenericTaintChecker or MPIChecker. Generally, you just create a named ProgramStateTrait in the header. You can take a look at TaintManager.h and MPITypes.h and how they are used.
b) To set a dependency from another checker, you can just register it while registering your checker. An example can be found in MallocChecker where register$Checker also calls registerCStringCheckerBasic to register a checker it depends on.
As you pointed, inter-checker communication can become a source of some problems. Most of them are discussed in this conversation: http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html

2. I think there is nothing bad in sharing RegionState across checkers in the way shown in 1a.

3. Artem Dergachev has done some excellent work on improvement of operator 'new' processing in CSA engine. Regarding checkers, I can see some on https://clang-analyzer.llvm.org/potential_checkers.html: for example, undefbehavior.AutoptrsOwnSameObj. You can search this list to find more.


15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:
Hi there!

While thinking about how it would be possible to implement various smart ptr related checkers I tried to review current state of MallocChecker and came up with that it would be great to have RegionState info available to other checkers. Could you please share your points of view and comments on the following statements / questions:

1. Is there any right way for chaining checkers? How they are expected to communicate between each other (excluding generation of nodes / ProgramStates). I've heard that there are couple of problems caused by inlining functions, constructors / descructors.
2. What do you think about moving RegionState to the Core of CSA or providing optional extended info in MemRegion about the source of such region (new opearator / array new, malloc, alloca, etc). So it would be available to all checkers.
3. Is there any roadmap for CSA and especially for dynamic memory management modeling & related checkers?

Regards, Alexey K


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


-- 
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
Dynamic memory management is pretty much done at this point - we have
very good support for malloc()-like stuff and reasonably good support
for operator new() (support for operator new[]() is currently missing
because it requires calling an indefinite number of constructors which
is not yet implemented). There is also a known issue with custom
operator new(...) returning null pointers - in this case we should not
be calling the constructor but for now we evaluate the constructor
conservatively.

Your real problem will be managing smart pointers themselves because
they are C++ objects that have a myriad of methods, they can be copied,
moved, assigned, move-assigned, they destroyed, they lifetime-extended,
you need to keep all of this in mind when writing a checker. This is
slowly getting easier because i'm currently working on that, but until
recently it wasn't working correctly in the core, let alone checkers,
which is why the few experimental C++ checkers that we currently have
required heavy hacks to become possible. But for now you'd rather keep
an eye on these problems.

On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:

> Hi Alexey!
>
> Could you share your ideas of checker design? It is possible that
> problems you met can be solved in different (maybe even better) way if
> we know the whole picture.
> Regarding your questions, you can find some answers below.
>
> 1. a) The first way for checker communication is to share their
> program state trait. You can see an example of how state trait is
> exported in GenericTaintChecker or MPIChecker. Generally, you just
> create a named ProgramStateTrait in the header. You can take a look at
> TaintManager.h and MPITypes.h and how they are used.
> b) To set a dependency from another checker, you can just register it
> while registering your checker. An example can be found in
> MallocChecker where register$Checker also calls
> registerCStringCheckerBasic to register a checker it depends on.
> As you pointed, inter-checker communication can become a source of
> some problems. Most of them are discussed in this conversation:
> http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
>
> 2. I think there is nothing bad in sharing RegionState across checkers
> in the way shown in 1a.
>
> 3. Artem Dergachev has done some excellent work on improvement of
> operator 'new' processing in CSA engine. Regarding checkers, I can see
> some on https://clang-analyzer.llvm.org/potential_checkers.html: for
> example, undefbehavior.AutoptrsOwnSameObj. You can search this list to
> find more.
>
>
> 15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:
>> Hi there!
>>
>> While thinking about how it would be possible to implement various
>> smart ptr related checkers I tried to review current state of
>> MallocChecker and came up with that it would be great to have
>> RegionState info available to other checkers. Could you please share
>> your points of view and comments on the following statements / questions:
>>
>> 1. Is there any right way for chaining checkers? How they are
>> expected to communicate between each other (excluding generation of
>> nodes / ProgramStates). I've heard that there are couple of problems
>> caused by inlining functions, constructors / descructors.
>> 2. What do you think about moving RegionState to the Core of CSA or
>> providing optional extended info in MemRegion about the source of
>> such region (new opearator / array new, malloc, alloca, etc). So it
>> would be available to all checkers.
>> 3. Is there any roadmap for CSA and especially for dynamic memory
>> management modeling & related checkers?
>>
>> Regards, Alexey K
>>
>> --
>> linkedin.com/profile
>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>>
>> github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>> bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>
>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> --
> Best regards,
> Aleksei Sidorin,
> SRR, Samsung Electronics
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
Hello guys,

And thanks for your attention

Aleksei,

Could you share your ideas of checker design? It is possible that problems you met can be solved in different (maybe even better) way if we know the whole picture.

As I said, I'm interested in improving current state of dynamic memory modeling. Especially stuff related to smart pointers (e.g. shared, unique) which also mentioned in list of potential checkers as smartptr.SmartPtrInit

You can see an example of how state trait is exported in GenericTaintChecker or MPIChecker. Generally, you just create a named ProgramStateTrait in the header.

Briefly looked at MPITypes.h. Does it mean we should move RegionState to separate file and register it via Traits in the same manner to make it avaliable from other checkers (not other TU as mentioned in ento::mpi::RequestMap)?

Artem,

you need to keep all of this in mind when writing a checker.

Sure, but on the other hand I think it's possible to implement and improve modeling of various API calls' effects step-by-step. Let's say, it case of SmartPtrInit checker mentioned before the bare minimum would be modeling of construction (and destruction to avoid leak report from existing related checkers).

which is why the few experimental C++ checkers that we currently have required heavy hacks to become possible. But for now you'd rather keep an eye on these problems.

Does it mean it's better to wait a bit for Core improvements from main contributors? Does it make sense to make an efforts to implement SmartPtrItnit checker in current state of things?

Thanks in advance and regards,
Alexey K

2018-03-16 21:47 GMT+03:00 Artem Dergachev <[hidden email]>:
Dynamic memory management is pretty much done at this point - we have very good support for malloc()-like stuff and reasonably good support for operator new() (support for operator new[]() is currently missing because it requires calling an indefinite number of constructors which is not yet implemented). There is also a known issue with custom operator new(...) returning null pointers - in this case we should not be calling the constructor but for now we evaluate the constructor conservatively.

Your real problem will be managing smart pointers themselves because they are C++ objects that have a myriad of methods, they can be copied, moved, assigned, move-assigned, they destroyed, they lifetime-extended, you need to keep all of this in mind when writing a checker. This is slowly getting easier because i'm currently working on that, but until recently it wasn't working correctly in the core, let alone checkers, which is why the few experimental C++ checkers that we currently have required heavy hacks to become possible. But for now you'd rather keep an eye on these problems.

On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:
Hi Alexey!

Could you share your ideas of checker design? It is possible that problems you met can be solved in different (maybe even better) way if we know the whole picture.
Regarding your questions, you can find some answers below.

1. a) The first way for checker communication is to share their program state trait. You can see an example of how state trait is exported in GenericTaintChecker or MPIChecker. Generally, you just create a named ProgramStateTrait in the header. You can take a look at TaintManager.h and MPITypes.h and how they are used.
b) To set a dependency from another checker, you can just register it while registering your checker. An example can be found in MallocChecker where register$Checker also calls registerCStringCheckerBasic to register a checker it depends on.
As you pointed, inter-checker communication can become a source of some problems. Most of them are discussed in this conversation: http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html

2. I think there is nothing bad in sharing RegionState across checkers in the way shown in 1a.

3. Artem Dergachev has done some excellent work on improvement of operator 'new' processing in CSA engine. Regarding checkers, I can see some on https://clang-analyzer.llvm.org/potential_checkers.html: for example, undefbehavior.AutoptrsOwnSameObj. You can search this list to find more.


15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:
Hi there!

While thinking about how it would be possible to implement various smart ptr related checkers I tried to review current state of MallocChecker and came up with that it would be great to have RegionState info available to other checkers. Could you please share your points of view and comments on the following statements / questions:

1. Is there any right way for chaining checkers? How they are expected to communicate between each other (excluding generation of nodes / ProgramStates). I've heard that there are couple of problems caused by inlining functions, constructors / descructors.
2. What do you think about moving RegionState to the Core of CSA or providing optional extended info in MemRegion about the source of such region (new opearator / array new, malloc, alloca, etc). So it would be available to all checkers.
3. Is there any roadmap for CSA and especially for dynamic memory management modeling & related checkers?

Regards, Alexey K

--
linkedin.com/profile <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

github.com/alexeyknyshev <http://github.com/alexeyknyshev>
bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


--
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




--

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
Sorry, accidentally removed part of message before sending.

Aleksei,

Could you share your ideas of checker design? It is possible that problems you met can be solved in different (maybe even better) way if we know the whole picture.

I'm currently have no clear idea how to implement it in the "right manner". But first of all I would aim to track the origin (malloc, new, new [], etc) of SVal and check if Deleter is the expected corresponding way to deallocate such SVal.

Regards, Alexey K 

2018-03-16 22:22 GMT+03:00 Alexey Knyshev <[hidden email]>:
Hello guys,

And thanks for your attention

Aleksei,

Could you share your ideas of checker design? It is possible that problems you met can be solved in different (maybe even better) way if we know the whole picture.

As I said, I'm interested in improving current state of dynamic memory modeling. Especially stuff related to smart pointers (e.g. shared, unique) which also mentioned in list of potential checkers as smartptr.SmartPtrInit

You can see an example of how state trait is exported in GenericTaintChecker or MPIChecker. Generally, you just create a named ProgramStateTrait in the header.

Briefly looked at MPITypes.h. Does it mean we should move RegionState to separate file and register it via Traits in the same manner to make it avaliable from other checkers (not other TU as mentioned in ento::mpi::RequestMap)?

Artem,

you need to keep all of this in mind when writing a checker.

Sure, but on the other hand I think it's possible to implement and improve modeling of various API calls' effects step-by-step. Let's say, it case of SmartPtrInit checker mentioned before the bare minimum would be modeling of construction (and destruction to avoid leak report from existing related checkers).

which is why the few experimental C++ checkers that we currently have required heavy hacks to become possible. But for now you'd rather keep an eye on these problems.

Does it mean it's better to wait a bit for Core improvements from main contributors? Does it make sense to make an efforts to implement SmartPtrItnit checker in current state of things?

Thanks in advance and regards,
Alexey K

2018-03-16 21:47 GMT+03:00 Artem Dergachev <[hidden email]>:
Dynamic memory management is pretty much done at this point - we have very good support for malloc()-like stuff and reasonably good support for operator new() (support for operator new[]() is currently missing because it requires calling an indefinite number of constructors which is not yet implemented). There is also a known issue with custom operator new(...) returning null pointers - in this case we should not be calling the constructor but for now we evaluate the constructor conservatively.

Your real problem will be managing smart pointers themselves because they are C++ objects that have a myriad of methods, they can be copied, moved, assigned, move-assigned, they destroyed, they lifetime-extended, you need to keep all of this in mind when writing a checker. This is slowly getting easier because i'm currently working on that, but until recently it wasn't working correctly in the core, let alone checkers, which is why the few experimental C++ checkers that we currently have required heavy hacks to become possible. But for now you'd rather keep an eye on these problems.

On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:
Hi Alexey!

Could you share your ideas of checker design? It is possible that problems you met can be solved in different (maybe even better) way if we know the whole picture.
Regarding your questions, you can find some answers below.

1. a) The first way for checker communication is to share their program state trait. You can see an example of how state trait is exported in GenericTaintChecker or MPIChecker. Generally, you just create a named ProgramStateTrait in the header. You can take a look at TaintManager.h and MPITypes.h and how they are used.
b) To set a dependency from another checker, you can just register it while registering your checker. An example can be found in MallocChecker where register$Checker also calls registerCStringCheckerBasic to register a checker it depends on.
As you pointed, inter-checker communication can become a source of some problems. Most of them are discussed in this conversation: http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html

2. I think there is nothing bad in sharing RegionState across checkers in the way shown in 1a.

3. Artem Dergachev has done some excellent work on improvement of operator 'new' processing in CSA engine. Regarding checkers, I can see some on https://clang-analyzer.llvm.org/potential_checkers.html: for example, undefbehavior.AutoptrsOwnSameObj. You can search this list to find more.


15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:
Hi there!

While thinking about how it would be possible to implement various smart ptr related checkers I tried to review current state of MallocChecker and came up with that it would be great to have RegionState info available to other checkers. Could you please share your points of view and comments on the following statements / questions:

1. Is there any right way for chaining checkers? How they are expected to communicate between each other (excluding generation of nodes / ProgramStates). I've heard that there are couple of problems caused by inlining functions, constructors / descructors.
2. What do you think about moving RegionState to the Core of CSA or providing optional extended info in MemRegion about the source of such region (new opearator / array new, malloc, alloca, etc). So it would be available to all checkers.
3. Is there any roadmap for CSA and especially for dynamic memory management modeling & related checkers?

Regards, Alexey K

--
linkedin.com/profile <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

github.com/alexeyknyshev <http://github.com/alexeyknyshev>
bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


--
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




--



--

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev


On 16/03/2018 12:30 PM, Alexey Knyshev wrote:

> Sorry, accidentally removed part of message before sending.
>
> Aleksei,
>
>     Could you share your ideas of checker design? It is possible that
>     problems you met can be solved in different (maybe even better)
>     way if we know the whole picture.
>
>
> I'm currently have no clear idea how to implement it in the "right
> manner". But first of all I would aim to track the origin (malloc,
> new, new [], etc) of SVal and check if Deleter is the expected
> corresponding way to deallocate such SVal.

This should already be supported by MallocChecker - i.e. it warns when
you allocate memory with, say, new() and release it with free() or
delete[]().

If methods of a smart pointer are "inlined" during analysis, these bugs
will already be caught automatically.

>
> Regards, Alexey K
>
> 2018-03-16 22:22 GMT+03:00 Alexey Knyshev <[hidden email]
> <mailto:[hidden email]>>:
>
>     Hello guys,
>
>     And thanks for your attention
>
>     Aleksei,
>
>         Could you share your ideas of checker design? It is possible
>         that problems you met can be solved in different (maybe even
>         better) way if we know the whole picture.
>
>
>     As I said, I'm interested in improving current state of dynamic
>     memory modeling. Especially stuff related to smart pointers (e.g.
>     shared, unique) which also mentioned in list of potential checkers
>     as smartptr.SmartPtrInit
>     <https://clang-analyzer.llvm.org/potential_checkers.html>*
>
>     *
>
>         **You can see an example of how state trait is exported in
>         GenericTaintChecker or MPIChecker. Generally, you just create
>         a named ProgramStateTrait in the header.
>
>
>     Briefly looked at MPITypes.h. Does it mean we should move
>     RegionState to separate file and register it via Traits in the
>     same manner to make it avaliable from other checkers (not other TU
>     as mentioned in ento::mpi::RequestMap)?
>

GenericTaintChecker is made in a fairly intrusive manner by putting
stuff into the program state class directly, which isn't very sane. I'd
definitely prefer something similar to dynamic type propagation (see
DynamicTypeMap.cpp, DynamicTypeMap.h). You can use the usual
REGISTER_MAP_WITH_PROGRAMSTATE (etc.) macros in the .cpp file as long as
you provide accessor methods declared in the .h file that other checkers
could include.

>
>     Artem,
>
>         you need to keep all of this in mind when writing a checker.
>
>
>     Sure, but on the other hand I think it's possible to implement and
>     improve modeling of various API calls' effects step-by-step. Let's
>     say, it case of SmartPtrInit checker mentioned before the bare
>     minimum would be modeling of construction (and destruction to
>     avoid leak report from existing related checkers).
>
>         which is why the few experimental C++ checkers that we
>         currently have required heavy hacks to become possible. But
>         for now you'd rather keep an eye on these problems.
>
>
>     Does it mean it's better to wait a bit for Core improvements from
>     main contributors? Does it make sense to make an efforts to
>     implement SmartPtrItnit checker in current state of things?
>

You should feel free to start working on the checker in an incremental
manner (everything is better in an incremental manner!), but in some
cases i may prefer improving the checker API or fixing core modeling
bugs instead of writing large portions of checker code to work around
inconvenient APIs or bugs, and this is something i might not be
immediately capable of doing myself (due to being busy with other
things), which may potentially delay your progress.

>     Thanks in advance and regards,
>     Alexey K
>
>     2018-03-16 21:47 GMT+03:00 Artem Dergachev <[hidden email]
>     <mailto:[hidden email]>>:
>
>         Dynamic memory management is pretty much done at this point -
>         we have very good support for malloc()-like stuff and
>         reasonably good support for operator new() (support for
>         operator new[]() is currently missing because it requires
>         calling an indefinite number of constructors which is not yet
>         implemented). There is also a known issue with custom operator
>         new(...) returning null pointers - in this case we should not
>         be calling the constructor but for now we evaluate the
>         constructor conservatively.
>
>         Your real problem will be managing smart pointers themselves
>         because they are C++ objects that have a myriad of methods,
>         they can be copied, moved, assigned, move-assigned, they
>         destroyed, they lifetime-extended, you need to keep all of
>         this in mind when writing a checker. This is slowly getting
>         easier because i'm currently working on that, but until
>         recently it wasn't working correctly in the core, let alone
>         checkers, which is why the few experimental C++ checkers that
>         we currently have required heavy hacks to become possible. But
>         for now you'd rather keep an eye on these problems.
>
>         On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:
>
>             Hi Alexey!
>
>             Could you share your ideas of checker design? It is
>             possible that problems you met can be solved in different
>             (maybe even better) way if we know the whole picture.
>             Regarding your questions, you can find some answers below.
>
>             1. a) The first way for checker communication is to share
>             their program state trait. You can see an example of how
>             state trait is exported in GenericTaintChecker or
>             MPIChecker. Generally, you just create a named
>             ProgramStateTrait in the header. You can take a look at
>             TaintManager.h and MPITypes.h and how they are used.
>             b) To set a dependency from another checker, you can just
>             register it while registering your checker. An example can
>             be found in MallocChecker where register$Checker also
>             calls registerCStringCheckerBasic to register a checker it
>             depends on.
>             As you pointed, inter-checker communication can become a
>             source of some problems. Most of them are discussed in
>             this conversation:
>             http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
>             <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>
>
>             2. I think there is nothing bad in sharing RegionState
>             across checkers in the way shown in 1a.
>
>             3. Artem Dergachev has done some excellent work on
>             improvement of operator 'new' processing in CSA engine.
>             Regarding checkers, I can see some on
>             https://clang-analyzer.llvm.org/potential_checkers.html
>             <https://clang-analyzer.llvm.org/potential_checkers.html>:
>             for example, undefbehavior.AutoptrsOwnSameObj. You can
>             search this list to find more.
>
>
>             15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:
>
>                 Hi there!
>
>                 While thinking about how it would be possible to
>                 implement various smart ptr related checkers I tried
>                 to review current state of MallocChecker and came up
>                 with that it would be great to have RegionState info
>                 available to other checkers. Could you please share
>                 your points of view and comments on the following
>                 statements / questions:
>
>                 1. Is there any right way for chaining checkers? How
>                 they are expected to communicate between each other
>                 (excluding generation of nodes / ProgramStates). I've
>                 heard that there are couple of problems caused by
>                 inlining functions, constructors / descructors.
>                 2. What do you think about moving RegionState to the
>                 Core of CSA or providing optional extended info in
>                 MemRegion about the source of such region (new
>                 opearator / array new, malloc, alloca, etc). So it
>                 would be available to all checkers.
>                 3. Is there any roadmap for CSA and especially for
>                 dynamic memory management modeling & related checkers?
>
>                 Regards, Alexey K
>
>                 --
>                 linkedin.com/profile <http://linkedin.com/profile>
>                 <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
>                 <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>
>
>                 github.com/alexeyknyshev
>                 <http://github.com/alexeyknyshev>
>                 <http://github.com/alexeyknyshev
>                 <http://github.com/alexeyknyshev>>
>                 bitbucket.org/alexeyknyshev
>                 <http://bitbucket.org/alexeyknyshev>
>                 <https://bitbucket.org/alexeyknyshev/
>                 <https://bitbucket.org/alexeyknyshev/>>
>
>
>                 _______________________________________________
>                 cfe-dev mailing list
>                 [hidden email] <mailto:[hidden email]>
>                 http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>                 <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>
>
>
>             --
>             Best regards,
>             Aleksei Sidorin,
>             SRR, Samsung Electronics
>
>
>             _______________________________________________
>             cfe-dev mailing list
>             [hidden email] <mailto:[hidden email]>
>             http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>
>
>
>
>
>     --
>     linkedin.com/profile
>     <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>
>     github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>     bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>
>
>
>
>
> --
> linkedin.com/profile
> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>
> github.com/alexeyknyshev <http://github.com/alexeyknyshev>
> bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
This should already be supported by MallocChecker - i.e. it warns when you allocate memory with, say, new() and release it with free() or delete[]().

If methods of a smart pointer are "inlined" during analysis, these bugs will already be caught automatically.

I took a look at your recent work & status report posted there in mailing lists. And I have a question about how it would work in particular cases, e.g.:
1. Returning smart_ptr by value from function that has it's implementation in analyzed translation unit but it isn't called from any other function in the same TU.

std::shared_ptr<S> createInstance() {
   return std::shared_ptr<S>(new S, free);
}

If I understand correctly, compiler is free to apply copy-elison optimization on shared_ptr which is returned by value. So destructor call won't be modeled properly because it's not inlined during analysis. And bug won't be caught.

2. As you mentioned before, report would be fired only if constructor and destructor have been inlined (which is not guaranteed to be done). As the result such check won't work in general case.

Actually my point is about modeling smart pointers behavior specifically and don't inline any calls to constructor / destructor & methods those modeled by checker. And the question immediately arises: is there way to control inline policy?
And sorry for any possible misunderstanding from my side as I'm not quite familiar with CSA internals.

Regards, Alexey K

2018-03-17 5:21 GMT+03:00 Artem Dergachev <[hidden email]>:


On 16/03/2018 12:30 PM, Alexey Knyshev wrote:
Sorry, accidentally removed part of message before sending.

Aleksei,

    Could you share your ideas of checker design? It is possible that
    problems you met can be solved in different (maybe even better)
    way if we know the whole picture.


I'm currently have no clear idea how to implement it in the "right manner". But first of all I would aim to track the origin (malloc, new, new [], etc) of SVal and check if Deleter is the expected corresponding way to deallocate such SVal.

This should already be supported by MallocChecker - i.e. it warns when you allocate memory with, say, new() and release it with free() or delete[]().

If methods of a smart pointer are "inlined" during analysis, these bugs will already be caught automatically.


Regards, Alexey K

2018-03-16 22:22 GMT+03:00 Alexey Knyshev <[hidden email] <mailto:[hidden email]>>:

    Hello guys,

    And thanks for your attention

    Aleksei,

        Could you share your ideas of checker design? It is possible
        that problems you met can be solved in different (maybe even
        better) way if we know the whole picture.


    As I said, I'm interested in improving current state of dynamic
    memory modeling. Especially stuff related to smart pointers (e.g.
    shared, unique) which also mentioned in list of potential checkers
    as smartptr.SmartPtrInit
    <https://clang-analyzer.llvm.org/potential_checkers.html>*

    *

        **You can see an example of how state trait is exported in
        GenericTaintChecker or MPIChecker. Generally, you just create
        a named ProgramStateTrait in the header.


    Briefly looked at MPITypes.h. Does it mean we should move
    RegionState to separate file and register it via Traits in the
    same manner to make it avaliable from other checkers (not other TU
    as mentioned in ento::mpi::RequestMap)?


GenericTaintChecker is made in a fairly intrusive manner by putting stuff into the program state class directly, which isn't very sane. I'd definitely prefer something similar to dynamic type propagation (see DynamicTypeMap.cpp, DynamicTypeMap.h). You can use the usual REGISTER_MAP_WITH_PROGRAMSTATE (etc.) macros in the .cpp file as long as you provide accessor methods declared in the .h file that other checkers could include.


    Artem,

        you need to keep all of this in mind when writing a checker.


    Sure, but on the other hand I think it's possible to implement and
    improve modeling of various API calls' effects step-by-step. Let's
    say, it case of SmartPtrInit checker mentioned before the bare
    minimum would be modeling of construction (and destruction to
    avoid leak report from existing related checkers).

        which is why the few experimental C++ checkers that we
        currently have required heavy hacks to become possible. But
        for now you'd rather keep an eye on these problems.


    Does it mean it's better to wait a bit for Core improvements from
    main contributors? Does it make sense to make an efforts to
    implement SmartPtrItnit checker in current state of things?


You should feel free to start working on the checker in an incremental manner (everything is better in an incremental manner!), but in some cases i may prefer improving the checker API or fixing core modeling bugs instead of writing large portions of checker code to work around inconvenient APIs or bugs, and this is something i might not be immediately capable of doing myself (due to being busy with other things), which may potentially delay your progress.

    Thanks in advance and regards,
    Alexey K

    2018-03-16 21:47 GMT+03:00 Artem Dergachev <[hidden email]
    <mailto:[hidden email]>>:


        Dynamic memory management is pretty much done at this point -
        we have very good support for malloc()-like stuff and
        reasonably good support for operator new() (support for
        operator new[]() is currently missing because it requires
        calling an indefinite number of constructors which is not yet
        implemented). There is also a known issue with custom operator
        new(...) returning null pointers - in this case we should not
        be calling the constructor but for now we evaluate the
        constructor conservatively.

        Your real problem will be managing smart pointers themselves
        because they are C++ objects that have a myriad of methods,
        they can be copied, moved, assigned, move-assigned, they
        destroyed, they lifetime-extended, you need to keep all of
        this in mind when writing a checker. This is slowly getting
        easier because i'm currently working on that, but until
        recently it wasn't working correctly in the core, let alone
        checkers, which is why the few experimental C++ checkers that
        we currently have required heavy hacks to become possible. But
        for now you'd rather keep an eye on these problems.

        On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:

            Hi Alexey!

            Could you share your ideas of checker design? It is
            possible that problems you met can be solved in different
            (maybe even better) way if we know the whole picture.
            Regarding your questions, you can find some answers below.

            1. a) The first way for checker communication is to share
            their program state trait. You can see an example of how
            state trait is exported in GenericTaintChecker or
            MPIChecker. Generally, you just create a named
            ProgramStateTrait in the header. You can take a look at
            TaintManager.h and MPITypes.h and how they are used.
            b) To set a dependency from another checker, you can just
            register it while registering your checker. An example can
            be found in MallocChecker where register$Checker also
            calls registerCStringCheckerBasic to register a checker it
            depends on.
            As you pointed, inter-checker communication can become a
            source of some problems. Most of them are discussed in
            this conversation:
            http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
            <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>

            2. I think there is nothing bad in sharing RegionState
            across checkers in the way shown in 1a.

            3. Artem Dergachev has done some excellent work on
            improvement of operator 'new' processing in CSA engine.
            Regarding checkers, I can see some on
            https://clang-analyzer.llvm.org/potential_checkers.html
            <https://clang-analyzer.llvm.org/potential_checkers.html>:
            for example, undefbehavior.AutoptrsOwnSameObj. You can
            search this list to find more.


            15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:

                Hi there!

                While thinking about how it would be possible to
                implement various smart ptr related checkers I tried
                to review current state of MallocChecker and came up
                with that it would be great to have RegionState info
                available to other checkers. Could you please share
                your points of view and comments on the following
                statements / questions:

                1. Is there any right way for chaining checkers? How
                they are expected to communicate between each other
                (excluding generation of nodes / ProgramStates). I've
                heard that there are couple of problems caused by
                inlining functions, constructors / descructors.
                2. What do you think about moving RegionState to the
                Core of CSA or providing optional extended info in
                MemRegion about the source of such region (new
                opearator / array new, malloc, alloca, etc). So it
                would be available to all checkers.
                3. Is there any roadmap for CSA and especially for
                dynamic memory management modeling & related checkers?

                Regards, Alexey K

                --                 linkedin.com/profile <http://linkedin.com/profile>
                <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
                <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>

                github.com/alexeyknyshev
                <http://github.com/alexeyknyshev>
                <http://github.com/alexeyknyshev
                <http://github.com/alexeyknyshev>>
                bitbucket.org/alexeyknyshev
                <http://bitbucket.org/alexeyknyshev>
                <https://bitbucket.org/alexeyknyshev/
                <https://bitbucket.org/alexeyknyshev/>>


                _______________________________________________
                cfe-dev mailing list
                [hidden email] <mailto:[hidden email]>
                http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
                <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>



            --             Best regards,
            Aleksei Sidorin,
            SRR, Samsung Electronics


            _______________________________________________
            cfe-dev mailing list
            [hidden email] <mailto:[hidden email]>
            http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
            <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>





    --     linkedin.com/profile
    <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

    github.com/alexeyknyshev <http://github.com/alexeyknyshev>
    bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>




--
linkedin.com/profile <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

github.com/alexeyknyshev <http://github.com/alexeyknyshev>
bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>




--

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
A bit out of scope question. Has someone ever faced problem with declaring/implementing ento::register${CHECKER_NAME} function (if I recall correctly Aleksei already has). I don't understand what's wrong with it, but I have to wrap it into 'namespace clang { namespace ento { ... } }' explicitly which probably works only because linker finds it somehow. In addition I have no chance to register any other checker as the dependency via calling register${OTHER_CHECKER_NAME} even if I add appropriate includes. Thanks in advance!

Sample clang output:

error: 'void clang::ento::registerSmartPtrInitChecker(clang::ento::CheckerManager&)' should have been declared inside 'clang::ento'

Regards, Alexey K

17 Мар 2018 г. 17:25 пользователь "Alexey Knyshev" <[hidden email]> написал:
This should already be supported by MallocChecker - i.e. it warns when you allocate memory with, say, new() and release it with free() or delete[]().

If methods of a smart pointer are "inlined" during analysis, these bugs will already be caught automatically.

I took a look at your recent work & status report posted there in mailing lists. And I have a question about how it would work in particular cases, e.g.:
1. Returning smart_ptr by value from function that has it's implementation in analyzed translation unit but it isn't called from any other function in the same TU.

std::shared_ptr<S> createInstance() {
   return std::shared_ptr<S>(new S, free);
}

If I understand correctly, compiler is free to apply copy-elison optimization on shared_ptr which is returned by value. So destructor call won't be modeled properly because it's not inlined during analysis. And bug won't be caught.

2. As you mentioned before, report would be fired only if constructor and destructor have been inlined (which is not guaranteed to be done). As the result such check won't work in general case.

Actually my point is about modeling smart pointers behavior specifically and don't inline any calls to constructor / destructor & methods those modeled by checker. And the question immediately arises: is there way to control inline policy?
And sorry for any possible misunderstanding from my side as I'm not quite familiar with CSA internals.

Regards, Alexey K

2018-03-17 5:21 GMT+03:00 Artem Dergachev <[hidden email]>:


On 16/03/2018 12:30 PM, Alexey Knyshev wrote:
Sorry, accidentally removed part of message before sending.

Aleksei,

    Could you share your ideas of checker design? It is possible that
    problems you met can be solved in different (maybe even better)
    way if we know the whole picture.


I'm currently have no clear idea how to implement it in the "right manner". But first of all I would aim to track the origin (malloc, new, new [], etc) of SVal and check if Deleter is the expected corresponding way to deallocate such SVal.

This should already be supported by MallocChecker - i.e. it warns when you allocate memory with, say, new() and release it with free() or delete[]().

If methods of a smart pointer are "inlined" during analysis, these bugs will already be caught automatically.


Regards, Alexey K

2018-03-16 22:22 GMT+03:00 Alexey Knyshev <[hidden email] <mailto:[hidden email]>>:

    Hello guys,

    And thanks for your attention

    Aleksei,

        Could you share your ideas of checker design? It is possible
        that problems you met can be solved in different (maybe even
        better) way if we know the whole picture.


    As I said, I'm interested in improving current state of dynamic
    memory modeling. Especially stuff related to smart pointers (e.g.
    shared, unique) which also mentioned in list of potential checkers
    as smartptr.SmartPtrInit
    <https://clang-analyzer.llvm.org/potential_checkers.html>*

    *

        **You can see an example of how state trait is exported in
        GenericTaintChecker or MPIChecker. Generally, you just create
        a named ProgramStateTrait in the header.


    Briefly looked at MPITypes.h. Does it mean we should move
    RegionState to separate file and register it via Traits in the
    same manner to make it avaliable from other checkers (not other TU
    as mentioned in ento::mpi::RequestMap)?


GenericTaintChecker is made in a fairly intrusive manner by putting stuff into the program state class directly, which isn't very sane. I'd definitely prefer something similar to dynamic type propagation (see DynamicTypeMap.cpp, DynamicTypeMap.h). You can use the usual REGISTER_MAP_WITH_PROGRAMSTATE (etc.) macros in the .cpp file as long as you provide accessor methods declared in the .h file that other checkers could include.


    Artem,

        you need to keep all of this in mind when writing a checker.


    Sure, but on the other hand I think it's possible to implement and
    improve modeling of various API calls' effects step-by-step. Let's
    say, it case of SmartPtrInit checker mentioned before the bare
    minimum would be modeling of construction (and destruction to
    avoid leak report from existing related checkers).

        which is why the few experimental C++ checkers that we
        currently have required heavy hacks to become possible. But
        for now you'd rather keep an eye on these problems.


    Does it mean it's better to wait a bit for Core improvements from
    main contributors? Does it make sense to make an efforts to
    implement SmartPtrItnit checker in current state of things?


You should feel free to start working on the checker in an incremental manner (everything is better in an incremental manner!), but in some cases i may prefer improving the checker API or fixing core modeling bugs instead of writing large portions of checker code to work around inconvenient APIs or bugs, and this is something i might not be immediately capable of doing myself (due to being busy with other things), which may potentially delay your progress.

    Thanks in advance and regards,
    Alexey K

    2018-03-16 21:47 GMT+03:00 Artem Dergachev <[hidden email]
    <mailto:[hidden email]>>:


        Dynamic memory management is pretty much done at this point -
        we have very good support for malloc()-like stuff and
        reasonably good support for operator new() (support for
        operator new[]() is currently missing because it requires
        calling an indefinite number of constructors which is not yet
        implemented). There is also a known issue with custom operator
        new(...) returning null pointers - in this case we should not
        be calling the constructor but for now we evaluate the
        constructor conservatively.

        Your real problem will be managing smart pointers themselves
        because they are C++ objects that have a myriad of methods,
        they can be copied, moved, assigned, move-assigned, they
        destroyed, they lifetime-extended, you need to keep all of
        this in mind when writing a checker. This is slowly getting
        easier because i'm currently working on that, but until
        recently it wasn't working correctly in the core, let alone
        checkers, which is why the few experimental C++ checkers that
        we currently have required heavy hacks to become possible. But
        for now you'd rather keep an eye on these problems.

        On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:

            Hi Alexey!

            Could you share your ideas of checker design? It is
            possible that problems you met can be solved in different
            (maybe even better) way if we know the whole picture.
            Regarding your questions, you can find some answers below.

            1. a) The first way for checker communication is to share
            their program state trait. You can see an example of how
            state trait is exported in GenericTaintChecker or
            MPIChecker. Generally, you just create a named
            ProgramStateTrait in the header. You can take a look at
            TaintManager.h and MPITypes.h and how they are used.
            b) To set a dependency from another checker, you can just
            register it while registering your checker. An example can
            be found in MallocChecker where register$Checker also
            calls registerCStringCheckerBasic to register a checker it
            depends on.
            As you pointed, inter-checker communication can become a
            source of some problems. Most of them are discussed in
            this conversation:
            http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
            <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>

            2. I think there is nothing bad in sharing RegionState
            across checkers in the way shown in 1a.

            3. Artem Dergachev has done some excellent work on
            improvement of operator 'new' processing in CSA engine.
            Regarding checkers, I can see some on
            https://clang-analyzer.llvm.org/potential_checkers.html
            <https://clang-analyzer.llvm.org/potential_checkers.html>:
            for example, undefbehavior.AutoptrsOwnSameObj. You can
            search this list to find more.


            15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:

                Hi there!

                While thinking about how it would be possible to
                implement various smart ptr related checkers I tried
                to review current state of MallocChecker and came up
                with that it would be great to have RegionState info
                available to other checkers. Could you please share
                your points of view and comments on the following
                statements / questions:

                1. Is there any right way for chaining checkers? How
                they are expected to communicate between each other
                (excluding generation of nodes / ProgramStates). I've
                heard that there are couple of problems caused by
                inlining functions, constructors / descructors.
                2. What do you think about moving RegionState to the
                Core of CSA or providing optional extended info in
                MemRegion about the source of such region (new
                opearator / array new, malloc, alloca, etc). So it
                would be available to all checkers.
                3. Is there any roadmap for CSA and especially for
                dynamic memory management modeling & related checkers?

                Regards, Alexey K

                --                 linkedin.com/profile <http://linkedin.com/profile>
                <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
                <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>

                github.com/alexeyknyshev
                <http://github.com/alexeyknyshev>
                <http://github.com/alexeyknyshev
                <http://github.com/alexeyknyshev>>
                bitbucket.org/alexeyknyshev
                <http://bitbucket.org/alexeyknyshev>
                <https://bitbucket.org/alexeyknyshev/
                <https://bitbucket.org/alexeyknyshev/>>


                _______________________________________________
                cfe-dev mailing list
                [hidden email] <mailto:[hidden email]>
                http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
                <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>



            --             Best regards,
            Aleksei Sidorin,
            SRR, Samsung Electronics


            _______________________________________________
            cfe-dev mailing list
            [hidden email] <mailto:[hidden email]>
            http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
            <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>





    --     linkedin.com/profile
    <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

    github.com/alexeyknyshev <http://github.com/alexeyknyshev>
    bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>




--
linkedin.com/profile <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

github.com/alexeyknyshev <http://github.com/alexeyknyshev>
bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>




--

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
In reply to this post by Louis Dionne via cfe-dev


On 17/03/2018 7:25 AM, Alexey Knyshev wrote:

>
>     This should already be supported by MallocChecker - i.e. it warns
>     when you allocate memory with, say, new() and release it with
>     free() or delete[]().
>
>     If methods of a smart pointer are "inlined" during analysis, these
>     bugs will already be caught automatically.
>
>
> I took a look at your recent work & status report posted there in
> mailing lists. And I have a question about how it would work in
> particular cases, e.g.:
> 1. Returning smart_ptr by value from function that has it's
> implementation in analyzed translation unit but it isn't called from
> any other function in the same TU.
>
> std::shared_ptr<S> createInstance() {
>    return std::shared_ptr<S>(new S, free);
> }
>
> If I understand correctly, compiler is free to apply copy-elison
> optimization on shared_ptr which is returned by value. So destructor
> call won't be modeled properly because it's not inlined during
> analysis. And bug won't be caught.

Indeed, the tempting - but risky - thing to do here is to declare
construction of "std::shared_ptr<S>(new S, free);" to be a bug on its
own. It is risky because it leaves us with a checker that finds
non-bugs; there is a chance that the destructor will never be called
while the pointer is still being owned by the shared_ptr. Such code
would be ridiculous but at the same time we wouldn't like to drive away
users who had to write this to work around an equally ridiculous
proprietary API that they have no way of changing.

In this particular case i'd have tried to take the risk, keeping in mind
that i might end up rewriting the checker back to the conservative
behavior of only warning at the destructor.

Note that copy elision is irrelevant, because in any case we won't
free() the pointer at the return site, because reference count will
never reach zero within the function. That's the whole point of the
shared pointer.

> 2. As you mentioned before, report would be fired only if constructor
> and destructor have been inlined (which is not guaranteed to be done).
> As the result such check won't work in general case.

Yeah, that's pretty much the primary motivation for making an
API-specific checker. It's not super strong in this case because this
would likely give a very slight increase over the existing coverage.

In this case, however, the much more important motivation is to avoid
false positives that occur when we don't know the original reference
count of a shared pointer that was constructed before analysis has
started - and we start "assuming..." that any destructor of a shared
pointer could release memory. This is currently planned to be suppressed
in a fairly gross manner (https://reviews.llvm.org/D44281). For this
reason we believe that having explicit modeling for various smart
pointers is a must.

> Actually my point is about modeling smart pointers behavior
> specifically and don't inline any calls to constructor / destructor &
> methods those modeled by checker. And the question immediately arises:
> is there way to control inline policy?

It is indeed possible to make the checker "take over" function call
modeling by subscribing on the `eval::Call` callback. In this case both
inlining and conservative evaluation by the core are avoided, and the
checker who evaluates the call is responsible for modeling *all* effects
of the call (because two checkers cannot evalCall the same call, even
though they can do the usual checkPreCall/checkPostCall thingy as much
as they want).

There is also the BodyFarm mechanism (which allows you to mock
simplified ASTs for standard functions) but you DO NOT WANT to use it
for C++ because it'd take you forever to construct the correct C++ AST
with templates manually.

> And sorry for any possible misunderstanding from my side as I'm not
> quite familiar with CSA internals.
>
> Regards, Alexey K
>
> 2018-03-17 5:21 GMT+03:00 Artem Dergachev <[hidden email]
> <mailto:[hidden email]>>:
>
>
>
>     On 16/03/2018 12:30 PM, Alexey Knyshev wrote:
>
>         Sorry, accidentally removed part of message before sending.
>
>         Aleksei,
>
>             Could you share your ideas of checker design? It is
>         possible that
>             problems you met can be solved in different (maybe even
>         better)
>             way if we know the whole picture.
>
>
>         I'm currently have no clear idea how to implement it in the
>         "right manner". But first of all I would aim to track the
>         origin (malloc, new, new [], etc) of SVal and check if Deleter
>         is the expected corresponding way to deallocate such SVal.
>
>
>     This should already be supported by MallocChecker - i.e. it warns
>     when you allocate memory with, say, new() and release it with
>     free() or delete[]().
>
>     If methods of a smart pointer are "inlined" during analysis, these
>     bugs will already be caught automatically.
>
>
>         Regards, Alexey K
>
>         2018-03-16 22:22 GMT+03:00 Alexey Knyshev
>         <[hidden email] <mailto:[hidden email]>
>         <mailto:[hidden email]
>         <mailto:[hidden email]>>>:
>
>             Hello guys,
>
>             And thanks for your attention
>
>             Aleksei,
>
>                 Could you share your ideas of checker design? It is
>         possible
>                 that problems you met can be solved in different
>         (maybe even
>                 better) way if we know the whole picture.
>
>
>             As I said, I'm interested in improving current state of
>         dynamic
>             memory modeling. Especially stuff related to smart
>         pointers (e.g.
>             shared, unique) which also mentioned in list of potential
>         checkers
>             as smartptr.SmartPtrInit
>             <https://clang-analyzer.llvm.org/potential_checkers.html
>         <https://clang-analyzer.llvm.org/potential_checkers.html>>*
>
>             *
>
>                 **You can see an example of how state trait is exported in
>                 GenericTaintChecker or MPIChecker. Generally, you just
>         create
>                 a named ProgramStateTrait in the header.
>
>
>             Briefly looked at MPITypes.h. Does it mean we should move
>             RegionState to separate file and register it via Traits in the
>             same manner to make it avaliable from other checkers (not
>         other TU
>             as mentioned in ento::mpi::RequestMap)?
>
>
>     GenericTaintChecker is made in a fairly intrusive manner by
>     putting stuff into the program state class directly, which isn't
>     very sane. I'd definitely prefer something similar to dynamic type
>     propagation (see DynamicTypeMap.cpp, DynamicTypeMap.h). You can
>     use the usual REGISTER_MAP_WITH_PROGRAMSTATE (etc.) macros in the
>     .cpp file as long as you provide accessor methods declared in the
>     .h file that other checkers could include.
>
>
>             Artem,
>
>                 you need to keep all of this in mind when writing a
>         checker.
>
>
>             Sure, but on the other hand I think it's possible to
>         implement and
>             improve modeling of various API calls' effects
>         step-by-step. Let's
>             say, it case of SmartPtrInit checker mentioned before the bare
>             minimum would be modeling of construction (and destruction to
>             avoid leak report from existing related checkers).
>
>                 which is why the few experimental C++ checkers that we
>                 currently have required heavy hacks to become
>         possible. But
>                 for now you'd rather keep an eye on these problems.
>
>
>             Does it mean it's better to wait a bit for Core
>         improvements from
>             main contributors? Does it make sense to make an efforts to
>             implement SmartPtrItnit checker in current state of things?
>
>
>     You should feel free to start working on the checker in an
>     incremental manner (everything is better in an incremental
>     manner!), but in some cases i may prefer improving the checker API
>     or fixing core modeling bugs instead of writing large portions of
>     checker code to work around inconvenient APIs or bugs, and this is
>     something i might not be immediately capable of doing myself (due
>     to being busy with other things), which may potentially delay your
>     progress.
>
>             Thanks in advance and regards,
>             Alexey K
>
>             2018-03-16 21:47 GMT+03:00 Artem Dergachev
>         <[hidden email] <mailto:[hidden email]>
>             <mailto:[hidden email] <mailto:[hidden email]>>>:
>
>
>                 Dynamic memory management is pretty much done at this
>         point -
>                 we have very good support for malloc()-like stuff and
>                 reasonably good support for operator new() (support for
>                 operator new[]() is currently missing because it requires
>                 calling an indefinite number of constructors which is
>         not yet
>                 implemented). There is also a known issue with custom
>         operator
>                 new(...) returning null pointers - in this case we
>         should not
>                 be calling the constructor but for now we evaluate the
>                 constructor conservatively.
>
>                 Your real problem will be managing smart pointers
>         themselves
>                 because they are C++ objects that have a myriad of
>         methods,
>                 they can be copied, moved, assigned, move-assigned, they
>                 destroyed, they lifetime-extended, you need to keep all of
>                 this in mind when writing a checker. This is slowly
>         getting
>                 easier because i'm currently working on that, but until
>                 recently it wasn't working correctly in the core, let
>         alone
>                 checkers, which is why the few experimental C++
>         checkers that
>                 we currently have required heavy hacks to become
>         possible. But
>                 for now you'd rather keep an eye on these problems.
>
>                 On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:
>
>                     Hi Alexey!
>
>                     Could you share your ideas of checker design? It is
>                     possible that problems you met can be solved in
>         different
>                     (maybe even better) way if we know the whole picture.
>                     Regarding your questions, you can find some
>         answers below.
>
>                     1. a) The first way for checker communication is
>         to share
>                     their program state trait. You can see an example
>         of how
>                     state trait is exported in GenericTaintChecker or
>                     MPIChecker. Generally, you just create a named
>                     ProgramStateTrait in the header. You can take a
>         look at
>                     TaintManager.h and MPITypes.h and how they are used.
>                     b) To set a dependency from another checker, you
>         can just
>                     register it while registering your checker. An
>         example can
>                     be found in MallocChecker where register$Checker also
>                     calls registerCStringCheckerBasic to register a
>         checker it
>                     depends on.
>                     As you pointed, inter-checker communication can
>         become a
>                     source of some problems. Most of them are discussed in
>                     this conversation:
>         http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
>         <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>
>                    
>         <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
>         <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>>
>
>                     2. I think there is nothing bad in sharing RegionState
>                     across checkers in the way shown in 1a.
>
>                     3. Artem Dergachev has done some excellent work on
>                     improvement of operator 'new' processing in CSA
>         engine.
>                     Regarding checkers, I can see some on
>         https://clang-analyzer.llvm.org/potential_checkers.html
>         <https://clang-analyzer.llvm.org/potential_checkers.html>
>                    
>         <https://clang-analyzer.llvm.org/potential_checkers.html
>         <https://clang-analyzer.llvm.org/potential_checkers.html>>:
>                     for example, undefbehavior.AutoptrsOwnSameObj. You can
>                     search this list to find more.
>
>
>                     15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:
>
>                         Hi there!
>
>                         While thinking about how it would be possible to
>                         implement various smart ptr related checkers I
>         tried
>                         to review current state of MallocChecker and
>         came up
>                         with that it would be great to have
>         RegionState info
>                         available to other checkers. Could you please
>         share
>                         your points of view and comments on the following
>                         statements / questions:
>
>                         1. Is there any right way for chaining
>         checkers? How
>                         they are expected to communicate between each
>         other
>                         (excluding generation of nodes /
>         ProgramStates). I've
>                         heard that there are couple of problems caused by
>                         inlining functions, constructors / descructors.
>                         2. What do you think about moving RegionState
>         to the
>                         Core of CSA or providing optional extended info in
>                         MemRegion about the source of such region (new
>                         opearator / array new, malloc, alloca, etc). So it
>                         would be available to all checkers.
>                         3. Is there any roadmap for CSA and especially for
>                         dynamic memory management modeling & related
>         checkers?
>
>                         Regards, Alexey K
>
>                         -- linkedin.com/profile
>         <http://linkedin.com/profile> <http://linkedin.com/profile>
>                        
>         <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
>         <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>                        
>         <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
>         <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>>
>
>         github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>                         <http://github.com/alexeyknyshev
>         <http://github.com/alexeyknyshev>>
>                         <http://github.com/alexeyknyshev
>         <http://github.com/alexeyknyshev>
>                         <http://github.com/alexeyknyshev
>         <http://github.com/alexeyknyshev>>>
>         bitbucket.org/alexeyknyshev <http://bitbucket.org/alexeyknyshev>
>                         <http://bitbucket.org/alexeyknyshev
>         <http://bitbucket.org/alexeyknyshev>>
>                         <https://bitbucket.org/alexeyknyshev/
>         <https://bitbucket.org/alexeyknyshev/>
>                         <https://bitbucket.org/alexeyknyshev/
>         <https://bitbucket.org/alexeyknyshev/>>>
>
>
>                         _______________________________________________
>                         cfe-dev mailing list
>         [hidden email] <mailto:[hidden email]>
>         <mailto:[hidden email] <mailto:[hidden email]>>
>         http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>                        
>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>
>
>
>
>                     --             Best regards,
>                     Aleksei Sidorin,
>                     SRR, Samsung Electronics
>
>
>                     _______________________________________________
>                     cfe-dev mailing list
>         [hidden email] <mailto:[hidden email]>
>         <mailto:[hidden email] <mailto:[hidden email]>>
>         http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>                    
>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>
>
>
>
>
>
>             -- linkedin.com/profile <http://linkedin.com/profile>
>            
>         <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
>         <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>
>
>         github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>         <http://github.com/alexeyknyshev
>         <http://github.com/alexeyknyshev>>
>         bitbucket.org/alexeyknyshev
>         <http://bitbucket.org/alexeyknyshev>
>         <https://bitbucket.org/alexeyknyshev/
>         <https://bitbucket.org/alexeyknyshev/>>
>
>
>
>
>         --
>         linkedin.com/profile <http://linkedin.com/profile>
>         <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
>         <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>
>
>         github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>         <http://github.com/alexeyknyshev
>         <http://github.com/alexeyknyshev>>
>         bitbucket.org/alexeyknyshev
>         <http://bitbucket.org/alexeyknyshev>
>         <https://bitbucket.org/alexeyknyshev/
>         <https://bitbucket.org/alexeyknyshev/>>
>
>
>
>
>
> --
> linkedin.com/profile
> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>
> github.com/alexeyknyshev <http://github.com/alexeyknyshev>
> bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
Hi all,

During last few days I was working on std::shared_ptr constructor modeling and trying to understand how path sensitive analysis works. After some time I faced unexpected modelling of Call arguments SVals. Here is the portion of code that caused analyzer behavior that I don't understand:

auto del = std::default_delete<S>();
return std::shared_ptr<S>((S*)malloc(sizeof(S)), del);

The case I observed: first argument SVal is modeled properly and I can get RegionState info from program state (provided by MallocChecker). The second one (del symbol) is unknown (DelSVal.dump() shows me that). On the other hand, the following code doesn't cause such problem and prints '&code{free}' as expected:

const auto &del = free;
return std::shared_ptr<S>((S*)malloc(sizeof(S)), del);

So, the question is: what can cause such imprecise modeling of SVal (Deleter - second constructor arg)?

P.S. I've changed shared_ptr constructor declaration in 'test/Analysis/Inputs/system-header-simulator-cxx-std-suppression.h':

shared_ptr(_Tp* __p, void(*deleter)(void*))

to

template<class Deleter>
shared_ptr(_Tp* __p, Deleter d)

and I don't understand why it was declared with function pointer argument instead template. But looks like it isn't the source of my problem at first sight.

Regards, Alexey K

2018-03-20 0:37 GMT+03:00 Artem Dergachev <[hidden email]>:


On 17/03/2018 7:25 AM, Alexey Knyshev wrote:

    This should already be supported by MallocChecker - i.e. it warns
    when you allocate memory with, say, new() and release it with
    free() or delete[]().

    If methods of a smart pointer are "inlined" during analysis, these
    bugs will already be caught automatically.


I took a look at your recent work & status report posted there in mailing lists. And I have a question about how it would work in particular cases, e.g.:
1. Returning smart_ptr by value from function that has it's implementation in analyzed translation unit but it isn't called from any other function in the same TU.

std::shared_ptr<S> createInstance() {
   return std::shared_ptr<S>(new S, free);
}

If I understand correctly, compiler is free to apply copy-elison optimization on shared_ptr which is returned by value. So destructor call won't be modeled properly because it's not inlined during analysis. And bug won't be caught.

Indeed, the tempting - but risky - thing to do here is to declare construction of "std::shared_ptr<S>(new S, free);" to be a bug on its own. It is risky because it leaves us with a checker that finds non-bugs; there is a chance that the destructor will never be called while the pointer is still being owned by the shared_ptr. Such code would be ridiculous but at the same time we wouldn't like to drive away users who had to write this to work around an equally ridiculous proprietary API that they have no way of changing.

In this particular case i'd have tried to take the risk, keeping in mind that i might end up rewriting the checker back to the conservative behavior of only warning at the destructor.

Note that copy elision is irrelevant, because in any case we won't free() the pointer at the return site, because reference count will never reach zero within the function. That's the whole point of the shared pointer.

2. As you mentioned before, report would be fired only if constructor and destructor have been inlined (which is not guaranteed to be done). As the result such check won't work in general case.

Yeah, that's pretty much the primary motivation for making an API-specific checker. It's not super strong in this case because this would likely give a very slight increase over the existing coverage.

In this case, however, the much more important motivation is to avoid false positives that occur when we don't know the original reference count of a shared pointer that was constructed before analysis has started - and we start "assuming..." that any destructor of a shared pointer could release memory. This is currently planned to be suppressed in a fairly gross manner (https://reviews.llvm.org/D44281). For this reason we believe that having explicit modeling for various smart pointers is a must.

Actually my point is about modeling smart pointers behavior specifically and don't inline any calls to constructor / destructor & methods those modeled by checker. And the question immediately arises: is there way to control inline policy?

It is indeed possible to make the checker "take over" function call modeling by subscribing on the `eval::Call` callback. In this case both inlining and conservative evaluation by the core are avoided, and the checker who evaluates the call is responsible for modeling *all* effects of the call (because two checkers cannot evalCall the same call, even though they can do the usual checkPreCall/checkPostCall thingy as much as they want).

There is also the BodyFarm mechanism (which allows you to mock simplified ASTs for standard functions) but you DO NOT WANT to use it for C++ because it'd take you forever to construct the correct C++ AST with templates manually.

And sorry for any possible misunderstanding from my side as I'm not quite familiar with CSA internals.

Regards, Alexey K

2018-03-17 5:21 GMT+03:00 Artem Dergachev <[hidden email] <mailto:[hidden email]>>:



    On 16/03/2018 12:30 PM, Alexey Knyshev wrote:

        Sorry, accidentally removed part of message before sending.

        Aleksei,

            Could you share your ideas of checker design? It is
        possible that
            problems you met can be solved in different (maybe even
        better)
            way if we know the whole picture.


        I'm currently have no clear idea how to implement it in the
        "right manner". But first of all I would aim to track the
        origin (malloc, new, new [], etc) of SVal and check if Deleter
        is the expected corresponding way to deallocate such SVal.


    This should already be supported by MallocChecker - i.e. it warns
    when you allocate memory with, say, new() and release it with
    free() or delete[]().

    If methods of a smart pointer are "inlined" during analysis, these
    bugs will already be caught automatically.


        Regards, Alexey K

        2018-03-16 22:22 GMT+03:00 Alexey Knyshev
        <[hidden email] <mailto:[hidden email]>
        <mailto:[hidden email]

        <mailto:[hidden email]>>>:

            Hello guys,

            And thanks for your attention

            Aleksei,

                Could you share your ideas of checker design? It is
        possible
                that problems you met can be solved in different
        (maybe even
                better) way if we know the whole picture.


            As I said, I'm interested in improving current state of
        dynamic
            memory modeling. Especially stuff related to smart
        pointers (e.g.
            shared, unique) which also mentioned in list of potential
        checkers
            as smartptr.SmartPtrInit
            <https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>>*

            *

                **You can see an example of how state trait is exported in
                GenericTaintChecker or MPIChecker. Generally, you just
        create
                a named ProgramStateTrait in the header.


            Briefly looked at MPITypes.h. Does it mean we should move
            RegionState to separate file and register it via Traits in the
            same manner to make it avaliable from other checkers (not
        other TU
            as mentioned in ento::mpi::RequestMap)?


    GenericTaintChecker is made in a fairly intrusive manner by
    putting stuff into the program state class directly, which isn't
    very sane. I'd definitely prefer something similar to dynamic type
    propagation (see DynamicTypeMap.cpp, DynamicTypeMap.h). You can
    use the usual REGISTER_MAP_WITH_PROGRAMSTATE (etc.) macros in the
    .cpp file as long as you provide accessor methods declared in the
    .h file that other checkers could include.


            Artem,

                you need to keep all of this in mind when writing a
        checker.


            Sure, but on the other hand I think it's possible to
        implement and
            improve modeling of various API calls' effects
        step-by-step. Let's
            say, it case of SmartPtrInit checker mentioned before the bare
            minimum would be modeling of construction (and destruction to
            avoid leak report from existing related checkers).

                which is why the few experimental C++ checkers that we
                currently have required heavy hacks to become
        possible. But
                for now you'd rather keep an eye on these problems.


            Does it mean it's better to wait a bit for Core
        improvements from
            main contributors? Does it make sense to make an efforts to
            implement SmartPtrItnit checker in current state of things?


    You should feel free to start working on the checker in an
    incremental manner (everything is better in an incremental
    manner!), but in some cases i may prefer improving the checker API
    or fixing core modeling bugs instead of writing large portions of
    checker code to work around inconvenient APIs or bugs, and this is
    something i might not be immediately capable of doing myself (due
    to being busy with other things), which may potentially delay your
    progress.

            Thanks in advance and regards,
            Alexey K

            2018-03-16 21:47 GMT+03:00 Artem Dergachev
        <[hidden email] <mailto:[hidden email]>
            <mailto:[hidden email] <mailto:[hidden email]>>>:



                Dynamic memory management is pretty much done at this
        point -
                we have very good support for malloc()-like stuff and
                reasonably good support for operator new() (support for
                operator new[]() is currently missing because it requires
                calling an indefinite number of constructors which is
        not yet
                implemented). There is also a known issue with custom
        operator
                new(...) returning null pointers - in this case we
        should not
                be calling the constructor but for now we evaluate the
                constructor conservatively.

                Your real problem will be managing smart pointers
        themselves
                because they are C++ objects that have a myriad of
        methods,
                they can be copied, moved, assigned, move-assigned, they
                destroyed, they lifetime-extended, you need to keep all of
                this in mind when writing a checker. This is slowly
        getting
                easier because i'm currently working on that, but until
                recently it wasn't working correctly in the core, let
        alone
                checkers, which is why the few experimental C++
        checkers that
                we currently have required heavy hacks to become
        possible. But
                for now you'd rather keep an eye on these problems.

                On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:

                    Hi Alexey!

                    Could you share your ideas of checker design? It is
                    possible that problems you met can be solved in
        different
                    (maybe even better) way if we know the whole picture.
                    Regarding your questions, you can find some
        answers below.

                    1. a) The first way for checker communication is
        to share
                    their program state trait. You can see an example
        of how
                    state trait is exported in GenericTaintChecker or
                    MPIChecker. Generally, you just create a named
                    ProgramStateTrait in the header. You can take a
        look at
                    TaintManager.h and MPITypes.h and how they are used.
                    b) To set a dependency from another checker, you
        can just
                    register it while registering your checker. An
        example can
                    be found in MallocChecker where register$Checker also
                    calls registerCStringCheckerBasic to register a
        checker it
                    depends on.
                    As you pointed, inter-checker communication can
        become a
                    source of some problems. Most of them are discussed in
                    this conversation:
        http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>
                   
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>>

                    2. I think there is nothing bad in sharing RegionState
                    across checkers in the way shown in 1a.

                    3. Artem Dergachev has done some excellent work on
                    improvement of operator 'new' processing in CSA
        engine.
                    Regarding checkers, I can see some on
        https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>
                   
        <https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>>:
                    for example, undefbehavior.AutoptrsOwnSameObj. You can
                    search this list to find more.


                    15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:

                        Hi there!

                        While thinking about how it would be possible to
                        implement various smart ptr related checkers I
        tried
                        to review current state of MallocChecker and
        came up
                        with that it would be great to have
        RegionState info
                        available to other checkers. Could you please
        share
                        your points of view and comments on the following
                        statements / questions:

                        1. Is there any right way for chaining
        checkers? How
                        they are expected to communicate between each
        other
                        (excluding generation of nodes /
        ProgramStates). I've
                        heard that there are couple of problems caused by
                        inlining functions, constructors / descructors.
                        2. What do you think about moving RegionState
        to the
                        Core of CSA or providing optional extended info in
                        MemRegion about the source of such region (new
                        opearator / array new, malloc, alloca, etc). So it
                        would be available to all checkers.
                        3. Is there any roadmap for CSA and especially for
                        dynamic memory management modeling & related
        checkers?

                        Regards, Alexey K

                        -- linkedin.com/profile
        <http://linkedin.com/profile> <http://linkedin.com/profile>
                       
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
                       
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>>
        bitbucket.org/alexeyknyshev <http://bitbucket.org/alexeyknyshev>
                        <http://bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>>
                        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>
                        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>>


                        _______________________________________________
                        cfe-dev mailing list
        [hidden email] <mailto:[hidden email]>
        <mailto:[hidden email] <mailto:[hidden email]>>
        http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
                       
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>



                    --             Best regards,
                    Aleksei Sidorin,
                    SRR, Samsung Electronics


                    _______________________________________________
                    cfe-dev mailing list
        [hidden email] <mailto:[hidden email]>
        <mailto:[hidden email] <mailto:[hidden email]>>
        http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
                   
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>





            -- linkedin.com/profile <http://linkedin.com/profile>
           
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
        bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>
        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>




        --         linkedin.com/profile <http://linkedin.com/profile>
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
        bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>
        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>





--
linkedin.com/profile <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

github.com/alexeyknyshev <http://github.com/alexeyknyshev>
bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>




--

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
Hi all,
Any ideas on it?

Regards, Alexey K

2018-03-26 12:01 GMT+03:00 Alexey Knyshev <[hidden email]>:
Hi all,

During last few days I was working on std::shared_ptr constructor modeling and trying to understand how path sensitive analysis works. After some time I faced unexpected modelling of Call arguments SVals. Here is the portion of code that caused analyzer behavior that I don't understand:

auto del = std::default_delete<S>();
return std::shared_ptr<S>((S*)malloc(sizeof(S)), del);

The case I observed: first argument SVal is modeled properly and I can get RegionState info from program state (provided by MallocChecker). The second one (del symbol) is unknown (DelSVal.dump() shows me that). On the other hand, the following code doesn't cause such problem and prints '&code{free}' as expected:

const auto &del = free;
return std::shared_ptr<S>((S*)malloc(sizeof(S)), del);

So, the question is: what can cause such imprecise modeling of SVal (Deleter - second constructor arg)?

P.S. I've changed shared_ptr constructor declaration in 'test/Analysis/Inputs/system-header-simulator-cxx-std-suppression.h':

shared_ptr(_Tp* __p, void(*deleter)(void*))

to

template<class Deleter>
shared_ptr(_Tp* __p, Deleter d)

and I don't understand why it was declared with function pointer argument instead template. But looks like it isn't the source of my problem at first sight.

Regards, Alexey K

2018-03-20 0:37 GMT+03:00 Artem Dergachev <[hidden email]>:


On 17/03/2018 7:25 AM, Alexey Knyshev wrote:

    This should already be supported by MallocChecker - i.e. it warns
    when you allocate memory with, say, new() and release it with
    free() or delete[]().

    If methods of a smart pointer are "inlined" during analysis, these
    bugs will already be caught automatically.


I took a look at your recent work & status report posted there in mailing lists. And I have a question about how it would work in particular cases, e.g.:
1. Returning smart_ptr by value from function that has it's implementation in analyzed translation unit but it isn't called from any other function in the same TU.

std::shared_ptr<S> createInstance() {
   return std::shared_ptr<S>(new S, free);
}

If I understand correctly, compiler is free to apply copy-elison optimization on shared_ptr which is returned by value. So destructor call won't be modeled properly because it's not inlined during analysis. And bug won't be caught.

Indeed, the tempting - but risky - thing to do here is to declare construction of "std::shared_ptr<S>(new S, free);" to be a bug on its own. It is risky because it leaves us with a checker that finds non-bugs; there is a chance that the destructor will never be called while the pointer is still being owned by the shared_ptr. Such code would be ridiculous but at the same time we wouldn't like to drive away users who had to write this to work around an equally ridiculous proprietary API that they have no way of changing.

In this particular case i'd have tried to take the risk, keeping in mind that i might end up rewriting the checker back to the conservative behavior of only warning at the destructor.

Note that copy elision is irrelevant, because in any case we won't free() the pointer at the return site, because reference count will never reach zero within the function. That's the whole point of the shared pointer.

2. As you mentioned before, report would be fired only if constructor and destructor have been inlined (which is not guaranteed to be done). As the result such check won't work in general case.

Yeah, that's pretty much the primary motivation for making an API-specific checker. It's not super strong in this case because this would likely give a very slight increase over the existing coverage.

In this case, however, the much more important motivation is to avoid false positives that occur when we don't know the original reference count of a shared pointer that was constructed before analysis has started - and we start "assuming..." that any destructor of a shared pointer could release memory. This is currently planned to be suppressed in a fairly gross manner (https://reviews.llvm.org/D44281). For this reason we believe that having explicit modeling for various smart pointers is a must.

Actually my point is about modeling smart pointers behavior specifically and don't inline any calls to constructor / destructor & methods those modeled by checker. And the question immediately arises: is there way to control inline policy?

It is indeed possible to make the checker "take over" function call modeling by subscribing on the `eval::Call` callback. In this case both inlining and conservative evaluation by the core are avoided, and the checker who evaluates the call is responsible for modeling *all* effects of the call (because two checkers cannot evalCall the same call, even though they can do the usual checkPreCall/checkPostCall thingy as much as they want).

There is also the BodyFarm mechanism (which allows you to mock simplified ASTs for standard functions) but you DO NOT WANT to use it for C++ because it'd take you forever to construct the correct C++ AST with templates manually.

And sorry for any possible misunderstanding from my side as I'm not quite familiar with CSA internals.

Regards, Alexey K

2018-03-17 5:21 GMT+03:00 Artem Dergachev <[hidden email] <mailto:[hidden email]>>:



    On 16/03/2018 12:30 PM, Alexey Knyshev wrote:

        Sorry, accidentally removed part of message before sending.

        Aleksei,

            Could you share your ideas of checker design? It is
        possible that
            problems you met can be solved in different (maybe even
        better)
            way if we know the whole picture.


        I'm currently have no clear idea how to implement it in the
        "right manner". But first of all I would aim to track the
        origin (malloc, new, new [], etc) of SVal and check if Deleter
        is the expected corresponding way to deallocate such SVal.


    This should already be supported by MallocChecker - i.e. it warns
    when you allocate memory with, say, new() and release it with
    free() or delete[]().

    If methods of a smart pointer are "inlined" during analysis, these
    bugs will already be caught automatically.


        Regards, Alexey K

        2018-03-16 22:22 GMT+03:00 Alexey Knyshev
        <[hidden email] <mailto:[hidden email]>
        <mailto:[hidden email]

        <mailto:[hidden email]>>>:

            Hello guys,

            And thanks for your attention

            Aleksei,

                Could you share your ideas of checker design? It is
        possible
                that problems you met can be solved in different
        (maybe even
                better) way if we know the whole picture.


            As I said, I'm interested in improving current state of
        dynamic
            memory modeling. Especially stuff related to smart
        pointers (e.g.
            shared, unique) which also mentioned in list of potential
        checkers
            as smartptr.SmartPtrInit
            <https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>>*

            *

                **You can see an example of how state trait is exported in
                GenericTaintChecker or MPIChecker. Generally, you just
        create
                a named ProgramStateTrait in the header.


            Briefly looked at MPITypes.h. Does it mean we should move
            RegionState to separate file and register it via Traits in the
            same manner to make it avaliable from other checkers (not
        other TU
            as mentioned in ento::mpi::RequestMap)?


    GenericTaintChecker is made in a fairly intrusive manner by
    putting stuff into the program state class directly, which isn't
    very sane. I'd definitely prefer something similar to dynamic type
    propagation (see DynamicTypeMap.cpp, DynamicTypeMap.h). You can
    use the usual REGISTER_MAP_WITH_PROGRAMSTATE (etc.) macros in the
    .cpp file as long as you provide accessor methods declared in the
    .h file that other checkers could include.


            Artem,

                you need to keep all of this in mind when writing a
        checker.


            Sure, but on the other hand I think it's possible to
        implement and
            improve modeling of various API calls' effects
        step-by-step. Let's
            say, it case of SmartPtrInit checker mentioned before the bare
            minimum would be modeling of construction (and destruction to
            avoid leak report from existing related checkers).

                which is why the few experimental C++ checkers that we
                currently have required heavy hacks to become
        possible. But
                for now you'd rather keep an eye on these problems.


            Does it mean it's better to wait a bit for Core
        improvements from
            main contributors? Does it make sense to make an efforts to
            implement SmartPtrItnit checker in current state of things?


    You should feel free to start working on the checker in an
    incremental manner (everything is better in an incremental
    manner!), but in some cases i may prefer improving the checker API
    or fixing core modeling bugs instead of writing large portions of
    checker code to work around inconvenient APIs or bugs, and this is
    something i might not be immediately capable of doing myself (due
    to being busy with other things), which may potentially delay your
    progress.

            Thanks in advance and regards,
            Alexey K

            2018-03-16 21:47 GMT+03:00 Artem Dergachev
        <[hidden email] <mailto:[hidden email]>
            <mailto:[hidden email] <mailto:[hidden email]>>>:



                Dynamic memory management is pretty much done at this
        point -
                we have very good support for malloc()-like stuff and
                reasonably good support for operator new() (support for
                operator new[]() is currently missing because it requires
                calling an indefinite number of constructors which is
        not yet
                implemented). There is also a known issue with custom
        operator
                new(...) returning null pointers - in this case we
        should not
                be calling the constructor but for now we evaluate the
                constructor conservatively.

                Your real problem will be managing smart pointers
        themselves
                because they are C++ objects that have a myriad of
        methods,
                they can be copied, moved, assigned, move-assigned, they
                destroyed, they lifetime-extended, you need to keep all of
                this in mind when writing a checker. This is slowly
        getting
                easier because i'm currently working on that, but until
                recently it wasn't working correctly in the core, let
        alone
                checkers, which is why the few experimental C++
        checkers that
                we currently have required heavy hacks to become
        possible. But
                for now you'd rather keep an eye on these problems.

                On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:

                    Hi Alexey!

                    Could you share your ideas of checker design? It is
                    possible that problems you met can be solved in
        different
                    (maybe even better) way if we know the whole picture.
                    Regarding your questions, you can find some
        answers below.

                    1. a) The first way for checker communication is
        to share
                    their program state trait. You can see an example
        of how
                    state trait is exported in GenericTaintChecker or
                    MPIChecker. Generally, you just create a named
                    ProgramStateTrait in the header. You can take a
        look at
                    TaintManager.h and MPITypes.h and how they are used.
                    b) To set a dependency from another checker, you
        can just
                    register it while registering your checker. An
        example can
                    be found in MallocChecker where register$Checker also
                    calls registerCStringCheckerBasic to register a
        checker it
                    depends on.
                    As you pointed, inter-checker communication can
        become a
                    source of some problems. Most of them are discussed in
                    this conversation:
        http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>
                   
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>>

                    2. I think there is nothing bad in sharing RegionState
                    across checkers in the way shown in 1a.

                    3. Artem Dergachev has done some excellent work on
                    improvement of operator 'new' processing in CSA
        engine.
                    Regarding checkers, I can see some on
        https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>
                   
        <https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>>:
                    for example, undefbehavior.AutoptrsOwnSameObj. You can
                    search this list to find more.


                    15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:

                        Hi there!

                        While thinking about how it would be possible to
                        implement various smart ptr related checkers I
        tried
                        to review current state of MallocChecker and
        came up
                        with that it would be great to have
        RegionState info
                        available to other checkers. Could you please
        share
                        your points of view and comments on the following
                        statements / questions:

                        1. Is there any right way for chaining
        checkers? How
                        they are expected to communicate between each
        other
                        (excluding generation of nodes /
        ProgramStates). I've
                        heard that there are couple of problems caused by
                        inlining functions, constructors / descructors.
                        2. What do you think about moving RegionState
        to the
                        Core of CSA or providing optional extended info in
                        MemRegion about the source of such region (new
                        opearator / array new, malloc, alloca, etc). So it
                        would be available to all checkers.
                        3. Is there any roadmap for CSA and especially for
                        dynamic memory management modeling & related
        checkers?

                        Regards, Alexey K

                        -- linkedin.com/profile
        <http://linkedin.com/profile> <http://linkedin.com/profile>
                       
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
                       
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>>
        bitbucket.org/alexeyknyshev <http://bitbucket.org/alexeyknyshev>
                        <http://bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>>
                        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>
                        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>>


                        _______________________________________________
                        cfe-dev mailing list
        [hidden email] <mailto:[hidden email]>
        <mailto:[hidden email] <mailto:[hidden email]>>
        http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
                       
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>



                    --             Best regards,
                    Aleksei Sidorin,
                    SRR, Samsung Electronics


                    _______________________________________________
                    cfe-dev mailing list
        [hidden email] <mailto:[hidden email]>
        <mailto:[hidden email] <mailto:[hidden email]>>
        http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
                   
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>





            -- linkedin.com/profile <http://linkedin.com/profile>
           
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
        bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>
        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>




        --         linkedin.com/profile <http://linkedin.com/profile>
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
        bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>
        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>





--
linkedin.com/profile <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

github.com/alexeyknyshev <http://github.com/alexeyknyshev>
bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>




--



--

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] MallocChecker & checker dependencies

Louis Dionne via cfe-dev
In reply to this post by Louis Dionne via cfe-dev


On 3/26/18 2:01 AM, Alexey Knyshev wrote:
Hi all,

During last few days I was working on std::shared_ptr constructor modeling and trying to understand how path sensitive analysis works. After some time I faced unexpected modelling of Call arguments SVals. Here is the portion of code that caused analyzer behavior that I don't understand:

auto del = std::default_delete<S>();
return std::shared_ptr<S>((S*)malloc(sizeof(S)), del);

The case I observed: first argument SVal is modeled properly and I can get RegionState info from program state (provided by MallocChecker). The second one (del symbol) is unknown (DelSVal.dump() shows me that). On the other hand, the following code doesn't cause such problem and prints '&code{free}' as expected:

const auto &del = free;
return std::shared_ptr<S>((S*)malloc(sizeof(S)), del);

So, the question is: what can cause such imprecise modeling of SVal (Deleter - second constructor arg)?

"Well," said Owl, "the customary procedure in such cases is as follows:"

1. View the exploded graph in GraphViz (as described in https://clang-analyzer.llvm.org/checker_dev_manual.html#visualizing).
2. Find the top-most node on it that has incorrect or missing info.
3. Does the node have a checker tag at the bottom? If so, find the respective code that adds transitions in the checker.
4. Otherwise see what statement is being evaluated at this node. The code you're interested in lies somewhere within ExprEngine::Visit() for the respective statement kind.
5. Put a breakpoint in the code you've found and debug step-by-step until you see the problem.

Exploded graph is quite overwhelming at first, but it's not as hard as it looks.

In your case the call argument is an UnknownVal - which means that you know that we couldn't evaluate argument expression. If you're not sure what kind of expression it is, consult the -ast-dump. The (last) node that evaluates that expression would not contain the value for the expression (because UnknownVal is dropped in Environment - but not in Store). If any of its sub-expressions are also incorrectly dropped, you might want to look at them first.

P.S. I've changed shared_ptr constructor declaration in 'test/Analysis/Inputs/system-header-simulator-cxx-std-suppression.h':

You should only use these simulators for regression testing. They're simplified standard library headers that try to be as accurate as possible but don't guarantee that. After all, For your actual development, you should use the real system headers, because that's what the users will care about, and then try to update the simulators to simulate them as closely as you need.


shared_ptr(_Tp* __p, void(*deleter)(void*))

to

template<class Deleter>
shared_ptr(_Tp* __p, Deleter d)

and I don't understand why it was declared with function pointer argument instead template. But looks like it isn't the source of my problem at first sight.

Regards, Alexey K

2018-03-20 0:37 GMT+03:00 Artem Dergachev <[hidden email]>:


On 17/03/2018 7:25 AM, Alexey Knyshev wrote:

    This should already be supported by MallocChecker - i.e. it warns
    when you allocate memory with, say, new() and release it with
    free() or delete[]().

    If methods of a smart pointer are "inlined" during analysis, these
    bugs will already be caught automatically.


I took a look at your recent work & status report posted there in mailing lists. And I have a question about how it would work in particular cases, e.g.:
1. Returning smart_ptr by value from function that has it's implementation in analyzed translation unit but it isn't called from any other function in the same TU.

std::shared_ptr<S> createInstance() {
   return std::shared_ptr<S>(new S, free);
}

If I understand correctly, compiler is free to apply copy-elison optimization on shared_ptr which is returned by value. So destructor call won't be modeled properly because it's not inlined during analysis. And bug won't be caught.

Indeed, the tempting - but risky - thing to do here is to declare construction of "std::shared_ptr<S>(new S, free);" to be a bug on its own. It is risky because it leaves us with a checker that finds non-bugs; there is a chance that the destructor will never be called while the pointer is still being owned by the shared_ptr. Such code would be ridiculous but at the same time we wouldn't like to drive away users who had to write this to work around an equally ridiculous proprietary API that they have no way of changing.

In this particular case i'd have tried to take the risk, keeping in mind that i might end up rewriting the checker back to the conservative behavior of only warning at the destructor.

Note that copy elision is irrelevant, because in any case we won't free() the pointer at the return site, because reference count will never reach zero within the function. That's the whole point of the shared pointer.

2. As you mentioned before, report would be fired only if constructor and destructor have been inlined (which is not guaranteed to be done). As the result such check won't work in general case.

Yeah, that's pretty much the primary motivation for making an API-specific checker. It's not super strong in this case because this would likely give a very slight increase over the existing coverage.

In this case, however, the much more important motivation is to avoid false positives that occur when we don't know the original reference count of a shared pointer that was constructed before analysis has started - and we start "assuming..." that any destructor of a shared pointer could release memory. This is currently planned to be suppressed in a fairly gross manner (https://reviews.llvm.org/D44281). For this reason we believe that having explicit modeling for various smart pointers is a must.

Actually my point is about modeling smart pointers behavior specifically and don't inline any calls to constructor / destructor & methods those modeled by checker. And the question immediately arises: is there way to control inline policy?

It is indeed possible to make the checker "take over" function call modeling by subscribing on the `eval::Call` callback. In this case both inlining and conservative evaluation by the core are avoided, and the checker who evaluates the call is responsible for modeling *all* effects of the call (because two checkers cannot evalCall the same call, even though they can do the usual checkPreCall/checkPostCall thingy as much as they want).

There is also the BodyFarm mechanism (which allows you to mock simplified ASTs for standard functions) but you DO NOT WANT to use it for C++ because it'd take you forever to construct the correct C++ AST with templates manually.

And sorry for any possible misunderstanding from my side as I'm not quite familiar with CSA internals.

Regards, Alexey K

2018-03-17 5:21 GMT+03:00 Artem Dergachev <[hidden email] <mailto:[hidden email]>>:



    On 16/03/2018 12:30 PM, Alexey Knyshev wrote:

        Sorry, accidentally removed part of message before sending.

        Aleksei,

            Could you share your ideas of checker design? It is
        possible that
            problems you met can be solved in different (maybe even
        better)
            way if we know the whole picture.


        I'm currently have no clear idea how to implement it in the
        "right manner". But first of all I would aim to track the
        origin (malloc, new, new [], etc) of SVal and check if Deleter
        is the expected corresponding way to deallocate such SVal.


    This should already be supported by MallocChecker - i.e. it warns
    when you allocate memory with, say, new() and release it with
    free() or delete[]().

    If methods of a smart pointer are "inlined" during analysis, these
    bugs will already be caught automatically.


        Regards, Alexey K

        2018-03-16 22:22 GMT+03:00 Alexey Knyshev
        <[hidden email] <mailto:[hidden email]>
        <mailto:[hidden email]

        <mailto:[hidden email]>>>:

            Hello guys,

            And thanks for your attention

            Aleksei,

                Could you share your ideas of checker design? It is
        possible
                that problems you met can be solved in different
        (maybe even
                better) way if we know the whole picture.


            As I said, I'm interested in improving current state of
        dynamic
            memory modeling. Especially stuff related to smart
        pointers (e.g.
            shared, unique) which also mentioned in list of potential
        checkers
            as smartptr.SmartPtrInit
            <https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>>*

            *

                **You can see an example of how state trait is exported in
                GenericTaintChecker or MPIChecker. Generally, you just
        create
                a named ProgramStateTrait in the header.


            Briefly looked at MPITypes.h. Does it mean we should move
            RegionState to separate file and register it via Traits in the
            same manner to make it avaliable from other checkers (not
        other TU
            as mentioned in ento::mpi::RequestMap)?


    GenericTaintChecker is made in a fairly intrusive manner by
    putting stuff into the program state class directly, which isn't
    very sane. I'd definitely prefer something similar to dynamic type
    propagation (see DynamicTypeMap.cpp, DynamicTypeMap.h). You can
    use the usual REGISTER_MAP_WITH_PROGRAMSTATE (etc.) macros in the
    .cpp file as long as you provide accessor methods declared in the
    .h file that other checkers could include.


            Artem,

                you need to keep all of this in mind when writing a
        checker.


            Sure, but on the other hand I think it's possible to
        implement and
            improve modeling of various API calls' effects
        step-by-step. Let's
            say, it case of SmartPtrInit checker mentioned before the bare
            minimum would be modeling of construction (and destruction to
            avoid leak report from existing related checkers).

                which is why the few experimental C++ checkers that we
                currently have required heavy hacks to become
        possible. But
                for now you'd rather keep an eye on these problems.


            Does it mean it's better to wait a bit for Core
        improvements from
            main contributors? Does it make sense to make an efforts to
            implement SmartPtrItnit checker in current state of things?


    You should feel free to start working on the checker in an
    incremental manner (everything is better in an incremental
    manner!), but in some cases i may prefer improving the checker API
    or fixing core modeling bugs instead of writing large portions of
    checker code to work around inconvenient APIs or bugs, and this is
    something i might not be immediately capable of doing myself (due
    to being busy with other things), which may potentially delay your
    progress.

            Thanks in advance and regards,
            Alexey K

            2018-03-16 21:47 GMT+03:00 Artem Dergachev
        <[hidden email] <mailto:[hidden email]>
            <mailto:[hidden email] <mailto:[hidden email]>>>:



                Dynamic memory management is pretty much done at this
        point -
                we have very good support for malloc()-like stuff and
                reasonably good support for operator new() (support for
                operator new[]() is currently missing because it requires
                calling an indefinite number of constructors which is
        not yet
                implemented). There is also a known issue with custom
        operator
                new(...) returning null pointers - in this case we
        should not
                be calling the constructor but for now we evaluate the
                constructor conservatively.

                Your real problem will be managing smart pointers
        themselves
                because they are C++ objects that have a myriad of
        methods,
                they can be copied, moved, assigned, move-assigned, they
                destroyed, they lifetime-extended, you need to keep all of
                this in mind when writing a checker. This is slowly
        getting
                easier because i'm currently working on that, but until
                recently it wasn't working correctly in the core, let
        alone
                checkers, which is why the few experimental C++
        checkers that
                we currently have required heavy hacks to become
        possible. But
                for now you'd rather keep an eye on these problems.

                On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:

                    Hi Alexey!

                    Could you share your ideas of checker design? It is
                    possible that problems you met can be solved in
        different
                    (maybe even better) way if we know the whole picture.
                    Regarding your questions, you can find some
        answers below.

                    1. a) The first way for checker communication is
        to share
                    their program state trait. You can see an example
        of how
                    state trait is exported in GenericTaintChecker or
                    MPIChecker. Generally, you just create a named
                    ProgramStateTrait in the header. You can take a
        look at
                    TaintManager.h and MPITypes.h and how they are used.
                    b) To set a dependency from another checker, you
        can just
                    register it while registering your checker. An
        example can
                    be found in MallocChecker where register$Checker also
                    calls registerCStringCheckerBasic to register a
        checker it
                    depends on.
                    As you pointed, inter-checker communication can
        become a
                    source of some problems. Most of them are discussed in
                    this conversation:
        http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>
                   
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html
        <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Design-idea-separate-modelling-from-checking-td4059122.html>>

                    2. I think there is nothing bad in sharing RegionState
                    across checkers in the way shown in 1a.

                    3. Artem Dergachev has done some excellent work on
                    improvement of operator 'new' processing in CSA
        engine.
                    Regarding checkers, I can see some on
        https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>
                   
        <https://clang-analyzer.llvm.org/potential_checkers.html
        <https://clang-analyzer.llvm.org/potential_checkers.html>>:
                    for example, undefbehavior.AutoptrsOwnSameObj. You can
                    search this list to find more.


                    15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:

                        Hi there!

                        While thinking about how it would be possible to
                        implement various smart ptr related checkers I
        tried
                        to review current state of MallocChecker and
        came up
                        with that it would be great to have
        RegionState info
                        available to other checkers. Could you please
        share
                        your points of view and comments on the following
                        statements / questions:

                        1. Is there any right way for chaining
        checkers? How
                        they are expected to communicate between each
        other
                        (excluding generation of nodes /
        ProgramStates). I've
                        heard that there are couple of problems caused by
                        inlining functions, constructors / descructors.
                        2. What do you think about moving RegionState
        to the
                        Core of CSA or providing optional extended info in
                        MemRegion about the source of such region (new
                        opearator / array new, malloc, alloca, etc). So it
                        would be available to all checkers.
                        3. Is there any roadmap for CSA and especially for
                        dynamic memory management modeling & related
        checkers?

                        Regards, Alexey K

                        -- linkedin.com/profile
        <http://linkedin.com/profile> <http://linkedin.com/profile>
                       
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
                       
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>
                        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>>
        bitbucket.org/alexeyknyshev <http://bitbucket.org/alexeyknyshev>
                        <http://bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>>
                        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>
                        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>>


                        _______________________________________________
                        cfe-dev mailing list
        [hidden email] <mailto:[hidden email]>
        <mailto:[hidden email] <mailto:[hidden email]>>
        http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
                       
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>



                    --             Best regards,
                    Aleksei Sidorin,
                    SRR, Samsung Electronics


                    _______________________________________________
                    cfe-dev mailing list
        [hidden email] <mailto:[hidden email]>
        <mailto:[hidden email] <mailto:[hidden email]>>
        http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
                   
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
        <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>





            -- linkedin.com/profile <http://linkedin.com/profile>
           
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
        bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>
        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>




        --         linkedin.com/profile <http://linkedin.com/profile>
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw
        <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>>

        github.com/alexeyknyshev <http://github.com/alexeyknyshev>
        <http://github.com/alexeyknyshev
        <http://github.com/alexeyknyshev>>
        bitbucket.org/alexeyknyshev
        <http://bitbucket.org/alexeyknyshev>
        <https://bitbucket.org/alexeyknyshev/
        <https://bitbucket.org/alexeyknyshev/>>





--
linkedin.com/profile <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

github.com/alexeyknyshev <http://github.com/alexeyknyshev>
bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>




--


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev