[analyzer] Bug post-processing questions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[analyzer] Bug post-processing questions

Dennis Luehring via cfe-dev
Hi,

I'd love to ask for some help to understand how bug post-processing could fit into the analyzer's pipeline.

My current rough understanding is that an ExplodedGraph is built for each top-level function (including other functions reachable from it). During analysis, reports (grouped into equivalence classes) are collected in the ExprEngine's BugReporter's EQClassesVector, and flushed after the construction is finished. Flushing includes selecting a 'representative' report from each class to generate a PathDiagnostic from, which is influenced by visitors and such. After some post-processing (pruning, filtering), diagnostics to display are collected into PathDiagnosticConsumer's Diags set, and actually printed in some form in the end. As I see it now, this whole process is repeated for the next untouched top-level function in the CallGraph.

I was thinking about whether the whole ExplodedGraph needs to be re-built if we want to post-process bugs occured in its realm. After looking at the EGs of some code snippets analyzed using the default solver and then Z3, it seems to me that constraints, but also environment and store contents are changing, so one guess would be a yes. But again, completely re-building EGs might not be far in slow-down from analyzing with Z3 from the beginning, depending on what portion of the code is buggy. So one could work with the information already present in the graph (this might need telling the analyzer core not to throw away stuff it does not understand when the false positive refutation option is turned on, if such thing happens?), and re-evaluate those, perhaps around the time we call FindReportInEquivalenceClass().

But I might be on a completely wrong track, so I'd love to hear your thoughts about it :)

Thanks,
Réka


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] Bug post-processing questions

Dennis Luehring via cfe-dev
Hi Réka,

I was thinking about whether the whole ExplodedGraph needs to be re-built if we want to post-process bugs occured in its realm.

Generally all of post-processing is done in the `BugReporterVisitors.cpp` file,
with general entry point being `trackNullOrUndefValue` function for many checkers.

The bug reporter visitor gets access to the last node in the exploded graph, but can generally very easily retrace the past,
by going up the chain of predecessors.

After looking at the EGs of some code snippets analyzed using the default solver and then Z3, it seems to me that constraints, but also environment and store contents are changing, so one guess would be a yes.

In general, I think it should be possible to get a good increase in precision just by re-analyzing the existing exploded graph.
From my understanding, many constraints are different because the constraint manager tries to be as efficient as possible,
and destroys all constraints which can not be processed by the current solver.
If a further increase in precision is needed, it should not be very difficult to change that destroying logic to keep those constraints as well.

But again, completely re-building EGs might not be far in slow-down from analyzing with Z3 from the beginning, depending on what portion of the code is buggy.

Yes, that’s why post-processing is done.

So one could work with the information already present in the graph (this might need telling the analyzer core not to throw away stuff it does not understand when the false positive refutation option is turned on, if such thing happens?), and re-evaluate those, perhaps around the time we call FindReportInEquivalenceClass().

As mentioned before, it’s better to do those in your own visitor inside BugReporterVisitors.cpp.
Visitors get the `BugReport` object, and can call `BugReport::markInvalid` if invalidation is needed.

Regards,
George

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] Bug post-processing questions

Dennis Luehring via cfe-dev


On 05/03/2018 2:30 PM, George Karpenkov via cfe-dev wrote:

> Hi Réka,
>
>> I was thinking about whether the whole ExplodedGraph needs to be
>> re-built if we want to post-process bugs occured in its realm.
>
> Generally all of post-processing is done in the
> `BugReporterVisitors.cpp` file,
> with general entry point being `trackNullOrUndefValue` function for
> many checkers.
>
> The bug reporter visitor gets access to the last node in the exploded
> graph, but can generally very easily retrace the past,
> by going up the chain of predecessors.
>
>> After looking at the EGs of some code snippets analyzed using the
>> default solver and then Z3, it seems to me that constraints, but also
>> environment and store contents are changing, so one guess would be a yes.
>
> In general, I think it should be possible to get a good increase in
> precision just by re-analyzing the existing exploded graph.
> From my understanding, many constraints are different because the
> constraint manager tries to be as efficient as possible,
> and destroys all constraints which can not be processed by the current
> solver.

I suspect that simply not destroying constraints on symbol that we
cannot handle would already be a slight increase in precision. Because
at least we can take the same path when the same symbol we cannot handle
is encountered twice. Eg.:

if (x^3 + y^3 == z^3) {
   // assume any branch, mark symbol (($x)^3 + ($y)^3 == ($z)^3)
   // as [ 0, 0 ] or [-INT_MAX, -1] U [1, INT_MAX] respectively
}

if (x^3 + y^3 == z^3) {
   // take the same branch that has been taken
   // on the previous if.
}

Which leads me into believing that regardless of the solver, keeping
those around is a good thing. I guess they are only removed for
performance reasons, not because keeping them around was causing false
positives. And i'm in favor of limiting symbol complexity to avoid
performance problems (this way we'd handle all obvious cases, but if the
arithmetic is ridiculously complex - the user would be more likely to
forgive us).

> If a further increase in precision is needed, it should not be very
> difficult to change that destroying logic to keep those constraints as
> well.
>
>> But again, completely re-building EGs might not be far in slow-down
>> from analyzing with Z3 from the beginning, depending on what portion
>> of the code is buggy.
>
> Yes, that’s why post-processing is done.
>
>> So one could work with the information already present in the graph
>> (this might need telling the analyzer core not to throw away stuff it
>> does not understand when the false positive refutation option is
>> turned on, if such thing happens?), and re-evaluate those, perhaps
>> around the time we call FindReportInEquivalenceClass().
>
> As mentioned before, it’s better to do those in your own visitor
> inside BugReporterVisitors.cpp.
> Visitors get the `BugReport` object, and can call
> `BugReport::markInvalid` if invalidation is needed.
>
> Regards,
> George
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev