[analyzer] Adding build bot for static analyzer reference results

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[analyzer] Adding build bot for static analyzer reference results

Renato Golin via cfe-dev
Hi all,

We’re planning to add a public Apple build bot for the static analyzer to Green Dragon (http://lab.llvm.org:8080/green/). I’d like to get your feedback on our proposed approach.

The goal of this bot is to catch unexpected analyzer regressions, crashes, and coverage loss by periodically running the analyzer on a suite of open-source benchmarks. The bot will compare the produced path diagnostics to reference results. If these do not match, we will e-mail the committers and a small set of interested people. (Let us know if you want to be notified on every failure.) We’d like to make it easy for the community to respond to analyzer regressions and update the reference results.

We currently have an Apple-internal static analyzer build bot and have found it helpful for catching mistakes that make it past the normal tests. The main downside is that the results need to be updated when new checks are added or the analyzer output changes.

We propose taking a “curl + cache” approach to benchmarks. That is, we won’t store the benchmarks themselves in a repository. Instead, the bots will download them from the projects' websites and cache locally. If we need to change the benchmarks (to get them to compile with newer versions of clang, for example) we will represent these changes as patch sets which will be applied to the downloaded version. Both these patch sets and the reference results will be checked into the llvm.org/zorg repository so anyone with commit access will be able to update them. The bot will use the CmpRuns.py script (in clang’s utils/analyzer/) to compare the produced path diagnostic plists to the reference results.

We’d very much appreciate feedback on this proposed approach. We’d also like to solicit suggestions for benchmarks, which we hope to grow over time. We think sqlite, postgresql, openssl, and Adium (for Objective-C coverage) are good initial benchmarks — but we’d like to add C++ benchmarks as well (perhaps LLVM?).

Devin Coughlin
Apple Program Analysis Team
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] Adding build bot for static analyzer reference results

Renato Golin via cfe-dev


On Mon, Sep 28, 2015 at 4:25 PM, Devin Coughlin via cfe-dev <[hidden email]> wrote:
Hi all,

We’re planning to add a public Apple build bot for the static analyzer to Green Dragon (http://lab.llvm.org:8080/green/). I’d like to get your feedback on our proposed approach.

The goal of this bot is to catch unexpected analyzer regressions, crashes, and coverage loss by periodically running the analyzer on a suite of open-source benchmarks. The bot will compare the produced path diagnostics to reference results. If these do not match, we will e-mail the committers and a small set of interested people. (Let us know if you want to be notified on every failure.) We’d like to make it easy for the community to respond to analyzer regressions and update the reference results.

We currently have an Apple-internal static analyzer build bot and have found it helpful for catching mistakes that make it past the normal tests. The main downside is that the results need to be updated when new checks are added or the analyzer output changes.

We propose taking a “curl + cache” approach to benchmarks. That is, we won’t store the benchmarks themselves in a repository. Instead, the bots will download them from the projects' websites and cache locally.

If we're going to be downloading things from external sources, those sources could be changing, no? Or will we pin to a specific version - if we're pinning to a specific version, what's the benefit to taking an external dependency like that (untrusted, may be down when we need it, etc), compared to copying the files permanently & checking them in to clang-tests (or clang-tests-external) as I did for GDB?

If we're interested in catching regressions in both the external code and our code (which I'm interested in doing for GDB, but haven't had time) I can see why it'd make sense to track ToT of both projects, but that's a bit of a different goal - and we'd probably want someone to triage those before mailing developers who committed the changes. (because many regressions will be due to the external project changing, not the  LLVM developer's change causing a regression)
 
If we need to change the benchmarks (to get them to compile with newer versions of clang, for example) we will represent these changes as patch sets which will be applied to the downloaded version. Both these patch sets and the reference results will be checked into the llvm.org/zorg repository so anyone with commit access will be able to update them. The bot will use the CmpRuns.py script (in clang’s utils/analyzer/) to compare the produced path diagnostic plists to the reference results.

We’d very much appreciate feedback on this proposed approach. We’d also like to solicit suggestions for benchmarks, which we hope to grow over time. We think sqlite, postgresql, openssl, and Adium (for Objective-C coverage) are good initial benchmarks — but we’d like to add C++ benchmarks as well (perhaps LLVM?).

Devin Coughlin
Apple Program Analysis Team
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] Adding build bot for static analyzer reference results

Renato Golin via cfe-dev
On Sep 28, 2015, at 4:33 PM, David Blaikie <[hidden email]> wrote:
On Mon, Sep 28, 2015 at 4:25 PM, Devin Coughlin via cfe-dev <[hidden email]> wrote:

We propose taking a “curl + cache” approach to benchmarks. That is, we won’t store the benchmarks themselves in a repository. Instead, the bots will download them from the projects' websites and cache locally.

If we're going to be downloading things from external sources, those sources could be changing, no? Or will we pin to a specific version.

We will pin it to a specific version, although we may want to periodically (yearly or even less frequently) update to a newer version of the benchmarks to make sure we get good coverage of newer language features and to keep the benchmarks compiling with ToT clang.

if we're pinning to a specific version, what's the benefit to taking an external dependency like that (untrusted, may be down when we need it, etc), compared to copying the files permanently & checking them in to clang-tests (or clang-tests-external) as I did for GDB?

The goal here is to avoid unnecessary co-mingling of benchmarks with different licenses in the llvm.org repositories — but as you point out, it comes with significant downsides. What has your experience with gdb and clang-test-external been like? Has dealing with differing licensing w/r/t clang been a challenge?

Devin

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] Adding build bot for static analyzer reference results

Renato Golin via cfe-dev


On Mon, Sep 28, 2015 at 5:02 PM, Devin Coughlin <[hidden email]> wrote:
On Sep 28, 2015, at 4:33 PM, David Blaikie <[hidden email]> wrote:
On Mon, Sep 28, 2015 at 4:25 PM, Devin Coughlin via cfe-dev <[hidden email]> wrote:

We propose taking a “curl + cache” approach to benchmarks. That is, we won’t store the benchmarks themselves in a repository. Instead, the bots will download them from the projects' websites and cache locally.

If we're going to be downloading things from external sources, those sources could be changing, no? Or will we pin to a specific version.

We will pin it to a specific version, although we may want to periodically (yearly or even less frequently) update to a newer version of the benchmarks to make sure we get good coverage of newer language features and to keep the benchmarks compiling with ToT clang.

if we're pinning to a specific version, what's the benefit to taking an external dependency like that (untrusted, may be down when we need it, etc), compared to copying the files permanently & checking them in to clang-tests (or clang-tests-external) as I did for GDB?

The goal here is to avoid unnecessary co-mingling of benchmarks with different licenses in the llvm.org repositories — but as you point out, it comes with significant downsides. What has your experience with gdb and clang-test-external been like? Has dealing with differing licensing w/r/t clang been a challenge?

I don't believe so - we just put it out in a separate repository from clang-tests, to ensure those who might have reasons for not wanting to view code under such a license could ensure they didn't accidentally run across it.

But as usual, legal/licensing issues should be directed at lawyers, of which I am not one.

- David


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] Adding build bot for static analyzer reference results

Renato Golin via cfe-dev
In reply to this post by Renato Golin via cfe-dev
Sending emails to people who change the results of the static analyzer seems fine. I'm concerned that catching performance regressions in the analyzer might have some false positives, though. The static analyzer is fairly isolated, so maybe there won't be many false positives, but if it becomes a problem, we should probably just disable this part of the reporting and simply track performance over time.

On Mon, Sep 28, 2015 at 4:25 PM, Devin Coughlin via cfe-dev <[hidden email]> wrote:
Hi all,

We’re planning to add a public Apple build bot for the static analyzer to Green Dragon (http://lab.llvm.org:8080/green/). I’d like to get your feedback on our proposed approach.

The goal of this bot is to catch unexpected analyzer regressions, crashes, and coverage loss by periodically running the analyzer on a suite of open-source benchmarks. The bot will compare the produced path diagnostics to reference results. If these do not match, we will e-mail the committers and a small set of interested people. (Let us know if you want to be notified on every failure.) We’d like to make it easy for the community to respond to analyzer regressions and update the reference results.

We currently have an Apple-internal static analyzer build bot and have found it helpful for catching mistakes that make it past the normal tests. The main downside is that the results need to be updated when new checks are added or the analyzer output changes.

We propose taking a “curl + cache” approach to benchmarks. That is, we won’t store the benchmarks themselves in a repository. Instead, the bots will download them from the projects' websites and cache locally. If we need to change the benchmarks (to get them to compile with newer versions of clang, for example) we will represent these changes as patch sets which will be applied to the downloaded version. Both these patch sets and the reference results will be checked into the llvm.org/zorg repository so anyone with commit access will be able to update them. The bot will use the CmpRuns.py script (in clang’s utils/analyzer/) to compare the produced path diagnostic plists to the reference results.

We’d very much appreciate feedback on this proposed approach. We’d also like to solicit suggestions for benchmarks, which we hope to grow over time. We think sqlite, postgresql, openssl, and Adium (for Objective-C coverage) are good initial benchmarks — but we’d like to add C++ benchmarks as well (perhaps LLVM?).

Devin Coughlin
Apple Program Analysis Team
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [analyzer] Adding build bot for static analyzer reference results

Renato Golin via cfe-dev

> On Sep 28, 2015, at 5:07 PM, Reid Kleckner <[hidden email]> wrote:
>
> Sending emails to people who change the results of the static analyzer seems fine. I'm concerned that catching performance regressions in the analyzer might have some false positives, though.

We’re not proposing to report performance regressions with this bot — only regressions in analyzer diagnostics. I think it would be very useful to track performance over time, but that is not something we plan to do initially with this bot.

Devin
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev