More plans on Static Analyzer + Clang-Tidy interoperation.

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

More plans on Static Analyzer + Clang-Tidy interoperation.

Manas via cfe-dev
After a bit of hiatus i'm reviving this work. The previous discussion is
at http://lists.llvm.org/pipermail/cfe-dev/2019-August/063092.html, also
http://lists.llvm.org/pipermail/cfe-dev/2019-September/063229.html. The
plan is not to turn the two Clang bug-finding tools into a single tool
entirely but rather to make them more interchangeable, so that users who
have existing integration of, say, static analyzer, could also have
clang-tidy integration for as-free-as-possible. In particular,
checks/checkers could be shared, which would resolve the constant
struggle of "where to put the checker?".

One thing i realized since the last discussion is that the more tools we
integrate, the more redundant compilation work do we perform. If we are
to compile the code + run static analyzer + clang-tidy separately over
it, that's 3 rebuilds of the AST from scratch. Whatever solution we
provide to run both tools, I'd much rather keep it at 2 (compilation +
*all* analysis) because static analysis is already expensive in terms of
build-time, no need to make it worse.

One core component of this plan is to teach clang-tidy how to emit
reports in various human-readable and machine-readable formats that the
static analyzer already supports. At this point i'm pretty much ready to
publish a clang::DiagnosticConsumer that'd produce all kinds of static
analyzer-style reports. In particular, such consumer would enable
clang-tidy to be used from inside scan-build instead of clang --analyze;
both clang-tidy checks and static analyzer checkers would be ran from
inside clang-tidy binary but produce html reports consumable by
scan-build; a common index.html report would be generated for all
checkers. I'm very much in favor of teaching scan-build how to run
arbitrary clang tools, not just clang-tidy (i.e., "scan-build
--tool=/path/to/my-clang-tool" or something like that) which would allow
users who don't have CMake compilation databases to take advantage of
clang tools more easily (and we have a few users who are interested in
that).

On the other hand, we've also made up our mind that for ourselves we do
in fact want produce a clang binary with clang-tidy checkers integrated.
Apart from free integration into a number of our CI systems, that'll
allow us to avoid shipping and supporting the clang-tidy command line
interface as a separate feature. That's about 7MB increase in clang
binary size which we're fine with. I plan to make it an off-by-default
cmake flag unless there are strong opinions to do the opposite. The
alternative approach to move ourselves into a separate binary that's
integrated at the Driver level would also technically work but that's
too disruptive to commit to for us at the moment - even just the Driver
work alone would require a lot of testing, let alone making sure that
static analyzer works exactly as before from within the tool (it already
works from inside clang-tidy but we don't really know if it actually
works *the same* in all the potential cornercases).

So, like, we want to support multiple workflows.

Also i'll be making occasional commits to some clang-tidy checks that
we're interested in - mostly to address a few false positives (say, some
checkers aren't aware of some Objective-C constructs), but also
sometimes tweak the warning text a little bit (for example,
bugprone-assert-side-effect is an awesome check but the warning text
"found assert() with side effect" sounds fairly un-compiler-ish - we're
not playing hide-and-seek!). Hope i'm welcome with these changes ^.^


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: More plans on Static Analyzer + Clang-Tidy interoperation.

Manas via cfe-dev
On Mon, Oct 12, 2020 at 5:26 PM Artem Dergachev via cfe-dev
<[hidden email]> wrote:

>
> After a bit of hiatus i'm reviving this work. The previous discussion is
> at http://lists.llvm.org/pipermail/cfe-dev/2019-August/063092.html, also
> http://lists.llvm.org/pipermail/cfe-dev/2019-September/063229.html. The
> plan is not to turn the two Clang bug-finding tools into a single tool
> entirely but rather to make them more interchangeable, so that users who
> have existing integration of, say, static analyzer, could also have
> clang-tidy integration for as-free-as-possible. In particular,
> checks/checkers could be shared, which would resolve the constant
> struggle of "where to put the checker?".

Thank you for looking into this topic again!

> One thing i realized since the last discussion is that the more tools we
> integrate, the more redundant compilation work do we perform. If we are
> to compile the code + run static analyzer + clang-tidy separately over
> it, that's 3 rebuilds of the AST from scratch. Whatever solution we
> provide to run both tools, I'd much rather keep it at 2 (compilation +
> *all* analysis) because static analysis is already expensive in terms of
> build-time, no need to make it worse.

+1

> One core component of this plan is to teach clang-tidy how to emit
> reports in various human-readable and machine-readable formats that the
> static analyzer already supports. At this point i'm pretty much ready to
> publish a clang::DiagnosticConsumer that'd produce all kinds of static
> analyzer-style reports. In particular, such consumer would enable
> clang-tidy to be used from inside scan-build instead of clang --analyze;
> both clang-tidy checks and static analyzer checkers would be ran from
> inside clang-tidy binary but produce html reports consumable by
> scan-build; a common index.html report would be generated for all
> checkers. I'm very much in favor of teaching scan-build how to run
> arbitrary clang tools, not just clang-tidy (i.e., "scan-build
> --tool=/path/to/my-clang-tool" or something like that) which would allow
> users who don't have CMake compilation databases to take advantage of
> clang tools more easily (and we have a few users who are interested in
> that).

I think this sounds really useful.

> On the other hand, we've also made up our mind that for ourselves we do
> in fact want produce a clang binary with clang-tidy checkers integrated.

I continue to disagree with that stance, FWIW. I want to see clang
able to run clang-tidy checks the same as clang static analyzer checks
because having a single tool to run which produces diagnostic results
is easier than running multiple tools in build systems or CI
pipelines. I realize that the community may not share this position,
but I hope we're willing to revisit the discussion.


> Apart from free integration into a number of our CI systems, that'll
> allow us to avoid shipping and supporting the clang-tidy command line
> interface as a separate feature. That's about 7MB increase in clang
> binary size which we're fine with. I plan to make it an off-by-default
> cmake flag unless there are strong opinions to do the opposite.

I don't have a strong opinion to do the opposite, but I'd like to at
least explore it as an option.

> The
> alternative approach to move ourselves into a separate binary that's
> integrated at the Driver level would also technically work but that's
> too disruptive to commit to for us at the moment - even just the Driver
> work alone would require a lot of testing, let alone making sure that
> static analyzer works exactly as before from within the tool (it already
> works from inside clang-tidy but we don't really know if it actually
> works *the same* in all the potential cornercases).

I think this approach would add more complexity to the question "what
tool do I reach for".

> So, like, we want to support multiple workflows.
>
> Also i'll be making occasional commits to some clang-tidy checks that
> we're interested in - mostly to address a few false positives (say, some
> checkers aren't aware of some Objective-C constructs), but also
> sometimes tweak the warning text a little bit (for example,
> bugprone-assert-side-effect is an awesome check but the warning text
> "found assert() with side effect" sounds fairly un-compiler-ish - we're
> not playing hide-and-seek!). Hope i'm welcome with these changes ^.^

Most certainly! One other thing that I think we should think about is:
now that we're looking to more tightly couple the static analyzer and
clang-tidy, should we unify their diagnostic message styles?
clang-tidy follows the Clang conventions and does not use
grammatically correct diagnostic messages (no capitalization or
punctuation) while the static analyzer has its own convention
(capitalization but no punctuation). Relatedly, do you expect to
expose some of the handy diagnostic utilities from the static analyzer
to clang-tidy checks, such as the ability to report a path that led to
a diagnostic (for the few checks that track that sort of information)?

Thanks!

~Aaron

>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: More plans on Static Analyzer + Clang-Tidy interoperation.

Manas via cfe-dev
In reply to this post by Manas via cfe-dev
This all sounds good to me, as long as it is possible to continue to disable all the analyzers to produce a smaller clang binary. In Chromium, we decided it was worth doing this for a 7% size improvement: https://crrev.com/c/1437255 We ship clang-tidy, and pay the runtime cost of a second compilation for static analysis.

Now we are no longer in the strange situation of, "small" checks are in the analyzer, "big" checks (using ASTMatchers? except the static analyzer uses those now...) are in clang-tidy, everything is the same.

Reid

On Mon, Oct 12, 2020 at 2:26 PM Artem Dergachev via cfe-dev <[hidden email]> wrote:
After a bit of hiatus i'm reviving this work. The previous discussion is
at http://lists.llvm.org/pipermail/cfe-dev/2019-August/063092.html, also
http://lists.llvm.org/pipermail/cfe-dev/2019-September/063229.html. The
plan is not to turn the two Clang bug-finding tools into a single tool
entirely but rather to make them more interchangeable, so that users who
have existing integration of, say, static analyzer, could also have
clang-tidy integration for as-free-as-possible. In particular,
checks/checkers could be shared, which would resolve the constant
struggle of "where to put the checker?".

One thing i realized since the last discussion is that the more tools we
integrate, the more redundant compilation work do we perform. If we are
to compile the code + run static analyzer + clang-tidy separately over
it, that's 3 rebuilds of the AST from scratch. Whatever solution we
provide to run both tools, I'd much rather keep it at 2 (compilation +
*all* analysis) because static analysis is already expensive in terms of
build-time, no need to make it worse.

One core component of this plan is to teach clang-tidy how to emit
reports in various human-readable and machine-readable formats that the
static analyzer already supports. At this point i'm pretty much ready to
publish a clang::DiagnosticConsumer that'd produce all kinds of static
analyzer-style reports. In particular, such consumer would enable
clang-tidy to be used from inside scan-build instead of clang --analyze;
both clang-tidy checks and static analyzer checkers would be ran from
inside clang-tidy binary but produce html reports consumable by
scan-build; a common index.html report would be generated for all
checkers. I'm very much in favor of teaching scan-build how to run
arbitrary clang tools, not just clang-tidy (i.e., "scan-build
--tool=/path/to/my-clang-tool" or something like that) which would allow
users who don't have CMake compilation databases to take advantage of
clang tools more easily (and we have a few users who are interested in
that).

On the other hand, we've also made up our mind that for ourselves we do
in fact want produce a clang binary with clang-tidy checkers integrated.
Apart from free integration into a number of our CI systems, that'll
allow us to avoid shipping and supporting the clang-tidy command line
interface as a separate feature. That's about 7MB increase in clang
binary size which we're fine with. I plan to make it an off-by-default
cmake flag unless there are strong opinions to do the opposite. The
alternative approach to move ourselves into a separate binary that's
integrated at the Driver level would also technically work but that's
too disruptive to commit to for us at the moment - even just the Driver
work alone would require a lot of testing, let alone making sure that
static analyzer works exactly as before from within the tool (it already
works from inside clang-tidy but we don't really know if it actually
works *the same* in all the potential cornercases).

So, like, we want to support multiple workflows.

Also i'll be making occasional commits to some clang-tidy checks that
we're interested in - mostly to address a few false positives (say, some
checkers aren't aware of some Objective-C constructs), but also
sometimes tweak the warning text a little bit (for example,
bugprone-assert-side-effect is an awesome check but the warning text
"found assert() with side effect" sounds fairly un-compiler-ish - we're
not playing hide-and-seek!). Hope i'm welcome with these changes ^.^


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: More plans on Static Analyzer + Clang-Tidy interoperation.

Manas via cfe-dev
In reply to this post by Manas via cfe-dev
On Tue, 13 Oct 2020 at 10:26, Artem Dergachev via cfe-dev
<[hidden email]> wrote:

>
> After a bit of hiatus i'm reviving this work. The previous discussion is
> at http://lists.llvm.org/pipermail/cfe-dev/2019-August/063092.html, also
> http://lists.llvm.org/pipermail/cfe-dev/2019-September/063229.html. The
> plan is not to turn the two Clang bug-finding tools into a single tool
> entirely but rather to make them more interchangeable, so that users who
> have existing integration of, say, static analyzer, could also have
> clang-tidy integration for as-free-as-possible. In particular,
> checks/checkers could be shared, which would resolve the constant
> struggle of "where to put the checker?".

Please can you keep in mind 'third-party' initiatives too?
eg. https://bugs.llvm.org/show_bug.cgi?id=32739
This bug is related to clazy, i'm not speaking for the project here,
but AFAIK, having checks out of tree means that you need to maintain
your own 'front-end' (tooling front-end, not compiler's one)

I use both clang-tidy and clazy in CI, i do not use the static
analyser as AFAIU, it is run by clang-tidy already.

> One thing i realized since the last discussion is that the more tools we
> integrate, the more redundant compilation work do we perform. If we are
> to compile the code + run static analyzer + clang-tidy separately over
> it, that's 3 rebuilds of the AST from scratch. Whatever solution we
> provide to run both tools, I'd much rather keep it at 2 (compilation +
> *all* analysis) because static analysis is already expensive in terms of
> build-time, no need to make it worse.

Agree, but IMHO there is a use case to separate builds from analysis.
Eg. you're CI pipeline might want a dedicated up-front analysis stage,
which gate any later builds.
The rational being that there is no point wasting resources (build for
a bunch of arch/os), if you don't pass sanity checks.

IMHO  1 federated analysis followed by M compile is still better than
N analysis followed by M compile.

As well, sorry if it doesn't make much sense or it has been discussed
in the past, but what about storing the AST on disk, and used these as
tooling input?

My 2 cents.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: More plans on Static Analyzer + Clang-Tidy interoperation.

Manas via cfe-dev
In reply to this post by Manas via cfe-dev
Hi Artem,

see some comments inline with DK>.

Regards,
Daniel
-----Original Message-----
From: cfe-dev <[hidden email]> On Behalf Of Artem Dergachev via cfe-dev
Sent: Monday, October 12, 2020 11:26 PM
To: cfe-dev <[hidden email]>; Valeriy Savchenko <[hidden email]>; Ravi <[hidden email]>; Dmitri Gribenko <[hidden email]>
Subject: [cfe-dev] More plans on Static Analyzer + Clang-Tidy interoperation.

After a bit of hiatus i'm reviving this work. The previous discussion is at https://protect2.fireeye.com/v1/url?k=c6a4cdef-9815168f-c6a48d74-86e2237f51fb-9ba675f7f1c1086e&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=http%3A%2F%2Flists.llvm.org%2Fpipermail%2Fcfe-dev%2F2019-August%2F063092.html, also https://protect2.fireeye.com/v1/url?k=66abd4f1-381a0f91-66ab946a-86e2237f51fb-1172955ed582bc7a&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=http%3A%2F%2Flists.llvm.org%2Fpipermail%2Fcfe-dev%2F2019-September%2F063229.html. The plan is not to turn the two Clang bug-finding tools into a single tool entirely but rather to make them more interchangeable, so that users who have existing integration of, say, static analyzer, could also have clang-tidy integration for as-free-as-possible. In particular, checks/checkers could be shared, which would resolve the constant struggle of "where to put the checker?".

One thing i realized since the last discussion is that the more tools we integrate, the more redundant compilation work do we perform. If we are to compile the code + run static analyzer + clang-tidy separately over it, that's 3 rebuilds of the AST from scratch. Whatever solution we provide to run both tools, I'd much rather keep it at 2 (compilation +
*all* analysis) because static analysis is already expensive in terms of build-time, no need to make it worse.


DK> Completely agree. Our users run both clang-tidy and clang static analyzer on their code from CodeChecker. The analysis is run for each TU separately by clang-tidy and separately by clang SA. It would be great if we could save the redundant parsing work. So if you add this option to static analyzer then we would run the clang-tidy check through this too (what about fixits? would they be generated?).
In CodeChecker we primarily use the machine readable plist format to generate any further output format to the user: command line output, static web html, or central server storage. We actually convert the output of clang tidy and other static analyzer tools (such as cppcheck, pylint...) too to plist (because it is machine readable).
We are open to contribute this plist_to_html converter tool to (https://github.com/Ericsson/codechecker/tree/master/tools/plist_to_html) the llvm repo if it is interesting to others too.

One core component of this plan is to teach clang-tidy how to emit reports in various human-readable and machine-readable formats that the static analyzer already supports. At this point i'm pretty much ready to publish a clang::DiagnosticConsumer that'd produce all kinds of static analyzer-style reports. In particular, such consumer would enable clang-tidy to be used from inside scan-build instead of clang --analyze; both clang-tidy checks and static analyzer checkers would be ran from inside clang-tidy binary but produce html reports consumable by scan-build; a common index.html report would be generated for all checkers. I'm very much in favor of teaching scan-build how to run arbitrary clang tools, not just clang-tidy (i.e., "scan-build --tool=/path/to/my-clang-tool" or something like that) which would allow users who don't have CMake compilation databases to take advantage of clang tools more easily (and we have a few users who are interested in that).

DK> good idea! Actually if we used the plist_to_html converter or similar (in python), we wouldn’t need to generate HTML code from clang, which gives greater flexibility (more libraries available in python for this etc.). One other thing we added to CodeChecker and which could be of generic interest: gcc to clang compilation database converter. We have many users who use gcc cross-compiled projects and who want to use clang static analyzer and clang-tidy to analyze their code. So in CodeChecker we implemented a conversion tool that transforms a gcc compilation database to clang compatible compilation database. It detects the built-in configuration of the gcc-compiler used for the original compilation (target architecture, built in defines, built in include paths), removes the gcc-only parameters, and creates a clang compatible compilation database with all this information. Then this can be used by any clang tool (tidy, clang-format, clangd etc.) to run analysis on gcc cross-compiled code without errors. This part could also be factored out as common component to LLVM if you guys find it interesting.

On the other hand, we've also made up our mind that for ourselves we do in fact want produce a clang binary with clang-tidy checkers integrated.
Apart from free integration into a number of our CI systems, that'll allow us to avoid shipping and supporting the clang-tidy command line interface as a separate feature. That's about 7MB increase in clang binary size which we're fine with. I plan to make it an off-by-default cmake flag unless there are strong opinions to do the opposite. The alternative approach to move ourselves into a separate binary that's integrated at the Driver level would also technically work but that's too disruptive to commit to for us at the moment - even just the Driver work alone would require a lot of testing, let alone making sure that static analyzer works exactly as before from within the tool (it already works from inside clang-tidy but we don't really know if it actually works *the same* in all the potential cornercases).

So, like, we want to support multiple workflows.

Also i'll be making occasional commits to some clang-tidy checks that we're interested in - mostly to address a few false positives (say, some checkers aren't aware of some Objective-C constructs), but also sometimes tweak the warning text a little bit (for example, bugprone-assert-side-effect is an awesome check but the warning text "found assert() with side effect" sounds fairly un-compiler-ish - we're not playing hide-and-seek!). Hope i'm welcome with these changes ^.^


_______________________________________________
cfe-dev mailing list
[hidden email]
https://protect2.fireeye.com/v1/url?k=5fe0386f-0151e30f-5fe078f4-86e2237f51fb-3106052e60c21d5a&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: More plans on Static Analyzer + Clang-Tidy interoperation.

Manas via cfe-dev
In reply to this post by Manas via cfe-dev


On 10/13/20 5:47 AM, Aaron Ballman wrote:

> On Mon, Oct 12, 2020 at 5:26 PM Artem Dergachev via cfe-dev
> <[hidden email]> wrote:
>> After a bit of hiatus i'm reviving this work. The previous discussion is
>> at http://lists.llvm.org/pipermail/cfe-dev/2019-August/063092.html, also
>> http://lists.llvm.org/pipermail/cfe-dev/2019-September/063229.html. The
>> plan is not to turn the two Clang bug-finding tools into a single tool
>> entirely but rather to make them more interchangeable, so that users who
>> have existing integration of, say, static analyzer, could also have
>> clang-tidy integration for as-free-as-possible. In particular,
>> checks/checkers could be shared, which would resolve the constant
>> struggle of "where to put the checker?".
> Thank you for looking into this topic again!
>
>> One thing i realized since the last discussion is that the more tools we
>> integrate, the more redundant compilation work do we perform. If we are
>> to compile the code + run static analyzer + clang-tidy separately over
>> it, that's 3 rebuilds of the AST from scratch. Whatever solution we
>> provide to run both tools, I'd much rather keep it at 2 (compilation +
>> *all* analysis) because static analysis is already expensive in terms of
>> build-time, no need to make it worse.
> +1
>
>> One core component of this plan is to teach clang-tidy how to emit
>> reports in various human-readable and machine-readable formats that the
>> static analyzer already supports. At this point i'm pretty much ready to
>> publish a clang::DiagnosticConsumer that'd produce all kinds of static
>> analyzer-style reports. In particular, such consumer would enable
>> clang-tidy to be used from inside scan-build instead of clang --analyze;
>> both clang-tidy checks and static analyzer checkers would be ran from
>> inside clang-tidy binary but produce html reports consumable by
>> scan-build; a common index.html report would be generated for all
>> checkers. I'm very much in favor of teaching scan-build how to run
>> arbitrary clang tools, not just clang-tidy (i.e., "scan-build
>> --tool=/path/to/my-clang-tool" or something like that) which would allow
>> users who don't have CMake compilation databases to take advantage of
>> clang tools more easily (and we have a few users who are interested in
>> that).
> I think this sounds really useful.
>
>> On the other hand, we've also made up our mind that for ourselves we do
>> in fact want produce a clang binary with clang-tidy checkers integrated.
> I continue to disagree with that stance, FWIW. I want to see clang
> able to run clang-tidy checks the same as clang static analyzer checks
> because having a single tool to run which produces diagnostic results
> is easier than running multiple tools in build systems or CI
> pipelines. I realize that the community may not share this position,
> but I hope we're willing to revisit the discussion.

Just to double-check, sounds like we two actually *agree* on this(?)
like, we both want clang with clang-tidy checks?

>> Apart from free integration into a number of our CI systems, that'll
>> allow us to avoid shipping and supporting the clang-tidy command line
>> interface as a separate feature. That's about 7MB increase in clang
>> binary size which we're fine with. I plan to make it an off-by-default
>> cmake flag unless there are strong opinions to do the opposite.
> I don't have a strong opinion to do the opposite, but I'd like to at
> least explore it as an option.
>
>> The
>> alternative approach to move ourselves into a separate binary that's
>> integrated at the Driver level would also technically work but that's
>> too disruptive to commit to for us at the moment - even just the Driver
>> work alone would require a lot of testing, let alone making sure that
>> static analyzer works exactly as before from within the tool (it already
>> works from inside clang-tidy but we don't really know if it actually
>> works *the same* in all the potential cornercases).
> I think this approach would add more complexity to the question "what
> tool do I reach for".
>
>> So, like, we want to support multiple workflows.
>>
>> Also i'll be making occasional commits to some clang-tidy checks that
>> we're interested in - mostly to address a few false positives (say, some
>> checkers aren't aware of some Objective-C constructs), but also
>> sometimes tweak the warning text a little bit (for example,
>> bugprone-assert-side-effect is an awesome check but the warning text
>> "found assert() with side effect" sounds fairly un-compiler-ish - we're
>> not playing hide-and-seek!). Hope i'm welcome with these changes ^.^
> Most certainly! One other thing that I think we should think about is:
> now that we're looking to more tightly couple the static analyzer and
> clang-tidy, should we unify their diagnostic message styles?
> clang-tidy follows the Clang conventions and does not use
> grammatically correct diagnostic messages (no capitalization or
> punctuation) while the static analyzer has its own convention
> (capitalization but no punctuation).

This is an important thing to do for any UI that displays both static
analyzer warnings and clang-tidy warnings; different capitalization is
indeed a bummer.

It's not uncommon for UIs/IDEs to re-capitalize all warnings into a
single style. If we suddenly change static analyzer warnings to be
lowercase now, i think most UIs will simply re-capitalize them back, so
that to preserve the existing experience.

I could also teach my intermediate DiagnosticConsumer to do the same.

Or i could teach individual PathDiagnosticConsumers (the ones that are
responsible for static analyzer-style outputs) to do that for themselves
specifically - that actually sounds like a very good plan, probably
*the* way forward.

> Relatedly, do you expect to
> expose some of the handy diagnostic utilities from the static analyzer
> to clang-tidy checks, such as the ability to report a path that led to
> a diagnostic (for the few checks that track that sort of information)?

These facilities will be exposed once https://reviews.llvm.org/D67422 
actually lands.

That said, i don't think path notes specifically will be useful for any
of the existing clang-tidy checks. IIUC the only checker so far that
could potentially make use of these notes is the use-after-move checker
- but it doesn't actually find the path, it only finds two points on the
CFG with a happens-before relation. Static analyzer path notes make
sense because every single point in the path potentially contributes to
the warning. Which implies that behind the warning there's a *huge*
facility that understands potential implications of *every* statement.
When it's just two points that contribute to the path, two notes are enough.

One thing i'm excited about, though, is eventually producing a good bug
visualization for flow-sensitive warnings. As i mentioned a few times,
flow-sensitive warnings should be displayed together rather than
separately in order to be understandable: "This code is dead *because*
this condition is always false *because* this assignment to the variable
that participates in the condition is a dead store". I believe static
analyzer's rich diagnostics system could contribute to that dream.

> Thanks!
>
> ~Aaron
>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: More plans on Static Analyzer + Clang-Tidy interoperation.

Manas via cfe-dev
In reply to this post by Manas via cfe-dev


On 10/14/20 6:28 AM, Dániel Krupp wrote:

> Hi Artem,
>
> see some comments inline with DK>.
>
> Regards,
> Daniel
> -----Original Message-----
> From: cfe-dev <[hidden email]> On Behalf Of Artem Dergachev via cfe-dev
> Sent: Monday, October 12, 2020 11:26 PM
> To: cfe-dev <[hidden email]>; Valeriy Savchenko <[hidden email]>; Ravi <[hidden email]>; Dmitri Gribenko <[hidden email]>
> Subject: [cfe-dev] More plans on Static Analyzer + Clang-Tidy interoperation.
>
>> After a bit of hiatus i'm reviving this work. The previous discussion is at https://protect2.fireeye.com/v1/url?k=c6a4cdef-9815168f-c6a48d74-86e2237f51fb-9ba675f7f1c1086e&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=http%3A%2F%2Flists.llvm.org%2Fpipermail%2Fcfe-dev%2F2019-August%2F063092.html, also https://protect2.fireeye.com/v1/url?k=66abd4f1-381a0f91-66ab946a-86e2237f51fb-1172955ed582bc7a&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=http%3A%2F%2Flists.llvm.org%2Fpipermail%2Fcfe-dev%2F2019-September%2F063229.html. The plan is not to turn the two Clang bug-finding tools into a single tool entirely but rather to make them more interchangeable, so that users who have existing integration of, say, static analyzer, could also have clang-tidy integration for as-free-as-possible. In particular, checks/checkers could be shared, which would resolve the constant struggle of "where to put the checker?".
>>
>> One thing i realized since the last discussion is that the more tools we integrate, the more redundant compilation work do we perform. If we are to compile the code + run static analyzer + clang-tidy separately over it, that's 3 rebuilds of the AST from scratch. Whatever solution we provide to run both tools, I'd much rather keep it at 2 (compilation +
>> *all* analysis) because static analysis is already expensive in terms of build-time, no need to make it worse.
>
>
> DK> Completely agree. Our users run both clang-tidy and clang static analyzer on their code from CodeChecker. The analysis is run for each TU separately by clang-tidy and separately by clang SA. It would be great if we could save the redundant parsing work. So if you add this option to static analyzer then we would run the clang-tidy check through this too (what about fixits? would they be generated?).

We have some support for fixits in plist output which i did exactly for
that purpose (https://reviews.llvm.org/D65182).

> In CodeChecker we primarily use the machine readable plist format to generate any further output format to the user: command line output, static web html, or central server storage. We actually convert the output of clang tidy and other static analyzer tools (such as cppcheck, pylint...) too to plist (because it is machine readable).
> We are open to contribute this plist_to_html converter tool to (https://github.com/Ericsson/codechecker/tree/master/tools/plist_to_html) the llvm repo if it is interesting to others too.

I'd be pretty happy to either have this converter somewhere in
utils/analyzer/ (because the problem of converting plists to htmls does
show up occasionally) or maybe even update scan-build to use it while
deprecating HTMLRewriter (as long as the quality of HTML reports doesn't
suffer; i suspect that with macro expansions available in plist reports
that's quite achievable).

>> One core component of this plan is to teach clang-tidy how to emit reports in various human-readable and machine-readable formats that the static analyzer already supports. At this point i'm pretty much ready to publish a clang::DiagnosticConsumer that'd produce all kinds of static analyzer-style reports. In particular, such consumer would enable clang-tidy to be used from inside scan-build instead of clang --analyze; both clang-tidy checks and static analyzer checkers would be ran from inside clang-tidy binary but produce html reports consumable by scan-build; a common index.html report would be generated for all checkers. I'm very much in favor of teaching scan-build how to run arbitrary clang tools, not just clang-tidy (i.e., "scan-build --tool=/path/to/my-clang-tool" or something like that) which would allow users who don't have CMake compilation databases to take advantage of clang tools more easily (and we have a few users who are interested in that).
>
> DK> good idea! Actually if we used the plist_to_html converter or similar (in python), we wouldn’t need to generate HTML code from clang, which gives greater flexibility (more libraries available in python for this etc.). One other thing we added to CodeChecker and which could be of generic interest: gcc to clang compilation database converter. We have many users who use gcc cross-compiled projects and who want to use clang static analyzer and clang-tidy to analyze their code. So in CodeChecker we implemented a conversion tool that transforms a gcc compilation database to clang compatible compilation database. It detects the built-in configuration of the gcc-compiler used for the original compilation (target architecture, built in defines, built in include paths), removes the gcc-only parameters, and creates a clang compatible compilation database with all this information. Then this can be used by any clang tool (tidy, clang-format, clangd etc.) to run analysis on gcc cross-compiled code without errors. This part could also be factored out as common component to LLVM if you guys find it interesting.

This one's probably not for me because i'm not an active user of
compilation databases but it sounds like a useful feature for a lot of
clang tooling.

>> On the other hand, we've also made up our mind that for ourselves we do in fact want produce a clang binary with clang-tidy checkers integrated.
>> Apart from free integration into a number of our CI systems, that'll allow us to avoid shipping and supporting the clang-tidy command line interface as a separate feature. That's about 7MB increase in clang binary size which we're fine with. I plan to make it an off-by-default cmake flag unless there are strong opinions to do the opposite. The alternative approach to move ourselves into a separate binary that's integrated at the Driver level would also technically work but that's too disruptive to commit to for us at the moment - even just the Driver work alone would require a lot of testing, let alone making sure that static analyzer works exactly as before from within the tool (it already works from inside clang-tidy but we don't really know if it actually works *the same* in all the potential cornercases).
>>
>> So, like, we want to support multiple workflows.
>>
>> Also i'll be making occasional commits to some clang-tidy checks that we're interested in - mostly to address a few false positives (say, some checkers aren't aware of some Objective-C constructs), but also sometimes tweak the warning text a little bit (for example, bugprone-assert-side-effect is an awesome check but the warning text "found assert() with side effect" sounds fairly un-compiler-ish - we're not playing hide-and-seek!). Hope i'm welcome with these changes ^.^
>
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://protect2.fireeye.com/v1/url?k=5fe0386f-0151e30f-5fe078f4-86e2237f51fb-3106052e60c21d5a&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: More plans on Static Analyzer + Clang-Tidy interoperation.

Manas via cfe-dev
In reply to this post by Manas via cfe-dev
On Wed, Oct 14, 2020 at 4:28 PM Artem Dergachev <[hidden email]> wrote:

> On 10/13/20 5:47 AM, Aaron Ballman wrote:
> > On Mon, Oct 12, 2020 at 5:26 PM Artem Dergachev via cfe-dev
> > <[hidden email]> wrote:
> >> After a bit of hiatus i'm reviving this work. The previous discussion is
> >> at http://lists.llvm.org/pipermail/cfe-dev/2019-August/063092.html, also
> >> http://lists.llvm.org/pipermail/cfe-dev/2019-September/063229.html. The
> >> plan is not to turn the two Clang bug-finding tools into a single tool
> >> entirely but rather to make them more interchangeable, so that users who
> >> have existing integration of, say, static analyzer, could also have
> >> clang-tidy integration for as-free-as-possible. In particular,
> >> checks/checkers could be shared, which would resolve the constant
> >> struggle of "where to put the checker?".
> > Thank you for looking into this topic again!
> >
> >> One thing i realized since the last discussion is that the more tools we
> >> integrate, the more redundant compilation work do we perform. If we are
> >> to compile the code + run static analyzer + clang-tidy separately over
> >> it, that's 3 rebuilds of the AST from scratch. Whatever solution we
> >> provide to run both tools, I'd much rather keep it at 2 (compilation +
> >> *all* analysis) because static analysis is already expensive in terms of
> >> build-time, no need to make it worse.
> > +1
> >
> >> One core component of this plan is to teach clang-tidy how to emit
> >> reports in various human-readable and machine-readable formats that the
> >> static analyzer already supports. At this point i'm pretty much ready to
> >> publish a clang::DiagnosticConsumer that'd produce all kinds of static
> >> analyzer-style reports. In particular, such consumer would enable
> >> clang-tidy to be used from inside scan-build instead of clang --analyze;
> >> both clang-tidy checks and static analyzer checkers would be ran from
> >> inside clang-tidy binary but produce html reports consumable by
> >> scan-build; a common index.html report would be generated for all
> >> checkers. I'm very much in favor of teaching scan-build how to run
> >> arbitrary clang tools, not just clang-tidy (i.e., "scan-build
> >> --tool=/path/to/my-clang-tool" or something like that) which would allow
> >> users who don't have CMake compilation databases to take advantage of
> >> clang tools more easily (and we have a few users who are interested in
> >> that).
> > I think this sounds really useful.
> >
> >> On the other hand, we've also made up our mind that for ourselves we do
> >> in fact want produce a clang binary with clang-tidy checkers integrated.
> > I continue to disagree with that stance, FWIW. I want to see clang
> > able to run clang-tidy checks the same as clang static analyzer checks
> > because having a single tool to run which produces diagnostic results
> > is easier than running multiple tools in build systems or CI
> > pipelines. I realize that the community may not share this position,
> > but I hope we're willing to revisit the discussion.
>
> Just to double-check, sounds like we two actually *agree* on this(?)
> like, we both want clang with clang-tidy checks?

Sorry for the confusion, yes, I think we agree on this point -- we
both want clang with clang-tidy checks. I was disagreeing with the
status quo.

> >> Apart from free integration into a number of our CI systems, that'll
> >> allow us to avoid shipping and supporting the clang-tidy command line
> >> interface as a separate feature. That's about 7MB increase in clang
> >> binary size which we're fine with. I plan to make it an off-by-default
> >> cmake flag unless there are strong opinions to do the opposite.
> > I don't have a strong opinion to do the opposite, but I'd like to at
> > least explore it as an option.
> >
> >> The
> >> alternative approach to move ourselves into a separate binary that's
> >> integrated at the Driver level would also technically work but that's
> >> too disruptive to commit to for us at the moment - even just the Driver
> >> work alone would require a lot of testing, let alone making sure that
> >> static analyzer works exactly as before from within the tool (it already
> >> works from inside clang-tidy but we don't really know if it actually
> >> works *the same* in all the potential cornercases).
> > I think this approach would add more complexity to the question "what
> > tool do I reach for".
> >
> >> So, like, we want to support multiple workflows.
> >>
> >> Also i'll be making occasional commits to some clang-tidy checks that
> >> we're interested in - mostly to address a few false positives (say, some
> >> checkers aren't aware of some Objective-C constructs), but also
> >> sometimes tweak the warning text a little bit (for example,
> >> bugprone-assert-side-effect is an awesome check but the warning text
> >> "found assert() with side effect" sounds fairly un-compiler-ish - we're
> >> not playing hide-and-seek!). Hope i'm welcome with these changes ^.^
> > Most certainly! One other thing that I think we should think about is:
> > now that we're looking to more tightly couple the static analyzer and
> > clang-tidy, should we unify their diagnostic message styles?
> > clang-tidy follows the Clang conventions and does not use
> > grammatically correct diagnostic messages (no capitalization or
> > punctuation) while the static analyzer has its own convention
> > (capitalization but no punctuation).
>
> This is an important thing to do for any UI that displays both static
> analyzer warnings and clang-tidy warnings; different capitalization is
> indeed a bummer.

Agreed.

> It's not uncommon for UIs/IDEs to re-capitalize all warnings into a
> single style. If we suddenly change static analyzer warnings to be
> lowercase now, i think most UIs will simply re-capitalize them back, so
> that to preserve the existing experience.

I know of at least one consumer of diagnostic output that doesn't do
that, FWIW. They just faithfully reproduce whatever the tool produces.

> I could also teach my intermediate DiagnosticConsumer to do the same.
>
> Or i could teach individual PathDiagnosticConsumers (the ones that are
> responsible for static analyzer-style outputs) to do that for themselves
> specifically - that actually sounds like a very good plan, probably
> *the* way forward.

Are you suggesting that Clang diagnostics would stay the same when
issued from Clang, but would be automatically capitalized if issued
through the static analyzer diagnostic consumer?

> > Relatedly, do you expect to
> > expose some of the handy diagnostic utilities from the static analyzer
> > to clang-tidy checks, such as the ability to report a path that led to
> > a diagnostic (for the few checks that track that sort of information)?
>
> These facilities will be exposed once https://reviews.llvm.org/D67422
> actually lands.

Huttah!

> That said, i don't think path notes specifically will be useful for any
> of the existing clang-tidy checks. IIUC the only checker so far that
> could potentially make use of these notes is the use-after-move checker
> - but it doesn't actually find the path, it only finds two points on the
> CFG with a happens-before relation. Static analyzer path notes make
> sense because every single point in the path potentially contributes to
> the warning. Which implies that behind the warning there's a *huge*
> facility that understands potential implications of *every* statement.
> When it's just two points that contribute to the path, two notes are enough.

That's true -- I don't think a lot of existing clang-tidy checks were
written with path functionality in mind.

> One thing i'm excited about, though, is eventually producing a good bug
> visualization for flow-sensitive warnings. As i mentioned a few times,
> flow-sensitive warnings should be displayed together rather than
> separately in order to be understandable: "This code is dead *because*
> this condition is always false *because* this assignment to the variable
> that participates in the condition is a dead store". I believe static
> analyzer's rich diagnostics system could contribute to that dream.

Strongly agreed! Perhaps I used the wrong terminology (or
misunderstood some other way), but this was the use case I was excited
about supporting. Making it more clear which particular set of
statements led to the decision to issue a diagnostic is really helpful
functionality when reasoning about how to resolve certain diagnostics.

Thanks!

~Aaron

>
> > Thanks!
> >
> > ~Aaron
> >
> >>
> >> _______________________________________________
> >> cfe-dev mailing list
> >> [hidden email]
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev