Improvements to AST Matcher tooling and auto usage

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Improvements to AST Matcher tooling and auto usage

Kristof Beyls via cfe-dev

Hi,

I presented some features at EuroLLVM in April this year relating to
improvements I made to AST Matcher tooling (clang-tidy and clang-query):

  https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching

My aims were:

* make learning to use AST Matchers easier
* make the clang-query tool itself help the user more
* simplify the AST Matchers required by skipping non-visible nodes in
matchers and AST dumps (such as implicit conversions, but see the talk
for more).

I think the javascript API I demo'd for writing clang-tidy checks could
be a particularly valuable addition to the tools ecosystem.

I didn't manage to get all of the patches upstream because arguments
about using auto in the same way it is used in other parts of LLVM
demotivated me. I haven't found an alternative reviewer to help get the
patches upstream, though my attendance at EuroLLVM was partially an
attempt to find one.

Some people are still interested in the features getting upstream, but
there is still some work for me to do in getting some of the patches
from presentation-quality to commit-quality. So there is still
motivation needed for my part. Beyond that, I have many other features I
wish to contribute to clang tooling, so I don't really want to start
again if I know I can't continue with it.

The rule being applied in the reviews for those patches is approximately
"if auto is used on the left, the type must appear on the right in angle
brackets (such as in a cast)".

I started a thread on this mailing list in November 2018 attempting to
get the coding standard clarified regarding what is allowed and what is
not regarding `auto`. There was no outcome from the thread. However, I
included some crude measurements of the existing use of auto at the time
using git grep:

 
http://llvm.1065342.n5.nabble.com/llvm-dev-RFC-Modernizing-our-use-of-auto-td123947.html#a124564

Here is a plot of the use of `auto` where there are no angle brackets on
the right side of the equals sign, measured each week over the last
three years:

  https://imgur.com/a/kcHaEhk

It was made with this script:

  https://gist.github.com/steveire/c9633d243d5d3b515e3eb26aaa3ad7c8

It is quite conservative as it only counts use of auto in the lib
directories of llvm, clang etc, excluding tests, include/ etc.

It shows that occurances of auto - in ways that my patches are rejected
for - has almost doubled in the last 3 years.

I think it shows that the standard that "patches must use the returned
type on the right side if using auto on the left" is not being followed
in the code. However, that is how the standard is being used by some
reviewers.

If some of the major increases (and the decrease) intrigue you, the
following patches are the responsible ones in the llvm-project repo):

Significant increases:
     6151654c00cfcdb90fd665403948aab64b83494a
     d169d70bbf053f636aed25b482ec63efd094a95b
Reductions:
     ddf3db9b5eddc6c7cc75fd30eea545c22ad70dcd
     a4fa0b880a63dad30d9ad1ff9fe8c2358c32dd77
     e372710d30c66e020576a2af6e1e5e815815a65a

The MLIR team announced that they are following the LLVM coding guidelines:

  http://lists.llvm.org/pipermail/llvm-dev/2019-September/134992.html

but they use auto extensively in ways that some reviewers don't accept:

  https://github.com/tensorflow/mlir/blob/master/lib/Parser/Parser.cpp

So, clearly other interpretations of the coding standard are possible.
If my reviewer would not accept the MLIR code, then someone is
misinterpreting the coding guideline.

I think it would make sense for

* The coding guideline
* The code itself
* The reviews

to agree somewhat on what the coding guideline is regarding auto (and
preferably have it enforced somehow). Currently, the reviews appear to
be out of sync.

If the rule should be "auto must not be used if the type is not on the
right side", I'm sure I can write a clang-tidy check to convert most of
the code. Such a tool would also enforce that new code follows the same
rule.

That would make it possible for me to finish the AST Matcher features
and get them upstream. I don't mind not using auto, but I do wish for
the coding guideline, the code itself and the reviews to agree on what
the guideline is and for everyone (contributors and reviewers) to follow
the same rules. That is not currently the case.

An alternative (which I favor) is to change the coding standard along
the lines of "use of auto is not encouraged, but it is not in itself a
reason to reject the contribution".

I would very much like to finish and upstream the AST Matcher improvements.

Does anyone share the sentiment?

Thanks,

Stephen

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improvements to AST Matcher tooling and auto usage

Kristof Beyls via cfe-dev
Hello,

Le 20/10/2019 à 00:09, Stephen Kelly via cfe-dev a écrit :
>
> Hi,
>
> I presented some features at EuroLLVM in April this year relating to
> improvements I made to AST Matcher tooling (clang-tidy and clang-query):
>
>  https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching
>
>
[...]
> Some people are still interested in the features getting upstream, but
> there is still some work for me to do in getting some of the patches
> from presentation-quality to commit-quality. So there is still
> motivation needed for my part.

Excellent news. I attended to your talk and what you are proposing could
have a huge impact on how we use AST matcher. Bravo!

One of my objective at Mozilla is to make to make custom and general
static analysis and linting more self service and for now,
I feel that the current API has a too steep learning curve.

I don't have a strong opinion on auto itself. Just that coding style
should be global to the project and we should not be
shy about updating it when practices are evolving.

Happy to help crossing the finish line if you need!

Cheers,
Sylvestre

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improvements to AST Matcher tooling and auto usage

Kristof Beyls via cfe-dev
In reply to this post by Kristof Beyls via cfe-dev
Both "significant increases" commits you linked to are to an external library used by polly, as far as I can tell. They're probably not the best examples to support your point.

On Sat, Oct 19, 2019 at 6:14 PM Stephen Kelly via cfe-dev <[hidden email]> wrote:

Hi,

I presented some features at EuroLLVM in April this year relating to
improvements I made to AST Matcher tooling (clang-tidy and clang-query):

  https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching

My aims were:

* make learning to use AST Matchers easier
* make the clang-query tool itself help the user more
* simplify the AST Matchers required by skipping non-visible nodes in
matchers and AST dumps (such as implicit conversions, but see the talk
for more).

I think the javascript API I demo'd for writing clang-tidy checks could
be a particularly valuable addition to the tools ecosystem.

I didn't manage to get all of the patches upstream because arguments
about using auto in the same way it is used in other parts of LLVM
demotivated me. I haven't found an alternative reviewer to help get the
patches upstream, though my attendance at EuroLLVM was partially an
attempt to find one.

Some people are still interested in the features getting upstream, but
there is still some work for me to do in getting some of the patches
from presentation-quality to commit-quality. So there is still
motivation needed for my part. Beyond that, I have many other features I
wish to contribute to clang tooling, so I don't really want to start
again if I know I can't continue with it.

The rule being applied in the reviews for those patches is approximately
"if auto is used on the left, the type must appear on the right in angle
brackets (such as in a cast)".

I started a thread on this mailing list in November 2018 attempting to
get the coding standard clarified regarding what is allowed and what is
not regarding `auto`. There was no outcome from the thread. However, I
included some crude measurements of the existing use of auto at the time
using git grep:


http://llvm.1065342.n5.nabble.com/llvm-dev-RFC-Modernizing-our-use-of-auto-td123947.html#a124564

Here is a plot of the use of `auto` where there are no angle brackets on
the right side of the equals sign, measured each week over the last
three years:

  https://imgur.com/a/kcHaEhk

It was made with this script:

  https://gist.github.com/steveire/c9633d243d5d3b515e3eb26aaa3ad7c8

It is quite conservative as it only counts use of auto in the lib
directories of llvm, clang etc, excluding tests, include/ etc.

It shows that occurances of auto - in ways that my patches are rejected
for - has almost doubled in the last 3 years.

I think it shows that the standard that "patches must use the returned
type on the right side if using auto on the left" is not being followed
in the code. However, that is how the standard is being used by some
reviewers.

If some of the major increases (and the decrease) intrigue you, the
following patches are the responsible ones in the llvm-project repo):

Significant increases:
     6151654c00cfcdb90fd665403948aab64b83494a
     d169d70bbf053f636aed25b482ec63efd094a95b
Reductions:
     ddf3db9b5eddc6c7cc75fd30eea545c22ad70dcd
     a4fa0b880a63dad30d9ad1ff9fe8c2358c32dd77
     e372710d30c66e020576a2af6e1e5e815815a65a

The MLIR team announced that they are following the LLVM coding guidelines:

  http://lists.llvm.org/pipermail/llvm-dev/2019-September/134992.html

but they use auto extensively in ways that some reviewers don't accept:

  https://github.com/tensorflow/mlir/blob/master/lib/Parser/Parser.cpp

So, clearly other interpretations of the coding standard are possible.
If my reviewer would not accept the MLIR code, then someone is
misinterpreting the coding guideline.

I think it would make sense for

* The coding guideline
* The code itself
* The reviews

to agree somewhat on what the coding guideline is regarding auto (and
preferably have it enforced somehow). Currently, the reviews appear to
be out of sync.

If the rule should be "auto must not be used if the type is not on the
right side", I'm sure I can write a clang-tidy check to convert most of
the code. Such a tool would also enforce that new code follows the same
rule.

That would make it possible for me to finish the AST Matcher features
and get them upstream. I don't mind not using auto, but I do wish for
the coding guideline, the code itself and the reviews to agree on what
the guideline is and for everyone (contributors and reviewers) to follow
the same rules. That is not currently the case.

An alternative (which I favor) is to change the coding standard along
the lines of "use of auto is not encouraged, but it is not in itself a
reason to reject the contribution".

I would very much like to finish and upstream the AST Matcher improvements.

Does anyone share the sentiment?

Thanks,

Stephen

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improvements to AST Matcher tooling and auto usage

Kristof Beyls via cfe-dev
Indeed, I wasn't pointing them out to support any point at all, but to allow the kind of people interested in noticeable changes in graphs to satiate their curiosity. :)

The graph itself makes a stronger point by itself about the trajectory of the llvm code: steady change over a long time without any sharp increases (except those accounted for by updating third party libraries).

Perhaps that trend should be reversed with tooling assistance as I suggested. It is a trend which is not in sync with all interpretations of the coding guidelines 

Thanks,

Stephen. 

On Wed 23 Oct 2019, 01:23 Nico Weber, <[hidden email]> wrote:
Both "significant increases" commits you linked to are to an external library used by polly, as far as I can tell. They're probably not the best examples to support your point.

On Sat, Oct 19, 2019 at 6:14 PM Stephen Kelly via cfe-dev <[hidden email]> wrote:

Hi,

I presented some features at EuroLLVM in April this year relating to
improvements I made to AST Matcher tooling (clang-tidy and clang-query):

  https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching

My aims were:

* make learning to use AST Matchers easier
* make the clang-query tool itself help the user more
* simplify the AST Matchers required by skipping non-visible nodes in
matchers and AST dumps (such as implicit conversions, but see the talk
for more).

I think the javascript API I demo'd for writing clang-tidy checks could
be a particularly valuable addition to the tools ecosystem.

I didn't manage to get all of the patches upstream because arguments
about using auto in the same way it is used in other parts of LLVM
demotivated me. I haven't found an alternative reviewer to help get the
patches upstream, though my attendance at EuroLLVM was partially an
attempt to find one.

Some people are still interested in the features getting upstream, but
there is still some work for me to do in getting some of the patches
from presentation-quality to commit-quality. So there is still
motivation needed for my part. Beyond that, I have many other features I
wish to contribute to clang tooling, so I don't really want to start
again if I know I can't continue with it.

The rule being applied in the reviews for those patches is approximately
"if auto is used on the left, the type must appear on the right in angle
brackets (such as in a cast)".

I started a thread on this mailing list in November 2018 attempting to
get the coding standard clarified regarding what is allowed and what is
not regarding `auto`. There was no outcome from the thread. However, I
included some crude measurements of the existing use of auto at the time
using git grep:


http://llvm.1065342.n5.nabble.com/llvm-dev-RFC-Modernizing-our-use-of-auto-td123947.html#a124564

Here is a plot of the use of `auto` where there are no angle brackets on
the right side of the equals sign, measured each week over the last
three years:

  https://imgur.com/a/kcHaEhk

It was made with this script:

  https://gist.github.com/steveire/c9633d243d5d3b515e3eb26aaa3ad7c8

It is quite conservative as it only counts use of auto in the lib
directories of llvm, clang etc, excluding tests, include/ etc.

It shows that occurances of auto - in ways that my patches are rejected
for - has almost doubled in the last 3 years.

I think it shows that the standard that "patches must use the returned
type on the right side if using auto on the left" is not being followed
in the code. However, that is how the standard is being used by some
reviewers.

If some of the major increases (and the decrease) intrigue you, the
following patches are the responsible ones in the llvm-project repo):

Significant increases:
     6151654c00cfcdb90fd665403948aab64b83494a
     d169d70bbf053f636aed25b482ec63efd094a95b
Reductions:
     ddf3db9b5eddc6c7cc75fd30eea545c22ad70dcd
     a4fa0b880a63dad30d9ad1ff9fe8c2358c32dd77
     e372710d30c66e020576a2af6e1e5e815815a65a

The MLIR team announced that they are following the LLVM coding guidelines:

  http://lists.llvm.org/pipermail/llvm-dev/2019-September/134992.html

but they use auto extensively in ways that some reviewers don't accept:

  https://github.com/tensorflow/mlir/blob/master/lib/Parser/Parser.cpp

So, clearly other interpretations of the coding standard are possible.
If my reviewer would not accept the MLIR code, then someone is
misinterpreting the coding guideline.

I think it would make sense for

* The coding guideline
* The code itself
* The reviews

to agree somewhat on what the coding guideline is regarding auto (and
preferably have it enforced somehow). Currently, the reviews appear to
be out of sync.

If the rule should be "auto must not be used if the type is not on the
right side", I'm sure I can write a clang-tidy check to convert most of
the code. Such a tool would also enforce that new code follows the same
rule.

That would make it possible for me to finish the AST Matcher features
and get them upstream. I don't mind not using auto, but I do wish for
the coding guideline, the code itself and the reviews to agree on what
the guideline is and for everyone (contributors and reviewers) to follow
the same rules. That is not currently the case.

An alternative (which I favor) is to change the coding standard along
the lines of "use of auto is not encouraged, but it is not in itself a
reason to reject the contribution".

I would very much like to finish and upstream the AST Matcher improvements.

Does anyone share the sentiment?

Thanks,

Stephen

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improvements to AST Matcher tooling and auto usage

Kristof Beyls via cfe-dev
On Wed, Oct 23, 2019 at 8:05 PM Stephen Kelly via cfe-dev
<[hidden email]> wrote:

>
> Indeed, I wasn't pointing them out to support any point at all, but to allow the kind of people interested in noticeable changes in graphs to satiate their curiosity. :)
>
> The graph itself makes a stronger point by itself about the trajectory of the llvm code: steady change over a long time without any sharp increases (except those accounted for by updating third party libraries).
>
> Perhaps that trend should be reversed with tooling assistance as I suggested. It is a trend which is not in sync with all interpretations of the coding guidelines
>
> Thanks,
>
> Stephen.
>
> On Wed 23 Oct 2019, 01:23 Nico Weber, <[hidden email]> wrote:
>>
>> Both "significant increases" commits you linked to are to an external library used by polly, as far as I can tell. They're probably not the best examples to support your point.
>>
>> On Sat, Oct 19, 2019 at 6:14 PM Stephen Kelly via cfe-dev <[hidden email]> wrote:
>>>
>>>
>>> Hi,
>>>
>>> I presented some features at EuroLLVM in April this year relating to
>>> improvements I made to AST Matcher tooling (clang-tidy and clang-query):
>>>
>>>   https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching
>>>
>>> My aims were:
>>>
>>> * make learning to use AST Matchers easier
>>> * make the clang-query tool itself help the user more
>>> * simplify the AST Matchers required by skipping non-visible nodes in
>>> matchers and AST dumps (such as implicit conversions, but see the talk
>>> for more).
>>>
>>> I think the javascript API I demo'd for writing clang-tidy checks could
>>> be a particularly valuable addition to the tools ecosystem.
>>>
>>> I didn't manage to get all of the patches upstream because arguments
>>> about using auto in the same way it is used in other parts of LLVM
>>> demotivated me. I haven't found an alternative reviewer to help get the
>>> patches upstream, though my attendance at EuroLLVM was partially an
>>> attempt to find one.
>>>
>>> Some people are still interested in the features getting upstream, but
>>> there is still some work for me to do in getting some of the patches
>>> from presentation-quality to commit-quality. So there is still
>>> motivation needed for my part. Beyond that, I have many other features I
>>> wish to contribute to clang tooling, so I don't really want to start
>>> again if I know I can't continue with it.
>>>
>>> The rule being applied in the reviews for those patches is approximately
>>> "if auto is used on the left, the type must appear on the right in angle
>>> brackets (such as in a cast)".
>>>
>>> I started a thread on this mailing list in November 2018 attempting to
>>> get the coding standard clarified regarding what is allowed and what is
>>> not regarding `auto`. There was no outcome from the thread. However, I
>>> included some crude measurements of the existing use of auto at the time
>>> using git grep:
>>>
>>>
>>> http://llvm.1065342.n5.nabble.com/llvm-dev-RFC-Modernizing-our-use-of-auto-td123947.html#a124564
>>>
>>> Here is a plot of the use of `auto` where there are no angle brackets on
>>> the right side of the equals sign, measured each week over the last
>>> three years:
>>>
>>>   https://imgur.com/a/kcHaEhk
>>>
>>> It was made with this script:
>>>
>>>   https://gist.github.com/steveire/c9633d243d5d3b515e3eb26aaa3ad7c8
>>>
>>> It is quite conservative as it only counts use of auto in the lib
>>> directories of llvm, clang etc, excluding tests, include/ etc.
>>>
>>> It shows that occurances of auto - in ways that my patches are rejected
>>> for - has almost doubled in the last 3 years.
>>>
>>> I think it shows that the standard that "patches must use the returned
>>> type on the right side if using auto on the left" is not being followed
>>> in the code. However, that is how the standard is being used by some
>>> reviewers.
>>>
>>> If some of the major increases (and the decrease) intrigue you, the
>>> following patches are the responsible ones in the llvm-project repo):
>>>
>>> Significant increases:
>>>      6151654c00cfcdb90fd665403948aab64b83494a
>>>      d169d70bbf053f636aed25b482ec63efd094a95b
>>> Reductions:
>>>      ddf3db9b5eddc6c7cc75fd30eea545c22ad70dcd
>>>      a4fa0b880a63dad30d9ad1ff9fe8c2358c32dd77
>>>      e372710d30c66e020576a2af6e1e5e815815a65a
>>>
>>> The MLIR team announced that they are following the LLVM coding guidelines:
>>>
>>>   http://lists.llvm.org/pipermail/llvm-dev/2019-September/134992.html
>>>
>>> but they use auto extensively in ways that some reviewers don't accept:
>>>
>>>   https://github.com/tensorflow/mlir/blob/master/lib/Parser/Parser.cpp
>>>
>>> So, clearly other interpretations of the coding standard are possible.
>>> If my reviewer would not accept the MLIR code, then someone is
>>> misinterpreting the coding guideline.
>>>
>>> I think it would make sense for
>>>
>>> * The coding guideline
>>> * The code itself
>>> * The reviews
>>>
>>> to agree somewhat on what the coding guideline is regarding auto (and
>>> preferably have it enforced somehow). Currently, the reviews appear to
>>> be out of sync.
>>>
>>> If the rule should be "auto must not be used if the type is not on the
>>> right side", I'm sure I can write a clang-tidy check to convert most of
>>> the code. Such a tool would also enforce that new code follows the same
>>> rule.
Ignoring the question of "more auto",
strong +1 to having such a clang-tidy check regardless.
(it shouldn't be llvm-specific, but it for sure should know about llvm
type system.)

>>> That would make it possible for me to finish the AST Matcher features
>>> and get them upstream. I don't mind not using auto, but I do wish for
>>> the coding guideline, the code itself and the reviews to agree on what
>>> the guideline is and for everyone (contributors and reviewers) to follow
>>> the same rules. That is not currently the case.
>>>
>>> An alternative (which I favor) is to change the coding standard along
>>> the lines of "use of auto is not encouraged, but it is not in itself a
>>> reason to reject the contribution".
>>>
>>> I would very much like to finish and upstream the AST Matcher improvements.
>>>
>>> Does anyone share the sentiment?
>>>
>>> Thanks,
>>>
>>> Stephen
Roman.

>>> _______________________________________________
>>> cfe-dev mailing list
>>> [hidden email]
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Improvements to AST Matcher tooling and auto usage

Kristof Beyls via cfe-dev

On 23/10/2019 18:16, Roman Lebedev wrote:
>
>>>> If the rule should be "auto must not be used if the type is not on the
>>>> right side", I'm sure I can write a clang-tidy check to convert most of
>>>> the code. Such a tool would also enforce that new code follows the same
>>>> rule.
> Ignoring the question of "more auto",
> strong +1 to having such a clang-tidy check regardless.
> (it shouldn't be llvm-specific, but it for sure should know about llvm
> type system.)


I'm happy to help you make such a tool, but only if it will be applied
to llvm in some regular way (such as failing an automatic test if it
generates a fixit).

Thanks,

Stephen.


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev