Re: Unknown attributes in clang

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Unknown attributes in clang

Manas via cfe-dev
(Dropping the old list email address and adding the new one.)

On Sat, Oct 10, 2020 at 5:55 PM Vassil Vassilev <[hidden email]> wrote:

>
> Hi,
>
>    Consider the example:
>
>    struct X {
>      int x [[transient]];
>    };
>
>    Clang rightfully warns about unknown attribute 'transient' and
> ignores it. Unfortunately it does not seem to model the unknown
> attributes in the AST.

Correct, we currently drop all unknown attributes from the AST.

> Would it make sense to add another kind of
> attribute, UnknownAttr, which will model those as a token stream
> sequence? That would be of a great benefit for tools and clang plugins.

I think that could make sense but it can be a bit tricky. Some
attributes, like GNU-style __attributes__ "slide" from one AST node to
a more appropriate one, and that can be decided per-attribute. So one
problem is that we may attach unknown attributes to an unexpected AST
node (like adding it to a type when it really should be on a
declaration). Another problem is that attributes may have particular
parsing needs for their argument list and we may not be able to
represent all the various kinds of attribute arguments. So while it's
possible, it's not trivial.

Did you know that Clang now allows you to add plugin-based attributes?
This allows you to tell the frontend about how to parse the attribute
so that you can retain some information within the AST. The attribute
plugins don't currently let you define your own semantic attribute
(you would generally use an existing one, like AnnotateAttr) and there
are still some rough edges to the feature, but you can see more
information here:
https://clang.llvm.org/docs/ClangPlugins.html#defining-attributes

HTH!

~Aaron
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Unknown attributes in clang

Manas via cfe-dev
Aaron Ballman wrote:

> I think that could make sense but it can be a bit tricky. Some attributes,
> like GNU-style __attributes__ "slide" from one AST node to a more
> appropriate one, and that can be decided per-attribute. So one problem is
> that we may attach unknown attributes to an unexpected AST node (like
> adding it to a type when it really should be on a declaration). Another
> problem is that attributes may have particular parsing needs for their
> argument list and we may not be able to represent all the various kinds of
> attribute arguments. So while it's possible, it's not trivial.

Does the answer change if we restrict ourselves only to C++ [[attributes]]
matching the grammar in dcl.attr.grammar?

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Unknown attributes in clang

Manas via cfe-dev
On Sun, Oct 11, 2020 at 11:47 AM Peter Dimov <[hidden email]> wrote:

>
> Aaron Ballman wrote:
>
> > I think that could make sense but it can be a bit tricky. Some attributes,
> > like GNU-style __attributes__ "slide" from one AST node to a more
> > appropriate one, and that can be decided per-attribute. So one problem is
> > that we may attach unknown attributes to an unexpected AST node (like
> > adding it to a type when it really should be on a declaration). Another
> > problem is that attributes may have particular parsing needs for their
> > argument list and we may not be able to represent all the various kinds of
> > attribute arguments. So while it's possible, it's not trivial.
>
> Does the answer change if we restrict ourselves only to C++ [[attributes]]
> matching the grammar in dcl.attr.grammar?

It would be pretty unfortunate if we only retain some kinds of unknown
attributes and not others.

However, I think it makes the problem easier because at least the
appertainment rules are clear in that case. However, the way we
represent semantic attribute arguments in the AST may be insufficient
for handling generic C++ [[attributes]]. We currently parse the
arguments based on information from Attr.td and then attach them as
members of the semantic attribute object, but in this scenario, we'd
have no idea what to parse or how to represent it in the AST. For
instance, the gsl::suppress attribute supported by Microsoft takes an
argument we can't currently represent, like:
[[gsl::suppress(pro.type.1)]]. For unknown attributes, I think we'd
have to parse token soup and retain the tokens themselves as part of
the AST representation, so a later consumer could do their own parsing
of the arguments (which could get awkward in some cases, I would
guess).

I think my preference is to put effort into the attribute plugin
system so that it's easy for downstream consumers of the Clang AST to
add attributes they know about to the frontend. This would include
things like custom argument parsing, adding novel semantic attributes
to the AST, etc. We're not quite there yet, but I think that approach
will result in a better user experience because the plugins can
provide better diagnostic support than we could do with a generic
solution. That said, I still think there may be value in retaining
some generic best-effort information about unknown attributes in the
AST -- e.g., the syntax used, the spelling of the attribute, a source
range for where the argument list (if any) can be found. We'd have to
be clear that this approach is a best-effort and that users should
prefer using a plugin for "serious" attribute support, but this would
at least retain enough information to be able to pretty print the
source code back out for the user.

~Aaron

>
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Unknown attributes in clang

Manas via cfe-dev
Hi Aaron,

   Thanks for your answers!

On 10/11/20 8:44 PM, Aaron Ballman wrote:

> On Sun, Oct 11, 2020 at 11:47 AM Peter Dimov <[hidden email]> wrote:
>> Aaron Ballman wrote:
>>
>>> I think that could make sense but it can be a bit tricky. Some attributes,
>>> like GNU-style __attributes__ "slide" from one AST node to a more
>>> appropriate one, and that can be decided per-attribute. So one problem is
>>> that we may attach unknown attributes to an unexpected AST node (like
>>> adding it to a type when it really should be on a declaration). Another
>>> problem is that attributes may have particular parsing needs for their
>>> argument list and we may not be able to represent all the various kinds of
>>> attribute arguments. So while it's possible, it's not trivial.
>> Does the answer change if we restrict ourselves only to C++ [[attributes]]
>> matching the grammar in dcl.attr.grammar?
> It would be pretty unfortunate if we only retain some kinds of unknown
> attributes and not others.
>
> However, I think it makes the problem easier because at least the
> appertainment rules are clear in that case. However, the way we
> represent semantic attribute arguments in the AST may be insufficient
> for handling generic C++ [[attributes]]. We currently parse the
> arguments based on information from Attr.td and then attach them as
> members of the semantic attribute object, but in this scenario, we'd
> have no idea what to parse or how to represent it in the AST. For
> instance, the gsl::suppress attribute supported by Microsoft takes an
> argument we can't currently represent, like:
> [[gsl::suppress(pro.type.1)]]. For unknown attributes, I think we'd
> have to parse token soup and retain the tokens themselves as part of
> the AST representation, so a later consumer could do their own parsing
> of the arguments (which could get awkward in some cases, I would
> guess).


   If we want to go the token stream direction would the implementation
be reasonably straight-forwards wrt Attr.td?


>
> I think my preference is to put effort into the attribute plugin
> system so that it's easy for downstream consumers of the Clang AST to
> add attributes they know about to the frontend. This would include
> things like custom argument parsing, adding novel semantic attributes
> to the AST, etc. We're not quite there yet, but I think that approach
> will result in a better user experience because the plugins can
> provide better diagnostic support than we could do with a generic
> solution. That said, I still think there may be value in retaining
> some generic best-effort information about unknown attributes in the
> AST -- e.g., the syntax used, the spelling of the attribute, a source
> range for where the argument list (if any) can be found. We'd have to
> be clear that this approach is a best-effort and that users should
> prefer using a plugin for "serious" attribute support, but this would
> at least retain enough information to be able to pretty print the
> source code back out for the user.


   I did not know that clang plugins are capable of doing that. That
would be very useful for consumers.

Best, Vassil


>
> ~Aaron
>

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Unknown attributes in clang

Manas via cfe-dev
On Tue, Oct 13, 2020 at 2:32 PM Vassil Vassilev <[hidden email]> wrote:

>
> Hi Aaron,
>
>    Thanks for your answers!
>
> On 10/11/20 8:44 PM, Aaron Ballman wrote:
> > On Sun, Oct 11, 2020 at 11:47 AM Peter Dimov <[hidden email]> wrote:
> >> Aaron Ballman wrote:
> >>
> >>> I think that could make sense but it can be a bit tricky. Some attributes,
> >>> like GNU-style __attributes__ "slide" from one AST node to a more
> >>> appropriate one, and that can be decided per-attribute. So one problem is
> >>> that we may attach unknown attributes to an unexpected AST node (like
> >>> adding it to a type when it really should be on a declaration). Another
> >>> problem is that attributes may have particular parsing needs for their
> >>> argument list and we may not be able to represent all the various kinds of
> >>> attribute arguments. So while it's possible, it's not trivial.
> >> Does the answer change if we restrict ourselves only to C++ [[attributes]]
> >> matching the grammar in dcl.attr.grammar?
> > It would be pretty unfortunate if we only retain some kinds of unknown
> > attributes and not others.
> >
> > However, I think it makes the problem easier because at least the
> > appertainment rules are clear in that case. However, the way we
> > represent semantic attribute arguments in the AST may be insufficient
> > for handling generic C++ [[attributes]]. We currently parse the
> > arguments based on information from Attr.td and then attach them as
> > members of the semantic attribute object, but in this scenario, we'd
> > have no idea what to parse or how to represent it in the AST. For
> > instance, the gsl::suppress attribute supported by Microsoft takes an
> > argument we can't currently represent, like:
> > [[gsl::suppress(pro.type.1)]]. For unknown attributes, I think we'd
> > have to parse token soup and retain the tokens themselves as part of
> > the AST representation, so a later consumer could do their own parsing
> > of the arguments (which could get awkward in some cases, I would
> > guess).
>
>
>    If we want to go the token stream direction would the implementation
> be reasonably straight-forwards wrt Attr.td?

It's probably not trivial because the parser currently tries to do
actual parsing of those tokens, but it also shouldn't be too awful
either. For the case of an unknown attribute, you'd have to enter a
different parsing path where you grab tokens from the stream
(balancing them) until getting to the end of the argument list and
then pass the container of those tokens to create the parsed unknown
attribute. That will involve a bit of boilerplate (you'd need a new
factory method for making one of these), but there should be examples
of how to do that for other attributes with custom parsing needs. In
terms of what happens in Attr.td, you'd mostly need to add the
definition for the semantic attribute representing an unknown
attribute.

> >
> > I think my preference is to put effort into the attribute plugin
> > system so that it's easy for downstream consumers of the Clang AST to
> > add attributes they know about to the frontend. This would include
> > things like custom argument parsing, adding novel semantic attributes
> > to the AST, etc. We're not quite there yet, but I think that approach
> > will result in a better user experience because the plugins can
> > provide better diagnostic support than we could do with a generic
> > solution. That said, I still think there may be value in retaining
> > some generic best-effort information about unknown attributes in the
> > AST -- e.g., the syntax used, the spelling of the attribute, a source
> > range for where the argument list (if any) can be found. We'd have to
> > be clear that this approach is a best-effort and that users should
> > prefer using a plugin for "serious" attribute support, but this would
> > at least retain enough information to be able to pretty print the
> > source code back out for the user.
>
>
>    I did not know that clang plugins are capable of doing that. That
> would be very useful for consumers.

Yeah, I'm really excited about the plugin feature -- it's newer, so
it's not as well known, and I'm sure there will be improvements that
need to be made. But overall, I think this is the direction we really
want people to go when doing heavy-lifting using attributes.

~Aaron

>
> Best, Vassil
>
>
> >
> > ~Aaron
> >
>
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev