[AST] Function redeclaration: parameter decl isn't redecl of same parameter of redecl'd function

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[AST] Function redeclaration: parameter decl isn't redecl of same parameter of redecl'd function

Hubert Tong via cfe-dev

Hey!

Suppose we have a rather trivial situation with a function prototype, a usage, and then later a definition:

int foo(int x);

void bar() {
  foo(0);
}

int foo(int x) { return x * 2; }

void baz() {
  foo(1);
}

The following AST results:

|-FunctionDecl 0x55ae46421ca8 <a.cpp:1:1, col:14> col:5 used foo 'int (int)'
| `-ParmVarDecl 0x55ae46421bd0 <col:9, col:13> col:13 x 'int'
|-FunctionDecl 0x55ae46421df0 <line:2:1, col:23> col:6 bar 'void ()'
| `-CompoundStmt 0x55ae46421f90 <col:12, col:23>
|   `-CallExpr 0x55ae46421f68 <col:16, col:21> 'int'
|     |-ImplicitCastExpr 0x55ae46421f50 <col:16> 'int (*)(int)' <FunctionToPointerDecay>
|     | `-DeclRefExpr 0x55ae46421ef8 <col:16> 'int (int)' lvalue Function 0x55ae46421ca8 'foo' 'int (int)'
|     `-IntegerLiteral 0x55ae46421ed8 <col:20> 'int' 0
|-FunctionDecl 0x55ae46422058 prev 0x55ae46421ca8 <line:3:1, col:32> col:5 used foo 'int (int)'
| |-ParmVarDecl 0x55ae46421fc0 <col:9, col:13> col:13 used x 'int'
| `-CompoundStmt 0x55ae46422188 <col:16, col:32>
|   `-ReturnStmt 0x55ae46422178 <col:18, col:29>
|     `-BinaryOperator 0x55ae46422158 <col:25, col:29> 'int' '*'
|       |-ImplicitCastExpr 0x55ae46422140 <col:25> 'int' <LValueToRValue>
|       | `-DeclRefExpr 0x55ae46422100 <col:25> 'int' lvalue ParmVar 0x55ae46421fc0 'x' 'int'
|       `-IntegerLiteral 0x55ae46422120 <col:29> 'int' 2
`-FunctionDecl 0x55ae464221c0 <line:4:1, col:21> col:6 baz 'void ()'
  `-CompoundStmt 0x55ae46422328 <col:12, col:21>
    `-CallExpr 0x55ae46422300 <col:14, col:19> 'int'
      |-ImplicitCastExpr 0x55ae464222e8 <col:14> 'int (*)(int)' <FunctionToPointerDecay>
      | `-DeclRefExpr 0x55ae464222c8 <col:14> 'int (int)' lvalue Function 0x55ae46422058 'foo' 'int (int)'
      `-IntegerLiteral 0x55ae464222a8 <col:18> 'int' 1

The FunctionDecl knows that it is a redeclaration of a previous Decl (namely, the prototype). However, the two ParmVarDecls for int x do not have this relationship: PVD->getCanonicalDecl() == PVD holds for both instances, with no connection between. PVD->redecls() == {PVD}, too.
What is more interesting, is that querying the parameter to which the CallExpr gives the argument to, by iterating the number of arguments and doing cast<FunctionDecl>(CE->getCalledDecl())->getParamDecl(0), we will get two separate ParmVarDecl instances, due to how the call before the definition of foo() binds the prototype (and gives us the prototype’s ParmVarDecl) but the call site after the called function has been defined bind the definition.

Is this an intended behaviour?
Why isn’t the two ParmVarDecls not linked into a redecl chain, considering they should mean the “same entity”, as it is the same parameter of a redecl’d function?
I obviously mean once we are past overloads, past template instantiations, etc. No “magic” should be intervening.
Or is it me who’s not grasping something from the language correctly?

Regards,
W.


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [AST] Function redeclaration: parameter decl isn't redecl of same parameter of redecl'd function

Hubert Tong via cfe-dev

On Aug 17, 2020, at 12:01 PM, Whisperity via cfe-dev <[hidden email]> wrote:

Hey!

Suppose we have a rather trivial situation with a function prototype, a usage, and then later a definition:

int foo(int x);

void bar() {
  foo(0);
}

int foo(int x) { return x * 2; }

void baz() {
  foo(1);
}

The following AST results:

|-FunctionDecl 0x55ae46421ca8 <a.cpp:1:1, col:14> col:5 used foo 'int (int)'
| `-ParmVarDecl 0x55ae46421bd0 <col:9, col:13> col:13 x 'int'
|-FunctionDecl 0x55ae46421df0 <line:2:1, col:23> col:6 bar 'void ()'
| `-CompoundStmt 0x55ae46421f90 <col:12, col:23>
|   `-CallExpr 0x55ae46421f68 <col:16, col:21> 'int'
|     |-ImplicitCastExpr 0x55ae46421f50 <col:16> 'int (*)(int)' <FunctionToPointerDecay>
|     | `-DeclRefExpr 0x55ae46421ef8 <col:16> 'int (int)' lvalue Function 0x55ae46421ca8 'foo' 'int (int)'
|     `-IntegerLiteral 0x55ae46421ed8 <col:20> 'int' 0
|-FunctionDecl 0x55ae46422058 prev 0x55ae46421ca8 <line:3:1, col:32> col:5 used foo 'int (int)'
| |-ParmVarDecl 0x55ae46421fc0 <col:9, col:13> col:13 used x 'int'
| `-CompoundStmt 0x55ae46422188 <col:16, col:32>
|   `-ReturnStmt 0x55ae46422178 <col:18, col:29>
|     `-BinaryOperator 0x55ae46422158 <col:25, col:29> 'int' '*'
|       |-ImplicitCastExpr 0x55ae46422140 <col:25> 'int' <LValueToRValue>
|       | `-DeclRefExpr 0x55ae46422100 <col:25> 'int' lvalue ParmVar 0x55ae46421fc0 'x' 'int'
|       `-IntegerLiteral 0x55ae46422120 <col:29> 'int' 2
`-FunctionDecl 0x55ae464221c0 <line:4:1, col:21> col:6 baz 'void ()'
  `-CompoundStmt 0x55ae46422328 <col:12, col:21>
    `-CallExpr 0x55ae46422300 <col:14, col:19> 'int'
      |-ImplicitCastExpr 0x55ae464222e8 <col:14> 'int (*)(int)' <FunctionToPointerDecay>
      | `-DeclRefExpr 0x55ae464222c8 <col:14> 'int (int)' lvalue Function 0x55ae46422058 'foo' 'int (int)'
      `-IntegerLiteral 0x55ae464222a8 <col:18> 'int' 1

The FunctionDecl knows that it is a redeclaration of a previous Decl (namely, the prototype). However, the two ParmVarDecls for int x do not have this relationship: PVD->getCanonicalDecl() == PVD holds for both instances, with no connection between. PVD->redecls() == {PVD}, too.
What is more interesting, is that querying the parameter to which the CallExpr gives the argument to, by iterating the number of arguments and doing cast<FunctionDecl>(CE->getCalledDecl())->getParamDecl(0), we will get two separate ParmVarDecl instances, due to how the call before the definition of foo() binds the prototype (and gives us the prototype’s ParmVarDecl) but the call site after the called function has been defined bind the definition.

Is this an intended behaviour?
Why isn’t the two ParmVarDecls not linked into a redecl chain, considering they should mean the “same entity”, as it is the same parameter of a redecl’d function?
I obviously mean once we are past overloads, past template instantiations, etc. No “magic” should be intervening.
Or is it me who’s not grasping something from the language correctly?

That redeclarations of functions can assign different names to their parameters is the most illuminating factor, to me.  It suggests ParmVarDecls really are private to their parent FunctionDecl, and should not be linked to anything outside it — even to another redeclaration of that function.

Indeed, ParmVarDecls do not matter in any function declaration except a definition; only the parameter types matter, and they are enclosed in the function type.  In other words, ParmVarDecls seem to be unnecessary, semantically, in every declaration of a function except its definition.  In non-defined functions, their purpose is only to help record the name a user assigned to that slot, purely as syntactic sugar. 

Since there need not anything in common between ParmVarDecls of different redeclarations except their type — and even that is stored separately in the FunctionProtoType — I think it’s proper that each ParmVarDecl be considered completely enclosed from the world outside its particular function redeclaration.

- Dave

Regards,
W.

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [AST] Function redeclaration: parameter decl isn't redecl of same parameter of redecl'd function

Hubert Tong via cfe-dev
Hey!

Sorry for the late reply. I get parts of the explanation. Not sure if
I agree with the reasoning, but I understand it. Let me put my issue
into a bit more concrete context.

I'm working on a method (or "tool") that deals a whole lot of heavy
lifting with attributes. There are some iterative things going on and
this tool invoked multiple times, so using some sort of knowledge
store between invocations is inevitable. Semantically, the code itself
is a perfect place for this.

We have something called "InheritableAttr"s and
"InheritableParamAttr"s, which in the definition file (Attr.td, lines
554 and 585, YMMV) say, respectively.



/// An inheritable attribute is inherited by later redeclarations.

/// An inheritable parameter attribute is inherited by later
/// redeclarations, even when it's written on a parameter.



Now, suppose, I write the following, where "fancy" is one such
inheritable attr, and I quote, "even when it's written on a
parameter", and let's expand the previous example.
(Unfortunately, it also says ">later< redeclarations"...)
For the sake of this argument, imagine "fanciness" to be some sort of
an opaque property. It could be a type invariant, it could be a
lifetime bind, it could be some proprietary ABI magic. (Speaking of
lifetime bind, I'm adding Gábor Horváth to the direct To:, maybe you
got an idea on how to deal with this?)



void foobar( [[fancy]] int x);
void blabla(int y);

// Obviously forward declarations might reside in a header file,
separate from the functions' definitions, etc. etc.

int test1() {
  foobar( ... ); // call expr binds to parameter of forward decl, I
know it is fancy
  blabla( ... ); // call expr binds to parameter of forward decl, I do
not know it is fancy
}

void foobar( /* inherited [[fancy]] */ int x) { ... }
void blabla( [[fancy]] int y) { ... }

int test2() {
  foobar( ... ); // I know it is fancy because inheritence, even
though it's a distinct node
  blabla( ... ); // call expr binds to parameter of *definition node*,
now I know it is fancy
}



Unfortunately, I need to know in both calls to blabla() that the
parameter is deemed "fancy".
Outside of this case, simply always "annotating" the canonical
declaration seems to be a good way forward, but this breaks for
parameters due to the aforementioned problem.

Due to the knowledge of "fanciness" required at all usage points (aka
call sites where arguments are passed) I need to have this information
in a common place (naturally the header which, thanks to inheriting
the attribute, will afflict the definition node too).
This is the formal part.

However, because people are people, the knowledge, or rather the fact
that the declaration is "fancy" (even if "formally" inherited) should
be appropriately visible at the definition node. (This is a big bummer
with C++ that you can define "out of line", but we work with what's
given.) This is more so for code comprehension and accountability
purposes.

Unfortunately, even the "formally" required part is broken, because
registering "fanciness" for the target of the call (which are
different nodes for the calls in test1() and test2(), and due to no
link through canonicalness, distinct!) is not a good way forward.

Perhaps you, or someone from the list, has a hunch of how I should move forward?
I mean... naturally, I have my own ideas, of simply registering, for
the Redecl chain of the same overload, "equivalence classes" of
parameters that are, one way or another (from my perspective),
"related", and then simply making sure that the knowledge is saved in
sync to "both" (or rather, "all") places.

I am just surprised this (at least AFAIK) has never come up before.
I instinctively "feel like" that it should not be my (tool's)
responsibility to build additional data structures for this problem,
this "relatedness" should be apparent without having to jump the hoops
of getting which indexth the current ParmVarDecl in my hand is,
finding the parent function, finding its CanonicalDecl (or other
redeclaration), and selecting the same indexth parameter, which is now
a different, as as you explained, sort-of "unrelated" node.


David Rector <[hidden email]> ezt írta (időpont: 2020. aug.
17., H, 19:58):

>
>
> On Aug 17, 2020, at 12:01 PM, Whisperity via cfe-dev <[hidden email]> wrote:
>
> Hey!
>
> Suppose we have a rather trivial situation with a function prototype, a usage, and then later a definition:
>
> int foo(int x);
>
> void bar() {
>   foo(0);
> }
>
> int foo(int x) { return x * 2; }
>
> void baz() {
>   foo(1);
> }
>
> The following AST results:
>
> |-FunctionDecl 0x55ae46421ca8 <a.cpp:1:1, col:14> col:5 used foo 'int (int)'
> | `-ParmVarDecl 0x55ae46421bd0 <col:9, col:13> col:13 x 'int'
> |-FunctionDecl 0x55ae46421df0 <line:2:1, col:23> col:6 bar 'void ()'
> | `-CompoundStmt 0x55ae46421f90 <col:12, col:23>
> |   `-CallExpr 0x55ae46421f68 <col:16, col:21> 'int'
> |     |-ImplicitCastExpr 0x55ae46421f50 <col:16> 'int (*)(int)' <FunctionToPointerDecay>
> |     | `-DeclRefExpr 0x55ae46421ef8 <col:16> 'int (int)' lvalue Function 0x55ae46421ca8 'foo' 'int (int)'
> |     `-IntegerLiteral 0x55ae46421ed8 <col:20> 'int' 0
> |-FunctionDecl 0x55ae46422058 prev 0x55ae46421ca8 <line:3:1, col:32> col:5 used foo 'int (int)'
> | |-ParmVarDecl 0x55ae46421fc0 <col:9, col:13> col:13 used x 'int'
> | `-CompoundStmt 0x55ae46422188 <col:16, col:32>
> |   `-ReturnStmt 0x55ae46422178 <col:18, col:29>
> |     `-BinaryOperator 0x55ae46422158 <col:25, col:29> 'int' '*'
> |       |-ImplicitCastExpr 0x55ae46422140 <col:25> 'int' <LValueToRValue>
> |       | `-DeclRefExpr 0x55ae46422100 <col:25> 'int' lvalue ParmVar 0x55ae46421fc0 'x' 'int'
> |       `-IntegerLiteral 0x55ae46422120 <col:29> 'int' 2
> `-FunctionDecl 0x55ae464221c0 <line:4:1, col:21> col:6 baz 'void ()'
>   `-CompoundStmt 0x55ae46422328 <col:12, col:21>
>     `-CallExpr 0x55ae46422300 <col:14, col:19> 'int'
>       |-ImplicitCastExpr 0x55ae464222e8 <col:14> 'int (*)(int)' <FunctionToPointerDecay>
>       | `-DeclRefExpr 0x55ae464222c8 <col:14> 'int (int)' lvalue Function 0x55ae46422058 'foo' 'int (int)'
>       `-IntegerLiteral 0x55ae464222a8 <col:18> 'int' 1
>
> The FunctionDecl knows that it is a redeclaration of a previous Decl (namely, the prototype). However, the two ParmVarDecls for int x do not have this relationship: PVD->getCanonicalDecl() == PVD holds for both instances, with no connection between. PVD->redecls() == {PVD}, too.
> What is more interesting, is that querying the parameter to which the CallExpr gives the argument to, by iterating the number of arguments and doing cast<FunctionDecl>(CE->getCalledDecl())->getParamDecl(0), we will get two separate ParmVarDecl instances, due to how the call before the definition of foo() binds the prototype (and gives us the prototype’s ParmVarDecl) but the call site after the called function has been defined bind the definition.
>
> Is this an intended behaviour?
> Why isn’t the two ParmVarDecls not linked into a redecl chain, considering they should mean the “same entity”, as it is the same parameter of a redecl’d function?
> I obviously mean once we are past overloads, past template instantiations, etc. No “magic” should be intervening.
> Or is it me who’s not grasping something from the language correctly?
>
> That redeclarations of functions can assign different names to their parameters is the most illuminating factor, to me.  It suggests ParmVarDecls really are private to their parent FunctionDecl, and should not be linked to anything outside it — even to another redeclaration of that function.
>
> Indeed, ParmVarDecls do not matter in any function declaration except a definition; only the parameter types matter, and they are enclosed in the function type.  In other words, ParmVarDecls seem to be unnecessary, semantically, in every declaration of a function except its definition.  In non-defined functions, their purpose is only to help record the name a user assigned to that slot, purely as syntactic sugar.
>
> Since there need not anything in common between ParmVarDecls of different redeclarations except their type — and even that is stored separately in the FunctionProtoType — I think it’s proper that each ParmVarDecl be considered completely enclosed from the world outside its particular function redeclaration.
>
> - Dave
>
> Regards,
> W.
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [AST] Function redeclaration: parameter decl isn't redecl of same parameter of redecl'd function

Hubert Tong via cfe-dev
What is the problem with always annotating the canonical declarations? I did not really understand it from your description.

On Fri, 11 Sep 2020 at 20:02, Whisperity <[hidden email]> wrote:
Hey!

Sorry for the late reply. I get parts of the explanation. Not sure if
I agree with the reasoning, but I understand it. Let me put my issue
into a bit more concrete context.

I'm working on a method (or "tool") that deals a whole lot of heavy
lifting with attributes. There are some iterative things going on and
this tool invoked multiple times, so using some sort of knowledge
store between invocations is inevitable. Semantically, the code itself
is a perfect place for this.

We have something called "InheritableAttr"s and
"InheritableParamAttr"s, which in the definition file (Attr.td, lines
554 and 585, YMMV) say, respectively.



/// An inheritable attribute is inherited by later redeclarations.

/// An inheritable parameter attribute is inherited by later
/// redeclarations, even when it's written on a parameter.



Now, suppose, I write the following, where "fancy" is one such
inheritable attr, and I quote, "even when it's written on a
parameter", and let's expand the previous example.
(Unfortunately, it also says ">later< redeclarations"...)
For the sake of this argument, imagine "fanciness" to be some sort of
an opaque property. It could be a type invariant, it could be a
lifetime bind, it could be some proprietary ABI magic. (Speaking of
lifetime bind, I'm adding Gábor Horváth to the direct To:, maybe you
got an idea on how to deal with this?)



void foobar( [[fancy]] int x);
void blabla(int y);

// Obviously forward declarations might reside in a header file,
separate from the functions' definitions, etc. etc.

int test1() {
  foobar( ... ); // call expr binds to parameter of forward decl, I
know it is fancy
  blabla( ... ); // call expr binds to parameter of forward decl, I do
not know it is fancy
}

void foobar( /* inherited [[fancy]] */ int x) { ... }
void blabla( [[fancy]] int y) { ... }

int test2() {
  foobar( ... ); // I know it is fancy because inheritence, even
though it's a distinct node
  blabla( ... ); // call expr binds to parameter of *definition node*,
now I know it is fancy
}



Unfortunately, I need to know in both calls to blabla() that the
parameter is deemed "fancy".
Outside of this case, simply always "annotating" the canonical
declaration seems to be a good way forward, but this breaks for
parameters due to the aforementioned problem.

Due to the knowledge of "fanciness" required at all usage points (aka
call sites where arguments are passed) I need to have this information
in a common place (naturally the header which, thanks to inheriting
the attribute, will afflict the definition node too).
This is the formal part.

However, because people are people, the knowledge, or rather the fact
that the declaration is "fancy" (even if "formally" inherited) should
be appropriately visible at the definition node. (This is a big bummer
with C++ that you can define "out of line", but we work with what's
given.) This is more so for code comprehension and accountability
purposes.

Unfortunately, even the "formally" required part is broken, because
registering "fanciness" for the target of the call (which are
different nodes for the calls in test1() and test2(), and due to no
link through canonicalness, distinct!) is not a good way forward.

Perhaps you, or someone from the list, has a hunch of how I should move forward?
I mean... naturally, I have my own ideas, of simply registering, for
the Redecl chain of the same overload, "equivalence classes" of
parameters that are, one way or another (from my perspective),
"related", and then simply making sure that the knowledge is saved in
sync to "both" (or rather, "all") places.

I am just surprised this (at least AFAIK) has never come up before.
I instinctively "feel like" that it should not be my (tool's)
responsibility to build additional data structures for this problem,
this "relatedness" should be apparent without having to jump the hoops
of getting which indexth the current ParmVarDecl in my hand is,
finding the parent function, finding its CanonicalDecl (or other
redeclaration), and selecting the same indexth parameter, which is now
a different, as as you explained, sort-of "unrelated" node.


David Rector <[hidden email]> ezt írta (időpont: 2020. aug.
17., H, 19:58):
>
>
> On Aug 17, 2020, at 12:01 PM, Whisperity via cfe-dev <[hidden email]> wrote:
>
> Hey!
>
> Suppose we have a rather trivial situation with a function prototype, a usage, and then later a definition:
>
> int foo(int x);
>
> void bar() {
>   foo(0);
> }
>
> int foo(int x) { return x * 2; }
>
> void baz() {
>   foo(1);
> }
>
> The following AST results:
>
> |-FunctionDecl 0x55ae46421ca8 <a.cpp:1:1, col:14> col:5 used foo 'int (int)'
> | `-ParmVarDecl 0x55ae46421bd0 <col:9, col:13> col:13 x 'int'
> |-FunctionDecl 0x55ae46421df0 <line:2:1, col:23> col:6 bar 'void ()'
> | `-CompoundStmt 0x55ae46421f90 <col:12, col:23>
> |   `-CallExpr 0x55ae46421f68 <col:16, col:21> 'int'
> |     |-ImplicitCastExpr 0x55ae46421f50 <col:16> 'int (*)(int)' <FunctionToPointerDecay>
> |     | `-DeclRefExpr 0x55ae46421ef8 <col:16> 'int (int)' lvalue Function 0x55ae46421ca8 'foo' 'int (int)'
> |     `-IntegerLiteral 0x55ae46421ed8 <col:20> 'int' 0
> |-FunctionDecl 0x55ae46422058 prev 0x55ae46421ca8 <line:3:1, col:32> col:5 used foo 'int (int)'
> | |-ParmVarDecl 0x55ae46421fc0 <col:9, col:13> col:13 used x 'int'
> | `-CompoundStmt 0x55ae46422188 <col:16, col:32>
> |   `-ReturnStmt 0x55ae46422178 <col:18, col:29>
> |     `-BinaryOperator 0x55ae46422158 <col:25, col:29> 'int' '*'
> |       |-ImplicitCastExpr 0x55ae46422140 <col:25> 'int' <LValueToRValue>
> |       | `-DeclRefExpr 0x55ae46422100 <col:25> 'int' lvalue ParmVar 0x55ae46421fc0 'x' 'int'
> |       `-IntegerLiteral 0x55ae46422120 <col:29> 'int' 2
> `-FunctionDecl 0x55ae464221c0 <line:4:1, col:21> col:6 baz 'void ()'
>   `-CompoundStmt 0x55ae46422328 <col:12, col:21>
>     `-CallExpr 0x55ae46422300 <col:14, col:19> 'int'
>       |-ImplicitCastExpr 0x55ae464222e8 <col:14> 'int (*)(int)' <FunctionToPointerDecay>
>       | `-DeclRefExpr 0x55ae464222c8 <col:14> 'int (int)' lvalue Function 0x55ae46422058 'foo' 'int (int)'
>       `-IntegerLiteral 0x55ae464222a8 <col:18> 'int' 1
>
> The FunctionDecl knows that it is a redeclaration of a previous Decl (namely, the prototype). However, the two ParmVarDecls for int x do not have this relationship: PVD->getCanonicalDecl() == PVD holds for both instances, with no connection between. PVD->redecls() == {PVD}, too.
> What is more interesting, is that querying the parameter to which the CallExpr gives the argument to, by iterating the number of arguments and doing cast<FunctionDecl>(CE->getCalledDecl())->getParamDecl(0), we will get two separate ParmVarDecl instances, due to how the call before the definition of foo() binds the prototype (and gives us the prototype’s ParmVarDecl) but the call site after the called function has been defined bind the definition.
>
> Is this an intended behaviour?
> Why isn’t the two ParmVarDecls not linked into a redecl chain, considering they should mean the “same entity”, as it is the same parameter of a redecl’d function?
> I obviously mean once we are past overloads, past template instantiations, etc. No “magic” should be intervening.
> Or is it me who’s not grasping something from the language correctly?
>
> That redeclarations of functions can assign different names to their parameters is the most illuminating factor, to me.  It suggests ParmVarDecls really are private to their parent FunctionDecl, and should not be linked to anything outside it — even to another redeclaration of that function.
>
> Indeed, ParmVarDecls do not matter in any function declaration except a definition; only the parameter types matter, and they are enclosed in the function type.  In other words, ParmVarDecls seem to be unnecessary, semantically, in every declaration of a function except its definition.  In non-defined functions, their purpose is only to help record the name a user assigned to that slot, purely as syntactic sugar.
>
> Since there need not anything in common between ParmVarDecls of different redeclarations except their type — and even that is stored separately in the FunctionProtoType — I think it’s proper that each ParmVarDecl be considered completely enclosed from the world outside its particular function redeclaration.
>
> - Dave
>
> Regards,
> W.
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [AST] Function redeclaration: parameter decl isn't redecl of same parameter of redecl'd function

Hubert Tong via cfe-dev
A canonical declaration of a ParmVarDecl is *always* itself.
(ParmVarDecl does not override getCanonicalDecl(), thus inheriting the
default "return this;" behaviour from Decl.) This is true even if the
function has a (re)declaration chain and an earlier canonical
declaration.

Given

  void f(int x);
  void f(int x) { ... }

The canonical declaration of the second f() is the first one.
The canonical declaration of the first f() is itself.
The two nodes are reachable from one another through redecls() range.

However, the two parameters are distinct. (See my original email from
back in August complete with an AST.)

There is no other way (AFAIK) than registering which parameter at
which index we are talking about, walking to the canonical function,
and then querying its nth parameter.

Gábor Horváth <[hidden email]> ezt írta (időpont: 2020. szept.
11., P, 20:43):

>
> What is the problem with always annotating the canonical declarations? I did not really understand it from your description.
>
> On Fri, 11 Sep 2020 at 20:02, Whisperity <[hidden email]> wrote:
>>
>> Hey!
>>
>> Sorry for the late reply. I get parts of the explanation. Not sure if
>> I agree with the reasoning, but I understand it. Let me put my issue
>> into a bit more concrete context.
>>
>> I'm working on a method (or "tool") that deals a whole lot of heavy
>> lifting with attributes. There are some iterative things going on and
>> this tool invoked multiple times, so using some sort of knowledge
>> store between invocations is inevitable. Semantically, the code itself
>> is a perfect place for this.
>>
>> We have something called "InheritableAttr"s and
>> "InheritableParamAttr"s, which in the definition file (Attr.td, lines
>> 554 and 585, YMMV) say, respectively.
>>
>>
>>
>> /// An inheritable attribute is inherited by later redeclarations.
>>
>> /// An inheritable parameter attribute is inherited by later
>> /// redeclarations, even when it's written on a parameter.
>>
>>
>>
>> Now, suppose, I write the following, where "fancy" is one such
>> inheritable attr, and I quote, "even when it's written on a
>> parameter", and let's expand the previous example.
>> (Unfortunately, it also says ">later< redeclarations"...)
>> For the sake of this argument, imagine "fanciness" to be some sort of
>> an opaque property. It could be a type invariant, it could be a
>> lifetime bind, it could be some proprietary ABI magic. (Speaking of
>> lifetime bind, I'm adding Gábor Horváth to the direct To:, maybe you
>> got an idea on how to deal with this?)
>>
>>
>>
>> void foobar( [[fancy]] int x);
>> void blabla(int y);
>>
>> // Obviously forward declarations might reside in a header file,
>> separate from the functions' definitions, etc. etc.
>>
>> int test1() {
>>   foobar( ... ); // call expr binds to parameter of forward decl, I
>> know it is fancy
>>   blabla( ... ); // call expr binds to parameter of forward decl, I do
>> not know it is fancy
>> }
>>
>> void foobar( /* inherited [[fancy]] */ int x) { ... }
>> void blabla( [[fancy]] int y) { ... }
>>
>> int test2() {
>>   foobar( ... ); // I know it is fancy because inheritence, even
>> though it's a distinct node
>>   blabla( ... ); // call expr binds to parameter of *definition node*,
>> now I know it is fancy
>> }
>>
>>
>>
>> Unfortunately, I need to know in both calls to blabla() that the
>> parameter is deemed "fancy".
>> Outside of this case, simply always "annotating" the canonical
>> declaration seems to be a good way forward, but this breaks for
>> parameters due to the aforementioned problem.
>>
>> Due to the knowledge of "fanciness" required at all usage points (aka
>> call sites where arguments are passed) I need to have this information
>> in a common place (naturally the header which, thanks to inheriting
>> the attribute, will afflict the definition node too).
>> This is the formal part.
>>
>> However, because people are people, the knowledge, or rather the fact
>> that the declaration is "fancy" (even if "formally" inherited) should
>> be appropriately visible at the definition node. (This is a big bummer
>> with C++ that you can define "out of line", but we work with what's
>> given.) This is more so for code comprehension and accountability
>> purposes.
>>
>> Unfortunately, even the "formally" required part is broken, because
>> registering "fanciness" for the target of the call (which are
>> different nodes for the calls in test1() and test2(), and due to no
>> link through canonicalness, distinct!) is not a good way forward.
>>
>> Perhaps you, or someone from the list, has a hunch of how I should move forward?
>> I mean... naturally, I have my own ideas, of simply registering, for
>> the Redecl chain of the same overload, "equivalence classes" of
>> parameters that are, one way or another (from my perspective),
>> "related", and then simply making sure that the knowledge is saved in
>> sync to "both" (or rather, "all") places.
>>
>> I am just surprised this (at least AFAIK) has never come up before.
>> I instinctively "feel like" that it should not be my (tool's)
>> responsibility to build additional data structures for this problem,
>> this "relatedness" should be apparent without having to jump the hoops
>> of getting which indexth the current ParmVarDecl in my hand is,
>> finding the parent function, finding its CanonicalDecl (or other
>> redeclaration), and selecting the same indexth parameter, which is now
>> a different, as as you explained, sort-of "unrelated" node.
>>
>>
>> David Rector <[hidden email]> ezt írta (időpont: 2020. aug.
>> 17., H, 19:58):
>> >
>> >
>> > On Aug 17, 2020, at 12:01 PM, Whisperity via cfe-dev <[hidden email]> wrote:
>> >
>> > Hey!
>> >
>> > Suppose we have a rather trivial situation with a function prototype, a usage, and then later a definition:
>> >
>> > int foo(int x);
>> >
>> > void bar() {
>> >   foo(0);
>> > }
>> >
>> > int foo(int x) { return x * 2; }
>> >
>> > void baz() {
>> >   foo(1);
>> > }
>> >
>> > The following AST results:
>> >
>> > |-FunctionDecl 0x55ae46421ca8 <a.cpp:1:1, col:14> col:5 used foo 'int (int)'
>> > | `-ParmVarDecl 0x55ae46421bd0 <col:9, col:13> col:13 x 'int'
>> > |-FunctionDecl 0x55ae46421df0 <line:2:1, col:23> col:6 bar 'void ()'
>> > | `-CompoundStmt 0x55ae46421f90 <col:12, col:23>
>> > |   `-CallExpr 0x55ae46421f68 <col:16, col:21> 'int'
>> > |     |-ImplicitCastExpr 0x55ae46421f50 <col:16> 'int (*)(int)' <FunctionToPointerDecay>
>> > |     | `-DeclRefExpr 0x55ae46421ef8 <col:16> 'int (int)' lvalue Function 0x55ae46421ca8 'foo' 'int (int)'
>> > |     `-IntegerLiteral 0x55ae46421ed8 <col:20> 'int' 0
>> > |-FunctionDecl 0x55ae46422058 prev 0x55ae46421ca8 <line:3:1, col:32> col:5 used foo 'int (int)'
>> > | |-ParmVarDecl 0x55ae46421fc0 <col:9, col:13> col:13 used x 'int'
>> > | `-CompoundStmt 0x55ae46422188 <col:16, col:32>
>> > |   `-ReturnStmt 0x55ae46422178 <col:18, col:29>
>> > |     `-BinaryOperator 0x55ae46422158 <col:25, col:29> 'int' '*'
>> > |       |-ImplicitCastExpr 0x55ae46422140 <col:25> 'int' <LValueToRValue>
>> > |       | `-DeclRefExpr 0x55ae46422100 <col:25> 'int' lvalue ParmVar 0x55ae46421fc0 'x' 'int'
>> > |       `-IntegerLiteral 0x55ae46422120 <col:29> 'int' 2
>> > `-FunctionDecl 0x55ae464221c0 <line:4:1, col:21> col:6 baz 'void ()'
>> >   `-CompoundStmt 0x55ae46422328 <col:12, col:21>
>> >     `-CallExpr 0x55ae46422300 <col:14, col:19> 'int'
>> >       |-ImplicitCastExpr 0x55ae464222e8 <col:14> 'int (*)(int)' <FunctionToPointerDecay>
>> >       | `-DeclRefExpr 0x55ae464222c8 <col:14> 'int (int)' lvalue Function 0x55ae46422058 'foo' 'int (int)'
>> >       `-IntegerLiteral 0x55ae464222a8 <col:18> 'int' 1
>> >
>> > The FunctionDecl knows that it is a redeclaration of a previous Decl (namely, the prototype). However, the two ParmVarDecls for int x do not have this relationship: PVD->getCanonicalDecl() == PVD holds for both instances, with no connection between. PVD->redecls() == {PVD}, too.
>> > What is more interesting, is that querying the parameter to which the CallExpr gives the argument to, by iterating the number of arguments and doing cast<FunctionDecl>(CE->getCalledDecl())->getParamDecl(0), we will get two separate ParmVarDecl instances, due to how the call before the definition of foo() binds the prototype (and gives us the prototype’s ParmVarDecl) but the call site after the called function has been defined bind the definition.
>> >
>> > Is this an intended behaviour?
>> > Why isn’t the two ParmVarDecls not linked into a redecl chain, considering they should mean the “same entity”, as it is the same parameter of a redecl’d function?
>> > I obviously mean once we are past overloads, past template instantiations, etc. No “magic” should be intervening.
>> > Or is it me who’s not grasping something from the language correctly?
>> >
>> > That redeclarations of functions can assign different names to their parameters is the most illuminating factor, to me.  It suggests ParmVarDecls really are private to their parent FunctionDecl, and should not be linked to anything outside it — even to another redeclaration of that function.
>> >
>> > Indeed, ParmVarDecls do not matter in any function declaration except a definition; only the parameter types matter, and they are enclosed in the function type.  In other words, ParmVarDecls seem to be unnecessary, semantically, in every declaration of a function except its definition.  In non-defined functions, their purpose is only to help record the name a user assigned to that slot, purely as syntactic sugar.
>> >
>> > Since there need not anything in common between ParmVarDecls of different redeclarations except their type — and even that is stored separately in the FunctionProtoType — I think it’s proper that each ParmVarDecl be considered completely enclosed from the world outside its particular function redeclaration.
>> >
>> > - Dave
>> >
>> > Regards,
>> > W.
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > [hidden email]
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >
>> >
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [AST] Function redeclaration: parameter decl isn't redecl of same parameter of redecl'd function

Hubert Tong via cfe-dev
Sorry for not being clearer. What is the problem with always using the canonical declaration of the *function*? I.e., collecting parameter attributes there. 

On Fri, Sep 11, 2020, 11:00 PM Whisperity <[hidden email]> wrote:
A canonical declaration of a ParmVarDecl is *always* itself.
(ParmVarDecl does not override getCanonicalDecl(), thus inheriting the
default "return this;" behaviour from Decl.) This is true even if the
function has a (re)declaration chain and an earlier canonical
declaration.

Given

  void f(int x);
  void f(int x) { ... }

The canonical declaration of the second f() is the first one.
The canonical declaration of the first f() is itself.
The two nodes are reachable from one another through redecls() range.

However, the two parameters are distinct. (See my original email from
back in August complete with an AST.)

There is no other way (AFAIK) than registering which parameter at
which index we are talking about, walking to the canonical function,
and then querying its nth parameter.

Gábor Horváth <[hidden email]> ezt írta (időpont: 2020. szept.
11., P, 20:43):
>
> What is the problem with always annotating the canonical declarations? I did not really understand it from your description.
>
> On Fri, 11 Sep 2020 at 20:02, Whisperity <[hidden email]> wrote:
>>
>> Hey!
>>
>> Sorry for the late reply. I get parts of the explanation. Not sure if
>> I agree with the reasoning, but I understand it. Let me put my issue
>> into a bit more concrete context.
>>
>> I'm working on a method (or "tool") that deals a whole lot of heavy
>> lifting with attributes. There are some iterative things going on and
>> this tool invoked multiple times, so using some sort of knowledge
>> store between invocations is inevitable. Semantically, the code itself
>> is a perfect place for this.
>>
>> We have something called "InheritableAttr"s and
>> "InheritableParamAttr"s, which in the definition file (Attr.td, lines
>> 554 and 585, YMMV) say, respectively.
>>
>>
>>
>> /// An inheritable attribute is inherited by later redeclarations.
>>
>> /// An inheritable parameter attribute is inherited by later
>> /// redeclarations, even when it's written on a parameter.
>>
>>
>>
>> Now, suppose, I write the following, where "fancy" is one such
>> inheritable attr, and I quote, "even when it's written on a
>> parameter", and let's expand the previous example.
>> (Unfortunately, it also says ">later< redeclarations"...)
>> For the sake of this argument, imagine "fanciness" to be some sort of
>> an opaque property. It could be a type invariant, it could be a
>> lifetime bind, it could be some proprietary ABI magic. (Speaking of
>> lifetime bind, I'm adding Gábor Horváth to the direct To:, maybe you
>> got an idea on how to deal with this?)
>>
>>
>>
>> void foobar( [[fancy]] int x);
>> void blabla(int y);
>>
>> // Obviously forward declarations might reside in a header file,
>> separate from the functions' definitions, etc. etc.
>>
>> int test1() {
>>   foobar( ... ); // call expr binds to parameter of forward decl, I
>> know it is fancy
>>   blabla( ... ); // call expr binds to parameter of forward decl, I do
>> not know it is fancy
>> }
>>
>> void foobar( /* inherited [[fancy]] */ int x) { ... }
>> void blabla( [[fancy]] int y) { ... }
>>
>> int test2() {
>>   foobar( ... ); // I know it is fancy because inheritence, even
>> though it's a distinct node
>>   blabla( ... ); // call expr binds to parameter of *definition node*,
>> now I know it is fancy
>> }
>>
>>
>>
>> Unfortunately, I need to know in both calls to blabla() that the
>> parameter is deemed "fancy".
>> Outside of this case, simply always "annotating" the canonical
>> declaration seems to be a good way forward, but this breaks for
>> parameters due to the aforementioned problem.
>>
>> Due to the knowledge of "fanciness" required at all usage points (aka
>> call sites where arguments are passed) I need to have this information
>> in a common place (naturally the header which, thanks to inheriting
>> the attribute, will afflict the definition node too).
>> This is the formal part.
>>
>> However, because people are people, the knowledge, or rather the fact
>> that the declaration is "fancy" (even if "formally" inherited) should
>> be appropriately visible at the definition node. (This is a big bummer
>> with C++ that you can define "out of line", but we work with what's
>> given.) This is more so for code comprehension and accountability
>> purposes.
>>
>> Unfortunately, even the "formally" required part is broken, because
>> registering "fanciness" for the target of the call (which are
>> different nodes for the calls in test1() and test2(), and due to no
>> link through canonicalness, distinct!) is not a good way forward.
>>
>> Perhaps you, or someone from the list, has a hunch of how I should move forward?
>> I mean... naturally, I have my own ideas, of simply registering, for
>> the Redecl chain of the same overload, "equivalence classes" of
>> parameters that are, one way or another (from my perspective),
>> "related", and then simply making sure that the knowledge is saved in
>> sync to "both" (or rather, "all") places.
>>
>> I am just surprised this (at least AFAIK) has never come up before.
>> I instinctively "feel like" that it should not be my (tool's)
>> responsibility to build additional data structures for this problem,
>> this "relatedness" should be apparent without having to jump the hoops
>> of getting which indexth the current ParmVarDecl in my hand is,
>> finding the parent function, finding its CanonicalDecl (or other
>> redeclaration), and selecting the same indexth parameter, which is now
>> a different, as as you explained, sort-of "unrelated" node.
>>
>>
>> David Rector <[hidden email]> ezt írta (időpont: 2020. aug.
>> 17., H, 19:58):
>> >
>> >
>> > On Aug 17, 2020, at 12:01 PM, Whisperity via cfe-dev <[hidden email]> wrote:
>> >
>> > Hey!
>> >
>> > Suppose we have a rather trivial situation with a function prototype, a usage, and then later a definition:
>> >
>> > int foo(int x);
>> >
>> > void bar() {
>> >   foo(0);
>> > }
>> >
>> > int foo(int x) { return x * 2; }
>> >
>> > void baz() {
>> >   foo(1);
>> > }
>> >
>> > The following AST results:
>> >
>> > |-FunctionDecl 0x55ae46421ca8 <a.cpp:1:1, col:14> col:5 used foo 'int (int)'
>> > | `-ParmVarDecl 0x55ae46421bd0 <col:9, col:13> col:13 x 'int'
>> > |-FunctionDecl 0x55ae46421df0 <line:2:1, col:23> col:6 bar 'void ()'
>> > | `-CompoundStmt 0x55ae46421f90 <col:12, col:23>
>> > |   `-CallExpr 0x55ae46421f68 <col:16, col:21> 'int'
>> > |     |-ImplicitCastExpr 0x55ae46421f50 <col:16> 'int (*)(int)' <FunctionToPointerDecay>
>> > |     | `-DeclRefExpr 0x55ae46421ef8 <col:16> 'int (int)' lvalue Function 0x55ae46421ca8 'foo' 'int (int)'
>> > |     `-IntegerLiteral 0x55ae46421ed8 <col:20> 'int' 0
>> > |-FunctionDecl 0x55ae46422058 prev 0x55ae46421ca8 <line:3:1, col:32> col:5 used foo 'int (int)'
>> > | |-ParmVarDecl 0x55ae46421fc0 <col:9, col:13> col:13 used x 'int'
>> > | `-CompoundStmt 0x55ae46422188 <col:16, col:32>
>> > |   `-ReturnStmt 0x55ae46422178 <col:18, col:29>
>> > |     `-BinaryOperator 0x55ae46422158 <col:25, col:29> 'int' '*'
>> > |       |-ImplicitCastExpr 0x55ae46422140 <col:25> 'int' <LValueToRValue>
>> > |       | `-DeclRefExpr 0x55ae46422100 <col:25> 'int' lvalue ParmVar 0x55ae46421fc0 'x' 'int'
>> > |       `-IntegerLiteral 0x55ae46422120 <col:29> 'int' 2
>> > `-FunctionDecl 0x55ae464221c0 <line:4:1, col:21> col:6 baz 'void ()'
>> >   `-CompoundStmt 0x55ae46422328 <col:12, col:21>
>> >     `-CallExpr 0x55ae46422300 <col:14, col:19> 'int'
>> >       |-ImplicitCastExpr 0x55ae464222e8 <col:14> 'int (*)(int)' <FunctionToPointerDecay>
>> >       | `-DeclRefExpr 0x55ae464222c8 <col:14> 'int (int)' lvalue Function 0x55ae46422058 'foo' 'int (int)'
>> >       `-IntegerLiteral 0x55ae464222a8 <col:18> 'int' 1
>> >
>> > The FunctionDecl knows that it is a redeclaration of a previous Decl (namely, the prototype). However, the two ParmVarDecls for int x do not have this relationship: PVD->getCanonicalDecl() == PVD holds for both instances, with no connection between. PVD->redecls() == {PVD}, too.
>> > What is more interesting, is that querying the parameter to which the CallExpr gives the argument to, by iterating the number of arguments and doing cast<FunctionDecl>(CE->getCalledDecl())->getParamDecl(0), we will get two separate ParmVarDecl instances, due to how the call before the definition of foo() binds the prototype (and gives us the prototype’s ParmVarDecl) but the call site after the called function has been defined bind the definition.
>> >
>> > Is this an intended behaviour?
>> > Why isn’t the two ParmVarDecls not linked into a redecl chain, considering they should mean the “same entity”, as it is the same parameter of a redecl’d function?
>> > I obviously mean once we are past overloads, past template instantiations, etc. No “magic” should be intervening.
>> > Or is it me who’s not grasping something from the language correctly?
>> >
>> > That redeclarations of functions can assign different names to their parameters is the most illuminating factor, to me.  It suggests ParmVarDecls really are private to their parent FunctionDecl, and should not be linked to anything outside it — even to another redeclaration of that function.
>> >
>> > Indeed, ParmVarDecls do not matter in any function declaration except a definition; only the parameter types matter, and they are enclosed in the function type.  In other words, ParmVarDecls seem to be unnecessary, semantically, in every declaration of a function except its definition.  In non-defined functions, their purpose is only to help record the name a user assigned to that slot, purely as syntactic sugar.
>> >
>> > Since there need not anything in common between ParmVarDecls of different redeclarations except their type — and even that is stored separately in the FunctionProtoType — I think it’s proper that each ParmVarDecl be considered completely enclosed from the world outside its particular function redeclaration.
>> >
>> > - Dave
>> >
>> > Regards,
>> > W.
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > [hidden email]
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> >
>> >

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev