AST for expresions in array sizes

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

AST for expresions in array sizes

Yvan Roux via cfe-dev
Hello,

#define MAX 10

int main() // (int n)
{
char arr[MAX * 3]; // arr[n + 3]
}

I expected that AST would be more detailed (e.g. Binary Operator) but Clang seems to fold it:
TranslationUnitDecl
`-FunctionDecl <line:11:1, line:14:1> line:11:5 main 'int ()'
`-CompoundStmt <line:12:1, line:14:1>
`-DeclStmt <line:13:3, col:20>
`-VarDecl <col:3, col:19> col:8 arr 'char [30]'

or raw output
`-VarDecl <col:3, col:15> col:8 a 'char [n * 3]'

So with the current AST output we are unable to determine overflowing in array sizes, as requested in https://bugs.llvm.org/show_bug.cgi?id=27439, right?

Is it possible to disable that folding / enhance VarDecl for arrays?

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: AST for expresions in array sizes

Yvan Roux via cfe-dev
This is I think because the expression is in truth, not an expression
because part of it comes from the preprocessor. So at parse time,
Clang I think sees "10 + 3" there, not "MACRO + 3". Without going back
to the preprocessor and retrieving the state or tokens from it, you
won't be able to grab this extra information. (Although I'm not sure
how the compiler warning notes do this...)

If you don't use a preprocessor macro but rather a real expression,
the AST contains the expression verbatim:

    |-DeclStmt 0x1b31c38 <line:8:3, col:18>
    | `-VarDecl 0x1b31bd8 <col:3, col:17> col:8 used tmp 'char [x + 3]'
    |-BinaryOperator 0x1b31d10 <line:10:3, col:12> 'char' lvalue '='
    | |-ArraySubscriptExpr 0x1b31cb0 <col:3, col:8> 'char' lvalue
    | | |-ImplicitCastExpr 0x1b31c98 <col:3> 'char *' <ArrayToPointerDecay>
    | | | `-DeclRefExpr 0x1b31c50 <col:3> 'char [x + 3]' lvalue Var
0x1b31bd8 'tmp' 'char [x + 3]'
    | | `-IntegerLiteral 0x1b31c78 <col:7> 'int' 0
    | `-ImplicitCastExpr 0x1b31cf8 <col:12> 'char' <IntegralCast>
    |   `-IntegerLiteral 0x1b31cd8 <col:12> 'int' 1

Or for a more whacky one:

    |-DeclStmt 0x11bdc28 <line:8:3, col:41>
    | `-VarDecl 0x11bdbc8 <col:3, col:40> col:8 used tmp 'char [x * 2
+ 5 - 1 / 2 * x * x + 42]'

This is still not a "BinaryOperator" but perhaps somehow the type
could be fetched out from this and then the inner expression
generated. It could be that only the dumper function is "lazy" about
this.

; Whisperity.
Dávid Bolvanský via cfe-dev <[hidden email]> ezt írta
(időpont: 2018. okt. 1., H, 12:35):

>
> Hello,
>
> #define MAX 10
>
> int main() // (int n)
> {
> char arr[MAX * 3]; // arr[n + 3]
> }
>
> I expected that AST would be more detailed (e.g. Binary Operator) but Clang seems to fold it:
> TranslationUnitDecl
> `-FunctionDecl <line:11:1, line:14:1> line:11:5 main 'int ()'
> `-CompoundStmt <line:12:1, line:14:1>
> `-DeclStmt <line:13:3, col:20>
> `-VarDecl <col:3, col:19> col:8 arr 'char [30]'
>
> or raw output
> `-VarDecl <col:3, col:15> col:8 a 'char [n * 3]'
>
> So with the current AST output we are unable to determine overflowing in array sizes, as requested in https://bugs.llvm.org/show_bug.cgi?id=27439, right?
>
> Is it possible to disable that folding / enhance VarDecl for arrays?
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: AST for expresions in array sizes

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev
Thanks!

Basically macro info (e.g. whether MAX or 10) is not needed, but info about BinaryOperator would be quite useful.

po 1. 10. 2018 o 12:35 Dávid Bolvanský <[hidden email]> napísal(a):
Hello,

#define MAX 10

int main() // (int n)
{
char arr[MAX * 3]; // arr[n + 3]
}

I expected that AST would be more detailed (e.g. Binary Operator) but Clang seems to fold it:
TranslationUnitDecl
`-FunctionDecl <line:11:1, line:14:1> line:11:5 main 'int ()'
`-CompoundStmt <line:12:1, line:14:1>
`-DeclStmt <line:13:3, col:20>
`-VarDecl <col:3, col:19> col:8 arr 'char [30]'

or raw output
`-VarDecl <col:3, col:15> col:8 a 'char [n * 3]'

So with the current AST output we are unable to determine overflowing in array sizes, as requested in https://bugs.llvm.org/show_bug.cgi?id=27439, right?

Is it possible to disable that folding / enhance VarDecl for arrays?

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: AST for expresions in array sizes

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev
In your example variable-length-arrays (VariableArrayType) are used,
which are a separate sub-class of ArrayType; it's a pretty rare feature.
I guess Dávid is more curious about constant-size arrays
(ConstantArrayType), which indeed do not store the size expression, and
should not, because arrays of the same numeric size must also be of the
same type (eg., for the purpose of template instantiations; VLAs, on the
other hand, are forbidden in C++, probably for that very reason).

I also don't think it's a preprocessor thing to do. I don't think
preprocessor collapses a[10 + 3] into a[13], because it definitely
doesn't collapse 10 + 3 to 13.

If anywhere, these constant array size expressions should live somewhere
in VarDecls.

On 10/1/18 3:57 AM, Whisperity via cfe-dev wrote:

> This is I think because the expression is in truth, not an expression
> because part of it comes from the preprocessor. So at parse time,
> Clang I think sees "10 + 3" there, not "MACRO + 3". Without going back
> to the preprocessor and retrieving the state or tokens from it, you
> won't be able to grab this extra information. (Although I'm not sure
> how the compiler warning notes do this...)
>
> If you don't use a preprocessor macro but rather a real expression,
> the AST contains the expression verbatim:
>
>      |-DeclStmt 0x1b31c38 <line:8:3, col:18>
>      | `-VarDecl 0x1b31bd8 <col:3, col:17> col:8 used tmp 'char [x + 3]'
>      |-BinaryOperator 0x1b31d10 <line:10:3, col:12> 'char' lvalue '='
>      | |-ArraySubscriptExpr 0x1b31cb0 <col:3, col:8> 'char' lvalue
>      | | |-ImplicitCastExpr 0x1b31c98 <col:3> 'char *' <ArrayToPointerDecay>
>      | | | `-DeclRefExpr 0x1b31c50 <col:3> 'char [x + 3]' lvalue Var
> 0x1b31bd8 'tmp' 'char [x + 3]'
>      | | `-IntegerLiteral 0x1b31c78 <col:7> 'int' 0
>      | `-ImplicitCastExpr 0x1b31cf8 <col:12> 'char' <IntegralCast>
>      |   `-IntegerLiteral 0x1b31cd8 <col:12> 'int' 1
>
> Or for a more whacky one:
>
>      |-DeclStmt 0x11bdc28 <line:8:3, col:41>
>      | `-VarDecl 0x11bdbc8 <col:3, col:40> col:8 used tmp 'char [x * 2
> + 5 - 1 / 2 * x * x + 42]'
>
> This is still not a "BinaryOperator" but perhaps somehow the type
> could be fetched out from this and then the inner expression
> generated. It could be that only the dumper function is "lazy" about
> this.
>
> ; Whisperity.
> Dávid Bolvanský via cfe-dev <[hidden email]> ezt írta
> (időpont: 2018. okt. 1., H, 12:35):
>> Hello,
>>
>> #define MAX 10
>>
>> int main() // (int n)
>> {
>> char arr[MAX * 3]; // arr[n + 3]
>> }
>>
>> I expected that AST would be more detailed (e.g. Binary Operator) but Clang seems to fold it:
>> TranslationUnitDecl
>> `-FunctionDecl <line:11:1, line:14:1> line:11:5 main 'int ()'
>> `-CompoundStmt <line:12:1, line:14:1>
>> `-DeclStmt <line:13:3, col:20>
>> `-VarDecl <col:3, col:19> col:8 arr 'char [30]'
>>
>> or raw output
>> `-VarDecl <col:3, col:15> col:8 a 'char [n * 3]'
>>
>> So with the current AST output we are unable to determine overflowing in array sizes, as requested in https://bugs.llvm.org/show_bug.cgi?id=27439, right?
>>
>> Is it possible to disable that folding / enhance VarDecl for arrays?
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: AST for expresions in array sizes

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev
I chatted with Richard Smith about this and he pointed out that the extra info for MAX * 3 is stored in the TypeLocInfo (which can be retrieved from the VarDecl), rather than the Type itself.

For example, in gdb, once I've found the VarDecl (pointer stored in the GDB temporary expression $10), I could retrieve the expression:

p ((clang::ArrayTypeLoc)((VarDecl*)$10)->getTypeSourceInfo()->getTypeLoc()).getSizeExpr()->dump()
BinaryOperator 0xcc8f7c8 'int' '*'
|-IntegerLiteral 0xcc8f788 'int' 10
`-IntegerLiteral 0xcc8f7a8 'int' 3

You can find the macro details by looking at the source location stuff - I don't know that piece in detail, but should work as well/in the same way here as in the rest of the AST.

Hope that helps!
- Dave

On Mon, Oct 1, 2018 at 3:35 AM Dávid Bolvanský via cfe-dev <[hidden email]> wrote:
Hello,

#define MAX 10

int main() // (int n)
{
char arr[MAX * 3]; // arr[n + 3]
}

I expected that AST would be more detailed (e.g. Binary Operator) but Clang seems to fold it:
TranslationUnitDecl
`-FunctionDecl <line:11:1, line:14:1> line:11:5 main 'int ()'
`-CompoundStmt <line:12:1, line:14:1>
`-DeclStmt <line:13:3, col:20>
`-VarDecl <col:3, col:19> col:8 arr 'char [30]'

or raw output
`-VarDecl <col:3, col:15> col:8 a 'char [n * 3]'

So with the current AST output we are unable to determine overflowing in array sizes, as requested in https://bugs.llvm.org/show_bug.cgi?id=27439, right?

Is it possible to disable that folding / enhance VarDecl for arrays?
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: AST for expresions in array sizes

Yvan Roux via cfe-dev
Great! Thanks

po 1. 10. 2018 o 20:55 David Blaikie <[hidden email]> napísal(a):
I chatted with Richard Smith about this and he pointed out that the extra info for MAX * 3 is stored in the TypeLocInfo (which can be retrieved from the VarDecl), rather than the Type itself.

For example, in gdb, once I've found the VarDecl (pointer stored in the GDB temporary expression $10), I could retrieve the expression:

p ((clang::ArrayTypeLoc)((VarDecl*)$10)->getTypeSourceInfo()->getTypeLoc()).getSizeExpr()->dump()
BinaryOperator 0xcc8f7c8 'int' '*'
|-IntegerLiteral 0xcc8f788 'int' 10
`-IntegerLiteral 0xcc8f7a8 'int' 3

You can find the macro details by looking at the source location stuff - I don't know that piece in detail, but should work as well/in the same way here as in the rest of the AST.

Hope that helps!
- Dave

On Mon, Oct 1, 2018 at 3:35 AM Dávid Bolvanský via cfe-dev <[hidden email]> wrote:
Hello,

#define MAX 10

int main() // (int n)
{
char arr[MAX * 3]; // arr[n + 3]
}

I expected that AST would be more detailed (e.g. Binary Operator) but Clang seems to fold it:
TranslationUnitDecl
`-FunctionDecl <line:11:1, line:14:1> line:11:5 main 'int ()'
`-CompoundStmt <line:12:1, line:14:1>
`-DeclStmt <line:13:3, col:20>
`-VarDecl <col:3, col:19> col:8 arr 'char [30]'

or raw output
`-VarDecl <col:3, col:15> col:8 a 'char [n * 3]'

So with the current AST output we are unable to determine overflowing in array sizes, as requested in https://bugs.llvm.org/show_bug.cgi?id=27439, right?

Is it possible to disable that folding / enhance VarDecl for arrays?
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev