AST Matcher matches InitListExpr child twice

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

AST Matcher matches InitListExpr child twice

Tom Stellard via cfe-dev
Consider the following C++11 (or later) code fragment, which uses an
initialization list that contains a function call:

        int f();
        int i {f()};

The AST for that initialization list consists of an InitListExpr with a
single CallExpr child:

        InitListExpr 0x2df2b80 'int'
        `-CallExpr 0x2df2b10 'int'
          `-ImplicitCastExpr 0x2df2af8 ...
            `-DeclRefExpr 0x2df2aa0 ...

If I use the Clang AST Matcher API to match on InitListExpr, then I get
one match as expected.  However, if I match on CallExpr, then the single
CallExpr in the above AST is matched *twice*.  Anything below that
CallExpr node is also matched twice; for example, I'd get two matches
for the DeclRefExpr if that's what my AST Matcher were looking for.
This is definitely a problem for my code, and would seem to be a bug in
general.  Surely that's not the intended behavior, is it?

Attached below is a small test program, "visit.cc", that matches and
dumps out InitListExpr and CallExpr nodes in the AST.  Running this on
my above example clearly shows that only one InitListExpr is matched,
but a single CallExpr (at a single memory address) is matched *twice*.

Should I file this glitch as a Clang bug report rather than describing
it on this mailing list?  Is there a reasonable workaround until this
bug can be properly fixed?

Thanks,
Ben

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

visit.cc (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: AST Matcher matches InitListExpr child twice

Tom Stellard via cfe-dev
On 06/10/2017 03:31 PM, Ben Liblit via cfe-dev wrote:

> Consider the following C++11 (or later) code fragment, which uses an
> initialization list that contains a function call:
>
>      int f();
>      int i {f()};
>
> The AST for that initialization list consists of an InitListExpr with a
> single CallExpr child:
>
>      InitListExpr 0x2df2b80 'int'
>      `-CallExpr 0x2df2b10 'int'
>        `-ImplicitCastExpr 0x2df2af8 ...
>          `-DeclRefExpr 0x2df2aa0 ...
>
> If I use the Clang AST Matcher API to match on InitListExpr, then I get
> one match as expected.  However, if I match on CallExpr, then the single
> CallExpr in the above AST is matched *twice*.  Anything below that
> CallExpr node is also matched twice; for example, I'd get two matches
> for the DeclRefExpr if that's what my AST Matcher were looking for. This
> is definitely a problem for my code, and would seem to be a bug in
> general.  Surely that's not the intended behavior, is it?

That looks to be by design, cf. DEF_TRAVERSE_STMT(InitListExpr,...) in
include/clang/AST/RecursiveASTVisitor.h calling into
TraverseSynOrSemInitListExpr twice (for "syntactic" and "semantic"
InitListExpr).

In some of our LibreOffice RecursiveASTVisitor implementations we use

>     bool TraverseInitListExpr(
>         InitListExpr * expr
> #if CLANG_VERSION >= 30800
>         , DataRecursionQueue * queue = nullptr
> #endif
>         )
>     {
>         return WalkUpFromInitListExpr(expr)
>             && TraverseSynOrSemInitListExpr(
>                 expr->isSemanticForm() ? expr : expr->getSemanticForm()
> #if CLANG_VERSION >= 30800
>                 , queue
> #endif
>                 );
>     }

to traverse only once (where CLANG_VERSION is LibreOffice's way of
supporting multiple versions of Clang).
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: AST Matcher matches InitListExpr child twice

Tom Stellard via cfe-dev
Stephan Bergmann <[hidden email]> wrote:
> That looks to be by design, cf. DEF_TRAVERSE_STMT(InitListExpr,...) in
> include/clang/AST/RecursiveASTVisitor.h calling into
> TraverseSynOrSemInitListExpr twice (for "syntactic" and "semantic"
> InitListExpr).

Nutty!  But clearly by design, as you point out.  There seems to be no
way to detect this duplication short of maintaining my own "nodes I have
already seen" set, and no way to prevent double visits short of defining
my own RecursiveASTVisitor.

> In some of our LibreOffice RecursiveASTVisitor implementations we use
> [...] to traverse only once

Thank you for this example code, Stephan.  Unfortunately I see no way to
tell the AST Matcher system to use an alternate visitor of this sort.
Am I missing some API that would let me do this?  Or is the recursive
visit outside of my control if I'm using MatchFinder to drive my tool?

-- Ben
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: AST Matcher matches InitListExpr child twice

Tom Stellard via cfe-dev
On Mon, Jun 12, 2017 at 7:54 PM Ben Liblit via cfe-dev <[hidden email]> wrote:
Stephan Bergmann <[hidden email]> wrote:
> That looks to be by design, cf. DEF_TRAVERSE_STMT(InitListExpr,...) in
> include/clang/AST/RecursiveASTVisitor.h calling into
> TraverseSynOrSemInitListExpr twice (for "syntactic" and "semantic"
> InitListExpr).

Nutty!  But clearly by design, as you point out.  There seems to be no
way to detect this duplication short of maintaining my own "nodes I have
already seen" set, and no way to prevent double visits short of defining
my own RecursiveASTVisitor.

> In some of our LibreOffice RecursiveASTVisitor implementations we use
> [...] to traverse only once

Thank you for this example code, Stephan.  Unfortunately I see no way to
tell the AST Matcher system to use an alternate visitor of this sort.
Am I missing some API that would let me do this?  Or is the recursive
visit outside of my control if I'm using MatchFinder to drive my tool?

Generally, there are many reasons you might visit the same node twice - after all, the AST is not a tree, but a graph :) When you write tools that need to run on multiple TUs, you'll have to do some form of deduplication anyway.

Generally, use the fact that many nodes have pointer identity (you can deduplicate with a simple set on the pointers).
 

-- Ben
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: AST Matcher matches InitListExpr child twice

Tom Stellard via cfe-dev
Manuel Klimek <[hidden email]> wrote:
> Generally, there are many reasons you might visit the same node twice -
> after all, the AST is not a tree, but a graph :)

OK, fair enough.  Thank you for correcting my assumptions about this.

> Generally, use the fact that many nodes have pointer identity (you can
> deduplicate with a simple set on the pointers).

Understood.  Thanks!
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Loading...