Why is RecursiveASTVisitor::TraverseStmt structured differently from other Traverse*s?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Why is RecursiveASTVisitor::TraverseStmt structured differently from other Traverse*s?

David Zarzycki via cfe-dev
I have been experimenting with adjusting RecursiveASTVisitor to perform VisitPre* and VisitPost* (via corresponding WalkUpFromPre* and WalkUpFromPost* methods) surrounding the inner component traversals within each Traverse* definition.  This would allow one to easily define visitors that can e.g. keep track of the current traversal state, do pushes/pops around traversals of selected node kinds, etc.

This works great with Decl and Type nodes, as DEF_TRAVERSE_DECL and DEF_TRAVERSE_TYPE are straightforward: if a node D has children D1 and D2, which in turn have children d11 and d12 / d21 and d22 respectively, the descendants of D will be processed in this order:

D1, d11, d12, D2, d21, d22

So, my VisitPre*Decl(D1) and VisitPost*Decl(D1) would surround all the visitations of d11 and d12 and their descendants and none of D2’s visitations, as one might expect.

However, DEF_TRAVERSE_STMT is another matter entirely.  It endows each Traverse* function with a "DataRecursionQueue *" parameter.  No specific traversals definitions seem to use it idiosyncratically though — instead, all that parameter seems to do is permit the root TraverseStmt definition to process descendants of a Stmt S in the following order, if I am not mistaken:

S1, S2, s11, s12, s21, s22

This defeats my intention of fully surrounding the processing of e.g. S1 and its descendants with pre and post visitation.  But I can’t discern why it must be done this way — the reasoning is not documented anywhere.  Anyone know why Stmts/Exprs are processed in this order, but not Decls and Types?

Thanks,

Dave
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Why is RecursiveASTVisitor::TraverseStmt structured differently from other Traverse*s?

David Zarzycki via cfe-dev
On Mon, 6 Jan 2020 at 17:35, David Rector via cfe-dev <[hidden email]> wrote:
I have been experimenting with adjusting RecursiveASTVisitor to perform VisitPre* and VisitPost* (via corresponding WalkUpFromPre* and WalkUpFromPost* methods) surrounding the inner component traversals within each Traverse* definition.  This would allow one to easily define visitors that can e.g. keep track of the current traversal state, do pushes/pops around traversals of selected node kinds, etc.

Can you say more about what you want to do here? The kinds of things I think you're describing can already be done straightforwardly by overriding the Traverse* methods, and we have several visitors in Clang that do that kind of thing, so either I'm not understanding or there may already be a good way to do what you're trying to do. (For example, at the top of lib/Sema/SemaTemplateVariadic.cpp, the CollectUnexpandedParameterPacksVisitor keeps track of and pushes/pops traversal state while it's traversing a LambdaExpr.)
 
This works great with Decl and Type nodes, as DEF_TRAVERSE_DECL and DEF_TRAVERSE_TYPE are straightforward: if a node D has children D1 and D2, which in turn have children d11 and d12 / d21 and d22 respectively, the descendants of D will be processed in this order:

D1, d11, d12, D2, d21, d22

So, my VisitPre*Decl(D1) and VisitPost*Decl(D1) would surround all the visitations of d11 and d12 and their descendants and none of D2’s visitations, as one might expect.

However, DEF_TRAVERSE_STMT is another matter entirely.  It endows each Traverse* function with a "DataRecursionQueue *" parameter.  No specific traversals definitions seem to use it idiosyncratically though — instead, all that parameter seems to do is permit the root TraverseStmt definition to process descendants of a Stmt S in the following order, if I am not mistaken:

S1, S2, s11, s12, s21, s22

This defeats my intention of fully surrounding the processing of e.g. S1 and its descendants with pre and post visitation.  But I can’t discern why it must be done this way — the reasoning is not documented anywhere.  Anyone know why Stmts/Exprs are processed in this order, but not Decls and Types?

The purpose is to perform data recursion: instead of recursing using the stack (and potentially overflowing the stack if a complex, deeply-nested expression is encountered), a worklist is maintained separate from the execution stack and a visitor can execute in mostly constant stack space.

This stack space optimization is automatically turned off when processing Stmt subclasses for which the visitor defines custom behavior (if that custom behavior doesn't support a data recursion queue) -- that's done by the TRAVERSE_STMT_BASE macro. If you want to support more hooks that need to turn off data recursion, you could follow a similar pattern there.
 
Thanks,

Dave
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Why is RecursiveASTVisitor::TraverseStmt structured differently from other Traverse*s?

David Zarzycki via cfe-dev
Good explanation, thanks.  I have no great need for this functionality with Stmt/Exprs right now, mainly Decls and Types.  But I might think about it further later on — would be nice if the nodes could still be queued up in the same order they would be as if processed in a stack.  There’s already some reordering of the Queue going on during TraverseStmt, so perhaps it could be done without significant additional cost.

My purpose, FWIW: a general DependencyVisitor<Derived> : RecursiveASTVisitor<Derived> that a) maintains a stack of full traversal state information and b) re-Traverses any Decls that are used in Types (e.g. the CXXRecordDecl behind a RecordType), and Types that are used in other Types or in Exprs etc.  With such a visitor you could, say, call Visit(SomeFunctionDecl) and traverse all the Decl dependencies of its Types, and those dependencies’ Decl dependencies etc., never losing track of exactly why you’re traversing some entity, should you need that info in your particular visitor.  

It is surely less important, though, to maintain all that info for Stmts/Exprs, than it is for Types/Decls, so okay for now.

As for the utility of VisitPre*Decl and VisitPost*Decl vs. overriding TraverseDecl to call stuff before and after the Base::TraverseDecl: I prefer the former so I can make use of the dynamic type information in both the pre- and post-traversal visitations.  E.g. I can define simple VisitPreNamedDecl and VisitPostNamedDecl and VisitPreCXXRecordDecl and VisitPostCXXRecordDecl, to do pushes and pops with different info; without those I’d have to do all of that in the base TraverseDecl override via isa/dyn_casts etc.  Not a huge difference though, I admit.

Thanks again,

Dave

On Jan 6, 2020, at 9:26 PM, Richard Smith <[hidden email]> wrote:

On Mon, 6 Jan 2020 at 17:35, David Rector via cfe-dev <[hidden email]> wrote:
I have been experimenting with adjusting RecursiveASTVisitor to perform VisitPre* and VisitPost* (via corresponding WalkUpFromPre* and WalkUpFromPost* methods) surrounding the inner component traversals within each Traverse* definition.  This would allow one to easily define visitors that can e.g. keep track of the current traversal state, do pushes/pops around traversals of selected node kinds, etc.

Can you say more about what you want to do here? The kinds of things I think you're describing can already be done straightforwardly by overriding the Traverse* methods, and we have several visitors in Clang that do that kind of thing, so either I'm not understanding or there may already be a good way to do what you're trying to do. (For example, at the top of lib/Sema/SemaTemplateVariadic.cpp, the CollectUnexpandedParameterPacksVisitor keeps track of and pushes/pops traversal state while it's traversing a LambdaExpr.)
 
This works great with Decl and Type nodes, as DEF_TRAVERSE_DECL and DEF_TRAVERSE_TYPE are straightforward: if a node D has children D1 and D2, which in turn have children d11 and d12 / d21 and d22 respectively, the descendants of D will be processed in this order:

D1, d11, d12, D2, d21, d22

So, my VisitPre*Decl(D1) and VisitPost*Decl(D1) would surround all the visitations of d11 and d12 and their descendants and none of D2’s visitations, as one might expect.

However, DEF_TRAVERSE_STMT is another matter entirely.  It endows each Traverse* function with a "DataRecursionQueue *" parameter.  No specific traversals definitions seem to use it idiosyncratically though — instead, all that parameter seems to do is permit the root TraverseStmt definition to process descendants of a Stmt S in the following order, if I am not mistaken:

S1, S2, s11, s12, s21, s22

This defeats my intention of fully surrounding the processing of e.g. S1 and its descendants with pre and post visitation.  But I can’t discern why it must be done this way — the reasoning is not documented anywhere.  Anyone know why Stmts/Exprs are processed in this order, but not Decls and Types?

The purpose is to perform data recursion: instead of recursing using the stack (and potentially overflowing the stack if a complex, deeply-nested expression is encountered), a worklist is maintained separate from the execution stack and a visitor can execute in mostly constant stack space.

This stack space optimization is automatically turned off when processing Stmt subclasses for which the visitor defines custom behavior (if that custom behavior doesn't support a data recursion queue) -- that's done by the TRAVERSE_STMT_BASE macro. If you want to support more hooks that need to turn off data recursion, you could follow a similar pattern there.
 
Thanks,

Dave
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Why is RecursiveASTVisitor::TraverseStmt structured differently from other Traverse*s?

David Zarzycki via cfe-dev
Disregard this issue — after looking at the code further I now think the way TraverseStmt is already implemented should, indeed, already process the Stmts in the proper nested order — the std::reverse of just the newly-added child nodes, combined with always popping off the back , should achieve the same processing order as recursive calls on the stack a la TraverseDecl, so I must have botched other implementation details somewhere.  Sorry for the bother,

Dave

On Jan 6, 2020, at 10:22 PM, David Rector <[hidden email]> wrote:

Good explanation, thanks.  I have no great need for this functionality with Stmt/Exprs right now, mainly Decls and Types.  But I might think about it further later on — would be nice if the nodes could still be queued up in the same order they would be as if processed in a stack.  There’s already some reordering of the Queue going on during TraverseStmt, so perhaps it could be done without significant additional cost.

My purpose, FWIW: a general DependencyVisitor<Derived> : RecursiveASTVisitor<Derived> that a) maintains a stack of full traversal state information and b) re-Traverses any Decls that are used in Types (e.g. the CXXRecordDecl behind a RecordType), and Types that are used in other Types or in Exprs etc.  With such a visitor you could, say, call Visit(SomeFunctionDecl) and traverse all the Decl dependencies of its Types, and those dependencies’ Decl dependencies etc., never losing track of exactly why you’re traversing some entity, should you need that info in your particular visitor.  

It is surely less important, though, to maintain all that info for Stmts/Exprs, than it is for Types/Decls, so okay for now.

As for the utility of VisitPre*Decl and VisitPost*Decl vs. overriding TraverseDecl to call stuff before and after the Base::TraverseDecl: I prefer the former so I can make use of the dynamic type information in both the pre- and post-traversal visitations.  E.g. I can define simple VisitPreNamedDecl and VisitPostNamedDecl and VisitPreCXXRecordDecl and VisitPostCXXRecordDecl, to do pushes and pops with different info; without those I’d have to do all of that in the base TraverseDecl override via isa/dyn_casts etc.  Not a huge difference though, I admit.

Thanks again,

Dave

On Jan 6, 2020, at 9:26 PM, Richard Smith <[hidden email]> wrote:

On Mon, 6 Jan 2020 at 17:35, David Rector via cfe-dev <[hidden email]> wrote:
I have been experimenting with adjusting RecursiveASTVisitor to perform VisitPre* and VisitPost* (via corresponding WalkUpFromPre* and WalkUpFromPost* methods) surrounding the inner component traversals within each Traverse* definition.  This would allow one to easily define visitors that can e.g. keep track of the current traversal state, do pushes/pops around traversals of selected node kinds, etc.

Can you say more about what you want to do here? The kinds of things I think you're describing can already be done straightforwardly by overriding the Traverse* methods, and we have several visitors in Clang that do that kind of thing, so either I'm not understanding or there may already be a good way to do what you're trying to do. (For example, at the top of lib/Sema/SemaTemplateVariadic.cpp, the CollectUnexpandedParameterPacksVisitor keeps track of and pushes/pops traversal state while it's traversing a LambdaExpr.)
 
This works great with Decl and Type nodes, as DEF_TRAVERSE_DECL and DEF_TRAVERSE_TYPE are straightforward: if a node D has children D1 and D2, which in turn have children d11 and d12 / d21 and d22 respectively, the descendants of D will be processed in this order:

D1, d11, d12, D2, d21, d22

So, my VisitPre*Decl(D1) and VisitPost*Decl(D1) would surround all the visitations of d11 and d12 and their descendants and none of D2’s visitations, as one might expect.

However, DEF_TRAVERSE_STMT is another matter entirely.  It endows each Traverse* function with a "DataRecursionQueue *" parameter.  No specific traversals definitions seem to use it idiosyncratically though — instead, all that parameter seems to do is permit the root TraverseStmt definition to process descendants of a Stmt S in the following order, if I am not mistaken:

S1, S2, s11, s12, s21, s22

This defeats my intention of fully surrounding the processing of e.g. S1 and its descendants with pre and post visitation.  But I can’t discern why it must be done this way — the reasoning is not documented anywhere.  Anyone know why Stmts/Exprs are processed in this order, but not Decls and Types?

The purpose is to perform data recursion: instead of recursing using the stack (and potentially overflowing the stack if a complex, deeply-nested expression is encountered), a worklist is maintained separate from the execution stack and a visitor can execute in mostly constant stack space.

This stack space optimization is automatically turned off when processing Stmt subclasses for which the visitor defines custom behavior (if that custom behavior doesn't support a data recursion queue) -- that's done by the TRAVERSE_STMT_BASE macro. If you want to support more hooks that need to turn off data recursion, you could follow a similar pattern there.
 
Thanks,

Dave
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev