Questions on implementing a custom preprocessor

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Questions on implementing a custom preprocessor

Nat! via cfe-dev
Hello:

I just started looking at libClang recently, so please pardon me if this is a noobish question.

I want to essentially build a custom preprocessing stage and I am not sure where to get started.  I want to be able to generate some code based on some code in my source files, things like:
 1. Insert statements in selected functions
 2. Insert members in a struct

And I want to be able to to this as part of the compilation, like, in other words, I don't see this as a refactoring, i.e. the original source file should remain intact, and all compilation error messages should reference line numbers for the original file, and I'm not really concerned with keeping the intermediate/preprocessed version anywhere.  Ideally this could be done as a single step.

I have found the documentation for C api for libclang and it looks like it is mainly for reading the AST, like you can't actually start a compilation or alter the AST.

I also found the C++ API for the "Driver" code, and that looks more functional, but it isn't mentioned as a recommended API, so I wanted to check to see if maybe I am missing something in the C api... 

There are also the plugins API, and I found some examples on how to rewrite code there using the "Rewriter" class but that looks like its designed for refactoring, not preprocessing.  Specifically it doesn't output or keep track of line markers, and after you rewrite the code there doesn't seem to be a way to compile the new version of the code.  

The best strategy that I have for moving forward is to try to use the "Driver" c++ api to run a custom plugin and do a preprocessor stage only, then use a "RecursiveASTVisitor"  to go through the whole AST, and output that into a temporary buffer/file, optionally making inserts/edits based on conditions, and manually implementing line markers by using "getSourceRange" and then looping that process until there is no more changes then feeding the temporary buffer into the compilation stage...

There is also the issue that sometimes my messages to the preprocessor aren't actually valid statements (undeclared identifiers), and it seems that the AST Visiting functionality completely ignores any error statements... The best way that I can think to work around this is to just add an option to define them as some kind of internal function/variable that I can filter for...

Any feedback would be appreciated.  Thanks :)



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Questions on implementing a custom preprocessor

Nat! via cfe-dev
2017-02-09 7:47 GMT+07:00 Dave Butler via cfe-dev <[hidden email]>:
Hello:

I just started looking at libClang recently, so please pardon me if this is a noobish question.

I want to essentially build a custom preprocessing stage and I am not sure where to get started.  I want to be able to generate some code based on some code in my source files, things like:
 1. Insert statements in selected functions
 2. Insert members in a struct

And I want to be able to to this as part of the compilation, like, in other words, I don't see this as a refactoring, i.e. the original source file should remain intact, and all compilation error messages should reference line numbers for the original file, and I'm not really concerned with keeping the intermediate/preprocessed version anywhere.  Ideally this could be done as a single step.

I have found the documentation for C api for libclang and it looks like it is mainly for reading the AST, like you can't actually start a compilation or alter the AST.

I also found the C++ API for the "Driver" code, and that looks more functional, but it isn't mentioned as a recommended API, so I wanted to check to see if maybe I am missing something in the C api... 

There are also the plugins API, and I found some examples on how to rewrite code there using the "Rewriter" class but that looks like its designed for refactoring, not preprocessing.  Specifically it doesn't output or keep track of line markers, and after you rewrite the code there doesn't seem to be a way to compile the new version of the code.  

 Plugin API is not confined to refactoring, it allows running arbitrary transformations over AST, so probably this way fits your needs.


The best strategy that I have for moving forward is to try to use the "Driver" c++ api to run a custom plugin and do a preprocessor stage only, then use a "RecursiveASTVisitor"  to go through the whole AST, and output that into a temporary buffer/file, optionally making inserts/edits based on conditions, and manually implementing line markers by using "getSourceRange" and then looping that process until there is no more changes then feeding the temporary buffer into the compilation stage...

 
 If you don't bother about distribution of your product, you could choose patching clang. Just put your transformation somewhere in clang sources, `Sema::ActOnEndOfTranslationUnit` may be a good place. Depending on what processing you need after you inserted you code (for instance, whether you need template instantiations), you put call to you code in different places. If your transformation produces AST ready to codegen, the call may be placed in ParseAST, after parser but before call to Handle* methods of ASTConsumer. This way may be simpler to start.

There is also the issue that sometimes my messages to the preprocessor aren't actually valid statements (undeclared identifiers), and it seems that the AST Visiting functionality completely ignores any error statements... The best way that I can think to work around this is to just add an option to define them as some kind of internal function/variable that I can filter for...

If you want to pass your AST to codegen, you must keep it perfect, codegen must not see invalid statements or declarations. If new names are introduced, they must be properly declared. Put the source code your transformation must produce into a file and run 'clang -cc1 -ast-dump file.cpp` to see what your code must look as in AST and try to build similar tree.


Any feedback would be appreciated.  Thanks :)



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev