source-code representation of an Expr

classic Classic list List threaded Threaded
8 messages Options
Sam
Reply | Threaded
Open this post in threaded view
|

source-code representation of an Expr

Sam
Is there a relatively painless way to get the exact source-code representation of an Expr?  I've looked into using SourceManager::getCharacterData(E->getExprLoc()), but it's not really what I want.  I am fairly new to the clang API so I realize that I may have missed something obvious.

Thanks for your help,
Sam


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: source-code representation of an Expr

John McCall
On Dec 29, 2010, at 10:21 AM, Sam wrote:
> Is there a relatively painless way to get the exact source-code representation of an Expr?  I've looked into using SourceManager::getCharacterData(E->getExprLoc()), but it's not really what I want.  I am fairly new to the clang API so I realize that I may have missed something obvious.

Well, starting with the expression's SourceRange instead of a single SourceLocation would be a good start.

John.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Sam
Reply | Threaded
Open this post in threaded view
|

Re: source-code representation of an Expr

Sam
Yes, I just don't know what to *do* with it ;-) In other words, I don't see a clear use of the SourceManager API once I have the SourceRange to extract the Expr's source-code.  I don't see any API calls that use SourceRange or beginning and ending SourceLocations for source-code extraction.

On Wed, Dec 29, 2010 at 3:48 PM, John McCall <[hidden email]> wrote:
On Dec 29, 2010, at 10:21 AM, Sam wrote:
> Is there a relatively painless way to get the exact source-code representation of an Expr?  I've looked into using SourceManager::getCharacterData(E->getExprLoc()), but it's not really what I want.  I am fairly new to the clang API so I realize that I may have missed something obvious.

Well, starting with the expression's SourceRange instead of a single SourceLocation would be a good start.

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: source-code representation of an Expr

John McCall
On Dec 29, 2010, at 12:53 PM, Sam wrote:
> Yes, I just don't know what to *do* with it ;-) In other words, I don't see a clear use of the SourceManager API once I have the SourceRange to extract the Expr's source-code.  I don't see any API calls that use SourceRange or beginning and ending SourceLocations for source-code extraction.

We should probably make some API for this.

What you can do for now is something like the following:

  SourceRange range = expr->getSourceRange();
  if (range.getBegin().isMacroID() || range.getEnd().isMacroID()) {
    // handle this case
  } else if (!sourceManager.isFromSameFile(range.getBegin(), range.getEnd())) {
    // handle this case
  } else {
    range.setEnd(preprocessor.getLocForEndOfToken(range.getEnd()));
    const char *begin = sourceManager.getCharacterData(range.getBegin());
    const char *end = sourceManager.getCharacterData(range.getEnd());
    llvm::StringRef string(begin, end - begin);
    // now you can do whatever you want
  }

John.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Sam
Reply | Threaded
Open this post in threaded view
|

Re: source-code representation of an Expr

Sam
John, 

That works perfectly.  Thank you so much for your help!

Sam

On Wed, Dec 29, 2010 at 4:20 PM, John McCall <[hidden email]> wrote:
On Dec 29, 2010, at 12:53 PM, Sam wrote:
> Yes, I just don't know what to *do* with it ;-) In other words, I don't see a clear use of the SourceManager API once I have the SourceRange to extract the Expr's source-code.  I don't see any API calls that use SourceRange or beginning and ending SourceLocations for source-code extraction.

We should probably make some API for this.

What you can do for now is something like the following:

 SourceRange range = expr->getSourceRange();
 if (range.getBegin().isMacroID() || range.getEnd().isMacroID()) {
   // handle this case
 } else if (!sourceManager.isFromSameFile(range.getBegin(), range.getEnd())) {
   // handle this case
 } else {
   range.setEnd(preprocessor.getLocForEndOfToken(range.getEnd()));
   const char *begin = sourceManager.getCharacterData(range.getBegin());
   const char *end = sourceManager.getCharacterData(range.getEnd());
   llvm::StringRef string(begin, end - begin);
   // now you can do whatever you want
 }

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: source-code representation of an Expr

Ted Kremenek
We should probably add this to a FAQ of some sort that documents how to do X with Clang.

On Dec 30, 2010, at 12:02 PM, Sam <[hidden email]> wrote:

John, 

That works perfectly.  Thank you so much for your help!

Sam

On Wed, Dec 29, 2010 at 4:20 PM, John McCall <[hidden email]> wrote:
On Dec 29, 2010, at 12:53 PM, Sam wrote:
> Yes, I just don't know what to *do* with it ;-) In other words, I don't see a clear use of the SourceManager API once I have the SourceRange to extract the Expr's source-code.  I don't see any API calls that use SourceRange or beginning and ending SourceLocations for source-code extraction.

We should probably make some API for this.

What you can do for now is something like the following:

 SourceRange range = expr->getSourceRange();
 if (range.getBegin().isMacroID() || range.getEnd().isMacroID()) {
   // handle this case
 } else if (!sourceManager.isFromSameFile(range.getBegin(), range.getEnd())) {
   // handle this case
 } else {
   range.setEnd(preprocessor.getLocForEndOfToken(range.getEnd()));
   const char *begin = sourceManager.getCharacterData(range.getBegin());
   const char *end = sourceManager.getCharacterData(range.getEnd());
   llvm::StringRef string(begin, end - begin);
   // now you can do whatever you want
 }

John.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Sam
Reply | Threaded
Open this post in threaded view
|

Re: source-code representation of an Expr

Sam
In reply to this post by John McCall
What is the right way to address the first condition in the code you suggested?  I have been banging my head against my monitor long enough that I now admit defeat.  I've been looking mostly at the APIs for clang::Preprocessor and clang::SourceManager.  I've also read through the Clang Internals document.

The bigger issue is that I'd really just like to understand how to properly traverse the source-code representation of statements I encounter via the AST in an efficient manner.  Ideally this would include the ability to examine the source-code representation of a statement (or one of its sublcasses) before or after preprocessing, e.g., before and after macro expansion.

Here's what I know so far:
1. For an expression "Expr * e", I can retrieve its source range via e->getSourceRange().
2. Using SourceRange::getBegin() I can call SourceManager::getCharacterData()
3. From what I can tell, SourceManager::getCharacterData returns source-code post macro-expansion.
4. If my expression doesn't contain any macros, the code that John McCall suggested works fine for my purposes.  However, I am interested in a number of cases where there are macros involved (specifically, I am looking at function arguments, i.e., my "expressions", that contain macros).

Can anyone help?  I feel that Clang is likely capable of providing me with the information that I want as-is, but I just can't figure this one out via API intuition.  If it's not, I'd be glad to submit a patch to the API if anyone can steer me in the right direction.  

Thanks in advance,
Sam

On Wed, Dec 29, 2010 at 4:20 PM, John McCall <[hidden email]> wrote:
On Dec 29, 2010, at 12:53 PM, Sam wrote:
> Yes, I just don't know what to *do* with it ;-) In other words, I don't see a clear use of the SourceManager API once I have the SourceRange to extract the Expr's source-code.  I don't see any API calls that use SourceRange or beginning and ending SourceLocations for source-code extraction.

We should probably make some API for this.

What you can do for now is something like the following:

 SourceRange range = expr->getSourceRange();
 if (range.getBegin().isMacroID() || range.getEnd().isMacroID()) {
   // handle this case
 } else if (!sourceManager.isFromSameFile(range.getBegin(), range.getEnd())) {
   // handle this case
 } else {
   range.setEnd(preprocessor.getLocForEndOfToken(range.getEnd()));
   const char *begin = sourceManager.getCharacterData(range.getBegin());
   const char *end = sourceManager.getCharacterData(range.getEnd());
   llvm::StringRef string(begin, end - begin);
   // now you can do whatever you want
 }

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: source-code representation of an Expr

John McCall
On Jan 4, 2011, at 1:10 PM, Sam wrote:
> What is the right way to address the first condition in the code you suggested?  I have been banging my head against my monitor long enough that I now admit defeat.  I've been looking mostly at the APIs for clang::Preprocessor and clang::SourceManager.  I've also read through the Clang Internals document.

getCharacterData will give you the data for whatever location you give it.  The problem with macros is that there are multiple locations involved:  there's the spelling location, where the actual token was written/formed, and there's a chain of arbitrarily many instantiation locations, where the name of some macro was written.  Clang preserves this full macro-instantiation stack, and you can walk up from the spelling location (which is what's generally stored in the AST) through the chain of instantiation locations.  SourceManager::getInstantiationLoc() will jump the whole way for you, or you can walk step-by-step, in which case you need to understand a bit more about how Clang's SourceLocation abstraction works.

A SourceLocation is basically just an offset within a FileID.  A FileID is either a specific inclusion of a physical file or it's a macro instantiation buffer;  SourceLocation::isMacroID() tells you which one, although you can also (at much greater expense) ask the SourceManager for a location's FileID, look up the FileID's SourceManager::SLocEntry, and then ask that.  The SLocEntry for a macro location will have a SourceManager::InstantiationInfo that will tell you the range of the expression from which the macro was instantiated, i.e. moving exactly one level up the instantiation stack.

But I can't tell you what your project should do with this information.

John.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev