Weird behavior while parsing nested (and non) pragmas

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Weird behavior while parsing nested (and non) pragmas

Simone Pellegrini
Hi there,
I have a question about a weird behavior I am observing when parsing C
programs which contains nested pragmas.

The bottom line is that I need to extend some of the functions in Sema
in order to associate the pragmas to the correct statement. Let's
consider now the following code:

1 int main() {
2    #pragma omp parallel
3    {
4        #pragma omp barrier
5        #pragma omp master
6        ;
7    }
8 }

my overloaded version of ActOnCompoundStmt(SourceLocation lbrac,
SourceLocation rbrac, ...) is called every time a block is consumed,
when I try to print the location of the left bracket and the right
bracket I obtain something weird:

1) ActOnCompoundStmt() -> left bracket 5:4, right bracket 7:2 (line:column)
2) ActOnCompoundStmt() -> left bracket 2:22, right bracket 8:1

Naturally 1 is referred to the inner compound stmt while 2 to the main
body. As you see, the location of the left bracket is not correct, is
this an intended behavior or I am doing something wrong. I have to say,
to be complete, that in order to parse the pragmas I manually call the
ConsumeToken() of the Parser class, could be this the problem?

cheers, Simone



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Weird behavior while parsing nested (and non) pragmas

Sebastian Redl

On Aug 25, 2010, at 8:47 AM, Simone Pellegrini wrote:

> I have to say,
> to be complete, that in order to parse the pragmas I manually call the
> ConsumeToken() of the Parser class, could be this the problem?

Yes. ConsumeToken returns the SourceLocation of the token just consumed. Now, ParseCompoundStatement asserts that the current token is an opening brace and then calls ParseCompoundStatementBody. PCSB doesn't contain an assert. It simply assumes that the first token is the opening brace and consumes it, storing the returned source location, which is then passed to ActOnCompoundStatement as the location of the lbrace.

Now, if you do your own thing there, consume the lbrace, and leave some other random token in the stream for PCSB to consume, obviously the source location you get would be that of that random thing.

Sebastian
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Weird behavior while parsing nested (and non) pragmas

Simone Pellegrini
On 08/25/2010 07:59 PM, Sebastian Redl wrote:

> On Aug 25, 2010, at 8:47 AM, Simone Pellegrini wrote:
>
>    
>> I have to say,
>> to be complete, that in order to parse the pragmas I manually call the
>> ConsumeToken() of the Parser class, could be this the problem?
>>      
> Yes. ConsumeToken returns the SourceLocation of the token just consumed. Now, ParseCompoundStatement asserts that the current token is an opening brace and then calls ParseCompoundStatementBody. PCSB doesn't contain an assert. It simply assumes that the first token is the opening brace and consumes it, storing the returned source location, which is then passed to ActOnCompoundStatement as the location of the lbrace.
>
>    
I see.
> Now, if you do your own thing there, consume the lbrace, and leave some other random token in the stream for PCSB to consume, obviously the source location you get would be that of that random thing.
>    
Ok, I understand, but how can I insert a token in the token stream?

thanks for the help, Simone
> Sebastian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Weird behavior while parsing nested (and non) pragmas

Sebastian Redl

On Aug 25, 2010, at 12:26 PM, Simone Pellegrini wrote:

> On 08/25/2010 07:59 PM, Sebastian Redl wrote:
>> On Aug 25, 2010, at 8:47 AM, Simone Pellegrini wrote:
>>
>>  
>>> I have to say,
>>> to be complete, that in order to parse the pragmas I manually call the
>>> ConsumeToken() of the Parser class, could be this the problem?
>>>    
>> Yes. ConsumeToken returns the SourceLocation of the token just consumed. Now, ParseCompoundStatement asserts that the current token is an opening brace and then calls ParseCompoundStatementBody. PCSB doesn't contain an assert. It simply assumes that the first token is the opening brace and consumes it, storing the returned source location, which is then passed to ActOnCompoundStatement as the location of the lbrace.
>>
>>  
> I see.
>> Now, if you do your own thing there, consume the lbrace, and leave some other random token in the stream for PCSB to consume, obviously the source location you get would be that of that random thing.
>>  
> Ok, I understand, but how can I insert a token in the token stream?

You could just manipulate Tok as stored in Parser, but it would be better IMO to just capture the source location yourself and pass it on.

After reading the existing code more carefully, I still don't understand how you got to where you are without tripping over an assertion.

Sebastian
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Weird behavior while parsing nested (and non) pragmas

Simone Pellegrini
On 08/25/2010 10:47 PM, Sebastian Redl wrote:

> On Aug 25, 2010, at 12:26 PM, Simone Pellegrini wrote:
>
>    
>> On 08/25/2010 07:59 PM, Sebastian Redl wrote:
>>      
>>> On Aug 25, 2010, at 8:47 AM, Simone Pellegrini wrote:
>>>
>>>
>>>        
>>>> I have to say,
>>>> to be complete, that in order to parse the pragmas I manually call the
>>>> ConsumeToken() of the Parser class, could be this the problem?
>>>>
>>>>          
>>> Yes. ConsumeToken returns the SourceLocation of the token just consumed. Now, ParseCompoundStatement asserts that the current token is an opening brace and then calls ParseCompoundStatementBody. PCSB doesn't contain an assert. It simply assumes that the first token is the opening brace and consumes it, storing the returned source location, which is then passed to ActOnCompoundStatement as the location of the lbrace.
>>>
>>>
>>>        
>> I see.
>>      
>>> Now, if you do your own thing there, consume the lbrace, and leave some other random token in the stream for PCSB to consume, obviously the source location you get would be that of that random thing.
>>>
>>>        
>> Ok, I understand, but how can I insert a token in the token stream?
>>      
> You could just manipulate Tok as stored in Parser, but it would be better IMO to just capture the source location yourself and pass it on.
>
> After reading the existing code more carefully, I still don't understand how you got to where you are without tripping over an assertion.
>
>    
:) that's comfortable.

I found the way to insert a new token in the stream by using the
EnterTokenStream from the preprocessor class.
But still there is a problem, for example when the code is the following:

1 int main() {
2    {
3        #pragma omp barrier
4        #pragma omp master
5        ;
6    }
7 }


if I insert a 'random' token in the stream the parser gives an error
saying: "Expecting an expression". So it looks like that the '{' has
been already consumed before the #pragmas are handled. Which actually
makes sense but this doens't explain why the SourceLocation for the '{'
is wrong. It is actually true that if I insert a token in the stream, a
semicolon for example (which is an expression so makes the parser happy)
the location of the left bracket becomes 3:4 (the one I forced for the ;
token).

This could solve my problem but I actually don't like having the ';'
every time I am handling pragmas.

To be complete I have to say that my pragma handling is working on
Clang2.7, but I don't expect to have a different behavior in the current
svn version.

Any help is appreciated. Cheers, Simone
> Sebastian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Weird behavior while parsing nested (and non) pragmas

Simone Pellegrini
In reply to this post by Sebastian Redl
On 08/25/2010 10:47 PM, Sebastian Redl wrote:

> On Aug 25, 2010, at 12:26 PM, Simone Pellegrini wrote:
>
>    
>> On 08/25/2010 07:59 PM, Sebastian Redl wrote:
>>      
>>> On Aug 25, 2010, at 8:47 AM, Simone Pellegrini wrote:
>>>
>>>
>>>        
>>>> I have to say,
>>>> to be complete, that in order to parse the pragmas I manually call the
>>>> ConsumeToken() of the Parser class, could be this the problem?
>>>>
>>>>          
>>> Yes. ConsumeToken returns the SourceLocation of the token just consumed. Now, ParseCompoundStatement asserts that the current token is an opening brace and then calls ParseCompoundStatementBody. PCSB doesn't contain an assert. It simply assumes that the first token is the opening brace and consumes it, storing the returned source location, which is then passed to ActOnCompoundStatement as the location of the lbrace.
>>>
>>>
>>>        
>> I see.
>>      
>>> Now, if you do your own thing there, consume the lbrace, and leave some other random token in the stream for PCSB to consume, obviously the source location you get would be that of that random thing.
>>>
>>>        
>> Ok, I understand, but how can I insert a token in the token stream?
>>      
> You could just manipulate Tok as stored in Parser, but it would be better IMO to just capture the source location yourself and pass it on.
>
> After reading the existing code more carefully, I still don't understand how you got to where you are without tripping over an assertion.
>
>    
I actually managed to solve the problem by overwriting the
SourceLocation of the left bracket passed to the ActOnCompoundStmt()
function. This works for me but it is just a workaround of a strange
behavior which to me looks like a bug in the clang parser.

cheers, Simone
> Sebastian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Weird behavior while parsing nested (and non) pragmas

Douglas Gregor
In reply to this post by Simone Pellegrini

On Aug 26, 2010, at 1:00 AM, Simone Pellegrini wrote:

> On 08/25/2010 10:47 PM, Sebastian Redl wrote:
>> On Aug 25, 2010, at 12:26 PM, Simone Pellegrini wrote:
>>
>>
>>> On 08/25/2010 07:59 PM, Sebastian Redl wrote:
>>>
>>>> On Aug 25, 2010, at 8:47 AM, Simone Pellegrini wrote:
>>>>
>>>>
>>>>
>>>>> I have to say,
>>>>> to be complete, that in order to parse the pragmas I manually call the
>>>>> ConsumeToken() of the Parser class, could be this the problem?
>>>>>
>>>>>
>>>> Yes. ConsumeToken returns the SourceLocation of the token just consumed. Now, ParseCompoundStatement asserts that the current token is an opening brace and then calls ParseCompoundStatementBody. PCSB doesn't contain an assert. It simply assumes that the first token is the opening brace and consumes it, storing the returned source location, which is then passed to ActOnCompoundStatement as the location of the lbrace.
>>>>
>>>>
>>>>
>>> I see.
>>>
>>>> Now, if you do your own thing there, consume the lbrace, and leave some other random token in the stream for PCSB to consume, obviously the source location you get would be that of that random thing.
>>>>
>>>>
>>> Ok, I understand, but how can I insert a token in the token stream?
>>>
>> You could just manipulate Tok as stored in Parser, but it would be better IMO to just capture the source location yourself and pass it on.
>>
>> After reading the existing code more carefully, I still don't understand how you got to where you are without tripping over an assertion.
>>
>>
> :) that's comfortable.
>
> I found the way to insert a new token in the stream by using the
> EnterTokenStream from the preprocessor class.
> But still there is a problem, for example when the code is the following:
>
> 1 int main() {
> 2    {
> 3        #pragma omp barrier
> 4        #pragma omp master
> 5        ;
> 6    }
> 7 }
>
>
> if I insert a 'random' token in the stream the parser gives an error
> saying: "Expecting an expression". So it looks like that the '{' has
> been already consumed before the #pragmas are handled. Which actually
> makes sense but this doens't explain why the SourceLocation for the '{'
> is wrong. It is actually true that if I insert a token in the stream, a
> semicolon for example (which is an expression so makes the parser happy)
> the location of the left bracket becomes 3:4 (the one I forced for the ;
> token).
>
> This could solve my problem but I actually don't like having the ';'
> every time I am handling pragmas.
>
> To be complete I have to say that my pragma handling is working on
> Clang2.7, but I don't expect to have a different behavior in the current
> svn version.

The parser keeps a one-token cache (in Tok), so when you're pushing a new token back into the stream, you need to account for that.

        - Doug


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev