Clang macro expansion differs from GCC

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Clang macro expansion differs from GCC

David Peixotto
I have found a discrepancy in behavior between the clang preprocessor and gcc. 

$ cat t.c 
#define X 0 .. 1

0 .. 1 // Ok
X      // Bad: space between ..

$ gcc -E t.c 
<snip>

0 .. 1
0 .. 1

$ clang -E t.c 
<snip>

0 .. 1
0 . . 1

The problem is that when ".." is in the macro and the macro gets expanded, it adds a space between the dots. As you can see, the space is not added when the ".." is entered directly in the text. GCC does not insert a space in either case. 

I'm not sure this qualifies as a bug, but the reason I need this to work is that I am trying to compile GHC (a Haskell compiler) with clang. It uses GCC to preprocess c-- files (http://www.cminusminus.org) before parsing them and ".." is a valid operator in c--. Adding the extra space messes up the subsequent parsing phase.

I'd be happy to take a look at fixing this myself if someone could point me in the right direction.

Thanks!
-Dave
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang macro expansion differs from GCC

Chris Lattner

On Apr 8, 2010, at 8:34 AM, David Peixotto wrote:

I have found a discrepancy in behavior between the clang preprocessor and gcc. 

$ cat t.c 
#define X 0 .. 1

0 .. 1 // Ok
X      // Bad: space between ..

$ gcc -E t.c 
<snip>

0 .. 1
0 .. 1

$ clang -E t.c 
<snip>

0 .. 1
0 . . 1

The problem is that when ".." is in the macro and the macro gets expanded, it adds a space between the dots. As you can see, the space is not added when the ".." is entered directly in the text. GCC does not insert a space in either case. 

I'm not sure this qualifies as a bug, but the reason I need this to work is that I am trying to compile GHC (a Haskell compiler) with clang. It uses GCC to preprocess c-- files (http://www.cminusminus.org) before parsing them and ".." is a valid operator in c--. Adding the extra space messes up the subsequent parsing phase.

Hi David,

I'm not particularly interested in fixing this: people abusing the C preprocessor to preprocess things that aren't C don't get much sympathy from me.  However, if you are able to produce a patch that works, doesn't break anything else, and doesn't affect performance, I'd be happy to apply it.


I'd be happy to take a look at fixing this myself if someone could point me in the right direction.

lib/Frontend/PrintPreprocessedOutput.cpp is the code that outputs a .i file, the "AvoidConcat" stuff is what you want.  It is what decides to add spaces to prevent three . tokens from looking like a ... for example.

-Chris


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang macro expansion differs from GCC

David Peixotto
Hi Chris,

Thanks for the quick response.

On Apr 8, 2010, at 11:48 AM, Chris Lattner wrote:
On Apr 8, 2010, at 8:34 AM, David Peixotto wrote:

I have found a discrepancy in behavior between the clang preprocessor and gcc. 


Hi David,

I'm not particularly interested in fixing this: people abusing the C preprocessor to preprocess things that aren't C don't get much sympathy from me.  However, if you are able to produce a patch that works, doesn't break anything else, and doesn't affect performance, I'd be happy to apply it.

I can fully understand your position on this point. I thought I'd report it since the behavior was different than GCC and did prevent clang from being used as a drop-in replacement.



I'd be happy to take a look at fixing this myself if someone could point me in the right direction.

lib/Frontend/PrintPreprocessedOutput.cpp is the code that outputs a .i file, the "AvoidConcat" stuff is what you want.  It is what decides to add spaces to prevent three . tokens from looking like a ... for example.

Great, thanks for the pointer. I'll take a look and if I can come up with something reasonable, I will send a patch.

-Dave



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang macro expansion differs from GCC

David Peixotto
In reply to this post by Chris Lattner
Hi Chris,

I took a look at modifying the preprocessor output and it was pretty
straightforward. I have attached a patch that makes the preprocessed output
less conservative in adding spaces between periods. It will now add a space to
avoid turning ".. ." into "...", but does not always add a space between two
consecutive dots.

The change was done by keeping track of the last two tokens that were output
and testing against those two tokens to avoid creating "...". Previously only
the last output token was tracked and tested against, which caused any two
consecutive . tokens to have a space between them.

The patch passes all the tests in the tests subdirectory. I modified one test
to reflect the output changing from ". . ." to ".. .". This output happens to
match the expected output in the test comment. I also added a test for the
original case that was causing me problems.

Please let me know if there are more tests I should run or anything else that
needs to happen for the patch to be accepted. Thanks!

-David



On Apr 8, 2010, at 11:48 AM, Chris Lattner wrote:


On Apr 8, 2010, at 8:34 AM, David Peixotto wrote:

I have found a discrepancy in behavior between the clang preprocessor and gcc. 

$ cat t.c 
#define X 0 .. 1

0 .. 1 // Ok
X      // Bad: space between ..

$ gcc -E t.c 
<snip>

0 .. 1
0 .. 1

$ clang -E t.c 
<snip>

0 .. 1
0 . . 1

The problem is that when ".." is in the macro and the macro gets expanded, it adds a space between the dots. As you can see, the space is not added when the ".." is entered directly in the text. GCC does not insert a space in either case. 

I'm not sure this qualifies as a bug, but the reason I need this to work is that I am trying to compile GHC (a Haskell compiler) with clang. It uses GCC to preprocess c-- files (http://www.cminusminus.org) before parsing them and ".." is a valid operator in c--. Adding the extra space messes up the subsequent parsing phase.

Hi David,

I'm not particularly interested in fixing this: people abusing the C preprocessor to preprocess things that aren't C don't get much sympathy from me.  However, if you are able to produce a patch that works, doesn't break anything else, and doesn't affect performance, I'd be happy to apply it.


I'd be happy to take a look at fixing this myself if someone could point me in the right direction.

lib/Frontend/PrintPreprocessedOutput.cpp is the code that outputs a .i file, the "AvoidConcat" stuff is what you want.  It is what decides to add spaces to prevent three . tokens from looking like a ... for example.

-Chris



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

PPdots.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Clang macro expansion differs from GCC

Chris Lattner

On Apr 11, 2010, at 11:43 AM, David Peixotto wrote:

Hi Chris,

I took a look at modifying the preprocessor output and it was pretty
straightforward. I have attached a patch that makes the preprocessed output
less conservative in adding spaces between periods. It will now add a space to
avoid turning ".. ." into "...", but does not always add a space between two
consecutive dots.

The change was done by keeping track of the last two tokens that were output
and testing against those two tokens to avoid creating "...". Previously only
the last output token was tracked and tested against, which caused any two
consecutive . tokens to have a space between them.

The patch passes all the tests in the tests subdirectory. I modified one test
to reflect the output changing from ". . ." to ".. .". This output happens to
match the expected output in the test comment. I also added a test for the
original case that was causing me problems.

Please let me know if there are more tests I should run or anything else that
needs to happen for the patch to be accepted. Thanks!

This patch looks great, except that it happens to conflict on mainline.  :(  Please send and updated patch and I'll be happy to apply it for you.  One minor thing is to watch & placement in "const Token& PrevPrevTok, "

Thanks for working on this!

-Chris



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang macro expansion differs from GCC

David Peixotto
Hi Chris,

Here is an updated patch that applied cleanly to the trunk. I can see that it is quite a moving target so hopefully it is still valid by the time you get it :) I also fixed the formatting of the & token to be consistent with the rest of the code. Please let me know if you see any more issues. Thanks!

-Dave




On Apr 12, 2010, at 7:12 PM, Chris Lattner wrote:


On Apr 11, 2010, at 11:43 AM, David Peixotto wrote:

Hi Chris,

I took a look at modifying the preprocessor output and it was pretty
straightforward. I have attached a patch that makes the preprocessed output
less conservative in adding spaces between periods. It will now add a space to
avoid turning ".. ." into "...", but does not always add a space between two
consecutive dots.

The change was done by keeping track of the last two tokens that were output
and testing against those two tokens to avoid creating "...". Previously only
the last output token was tracked and tested against, which caused any two
consecutive . tokens to have a space between them.

The patch passes all the tests in the tests subdirectory. I modified one test
to reflect the output changing from ". . ." to ".. .". This output happens to
match the expected output in the test comment. I also added a test for the
original case that was causing me problems.

Please let me know if there are more tests I should run or anything else that
needs to happen for the patch to be accepted. Thanks!

This patch looks great, except that it happens to conflict on mainline.  :(  Please send and updated patch and I'll be happy to apply it for you.  One minor thing is to watch & placement in "const Token& PrevPrevTok, "

Thanks for working on this!

-Chris




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

PPdots2.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Clang macro expansion differs from GCC

Chris Lattner
Applied here, thanks!

If there is a PR tracking this, please close it.

-Chris

On Apr 12, 2010, at 9:44 PM, David Peixotto wrote:

Hi Chris,

Here is an updated patch that applied cleanly to the trunk. I can see that it is quite a moving target so hopefully it is still valid by the time you get it :) I also fixed the formatting of the & token to be consistent with the rest of the code. Please let me know if you see any more issues. Thanks!

-Dave

<PPdots2.patch>


On Apr 12, 2010, at 7:12 PM, Chris Lattner wrote:


On Apr 11, 2010, at 11:43 AM, David Peixotto wrote:

Hi Chris,

I took a look at modifying the preprocessor output and it was pretty
straightforward. I have attached a patch that makes the preprocessed output
less conservative in adding spaces between periods. It will now add a space to
avoid turning ".. ." into "...", but does not always add a space between two
consecutive dots.

The change was done by keeping track of the last two tokens that were output
and testing against those two tokens to avoid creating "...". Previously only
the last output token was tracked and tested against, which caused any two
consecutive . tokens to have a space between them.

The patch passes all the tests in the tests subdirectory. I modified one test
to reflect the output changing from ". . ." to ".. .". This output happens to
match the expected output in the test comment. I also added a test for the
original case that was causing me problems.

Please let me know if there are more tests I should run or anything else that
needs to happen for the patch to be accepted. Thanks!

This patch looks great, except that it happens to conflict on mainline.  :(  Please send and updated patch and I'll be happy to apply it for you.  One minor thing is to watch & placement in "const Token& PrevPrevTok, "

Thanks for working on this!

-Chris





_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev