Re: [llvm-dev] Clang Preprocessor Speed Up

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
First, surely the right place for this discussion is the cfe-dev mailing list?

Second, have you determined that this is a noticeable amount of time when compiling? I have no idea - in my Pascal compiler, parsing the code is ~0.1%, codegen to IR ~1.9% and LLVM 98%. But I'm sure Clang is more complex in many ways, so the proportion is probably a bit different - a measurement of the time spent expanding macros would probably help determine if it's worth doing or not.

--
Mats

On 24 March 2016 at 22:17, Andy via llvm-dev <[hidden email]> wrote:
Hello, folks!

Currently me with one other guy are trying to play with clang. The proposal may seem stupid, excuse me, if it was already discussed, we just want to try to implement something useful which seems absent for now.

Ok, the idea. It seems interesting to try to make lexer a little bit more efficient in terms of macro expanding by applying partial expansion of macros. the idea is that some libraries have rather deeply nested macro definitions, and each time lexer sees it in code, it reexpands definition fully. This seems to be overkill sometimes, for rather often macros are not redefined in code, so expansion can be reused.

Of course, the typical nesting is rather low, but for example BOOST_PP_REPEAT can cause such situations.

So, the question is, what do you think about possible utility of such research and the reasons for you think so?

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Hi Mats,
 
Thanks for the reply. Yep, you are right, the time should be measured and I guess I can imagine the typical workflow
  • implement prototype
  • take bunch of big real projects
  • compare preprocessing time for initial and changed clang
  • make conclusion whether the idea is sane or not
About usage - probably, some IDEs can act better for they need iteratively relex source for correct autocomplete.
 
What I'm also curious about is if somebody already did something on this or had thought about it.
If the idea was already thought (which I guess is rather possible), it's interesting, did somebody already prove it's useless? 
 
25.03.2016, 01:59, "mats petersson" <[hidden email]>:
First, surely the right place for this discussion is the cfe-dev mailing list?

Second, have you determined that this is a noticeable amount of time when compiling? I have no idea - in my Pascal compiler, parsing the code is ~0.1%, codegen to IR ~1.9% and LLVM 98%. But I'm sure Clang is more complex in many ways, so the proportion is probably a bit different - a measurement of the time spent expanding macros would probably help determine if it's worth doing or not.

--
Mats

On 24 March 2016 at 22:17, Andy via llvm-dev <[hidden email]> wrote:
Hello, folks!

Currently me with one other guy are trying to play with clang. The proposal may seem stupid, excuse me, if it was already discussed, we just want to try to implement something useful which seems absent for now.

Ok, the idea. It seems interesting to try to make lexer a little bit more efficient in terms of macro expanding by applying partial expansion of macros. the idea is that some libraries have rather deeply nested macro definitions, and each time lexer sees it in code, it reexpands definition fully. This seems to be overkill sometimes, for rather often macros are not redefined in code, so expansion can be reused.

Of course, the typical nesting is rather low, but for example BOOST_PP_REPEAT can cause such situations.

So, the question is, what do you think about possible utility of such research and the reasons for you think so?

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Once measured times for one of the Boost libraries example, preprocessing (-E) was about 20% of total compilation time. This is not typical in general but quite common with Boost libraries as 100s-1000s files may be included with tons of macros and nested macros.



2016-03-25 1:29 GMT+02:00 Андрей Серебро <[hidden email]>:
Hi Mats,
 
Thanks for the reply. Yep, you are right, the time should be measured and I guess I can imagine the typical workflow
  • implement prototype
  • take bunch of big real projects
  • compare preprocessing time for initial and changed clang
  • make conclusion whether the idea is sane or not
About usage - probably, some IDEs can act better for they need iteratively relex source for correct autocomplete.
 
What I'm also curious about is if somebody already did something on this or had thought about it.
If the idea was already thought (which I guess is rather possible), it's interesting, did somebody already prove it's useless? 
 
25.03.2016, 01:59, "mats petersson" <[hidden email]>:
First, surely the right place for this discussion is the cfe-dev mailing list?

Second, have you determined that this is a noticeable amount of time when compiling? I have no idea - in my Pascal compiler, parsing the code is ~0.1%, codegen to IR ~1.9% and LLVM 98%. But I'm sure Clang is more complex in many ways, so the proportion is probably a bit different - a measurement of the time spent expanding macros would probably help determine if it's worth doing or not.

--
Mats

On 24 March 2016 at 22:17, Andy via llvm-dev <[hidden email]> wrote:
Hello, folks!

Currently me with one other guy are trying to play with clang. The proposal may seem stupid, excuse me, if it was already discussed, we just want to try to implement something useful which seems absent for now.

Ok, the idea. It seems interesting to try to make lexer a little bit more efficient in terms of macro expanding by applying partial expansion of macros. the idea is that some libraries have rather deeply nested macro definitions, and each time lexer sees it in code, it reexpands definition fully. This seems to be overkill sometimes, for rather often macros are not redefined in code, so expansion can be reused.

Of course, the typical nesting is rather low, but for example BOOST_PP_REPEAT can cause such situations.

So, the question is, what do you think about possible utility of such research and the reasons for you think so?

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
But even then, how much of the total time is expanding macros, and how much is "reading and finding the actual files" (and writing the output)?

I'm not saying this is not worth doing, I'm just trying to avoid someone spending time on something that doesn't provide benefit - I speak from experience, I've "optimized" code, and then found that it didn't make any improvement at all - I've also done work with gives 3-30x speedups by some simple steps... So, measure, make improvement, measure.

Or, use `perf` on some typical usecase, and figure out where the time goes...

--
Mats

On 25 March 2016 at 08:29, Yaron Keren <[hidden email]> wrote:
Once measured times for one of the Boost libraries example, preprocessing (-E) was about 20% of total compilation time. This is not typical in general but quite common with Boost libraries as 100s-1000s files may be included with tons of macros and nested macros.



2016-03-25 1:29 GMT+02:00 Андрей Серебро <[hidden email]>:
Hi Mats,
 
Thanks for the reply. Yep, you are right, the time should be measured and I guess I can imagine the typical workflow
  • implement prototype
  • take bunch of big real projects
  • compare preprocessing time for initial and changed clang
  • make conclusion whether the idea is sane or not
About usage - probably, some IDEs can act better for they need iteratively relex source for correct autocomplete.
 
What I'm also curious about is if somebody already did something on this or had thought about it.
If the idea was already thought (which I guess is rather possible), it's interesting, did somebody already prove it's useless? 
 
25.03.2016, 01:59, "mats petersson" <[hidden email]>:
First, surely the right place for this discussion is the cfe-dev mailing list?

Second, have you determined that this is a noticeable amount of time when compiling? I have no idea - in my Pascal compiler, parsing the code is ~0.1%, codegen to IR ~1.9% and LLVM 98%. But I'm sure Clang is more complex in many ways, so the proportion is probably a bit different - a measurement of the time spent expanding macros would probably help determine if it's worth doing or not.

--
Mats

On 24 March 2016 at 22:17, Andy via llvm-dev <[hidden email]> wrote:
Hello, folks!

Currently me with one other guy are trying to play with clang. The proposal may seem stupid, excuse me, if it was already discussed, we just want to try to implement something useful which seems absent for now.

Ok, the idea. It seems interesting to try to make lexer a little bit more efficient in terms of macro expanding by applying partial expansion of macros. the idea is that some libraries have rather deeply nested macro definitions, and each time lexer sees it in code, it reexpands definition fully. This seems to be overkill sometimes, for rather often macros are not redefined in code, so expansion can be reused.

Of course, the typical nesting is rather low, but for example BOOST_PP_REPEAT can cause such situations.

So, the question is, what do you think about possible utility of such research and the reasons for you think so?

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 
 
-- 
Regards,
Andrei Serebro
tel. <a href="tel:%2B79111758381" value="+79111758381" target="_blank">+79111758381
 

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
So, I have implemented a prototype. For anyone interested, I attach the patch (I used clang release 38 from github as startup code).
 
The prototype consumes 40% less time on preprocessing all boost headers than original clang (780 seconds vs 1330). On real-world source code patched clang seems to be usually faster than original.
 
The big problem with it is that it currently doesn't generate proper source locations for expanded tokens. It doesn't affect the final code, but of course it makes diagnostics hard in case of errors. On the other hand, this situation is similar with inline functions debugging: without special flag, debugging will not be that easy. 
 
I have also measured timings for patched clang and clean clang, but with removed information about expansion locations (for it could happen, that all the profit came from switching off this info). But it turned out, that patched is still 28% faster (1100 seconds vs 780 seconds). 
 
I think, it may be useful probably to have some flag in clang that allows fast preprocessing, for sometimes profit can reach up to x4 times!

I'm pretty sure there are some bugs now I haven't yet recognized, so any feedback is highly appreciated.
 
25.03.2016, 12:05, "mats petersson" [hidden email]:
But even then, how much of the total time is expanding macros, and how much is "reading and finding the actual files" (and writing the output)?

I'm not saying this is not worth doing, I'm just trying to avoid someone spending time on something that doesn't provide benefit - I speak from experience, I've "optimized" code, and then found that it didn't make any improvement at all - I've also done work with gives 3-30x speedups by some simple steps... So, measure, make improvement, measure.

Or, use `perf` on some typical usecase, and figure out where the time goes...

--
Mats

On 25 March 2016 at 08:29, Yaron Keren <[hidden email]> wrote:
Once measured times for one of the Boost libraries example, preprocessing (-E) was about 20% of total compilation time. This is not typical in general but quite common with Boost libraries as 100s-1000s files may be included with tons of macros and nested macros.
 
 

2016-03-25 1:29 GMT+02:00 Андрей Серебро <[hidden email]>:
Hi Mats,
 
Thanks for the reply. Yep, you are right, the time should be measured and I guess I can imagine the typical workflow
  • implement prototype
  • take bunch of big real projects
  • compare preprocessing time for initial and changed clang
  • make conclusion whether the idea is sane or not
About usage - probably, some IDEs can act better for they need iteratively relex source for correct autocomplete.
 
What I'm also curious about is if somebody already did something on this or had thought about it.
If the idea was already thought (which I guess is rather possible), it's interesting, did somebody already prove it's useless? 
 
25.03.2016, 01:59, "mats petersson" <[hidden email]>:
First, surely the right place for this discussion is the cfe-dev mailing list?

Second, have you determined that this is a noticeable amount of time when compiling? I have no idea - in my Pascal compiler, parsing the code is ~0.1%, codegen to IR ~1.9% and LLVM 98%. But I'm sure Clang is more complex in many ways, so the proportion is probably a bit different - a measurement of the time spent expanding macros would probably help determine if it's worth doing or not.

--
Mats

On 24 March 2016 at 22:17, Andy via llvm-dev <[hidden email]> wrote:
Hello, folks!

Currently me with one other guy are trying to play with clang. The proposal may seem stupid, excuse me, if it was already discussed, we just want to try to implement something useful which seems absent for now.

Ok, the idea. It seems interesting to try to make lexer a little bit more efficient in terms of macro expanding by applying partial expansion of macros. the idea is that some libraries have rather deeply nested macro definitions, and each time lexer sees it in code, it reexpands definition fully. This seems to be overkill sometimes, for rather often macros are not redefined in code, so expansion can be reused.

Of course, the typical nesting is rather low, but for example BOOST_PP_REPEAT can cause such situations.

So, the question is, what do you think about possible utility of such research and the reasons for you think so?

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 
 
-- 
Regards,
Andrei Serebro
tel. <a href="tel:%2B79111758381" target="_blank">+79111758381
 

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

macros_patch.diff (48K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Adding Reid and Nico who have been struggling with lexer / preprocessor compile times recently.

On Mon, May 16, 2016 at 10:46 AM Андрей Серебро <[hidden email]> wrote:
So, I have implemented a prototype. For anyone interested, I attach the patch (I used clang release 38 from github as startup code).
 
The prototype consumes 40% less time on preprocessing all boost headers than original clang (780 seconds vs 1330). On real-world source code patched clang seems to be usually faster than original.
 
The big problem with it is that it currently doesn't generate proper source locations for expanded tokens. It doesn't affect the final code, but of course it makes diagnostics hard in case of errors. On the other hand, this situation is similar with inline functions debugging: without special flag, debugging will not be that easy. 
 
I have also measured timings for patched clang and clean clang, but with removed information about expansion locations (for it could happen, that all the profit came from switching off this info). But it turned out, that patched is still 28% faster (1100 seconds vs 780 seconds). 
 
I think, it may be useful probably to have some flag in clang that allows fast preprocessing, for sometimes profit can reach up to x4 times!

I'm pretty sure there are some bugs now I haven't yet recognized, so any feedback is highly appreciated.
 
25.03.2016, 12:05, "mats petersson" [hidden email]:
But even then, how much of the total time is expanding macros, and how much is "reading and finding the actual files" (and writing the output)?

I'm not saying this is not worth doing, I'm just trying to avoid someone spending time on something that doesn't provide benefit - I speak from experience, I've "optimized" code, and then found that it didn't make any improvement at all - I've also done work with gives 3-30x speedups by some simple steps... So, measure, make improvement, measure.

Or, use `perf` on some typical usecase, and figure out where the time goes...

--
Mats

On 25 March 2016 at 08:29, Yaron Keren <[hidden email][hidden email]> wrote:
Once measured times for one of the Boost libraries example, preprocessing (-E) was about 20% of total compilation time. This is not typical in general but quite common with Boost libraries as 100s-1000s files may be included with tons of macros and nested macros.
 
 

2016-03-25 1:29 GMT+02:00 Андрей Серебро <[hidden email][hidden email]>:
Hi Mats,
 
Thanks for the reply. Yep, you are right, the time should be measured and I guess I can imagine the typical workflow
  • implement prototype
  • take bunch of big real projects
  • compare preprocessing time for initial and changed clang
  • make conclusion whether the idea is sane or not
About usage - probably, some IDEs can act better for they need iteratively relex source for correct autocomplete.
 
What I'm also curious about is if somebody already did something on this or had thought about it.
If the idea was already thought (which I guess is rather possible), it's interesting, did somebody already prove it's useless? 
 
25.03.2016, 01:59, "mats petersson" <[hidden email][hidden email]>:
First, surely the right place for this discussion is the cfe-dev mailing list?

Second, have you determined that this is a noticeable amount of time when compiling? I have no idea - in my Pascal compiler, parsing the code is ~0.1%, codegen to IR ~1.9% and LLVM 98%. But I'm sure Clang is more complex in many ways, so the proportion is probably a bit different - a measurement of the time spent expanding macros would probably help determine if it's worth doing or not.

--
Mats

On 24 March 2016 at 22:17, Andy via llvm-dev <[hidden email][hidden email]> wrote:
Hello, folks!

Currently me with one other guy are trying to play with clang. The proposal may seem stupid, excuse me, if it was already discussed, we just want to try to implement something useful which seems absent for now.

Ok, the idea. It seems interesting to try to make lexer a little bit more efficient in terms of macro expanding by applying partial expansion of macros. the idea is that some libraries have rather deeply nested macro definitions, and each time lexer sees it in code, it reexpands definition fully. This seems to be overkill sometimes, for rather often macros are not redefined in code, so expansion can be reused.

Of course, the typical nesting is rather low, but for example BOOST_PP_REPEAT can cause such situations.

So, the question is, what do you think about possible utility of such research and the reasons for you think so?

_______________________________________________
LLVM Developers mailing list
[hidden email][hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 
 
-- 
Regards,
Andrei Serebro
tel. <a href="tel:%2B79111758381" target="_blank">+79111758381
 

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
On Mon, May 16, 2016 at 10:46 AM Андрей Серебро wrote:

> So, I have implemented a prototype. For anyone interested, I attach the
> patch (I used clang release 38 from github as startup code).
>
> The prototype consumes 40% less time on preprocessing all boost headers
> than original clang (780 seconds vs 1330). On real-world source code
> patched clang seems to be usually faster than original.
>
> The big problem with it is that it currently doesn't generate proper
> source locations for expanded tokens. It doesn't affect the final code,
> but of course it makes diagnostics hard in case of errors. On the other
> hand, this situation is similar with inline functions debugging: without
> special flag, debugging will not be that easy.
>
> I have also measured timings for patched clang and clean clang, but with
> removed information about expansion locations (for it could happen, that
> all the profit came from switching off this info). But it turned out,
> that patched is still 28% faster (1100 seconds vs 780 seconds).
>
> I think, it may be useful probably to have some flag in clang that allows
> fast preprocessing, for sometimes profit can reach up to x4 times!
>
> I'm pretty sure there are some bugs now I haven't yet recognized, so any
> feedback is highly appreciated.

Hi,

I came across your post rummaging through the mailing list and I am most
interested in it, not least because with minimal effort and an albeit fairly
pathological macro expansion, I can make clang crash with the message "Ran
out of source locations!" (assert inside SourceLocations.cpp).  So anything
that will help there will be great.

However, I am unable to get your patch to work at all.  It seems to totally
garble the macro expansion, even on simple expansions.  And the test-suite
produces over 200 unexpected fails.  Other problems I see are that it gives
an error on the following define (claiming an unterminated function-like
macro invocation):

#define M1(...) M2(__VA_ARGS__

and I see lots of errors relating to supposed redefinitions of macros, when
they aren't.

Is it possible that you attached an incomplete patch?  Do you have a working
test-case that I can start with?  I would be delighted to test your work
further.

Cheers,
Andy

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Hi,
 
Yep, I have noticed several problems with the patch after I have sent it, it's my bad, really: first was with variadic macos, second there was an erroneous ## processing, and also now I'm experiencing some problems with compiling Boost 1.61. 
Currently I put a gap for the first problem, so that variadics are just not cached and so expanded in a standard way.
The second I guess was fixed, now I'm struggling with the third one.
 

However, I am unable to get your patch to work at all. It seems to totally 
garble the macro expansion, even on simple expansions. And the test-suite 
produces over 200 unexpected fails.

That's interesting. Simple samples should definitely work fine (in case there were no variadics). You can share pieces that cause bad behavior, I will try to understand whats wrong with them (probably, it's about that errors with ##).
Unexpeccted failures I guess are fine since llvm uses Lit to check the test results, and this checking consists of pattern matching the output with the one that is expected. Since I have removed SourceLocation information, the output definitely varies from the expected output that uses them. But I will also look at it in case something not connected with locations also breaks the tests. 


and I see lots of errors relating to supposed redefinitions of macros, when 
they aren't.

Can you please attach some code that can reproduce redefinition problems? I hoped I have found all the cases where it happens, however, I could miss something. 
Thanks a lot for the feedback. 
 
02.06.2016, 10:45, "Andy Gibbs" <[hidden email]>:

On Mon, May 16, 2016 at 10:46 AM Андрей Серебро wrote:

 So, I have implemented a prototype. For anyone interested, I attach the
 patch (I used clang release 38 from github as startup code).

 The prototype consumes 40% less time on preprocessing all boost headers
 than original clang (780 seconds vs 1330). On real-world source code
 patched clang seems to be usually faster than original.

 The big problem with it is that it currently doesn't generate proper
 source locations for expanded tokens. It doesn't affect the final code,
 but of course it makes diagnostics hard in case of errors. On the other
 hand, this situation is similar with inline functions debugging: without
 special flag, debugging will not be that easy.

 I have also measured timings for patched clang and clean clang, but with
 removed information about expansion locations (for it could happen, that
 all the profit came from switching off this info). But it turned out,
 that patched is still 28% faster (1100 seconds vs 780 seconds).

 I think, it may be useful probably to have some flag in clang that allows
 fast preprocessing, for sometimes profit can reach up to x4 times!

 I'm pretty sure there are some bugs now I haven't yet recognized, so any
 feedback is highly appreciated.


Hi,

I came across your post rummaging through the mailing list and I am most
interested in it, not least because with minimal effort and an albeit fairly
pathological macro expansion, I can make clang crash with the message "Ran
out of source locations!" (assert inside SourceLocations.cpp). So anything
that will help there will be great.

However, I am unable to get your patch to work at all. It seems to totally
garble the macro expansion, even on simple expansions. And the test-suite
produces over 200 unexpected fails. Other problems I see are that it gives
an error on the following define (claiming an unterminated function-like
macro invocation):

#define M1(...) M2(__VA_ARGS__

and I see lots of errors relating to supposed redefinitions of macros, when
they aren't.

Is it possible that you attached an incomplete patch? Do you have a working
test-case that I can start with? I would be delighted to test your work
further.

Cheers,
Andy

 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Hi,

After your patch, you can run all clang regresion tests with "ninja check-clang".
Passing these would be a good start.

Yaron


2016-06-02 11:59 GMT+03:00 Андрей Серебро <[hidden email]>:
Hi,
 
Yep, I have noticed several problems with the patch after I have sent it, it's my bad, really: first was with variadic macos, second there was an erroneous ## processing, and also now I'm experiencing some problems with compiling Boost 1.61. 
Currently I put a gap for the first problem, so that variadics are just not cached and so expanded in a standard way.
The second I guess was fixed, now I'm struggling with the third one.
 

However, I am unable to get your patch to work at all. It seems to totally 
garble the macro expansion, even on simple expansions. And the test-suite 
produces over 200 unexpected fails.

That's interesting. Simple samples should definitely work fine (in case there were no variadics). You can share pieces that cause bad behavior, I will try to understand whats wrong with them (probably, it's about that errors with ##).
Unexpeccted failures I guess are fine since llvm uses Lit to check the test results, and this checking consists of pattern matching the output with the one that is expected. Since I have removed SourceLocation information, the output definitely varies from the expected output that uses them. But I will also look at it in case something not connected with locations also breaks the tests. 


and I see lots of errors relating to supposed redefinitions of macros, when 
they aren't.

Can you please attach some code that can reproduce redefinition problems? I hoped I have found all the cases where it happens, however, I could miss something. 
Thanks a lot for the feedback. 
 
02.06.2016, 10:45, "Andy Gibbs" <[hidden email]>:

On Mon, May 16, 2016 at 10:46 AM Андрей Серебро wrote:

 So, I have implemented a prototype. For anyone interested, I attach the
 patch (I used clang release 38 from github as startup code).

 The prototype consumes 40% less time on preprocessing all boost headers
 than original clang (780 seconds vs 1330). On real-world source code
 patched clang seems to be usually faster than original.

 The big problem with it is that it currently doesn't generate proper
 source locations for expanded tokens. It doesn't affect the final code,
 but of course it makes diagnostics hard in case of errors. On the other
 hand, this situation is similar with inline functions debugging: without
 special flag, debugging will not be that easy.

 I have also measured timings for patched clang and clean clang, but with
 removed information about expansion locations (for it could happen, that
 all the profit came from switching off this info). But it turned out,
 that patched is still 28% faster (1100 seconds vs 780 seconds).

 I think, it may be useful probably to have some flag in clang that allows
 fast preprocessing, for sometimes profit can reach up to x4 times!

 I'm pretty sure there are some bugs now I haven't yet recognized, so any
 feedback is highly appreciated.


Hi,

I came across your post rummaging through the mailing list and I am most
interested in it, not least because with minimal effort and an albeit fairly
pathological macro expansion, I can make clang crash with the message "Ran
out of source locations!" (assert inside SourceLocations.cpp). So anything
that will help there will be great.

However, I am unable to get your patch to work at all. It seems to totally
garble the macro expansion, even on simple expansions. And the test-suite
produces over 200 unexpected fails. Other problems I see are that it gives
an error on the following define (claiming an unterminated function-like
macro invocation):

#define M1(...) M2(__VA_ARGS__

and I see lots of errors relating to supposed redefinitions of macros, when
they aren't.

Is it possible that you attached an incomplete patch? Do you have a working
test-case that I can start with? I would be delighted to test your work
further.

Cheers,
Andy

 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Ok, that would be a miracle.
 
As I understand, unfortunately getting rid of proper source locations causes big problems, yet it's currently hard for me to find some small samples on which everything crashes. I mean, it would be great to reduce the crash code dramatically, for now the examples of misbehavior are huge. 
 
I think a good understanding of interactions between SourceManager, Lexer and Parser is needed now because it's somewhere in between their communication, so to say, which turns to be broken after the patch.
I see a possibility in trying to save proper SourceLocations, yet I'm not sure it will be such an easy feature. 
 
I wonder if I can find some person who deeply understands the concept of SourceLocations so that it's possible to discuss the underlying problems?
Guys, I need your piece of advice :|
 
02.06.2016, 12:31, "Yaron Keren" <[hidden email]>:
Hi,
 
After your patch, you can run all clang regresion tests with "ninja check-clang".
Passing these would be a good start.
 
Yaron
 

2016-06-02 11:59 GMT+03:00 Андрей Серебро <[hidden email]>:
Hi,
 
Yep, I have noticed several problems with the patch after I have sent it, it's my bad, really: first was with variadic macos, second there was an erroneous ## processing, and also now I'm experiencing some problems with compiling Boost 1.61. 
Currently I put a gap for the first problem, so that variadics are just not cached and so expanded in a standard way.
The second I guess was fixed, now I'm struggling with the third one.
 

However, I am unable to get your patch to work at all. It seems to totally 
garble the macro expansion, even on simple expansions. And the test-suite 
produces over 200 unexpected fails.

That's interesting. Simple samples should definitely work fine (in case there were no variadics). You can share pieces that cause bad behavior, I will try to understand whats wrong with them (probably, it's about that errors with ##).
Unexpeccted failures I guess are fine since llvm uses Lit to check the test results, and this checking consists of pattern matching the output with the one that is expected. Since I have removed SourceLocation information, the output definitely varies from the expected output that uses them. But I will also look at it in case something not connected with locations also breaks the tests. 


and I see lots of errors relating to supposed redefinitions of macros, when 
they aren't.

Can you please attach some code that can reproduce redefinition problems? I hoped I have found all the cases where it happens, however, I could miss something. 
Thanks a lot for the feedback. 
 
02.06.2016, 10:45, "Andy Gibbs" <[hidden email]>:

On Mon, May 16, 2016 at 10:46 AM Андрей Серебро wrote:

 So, I have implemented a prototype. For anyone interested, I attach the
 patch (I used clang release 38 from github as startup code).

 The prototype consumes 40% less time on preprocessing all boost headers
 than original clang (780 seconds vs 1330). On real-world source code
 patched clang seems to be usually faster than original.

 The big problem with it is that it currently doesn't generate proper
 source locations for expanded tokens. It doesn't affect the final code,
 but of course it makes diagnostics hard in case of errors. On the other
 hand, this situation is similar with inline functions debugging: without
 special flag, debugging will not be that easy.

 I have also measured timings for patched clang and clean clang, but with
 removed information about expansion locations (for it could happen, that
 all the profit came from switching off this info). But it turned out,
 that patched is still 28% faster (1100 seconds vs 780 seconds).

 I think, it may be useful probably to have some flag in clang that allows
 fast preprocessing, for sometimes profit can reach up to x4 times!

 I'm pretty sure there are some bugs now I haven't yet recognized, so any
 feedback is highly appreciated.


Hi,

I came across your post rummaging through the mailing list and I am most
interested in it, not least because with minimal effort and an albeit fairly
pathological macro expansion, I can make clang crash with the message "Ran
out of source locations!" (assert inside SourceLocations.cpp). So anything
that will help there will be great.

However, I am unable to get your patch to work at all. It seems to totally
garble the macro expansion, even on simple expansions. And the test-suite
produces over 200 unexpected fails. Other problems I see are that it gives
an error on the following define (claiming an unterminated function-like
macro invocation):

#define M1(...) M2(__VA_ARGS__

and I see lots of errors relating to supposed redefinitions of macros, when
they aren't.

Is it possible that you attached an incomplete patch? Do you have a working
test-case that I can start with? I would be delighted to test your work
further.

Cheers,
Andy

 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

> Ok, that would be a miracle.
>  
> As I understand, unfortunately getting rid of proper source locations causes
> big problems, yet it's currently hard for me to find some small samples on
> which everything crashes. I mean, it would be great to reduce the crash
> code dramatically, for now the examples of misbehavior are huge.  
> I think a good understanding of interactions between SourceManager, Lexer
> and Parser is needed now because it's somewhere in between their
> communication, so to say, which turns to be broken after the patch. I see a
> possibility in trying to save proper SourceLocations, yet I'm not sure it
> will be such an easy feature.
> I wonder if I can find some person who deeply understands the concept of
> SourceLocations so that it's possible to discuss the underlying problems?
> Guys, I need your piece of advice :|

My suggestion, but others should jump in here too, is that you split the patch
up into two separate aims.

The first aim would be the reduction of SourceLocation information.

The second aim would be caching of previous macro expansions (from what I
understand of your patch, I think this is what you are attempting?).

I think this would be a benefit since it may be easier to get the patches
working if you work on the two parts independently.  Plus you indicated that
the first still has a sizeable benefit.  And I think the second will give you
problems (detailed below).

I'm afraid I am not able to help you much on the interactions of the
SourceManager, Lexer and Parser.  There are others on this mailing list who
would be able to help you much better in this regard.

Regarding reduction of SourceLocation information, these are my thoughts.  
Please note that I haven't assessed the practicality of this, and I'm writing
as I think so it may not even be coherent!  Caveats aside, you talked
previously of having this as a compiler switch.  I think this is a good idea
since the SourceLocation information is needed for debugging macro expansions.  
I would suggest implementing this switch as a depth counter so that
SourceLocation information is only once a macro expansion reaches a certain
depth the SourceLocation information will be elided.  In a sense, the
important information of a macro expansion is the SourceLocations of the
original tokens going into the macro and the SourceLocation of the final
scratch buffer.  At this point, I'm wondering whether it is useful to have
SourceLocation information for concatenated tokens... Maybe someone else can
give there thoughts here.

The other thing you can think about here is that of actually releasing a
scratch buffer no longer in use.  I think clang doesn't do this currently
(probably because of keeping all the SourceLocation information), but if you
looked at having a reference count on a scratch buffer then maybe it will be
possible to release it once its last use is complete.  This has the benefit of
also freeing memory... and may also improve the speed of the compiler.  Again,
others should give their thoughts here.  Personally, I would be very
interested to see if this was a possible line of improvement since with
extensive macro expansions, the memory consumption of clang is huge.

In fact, thinking about this, this may be the first place to start -- it is
sort of linked to eliding SourceLocation information, but it has the advantage
that you are freeing the whole scratch buffer since it has been incorporated
into another higher-level scratch buffer, and so is no longer referenced
anywhere and its disappearance shouldn't (hopefully!) cause so many problems
in clang diagnostics, for example, which even so would need to be adapted to
skip these elided macro expansion steps.

Finally, regarding the caching of previous macro expansions, you need to be
very careful because macro definitions can change, yes even inside a macro
expansion too (think pragma pop_macro).  For that matter, you need to think
about how to handle any _Pragma expansion since these can cause side effects.  
This is why I would leave this side of your patch until last.  It will be the
hardest to test and guarantee correctness.

Keep up the good work!  :o)

Cheers,
Andy

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Well, you know, this splitting sounds like a good idea. I guess I'll try to follow your advice, for it's really somehow strange to solve two tasks in a time. 
 
I weighed the advantages and disadvantages and I guess that saving source locations would just vanish any sense from the activity of boosting macro expansion, because we will need to do all the same works as we did without caching at all: to create proper locations we will need to go down from the very top macro downside for every token. So, the idea from my previous post was not that sane - saving proper source locations means just give up the idea at all, it will barely reduce time consumption at all. 
 
You are right about debugging purpose of SL, and I think the idea with limitation on depths, for which info will be present, is solid. Looks like more flexible then complete turning off. 
 
About scratch space - to be honest, I haven't thought about it yet. I didn't even look in this side, so that's good you have pointed it. First of all, I need to understand what is this scratch space all about. =)
 
Such things like pragmas inside macro definitions which spoil caching can be overcome by just forbidding caching in certain situations, yet of course all such situations should be defined carefully. 
 
04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:

On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

 Ok, that would be a miracle.

 As I understand, unfortunately getting rid of proper source locations causes
 big problems, yet it's currently hard for me to find some small samples on
 which everything crashes. I mean, it would be great to reduce the crash
 code dramatically, for now the examples of misbehavior are huge.
 I think a good understanding of interactions between SourceManager, Lexer
 and Parser is needed now because it's somewhere in between their
 communication, so to say, which turns to be broken after the patch. I see a
 possibility in trying to save proper SourceLocations, yet I'm not sure it
 will be such an easy feature.
 I wonder if I can find some person who deeply understands the concept of
 SourceLocations so that it's possible to discuss the underlying problems?
 Guys, I need your piece of advice :|


My suggestion, but others should jump in here too, is that you split the patch
up into two separate aims.

The first aim would be the reduction of SourceLocation information.

The second aim would be caching of previous macro expansions (from what I
understand of your patch, I think this is what you are attempting?).

I think this would be a benefit since it may be easier to get the patches
working if you work on the two parts independently. Plus you indicated that
the first still has a sizeable benefit. And I think the second will give you
problems (detailed below).

I'm afraid I am not able to help you much on the interactions of the
SourceManager, Lexer and Parser. There are others on this mailing list who
would be able to help you much better in this regard.

Regarding reduction of SourceLocation information, these are my thoughts.
Please note that I haven't assessed the practicality of this, and I'm writing
as I think so it may not even be coherent! Caveats aside, you talked
previously of having this as a compiler switch. I think this is a good idea
since the SourceLocation information is needed for debugging macro expansions.
I would suggest implementing this switch as a depth counter so that
SourceLocation information is only once a macro expansion reaches a certain
depth the SourceLocation information will be elided. In a sense, the
important information of a macro expansion is the SourceLocations of the
original tokens going into the macro and the SourceLocation of the final
scratch buffer. At this point, I'm wondering whether it is useful to have
SourceLocation information for concatenated tokens... Maybe someone else can
give there thoughts here.

The other thing you can think about here is that of actually releasing a
scratch buffer no longer in use. I think clang doesn't do this currently
(probably because of keeping all the SourceLocation information), but if you
looked at having a reference count on a scratch buffer then maybe it will be
possible to release it once its last use is complete. This has the benefit of
also freeing memory... and may also improve the speed of the compiler. Again,
others should give their thoughts here. Personally, I would be very
interested to see if this was a possible line of improvement since with
extensive macro expansions, the memory consumption of clang is huge.

In fact, thinking about this, this may be the first place to start -- it is
sort of linked to eliding SourceLocation information, but it has the advantage
that you are freeing the whole scratch buffer since it has been incorporated
into another higher-level scratch buffer, and so is no longer referenced
anywhere and its disappearance shouldn't (hopefully!) cause so many problems
in clang diagnostics, for example, which even so would need to be adapted to
skip these elided macro expansion steps.

Finally, regarding the caching of previous macro expansions, you need to be
very careful because macro definitions can change, yes even inside a macro
expansion too (think pragma pop_macro). For that matter, you need to think
about how to handle any _Pragma expansion since these can cause side effects.
This is why I would leave this side of your patch until last. It will be the
hardest to test and guarantee correctness.

Keep up the good work! :o)

Cheers,
Andy

 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Interesting moment about Andy's comment. I found out that GCC already has (if I understood correctly) teh feature with expansion tracking depth.
 
The option is -ftrack-macro-expansion[=level] and it's description can be found here:
 
As I understand, clang doesn't bother with this yet.
 
05.06.2016, 01:38, "Андрей Серебро" <[hidden email]>:
Well, you know, this splitting sounds like a good idea. I guess I'll try to follow your advice, for it's really somehow strange to solve two tasks in a time. 
 
I weighed the advantages and disadvantages and I guess that saving source locations would just vanish any sense from the activity of boosting macro expansion, because we will need to do all the same works as we did without caching at all: to create proper locations we will need to go down from the very top macro downside for every token. So, the idea from my previous post was not that sane - saving proper source locations means just give up the idea at all, it will barely reduce time consumption at all. 
 
You are right about debugging purpose of SL, and I think the idea with limitation on depths, for which info will be present, is solid. Looks like more flexible then complete turning off. 
 
About scratch space - to be honest, I haven't thought about it yet. I didn't even look in this side, so that's good you have pointed it. First of all, I need to understand what is this scratch space all about. =)
 
Such things like pragmas inside macro definitions which spoil caching can be overcome by just forbidding caching in certain situations, yet of course all such situations should be defined carefully. 
 
04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:

On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

 Ok, that would be a miracle.

 As I understand, unfortunately getting rid of proper source locations causes
 big problems, yet it's currently hard for me to find some small samples on
 which everything crashes. I mean, it would be great to reduce the crash
 code dramatically, for now the examples of misbehavior are huge.
 I think a good understanding of interactions between SourceManager, Lexer
 and Parser is needed now because it's somewhere in between their
 communication, so to say, which turns to be broken after the patch. I see a
 possibility in trying to save proper SourceLocations, yet I'm not sure it
 will be such an easy feature.
 I wonder if I can find some person who deeply understands the concept of
 SourceLocations so that it's possible to discuss the underlying problems?
 Guys, I need your piece of advice :|


My suggestion, but others should jump in here too, is that you split the patch
up into two separate aims.

The first aim would be the reduction of SourceLocation information.

The second aim would be caching of previous macro expansions (from what I
understand of your patch, I think this is what you are attempting?).

I think this would be a benefit since it may be easier to get the patches
working if you work on the two parts independently. Plus you indicated that
the first still has a sizeable benefit. And I think the second will give you
problems (detailed below).

I'm afraid I am not able to help you much on the interactions of the
SourceManager, Lexer and Parser. There are others on this mailing list who
would be able to help you much better in this regard.

Regarding reduction of SourceLocation information, these are my thoughts.
Please note that I haven't assessed the practicality of this, and I'm writing
as I think so it may not even be coherent! Caveats aside, you talked
previously of having this as a compiler switch. I think this is a good idea
since the SourceLocation information is needed for debugging macro expansions.
I would suggest implementing this switch as a depth counter so that
SourceLocation information is only once a macro expansion reaches a certain
depth the SourceLocation information will be elided. In a sense, the
important information of a macro expansion is the SourceLocations of the
original tokens going into the macro and the SourceLocation of the final
scratch buffer. At this point, I'm wondering whether it is useful to have
SourceLocation information for concatenated tokens... Maybe someone else can
give there thoughts here.

The other thing you can think about here is that of actually releasing a
scratch buffer no longer in use. I think clang doesn't do this currently
(probably because of keeping all the SourceLocation information), but if you
looked at having a reference count on a scratch buffer then maybe it will be
possible to release it once its last use is complete. This has the benefit of
also freeing memory... and may also improve the speed of the compiler. Again,
others should give their thoughts here. Personally, I would be very
interested to see if this was a possible line of improvement since with
extensive macro expansions, the memory consumption of clang is huge.

In fact, thinking about this, this may be the first place to start -- it is
sort of linked to eliding SourceLocation information, but it has the advantage
that you are freeing the whole scratch buffer since it has been incorporated
into another higher-level scratch buffer, and so is no longer referenced
anywhere and its disappearance shouldn't (hopefully!) cause so many problems
in clang diagnostics, for example, which even so would need to be adapted to
skip these elided macro expansion steps.

Finally, regarding the caching of previous macro expansions, you need to be
very careful because macro definitions can change, yes even inside a macro
expansion too (think pragma pop_macro). For that matter, you need to think
about how to handle any _Pragma expansion since these can cause side effects.
This is why I would leave this side of your patch until last. It will be the
hardest to test and guarantee correctness.

Keep up the good work! :o)

Cheers,
Andy

 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
So, taking your advice I got rid of caching for now and concentrated on turning off source locations. 
 
I have introduced new flag -no-macro-exploc-tracking. Passing it to compiler does exactly what is intended to do - we don't generate expansion locations for macro tokens. To evaluate correct __LINE__ expansion, we just memorize the top most macro that is currently being expanded (it is ok since all macro expansions turn to produce one-liners... but there are some cases with multiline arguments for function-like macros, I didn't really get right now should we do something with it). 
 
I have tested it tough on boost 1.61. Among first 11000 files at least the success of compilation is the same for patched and clean release 38. 
 
Average boosting of preprocessing is 1.2 times. Also cases in which clang ran out of source locations will be less frequent I hope. At least, here is a fine example where compiler doesn't fall with the flag and fall without:
 
clang -c  -ftemplate-depth-32 -O3 -finline-functions -Wno-inline -Wall -pedantic -Wno-variadic-macros -pedantic-errors -std=c++11 -I $BOOST_1_61_HOME -no-macro-exploc-tracking $BOOST_1_61_HOME/libs/vmd/test/test_doc_modifiers_return_type.cpp 
 
If I don't put the flag, the process hangs (at least, on my machine). 
 
I attach the patch, hope for some feedback.
 
P.S. Also here are preprocessing benchmarks for boost. 
 
09.06.2016, 16:19, "Андрей Серебро" <[hidden email]>:
Interesting moment about Andy's comment. I found out that GCC already has (if I understood correctly) teh feature with expansion tracking depth.
 
The option is -ftrack-macro-expansion[=level] and it's description can be found here:
 
As I understand, clang doesn't bother with this yet.
 
05.06.2016, 01:38, "Андрей Серебро" <[hidden email]>:
Well, you know, this splitting sounds like a good idea. I guess I'll try to follow your advice, for it's really somehow strange to solve two tasks in a time. 
 
I weighed the advantages and disadvantages and I guess that saving source locations would just vanish any sense from the activity of boosting macro expansion, because we will need to do all the same works as we did without caching at all: to create proper locations we will need to go down from the very top macro downside for every token. So, the idea from my previous post was not that sane - saving proper source locations means just give up the idea at all, it will barely reduce time consumption at all. 
 
You are right about debugging purpose of SL, and I think the idea with limitation on depths, for which info will be present, is solid. Looks like more flexible then complete turning off. 
 
About scratch space - to be honest, I haven't thought about it yet. I didn't even look in this side, so that's good you have pointed it. First of all, I need to understand what is this scratch space all about. =)
 
Such things like pragmas inside macro definitions which spoil caching can be overcome by just forbidding caching in certain situations, yet of course all such situations should be defined carefully. 
 
04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:

On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

 Ok, that would be a miracle.

 As I understand, unfortunately getting rid of proper source locations causes
 big problems, yet it's currently hard for me to find some small samples on
 which everything crashes. I mean, it would be great to reduce the crash
 code dramatically, for now the examples of misbehavior are huge.
 I think a good understanding of interactions between SourceManager, Lexer
 and Parser is needed now because it's somewhere in between their
 communication, so to say, which turns to be broken after the patch. I see a
 possibility in trying to save proper SourceLocations, yet I'm not sure it
 will be such an easy feature.
 I wonder if I can find some person who deeply understands the concept of
 SourceLocations so that it's possible to discuss the underlying problems?
 Guys, I need your piece of advice :|


My suggestion, but others should jump in here too, is that you split the patch
up into two separate aims.

The first aim would be the reduction of SourceLocation information.

The second aim would be caching of previous macro expansions (from what I
understand of your patch, I think this is what you are attempting?).

I think this would be a benefit since it may be easier to get the patches
working if you work on the two parts independently. Plus you indicated that
the first still has a sizeable benefit. And I think the second will give you
problems (detailed below).

I'm afraid I am not able to help you much on the interactions of the
SourceManager, Lexer and Parser. There are others on this mailing list who
would be able to help you much better in this regard.

Regarding reduction of SourceLocation information, these are my thoughts.
Please note that I haven't assessed the practicality of this, and I'm writing
as I think so it may not even be coherent! Caveats aside, you talked
previously of having this as a compiler switch. I think this is a good idea
since the SourceLocation information is needed for debugging macro expansions.
I would suggest implementing this switch as a depth counter so that
SourceLocation information is only once a macro expansion reaches a certain
depth the SourceLocation information will be elided. In a sense, the
important information of a macro expansion is the SourceLocations of the
original tokens going into the macro and the SourceLocation of the final
scratch buffer. At this point, I'm wondering whether it is useful to have
SourceLocation information for concatenated tokens... Maybe someone else can
give there thoughts here.

The other thing you can think about here is that of actually releasing a
scratch buffer no longer in use. I think clang doesn't do this currently
(probably because of keeping all the SourceLocation information), but if you
looked at having a reference count on a scratch buffer then maybe it will be
possible to release it once its last use is complete. This has the benefit of
also freeing memory... and may also improve the speed of the compiler. Again,
others should give their thoughts here. Personally, I would be very
interested to see if this was a possible line of improvement since with
extensive macro expansions, the memory consumption of clang is huge.

In fact, thinking about this, this may be the first place to start -- it is
sort of linked to eliding SourceLocation information, but it has the advantage
that you are freeing the whole scratch buffer since it has been incorporated
into another higher-level scratch buffer, and so is no longer referenced
anywhere and its disappearance shouldn't (hopefully!) cause so many problems
in clang diagnostics, for example, which even so would need to be adapted to
skip these elided macro expansion steps.

Finally, regarding the caching of previous macro expansions, you need to be
very careful because macro definitions can change, yes even inside a macro
expansion too (think pragma pop_macro). For that matter, you need to think
about how to handle any _Pragma expansion since these can cause side effects.
This is why I would leave this side of your patch until last. It will be the
hardest to test and guarantee correctness.

Keep up the good work! :o)

Cheers,
Andy

 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

patch_slocs.diff (28K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Ouch, forgot to attach stats. Here they are. 
 
15.06.2016, 18:28, "Андрей Серебро" <[hidden email]>:
So, taking your advice I got rid of caching for now and concentrated on turning off source locations. 
 
I have introduced new flag -no-macro-exploc-tracking. Passing it to compiler does exactly what is intended to do - we don't generate expansion locations for macro tokens. To evaluate correct __LINE__ expansion, we just memorize the top most macro that is currently being expanded (it is ok since all macro expansions turn to produce one-liners... but there are some cases with multiline arguments for function-like macros, I didn't really get right now should we do something with it). 
 
I have tested it tough on boost 1.61. Among first 11000 files at least the success of compilation is the same for patched and clean release 38. 
 
Average boosting of preprocessing is 1.2 times. Also cases in which clang ran out of source locations will be less frequent I hope. At least, here is a fine example where compiler doesn't fall with the flag and fall without:
 
clang -c  -ftemplate-depth-32 -O3 -finline-functions -Wno-inline -Wall -pedantic -Wno-variadic-macros -pedantic-errors -std=c++11 -I $BOOST_1_61_HOME -no-macro-exploc-tracking $BOOST_1_61_HOME/libs/vmd/test/test_doc_modifiers_return_type.cpp 
 
If I don't put the flag, the process hangs (at least, on my machine). 
 
I attach the patch, hope for some feedback.
 
P.S. Also here are preprocessing benchmarks for boost. 
 
09.06.2016, 16:19, "Андрей Серебро" <[hidden email]>:
Interesting moment about Andy's comment. I found out that GCC already has (if I understood correctly) teh feature with expansion tracking depth.
 
The option is -ftrack-macro-expansion[=level] and it's description can be found here:
 
As I understand, clang doesn't bother with this yet.
 
05.06.2016, 01:38, "Андрей Серебро" <[hidden email]>:
Well, you know, this splitting sounds like a good idea. I guess I'll try to follow your advice, for it's really somehow strange to solve two tasks in a time. 
 
I weighed the advantages and disadvantages and I guess that saving source locations would just vanish any sense from the activity of boosting macro expansion, because we will need to do all the same works as we did without caching at all: to create proper locations we will need to go down from the very top macro downside for every token. So, the idea from my previous post was not that sane - saving proper source locations means just give up the idea at all, it will barely reduce time consumption at all. 
 
You are right about debugging purpose of SL, and I think the idea with limitation on depths, for which info will be present, is solid. Looks like more flexible then complete turning off. 
 
About scratch space - to be honest, I haven't thought about it yet. I didn't even look in this side, so that's good you have pointed it. First of all, I need to understand what is this scratch space all about. =)
 
Such things like pragmas inside macro definitions which spoil caching can be overcome by just forbidding caching in certain situations, yet of course all such situations should be defined carefully. 
 
04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:

On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

 Ok, that would be a miracle.

 As I understand, unfortunately getting rid of proper source locations causes
 big problems, yet it's currently hard for me to find some small samples on
 which everything crashes. I mean, it would be great to reduce the crash
 code dramatically, for now the examples of misbehavior are huge.
 I think a good understanding of interactions between SourceManager, Lexer
 and Parser is needed now because it's somewhere in between their
 communication, so to say, which turns to be broken after the patch. I see a
 possibility in trying to save proper SourceLocations, yet I'm not sure it
 will be such an easy feature.
 I wonder if I can find some person who deeply understands the concept of
 SourceLocations so that it's possible to discuss the underlying problems?
 Guys, I need your piece of advice :|


My suggestion, but others should jump in here too, is that you split the patch
up into two separate aims.

The first aim would be the reduction of SourceLocation information.

The second aim would be caching of previous macro expansions (from what I
understand of your patch, I think this is what you are attempting?).

I think this would be a benefit since it may be easier to get the patches
working if you work on the two parts independently. Plus you indicated that
the first still has a sizeable benefit. And I think the second will give you
problems (detailed below).

I'm afraid I am not able to help you much on the interactions of the
SourceManager, Lexer and Parser. There are others on this mailing list who
would be able to help you much better in this regard.

Regarding reduction of SourceLocation information, these are my thoughts.
Please note that I haven't assessed the practicality of this, and I'm writing
as I think so it may not even be coherent! Caveats aside, you talked
previously of having this as a compiler switch. I think this is a good idea
since the SourceLocation information is needed for debugging macro expansions.
I would suggest implementing this switch as a depth counter so that
SourceLocation information is only once a macro expansion reaches a certain
depth the SourceLocation information will be elided. In a sense, the
important information of a macro expansion is the SourceLocations of the
original tokens going into the macro and the SourceLocation of the final
scratch buffer. At this point, I'm wondering whether it is useful to have
SourceLocation information for concatenated tokens... Maybe someone else can
give there thoughts here.

The other thing you can think about here is that of actually releasing a
scratch buffer no longer in use. I think clang doesn't do this currently
(probably because of keeping all the SourceLocation information), but if you
looked at having a reference count on a scratch buffer then maybe it will be
possible to release it once its last use is complete. This has the benefit of
also freeing memory... and may also improve the speed of the compiler. Again,
others should give their thoughts here. Personally, I would be very
interested to see if this was a possible line of improvement since with
extensive macro expansions, the memory consumption of clang is huge.

In fact, thinking about this, this may be the first place to start -- it is
sort of linked to eliding SourceLocation information, but it has the advantage
that you are freeing the whole scratch buffer since it has been incorporated
into another higher-level scratch buffer, and so is no longer referenced
anywhere and its disappearance shouldn't (hopefully!) cause so many problems
in clang diagnostics, for example, which even so would need to be adapted to
skip these elided macro expansion steps.

Finally, regarding the caching of previous macro expansions, you need to be
very careful because macro definitions can change, yes even inside a macro
expansion too (think pragma pop_macro). For that matter, you need to think
about how to handle any _Pragma expansion since these can cause side effects.
This is why I would leave this side of your patch until last. It will be the
hardest to test and guarantee correctness.

Keep up the good work! :o)

Cheers,
Andy

 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

boost_times_slocs.ods (2M) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Hi, the speedups look great. Do this patch passes a clean `ninja check-clang`?

On Wed, Jun 15, 2016 at 8:30 AM, Андрей Серебро <[hidden email]> wrote:

> Ouch, forgot to attach stats. Here they are.
>
> 15.06.2016, 18:28, "Андрей Серебро" <[hidden email]>:
>
> So, taking your advice I got rid of caching for now and concentrated on
> turning off source locations.
>
> I have introduced new flag -no-macro-exploc-tracking. Passing it to compiler
> does exactly what is intended to do - we don't generate expansion locations
> for macro tokens. To evaluate correct __LINE__ expansion, we just memorize
> the top most macro that is currently being expanded (it is ok since all
> macro expansions turn to produce one-liners... but there are some cases with
> multiline arguments for function-like macros, I didn't really get right now
> should we do something with it).
>
> I have tested it tough on boost 1.61. Among first 11000 files at least the
> success of compilation is the same for patched and clean release 38.
>
> Average boosting of preprocessing is 1.2 times. Also cases in which clang
> ran out of source locations will be less frequent I hope. At least, here is
> a fine example where compiler doesn't fall with the flag and fall without:
>
> clang -c  -ftemplate-depth-32 -O3 -finline-functions -Wno-inline -Wall
> -pedantic -Wno-variadic-macros -pedantic-errors -std=c++11 -I
> $BOOST_1_61_HOME -no-macro-exploc-tracking
> $BOOST_1_61_HOME/libs/vmd/test/test_doc_modifiers_return_type.cpp
>
> If I don't put the flag, the process hangs (at least, on my machine).
>
> I attach the patch, hope for some feedback.
>
> P.S. Also here are preprocessing benchmarks for boost.
>
> 09.06.2016, 16:19, "Андрей Серебро" <[hidden email]>:
>
> Interesting moment about Andy's comment. I found out that GCC already has
> (if I understood correctly) teh feature with expansion tracking depth.
>
> The option is -ftrack-macro-expansion[=level] and it's description can be
> found here:
> https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html
>
> As I understand, clang doesn't bother with this yet.
>
> 05.06.2016, 01:38, "Андрей Серебро" <[hidden email]>:
>
> Well, you know, this splitting sounds like a good idea. I guess I'll try to
> follow your advice, for it's really somehow strange to solve two tasks in a
> time.
>
> I weighed the advantages and disadvantages and I guess that saving source
> locations would just vanish any sense from the activity of boosting macro
> expansion, because we will need to do all the same works as we did without
> caching at all: to create proper locations we will need to go down from the
> very top macro downside for every token. So, the idea from my previous post
> was not that sane - saving proper source locations means just give up the
> idea at all, it will barely reduce time consumption at all.
>
> You are right about debugging purpose of SL, and I think the idea with
> limitation on depths, for which info will be present, is solid. Looks like
> more flexible then complete turning off.
>
> About scratch space - to be honest, I haven't thought about it yet. I didn't
> even look in this side, so that's good you have pointed it. First of all, I
> need to understand what is this scratch space all about. =)
>
> Such things like pragmas inside macro definitions which spoil caching can be
> overcome by just forbidding caching in certain situations, yet of course all
> such situations should be defined carefully.
>
> 04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:
>
> On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:
>
>  Ok, that would be a miracle.
>
>  As I understand, unfortunately getting rid of proper source locations
> causes
>  big problems, yet it's currently hard for me to find some small samples on
>  which everything crashes. I mean, it would be great to reduce the crash
>  code dramatically, for now the examples of misbehavior are huge.
>  I think a good understanding of interactions between SourceManager, Lexer
>  and Parser is needed now because it's somewhere in between their
>  communication, so to say, which turns to be broken after the patch. I see a
>  possibility in trying to save proper SourceLocations, yet I'm not sure it
>  will be such an easy feature.
>  I wonder if I can find some person who deeply understands the concept of
>  SourceLocations so that it's possible to discuss the underlying problems?
>  Guys, I need your piece of advice :|
>
>
> My suggestion, but others should jump in here too, is that you split the
> patch
> up into two separate aims.
>
> The first aim would be the reduction of SourceLocation information.
>
> The second aim would be caching of previous macro expansions (from what I
> understand of your patch, I think this is what you are attempting?).
>
> I think this would be a benefit since it may be easier to get the patches
> working if you work on the two parts independently. Plus you indicated that
> the first still has a sizeable benefit. And I think the second will give you
> problems (detailed below).
>
> I'm afraid I am not able to help you much on the interactions of the
> SourceManager, Lexer and Parser. There are others on this mailing list who
> would be able to help you much better in this regard.
>
> Regarding reduction of SourceLocation information, these are my thoughts.
> Please note that I haven't assessed the practicality of this, and I'm
> writing
> as I think so it may not even be coherent! Caveats aside, you talked
> previously of having this as a compiler switch. I think this is a good idea
> since the SourceLocation information is needed for debugging macro
> expansions.
> I would suggest implementing this switch as a depth counter so that
> SourceLocation information is only once a macro expansion reaches a certain
> depth the SourceLocation information will be elided. In a sense, the
> important information of a macro expansion is the SourceLocations of the
> original tokens going into the macro and the SourceLocation of the final
> scratch buffer. At this point, I'm wondering whether it is useful to have
> SourceLocation information for concatenated tokens... Maybe someone else can
> give there thoughts here.
>
> The other thing you can think about here is that of actually releasing a
> scratch buffer no longer in use. I think clang doesn't do this currently
> (probably because of keeping all the SourceLocation information), but if you
> looked at having a reference count on a scratch buffer then maybe it will be
> possible to release it once its last use is complete. This has the benefit
> of
> also freeing memory... and may also improve the speed of the compiler.
> Again,
> others should give their thoughts here. Personally, I would be very
> interested to see if this was a possible line of improvement since with
> extensive macro expansions, the memory consumption of clang is huge.
>
> In fact, thinking about this, this may be the first place to start -- it is
> sort of linked to eliding SourceLocation information, but it has the
> advantage
> that you are freeing the whole scratch buffer since it has been incorporated
> into another higher-level scratch buffer, and so is no longer referenced
> anywhere and its disappearance shouldn't (hopefully!) cause so many problems
> in clang diagnostics, for example, which even so would need to be adapted to
> skip these elided macro expansion steps.
>
> Finally, regarding the caching of previous macro expansions, you need to be
> very careful because macro definitions can change, yes even inside a macro
> expansion too (think pragma pop_macro). For that matter, you need to think
> about how to handle any _Pragma expansion since these can cause side
> effects.
> This is why I would leave this side of your patch until last. It will be the
> hardest to test and guarantee correctness.
>
> Keep up the good work! :o)
>
> Cheers,
> Andy
>
>
>
> --
> Regards,
> Andrei Serebro
> tel. +79111758381
>
>
>
>
> --
> Regards,
> Andrei Serebro
> tel. +79111758381
>
>
>
>
> --
> Regards,
> Andrei Serebro
> tel. +79111758381
>
>
>
>
> --
> Regards,
> Andrei Serebro
> tel. +79111758381
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>



--
Bruno Cardoso Lopes
http://www.brunocardoso.cc
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Yep, of course it passes, since without using the new flag it's pretty much the same release-38. 
I don't know, probably, either some new tests should be generated and clang with new flag should be run on them or we can just use old ones but somehow make clang run with new flag... 
 
15.06.2016, 21:28, "Bruno Cardoso Lopes" <[hidden email]>:

Hi, the speedups look great. Do this patch passes a clean `ninja check-clang`?

On Wed, Jun 15, 2016 at 8:30 AM, Андрей Серебро <[hidden email]> wrote:

 Ouch, forgot to attach stats. Here they are.

 15.06.2016, 18:28, "Андрей Серебро" <[hidden email]>:

 So, taking your advice I got rid of caching for now and concentrated on
 turning off source locations.

 I have introduced new flag -no-macro-exploc-tracking. Passing it to compiler
 does exactly what is intended to do - we don't generate expansion locations
 for macro tokens. To evaluate correct __LINE__ expansion, we just memorize
 the top most macro that is currently being expanded (it is ok since all
 macro expansions turn to produce one-liners... but there are some cases with
 multiline arguments for function-like macros, I didn't really get right now
 should we do something with it).

 I have tested it tough on boost 1.61. Among first 11000 files at least the
 success of compilation is the same for patched and clean release 38.

 Average boosting of preprocessing is 1.2 times. Also cases in which clang
 ran out of source locations will be less frequent I hope. At least, here is
 a fine example where compiler doesn't fall with the flag and fall without:

 clang -c -ftemplate-depth-32 -O3 -finline-functions -Wno-inline -Wall
 -pedantic -Wno-variadic-macros -pedantic-errors -std=c++11 -I
 $BOOST_1_61_HOME -no-macro-exploc-tracking
 $BOOST_1_61_HOME/libs/vmd/test/test_doc_modifiers_return_type.cpp

 If I don't put the flag, the process hangs (at least, on my machine).

 I attach the patch, hope for some feedback.

 P.S. Also here are preprocessing benchmarks for boost.

 09.06.2016, 16:19, "Андрей Серебро" <[hidden email]>:

 Interesting moment about Andy's comment. I found out that GCC already has
 (if I understood correctly) teh feature with expansion tracking depth.

 The option is -ftrack-macro-expansion[=level] and it's description can be
 found here:
 https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html

 As I understand, clang doesn't bother with this yet.

 05.06.2016, 01:38, "Андрей Серебро" <[hidden email]>:

 Well, you know, this splitting sounds like a good idea. I guess I'll try to
 follow your advice, for it's really somehow strange to solve two tasks in a
 time.

 I weighed the advantages and disadvantages and I guess that saving source
 locations would just vanish any sense from the activity of boosting macro
 expansion, because we will need to do all the same works as we did without
 caching at all: to create proper locations we will need to go down from the
 very top macro downside for every token. So, the idea from my previous post
 was not that sane - saving proper source locations means just give up the
 idea at all, it will barely reduce time consumption at all.

 You are right about debugging purpose of SL, and I think the idea with
 limitation on depths, for which info will be present, is solid. Looks like
 more flexible then complete turning off.

 About scratch space - to be honest, I haven't thought about it yet. I didn't
 even look in this side, so that's good you have pointed it. First of all, I
 need to understand what is this scratch space all about. =)

 Such things like pragmas inside macro definitions which spoil caching can be
 overcome by just forbidding caching in certain situations, yet of course all
 such situations should be defined carefully.

 04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:

 On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

  Ok, that would be a miracle.

  As I understand, unfortunately getting rid of proper source locations
 causes
  big problems, yet it's currently hard for me to find some small samples on
  which everything crashes. I mean, it would be great to reduce the crash
  code dramatically, for now the examples of misbehavior are huge.
  I think a good understanding of interactions between SourceManager, Lexer
  and Parser is needed now because it's somewhere in between their
  communication, so to say, which turns to be broken after the patch. I see a
  possibility in trying to save proper SourceLocations, yet I'm not sure it
  will be such an easy feature.
  I wonder if I can find some person who deeply understands the concept of
  SourceLocations so that it's possible to discuss the underlying problems?
  Guys, I need your piece of advice :|


 My suggestion, but others should jump in here too, is that you split the
 patch
 up into two separate aims.

 The first aim would be the reduction of SourceLocation information.

 The second aim would be caching of previous macro expansions (from what I
 understand of your patch, I think this is what you are attempting?).

 I think this would be a benefit since it may be easier to get the patches
 working if you work on the two parts independently. Plus you indicated that
 the first still has a sizeable benefit. And I think the second will give you
 problems (detailed below).

 I'm afraid I am not able to help you much on the interactions of the
 SourceManager, Lexer and Parser. There are others on this mailing list who
 would be able to help you much better in this regard.

 Regarding reduction of SourceLocation information, these are my thoughts.
 Please note that I haven't assessed the practicality of this, and I'm
 writing
 as I think so it may not even be coherent! Caveats aside, you talked
 previously of having this as a compiler switch. I think this is a good idea
 since the SourceLocation information is needed for debugging macro
 expansions.
 I would suggest implementing this switch as a depth counter so that
 SourceLocation information is only once a macro expansion reaches a certain
 depth the SourceLocation information will be elided. In a sense, the
 important information of a macro expansion is the SourceLocations of the
 original tokens going into the macro and the SourceLocation of the final
 scratch buffer. At this point, I'm wondering whether it is useful to have
 SourceLocation information for concatenated tokens... Maybe someone else can
 give there thoughts here.

 The other thing you can think about here is that of actually releasing a
 scratch buffer no longer in use. I think clang doesn't do this currently
 (probably because of keeping all the SourceLocation information), but if you
 looked at having a reference count on a scratch buffer then maybe it will be
 possible to release it once its last use is complete. This has the benefit
 of
 also freeing memory... and may also improve the speed of the compiler.
 Again,
 others should give their thoughts here. Personally, I would be very
 interested to see if this was a possible line of improvement since with
 extensive macro expansions, the memory consumption of clang is huge.

 In fact, thinking about this, this may be the first place to start -- it is
 sort of linked to eliding SourceLocation information, but it has the
 advantage
 that you are freeing the whole scratch buffer since it has been incorporated
 into another higher-level scratch buffer, and so is no longer referenced
 anywhere and its disappearance shouldn't (hopefully!) cause so many problems
 in clang diagnostics, for example, which even so would need to be adapted to
 skip these elided macro expansion steps.

 Finally, regarding the caching of previous macro expansions, you need to be
 very careful because macro definitions can change, yes even inside a macro
 expansion too (think pragma pop_macro). For that matter, you need to think
 about how to handle any _Pragma expansion since these can cause side
 effects.
 This is why I would leave this side of your patch until last. It will be the
 hardest to test and guarantee correctness.

 Keep up the good work! :o)

 Cheers,
 Andy



 --
 Regards,
 Andrei Serebro
 tel. +79111758381




 --
 Regards,
 Andrei Serebro
 tel. +79111758381




 --
 Regards,
 Andrei Serebro
 tel. +79111758381




 --
 Regards,
 Andrei Serebro
 tel. +79111758381


 _______________________________________________
 cfe-dev mailing list
 [hidden email]
 http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




--
Bruno Cardoso Lopes
http://www.brunocardoso.cc
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
I didn't have a lot of time to dig into the patch, but an off by default flag to speed things up with a reduced the diagnostic experience doesn't seem that valuable.

I thought there was a macro expansion note backtrace limit. Is there a way that we can leverage that to generate unique expansion source locations when the expansion level is low but cut them off when the expansion level is deep? I'm not a preprocessor expert, so can you elaborate on exactly what you're caching and reusing here?

On Wed, Jun 15, 2016 at 1:17 PM, Андрей Серебро <[hidden email]> wrote:
Yep, of course it passes, since without using the new flag it's pretty much the same release-38. 
I don't know, probably, either some new tests should be generated and clang with new flag should be run on them or we can just use old ones but somehow make clang run with new flag... 
 
15.06.2016, 21:28, "Bruno Cardoso Lopes" <[hidden email]>:

Hi, the speedups look great. Do this patch passes a clean `ninja check-clang`?

On Wed, Jun 15, 2016 at 8:30 AM, Андрей Серебро <[hidden email]> wrote:

 Ouch, forgot to attach stats. Here they are.

 15.06.2016, 18:28, "Андрей Серебро" <[hidden email]>:

 So, taking your advice I got rid of caching for now and concentrated on
 turning off source locations.

 I have introduced new flag -no-macro-exploc-tracking. Passing it to compiler
 does exactly what is intended to do - we don't generate expansion locations
 for macro tokens. To evaluate correct __LINE__ expansion, we just memorize
 the top most macro that is currently being expanded (it is ok since all
 macro expansions turn to produce one-liners... but there are some cases with
 multiline arguments for function-like macros, I didn't really get right now
 should we do something with it).

 I have tested it tough on boost 1.61. Among first 11000 files at least the
 success of compilation is the same for patched and clean release 38.

 Average boosting of preprocessing is 1.2 times. Also cases in which clang
 ran out of source locations will be less frequent I hope. At least, here is
 a fine example where compiler doesn't fall with the flag and fall without:

 clang -c -ftemplate-depth-32 -O3 -finline-functions -Wno-inline -Wall
 -pedantic -Wno-variadic-macros -pedantic-errors -std=c++11 -I
 $BOOST_1_61_HOME -no-macro-exploc-tracking
 $BOOST_1_61_HOME/libs/vmd/test/test_doc_modifiers_return_type.cpp

 If I don't put the flag, the process hangs (at least, on my machine).

 I attach the patch, hope for some feedback.

 P.S. Also here are preprocessing benchmarks for boost.

 09.06.2016, 16:19, "Андрей Серебро" <[hidden email]>:

 Interesting moment about Andy's comment. I found out that GCC already has
 (if I understood correctly) teh feature with expansion tracking depth.

 The option is -ftrack-macro-expansion[=level] and it's description can be
 found here:
 https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html

 As I understand, clang doesn't bother with this yet.

 05.06.2016, 01:38, "Андрей Серебро" <[hidden email]>:

 Well, you know, this splitting sounds like a good idea. I guess I'll try to
 follow your advice, for it's really somehow strange to solve two tasks in a
 time.

 I weighed the advantages and disadvantages and I guess that saving source
 locations would just vanish any sense from the activity of boosting macro
 expansion, because we will need to do all the same works as we did without
 caching at all: to create proper locations we will need to go down from the
 very top macro downside for every token. So, the idea from my previous post
 was not that sane - saving proper source locations means just give up the
 idea at all, it will barely reduce time consumption at all.

 You are right about debugging purpose of SL, and I think the idea with
 limitation on depths, for which info will be present, is solid. Looks like
 more flexible then complete turning off.

 About scratch space - to be honest, I haven't thought about it yet. I didn't
 even look in this side, so that's good you have pointed it. First of all, I
 need to understand what is this scratch space all about. =)

 Such things like pragmas inside macro definitions which spoil caching can be
 overcome by just forbidding caching in certain situations, yet of course all
 such situations should be defined carefully.

 04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:

 On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

  Ok, that would be a miracle.

  As I understand, unfortunately getting rid of proper source locations
 causes
  big problems, yet it's currently hard for me to find some small samples on
  which everything crashes. I mean, it would be great to reduce the crash
  code dramatically, for now the examples of misbehavior are huge.
  I think a good understanding of interactions between SourceManager, Lexer
  and Parser is needed now because it's somewhere in between their
  communication, so to say, which turns to be broken after the patch. I see a
  possibility in trying to save proper SourceLocations, yet I'm not sure it
  will be such an easy feature.
  I wonder if I can find some person who deeply understands the concept of
  SourceLocations so that it's possible to discuss the underlying problems?
  Guys, I need your piece of advice :|


 My suggestion, but others should jump in here too, is that you split the
 patch
 up into two separate aims.

 The first aim would be the reduction of SourceLocation information.

 The second aim would be caching of previous macro expansions (from what I
 understand of your patch, I think this is what you are attempting?).

 I think this would be a benefit since it may be easier to get the patches
 working if you work on the two parts independently. Plus you indicated that
 the first still has a sizeable benefit. And I think the second will give you
 problems (detailed below).

 I'm afraid I am not able to help you much on the interactions of the
 SourceManager, Lexer and Parser. There are others on this mailing list who
 would be able to help you much better in this regard.

 Regarding reduction of SourceLocation information, these are my thoughts.
 Please note that I haven't assessed the practicality of this, and I'm
 writing
 as I think so it may not even be coherent! Caveats aside, you talked
 previously of having this as a compiler switch. I think this is a good idea
 since the SourceLocation information is needed for debugging macro
 expansions.
 I would suggest implementing this switch as a depth counter so that
 SourceLocation information is only once a macro expansion reaches a certain
 depth the SourceLocation information will be elided. In a sense, the
 important information of a macro expansion is the SourceLocations of the
 original tokens going into the macro and the SourceLocation of the final
 scratch buffer. At this point, I'm wondering whether it is useful to have
 SourceLocation information for concatenated tokens... Maybe someone else can
 give there thoughts here.

 The other thing you can think about here is that of actually releasing a
 scratch buffer no longer in use. I think clang doesn't do this currently
 (probably because of keeping all the SourceLocation information), but if you
 looked at having a reference count on a scratch buffer then maybe it will be
 possible to release it once its last use is complete. This has the benefit
 of
 also freeing memory... and may also improve the speed of the compiler.
 Again,
 others should give their thoughts here. Personally, I would be very
 interested to see if this was a possible line of improvement since with
 extensive macro expansions, the memory consumption of clang is huge.

 In fact, thinking about this, this may be the first place to start -- it is
 sort of linked to eliding SourceLocation information, but it has the
 advantage
 that you are freeing the whole scratch buffer since it has been incorporated
 into another higher-level scratch buffer, and so is no longer referenced
 anywhere and its disappearance shouldn't (hopefully!) cause so many problems
 in clang diagnostics, for example, which even so would need to be adapted to
 skip these elided macro expansion steps.

 Finally, regarding the caching of previous macro expansions, you need to be
 very careful because macro definitions can change, yes even inside a macro
 expansion too (think pragma pop_macro). For that matter, you need to think
 about how to handle any _Pragma expansion since these can cause side
 effects.
 This is why I would leave this side of your patch until last. It will be the
 hardest to test and guarantee correctness.

 Keep up the good work! :o)

 Cheers,
 Andy



 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" value="+79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" value="+79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" value="+79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" value="+79111758381" target="_blank">+79111758381


 _______________________________________________
 cfe-dev mailing list
 [hidden email]
 http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




--
Bruno Cardoso Lopes
http://www.brunocardoso.cc
 
 
-- 
Regards,
Andrei Serebro
tel. <a href="tel:%2B79111758381" value="+79111758381" target="_blank">+79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Yep, sure
 
You are right, currently there is just a possibility to turn off source locations tracking. It would be nice I guess to make a depth limit as we discussed earlier, and I think it is possible (at least, currently there is already a depth counter but it's used for a bit different purpose of setting the root macro when expansion is being performed). But this is the simplest part could be done at all.
 
I will try to add depth parameter soon so we will see where it leads us. I understand, that rather easy is to do the following: for each token, that has depth more than D with respect to top-level expanding macro we simply don't evaluate expansion location. I am not sure if you mean another problem, like "for each token keep information about D closest parents", but I guess this problem solution won't improve anything, even though it looks like useful. I will try to understand whether second problem can be solved efficiently, but for now I assume we are talking about the first one.
 
Example for the first problem statement: 
 
#define M1(X) X + M2(X)
#define M2(X) X
M1(1) // expansion of M1; M1 is a top-level expanding Macro
 
Suppose we have passed a depth limit of D=1. So, then the result will contain tokens '1', '+' with correct expansion locations, and the second '1', which comes from expansion of M2, will have no expansion location.  
 
About caching: currently this is postponed for a separate patch. The idea was rather simple, but as Andy Gibbs said, it's uncooked and it's better to split the tasks into two separate parts. So, now no caching is performed, at all. 
 
16.06.2016, 01:34, "Reid Kleckner" <[hidden email]>:
I didn't have a lot of time to dig into the patch, but an off by default flag to speed things up with a reduced the diagnostic experience doesn't seem that valuable.
 
I thought there was a macro expansion note backtrace limit. Is there a way that we can leverage that to generate unique expansion source locations when the expansion level is low but cut them off when the expansion level is deep? I'm not a preprocessor expert, so can you elaborate on exactly what you're caching and reusing here?

On Wed, Jun 15, 2016 at 1:17 PM, Андрей Серебро <[hidden email]> wrote:
Yep, of course it passes, since without using the new flag it's pretty much the same release-38. 
I don't know, probably, either some new tests should be generated and clang with new flag should be run on them or we can just use old ones but somehow make clang run with new flag... 
 
15.06.2016, 21:28, "Bruno Cardoso Lopes" <[hidden email]>:

Hi, the speedups look great. Do this patch passes a clean `ninja check-clang`?

On Wed, Jun 15, 2016 at 8:30 AM, Андрей Серебро <[hidden email]> wrote:

 Ouch, forgot to attach stats. Here they are.

 15.06.2016, 18:28, "Андрей Серебро" <[hidden email]>:

 So, taking your advice I got rid of caching for now and concentrated on
 turning off source locations.

 I have introduced new flag -no-macro-exploc-tracking. Passing it to compiler
 does exactly what is intended to do - we don't generate expansion locations
 for macro tokens. To evaluate correct __LINE__ expansion, we just memorize
 the top most macro that is currently being expanded (it is ok since all
 macro expansions turn to produce one-liners... but there are some cases with
 multiline arguments for function-like macros, I didn't really get right now
 should we do something with it).

 I have tested it tough on boost 1.61. Among first 11000 files at least the
 success of compilation is the same for patched and clean release 38.

 Average boosting of preprocessing is 1.2 times. Also cases in which clang
 ran out of source locations will be less frequent I hope. At least, here is
 a fine example where compiler doesn't fall with the flag and fall without:

 clang -c -ftemplate-depth-32 -O3 -finline-functions -Wno-inline -Wall
 -pedantic -Wno-variadic-macros -pedantic-errors -std=c++11 -I
 $BOOST_1_61_HOME -no-macro-exploc-tracking
 $BOOST_1_61_HOME/libs/vmd/test/test_doc_modifiers_return_type.cpp

 If I don't put the flag, the process hangs (at least, on my machine).

 I attach the patch, hope for some feedback.

 P.S. Also here are preprocessing benchmarks for boost.

 09.06.2016, 16:19, "Андрей Серебро" <[hidden email]>:

 Interesting moment about Andy's comment. I found out that GCC already has
 (if I understood correctly) teh feature with expansion tracking depth.

 The option is -ftrack-macro-expansion[=level] and it's description can be
 found here:
 https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html

 As I understand, clang doesn't bother with this yet.

 05.06.2016, 01:38, "Андрей Серебро" <[hidden email]>:

 Well, you know, this splitting sounds like a good idea. I guess I'll try to
 follow your advice, for it's really somehow strange to solve two tasks in a
 time.

 I weighed the advantages and disadvantages and I guess that saving source
 locations would just vanish any sense from the activity of boosting macro
 expansion, because we will need to do all the same works as we did without
 caching at all: to create proper locations we will need to go down from the
 very top macro downside for every token. So, the idea from my previous post
 was not that sane - saving proper source locations means just give up the
 idea at all, it will barely reduce time consumption at all.

 You are right about debugging purpose of SL, and I think the idea with
 limitation on depths, for which info will be present, is solid. Looks like
 more flexible then complete turning off.

 About scratch space - to be honest, I haven't thought about it yet. I didn't
 even look in this side, so that's good you have pointed it. First of all, I
 need to understand what is this scratch space all about. =)

 Such things like pragmas inside macro definitions which spoil caching can be
 overcome by just forbidding caching in certain situations, yet of course all
 such situations should be defined carefully.

 04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:

 On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

  Ok, that would be a miracle.

  As I understand, unfortunately getting rid of proper source locations
 causes
  big problems, yet it's currently hard for me to find some small samples on
  which everything crashes. I mean, it would be great to reduce the crash
  code dramatically, for now the examples of misbehavior are huge.
  I think a good understanding of interactions between SourceManager, Lexer
  and Parser is needed now because it's somewhere in between their
  communication, so to say, which turns to be broken after the patch. I see a
  possibility in trying to save proper SourceLocations, yet I'm not sure it
  will be such an easy feature.
  I wonder if I can find some person who deeply understands the concept of
  SourceLocations so that it's possible to discuss the underlying problems?
  Guys, I need your piece of advice :|


 My suggestion, but others should jump in here too, is that you split the
 patch
 up into two separate aims.

 The first aim would be the reduction of SourceLocation information.

 The second aim would be caching of previous macro expansions (from what I
 understand of your patch, I think this is what you are attempting?).

 I think this would be a benefit since it may be easier to get the patches
 working if you work on the two parts independently. Plus you indicated that
 the first still has a sizeable benefit. And I think the second will give you
 problems (detailed below).

 I'm afraid I am not able to help you much on the interactions of the
 SourceManager, Lexer and Parser. There are others on this mailing list who
 would be able to help you much better in this regard.

 Regarding reduction of SourceLocation information, these are my thoughts.
 Please note that I haven't assessed the practicality of this, and I'm
 writing
 as I think so it may not even be coherent! Caveats aside, you talked
 previously of having this as a compiler switch. I think this is a good idea
 since the SourceLocation information is needed for debugging macro
 expansions.
 I would suggest implementing this switch as a depth counter so that
 SourceLocation information is only once a macro expansion reaches a certain
 depth the SourceLocation information will be elided. In a sense, the
 important information of a macro expansion is the SourceLocations of the
 original tokens going into the macro and the SourceLocation of the final
 scratch buffer. At this point, I'm wondering whether it is useful to have
 SourceLocation information for concatenated tokens... Maybe someone else can
 give there thoughts here.

 The other thing you can think about here is that of actually releasing a
 scratch buffer no longer in use. I think clang doesn't do this currently
 (probably because of keeping all the SourceLocation information), but if you
 looked at having a reference count on a scratch buffer then maybe it will be
 possible to release it once its last use is complete. This has the benefit
 of
 also freeing memory... and may also improve the speed of the compiler.
 Again,
 others should give their thoughts here. Personally, I would be very
 interested to see if this was a possible line of improvement since with
 extensive macro expansions, the memory consumption of clang is huge.

 In fact, thinking about this, this may be the first place to start -- it is
 sort of linked to eliding SourceLocation information, but it has the
 advantage
 that you are freeing the whole scratch buffer since it has been incorporated
 into another higher-level scratch buffer, and so is no longer referenced
 anywhere and its disappearance shouldn't (hopefully!) cause so many problems
 in clang diagnostics, for example, which even so would need to be adapted to
 skip these elided macro expansion steps.

 Finally, regarding the caching of previous macro expansions, you need to be
 very careful because macro definitions can change, yes even inside a macro
 expansion too (think pragma pop_macro). For that matter, you need to think
 about how to handle any _Pragma expansion since these can cause side
 effects.
 This is why I would leave this side of your patch until last. It will be the
 hardest to test and guarantee correctness.

 Keep up the good work! :o)

 Cheers,
 Andy



 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" target="_blank">+79111758381


 _______________________________________________
 cfe-dev mailing list
 [hidden email]
 http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




--
Bruno Cardoso Lopes
http://www.brunocardoso.cc
 
 
-- 
Regards,
Andrei Serebro
tel. <a href="tel:%2B79111758381" target="_blank">+79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Clang Preprocessor Speed Up

Neil Nelson via cfe-dev
Good day, everyone!
 
Here is a version which allows to set a certain expansion depth limit. Below this depth, expansion locations are evaluated fairly, and if expansion depth exceeds it, then expansion locations are not evaluated.
New flag is --max-macro-exploc-depth=N, where N is unsigned. If no passing the flag at all, no limitations on depth are used (so in this case preprocessing goes like in old good clang)
 
16.06.2016, 02:16, "Андрей Серебро" <[hidden email]>:
Yep, sure
 
You are right, currently there is just a possibility to turn off source locations tracking. It would be nice I guess to make a depth limit as we discussed earlier, and I think it is possible (at least, currently there is already a depth counter but it's used for a bit different purpose of setting the root macro when expansion is being performed). But this is the simplest part could be done at all.
 
I will try to add depth parameter soon so we will see where it leads us. I understand, that rather easy is to do the following: for each token, that has depth more than D with respect to top-level expanding macro we simply don't evaluate expansion location. I am not sure if you mean another problem, like "for each token keep information about D closest parents", but I guess this problem solution won't improve anything, even though it looks like useful. I will try to understand whether second problem can be solved efficiently, but for now I assume we are talking about the first one.
 
Example for the first problem statement: 
 
#define M1(X) X + M2(X)
#define M2(X) X
M1(1) // expansion of M1; M1 is a top-level expanding Macro
 
Suppose we have passed a depth limit of D=1. So, then the result will contain tokens '1', '+' with correct expansion locations, and the second '1', which comes from expansion of M2, will have no expansion location.  
 
About caching: currently this is postponed for a separate patch. The idea was rather simple, but as Andy Gibbs said, it's uncooked and it's better to split the tasks into two separate parts. So, now no caching is performed, at all. 
 
16.06.2016, 01:34, "Reid Kleckner" <[hidden email]>:
I didn't have a lot of time to dig into the patch, but an off by default flag to speed things up with a reduced the diagnostic experience doesn't seem that valuable.
 
I thought there was a macro expansion note backtrace limit. Is there a way that we can leverage that to generate unique expansion source locations when the expansion level is low but cut them off when the expansion level is deep? I'm not a preprocessor expert, so can you elaborate on exactly what you're caching and reusing here?

On Wed, Jun 15, 2016 at 1:17 PM, Андрей Серебро <[hidden email]> wrote:
Yep, of course it passes, since without using the new flag it's pretty much the same release-38. 
I don't know, probably, either some new tests should be generated and clang with new flag should be run on them or we can just use old ones but somehow make clang run with new flag... 
 
15.06.2016, 21:28, "Bruno Cardoso Lopes" <[hidden email]>:

Hi, the speedups look great. Do this patch passes a clean `ninja check-clang`?

On Wed, Jun 15, 2016 at 8:30 AM, Андрей Серебро <[hidden email]> wrote:

 Ouch, forgot to attach stats. Here they are.

 15.06.2016, 18:28, "Андрей Серебро" <[hidden email]>:

 So, taking your advice I got rid of caching for now and concentrated on
 turning off source locations.

 I have introduced new flag -no-macro-exploc-tracking. Passing it to compiler
 does exactly what is intended to do - we don't generate expansion locations
 for macro tokens. To evaluate correct __LINE__ expansion, we just memorize
 the top most macro that is currently being expanded (it is ok since all
 macro expansions turn to produce one-liners... but there are some cases with
 multiline arguments for function-like macros, I didn't really get right now
 should we do something with it).

 I have tested it tough on boost 1.61. Among first 11000 files at least the
 success of compilation is the same for patched and clean release 38.

 Average boosting of preprocessing is 1.2 times. Also cases in which clang
 ran out of source locations will be less frequent I hope. At least, here is
 a fine example where compiler doesn't fall with the flag and fall without:

 clang -c -ftemplate-depth-32 -O3 -finline-functions -Wno-inline -Wall
 -pedantic -Wno-variadic-macros -pedantic-errors -std=c++11 -I
 $BOOST_1_61_HOME -no-macro-exploc-tracking
 $BOOST_1_61_HOME/libs/vmd/test/test_doc_modifiers_return_type.cpp

 If I don't put the flag, the process hangs (at least, on my machine).

 I attach the patch, hope for some feedback.

 P.S. Also here are preprocessing benchmarks for boost.

 09.06.2016, 16:19, "Андрей Серебро" <[hidden email]>:

 Interesting moment about Andy's comment. I found out that GCC already has
 (if I understood correctly) teh feature with expansion tracking depth.

 The option is -ftrack-macro-expansion[=level] and it's description can be
 found here:
 https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html

 As I understand, clang doesn't bother with this yet.

 05.06.2016, 01:38, "Андрей Серебро" <[hidden email]>:

 Well, you know, this splitting sounds like a good idea. I guess I'll try to
 follow your advice, for it's really somehow strange to solve two tasks in a
 time.

 I weighed the advantages and disadvantages and I guess that saving source
 locations would just vanish any sense from the activity of boosting macro
 expansion, because we will need to do all the same works as we did without
 caching at all: to create proper locations we will need to go down from the
 very top macro downside for every token. So, the idea from my previous post
 was not that sane - saving proper source locations means just give up the
 idea at all, it will barely reduce time consumption at all.

 You are right about debugging purpose of SL, and I think the idea with
 limitation on depths, for which info will be present, is solid. Looks like
 more flexible then complete turning off.

 About scratch space - to be honest, I haven't thought about it yet. I didn't
 even look in this side, so that's good you have pointed it. First of all, I
 need to understand what is this scratch space all about. =)

 Such things like pragmas inside macro definitions which spoil caching can be
 overcome by just forbidding caching in certain situations, yet of course all
 such situations should be defined carefully.

 04.06.2016, 10:43, "Andy Gibbs" <[hidden email]>:

 On Friday 03 Jun 2016 14:33:37 Андрей Серебро wrote:

  Ok, that would be a miracle.

  As I understand, unfortunately getting rid of proper source locations
 causes
  big problems, yet it's currently hard for me to find some small samples on
  which everything crashes. I mean, it would be great to reduce the crash
  code dramatically, for now the examples of misbehavior are huge.
  I think a good understanding of interactions between SourceManager, Lexer
  and Parser is needed now because it's somewhere in between their
  communication, so to say, which turns to be broken after the patch. I see a
  possibility in trying to save proper SourceLocations, yet I'm not sure it
  will be such an easy feature.
  I wonder if I can find some person who deeply understands the concept of
  SourceLocations so that it's possible to discuss the underlying problems?
  Guys, I need your piece of advice :|


 My suggestion, but others should jump in here too, is that you split the
 patch
 up into two separate aims.

 The first aim would be the reduction of SourceLocation information.

 The second aim would be caching of previous macro expansions (from what I
 understand of your patch, I think this is what you are attempting?).

 I think this would be a benefit since it may be easier to get the patches
 working if you work on the two parts independently. Plus you indicated that
 the first still has a sizeable benefit. And I think the second will give you
 problems (detailed below).

 I'm afraid I am not able to help you much on the interactions of the
 SourceManager, Lexer and Parser. There are others on this mailing list who
 would be able to help you much better in this regard.

 Regarding reduction of SourceLocation information, these are my thoughts.
 Please note that I haven't assessed the practicality of this, and I'm
 writing
 as I think so it may not even be coherent! Caveats aside, you talked
 previously of having this as a compiler switch. I think this is a good idea
 since the SourceLocation information is needed for debugging macro
 expansions.
 I would suggest implementing this switch as a depth counter so that
 SourceLocation information is only once a macro expansion reaches a certain
 depth the SourceLocation information will be elided. In a sense, the
 important information of a macro expansion is the SourceLocations of the
 original tokens going into the macro and the SourceLocation of the final
 scratch buffer. At this point, I'm wondering whether it is useful to have
 SourceLocation information for concatenated tokens... Maybe someone else can
 give there thoughts here.

 The other thing you can think about here is that of actually releasing a
 scratch buffer no longer in use. I think clang doesn't do this currently
 (probably because of keeping all the SourceLocation information), but if you
 looked at having a reference count on a scratch buffer then maybe it will be
 possible to release it once its last use is complete. This has the benefit
 of
 also freeing memory... and may also improve the speed of the compiler.
 Again,
 others should give their thoughts here. Personally, I would be very
 interested to see if this was a possible line of improvement since with
 extensive macro expansions, the memory consumption of clang is huge.

 In fact, thinking about this, this may be the first place to start -- it is
 sort of linked to eliding SourceLocation information, but it has the
 advantage
 that you are freeing the whole scratch buffer since it has been incorporated
 into another higher-level scratch buffer, and so is no longer referenced
 anywhere and its disappearance shouldn't (hopefully!) cause so many problems
 in clang diagnostics, for example, which even so would need to be adapted to
 skip these elided macro expansion steps.

 Finally, regarding the caching of previous macro expansions, you need to be
 very careful because macro definitions can change, yes even inside a macro
 expansion too (think pragma pop_macro). For that matter, you need to think
 about how to handle any _Pragma expansion since these can cause side
 effects.
 This is why I would leave this side of your patch until last. It will be the
 hardest to test and guarantee correctness.

 Keep up the good work! :o)

 Cheers,
 Andy



 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" target="_blank">+79111758381




 --
 Regards,
 Andrei Serebro
 tel. <a href="tel:%2B79111758381" target="_blank">+79111758381


 _______________________________________________
 cfe-dev mailing list
 [hidden email]
 http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




--
Bruno Cardoso Lopes
http://www.brunocardoso.cc
 
 
-- 
Regards,
Andrei Serebro
tel. <a href="tel:%2B79111758381" target="_blank">+79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 
 
 
-- 
Regards,
Andrei Serebro
tel. +79111758381
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

patch_slocs_depth.diff (27K) Download Attachment
12