Who wants faster LLVM/Clang builds?

classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev


On Dec 8, 2017, at 5:01 PM, Mikhail Zolotukhin via llvm-dev <[hidden email]> wrote:

Hi,

I tweaked my scripts to avoid removing includes when it doesn't give any significant benefits, which made the patches significantly smaller. This time the patches should not try to remove includes of header files, which are transitively included from other included header files. The gains mostly remained the same (plus/minus noise), the tables are in the end of the email. I also included size of preprocessed files (measured in 1000 lines of code).

Just my opinion, but it is *good* to remove redundant header file includes, even if it doesn’t speed up the build.

-Chris



I suggest that from here we go as follows: maintainers/interested people take a look at files related to their components and pick the parts of the patches that they consider correct. I'll also start with some files next week if there is no objections to it. Does it sound reasonable?

The most impacted files (the numbers are for Debug build):

LLVM top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.6% 35.0 0.3 -99.0%
lib/MC/MCLabel.cpp 0.20 0.02 -88.0% 25.5 0.0 -99.9%
tools/llvm-readobj/ObjDumper.cpp 0.44 0.10 -76.8% 41.0 11.8 -71.1%
lib/MC/MCWinEH.cpp 0.49 0.15 -70.4% 43.9 21.4 -51.2%
lib/Transforms/Vectorize/Vectorize.cpp 0.73 0.29 -60.7% 52.7 35.5 -32.6%
tools/llvm-diff/DiffLog.cpp 0.59 0.27 -53.8% 50.7 33.7 -33.7%
lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.47 0.25 -46.7% 46.7 37.9 -18.9%
lib/DebugInfo/DWARF/DWARFExpression.cpp 0.67 0.38 -43.5% 47.4 34.8 -26.7%
lib/Transforms/Utils/ASanStackFrameLayout.cpp 0.52 0.32 -38.8% 41.7 33.7 -19.2%
tools/llvm-dwp/llvm-dwp.cpp 2.48 1.53 -38.3% 92.5 55.2 -40.3%

Full list:
<llvm.txt>

Clang top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
tools/libclang/CXString.cpp 2.25 0.30 -86.9% 85.1 31.7 -62.7%
lib/Tooling/CommonOptionsParser.cpp 2.33 0.68 -70.6% 87.1 34.4 -60.5%
lib/AST/StmtViz.cpp 1.28 0.48 -62.5% 62.4 39.0 -37.5%
tools/driver/cc1_main.cpp 3.05 1.29 -57.8% 93.7 58.6 -37.4%
unittests/CodeGen/BufferSourceTest.cpp 4.12 2.62 -36.5% 145.8 103.9 -28.7%
lib/CodeGen/CGLoopInfo.cpp 2.43 1.68 -30.7% 101.6 82.5 -18.8%
unittests/CodeGen/CodeGenExternalTest.cpp 4.50 3.21 -28.6% 155.5 125.1 -19.5%
lib/Driver/ToolChains/Contiki.cpp 0.53 0.38 -28.1% 42.4 38.0 -10.5%
unittests/Tooling/RefactoringActionRulesTest.cpp 3.22 2.34 -27.5% 108.3 90.0 -16.9%
lib/Serialization/GeneratePCH.cpp 2.38 1.78 -25.1% 83.8 71.1 -15.1%

Full list:
<clang.txt>


The updated patches:
<llvm_redundant_headers.patch>
<clang_redundant_headers.patch>

Thanks,
Michael

On Dec 8, 2017, at 9:20 AM, Quentin Colombet via llvm-dev <[hidden email]> wrote:



On Dec 6, 2017, at 1:17 PM, Matthias Braun via llvm-dev <[hidden email]> wrote:

- We do indeed have a lot of unnecessary includes around in llvm (or pretty much any other C++ project for that matter).
- I want faster builds.
- The only way to reliably fight this is indeed automatic tools.
- Having the right amount of includes also has documentation value and ideally let's you understand the structure of your project.
- However relying on transitive includes works contrary to the last "undestanding/documentation" point.
- (And as stated earlier to have things really clean we want `class XXX;` instead of `#include "XXX.h"` wherever possible. And if you are serious about that we also often have to reduce the amount of include code in the headers so we can move the `#include "XXX.h"` from the header to the implementation.

For me personally I think the documentation/understandability we loose when relying on transitive includes weights heavier than my desire to get a faster build…

+1

Q.


- Matthias

On Dec 6, 2017, at 12:28 PM, serge guelton via llvm-dev <[hidden email]> wrote:

On Wed, Dec 06, 2017 at 11:38:54AM -0800, Michael Zolotukhin via llvm-dev wrote:

On Dec 6, 2017, at 9:00 AM, mats petersson via cfe-dev <[hidden email]> wrote:

In my experience, a lot of time is spent on optimizing the code (assuming it's not a "-O0" build).
The numbers were actually for the debug build (-O0 -g), so for Release build they would be different (presumably lower).
Also redundant includes are largely fixed by header guards, and I believe Clang [and gcc as well as MS Compilers, and probably most others too] have an include guards-cache that determines that "we've already included foo.h, and it has include guards around the whole actual content of the file, so we can just skip it”.
By redundant here I meant that we included a file, but we didn’t use any of its content (rather than we included the same file twice).

So I'm slightly dubious as to this being an efficient way of significantly reducing the total compilation time for the overall project - even if there are SOME cases where there is a significant improvement in a single file. The total time for a clean build [in wall-clock-time, not CPU-time] should be measured, making sure that there is enough memory. Doing a run of, say, five complete builds of the same thing [with suitable "clean" between to redo the whole build], take away the worst and the best, and perhaps also "modify one of the more common header files" (llvm/IR/Type.h for example) and build again.
On full builds the benefit is not big (around 1%, but the noise is high), but: 1) if we only take gains more than, say, 5%, we’ll probably never see any, 2) I aim at changes that make the code strictly better (modulo David’s point about disk cache). If any change is questionable from maintenance or whatever other point of view, I’m all for dropping it.

my 2¢

+1 for point 2). Even leaving aside the speed gain, removing unused
includes file just looks like good coding practice to me.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
In reply to this post by Liu, Yaxun (Sam) via cfe-dev
Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:
>
> Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
> LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.

> If we also can tweak it a bit to make it choose more human-like (~more
> conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)

It'd be great to hear more about specifics, maybe you can bring them
up on the IWYU list:
https://groups.google.com/forum/#!forum/include-what-you-use?

Some of IWYU's controversial changes are by design, but some are just
bugs, and if you have ideas for a more principled design, we
appreciate all the input we can get.

Thanks,
- Kim
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
Hi Kim,

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:

Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.
I see.

If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)
True, what I meant hear is to make the changes more conservative: e.g. if we can replace
#include "MyClass.h"
with 
class MyClass;
then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

But if for instance we suggest replacing it with
#include "BaseClass1.h"           // previously included from MyClass.h
#include "ExtraStuffForMyClass.h"
then it's less straight-forward, at least from the code-selfdocumentation point of view.

If we could make IWYU to only suggest changes of the first type, then we probably could've just blindly apply all the suggested changes.


It'd be great to hear more about specifics, maybe you can bring them
up on the IWYU list:
https://groups.google.com/forum/#!forum/include-what-you-use?
Will do, thanks.

Michael

Some of IWYU's controversial changes are by design, but some are just
bugs, and if you have ideas for a more principled design, we
appreciate all the input we can get.

Thanks,
- Kim


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up

 

Re-implementing IWYU based on Clang tooling seems like it would help a lot with that kind of problem.

--paulr

 

From: cfe-dev [mailto:[hidden email]] On Behalf Of Mikhail Zolotukhin via cfe-dev
Sent: Monday, December 11, 2017 12:37 PM
To: Kim Gräsman
Cc: llvm-dev; Clang Dev; Chris Lattner
Subject: Re: [cfe-dev] [llvm-dev] Who wants faster LLVM/Clang builds?

 

Hi Kim,

 

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

 

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:


Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).


There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.

I see.



If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!


Different humans appear to have different preferences :)

True, what I meant hear is to make the changes more conservative: e.g. if we can replace

#include "MyClass.h"

with 

class MyClass;

then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

 

But if for instance we suggest replacing it with

#include "BaseClass1.h"           // previously included from MyClass.h

#include "ExtraStuffForMyClass.h"

then it's less straight-forward, at least from the code-selfdocumentation point of view.

 

If we could make IWYU to only suggest changes of the first type, then we probably could've just blindly apply all the suggested changes.

 


It'd be great to hear more about specifics, maybe you can bring them
up on the IWYU list:
https://groups.google.com/forum/#!forum/include-what-you-use?

Will do, thanks.

 

Michael


Some of IWYU's controversial changes are by design, but some are just
bugs, and if you have ideas for a more principled design, we
appreciate all the input we can get.

Thanks,
- Kim

 


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
In reply to this post by Liu, Yaxun (Sam) via cfe-dev
By the way, have you tried -fmodules ?

Takumi

On Wed, Dec 6, 2017 at 1:41 PM Mikhail Zolotukhin via llvm-dev <[hidden email]> wrote:
Hi,

Recently I've done some experiments on the LLVM/Clang code and discovered that many of our source files often include unnecessary header files. I wrote a simple tool that eliminates redundant includes and estimates benefits of doing it, and the results were quite nice: for some files we were able to save 90% of compile time! I think we want to apply some of the cleanups I found, but I'm not sure how to better do it: the total patches are 8k lines of code for LLVM and 3k lines of code for clang (I'll attach them for reference). My suggestion would be that people take a look at the list of changed files and pick the changes for the piece of code they are working on if the changes look sane (the changes do need some checking before committing). Does it sound like a good idea? I'd appreciate any feedback on what can we do here.

The list of files for which removing redundant headers improved compile time (the numbers are compile time in seconds for a Debug build):

LLVM top 10
Filename Old New Delta
lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.9%
lib/MC/MCLabel.cpp 0.19 0.02 -88.2%
tools/llvm-readobj/ObjDumper.cpp 0.43 0.10 -76.5%
lib/MC/MCWinEH.cpp 0.51 0.13 -74.3%
lib/Transforms/Vectorize/Vectorize.cpp 0.72 0.29 -59.7%
tools/llvm-diff/DiffLog.cpp 0.58 0.26 -54.6%
lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.46 0.26 -44.1%
lib/DebugInfo/DWARF/DWARFExpression.cpp 0.68 0.38 -43.3%
lib/LTO/LTOModule.cpp 2.25 1.33 -41.1%
lib/Target/TargetMachine.cpp 1.76 1.10 -37.8%

Full list:


Clang top 10
Filename Old New Delta
tools/libclang/CXString.cpp 1.70 0.25 -85.2%
lib/Tooling/CommonOptionsParser.cpp 1.69 0.55 -67.3%
lib/AST/StmtViz.cpp 1.02 0.44 -57.4%
tools/driver/cc1_main.cpp 2.26 0.97 -57.1%
unittests/CodeGen/BufferSourceTest.cpp 3.08 1.83 -40.6%
lib/CodeGen/CGLoopInfo.cpp 1.91 1.34 -29.9%
unittests/Tooling/RefactoringActionRulesTest.cpp 2.46 1.79 -27.0%
unittests/CodeGen/CodeGenExternalTest.cpp 3.43 2.52 -26.5%
tools/libclang/CXStoredDiagnostic.cpp 1.67 1.26 -24.8%
tools/clang-func-mapping/ClangFnMapGen.cpp 2.48 1.89 -23.8%

Full list:

The corresponding patches (careful, they are big):

Methodology
My tool took the compile_commands.json from LLVM build and  iterated over files trying to remove redundant headers. To find which header files could be removed it scanned the file for "#include" lines and tried to remove them one by one (checking if the file still compiles after the removal). When there were no more include lines to remove, we verified the change with ninja+ninja check. After it we compared preprocessed file size before and after the change hoping to see that it dropped and then checked the compile time impact.
NB: As a side effect of this approach we removed all include-lines from inactive "ifdef" sections, which means that the patches *will* break other configurations if applied as-is.

Thanks,
Michael
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev

On Dec 11, 2017, at 7:14 PM, NAKAMURA Takumi <[hidden email]> wrote:

By the way, have you tried -fmodules ?
No, I haven’t. Do you have any specific experiment in mind that you wanted to try?

Thanks,
Michael

Takumi

On Wed, Dec 6, 2017 at 1:41 PM Mikhail Zolotukhin via llvm-dev <[hidden email]> wrote:
Hi,

Recently I've done some experiments on the LLVM/Clang code and discovered that many of our source files often include unnecessary header files. I wrote a simple tool that eliminates redundant includes and estimates benefits of doing it, and the results were quite nice: for some files we were able to save 90% of compile time! I think we want to apply some of the cleanups I found, but I'm not sure how to better do it: the total patches are 8k lines of code for LLVM and 3k lines of code for clang (I'll attach them for reference). My suggestion would be that people take a look at the list of changed files and pick the changes for the piece of code they are working on if the changes look sane (the changes do need some checking before committing). Does it sound like a good idea? I'd appreciate any feedback on what can we do here.

The list of files for which removing redundant headers improved compile time (the numbers are compile time in seconds for a Debug build):

LLVM top 10
Filename Old New Delta
lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.9%
lib/MC/MCLabel.cpp 0.19 0.02 -88.2%
tools/llvm-readobj/ObjDumper.cpp 0.43 0.10 -76.5%
lib/MC/MCWinEH.cpp 0.51 0.13 -74.3%
lib/Transforms/Vectorize/Vectorize.cpp 0.72 0.29 -59.7%
tools/llvm-diff/DiffLog.cpp 0.58 0.26 -54.6%
lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.46 0.26 -44.1%
lib/DebugInfo/DWARF/DWARFExpression.cpp 0.68 0.38 -43.3%
lib/LTO/LTOModule.cpp 2.25 1.33 -41.1%
lib/Target/TargetMachine.cpp 1.76 1.10 -37.8%

Full list:


Clang top 10
Filename Old New Delta
tools/libclang/CXString.cpp 1.70 0.25 -85.2%
lib/Tooling/CommonOptionsParser.cpp 1.69 0.55 -67.3%
lib/AST/StmtViz.cpp 1.02 0.44 -57.4%
tools/driver/cc1_main.cpp 2.26 0.97 -57.1%
unittests/CodeGen/BufferSourceTest.cpp 3.08 1.83 -40.6%
lib/CodeGen/CGLoopInfo.cpp 1.91 1.34 -29.9%
unittests/Tooling/RefactoringActionRulesTest.cpp 2.46 1.79 -27.0%
unittests/CodeGen/CodeGenExternalTest.cpp 3.43 2.52 -26.5%
tools/libclang/CXStoredDiagnostic.cpp 1.67 1.26 -24.8%
tools/clang-func-mapping/ClangFnMapGen.cpp 2.48 1.89 -23.8%

Full list:

The corresponding patches (careful, they are big):

Methodology
My tool took the compile_commands.json from LLVM build and  iterated over files trying to remove redundant headers. To find which header files could be removed it scanned the file for "#include" lines and tried to remove them one by one (checking if the file still compiles after the removal). When there were no more include lines to remove, we verified the change with ninja+ninja check. After it we compared preprocessed file size before and after the change hoping to see that it dropped and then checked the compile time impact.
NB: As a side effect of this approach we removed all include-lines from inactive "ifdef" sections, which means that the patches *will* break other configurations if applied as-is.

Thanks,
Michael
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
In reply to this post by Liu, Yaxun (Sam) via cfe-dev


On Mon, Dec 11, 2017 at 3:37 PM, Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
Hi Kim,

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:

Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.
I see.

If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)
True, what I meant hear is to make the changes more conservative: e.g. if we can replace
#include "MyClass.h"
with 
class MyClass;
then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

This is not a transform which is clearly "desirable in every way", because it _increases_ coupling between the user of the class and the implementation. The owner of the class can't add optional template arguments, change a class into a typedef, change the namespace, or other such refactorings. It also introduces the possibility of code which changes behavior depending on whether the full or forward decl are available (which, then, may be an ODR-violation).

Effectively the same reasons why the standard forbids users from forward-declaring std:: names apply to doing so to user-defined names.

https://google.github.io/styleguide/cppguide.html#Forward_Declarations lists some of the issues, and a recommendation not to do so.

Of course you do have the upside is that it can improve compile time. Which is certainly of value, and perhaps that's a worthwhile trade-off when the implementation and forward-declare are both within a single project and thus easy to coordinate. But, it's not by any means a _pure_ win.



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev

On Dec 12, 2017, at 12:57 PM, James Y Knight <[hidden email]> wrote:



On Mon, Dec 11, 2017 at 3:37 PM, Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
Hi Kim,

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:

Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.
I see.

If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)
True, what I meant hear is to make the changes more conservative: e.g. if we can replace
#include "MyClass.h"
with 
class MyClass;
then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

This is not a transform which is clearly "desirable in every way", because it _increases_ coupling between the user of the class and the implementation. The owner of the class can't add optional template arguments, change a class into a typedef, change the namespace, or other such refactorings. It also introduces the possibility of code which changes behavior depending on whether the full or forward decl are available (which, then, may be an ODR-violation).

Effectively the same reasons why the standard forbids users from forward-declaring std:: names apply to doing so to user-defined names.

https://google.github.io/styleguide/cppguide.html#Forward_Declarations lists some of the issues, and a recommendation not to do so.

Of course you do have the upside is that it can improve compile time. Which is certainly of value, and perhaps that's a worthwhile trade-off when the implementation and forward-declare are both within a single project and thus easy to coordinate. But, it's not by any means a _pure_ win.
That's correct. I was speaking about the LLVM codebase though (I should've stated that clearer), and in LLVM I don't remember many occasions of refactorings you mentioned. For LLVM forward declaration is recommended by the style guide:

Thanks,
Michael





_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
I'm a little late to the party, but one observation that I haven't seen mentioned is that simply removing #includes and testing that the program compiles is not guaranteed to be a correct transformation.  Imagine, for example, that a header file provides an overload of a function that is a better match than one found elsewhere.  It will compile either way, but without the #include, you will call a different function.  Note that I'm not encouraging this kind of pattern, but the point is just to illustrate that this is not necessarily a correct transformation.

Another example is in the use of a macro definitions.  Suppose a header defines certain macros, and other code uses ifdefs to test for the definition of those macros.  If it is defined, one code path is taken, if it is not defined, another code path (which will compile but do the wrong thing) will be taken.  Does the tool handle this?


On Tue, Dec 12, 2017 at 1:38 PM Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
On Dec 12, 2017, at 12:57 PM, James Y Knight <[hidden email]> wrote:



On Mon, Dec 11, 2017 at 3:37 PM, Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
Hi Kim,

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:

Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.
I see.

If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)
True, what I meant hear is to make the changes more conservative: e.g. if we can replace
#include "MyClass.h"
with 
class MyClass;
then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

This is not a transform which is clearly "desirable in every way", because it _increases_ coupling between the user of the class and the implementation. The owner of the class can't add optional template arguments, change a class into a typedef, change the namespace, or other such refactorings. It also introduces the possibility of code which changes behavior depending on whether the full or forward decl are available (which, then, may be an ODR-violation).

Effectively the same reasons why the standard forbids users from forward-declaring std:: names apply to doing so to user-defined names.

https://google.github.io/styleguide/cppguide.html#Forward_Declarations lists some of the issues, and a recommendation not to do so.

Of course you do have the upside is that it can improve compile time. Which is certainly of value, and perhaps that's a worthwhile trade-off when the implementation and forward-declare are both within a single project and thus easy to coordinate. But, it's not by any means a _pure_ win.
That's correct. I was speaking about the LLVM codebase though (I should've stated that clearer), and in LLVM I don't remember many occasions of refactorings you mentioned. For LLVM forward declaration is recommended by the style guide:

Thanks,
Michael




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
Yes, that’s a valid concern, and that’s why I wanted to review all the changes before committing them (right now I’m going through all of them exactly to catch cases like you mention). It is actually happening sometimes, a good example is include of “config.h” - usually everything compiles even without it, but that’s not what we want.

My plan is to go through all the changes, remove undesired (like the ones you mentioned) and commit remaining changes in chunks (e.g. a folder at a time). Then people can do post-commit review, while the changes will get more exposure for testing. I hope it will go smoothly, but the time will show.

If there are any objections to my plan, I can put the patches for a pre-commit review first, though I consider them pretty safe.

Michael

On Dec 12, 2017, at 5:11 PM, Zachary Turner <[hidden email]> wrote:

I'm a little late to the party, but one observation that I haven't seen mentioned is that simply removing #includes and testing that the program compiles is not guaranteed to be a correct transformation.  Imagine, for example, that a header file provides an overload of a function that is a better match than one found elsewhere.  It will compile either way, but without the #include, you will call a different function.  Note that I'm not encouraging this kind of pattern, but the point is just to illustrate that this is not necessarily a correct transformation.

Another example is in the use of a macro definitions.  Suppose a header defines certain macros, and other code uses ifdefs to test for the definition of those macros.  If it is defined, one code path is taken, if it is not defined, another code path (which will compile but do the wrong thing) will be taken.  Does the tool handle this?


On Tue, Dec 12, 2017 at 1:38 PM Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
On Dec 12, 2017, at 12:57 PM, James Y Knight <[hidden email]> wrote:



On Mon, Dec 11, 2017 at 3:37 PM, Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
Hi Kim,

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:

Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.
I see.

If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)
True, what I meant hear is to make the changes more conservative: e.g. if we can replace
#include "MyClass.h"
with 
class MyClass;
then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

This is not a transform which is clearly "desirable in every way", because it _increases_ coupling between the user of the class and the implementation. The owner of the class can't add optional template arguments, change a class into a typedef, change the namespace, or other such refactorings. It also introduces the possibility of code which changes behavior depending on whether the full or forward decl are available (which, then, may be an ODR-violation).

Effectively the same reasons why the standard forbids users from forward-declaring std:: names apply to doing so to user-defined names.

https://google.github.io/styleguide/cppguide.html#Forward_Declarations lists some of the issues, and a recommendation not to do so.

Of course you do have the upside is that it can improve compile time. Which is certainly of value, and perhaps that's a worthwhile trade-off when the implementation and forward-declare are both within a single project and thus easy to coordinate. But, it's not by any means a _pure_ win.
That's correct. I was speaking about the LLVM codebase though (I should've stated that clearer), and in LLVM I don't remember many occasions of refactorings you mentioned. For LLVM forward declaration is recommended by the style guide:

Thanks,
Michael




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
Can you just compare CRC32s of the binaries before and after your changes? And then just don’t commit unless they’re the same. You’ll have to do this with debug info disabled, as white space changes will affect debug info.

That said, I’m not lgtm’ing this, just saying that if the community does decide this is a good plan, let’s make sure generated code is identical
On Tue, Dec 12, 2017 at 5:42 PM Michael Zolotukhin <[hidden email]> wrote:
Yes, that’s a valid concern, and that’s why I wanted to review all the changes before committing them (right now I’m going through all of them exactly to catch cases like you mention). It is actually happening sometimes, a good example is include of “config.h” - usually everything compiles even without it, but that’s not what we want.

My plan is to go through all the changes, remove undesired (like the ones you mentioned) and commit remaining changes in chunks (e.g. a folder at a time). Then people can do post-commit review, while the changes will get more exposure for testing. I hope it will go smoothly, but the time will show.

If there are any objections to my plan, I can put the patches for a pre-commit review first, though I consider them pretty safe.

Michael

On Dec 12, 2017, at 5:11 PM, Zachary Turner <[hidden email]> wrote:

I'm a little late to the party, but one observation that I haven't seen mentioned is that simply removing #includes and testing that the program compiles is not guaranteed to be a correct transformation.  Imagine, for example, that a header file provides an overload of a function that is a better match than one found elsewhere.  It will compile either way, but without the #include, you will call a different function.  Note that I'm not encouraging this kind of pattern, but the point is just to illustrate that this is not necessarily a correct transformation.

Another example is in the use of a macro definitions.  Suppose a header defines certain macros, and other code uses ifdefs to test for the definition of those macros.  If it is defined, one code path is taken, if it is not defined, another code path (which will compile but do the wrong thing) will be taken.  Does the tool handle this?


On Tue, Dec 12, 2017 at 1:38 PM Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
On Dec 12, 2017, at 12:57 PM, James Y Knight <[hidden email]> wrote:



On Mon, Dec 11, 2017 at 3:37 PM, Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
Hi Kim,

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:

Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.
I see.

If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)
True, what I meant hear is to make the changes more conservative: e.g. if we can replace
#include "MyClass.h"
with 
class MyClass;
then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

This is not a transform which is clearly "desirable in every way", because it _increases_ coupling between the user of the class and the implementation. The owner of the class can't add optional template arguments, change a class into a typedef, change the namespace, or other such refactorings. It also introduces the possibility of code which changes behavior depending on whether the full or forward decl are available (which, then, may be an ODR-violation).

Effectively the same reasons why the standard forbids users from forward-declaring std:: names apply to doing so to user-defined names.

https://google.github.io/styleguide/cppguide.html#Forward_Declarations lists some of the issues, and a recommendation not to do so.

Of course you do have the upside is that it can improve compile time. Which is certainly of value, and perhaps that's a worthwhile trade-off when the implementation and forward-declare are both within a single project and thus easy to coordinate. But, it's not by any means a _pure_ win.
That's correct. I was speaking about the LLVM codebase though (I should've stated that clearer), and in LLVM I don't remember many occasions of refactorings you mentioned. For LLVM forward declaration is recommended by the style guide:

Thanks,
Michael




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev

On Dec 12, 2017, at 5:47 PM, Zachary Turner <[hidden email]> wrote:

Can you just compare CRC32s of the binaries before and after your changes? And then just don’t commit unless they’re the same. You’ll have to do this with debug info disabled, as white space changes will affect debug info.
That's a good idea, I'll do that!

Thanks,
Michael

That said, I’m not lgtm’ing this, just saying that if the community does decide this is a good plan, let’s make sure generated code is identical
On Tue, Dec 12, 2017 at 5:42 PM Michael Zolotukhin <[hidden email]> wrote:
Yes, that’s a valid concern, and that’s why I wanted to review all the changes before committing them (right now I’m going through all of them exactly to catch cases like you mention). It is actually happening sometimes, a good example is include of “config.h” - usually everything compiles even without it, but that’s not what we want.

My plan is to go through all the changes, remove undesired (like the ones you mentioned) and commit remaining changes in chunks (e.g. a folder at a time). Then people can do post-commit review, while the changes will get more exposure for testing. I hope it will go smoothly, but the time will show.

If there are any objections to my plan, I can put the patches for a pre-commit review first, though I consider them pretty safe.

Michael

On Dec 12, 2017, at 5:11 PM, Zachary Turner <[hidden email]> wrote:

I'm a little late to the party, but one observation that I haven't seen mentioned is that simply removing #includes and testing that the program compiles is not guaranteed to be a correct transformation.  Imagine, for example, that a header file provides an overload of a function that is a better match than one found elsewhere.  It will compile either way, but without the #include, you will call a different function.  Note that I'm not encouraging this kind of pattern, but the point is just to illustrate that this is not necessarily a correct transformation.

Another example is in the use of a macro definitions.  Suppose a header defines certain macros, and other code uses ifdefs to test for the definition of those macros.  If it is defined, one code path is taken, if it is not defined, another code path (which will compile but do the wrong thing) will be taken.  Does the tool handle this?


On Tue, Dec 12, 2017 at 1:38 PM Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
On Dec 12, 2017, at 12:57 PM, James Y Knight <[hidden email]> wrote:



On Mon, Dec 11, 2017 at 3:37 PM, Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
Hi Kim,

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:

Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.
I see.

If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)
True, what I meant hear is to make the changes more conservative: e.g. if we can replace
#include "MyClass.h"
with 
class MyClass;
then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

This is not a transform which is clearly "desirable in every way", because it _increases_ coupling between the user of the class and the implementation. The owner of the class can't add optional template arguments, change a class into a typedef, change the namespace, or other such refactorings. It also introduces the possibility of code which changes behavior depending on whether the full or forward decl are available (which, then, may be an ODR-violation).

Effectively the same reasons why the standard forbids users from forward-declaring std:: names apply to doing so to user-defined names.

https://google.github.io/styleguide/cppguide.html#Forward_Declarations lists some of the issues, and a recommendation not to do so.

Of course you do have the upside is that it can improve compile time. Which is certainly of value, and perhaps that's a worthwhile trade-off when the implementation and forward-declare are both within a single project and thus easy to coordinate. But, it's not by any means a _pure_ win.
That's correct. I was speaking about the LLVM codebase though (I should've stated that clearer), and in LLVM I don't remember many occasions of refactorings you mentioned. For LLVM forward declaration is recommended by the style guide:

Thanks,
Michael




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
In reply to this post by Liu, Yaxun (Sam) via cfe-dev

On Dec 12, 2017, at 12:08 PM, Michael Zolotukhin via llvm-dev <[hidden email]> wrote:


On Dec 11, 2017, at 7:14 PM, NAKAMURA Takumi <[hidden email]> wrote:

By the way, have you tried -fmodules ?
No, I haven’t. Do you have any specific experiment in mind that you wanted to try?

When you configure clang with -DLLVM_ENABLE_MODULES=1 (like we do on the stage2 bots on green dragon) I suspect the win to be much smaller, since clang modules will eliminate most of the parsing overhead.

-- adrian


Thanks,
Michael

Takumi

On Wed, Dec 6, 2017 at 1:41 PM Mikhail Zolotukhin via llvm-dev <[hidden email]> wrote:
Hi,

Recently I've done some experiments on the LLVM/Clang code and discovered that many of our source files often include unnecessary header files. I wrote a simple tool that eliminates redundant includes and estimates benefits of doing it, and the results were quite nice: for some files we were able to save 90% of compile time! I think we want to apply some of the cleanups I found, but I'm not sure how to better do it: the total patches are 8k lines of code for LLVM and 3k lines of code for clang (I'll attach them for reference). My suggestion would be that people take a look at the list of changed files and pick the changes for the piece of code they are working on if the changes look sane (the changes do need some checking before committing). Does it sound like a good idea? I'd appreciate any feedback on what can we do here.

The list of files for which removing redundant headers improved compile time (the numbers are compile time in seconds for a Debug build):

LLVM top 10
Filename Old New Delta
lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.9%
lib/MC/MCLabel.cpp 0.19 0.02 -88.2%
tools/llvm-readobj/ObjDumper.cpp 0.43 0.10 -76.5%
lib/MC/MCWinEH.cpp 0.51 0.13 -74.3%
lib/Transforms/Vectorize/Vectorize.cpp 0.72 0.29 -59.7%
tools/llvm-diff/DiffLog.cpp 0.58 0.26 -54.6%
lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.46 0.26 -44.1%
lib/DebugInfo/DWARF/DWARFExpression.cpp 0.68 0.38 -43.3%
lib/LTO/LTOModule.cpp 2.25 1.33 -41.1%
lib/Target/TargetMachine.cpp 1.76 1.10 -37.8%

Full list:


Clang top 10
Filename Old New Delta
tools/libclang/CXString.cpp 1.70 0.25 -85.2%
lib/Tooling/CommonOptionsParser.cpp 1.69 0.55 -67.3%
lib/AST/StmtViz.cpp 1.02 0.44 -57.4%
tools/driver/cc1_main.cpp 2.26 0.97 -57.1%
unittests/CodeGen/BufferSourceTest.cpp 3.08 1.83 -40.6%
lib/CodeGen/CGLoopInfo.cpp 1.91 1.34 -29.9%
unittests/Tooling/RefactoringActionRulesTest.cpp 2.46 1.79 -27.0%
unittests/CodeGen/CodeGenExternalTest.cpp 3.43 2.52 -26.5%
tools/libclang/CXStoredDiagnostic.cpp 1.67 1.26 -24.8%
tools/clang-func-mapping/ClangFnMapGen.cpp 2.48 1.89 -23.8%

Full list:

The corresponding patches (careful, they are big):

Methodology
My tool took the compile_commands.json from LLVM build and  iterated over files trying to remove redundant headers. To find which header files could be removed it scanned the file for "#include" lines and tried to remove them one by one (checking if the file still compiles after the removal). When there were no more include lines to remove, we verified the change with ninja+ninja check. After it we compared preprocessed file size before and after the change hoping to see that it dropped and then checked the compile time impact.
NB: As a side effect of this approach we removed all include-lines from inactive "ifdef" sections, which means that the patches *will* break other configurations if applied as-is.

Thanks,
Michael
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
In reply to this post by Liu, Yaxun (Sam) via cfe-dev



On 12/12/2017 01:38 PM, Mikhail Zolotukhin via llvm-dev wrote:

On Dec 12, 2017, at 12:57 PM, James Y Knight <[hidden email]> wrote:



On Mon, Dec 11, 2017 at 3:37 PM, Mikhail Zolotukhin via cfe-dev <[hidden email]> wrote:
Hi Kim,

On Dec 10, 2017, at 7:39 AM, Kim Gräsman <[hidden email]> wrote:

Hi Michael,

On Thu, Dec 7, 2017 at 3:16 AM, Michael Zolotukhin
<[hidden email]> wrote:

Nice to IWYU developers here:) I wonder how hard it would be to run IWYU on
LLVM/Clang (or, if it’s supposed to work, I wonder what I did wrong).

There are known problems with running IWYU over LLVM/Clang -- Zachary
Turner made an attempt a while back to get it up and running. Since
the LLVM tree uses all sorts of modern and moderately complex
patterns, we're struggling to keep up.
I see.

If we also can tweak it a bit to make it choose more human-like (~more
conservative) decisions, we would be able to just apply what it suggests!

Different humans appear to have different preferences :)
True, what I meant hear is to make the changes more conservative: e.g. if we can replace
#include "MyClass.h"
with 
class MyClass;
then this change is probably desirable in every way: it documents the code better, it decreases coupling, it improves compile time.

This is not a transform which is clearly "desirable in every way", because it _increases_ coupling between the user of the class and the implementation. The owner of the class can't add optional template arguments, change a class into a typedef, change the namespace, or other such refactorings. It also introduces the possibility of code which changes behavior depending on whether the full or forward decl are available (which, then, may be an ODR-violation).

Effectively the same reasons why the standard forbids users from forward-declaring std:: names apply to doing so to user-defined names.

https://google.github.io/styleguide/cppguide.html#Forward_Declarations lists some of the issues, and a recommendation not to do so.

Of course you do have the upside is that it can improve compile time. Which is certainly of value, and perhaps that's a worthwhile trade-off when the implementation and forward-declare are both within a single project and thus easy to coordinate. But, it's not by any means a _pure_ win.
That's correct. I was speaking about the LLVM codebase though (I should've stated that clearer), and in LLVM I don't remember many occasions of refactorings you mentioned. For LLVM forward declaration is recommended by the style guide:

As an aside, there is a standard idiom used in some code bases which might be applicable.

For each header "Header.h" which contains a class MyClass, you introduce a header called "Header.FwdDecls.h" which contains forward decls for all of the classes declared in the header.  There are also variants which declare "MyClass.FwdDecl.h".  Both of these schemes have the advantage of only putting the forward decl in one place while reducing the transitive include set greatly.  I've seen environments where these forward decl headers are automatically generated by the build system.

Philip

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
In reply to this post by Liu, Yaxun (Sam) via cfe-dev


2017-12-09 12:54 GMT-08:00 Chris Lattner via llvm-dev <[hidden email]>:


On Dec 8, 2017, at 5:01 PM, Mikhail Zolotukhin via llvm-dev <[hidden email]> wrote:

Hi,

I tweaked my scripts to avoid removing includes when it doesn't give any significant benefits, which made the patches significantly smaller. This time the patches should not try to remove includes of header files, which are transitively included from other included header files. The gains mostly remained the same (plus/minus noise), the tables are in the end of the email. I also included size of preprocessed files (measured in 1000 lines of code).

Just my opinion, but it is *good* to remove redundant header file includes, even if it doesn’t speed up the build.


Do you consider transitive includes as redundant? Why should we remove these?
One drawback I see with removing these is that if I need `class Foo` but I get it through `bar.h` which includes `foo.h`, it means that removing the include to `foo.h` in `bar.h` would break the client of bar that relied on this transitive includes.
That makes refactoring of a components harder since it may break some of the clients of the components for no other reason than this transitive include.
I suspect this is why IWYU would instead *add* these includes when they're missing.

-- 
Mehdi


 


I suggest that from here we go as follows: maintainers/interested people take a look at files related to their components and pick the parts of the patches that they consider correct. I'll also start with some files next week if there is no objections to it. Does it sound reasonable?

The most impacted files (the numbers are for Debug build):

LLVM top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.6% 35.0 0.3 -99.0%
lib/MC/MCLabel.cpp 0.20 0.02 -88.0% 25.5 0.0 -99.9%
tools/llvm-readobj/ObjDumper.cpp 0.44 0.10 -76.8% 41.0 11.8 -71.1%
lib/MC/MCWinEH.cpp 0.49 0.15 -70.4% 43.9 21.4 -51.2%
lib/Transforms/Vectorize/Vectorize.cpp 0.73 0.29 -60.7% 52.7 35.5 -32.6%
tools/llvm-diff/DiffLog.cpp 0.59 0.27 -53.8% 50.7 33.7 -33.7%
lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.47 0.25 -46.7% 46.7 37.9 -18.9%
lib/DebugInfo/DWARF/DWARFExpression.cpp 0.67 0.38 -43.5% 47.4 34.8 -26.7%
lib/Transforms/Utils/ASanStackFrameLayout.cpp 0.52 0.32 -38.8% 41.7 33.7 -19.2%
tools/llvm-dwp/llvm-dwp.cpp 2.48 1.53 -38.3% 92.5 55.2 -40.3%

Full list:
<llvm.txt>

Clang top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
tools/libclang/CXString.cpp 2.25 0.30 -86.9% 85.1 31.7 -62.7%
lib/Tooling/CommonOptionsParser.cpp 2.33 0.68 -70.6% 87.1 34.4 -60.5%
lib/AST/StmtViz.cpp 1.28 0.48 -62.5% 62.4 39.0 -37.5%
tools/driver/cc1_main.cpp 3.05 1.29 -57.8% 93.7 58.6 -37.4%
unittests/CodeGen/BufferSourceTest.cpp 4.12 2.62 -36.5% 145.8 103.9 -28.7%
lib/CodeGen/CGLoopInfo.cpp 2.43 1.68 -30.7% 101.6 82.5 -18.8%
unittests/CodeGen/CodeGenExternalTest.cpp 4.50 3.21 -28.6% 155.5 125.1 -19.5%
lib/Driver/ToolChains/Contiki.cpp 0.53 0.38 -28.1% 42.4 38.0 -10.5%
unittests/Tooling/RefactoringActionRulesTest.cpp 3.22 2.34 -27.5% 108.3 90.0 -16.9%
lib/Serialization/GeneratePCH.cpp 2.38 1.78 -25.1% 83.8 71.1 -15.1%

Full list:
<clang.txt>


The updated patches:
<llvm_redundant_headers.patch>
<clang_redundant_headers.patch>

Thanks,
Michael

On Dec 8, 2017, at 9:20 AM, Quentin Colombet via llvm-dev <[hidden email]> wrote:



On Dec 6, 2017, at 1:17 PM, Matthias Braun via llvm-dev <[hidden email]> wrote:

- We do indeed have a lot of unnecessary includes around in llvm (or pretty much any other C++ project for that matter).
- I want faster builds.
- The only way to reliably fight this is indeed automatic tools.
- Having the right amount of includes also has documentation value and ideally let's you understand the structure of your project.
- However relying on transitive includes works contrary to the last "undestanding/documentation" point.
- (And as stated earlier to have things really clean we want `class XXX;` instead of `#include "XXX.h"` wherever possible. And if you are serious about that we also often have to reduce the amount of include code in the headers so we can move the `#include "XXX.h"` from the header to the implementation.

For me personally I think the documentation/understandability we loose when relying on transitive includes weights heavier than my desire to get a faster build…

+1

Q.


- Matthias

On Dec 6, 2017, at 12:28 PM, serge guelton via llvm-dev <[hidden email]> wrote:

On Wed, Dec 06, 2017 at 11:38:54AM -0800, Michael Zolotukhin via llvm-dev wrote:

On Dec 6, 2017, at 9:00 AM, mats petersson via cfe-dev <[hidden email]> wrote:

In my experience, a lot of time is spent on optimizing the code (assuming it's not a "-O0" build).
The numbers were actually for the debug build (-O0 -g), so for Release build they would be different (presumably lower).
Also redundant includes are largely fixed by header guards, and I believe Clang [and gcc as well as MS Compilers, and probably most others too] have an include guards-cache that determines that "we've already included foo.h, and it has include guards around the whole actual content of the file, so we can just skip it”.
By redundant here I meant that we included a file, but we didn’t use any of its content (rather than we included the same file twice).

So I'm slightly dubious as to this being an efficient way of significantly reducing the total compilation time for the overall project - even if there are SOME cases where there is a significant improvement in a single file. The total time for a clean build [in wall-clock-time, not CPU-time] should be measured, making sure that there is enough memory. Doing a run of, say, five complete builds of the same thing [with suitable "clean" between to redo the whole build], take away the worst and the best, and perhaps also "modify one of the more common header files" (llvm/IR/Type.h for example) and build again.
On full builds the benefit is not big (around 1%, but the noise is high), but: 1) if we only take gains more than, say, 5%, we’ll probably never see any, 2) I aim at changes that make the code strictly better (modulo David’s point about disk cache). If any change is questionable from maintenance or whatever other point of view, I’m all for dropping it.

my 2¢

+1 for point 2). Even leaving aside the speed gain, removing unused
includes file just looks like good coding practice to me.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev


On Fri, Dec 15, 2017 at 10:58 AM, Mehdi AMINI via llvm-dev <[hidden email]> wrote:


2017-12-09 12:54 GMT-08:00 Chris Lattner via llvm-dev <[hidden email]>:


On Dec 8, 2017, at 5:01 PM, Mikhail Zolotukhin via llvm-dev <[hidden email]> wrote:

Hi,

I tweaked my scripts to avoid removing includes when it doesn't give any significant benefits, which made the patches significantly smaller. This time the patches should not try to remove includes of header files, which are transitively included from other included header files. The gains mostly remained the same (plus/minus noise), the tables are in the end of the email. I also included size of preprocessed files (measured in 1000 lines of code).

Just my opinion, but it is *good* to remove redundant header file includes, even if it doesn’t speed up the build.


Do you consider transitive includes as redundant? Why should we remove these?
One drawback I see with removing these is that if I need `class Foo` but I get it through `bar.h` which includes `foo.h`, it means that removing the include to `foo.h` in `bar.h` would break the client of bar that relied on this transitive includes.
That makes refactoring of a components harder since it may break some of the clients of the components for no other reason than this transitive include.
I suspect this is why IWYU would instead *add* these includes when they're missing.

+1, relying on transitive includes is fragile.

In our fork, a random #include of raw_ostream.h snuck into APInt.h at one point. When I went to remove it (perhaps a few months after it was introduced), there were already a handful of files that were relying on it that were broken by my "remove debugging code that snuck in" change.

-- Sean Silva
 

-- 
Mehdi


 


I suggest that from here we go as follows: maintainers/interested people take a look at files related to their components and pick the parts of the patches that they consider correct. I'll also start with some files next week if there is no objections to it. Does it sound reasonable?

The most impacted files (the numbers are for Debug build):

LLVM top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.6% 35.0 0.3 -99.0%
lib/MC/MCLabel.cpp 0.20 0.02 -88.0% 25.5 0.0 -99.9%
tools/llvm-readobj/ObjDumper.cpp 0.44 0.10 -76.8% 41.0 11.8 -71.1%
lib/MC/MCWinEH.cpp 0.49 0.15 -70.4% 43.9 21.4 -51.2%
lib/Transforms/Vectorize/Vectorize.cpp 0.73 0.29 -60.7% 52.7 35.5 -32.6%
tools/llvm-diff/DiffLog.cpp 0.59 0.27 -53.8% 50.7 33.7 -33.7%
lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.47 0.25 -46.7% 46.7 37.9 -18.9%
lib/DebugInfo/DWARF/DWARFExpression.cpp 0.67 0.38 -43.5% 47.4 34.8 -26.7%
lib/Transforms/Utils/ASanStackFrameLayout.cpp 0.52 0.32 -38.8% 41.7 33.7 -19.2%
tools/llvm-dwp/llvm-dwp.cpp 2.48 1.53 -38.3% 92.5 55.2 -40.3%

Full list:
<llvm.txt>

Clang top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
tools/libclang/CXString.cpp 2.25 0.30 -86.9% 85.1 31.7 -62.7%
lib/Tooling/CommonOptionsParser.cpp 2.33 0.68 -70.6% 87.1 34.4 -60.5%
lib/AST/StmtViz.cpp 1.28 0.48 -62.5% 62.4 39.0 -37.5%
tools/driver/cc1_main.cpp 3.05 1.29 -57.8% 93.7 58.6 -37.4%
unittests/CodeGen/BufferSourceTest.cpp 4.12 2.62 -36.5% 145.8 103.9 -28.7%
lib/CodeGen/CGLoopInfo.cpp 2.43 1.68 -30.7% 101.6 82.5 -18.8%
unittests/CodeGen/CodeGenExternalTest.cpp 4.50 3.21 -28.6% 155.5 125.1 -19.5%
lib/Driver/ToolChains/Contiki.cpp 0.53 0.38 -28.1% 42.4 38.0 -10.5%
unittests/Tooling/RefactoringActionRulesTest.cpp 3.22 2.34 -27.5% 108.3 90.0 -16.9%
lib/Serialization/GeneratePCH.cpp 2.38 1.78 -25.1% 83.8 71.1 -15.1%

Full list:
<clang.txt>


The updated patches:
<llvm_redundant_headers.patch>
<clang_redundant_headers.patch>

Thanks,
Michael

On Dec 8, 2017, at 9:20 AM, Quentin Colombet via llvm-dev <[hidden email]> wrote:



On Dec 6, 2017, at 1:17 PM, Matthias Braun via llvm-dev <[hidden email]> wrote:

- We do indeed have a lot of unnecessary includes around in llvm (or pretty much any other C++ project for that matter).
- I want faster builds.
- The only way to reliably fight this is indeed automatic tools.
- Having the right amount of includes also has documentation value and ideally let's you understand the structure of your project.
- However relying on transitive includes works contrary to the last "undestanding/documentation" point.
- (And as stated earlier to have things really clean we want `class XXX;` instead of `#include "XXX.h"` wherever possible. And if you are serious about that we also often have to reduce the amount of include code in the headers so we can move the `#include "XXX.h"` from the header to the implementation.

For me personally I think the documentation/understandability we loose when relying on transitive includes weights heavier than my desire to get a faster build…

+1

Q.


- Matthias

On Dec 6, 2017, at 12:28 PM, serge guelton via llvm-dev <[hidden email]> wrote:

On Wed, Dec 06, 2017 at 11:38:54AM -0800, Michael Zolotukhin via llvm-dev wrote:

On Dec 6, 2017, at 9:00 AM, mats petersson via cfe-dev <[hidden email]> wrote:

In my experience, a lot of time is spent on optimizing the code (assuming it's not a "-O0" build).
The numbers were actually for the debug build (-O0 -g), so for Release build they would be different (presumably lower).
Also redundant includes are largely fixed by header guards, and I believe Clang [and gcc as well as MS Compilers, and probably most others too] have an include guards-cache that determines that "we've already included foo.h, and it has include guards around the whole actual content of the file, so we can just skip it”.
By redundant here I meant that we included a file, but we didn’t use any of its content (rather than we included the same file twice).

So I'm slightly dubious as to this being an efficient way of significantly reducing the total compilation time for the overall project - even if there are SOME cases where there is a significant improvement in a single file. The total time for a clean build [in wall-clock-time, not CPU-time] should be measured, making sure that there is enough memory. Doing a run of, say, five complete builds of the same thing [with suitable "clean" between to redo the whole build], take away the worst and the best, and perhaps also "modify one of the more common header files" (llvm/IR/Type.h for example) and build again.
On full builds the benefit is not big (around 1%, but the noise is high), but: 1) if we only take gains more than, say, 5%, we’ll probably never see any, 2) I aim at changes that make the code strictly better (modulo David’s point about disk cache). If any change is questionable from maintenance or whatever other point of view, I’m all for dropping it.

my 2¢

+1 for point 2). Even leaving aside the speed gain, removing unused
includes file just looks like good coding practice to me.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev
In reply to this post by Liu, Yaxun (Sam) via cfe-dev


On Dec 15, 2017, at 10:58 AM, Mehdi AMINI <[hidden email]> wrote:



2017-12-09 12:54 GMT-08:00 Chris Lattner via llvm-dev <[hidden email]>:


On Dec 8, 2017, at 5:01 PM, Mikhail Zolotukhin via llvm-dev <[hidden email]> wrote:

Hi,

I tweaked my scripts to avoid removing includes when it doesn't give any significant benefits, which made the patches significantly smaller. This time the patches should not try to remove includes of header files, which are transitively included from other included header files. The gains mostly remained the same (plus/minus noise), the tables are in the end of the email. I also included size of preprocessed files (measured in 1000 lines of code).

Just my opinion, but it is *good* to remove redundant header file includes, even if it doesn’t speed up the build.


Do you consider transitive includes as redundant? Why should we remove these?
Hi Mehdi,

I don’t consider them redundant, and in the patches I’ve recently committed there shouldn’t be such removals - if you spot any, please let me know. All your arguments (as well as mentioned earlier by other people) are completely valid.

Thanks,
Michael
One drawback I see with removing these is that if I need `class Foo` but I get it through `bar.h` which includes `foo.h`, it means that removing the include to `foo.h` in `bar.h` would break the client of bar that relied on this transitive includes.
That makes refactoring of a components harder since it may break some of the clients of the components for no other reason than this transitive include.
I suspect this is why IWYU would instead *add* these includes when they're missing.

-- 
Mehdi


 


I suggest that from here we go as follows: maintainers/interested people take a look at files related to their components and pick the parts of the patches that they consider correct. I'll also start with some files next week if there is no objections to it. Does it sound reasonable?

The most impacted files (the numbers are for Debug build):

LLVM top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.6% 35.0 0.3 -99.0%
lib/MC/MCLabel.cpp 0.20 0.02 -88.0% 25.5 0.0 -99.9%
tools/llvm-readobj/ObjDumper.cpp 0.44 0.10 -76.8% 41.0 11.8 -71.1%
lib/MC/MCWinEH.cpp 0.49 0.15 -70.4% 43.9 21.4 -51.2%
lib/Transforms/Vectorize/Vectorize.cpp 0.73 0.29 -60.7% 52.7 35.5 -32.6%
tools/llvm-diff/DiffLog.cpp 0.59 0.27 -53.8% 50.7 33.7 -33.7%
lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.47 0.25 -46.7% 46.7 37.9 -18.9%
lib/DebugInfo/DWARF/DWARFExpression.cpp 0.67 0.38 -43.5% 47.4 34.8 -26.7%
lib/Transforms/Utils/ASanStackFrameLayout.cpp 0.52 0.32 -38.8% 41.7 33.7 -19.2%
tools/llvm-dwp/llvm-dwp.cpp 2.48 1.53 -38.3% 92.5 55.2 -40.3%

Full list:
<llvm.txt>

Clang top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
tools/libclang/CXString.cpp 2.25 0.30 -86.9% 85.1 31.7 -62.7%
lib/Tooling/CommonOptionsParser.cpp 2.33 0.68 -70.6% 87.1 34.4 -60.5%
lib/AST/StmtViz.cpp 1.28 0.48 -62.5% 62.4 39.0 -37.5%
tools/driver/cc1_main.cpp 3.05 1.29 -57.8% 93.7 58.6 -37.4%
unittests/CodeGen/BufferSourceTest.cpp 4.12 2.62 -36.5% 145.8 103.9 -28.7%
lib/CodeGen/CGLoopInfo.cpp 2.43 1.68 -30.7% 101.6 82.5 -18.8%
unittests/CodeGen/CodeGenExternalTest.cpp 4.50 3.21 -28.6% 155.5 125.1 -19.5%
lib/Driver/ToolChains/Contiki.cpp 0.53 0.38 -28.1% 42.4 38.0 -10.5%
unittests/Tooling/RefactoringActionRulesTest.cpp 3.22 2.34 -27.5% 108.3 90.0 -16.9%
lib/Serialization/GeneratePCH.cpp 2.38 1.78 -25.1% 83.8 71.1 -15.1%

Full list:
<clang.txt>


The updated patches:
<llvm_redundant_headers.patch>
<clang_redundant_headers.patch>

Thanks,
Michael

On Dec 8, 2017, at 9:20 AM, Quentin Colombet via llvm-dev <[hidden email]> wrote:



On Dec 6, 2017, at 1:17 PM, Matthias Braun via llvm-dev <[hidden email]> wrote:

- We do indeed have a lot of unnecessary includes around in llvm (or pretty much any other C++ project for that matter).
- I want faster builds.
- The only way to reliably fight this is indeed automatic tools.
- Having the right amount of includes also has documentation value and ideally let's you understand the structure of your project.
- However relying on transitive includes works contrary to the last "undestanding/documentation" point.
- (And as stated earlier to have things really clean we want `class XXX;` instead of `#include "XXX.h"` wherever possible. And if you are serious about that we also often have to reduce the amount of include code in the headers so we can move the `#include "XXX.h"` from the header to the implementation.

For me personally I think the documentation/understandability we loose when relying on transitive includes weights heavier than my desire to get a faster build…

+1

Q.


- Matthias

On Dec 6, 2017, at 12:28 PM, serge guelton via llvm-dev <[hidden email]> wrote:

On Wed, Dec 06, 2017 at 11:38:54AM -0800, Michael Zolotukhin via llvm-dev wrote:

On Dec 6, 2017, at 9:00 AM, mats petersson via cfe-dev <[hidden email]> wrote:

In my experience, a lot of time is spent on optimizing the code (assuming it's not a "-O0" build).
The numbers were actually for the debug build (-O0 -g), so for Release build they would be different (presumably lower).
Also redundant includes are largely fixed by header guards, and I believe Clang [and gcc as well as MS Compilers, and probably most others too] have an include guards-cache that determines that "we've already included foo.h, and it has include guards around the whole actual content of the file, so we can just skip it”.
By redundant here I meant that we included a file, but we didn’t use any of its content (rather than we included the same file twice).

So I'm slightly dubious as to this being an efficient way of significantly reducing the total compilation time for the overall project - even if there are SOME cases where there is a significant improvement in a single file. The total time for a clean build [in wall-clock-time, not CPU-time] should be measured, making sure that there is enough memory. Doing a run of, say, five complete builds of the same thing [with suitable "clean" between to redo the whole build], take away the worst and the best, and perhaps also "modify one of the more common header files" (llvm/IR/Type.h for example) and build again.
On full builds the benefit is not big (around 1%, but the noise is high), but: 1) if we only take gains more than, say, 5%, we’ll probably never see any, 2) I aim at changes that make the code strictly better (modulo David’s point about disk cache). If any change is questionable from maintenance or whatever other point of view, I’m all for dropping it.

my 2¢

+1 for point 2). Even leaving aside the speed gain, removing unused
includes file just looks like good coding practice to me.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Who wants faster LLVM/Clang builds?

Liu, Yaxun (Sam) via cfe-dev


On Dec 15, 2017, at 12:45 PM, Michael Zolotukhin <[hidden email]> wrote:



On Dec 15, 2017, at 10:58 AM, Mehdi AMINI <[hidden email]> wrote:



2017-12-09 12:54 GMT-08:00 Chris Lattner via llvm-dev <[hidden email]>:


On Dec 8, 2017, at 5:01 PM, Mikhail Zolotukhin via llvm-dev <[hidden email]> wrote:

Hi,

I tweaked my scripts to avoid removing includes when it doesn't give any significant benefits, which made the patches significantly smaller. This time the patches should not try to remove includes of header files, which are transitively included from other included header files. The gains mostly remained the same (plus/minus noise), the tables are in the end of the email. I also included size of preprocessed files (measured in 1000 lines of code).

Just my opinion, but it is *good* to remove redundant header file includes, even if it doesn’t speed up the build.


Do you consider transitive includes as redundant? Why should we remove these?
Hi Mehdi,

I don’t consider them redundant, and in the patches I’ve recently committed there shouldn’t be such removals - if you spot any, please let me know.
For the record, they were committed in r320617-r320636 with a couple of follow-ups in r320645 and r320648. The clang part isn’t committed yet.

Michael
All your arguments (as well as mentioned earlier by other people) are completely valid.

Thanks,
Michael
One drawback I see with removing these is that if I need `class Foo` but I get it through `bar.h` which includes `foo.h`, it means that removing the include to `foo.h` in `bar.h` would break the client of bar that relied on this transitive includes.
That makes refactoring of a components harder since it may break some of the clients of the components for no other reason than this transitive include.
I suspect this is why IWYU would instead *add* these includes when they're missing.

-- 
Mehdi


 


I suggest that from here we go as follows: maintainers/interested people take a look at files related to their components and pick the parts of the patches that they consider correct. I'll also start with some files next week if there is no objections to it. Does it sound reasonable?

The most impacted files (the numbers are for Debug build):

LLVM top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
lib/CodeGen/GlobalISel/GlobalISel.cpp 0.26 0.02 -91.6% 35.0 0.3 -99.0%
lib/MC/MCLabel.cpp 0.20 0.02 -88.0% 25.5 0.0 -99.9%
tools/llvm-readobj/ObjDumper.cpp 0.44 0.10 -76.8% 41.0 11.8 -71.1%
lib/MC/MCWinEH.cpp 0.49 0.15 -70.4% 43.9 21.4 -51.2%
lib/Transforms/Vectorize/Vectorize.cpp 0.73 0.29 -60.7% 52.7 35.5 -32.6%
tools/llvm-diff/DiffLog.cpp 0.59 0.27 -53.8% 50.7 33.7 -33.7%
lib/Target/ARM/MCTargetDesc/ARMMachORelocationInfo.cpp 0.47 0.25 -46.7% 46.7 37.9 -18.9%
lib/DebugInfo/DWARF/DWARFExpression.cpp 0.67 0.38 -43.5% 47.4 34.8 -26.7%
lib/Transforms/Utils/ASanStackFrameLayout.cpp 0.52 0.32 -38.8% 41.7 33.7 -19.2%
tools/llvm-dwp/llvm-dwp.cpp 2.48 1.53 -38.3% 92.5 55.2 -40.3%

Full list:
<llvm.txt>

Clang top 10
Filename TimeOld TimeNew Delta SizeOld SizeNew SizeDelta
tools/libclang/CXString.cpp 2.25 0.30 -86.9% 85.1 31.7 -62.7%
lib/Tooling/CommonOptionsParser.cpp 2.33 0.68 -70.6% 87.1 34.4 -60.5%
lib/AST/StmtViz.cpp 1.28 0.48 -62.5% 62.4 39.0 -37.5%
tools/driver/cc1_main.cpp 3.05 1.29 -57.8% 93.7 58.6 -37.4%
unittests/CodeGen/BufferSourceTest.cpp 4.12 2.62 -36.5% 145.8 103.9 -28.7%
lib/CodeGen/CGLoopInfo.cpp 2.43 1.68 -30.7% 101.6 82.5 -18.8%
unittests/CodeGen/CodeGenExternalTest.cpp 4.50 3.21 -28.6% 155.5 125.1 -19.5%
lib/Driver/ToolChains/Contiki.cpp 0.53 0.38 -28.1% 42.4 38.0 -10.5%
unittests/Tooling/RefactoringActionRulesTest.cpp 3.22 2.34 -27.5% 108.3 90.0 -16.9%
lib/Serialization/GeneratePCH.cpp 2.38 1.78 -25.1% 83.8 71.1 -15.1%

Full list:
<clang.txt>


The updated patches:
<llvm_redundant_headers.patch>
<clang_redundant_headers.patch>

Thanks,
Michael

On Dec 8, 2017, at 9:20 AM, Quentin Colombet via llvm-dev <[hidden email]> wrote:



On Dec 6, 2017, at 1:17 PM, Matthias Braun via llvm-dev <[hidden email]> wrote:

- We do indeed have a lot of unnecessary includes around in llvm (or pretty much any other C++ project for that matter).
- I want faster builds.
- The only way to reliably fight this is indeed automatic tools.
- Having the right amount of includes also has documentation value and ideally let's you understand the structure of your project.
- However relying on transitive includes works contrary to the last "undestanding/documentation" point.
- (And as stated earlier to have things really clean we want `class XXX;` instead of `#include "XXX.h"` wherever possible. And if you are serious about that we also often have to reduce the amount of include code in the headers so we can move the `#include "XXX.h"` from the header to the implementation.

For me personally I think the documentation/understandability we loose when relying on transitive includes weights heavier than my desire to get a faster build…

+1

Q.


- Matthias

On Dec 6, 2017, at 12:28 PM, serge guelton via llvm-dev <[hidden email]> wrote:

On Wed, Dec 06, 2017 at 11:38:54AM -0800, Michael Zolotukhin via llvm-dev wrote:

On Dec 6, 2017, at 9:00 AM, mats petersson via cfe-dev <[hidden email]> wrote:

In my experience, a lot of time is spent on optimizing the code (assuming it's not a "-O0" build).
The numbers were actually for the debug build (-O0 -g), so for Release build they would be different (presumably lower).
Also redundant includes are largely fixed by header guards, and I believe Clang [and gcc as well as MS Compilers, and probably most others too] have an include guards-cache that determines that "we've already included foo.h, and it has include guards around the whole actual content of the file, so we can just skip it”.
By redundant here I meant that we included a file, but we didn’t use any of its content (rather than we included the same file twice).

So I'm slightly dubious as to this being an efficient way of significantly reducing the total compilation time for the overall project - even if there are SOME cases where there is a significant improvement in a single file. The total time for a clean build [in wall-clock-time, not CPU-time] should be measured, making sure that there is enough memory. Doing a run of, say, five complete builds of the same thing [with suitable "clean" between to redo the whole build], take away the worst and the best, and perhaps also "modify one of the more common header files" (llvm/IR/Type.h for example) and build again.
On full builds the benefit is not big (around 1%, but the noise is high), but: 1) if we only take gains more than, say, 5%, we’ll probably never see any, 2) I aim at changes that make the code strictly better (modulo David’s point about disk cache). If any change is questionable from maintenance or whatever other point of view, I’m all for dropping it.

my 2¢

+1 for point 2). Even leaving aside the speed gain, removing unused
includes file just looks like good coding practice to me.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
12