Meaning of LLVM optimization levels

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Meaning of LLVM optimization levels

Renato Golin Linaro
Folks,

I'm trying to rationalize about optimization levels and maybe we should come up with a document like this:


Though, I remember a discussion a few months ago, and some people recommended we had names, rather than numbers, to dissociate the idea that 3 is better than 2. Regardless, would be good to have some guidelines on what goes where, so we don't end up in yet another long discussion about where to put the optimization <insert-name-here>.

As far as I can get from our side is:

-O3 : throw everything and hope it sticks
-O2 : optimized build, but should not explode in code size nor consume all resources while compiling
-O1 : optimized debug binaries, don't change the execution order but remove dead code and stuff
-O0 : don't touch it
-Os : optimize, but don't run passes that could blow up code. Try to be a bit more drastic when removing code. When in doubt, prefer small, not fast code.
-Oz : only perform optimizations that reduce code size. Don't even try to run things that could potentially increase code size.

I've been thinking about this, and I think, regarding those criteria, it would make sense to use a try/compare/rollback approach to some passes, at least the most dramatic ones.

For instance, the vectorizer keeps the old loops hanging, and under Os/Oz, it should be possible to rollback the pass if the end result is bigger. Of course, IR size has little to do with final code size, but that's why we have (and rely so much on) heuristics.

AFAIK, for that to work on any pass as they are, we'd have to implement a transactional model on IRBuilder, which is not trivial, but could be done. Does anyone have a strong opinion about this?

cheers,
--renato

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Meaning of LLVM optimization levels

David Tweed-2
Hi, interesting idea.

I'll just note one thing at this point in the discussion: whether it's done by trying to "project what a transformation will do" or by applying transforms with the capability to roll-back, this depends on having a good idea of how a given piece of code (at some level) will actually perform on a piece of real hardware. Without that I suspect other aspects of the how to do optmizations won't work effectively anyway.

Cheers,
Dave
________________________________________
From: [hidden email] [[hidden email]] On Behalf Of Renato Golin [[hidden email]]
Sent: Thursday, June 06, 2013 9:40 PM
To: LLVM Dev; Clang Dev
Subject: [cfe-dev] Meaning of LLVM optimization levels

Folks,

I'm trying to rationalize about optimization levels and maybe we should come up with a document like this:

http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Though, I remember a discussion a few months ago, and some people recommended we had names, rather than numbers, to dissociate the idea that 3 is better than 2. Regardless, would be good to have some guidelines on what goes where, so we don't end up in yet another long discussion about where to put the optimization <insert-name-here>.

As far as I can get from our side is:

-O3 : throw everything and hope it sticks
-O2 : optimized build, but should not explode in code size nor consume all resources while compiling
-O1 : optimized debug binaries, don't change the execution order but remove dead code and stuff
-O0 : don't touch it
-Os : optimize, but don't run passes that could blow up code. Try to be a bit more drastic when removing code. When in doubt, prefer small, not fast code.
-Oz : only perform optimizations that reduce code size. Don't even try to run things that could potentially increase code size.

I've been thinking about this, and I think, regarding those criteria, it would make sense to use a try/compare/rollback approach to some passes, at least the most dramatic ones.

For instance, the vectorizer keeps the old loops hanging, and under Os/Oz, it should be possible to rollback the pass if the end result is bigger. Of course, IR size has little to do with final code size, but that's why we have (and rely so much on) heuristics.

AFAIK, for that to work on any pass as they are, we'd have to implement a transactional model on IRBuilder, which is not trivial, but could be done. Does anyone have a strong opinion about this?

cheers,
--renato

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Meaning of LLVM optimization levels

Dallman, John-3
In reply to this post by Renato Golin Linaro

I'm not a LLVM or Clang developer, but I do spend a lot of time teasing software into

working with the highest possible optimisation levels where it still works correctly.

 

These guidelines are pretty good, but there are a few details worth considering.

 

It needs to be possible to debug code at any optimisation level. It's acceptable for

that to be harder at high optimisation levels, but it should be possible. I find myself

doing this when I hit optimizer bugs, and want to make coherent bug reports. The

reports are much better if I can work out what's wrong in the generated code. I

haven't had to report many problems with Clang ... but I haven't turned up the

optimisation all the way either.

 

Related to optimisation levels, it's quite helpful to have a way of controlling

optimisation on a function-by-function level. This is very useful when you're trying

to work out where in a file with many functions an optimiser problem is happening;

it isn't foolproof, but it helps a lot.

 

--

John Dallman

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Renato Golin
Sent: 06 June 2013 21:41
To: LLVM Dev; Clang Dev
Subject: [cfe-dev] Meaning of LLVM optimization levels

 

Folks,

 

I'm trying to rationalize about optimization levels and maybe we should come up with a document like this:

 

 

Though, I remember a discussion a few months ago, and some people recommended we had names, rather than numbers, to dissociate the idea that 3 is better than 2. Regardless, would be good to have some guidelines on what goes where, so we don't end up in yet another long discussion about where to put the optimization <insert-name-here>.

 

As far as I can get from our side is:

 

-O3 : throw everything and hope it sticks

-O2 : optimized build, but should not explode in code size nor consume all resources while compiling

-O1 : optimized debug binaries, don't change the execution order but remove dead code and stuff

-O0 : don't touch it

-Os : optimize, but don't run passes that could blow up code. Try to be a bit more drastic when removing code. When in doubt, prefer small, not fast code.

-Oz : only perform optimizations that reduce code size. Don't even try to run things that could potentially increase code size.

 

I've been thinking about this, and I think, regarding those criteria, it would make sense to use a try/compare/rollback approach to some passes, at least the most dramatic ones.

 

For instance, the vectorizer keeps the old loops hanging, and under Os/Oz, it should be possible to rollback the pass if the end result is bigger. Of course, IR size has little to do with final code size, but that's why we have (and rely so much on) heuristics.

 

AFAIK, for that to work on any pass as they are, we'd have to implement a transactional model on IRBuilder, which is not trivial, but could be done. Does anyone have a strong opinion about this?

 

cheers,

--renato

-----------------
Siemens Industry Software Limited is a limited company registered in England and Wales.
Registered number: 3476850.
Registered office: Faraday House, Sir William Siemens Square, Frimley, Surrey, GU16 8QD.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Meaning of LLVM optimization levels

Renato Golin Linaro
On 7 June 2013 13:53, Dallman, John <[hidden email]> wrote:

It needs to be possible to debug code at any optimisation level.


Yes, I agree. But after O1, sequential execution is a big impediment for optimizations, and keeping the debug information valid after so many transformations might pose a big penalty on the passes (time & memory). That was the whole idea of metadata being a second-class citizen.


Related to optimisation levels, it's quite helpful to have a way of controlling

optimisation on a function-by-function level. This is very useful when you're trying

to work out where in a file with many functions an optimiser problem is happening;

it isn't foolproof, but it helps a lot.


There are already people working on that, and discussions on the list about this very topic. I agree that it would be extremely helpful for debugging large programs.

cheers,
--renato

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Meaning of LLVM optimization levels

Dallman, John-3

> after O1, sequential execution is a big impediment for optimizations, and

> keeping the debug information valid after so many transformations might

> pose a big penalty on the passes (time & memory). That was the whole

> idea of metadata being a second-class citizen.

 

I'm afraid I don't know much about how debug information is expressed, so this idea may

be nonsense.

 

Is it possible for the debug information to mark all the instructions that arise from a

language statement as coming from that statement, even though the instructions may

be widely scattered? That in itself would be quite helpful. Instructions whose effects

are used in the logic from more than one statement would have to be included with

all those statement.

 

I felt the lack of something like this severely when digging out dozens of compiler bugs

on Microsoft's Itanium compiler, over a decade ago. That processor "naturally" mixed

instructions from many source statements, which prefigured this kind of problem.

 

I'm reasonably happy for debugging at high optimisation levels to be primarily done

with a disassembly listing rather than source code, provided I can get some idea of

which instructions come from which source statements, and which variables are being

accessed. The absence of debug information at that level tends to require going

through an entire function figuring out what every instruction does, and how it

relates to the source, which is rather time-consuming.

 

thanks,

 

--

John Dallman

 

From: Renato Golin [mailto:[hidden email]]
Sent: 07 June 2013 17:39
To: Dallman, John
Cc: LLVM Dev; Clang Dev
Subject: Re: [cfe-dev] Meaning of LLVM optimization levels

 

On 7 June 2013 13:53, Dallman, John <[hidden email]> wrote:

It needs to be possible to debug code at any optimisation level.

 

Yes, I agree. But after O1, sequential execution is a big impediment for optimizations, and keeping the debug information valid after so many transformations might pose a big penalty on the passes (time & memory). That was the whole idea of metadata being a second-class citizen.

 

 

Related to optimisation levels, it's quite helpful to have a way of controlling

optimisation on a function-by-function level. This is very useful when you're trying

to work out where in a file with many functions an optimiser problem is happening;

it isn't foolproof, but it helps a lot.

 

There are already people working on that, and discussions on the list about this very topic. I agree that it would be extremely helpful for debugging large programs.

 

cheers,

--renato

-----------------
Siemens Industry Software Limited is a limited company registered in England and Wales.
Registered number: 3476850.
Registered office: Faraday House, Sir William Siemens Square, Frimley, Surrey, GU16 8QD.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Meaning of LLVM optimization levels

Dean Sutherland
In reply to this post by Renato Golin Linaro
Of course it's *possible*, in a fundamental sense. It's even pretty easy to get right in a compiler back end (in a conceptual sense). You have to touch a LOT of code, but all the changes are trivial.  We did this at Tartan Labs back in the 90s. Done with only a bit of care, it makes debugging possible at any optimization level.  The idea is to make the debug information reflect what the optimizer and code generator actually did, rather than restricting them to the linear mapping supported by most debuggers.  If anyone cares, I can even give details now that the NDAs have finally expired.

Sadly, you can't express the resulting source line information in the debug directives used by any commonly available debugger (that I am aware of).  So -- at the very most optimistic -- this approach won't get you anything any time soon.

Dean Sutherland

On Jun 7, 2013, at 12:38 PM, Renato Golin <[hidden email]> wrote:

On 7 June 2013 13:53, Dallman, John <[hidden email]> wrote:

It needs to be possible to debug code at any optimisation level.


Yes, I agree. But after O1, sequential execution is a big impediment for optimizations, and keeping the debug information valid after so many transformations might pose a big penalty on the passes (time & memory). That was the whole idea of metadata being a second-class citizen.


Related to optimisation levels, it's quite helpful to have a way of controlling

optimisation on a function-by-function level. This is very useful when you're trying

to work out where in a file with many functions an optimiser problem is happening;

it isn't foolproof, but it helps a lot.


There are already people working on that, and discussions on the list about this very topic. I agree that it would be extremely helpful for debugging large programs.

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Meaning of LLVM optimization levels

Dallman, John-3

> We did this at Tartan Labs back in the 90s.

 

Glad to know the idea makes sense.

 

> Sadly, you can't express the resulting source line information in the debug

> directives used by any commonly available debugger (that I am aware of).  

> So -- at the very most optimistic -- this approach won't get you anything

> any time soon.

 

Oh, well. I wish I was more surprised by this.

 

--

John Dallman

 

From: Dean Sutherland [mailto:[hidden email]]
Sent: 07 June 2013 18:15
To: Renato Golin
Cc: Dallman, John; Clang Dev; LLVM Dev
Subject: Re: [cfe-dev] Meaning of LLVM optimization levels

 

Of course it's *possible*, in a fundamental sense. It's even pretty easy to get right in a compiler back end (in a conceptual sense). You have to touch a LOT of code, but all the changes are trivial.  We did this at Tartan Labs back in the 90s. Done with only a bit of care, it makes debugging possible at any optimization level.  The idea is to make the debug information reflect what the optimizer and code generator actually did, rather than restricting them to the linear mapping supported by most debuggers.  If anyone cares, I can even give details now that the NDAs have finally expired.

 

Sadly, you can't express the resulting source line information in the debug directives used by any commonly available debugger (that I am aware of).  So -- at the very most optimistic -- this approach won't get you anything any time soon.

 

Dean Sutherland

 

On Jun 7, 2013, at 12:38 PM, Renato Golin <[hidden email]> wrote:



On 7 June 2013 13:53, Dallman, John <[hidden email]> wrote:

It needs to be possible to debug code at any optimisation level.

 

Yes, I agree. But after O1, sequential execution is a big impediment for optimizations, and keeping the debug information valid after so many transformations might pose a big penalty on the passes (time & memory). That was the whole idea of metadata being a second-class citizen.

 

 

Related to optimisation levels, it's quite helpful to have a way of controlling

optimisation on a function-by-function level. This is very useful when you're trying

to work out where in a file with many functions an optimiser problem is happening;

it isn't foolproof, but it helps a lot.

 

There are already people working on that, and discussions on the list about this very topic. I agree that it would be extremely helpful for debugging large programs.

 

cheers,

--renato

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

 

-----------------
Siemens Industry Software Limited is a limited company registered in England and Wales.
Registered number: 3476850.
Registered office: Faraday House, Sir William Siemens Square, Frimley, Surrey, GU16 8QD.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Meaning of LLVM optimization levels

Renato Golin Linaro
In reply to this post by Dallman, John-3
On 7 June 2013 17:52, Dallman, John <[hidden email]> wrote:

Is it possible for the debug information to mark all the instructions that arise from a

language statement as coming from that statement, even though the instructions may

be widely scattered?


I'm not aware Dwarf supports statements, but it does support line and column information, so if the sources are accurate, you can get "statements" but not as a compiler would recognize, just as a string.

Line information is normally passed along optimizations and inlining, and this is why debugging at O2/O3 has the effect of jumping randomly through steps in the debugger. But the interaction of the debugger and the core is extremely complex: breakpoints don't break in the right place, you might break on the wrong thread, or not break at all, stepping doesn't change lines consistently, you might step out of a function without noticing, etc, etc.

Even if LLVM (or any compiler) kept all the debug information intact, it still wouldn't mean much for the debugger if it couldn't make heads or tails of that information as the core would chug along at random instructions.

cheers,
--renato

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [LLVMdev] Meaning of LLVM optimization levels

Robinson, Paul-3
On 7 June 2013 17:52, Dallman, John <[hidden email]> wrote:
> Is it possible for the debug information to mark all the instructions that arise
> from a
> language statement as coming from that statement, even though the instructions may
> be widely scattered?

Yes.

> Instructions whose effects
> are used in the logic from more than one statement would have to be included with
> all those statement.

Hmmm, that would be atypical.  You *can* produce legal DWARF to do that, but
it's a little unlikely that any debugger would understand what you meant.
Generally each instruction is associated with a single statement.

From: [hidden email] [mailto:[hidden email]] On Behalf Of Renato Golin
> I'm not aware Dwarf supports statements, but it does support line and column
> information, so if the sources are accurate, you can get "statements" but not as a
> compiler would recognize, just as a string.

DWARF actually doesn't support source *extents*, it assumes the compiler
will map each statement to a single (canonical) source location and associate
each instruction produced for that statement to the same source location.

DWARF "supports statements" in the sense that a compiler can flag an
instruction as an appropriate place to set a breakpoint for a given
statement, i.e. the statement whose canonical source location is
associated with that instruction.

--paulr



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [LLVMdev] Meaning of LLVM optimization levels

Chris Lattner
In reply to this post by Renato Golin Linaro

On Jun 6, 2013, at 1:40 PM, Renato Golin <[hidden email]> wrote:

Folks,

I'm trying to rationalize about optimization levels and maybe we should come up with a document like this:


Though, I remember a discussion a few months ago, and some people recommended we had names, rather than numbers, to dissociate the idea that 3 is better than 2. Regardless, would be good to have some guidelines on what goes where, so we don't end up in yet another long discussion about where to put the optimization <insert-name-here>.

As far as I can get from our side is:

-O3 : throw everything and hope it sticks
-O2 : optimized build, but should not explode in code size nor consume all resources while compiling
-O1 : optimized debug binaries, don't change the execution order but remove dead code and stuff
-O0 : don't touch it
-Os : optimize, but don't run passes that could blow up code. Try to be a bit more drastic when removing code. When in doubt, prefer small, not fast code.
-Oz : only perform optimizations that reduce code size. Don't even try to run things that could potentially increase code size.

I think that this is a pretty good codification of how things work, but we should separate out the mechanics (e.g. running passes) from the goals (don't blow up code size).  Something like this definitely should be in the Clang user docs.  The LLVM docs should have something similar but less "GCC command line option" centric.

-Chris




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev