[RFC] FP Contract = fast?

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Roman Popov via cfe-dev

On Mar 16, 2017, at 2:13 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:51 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Thinking more about this, I am back to thinking that a function attribute is the better solution for this than FMF, at least before inlining.

Consider the standard example where we want this to trigger: an overloaded addition and multiplier operator for a vector class.

In this case, we want the fadd and the fmul in the inlined functions to have the FMF as well but we don’t necessarily want to mark the overloaded operators with the pragma; we may be only comfortable contracting at this call site.

You don’t have this problem if you mark the containing function FP-contractable.  Effectively what we want is to outline the block within the pragma into a function and tag it with the attribute.

During inlining we can still transform the function attribute into FMF.

So I think I am going back to implementing fp-contract=fast as a function attribute as the first step unless there are any objections.

I’m objecting to “as the first step” part only :) 
Either it is the right thing to do, or it is not! (You seem to advocate above that we should do it this way, in the absolute, not as a first step)

On the general direction, I feel that what you’re describing applies equally to any FMF, so it is not clear to me right now why fp-contract is fundamentally different from other FMF, can you clarify that? 

— 
Mehdi



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Roman Popov via cfe-dev
In reply to this post by Roman Popov via cfe-dev

On Mar 16, 2017, at 3:25 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 04:13 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:51 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Thinking more about this, I am back to thinking that a function attribute is the better solution for this than FMF, at least before inlining.

Consider the standard example where we want this to trigger: an overloaded addition and multiplier operator for a vector class.

In this case, we want the fadd and the fmul in the inlined functions to have the FMF as well but we don’t necessarily want to mark the overloaded operators with the pragma; we may be only comfortable contracting at this call site.

You don’t have this problem if you mark the containing function FP-contractable.  Effectively what we want is to outline the block within the pragma into a function and tag it with the attribute.

During inlining we can still transform the function attribute into FMF.

So I think I am going back to implementing fp-contract=fast as a function attribute as the first step unless there are any objections.

Are you saying this works because we don't block inlining when functions attributes don't match or update the function attributes to be more conservative when inlining? This is specifically one of the issues we were avoiding by using FMFs. Frankly, the same issue comes up with other fast-math properties, and I don't see why we should handle this differently. I think that I'd prefer you stick with the new flag.

OK, so in the example:

#pragma clang fast_math contract_fast(on)
 vect v1 = v2 * v3 + v4;
#pragma clang fast_math contract_fast(off)

where all the operands are vectors with the typical implementation for the overload operators, we wouldn’t fp-contract unless the operator definitions use contract_fast too?

Adam


 -Hal


Adam


Adam


 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Roman Popov via cfe-dev

On Mar 16, 2017, at 4:04 PM, Adam Nemet via cfe-dev <[hidden email]> wrote:


On Mar 16, 2017, at 3:25 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 04:13 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:51 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Thinking more about this, I am back to thinking that a function attribute is the better solution for this than FMF, at least before inlining.

Consider the standard example where we want this to trigger: an overloaded addition and multiplier operator for a vector class.

In this case, we want the fadd and the fmul in the inlined functions to have the FMF as well but we don’t necessarily want to mark the overloaded operators with the pragma; we may be only comfortable contracting at this call site.

You don’t have this problem if you mark the containing function FP-contractable.  Effectively what we want is to outline the block within the pragma into a function and tag it with the attribute.

During inlining we can still transform the function attribute into FMF.

So I think I am going back to implementing fp-contract=fast as a function attribute as the first step unless there are any objections.

Are you saying this works because we don't block inlining when functions attributes don't match or update the function attributes to be more conservative when inlining? This is specifically one of the issues we were avoiding by using FMFs. Frankly, the same issue comes up with other fast-math properties, and I don't see why we should handle this differently. I think that I'd prefer you stick with the new flag.

OK, so in the example:

#pragma clang fast_math contract_fast(on)
 vect v1 = v2 * v3 + v4;
#pragma clang fast_math contract_fast(off)

where all the operands are vectors with the typical implementation for the overload operators, we wouldn’t fp-contract unless the operator definitions use contract_fast too?

I guess it’s the conservative thing to do since those functions may have non-contractable operations.

Adam


Adam


 -Hal


Adam


Adam


 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Roman Popov via cfe-dev


On 03/16/2017 06:23 PM, Adam Nemet wrote:

On Mar 16, 2017, at 4:04 PM, Adam Nemet via cfe-dev <[hidden email]> wrote:


On Mar 16, 2017, at 3:25 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 04:13 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:51 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Thinking more about this, I am back to thinking that a function attribute is the better solution for this than FMF, at least before inlining.

Consider the standard example where we want this to trigger: an overloaded addition and multiplier operator for a vector class.

In this case, we want the fadd and the fmul in the inlined functions to have the FMF as well but we don’t necessarily want to mark the overloaded operators with the pragma; we may be only comfortable contracting at this call site.

You don’t have this problem if you mark the containing function FP-contractable.  Effectively what we want is to outline the block within the pragma into a function and tag it with the attribute.

During inlining we can still transform the function attribute into FMF.

So I think I am going back to implementing fp-contract=fast as a function attribute as the first step unless there are any objections.

Are you saying this works because we don't block inlining when functions attributes don't match or update the function attributes to be more conservative when inlining? This is specifically one of the issues we were avoiding by using FMFs. Frankly, the same issue comes up with other fast-math properties, and I don't see why we should handle this differently. I think that I'd prefer you stick with the new flag.

OK, so in the example:

#pragma clang fast_math contract_fast(on)
 vect v1 = v2 * v3 + v4;
#pragma clang fast_math contract_fast(off)

where all the operands are vectors with the typical implementation for the overload operators, we wouldn’t fp-contract unless the operator definitions use contract_fast too?

I guess it’s the conservative thing to do since those functions may have non-contractable operations.

Exactly; the problem is that this is the desirable behavior in some circumstances and undesirable in others, but I think that doing the conservative thing is the most self-consistent choice (and also allows users the most powerful fine-grained control).

Thanks again,
Hal


Adam


Adam


 -Hal


Adam


Adam


 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Roman Popov via cfe-dev
I’ve uploaded the first set of patches for this.  They should fix LTO for fused multiply-and-add.  -ffp-contract=fast is now translated into a new FMF called contract. 

  1. llvm: [IR] Add AllowContract to FastMathFlags https://reviews.llvm.org/D31164
  2. llvm: [SDAG] Add AllowContract to SNodeFlags https://reviews.llvm.org/D31165
  3. clang: Encapsulate FPOptions and use it consistently https://reviews.llvm.org/D31166
  4. clang: Use FPContractModeKind universally https://reviews.llvm.org/D31167
  5. clang: Set FMF on -ffp-contract=fast rather than https://reviews.llvm.org/D31168
  6. llvm: [DAGCombiner] Initial support for the fast-math flag contract https://reviews.llvm.org/D31169
DAGCombine support for fused subtract and then the pragma we discussed is coming up later.  Please help to review what I have so far.

Thanks,
Adam

On Mar 16, 2017, at 5:20 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 06:23 PM, Adam Nemet wrote:

On Mar 16, 2017, at 4:04 PM, Adam Nemet via cfe-dev <[hidden email]> wrote:


On Mar 16, 2017, at 3:25 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 04:13 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:51 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Thinking more about this, I am back to thinking that a function attribute is the better solution for this than FMF, at least before inlining.

Consider the standard example where we want this to trigger: an overloaded addition and multiplier operator for a vector class.

In this case, we want the fadd and the fmul in the inlined functions to have the FMF as well but we don’t necessarily want to mark the overloaded operators with the pragma; we may be only comfortable contracting at this call site.

You don’t have this problem if you mark the containing function FP-contractable.  Effectively what we want is to outline the block within the pragma into a function and tag it with the attribute.

During inlining we can still transform the function attribute into FMF.

So I think I am going back to implementing fp-contract=fast as a function attribute as the first step unless there are any objections.

Are you saying this works because we don't block inlining when functions attributes don't match or update the function attributes to be more conservative when inlining? This is specifically one of the issues we were avoiding by using FMFs. Frankly, the same issue comes up with other fast-math properties, and I don't see why we should handle this differently. I think that I'd prefer you stick with the new flag.

OK, so in the example:

#pragma clang fast_math contract_fast(on)
 vect v1 = v2 * v3 + v4;
#pragma clang fast_math contract_fast(off)

where all the operands are vectors with the typical implementation for the overload operators, we wouldn’t fp-contract unless the operator definitions use contract_fast too?

I guess it’s the conservative thing to do since those functions may have non-contractable operations.

Exactly; the problem is that this is the desirable behavior in some circumstances and undesirable in others, but I think that doing the conservative thing is the most self-consistent choice (and also allows users the most powerful fine-grained control).

Thanks again,
Hal


Adam


Adam


 -Hal


Adam


Adam


 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Roman Popov via cfe-dev

On Mar 20, 2017, at 8:26 PM, Adam Nemet <[hidden email]> wrote:

I’ve uploaded the first set of patches for this.  They should fix LTO for fused multiply-and-add.  -ffp-contract=fast is now translated into a new FMF called contract. 

  1. llvm: [IR] Add AllowContract to FastMathFlags https://reviews.llvm.org/D31164
  2. llvm: [SDAG] Add AllowContract to SNodeFlags https://reviews.llvm.org/D31165
  3. clang: Encapsulate FPOptions and use it consistently https://reviews.llvm.org/D31166
  4. clang: Use FPContractModeKind universally https://reviews.llvm.org/D31167
  5. clang: Set FMF on -ffp-contract=fast rather than https://reviews.llvm.org/D31168
  6. llvm: [DAGCombiner] Initial support for the fast-math flag contract https://reviews.llvm.org/D31169
DAGCombine support for fused subtract and then the pragma we discussed is coming up later.  Please help to review what I have so far.

The new pragma is added in https://reviews.llvm.org/D31276.  Please help to review this and the previous patches in the series.

Thanks,
Adam


Thanks,
Adam

On Mar 16, 2017, at 5:20 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 06:23 PM, Adam Nemet wrote:

On Mar 16, 2017, at 4:04 PM, Adam Nemet via cfe-dev <[hidden email]> wrote:


On Mar 16, 2017, at 3:25 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 04:13 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:51 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Thinking more about this, I am back to thinking that a function attribute is the better solution for this than FMF, at least before inlining.

Consider the standard example where we want this to trigger: an overloaded addition and multiplier operator for a vector class.

In this case, we want the fadd and the fmul in the inlined functions to have the FMF as well but we don’t necessarily want to mark the overloaded operators with the pragma; we may be only comfortable contracting at this call site.

You don’t have this problem if you mark the containing function FP-contractable.  Effectively what we want is to outline the block within the pragma into a function and tag it with the attribute.

During inlining we can still transform the function attribute into FMF.

So I think I am going back to implementing fp-contract=fast as a function attribute as the first step unless there are any objections.

Are you saying this works because we don't block inlining when functions attributes don't match or update the function attributes to be more conservative when inlining? This is specifically one of the issues we were avoiding by using FMFs. Frankly, the same issue comes up with other fast-math properties, and I don't see why we should handle this differently. I think that I'd prefer you stick with the new flag.

OK, so in the example:

#pragma clang fast_math contract_fast(on)
 vect v1 = v2 * v3 + v4;
#pragma clang fast_math contract_fast(off)

where all the operands are vectors with the typical implementation for the overload operators, we wouldn’t fp-contract unless the operator definitions use contract_fast too?

I guess it’s the conservative thing to do since those functions may have non-contractable operations.

Exactly; the problem is that this is the desirable behavior in some circumstances and undesirable in others, but I think that doing the conservative thing is the most self-consistent choice (and also allows users the most powerful fine-grained control).

Thanks again,
Hal


Adam


Adam


 -Hal


Adam


Adam


 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
12