[RFC] FP Contract = fast?

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev
Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev
On 03/15/2017 08:27 AM, Renato Golin wrote:

> Folks,
>
> I've been asking around people about the state of FP contract, which
> seems to be "on" but it's not really behaving like it, at least not as
> I would expect:
>
> int foo(float a, float b, float c) { return a*b+c; }
>
> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
> (...)
> fmul s0, s0, s1
> fadd s0, s0, s2
> (...)

When you reverted  r282259 in 282289, you also reverted the functional
fix to make the command-line option actually work. Right now it is
broken. Regardless of what else we do, we should fix this (we should
probably recommit r282259, with the default flipped, to pick up the
fixes). If you were to change your source file to:

#pragma STDC FP_CONTRACT ON
int foo(float a, float b, float c) { return a*b+c; }

Then running:

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmadd    s0, s0, s1, s2
(...)

>
> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
> (...)
> fmadd s0, s0, s1, s2
> (...)
>
> I'm not sure this works in Fortran either, but defaulting to "on" when
> (I believe) the language should allow contraction and not doing it is
> not a good default.
>
> i haven't worked out what would be necessary to make it work on a
> case-by-case basis (what kinds of fusions does C allow?) to make sure
> we don't do all or nothing, but if we don't want to start that
> conversation now, then I'd recommend we just turn it all the way to 11
> (like GCC) and let people turn it off if they really mean it.
>
> The rationale is that:
>
> * Contracted operations increase precision (less rounding steps)
> * It performs equal or faster on all architectures I know (true everywhere?)
> * Users already expect that (certainly, GCC users do)
> * Makes us look good on benchmarks :)
>
> A recent SPEC2k6 comparison Linaro did for AArch64, enabling
> -ffp-contract=fast took the edge of GCC in a number of cases and in
> some of them made them comparable in performance. So, any reasons not
> to?
>
> If we go with it, we need to first finish the job that Sebastian was
> dong on the test-suite, then just turn it on by default. A second
> stage would be to add tests/benchmarks that explicitly test FP
> precision, so that we have some extra guarantee that we're doing the
> right thing.
>
> Opinions?

I'm certainly in favor of this plan. My users generally find our current
defaults confusing because it differs from all of our other compilers
(GCC and vendor compilers), plus it gives poor performance.

Thanks again,
Hal

>
> cheers,
> --renato

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev


On 03/15/2017 09:58 AM, Hal Finkel wrote:

> On 03/15/2017 08:27 AM, Renato Golin wrote:
>
>> Folks,
>>
>> I've been asking around people about the state of FP contract, which
>> seems to be "on" but it's not really behaving like it, at least not as
>> I would expect:
>>
>> int foo(float a, float b, float c) { return a*b+c; }
>>
>> $ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
>> (...)
>> fmul s0, s0, s1
>> fadd s0, s0, s2
>> (...)
>
> When you reverted  r282259 in 282289, you also reverted the functional
> fix to make the command-line option actually work. Right now it is
> broken. Regardless of what else we do, we should fix this (we should
> probably recommit r282259, with the default flipped, to pick up the
> fixes). If you were to change your source file to:
>
> #pragma STDC FP_CONTRACT ON
> int foo(float a, float b, float c) { return a*b+c; }
>
I would like to see https://reviews.llvm.org/rL282259 re-enabled,
and the few miscompares left in the aarch64 run of the test-suite
fixed by adding the FP_CONTRACT pragma in the source code.

The commit log of r282259 states the problem that it fixed:
> Clang has the default FP contraction setting of  “-ffp-contract=on”, which
> doesn't really mean “on” in the conventional  sense of the word, but rather
> really means “according to the per-statement  effective value of the relevant
> pragma”.

Thanks,
Sebastian
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev
In reply to this post by Boris Kolpackov via cfe-dev
On 15 March 2017 at 14:58, Hal Finkel <[hidden email]> wrote:
> When you reverted  r282259 in 282289, you also reverted the functional fix
> to make the command-line option actually work. Right now it is broken.
> Regardless of what else we do, we should fix this (we should probably
> recommit r282259, with the default flipped, to pick up the fixes).

We should run check-all and the test-suite with it on/fast before
flipping the switch, and make sure that the behaviour Sebastian
encountered is dealt with before this going live, or we'd be breaking
too many test-suites and reverting and reapplying too often.

But I'm certainly in favour of the plan to make it on/fast by default.


> If you were to change your source file to:
>
> #pragma STDC FP_CONTRACT ON
> int foo(float a, float b, float c) { return a*b+c; }

I wasn't aware you needed the pragma for -ffp-contract=on. I assumed
it would enable on all fp-math that the standard allowed (thus maybe
needing some annotation on the operations).

I thought that the pragma was to avoid using the command line argument...


> I'm certainly in favor of this plan. My users generally find our current
> defaults confusing because it differs from all of our other compilers (GCC
> and vendor compilers), plus it gives poor performance.

Right, I also agree to follow the principle of least surprise, we
should default to "fast". Let's just make sure the infrastructure
isn't going to crumble and do it.

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev
In reply to this post by Boris Kolpackov via cfe-dev
On 15 March 2017 at 15:35, Sebastian Pop <[hidden email]> wrote:
> I would like to see https://reviews.llvm.org/rL282259 re-enabled,
> and the few miscompares left in the aarch64 run of the test-suite
> fixed by adding the FP_CONTRACT pragma in the source code.

Great! Do you have a list of the remaining cases?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev
In reply to this post by Boris Kolpackov via cfe-dev
We should default to pragma STDC FP_CONTRACT ON, not ‘fast’.

‘on’ is the most aggressive mode that conforms to C11. ‘fast’ should be opt-in (like ‘fast-math’ is).

– Steve

On Mar 15, 2017, at 12:35 PM, Renato Golin via llvm-dev <[hidden email]> wrote:

Right, I also agree to follow the principle of least surprise, we
should default to "fast". Let's just make sure the infrastructure
isn't going to crumble and do it.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev


On 03/15/2017 11:39 AM, Stephen Canon wrote:
We should default to pragma STDC FP_CONTRACT ON, not ‘fast’.

‘on’ is the most aggressive mode that conforms to C11. ‘fast’ should be opt-in (like ‘fast-math’ is).

Why? Other compilers default to 'fast', and 'on' does not play well with C++ code (where inlining is really important, and so the statement boundaries that restrict contraction encourage bad coding style).

 -Hal


– Steve

On Mar 15, 2017, at 12:35 PM, Renato Golin via llvm-dev <[hidden email]> wrote:

Right, I also agree to follow the principle of least surprise, we
should default to "fast". Let's just make sure the infrastructure
isn't going to crumble and do it.


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev
In reply to this post by Boris Kolpackov via cfe-dev

On 03/15/2017 11:35 AM, Renato Golin wrote:

> On 15 March 2017 at 14:58, Hal Finkel <[hidden email]> wrote:
>> When you reverted  r282259 in 282289, you also reverted the functional fix
>> to make the command-line option actually work. Right now it is broken.
>> Regardless of what else we do, we should fix this (we should probably
>> recommit r282259, with the default flipped, to pick up the fixes).
> We should run check-all and the test-suite with it on/fast before
> flipping the switch, and make sure that the behaviour Sebastian
> encountered is dealt with before this going live, or we'd be breaking
> too many test-suites and reverting and reapplying too often.
>
> But I'm certainly in favour of the plan to make it on/fast by default.
>
>
>> If you were to change your source file to:
>>
>> #pragma STDC FP_CONTRACT ON
>> int foo(float a, float b, float c) { return a*b+c; }
> I wasn't aware you needed the pragma for -ffp-contract=on. I assumed
> it would enable on all fp-math that the standard allowed (thus maybe
> needing some annotation on the operations).
>
> I thought that the pragma was to avoid using the command line argument...

You shouldn't; this is a bug.

  -Hal

>
>
>> I'm certainly in favor of this plan. My users generally find our current
>> defaults confusing because it differs from all of our other compilers (GCC
>> and vendor compilers), plus it gives poor performance.
> Right, I also agree to follow the principle of least surprise, we
> should default to "fast". Let's just make sure the infrastructure
> isn't going to crumble and do it.
>
> cheers,
> --renato

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev
In reply to this post by Boris Kolpackov via cfe-dev
On Mar 15, 2017, at 12:52 PM, Hal Finkel <[hidden email]> wrote:

On 03/15/2017 11:39 AM, Stephen Canon wrote:

We should default to pragma STDC FP_CONTRACT ON, not ‘fast’.

‘on’ is the most aggressive mode that conforms to C11. ‘fast’ should be opt-in (like ‘fast-math’ is).

Why? Other compilers default to 'fast', and 'on' does not play well with C++ code (where inlining is really important, and so the statement boundaries that restrict contraction encourage bad coding style).

I’m fine with defaulting to fast in C++ (it’s allowed by the C++ arithmetic model IIRC).

Personally, I think the bar for not defaulting to the standards-conforming mode should be high. In particular, ‘fast’ doesn’t respect the pragma control, so it is unsafe—if someone copy-pastes code that uses the documented C semantics into a file compiled with ‘fast' set, they will get incorrect behavior, even if they use the pragmas.

“Other compilers are doing it” is not convincing, at all, especially where hard-to-analyze floating-point rounding behavior is concerned. GCC defaulted to preserving extra precision on x87 for decades, and that wasn’t a good idea either. We can do better.

– Steve

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev


On 03/15/2017 12:01 PM, Stephen Canon wrote:
On Mar 15, 2017, at 12:52 PM, Hal Finkel <[hidden email]> wrote:

On 03/15/2017 11:39 AM, Stephen Canon wrote:

We should default to pragma STDC FP_CONTRACT ON, not ‘fast’.

‘on’ is the most aggressive mode that conforms to C11. ‘fast’ should be opt-in (like ‘fast-math’ is).

Why? Other compilers default to 'fast', and 'on' does not play well with C++ code (where inlining is really important, and so the statement boundaries that restrict contraction encourage bad coding style).

I’m fine with defaulting to fast in C++ (it’s allowed by the C++ arithmetic model IIRC).

Personally, I think the bar for not defaulting to the standards-conforming mode should be high. In particular, ‘fast’ doesn’t respect the pragma control, so it is unsafe—if someone copy-pastes code that uses the documented C semantics into a file compiled with ‘fast' set, they will get incorrect behavior, even if they use the pragmas.

Oh, good point. We can and should fix this now. All relevant IR instructions now support fast-math flags (arithmetic, function calls, etc.). We should have a flag to allow contractions and use it. This will allow the pragmas to work correctly.


“Other compilers are doing it” is not convincing, at all, especially where hard-to-analyze floating-point rounding behavior is concerned. GCC defaulted to preserving extra precision on x87 for decades, and that wasn’t a good idea either. We can do better.

In general, I agree. I think, however, that in this case the other compilers are right. The statement-limited model is something that, in my experience, users find surprising and unhelpful. Nevertheless, I'm fine with having different defaults here for C and C++ (although somewhat sad) is we want to draw the line at having a confirming implementation by default.

 -Hal


– Steve

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev
In reply to this post by Boris Kolpackov via cfe-dev
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev

On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 


Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible.

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction; I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible.

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible.

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev

On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible.

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Adam


 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev

On Mar 15, 2017, at 2:51 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Thinking more about this, I am back to thinking that a function attribute is the better solution for this than FMF, at least before inlining.

Consider the standard example where we want this to trigger: an overloaded addition and multiplier operator for a vector class.

In this case, we want the fadd and the fmul in the inlined functions to have the FMF as well but we don’t necessarily want to mark the overloaded operators with the pragma; we may be only comfortable contracting at this call site.

You don’t have this problem if you mark the containing function FP-contractable.  Effectively what we want is to outline the block within the pragma into a function and tag it with the attribute.

During inlining we can still transform the function attribute into FMF.

So I think I am going back to implementing fp-contract=fast as a function attribute as the first step unless there are any objections.

Adam


Adam


 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] FP Contract = fast?

Boris Kolpackov via cfe-dev


On 03/16/2017 04:13 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:51 PM, Adam Nemet <[hidden email]> wrote:


On Mar 15, 2017, at 2:30 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 04:05 PM, Adam Nemet wrote:

On Mar 15, 2017, at 2:00 PM, Hal Finkel <[hidden email]> wrote:


On 03/15/2017 01:47 PM, Adam Nemet wrote:

On Mar 15, 2017, at 11:36 AM, Mehdi Amini <[hidden email]> wrote:


On Mar 15, 2017, at 10:13 AM, Hal Finkel via cfe-dev <[hidden email]> wrote:


On 03/15/2017 12:10 PM, Adam Nemet via llvm-dev wrote:
Relevant to this discussion is http://bugs.llvm.org/show_bug.cgi?id=25721 (-ffp-contract=fast does not work with LTO).  I am working on adding function attributes for fp-contract=fast which should fix this.

Great!


A function attribute would be a strict improvement over today: LLVM can’t do contraction today. But actually I’m not sure if it is the long term right choice: attributes don’t combine well with inlining for instance. You mentioned FMF earlier, why don’t we have a FMF to allow contraction?

OK, I thought that the prerequisite for that was a fast-math pragma which I don’t think is something we have (I want to be able to specify contract=fast on smaller granularity than module).  But now that I think more about, we should be able to turn a user function attribute into FMF in the front-end which is the most flexible. 

I agree, a FMF is the way to go and we can then control it with the pragma. We can use the STDC FP_CONTRACT pragma for contraction;

Just to confirm, do you mean to introduce a “fast” option to the pragma, e.g.:

#pragma STDC FP_CONTRACT FAST

That's a good point. If we don't add something like this, then we'd be able to turn the fast mode off with the pragma, but then not be able to turn it back on ;)

So, yes, except that I'm somewhat hesitant to invade the 'STDC' space with vendor extensions. If we generally introduce a pragma to control FMFs, maybe we should just use that instead? I don't have a clear idea on the syntax, but for example, if we had some pragma that let us do

#pragma clang fast_math or #pragma clang fast_math nnan(off) contract(on) or whatever then we could use that. What do you think?

That looks great; it nicely matches the internal representation.  Let me take a stab at this.

Thinking more about this, I am back to thinking that a function attribute is the better solution for this than FMF, at least before inlining.

Consider the standard example where we want this to trigger: an overloaded addition and multiplier operator for a vector class.

In this case, we want the fadd and the fmul in the inlined functions to have the FMF as well but we don’t necessarily want to mark the overloaded operators with the pragma; we may be only comfortable contracting at this call site.

You don’t have this problem if you mark the containing function FP-contractable.  Effectively what we want is to outline the block within the pragma into a function and tag it with the attribute.

During inlining we can still transform the function attribute into FMF.

So I think I am going back to implementing fp-contract=fast as a function attribute as the first step unless there are any objections.

Are you saying this works because we don't block inlining when functions attributes don't match or update the function attributes to be more conservative when inlining? This is specifically one of the issues we were avoiding by using FMFs. Frankly, the same issue comes up with other fast-math properties, and I don't see why we should handle this differently. I think that I'd prefer you stick with the new flag.

 -Hal


Adam


Adam


 -Hal


Thanks,
Adam

I also think that having a "fast math" pragma is also a good idea (the fact that we can currently only specify fast-math settings on a translation-unit level is somewhat problematic).



Also, IIUC, the function attribute as well as a FMF wouldn’t apply to the “ON” setting but only to the “FAST” mode (no way to distinguish source level statement in llvm IR).

Right. We still have the existing fmuladd intrinsic method for dealing with the "ON" setting.


Yes.

Adam


— 
Mehdi






Also now that we have backend optimization remarks, I am planning to report missed optimization when we can’t fuse FMAs due “fast” not being on.  This will show up in the opt-viewer.  Then the user can opt in either with the command-line switch or the new function attribute.

That seems useful.

Thanks again,
Hal


Adam

On Mar 15, 2017, at 6:27 AM, Renato Golin via cfe-dev <[hidden email]> wrote:

Folks,

I've been asking around people about the state of FP contract, which
seems to be "on" but it's not really behaving like it, at least not as
I would expect:

int foo(float a, float b, float c) { return a*b+c; }

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=on -o -
(...)
fmul s0, s0, s1
fadd s0, s0, s2
(...)

$ clang -target aarch64-linux-gnu -O2 -S fma.c -ffp-contract=fast -o -
(...)
fmadd s0, s0, s1, s2
(...)

I'm not sure this works in Fortran either, but defaulting to "on" when
(I believe) the language should allow contraction and not doing it is
not a good default.

i haven't worked out what would be necessary to make it work on a
case-by-case basis (what kinds of fusions does C allow?) to make sure
we don't do all or nothing, but if we don't want to start that
conversation now, then I'd recommend we just turn it all the way to 11
(like GCC) and let people turn it off if they really mean it.

The rationale is that:

* Contracted operations increase precision (less rounding steps)
* It performs equal or faster on all architectures I know (true everywhere?)
* Users already expect that (certainly, GCC users do)
* Makes us look good on benchmarks :)

A recent SPEC2k6 comparison Linaro did for AArch64, enabling
-ffp-contract=fast took the edge of GCC in a number of cases and in
some of them made them comparable in performance. So, any reasons not
to?

If we go with it, we need to first finish the job that Sebastian was
dong on the test-suite, then just turn it on by default. A second
stage would be to add tests/benchmarks that explicitly test FP
precision, so that we have some extra guarantee that we're doing the
right thing.

Opinions?

cheers,
--renato
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
12