Why is #pragma STDC FENV_ACCESS not supported?

classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev
On 9 Jan 2018 22:55, "John McCall via llvm-dev" <[hidden email]> wrote:

On Jan 9, 2018, at 3:50 PM, Kaylor, Andrew <[hidden email]> wrote:

>The standard argument against trying to introduce "scope-like" mechanisms to LLVM IR is inlining;
>unless you're going to prevent functions that use stricter/laxer FP rules from being inlined >into
>each other (which sounds disastrous), you're going to need to communicate strictness on an
>instruction-by-instruction basis.  If the backend wants to handle that by using the strictest
>rule that it sees in use anywhere in the function because pattern-matching is otherwise too
>error-prone, ok, that's its right; but the IR really should be per-instruction.
 
I added a function level attribute, strictfp, which is meant to help with this. I don’t believe the inlining handling of the attribute is implemented yet, but what I’m thinking is that we would never inline a function that had the strictfp attribute and if we inlined a non-strictfp function into a strictfp function, we would transform any FP operations into their constrained equivalents at that time. In the short term, we’d probably just block both types of inlining.
 
It may sound disastrous, but I think there’s an understanding that using strict FP semantics is going to significantly inhibit optimizations. In the short term, that’s actually the purpose of the constrained intrinsics -- to disable all optimizations until we can teach the optimizer to do things correctly. The plan is that once this is all implemented to produce correct results, we’ll go back and try to re-enable as many optimizations as possible, which may eventually include doing something more intelligent with inlining.
 
With regard to your “instruction level” comments, my intention is that the use of the intrinsics will impose the necessary restrictions at the instruction level. Optimizations (other than inlining) should never need to check the function level attribute. But if we mixed “raw” FP operations and constrained intrinsics within a single function there would be no way to prevent motion of the “raw” operations across the intrinsics.

Is that a problem?  Semantics are guaranteed only for strictfp operations, i.e. ones that use the intrinsics.  Raw operations can get reordered across intrinsics and change semantics, but that seems allowable, right?

No. Raw operations must execute in the default FP environment. It's not ok to reorder a raw fp operation into an FENV_ACCESS region and past a call that enables an FP exception on overflow, for example.

John.


 
The reason I brought up the scope level nature of the pragma was just to suggest that it might be a property that we could take advantage of to handle the transition from IR to MIR. I haven’t come up with a way to bake the strict FP information into the instructions across the ISel boundary, but I think it might be possible to temporarily add it to a block and then have an early machine code pass that used this information in some way once the MIR was all in place. I’m open to the possibility that that was a bad idea.
 
-Andy
 
From: [hidden email] [[hidden email]] 
Sent: Tuesday, January 09, 2018 11:12 AM
To: Kaylor, Andrew <[hidden email]>
Cc: Ulrich Weigand <[hidden email]>; [hidden email]; [hidden email]; [hidden email]; llvm-dev <[hidden email]>; Richard Smith <[hidden email]>; [hidden email]
Subject: Re: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?
 
 
On Jan 9, 2018, at 1:53 PM, Kaylor, Andrew via cfe-dev <[hidden email]> wrote:
 
I think we’re going to need to create a new mechanism to communicate strict FP modes to the backend. I think we need to avoid doing anything that will require re-inventing or duplicating all of the pattern matching that goes on in instruction selection (which is the reason we’re currently dropping that information). I’m out of my depth on this transition, but I think maybe we could handle it with some kind of attribute on the MBB.
 
In C/C++, at least, it’s my understanding that the pragmas always apply at the scope-level (as opposed to having the possibility of being instruction-specific), and we’ve previously agreed that our implementation will really need to apply the rules across entire functions in the sense that if any part of a function uses the constrained intrinsics all FP operations in the function will need to use them (though different metadata arguments may be used in different scopes). So I think that opens our options a bit.
 
Regarding constant folding, I think you are correct that it isn’t happening anywhere in the backends at the moment. There is some constant folding done during instruction selection, but the existing mechanism prevents that. My concern is that given LLVM’s development model, if there is nothing in place to prevent constant folding and no consensus that it shouldn’t be allowed then we should probably believe that someone will eventually do it.
 
The standard argument against trying to introduce "scope-like" mechanisms to LLVM IR is inlining; unless you're going to prevent functions that use stricter/laxer FP rules from being inlined into each other (which sounds disastrous), you're going to need to communicate strictness on an instruction-by-instruction basis.  If the backend wants to handle that by using the strictest rule that it sees in use anywhere in the function because pattern-matching is otherwise too error-prone, ok, that's its right; but the IR really should be per-instruction.
 
John.


 
-Andy
 
From: Ulrich Weigand [[hidden email]] 
Sent: Tuesday, January 09, 2018 9:59 AM
To: Kaylor, Andrew <[hidden email]>; [hidden email]
Cc: Hal Finkel <[hidden email]>; Richard Smith <[hidden email]>; [hidden email]; [hidden email]; [hidden email]; [hidden email]; llvm-dev <[hidden email]>
Subject: Re: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?
 
Andrew Kaylor wrote:

>In general, the current "strict FP" handling stops at instruction
>selection. At the MachineIR level we don't currently have a mechanism
>to prevent inappropriate optimizations based on floating point
>constraints, or indeed to convey such constraints to the backend.
>Implicit register use modeling may provide some restriction on some
>architectures, but this is definitely lacking for X86 targets. On the
>other hand, I'm not aware of any specific current problems, so in many
>cases we may "get lucky" and have the correct thing happen by chance.
>Obviously that's not a viable long term solution. I have a rough plan
>for adding improved register modeling to the X86 backend, which should
>take care of instruction scheduling issues, but we'd still need a
>mechanism to prevent constant folding optimizations and such.

Given that Kevin intends to target SystemZ, I'll be happy to work on the SystemZ back-end support for this feature. I agree that we should be using implicit control register dependencies, which will at least prevent moving floating-point operations across instructions that e.g. change rounding modes. However, the main property we need to model is that floating-point operations may *trap*. I guess this can be done using UnmodeledSideEffects, but I'm not quite clear on how to make this dependent on whether or not a "strict" operation is requested (without duplicating all the instruction patterns ...).


Once we do use something like UnmodeledSideEffects, I think MachineIR passes should handle everything correctly; in the end, the requirements are not really different from those of other trapping instructions. B.t.w. I don't think anybody does constant folding on floating-point constants at the MachineIR level anyway ... have you seen this anywhere?


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

-- 
Dr. Ulrich Weigand | Phone: <a href="tel:+49%207031%20163727" value="+497031163727" target="_blank">+49-7031/16-3727
STSM, GNU/Linux compilers and toolchain
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev

On Jan 10, 2018, at 2:22 AM, Richard Smith <[hidden email]> wrote:

On 9 Jan 2018 22:55, "John McCall via llvm-dev" <[hidden email]> wrote:

On Jan 9, 2018, at 3:50 PM, Kaylor, Andrew <[hidden email]> wrote:

>The standard argument against trying to introduce "scope-like" mechanisms to LLVM IR is inlining;
>unless you're going to prevent functions that use stricter/laxer FP rules from being inlined >into
>each other (which sounds disastrous), you're going to need to communicate strictness on an
>instruction-by-instruction basis.  If the backend wants to handle that by using the strictest
>rule that it sees in use anywhere in the function because pattern-matching is otherwise too
>error-prone, ok, that's its right; but the IR really should be per-instruction.
 
I added a function level attribute, strictfp, which is meant to help with this. I don’t believe the inlining handling of the attribute is implemented yet, but what I’m thinking is that we would never inline a function that had the strictfp attribute and if we inlined a non-strictfp function into a strictfp function, we would transform any FP operations into their constrained equivalents at that time. In the short term, we’d probably just block both types of inlining.
 
It may sound disastrous, but I think there’s an understanding that using strict FP semantics is going to significantly inhibit optimizations. In the short term, that’s actually the purpose of the constrained intrinsics -- to disable all optimizations until we can teach the optimizer to do things correctly. The plan is that once this is all implemented to produce correct results, we’ll go back and try to re-enable as many optimizations as possible, which may eventually include doing something more intelligent with inlining.
 
With regard to your “instruction level” comments, my intention is that the use of the intrinsics will impose the necessary restrictions at the instruction level. Optimizations (other than inlining) should never need to check the function level attribute. But if we mixed “raw” FP operations and constrained intrinsics within a single function there would be no way to prevent motion of the “raw” operations across the intrinsics.

Is that a problem?  Semantics are guaranteed only for strictfp operations, i.e. ones that use the intrinsics.  Raw operations can get reordered across intrinsics and change semantics, but that seems allowable, right?

No. Raw operations must execute in the default FP environment. It's not ok to reorder a raw fp operation into an FENV_ACCESS region and past a call that enables an FP exception on overflow, for example.

Ah, okay.  In that case, yes, I can understand why you want a stricter rule that all the operations in a strictfp function must use the intrinsics.

John.


John.


 
The reason I brought up the scope level nature of the pragma was just to suggest that it might be a property that we could take advantage of to handle the transition from IR to MIR. I haven’t come up with a way to bake the strict FP information into the instructions across the ISel boundary, but I think it might be possible to temporarily add it to a block and then have an early machine code pass that used this information in some way once the MIR was all in place. I’m open to the possibility that that was a bad idea.
 
-Andy
 
From: [hidden email] [[hidden email]] 
Sent: Tuesday, January 09, 2018 11:12 AM
To: Kaylor, Andrew <[hidden email]>
Cc: Ulrich Weigand <[hidden email]>; [hidden email]; [hidden email]; [hidden email]; llvm-dev <[hidden email]>; Richard Smith <[hidden email]>; [hidden email]
Subject: Re: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?
 
 
On Jan 9, 2018, at 1:53 PM, Kaylor, Andrew via cfe-dev <[hidden email]> wrote:
 
I think we’re going to need to create a new mechanism to communicate strict FP modes to the backend. I think we need to avoid doing anything that will require re-inventing or duplicating all of the pattern matching that goes on in instruction selection (which is the reason we’re currently dropping that information). I’m out of my depth on this transition, but I think maybe we could handle it with some kind of attribute on the MBB.
 
In C/C++, at least, it’s my understanding that the pragmas always apply at the scope-level (as opposed to having the possibility of being instruction-specific), and we’ve previously agreed that our implementation will really need to apply the rules across entire functions in the sense that if any part of a function uses the constrained intrinsics all FP operations in the function will need to use them (though different metadata arguments may be used in different scopes). So I think that opens our options a bit.
 
Regarding constant folding, I think you are correct that it isn’t happening anywhere in the backends at the moment. There is some constant folding done during instruction selection, but the existing mechanism prevents that. My concern is that given LLVM’s development model, if there is nothing in place to prevent constant folding and no consensus that it shouldn’t be allowed then we should probably believe that someone will eventually do it.
 
The standard argument against trying to introduce "scope-like" mechanisms to LLVM IR is inlining; unless you're going to prevent functions that use stricter/laxer FP rules from being inlined into each other (which sounds disastrous), you're going to need to communicate strictness on an instruction-by-instruction basis.  If the backend wants to handle that by using the strictest rule that it sees in use anywhere in the function because pattern-matching is otherwise too error-prone, ok, that's its right; but the IR really should be per-instruction.
 
John.


 
-Andy
 
From: Ulrich Weigand [[hidden email]] 
Sent: Tuesday, January 09, 2018 9:59 AM
To: Kaylor, Andrew <[hidden email]>; [hidden email]
Cc: Hal Finkel <[hidden email]>; Richard Smith <[hidden email]>; [hidden email]; [hidden email]; [hidden email]; [hidden email]; llvm-dev <[hidden email]>
Subject: Re: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?
 
Andrew Kaylor wrote:

>In general, the current "strict FP" handling stops at instruction
>selection. At the MachineIR level we don't currently have a mechanism
>to prevent inappropriate optimizations based on floating point
>constraints, or indeed to convey such constraints to the backend.
>Implicit register use modeling may provide some restriction on some
>architectures, but this is definitely lacking for X86 targets. On the
>other hand, I'm not aware of any specific current problems, so in many
>cases we may "get lucky" and have the correct thing happen by chance.
>Obviously that's not a viable long term solution. I have a rough plan
>for adding improved register modeling to the X86 backend, which should
>take care of instruction scheduling issues, but we'd still need a
>mechanism to prevent constant folding optimizations and such.

Given that Kevin intends to target SystemZ, I'll be happy to work on the SystemZ back-end support for this feature. I agree that we should be using implicit control register dependencies, which will at least prevent moving floating-point operations across instructions that e.g. change rounding modes. However, the main property we need to model is that floating-point operations may *trap*. I guess this can be done using UnmodeledSideEffects, but I'm not quite clear on how to make this dependent on whether or not a "strict" operation is requested (without duplicating all the instruction patterns ...).


Once we do use something like UnmodeledSideEffects, I think MachineIR passes should handle everything correctly; in the end, the requirements are not really different from those of other trapping instructions. B.t.w. I don't think anybody does constant folding on floating-point constants at the MachineIR level anyway ... have you seen this anywhere?


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

-- 
Dr. Ulrich Weigand | Phone: <a href="tel:+49%207031%20163727" value="+497031163727" target="_blank" class="">+49-7031/16-3727
STSM, GNU/Linux compilers and toolchain
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev
In reply to this post by Louis Dionne via cfe-dev

Hi Andrew,

sorry for the delay, I only now got some time to look into this a bit more. But I still have a number of questions of how to actually implement this in the back end. Looking at this bottom-up, starting with the behavior of the actual machine instructions, we have (at least on SystemZ) the following things to consider:

A) Rounding mode

Most FP arithmetic instructions use the "current rounding mode" as indicated in the floating-point control register. This is currently assumed to never change. To fix this, we need to avoid scheduling FP arithmetic instructions across instructions that modify the rounding mode. This may also imply avoiding scheduling instructions across function calls, since those may also modify the rounding mode. This can probably be done by modeling the floating-point control register as LLVM register (or maybe model just the rounding mode bits as its own "register"), have all FP arithmetic instructions in question take this new register as implicit input, and have the register by clobbered by the instructions that change the rounding mode (and also function calls).

B) Floating-point status flags

FP instructions set a flag bit in the floating-point status register whenever an IEEE exception condition is recognized. If these flag bits are later tested by application code, we should ensure their value is unchanged by compiler optimization. Naively modeling the status register is probably overkill here: since every FP instruction would need to be considered to modify (i.e. use and def) that register, this simply has the effect of creating a dependency chain across *all* FP instructions and makes any kind of instruction scheduling impossible. But this isn't really necessary since the flag bits actually simply accumulate. So it would suffice to have special dependencies from each FP instruction separately directly to the next instruction (or routine) that reads the status flags. However, I don't really see any easy way to model this type of dependency in the back-end (in particular on the MI level).

C) Floating-point exceptions

If a mask bit in the floating-point status register is set, then all FP instructions will *trap* whenever an IEEE exception condition is recognized. This means that we need to treat those instructions as having unmodelled side effects, so that they cannot be speculatively executed. Also, we cannot schedule FP instructions across instructions that set (those bits in) the FP status register -- but the latter is probably automatically done as long as those latter instructions are described as having unmodeled side effects. Note that this will in effect again create a dependency chain across all FP instructions, so that B) should be implicitly covered as well here.

Did I miss anything here? I'm assuming that the behavior on FP instructions on Intel (and other architectures) will be roughly similar, given that this behavior is mostly defined by the IEEE standard.


Now the question in my mind is, how this this all map onto the experimental constrained intrinsics? They do have "rounding mode" and "exception behavior" metadata, but I don't really see how that maps onto the behavior of instructions as described above. Also, right now the back-end doesn't even *get* at that data in the first place, since it is just thown away when lowering the intrinsics to STRICT_... nodes. In fact, I'm also not sure how the front-end is even supposed to be *setting* those metadata flags -- is the compiler supposed to track calls to fesetround and the like, and thereby determine which rounding and exception modes apply to any given block of code? In fact, was the original intention even that the back-end actually implements different behavior based on this level of detail, or was the back-end supposed to support only two modes, the default behavior of today and a fully strict implementation always satisfying all three of A), B), and C) above?

Looking again at a possible implementation in the back-end, I'm now wondering if it wouldn't after all be better to just treat the STRICT_ opcodes like all other DAG nodes. That is, have them be associated with an action (Legal, Expand, or Custom); set the default action to Expand, with a default expander that just replaces them by the "normal" FP nodes; and allow a back-end to set the action to Legal and/or Custom and then just handle them in the back-end as it sees fit. This might indeed require multiple patterns to match them, but it should be possible to generate those via multiclass instantiations so it might not be all that big a deal. The benefit would be that it allows the back-end the greatest freedom how to handle things (e.g. interactions with target-specific control registers).


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU/Linux compilers and toolchain
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294


Inactive hide details for "Kaylor, Andrew" ---09.01.2018 19:55:59---I think we're going to need to create a new mechanism to co"Kaylor, Andrew" ---09.01.2018 19:55:59---I think we're going to need to create a new mechanism to communicate strict FP modes to the backend.

From: "Kaylor, Andrew" <[hidden email]>
To: Ulrich Weigand <[hidden email]>, "[hidden email]" <[hidden email]>
Cc: Hal Finkel <[hidden email]>, Richard Smith <[hidden email]>, "[hidden email]" <[hidden email]>, "[hidden email]" <[hidden email]>, "[hidden email]" <[hidden email]>, llvm-dev <[hidden email]>
Date: 09.01.2018 19:55
Subject: RE: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?





I think we’re going to need to create a new mechanism to communicate strict FP modes to the backend. I think we need to avoid doing anything that will require re-inventing or duplicating all of the pattern matching that goes on in instruction selection (which is the reason we’re currently dropping that information). I’m out of my depth on this transition, but I think maybe we could handle it with some kind of attribute on the MBB.

In C/C++, at least, it’s my understanding that the pragmas always apply at the scope-level (as opposed to having the possibility of being instruction-specific), and we’ve previously agreed that our implementation will really need to apply the rules across entire functions in the sense that if any part of a function uses the constrained intrinsics all FP operations in the function will need to use them (though different metadata arguments may be used in different scopes). So I think that opens our options a bit.

Regarding constant folding, I think you are correct that it isn’t happening anywhere in the backends at the moment. There is some constant folding done during instruction selection, but the existing mechanism prevents that. My concern is that given LLVM’s development model, if there is nothing in place to prevent constant folding and no consensus that it shouldn’t be allowed then we should probably believe that someone will eventually do it.

-Andy

From: Ulrich Weigand [[hidden email]]
Sent:
Tuesday, January 09, 2018 9:59 AM
To:
Kaylor, Andrew <[hidden email]>; [hidden email]
Cc:
Hal Finkel <[hidden email]>; Richard Smith <[hidden email]>; [hidden email]; [hidden email]; [hidden email]; [hidden email]; llvm-dev <[hidden email]>
Subject:
Re: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?

Andrew Kaylor wrote:


>In general, the current "strict FP" handling stops at instruction
>selection. At the MachineIR level we don't currently have a mechanism
>to prevent inappropriate optimizations based on floating point
>constraints, or indeed to convey such constraints to the backend.
>Implicit register use modeling may provide some restriction on some
>architectures, but this is definitely lacking for X86 targets. On the
>other hand, I'm not aware of any specific current problems, so in many
>cases we may "get lucky" and have the correct thing happen by chance.
>Obviously that's not a viable long term solution. I have a rough plan
>for adding improved register modeling to the X86 backend, which should
>take care of instruction scheduling issues, but we'd still need a
>mechanism to prevent constant folding optimizations and such.

Given that Kevin intends to target SystemZ, I'll be happy to work on the SystemZ back-end support for this feature. I agree that we should be using implicit control register dependencies, which will at least prevent moving floating-point operations across instructions that e.g. change rounding modes. However, the main property we need to model is that floating-point operations may *trap*. I guess this can be done using UnmodeledSideEffects, but I'm not quite clear on how to make this dependent on whether or not a "strict" operation is requested (without duplicating all the instruction patterns ...).


Once we do use something like UnmodeledSideEffects, I think MachineIR passes should handle everything correctly; in the end, the requirements are not really different from those of other trapping instructions. B.t.w. I don't think anybody does constant folding on floating-point constants at the MachineIR level anyway ... have you seen this anywhere?



Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU/Linux compilers and toolchain
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Reply | Threaded
Open this post in threaded view
|

Re: Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev
In reply to this post by Louis Dionne via cfe-dev

Oh, and other thing: Are you planning to attend the upcoming LLVM developer's meeting in Bristol? I thought it might be good idea to get all parties interested in this feature together in person, if we're at the same meeting anyway. So I was thinking of submitting a proposal for a BoF session on this topic ...


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU/Linux compilers and toolchain
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294


Inactive hide details for "Kaylor, Andrew" ---09.01.2018 19:55:59---I think we're going to need to create a new mechanism to co"Kaylor, Andrew" ---09.01.2018 19:55:59---I think we're going to need to create a new mechanism to communicate strict FP modes to the backend.

From: "Kaylor, Andrew" <[hidden email]>
To: Ulrich Weigand <[hidden email]>, "[hidden email]" <[hidden email]>
Cc: Hal Finkel <[hidden email]>, Richard Smith <[hidden email]>, "[hidden email]" <[hidden email]>, "[hidden email]" <[hidden email]>, "[hidden email]" <[hidden email]>, llvm-dev <[hidden email]>
Date: 09.01.2018 19:55
Subject: RE: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?





I think we’re going to need to create a new mechanism to communicate strict FP modes to the backend. I think we need to avoid doing anything that will require re-inventing or duplicating all of the pattern matching that goes on in instruction selection (which is the reason we’re currently dropping that information). I’m out of my depth on this transition, but I think maybe we could handle it with some kind of attribute on the MBB.

In C/C++, at least, it’s my understanding that the pragmas always apply at the scope-level (as opposed to having the possibility of being instruction-specific), and we’ve previously agreed that our implementation will really need to apply the rules across entire functions in the sense that if any part of a function uses the constrained intrinsics all FP operations in the function will need to use them (though different metadata arguments may be used in different scopes). So I think that opens our options a bit.

Regarding constant folding, I think you are correct that it isn’t happening anywhere in the backends at the moment. There is some constant folding done during instruction selection, but the existing mechanism prevents that. My concern is that given LLVM’s development model, if there is nothing in place to prevent constant folding and no consensus that it shouldn’t be allowed then we should probably believe that someone will eventually do it.

-Andy

From: Ulrich Weigand [[hidden email]]
Sent:
Tuesday, January 09, 2018 9:59 AM
To:
Kaylor, Andrew <[hidden email]>; [hidden email]
Cc:
Hal Finkel <[hidden email]>; Richard Smith <[hidden email]>; [hidden email]; [hidden email]; [hidden email]; [hidden email]; llvm-dev <[hidden email]>
Subject:
Re: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?

Andrew Kaylor wrote:


>In general, the current "strict FP" handling stops at instruction
>selection. At the MachineIR level we don't currently have a mechanism
>to prevent inappropriate optimizations based on floating point
>constraints, or indeed to convey such constraints to the backend.
>Implicit register use modeling may provide some restriction on some
>architectures, but this is definitely lacking for X86 targets. On the
>other hand, I'm not aware of any specific current problems, so in many
>cases we may "get lucky" and have the correct thing happen by chance.
>Obviously that's not a viable long term solution. I have a rough plan
>for adding improved register modeling to the X86 backend, which should
>take care of instruction scheduling issues, but we'd still need a
>mechanism to prevent constant folding optimizations and such.

Given that Kevin intends to target SystemZ, I'll be happy to work on the SystemZ back-end support for this feature. I agree that we should be using implicit control register dependencies, which will at least prevent moving floating-point operations across instructions that e.g. change rounding modes. However, the main property we need to model is that floating-point operations may *trap*. I guess this can be done using UnmodeledSideEffects, but I'm not quite clear on how to make this dependent on whether or not a "strict" operation is requested (without duplicating all the instruction patterns ...).


Once we do use something like UnmodeledSideEffects, I think MachineIR passes should handle everything correctly; in the end, the requirements are not really different from those of other trapping instructions. B.t.w. I don't think anybody does constant folding on floating-point constants at the MachineIR level anyway ... have you seen this anywhere?



Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU/Linux compilers and toolchain
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Reply | Threaded
Open this post in threaded view
|

Re: Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev
In reply to this post by Louis Dionne via cfe-dev

I think you have described the backend issues very well.

 

You are correct that Intel architecture machines behave roughly as you describe. There are some wrinkles in that status and control bits are kept in the same register and there are two such registers, one for MMX/SSE/AVX instructions and one for X87 instructions. But that is all a matter of details, conceptually it is just as you have described.

 

It is my understanding that some LLVM backends are already modeling the FP control and status registers. The X86 backend does not. I attempted to add it last year, but I ran into some complications and backed it out. I think I know how to fix those problems now.

 

Everyone I’ve talked to up until now is happy to live with performance degradations when using non-default FP modes. The sticking point is that we’d really like to avoid doing anything that would restrict performance in the default case, which we expect to be used in the vast majority of programs. I’m not sure how much impact restricting FP scheduling in the backend would have. My intuition is that it wouldn’t be particularly significant, but it would certainly be something worth measuring.

 

You’re correct that we currently have no means of communicating the rounding mode and exception behavior to the back end. I’m reluctant to rely on Selection DAG pattern matching for the STRICT nodes because the existing pattern matching has a large number of variations. If we can re-use those patterns, I definitely want to. That’s the reason that the current implementation was written the way it was.

 

To answer your other question, I am not going to be attending the LLVM developers meeting in Bristol. I would, however, be happy to have some sort of virtual meeting to discuss this with anyone who is interested.

 

Thanks,

Andy

 

 

From: Ulrich Weigand [mailto:[hidden email]]
Sent: Friday, February 09, 2018 6:42 AM
To: Kaylor, Andrew <[hidden email]>
Cc: [hidden email]; [hidden email]; [hidden email]; Hal Finkel <[hidden email]>; [hidden email]; llvm-dev <[hidden email]>; Richard Smith <[hidden email]>
Subject: RE: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?

 

Hi Andrew,

sorry for the delay, I only now got some time to look into this a bit more. But I still have a number of questions of how to actually implement this in the back end. Looking at this bottom-up, starting with the behavior of the actual machine instructions, we have (at least on SystemZ) the following things to consider:

A) Rounding mode

Most FP arithmetic instructions use the "current rounding mode" as indicated in the floating-point control register. This is currently assumed to never change. To fix this, we need to avoid scheduling FP arithmetic instructions across instructions that modify the rounding mode. This may also imply avoiding scheduling instructions across function calls, since those may also modify the rounding mode. This can probably be done by modeling the floating-point control register as LLVM register (or maybe model just the rounding mode bits as its own "register"), have all FP arithmetic instructions in question take this new register as implicit input, and have the register by clobbered by the instructions that change the rounding mode (and also function calls).

B) Floating-point status flags

FP instructions set a flag bit in the floating-point status register whenever an IEEE exception condition is recognized. If these flag bits are later tested by application code, we should ensure their value is unchanged by compiler optimization. Naively modeling the status register is probably overkill here: since every FP instruction would need to be considered to modify (i.e. use and def) that register, this simply has the effect of creating a dependency chain across *all* FP instructions and makes any kind of instruction scheduling impossible. But this isn't really necessary since the flag bits actually simply accumulate. So it would suffice to have special dependencies from each FP instruction separately directly to the next instruction (or routine) that reads the status flags. However, I don't really see any easy way to model this type of dependency in the back-end (in particular on the MI level).

C) Floating-point exceptions

If a mask bit in the floating-point status register is set, then all FP instructions will *trap* whenever an IEEE exception condition is recognized. This means that we need to treat those instructions as having unmodelled side effects, so that they cannot be speculatively executed. Also, we cannot schedule FP instructions across instructions that set (those bits in) the FP status register -- but the latter is probably automatically done as long as those latter instructions are described as having unmodeled side effects. Note that this will in effect again create a dependency chain across all FP instructions, so that B) should be implicitly covered as well here.

Did I miss anything here? I'm assuming that the behavior on FP instructions on Intel (and other architectures) will be roughly similar, given that this behavior is mostly defined by the IEEE standard.


Now the question in my mind is, how this this all map onto the experimental constrained intrinsics? They do have "rounding mode" and "exception behavior" metadata, but I don't really see how that maps onto the behavior of instructions as described above. Also, right now the back-end doesn't even *get* at that data in the first place, since it is just thown away when lowering the intrinsics to STRICT_... nodes. In fact, I'm also not sure how the front-end is even supposed to be *setting* those metadata flags -- is the compiler supposed to track calls to fesetround and the like, and thereby determine which rounding and exception modes apply to any given block of code? In fact, was the original intention even that the back-end actually implements different behavior based on this level of detail, or was the back-end supposed to support only two modes, the default behavior of today and a fully strict implementation always satisfying all three of A), B), and C) above?

Looking again at a possible implementation in the back-end, I'm now wondering if it wouldn't after all be better to just treat the STRICT_ opcodes like all other DAG nodes. That is, have them be associated with an action (Legal, Expand, or Custom); set the default action to Expand, with a default expander that just replaces them by the "normal" FP nodes; and allow a back-end to set the action to Legal and/or Custom and then just handle them in the back-end as it sees fit. This might indeed require multiple patterns to match them, but it should be possible to generate those via multiclass instantiations so it might not be all that big a deal. The benefit would be that it allows the back-end the greatest freedom how to handle things (e.g. interactions with target-specific control registers).


Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU/Linux compilers and toolchain
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294


Inactive hide details for "Kaylor, Andrew" ---09.01.2018 19:55:59---I think we're going to need to create a new mechanism to co"Kaylor, Andrew" ---09.01.2018 19:55:59---I think we're going to need to create a new mechanism to communicate strict FP modes to the backend.

From: "Kaylor, Andrew" <[hidden email]>
To: Ulrich Weigand <[hidden email]>, "[hidden email]" <[hidden email]>
Cc: Hal Finkel <[hidden email]>, Richard Smith <[hidden email]>, "[hidden email]" <[hidden email]>, "[hidden email]" <[hidden email]>, "[hidden email]" <[hidden email]>, llvm-dev <[hidden email]>
Date: 09.01.2018 19:55
Subject: RE: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?





I think we’re going to need to create a new mechanism to communicate strict FP modes to the backend. I think we need to avoid doing anything that will require re-inventing or duplicating all of the pattern matching that goes on in instruction selection (which is the reason we’re currently dropping that information). I’m out of my depth on this transition, but I think maybe we could handle it with some kind of attribute on the MBB.

In C/C++, at least, it’s my understanding that the pragmas always apply at the scope-level (as opposed to having the possibility of being instruction-specific), and we’ve previously agreed that our implementation will really need to apply the rules across entire functions in the sense that if any part of a function uses the constrained intrinsics all FP operations in the function will need to use them (though different metadata arguments may be used in different scopes). So I think that opens our options a bit.

Regarding constant folding, I think you are correct that it isn’t happening anywhere in the backends at the moment. There is some constant folding done during instruction selection, but the existing mechanism prevents that. My concern is that given LLVM’s development model, if there is nothing in place to prevent constant folding and no consensus that it shouldn’t be allowed then we should probably believe that someone will eventually do it.

-Andy

From: Ulrich Weigand [[hidden email]]
Sent:
Tuesday, January 09, 2018 9:59 AM
To:
Kaylor, Andrew <[hidden email]>; [hidden email]
Cc:
Hal Finkel <[hidden email]>; Richard Smith <[hidden email]>; [hidden email]; [hidden email]; [hidden email]; [hidden email]; llvm-dev <[hidden email]>
Subject:
Re: [cfe-dev] Why is #pragma STDC FENV_ACCESS not supported?

Andrew Kaylor wrote:

>In general, the current "strict FP" handling stops at instruction
>selection. At the MachineIR level we don't currently have a mechanism
>to prevent inappropriate optimizations based on floating point
>constraints, or indeed to convey such constraints to the backend.
>Implicit register use modeling may provide some restriction on some
>architectures, but this is definitely lacking for X86 targets. On the
>other hand, I'm not aware of any specific current problems, so in many
>cases we may "get lucky" and have the correct thing happen by chance.
>Obviously that's not a viable long term solution. I have a rough plan
>for adding improved register modeling to the X86 backend, which should
>take care of instruction scheduling issues, but we'd still need a
>mechanism to prevent constant folding optimizations and such.


Given that Kevin intends to target SystemZ, I'll be happy to work on the SystemZ back-end support for this feature. I agree that we should be using implicit control register dependencies, which will at least prevent moving floating-point operations across instructions that e.g. change rounding modes. However, the main property we need to model is that floating-point operations may *trap*. I guess this can be done using UnmodeledSideEffects, but I'm not quite clear on how to make this dependent on whether or not a "strict" operation is requested (without duplicating all the instruction patterns ...).


Once we do use something like UnmodeledSideEffects, I think MachineIR passes should handle everything correctly; in the end, the requirements are not really different from those of other trapping instructions. B.t.w. I don't think anybody does constant folding on floating-point constants at the MachineIR level anyway ... have you seen this anywhere?



Mit freundlichen Gruessen / Best Regards

Ulrich Weigand

--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU/Linux compilers and toolchain
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

 


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev
In reply to this post by Louis Dionne via cfe-dev
I'm working with Andrew on D43515 right now, and some of these unanswered
questions are directly relevant to that patch. So....

On Fri, Feb 09, 2018 at 03:42:20PM +0100, Ulrich Weigand wrote:
>    C) Floating-point exceptions
>    If a mask bit in the floating-point status register is set, then all FP
>    instructions will *trap* whenever an IEEE exception condition is
>    recognized. This means that we need to treat those instructions as
>    having unmodelled side effects, so that they cannot be speculatively
>    executed. Also, we cannot schedule FP instructions across instructions

Does this mean that the problems with the default expansion of ISD::FP_TO_UINT
would be solved by the backend knowing that it should model traps?

In D43515 the issue of what to do with STRICT_FP_TO_UINT is still unsolved.

>    that set (those bits in) the FP status register -- but the latter is
>    probably automatically done as long as those latter instructions are
>    described as having unmodeled side effects. Note that this will in
>    effect again create a dependency chain across all FP instructions, so
>    that B) should be implicitly covered as well here.
>    Did I miss anything here? I'm assuming that the behavior on FP
>    instructions on Intel (and other architectures) will be roughly
>    similar, given that this behavior is mostly defined by the IEEE
>    standard.
>    Now the question in my mind is, how this this all map onto the
>    experimental constrained intrinsics? They do have "rounding mode" and
>    "exception behavior" metadata, but I don't really see how that maps
>    onto the behavior of instructions as described above. Also, right now
>    the back-end doesn't even *get* at that data in the first place, since
>    it is just thown away when lowering the intrinsics to STRICT_... nodes.
>    In fact, I'm also not sure how the front-end is even supposed to be
>    *setting* those metadata flags -- is the compiler supposed to track
>    calls to fesetround and the like, and thereby determine which rounding
>    and exception modes apply to any given block of code? In fact, was the
>    original intention even that the back-end actually implements different
>    behavior based on this level of detail, or was the back-end supposed to
>    support only two modes, the default behavior of today and a fully
>    strict implementation always satisfying all three of A), B), and C)
>    above?
>    Looking again at a possible implementation in the back-end, I'm now
>    wondering if it wouldn't after all be better to just treat the STRICT_
>    opcodes like all other DAG nodes. That is, have them be associated with
>    an action (Legal, Expand, or Custom); set the default action to Expand,
>    with a default expander that just replaces them by the "normal" FP
>    nodes; and allow a back-end to set the action to Legal and/or Custom
>    and then just handle them in the back-end as it sees fit. This might
>    indeed require multiple patterns to match them, but it should be
>    possible to generate those via multiclass instantiations so it might
>    not be all that big a deal. The benefit would be that it allows the
>    back-end the greatest freedom how to handle things (e.g. interactions
>    with target-specific control registers).

Was there a consensus on what to do here?

Are we exposing the strict SDAG nodes to the backend or not? Obviously
if we are this isn't going to take a while to implement, but it would
still be useful to know when coding the layers above the backend.

If we're not exposing the strict nodes to the backend, would using the
chain on expansions like ISD::FP_TO_UINT solve speculative execution issues?

--
Kevin P. Neal                                http://www.pobox.com/~kpn/

                    "A pig's gotta fly." - Crimson Pig
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev

"Kevin P. Neal" <[hidden email]> wrote on 06.03.2018 15:01:46:


> On Fri, Feb 09, 2018 at 03:42:20PM +0100, Ulrich Weigand wrote:
> >    C) Floating-point exceptions
> >    If a mask bit in the floating-point status register is set, then all FP
> >    instructions will *trap* whenever an IEEE exception condition is
> >    recognized. This means that we need to treat those instructions as
> >    having unmodelled side effects, so that they cannot be speculatively
> >    executed. Also, we cannot schedule FP instructions across instructions
>
> Does this mean that the problems with the default expansion of ISD::FP_TO_UINT
> would be solved by the backend knowing that it should model traps?

Not really.  The problem with the default FP_TO_UINT expansion is that it
implements FP_TO_UINT in terms of FE_TO_SINT, like so:

    x < INT_MAX+1 ? (signed) x : ((signed) (x - (INT_MAX+1)) + INT_MAX+1)

This is impemented via a "select", so that both paths are computed in
parallel.  If x is too large for the signed range (but within the unsigned
range), the (signed) x conversion will raise an FP_INEXACT exception, which
is not supposed to happen here.

I think this simplest way to fix this for the STRICT_ case is to use an
explicit "if" (two basic blocks and a phi node) instead of the "select",
using STRICT_FP_TO_SINT to implement the signed conversion.

The existing (or to be implemented) constraints on the strict nodes should
prevent instruction selection from converting the if back to a select.

> >    Looking again at a possible implementation in the back-end, I'm now
> >    wondering if it wouldn't after all be better to just treat the STRICT_
> >    opcodes like all other DAG nodes. That is, have them be associated with
> >    an action (Legal, Expand, or Custom); set the default action to Expand,
> >    with a default expander that just replaces them by the "normal" FP
> >    nodes; and allow a back-end to set the action to Legal and/or Custom
> >    and then just handle them in the back-end as it sees fit. This might
> >    indeed require multiple patterns to match them, but it should be
> >    possible to generate those via multiclass instantiations so it might
> >    not be all that big a deal. The benefit would be that it allows the
> >    back-end the greatest freedom how to handle things (e.g. interactions
> >    with target-specific control registers).
>
> Was there a consensus on what to do here?

I don't think we have a consensus yet.
 
> Are we exposing the strict SDAG nodes to the backend or not? Obviously
> if we are this isn't going to take a while to implement, but it would
> still be useful to know when coding the layers above the backend.
>
> If we're not exposing the strict nodes to the backend, would using the
> chain on expansions like ISD::FP_TO_UINT solve speculative execution issues?

Well, any specific node like FP_TO_UINT either has a chain or it doesn't
have a chain.  So adding a chain to FP_TO_UINT would require us to add
it unconditionally, even if we didn't start out with strict nodes.

But that would impose scheduling constraints even on the non-strict case,
which I gather we don't really want to see.

So it appears to me that if we need a chain (or control register
dependencies etc.), it would be easier on the back-end anyway to
have a different ISD node, in which case it just might be the easiest
to pass the STRICT_ nodes through to the back-end if it wants ...

I'll try and go ahead with the SystemZ back-end to see how complicated
it would actually be to add those nodes, so that we can make an
informed decision.

Bye,
Ulrich


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev
On Tue, Mar 06, 2018 at 04:07:34PM +0100, Ulrich Weigand wrote:

>    "Kevin P. Neal" <[hidden email]> wrote on 06.03.2018 15:01:46:
>    > On Fri, Feb 09, 2018 at 03:42:20PM +0100, Ulrich Weigand wrote:
>    > >    C) Floating-point exceptions
>    > >    If a mask bit in the floating-point status register is set, then
>    all FP
>    > >    instructions will *trap* whenever an IEEE exception condition is
>    > >    recognized. This means that we need to treat those instructions
>    as
>    > >    having unmodelled side effects, so that they cannot be
>    speculatively
>    > >    executed. Also, we cannot schedule FP instructions across
>    instructions
>    >
>    > Does this mean that the problems with the default expansion of
>    ISD::FP_TO_UINT
>    > would be solved by the backend knowing that it should model traps?
>    Not really.  The problem with the default FP_TO_UINT expansion is that
>    it
>    implements FP_TO_UINT in terms of FE_TO_SINT, like so:
>        x < INT_MAX+1 ? (signed) x : ((signed) (x - (INT_MAX+1)) +
>    INT_MAX+1)
>    This is impemented via a "select", so that both paths are computed in
>    parallel.  If x is too large for the signed range (but within the
>    unsigned
>    range), the (signed) x conversion will raise an FP_INEXACT exception,
>    which
>    is not supposed to happen here.

Yup, I've seen that. It's why we had to target newer z/Arch systems so we
would gain use of the single instruction for this purpose.

But in a debug build the expansion turns into code that looks like it came
from a regular if/else set of blocks. So it isn't always the case that a
"select" turns into speculative execution. Thus my question.

>    I think this simplest way to fix this for the STRICT_ case is to use an
>    explicit "if" (two basic blocks and a phi node) instead of the
>    "select",
>    using STRICT_FP_TO_SINT to implement the signed conversion.
>    The existing (or to be implemented) constraints on the strict nodes
>    should
>    prevent instruction selection from converting the if back to a select.

This would need to happen at SelectionDAG construction time, correct?
Because unless I'm misreading SelectAllBasicBlocks() it looks like we can't
add new basic blocks during selection. So it would need to be done before.

Correct?

Doing that expansion at SDAG construction time would still allow us to
query the backend so we don't defeat any optimizations. And use of the
STRICT_FP_TO_SINT would prevent optimization of the early expansion. So
this would probably be as good as the expansion we're doing during
legalization.

Does anyone else like the idea?

--
Kevin P. Neal                                http://www.pobox.com/~kpn/

"Nonbelievers found it difficult to defend their position in \
    the presense of a working computer." -- a DEC Jensen paper
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Why is #pragma STDC FENV_ACCESS not supported?

Louis Dionne via cfe-dev
In reply to this post by Louis Dionne via cfe-dev

Ulrich Weigand/Germany/IBM wrote on 06.03.2018 16:07:34:

> So it appears to me that if we need a chain (or control register

> dependencies etc.), it would be easier on the back-end anyway to
> have a different ISD node, in which case it just might be the easiest
> to pass the STRICT_ nodes through to the back-end if it wants ...
>
> I'll try and go ahead with the SystemZ back-end to see how complicated

> it would actually be to add those nodes, so that we can make an
> informed decision.

I've now implemented the above to handle all the currently supported
STRICT_ FP nodes in the SystemZ back-end (at least for pre-z13 machines,
I'm not supporting vector instructions yet):

https://reviews.llvm.org/D45576

I've tested this using a hacked clang front-end that always uses
constrained intrisincs in place of most regular FP operations,
and it at least still passes the LNT test-suite.

I'd appreciate any comments on whether this looks like an acceptable
approach.  In particular, I'd like to hear from Andrew how this
compares with your approach -- I understand you have some other
method in mind to handle this for x86?

Bye,
Ulrich


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

123