RFC: Supported Optimizations attribute

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
Hi folks,

please check out our RFC: Supported Optimizations attribute


Pasting it here for the record:

RFC: supported_optimizations attribute

Piotr Padlewski - [hidden email]Krzysztof Pszeniczny - [hidden email]December 2018


Introduction

Sometimes a possible class of optimizations requires very precise use of metadata and/or intrinsics, at least to achieve a clean implementation thereof.


Our primary example is devirtualization[RFC: Devirtualization v2], which requires that the LLVM IR be annotated with certain intrinsic calls, viz. launder.invariant.group and strip.invariant.group, in certain places, e.g. when the dynamic type of an object changes.


This currently makes it impossible to merge IR created with this kind of optimization in mind and without it, as this can lead to miscompilation when the optimizer relies on the lack of intrinsics and/or metadata to make certain assumptions about the code in question.  For now, in case of devirtualization, we plainly refuse to merge such pieces of IR. This predominantly affects cross-language link-time optimization.

Solution

We propose a fine-grained, function-level solution to this problem, originally suggested by John McCall: mark each function with a list of such optimization requirements it complies with, by the means of an attribute for now called supported_optimizations.  The exact naming of this attribute is of course subject to bikeshedding, so we kindly request the community to provide us with a better name, if found.


When performing inter-procedural optimizations (mostly inlining), the list of supported optimizations by the both functions should be considered to be to the intersection of their previous lists of supported optimizations.  This effectively hinders any optimization relying on precise annotations from happening if such precise annotations can no longer be guaranteed.


More precisely, when doing an IPO other than inlining, such a pass must treat the functions in question as if they had the supported optimizations lists set to the intersection of their original ones.  This in general obviously does not require modifying those lists.


In the case of inlining, the list of optimizations supported by the callee is left intact, but the list of optimizations supported by the caller must be set to the said intersection.


invariant.group

The !invariant.group metadata should be equipped with an argument referring to the supported_optimization it is related to. In the existing case, it will be set to "clang.cxx.devirtualization".  If this supported_optimization is not present in the list of optimizations supported by the enclosing function, the !invariant.group metadata has no effect and cannot be relied upon by the optimizer.  Of course, in this case the metadata in question is not useful anymore and can be safely removed.  We kindly thank Richard Smith for suggesting this approach.


As a transitional measure and for backwards compatibility reasons, any !invariant.group metadata with an empty argument (i.e. as before this RFC), shall not be subject to the above restrictions and shall remain applicable even when there is no supported_optimizations list provided for the enclosing function.

Example:

define void @with_devirtualization(i8* %p) supported_optimizations = "clang.cxx.devirtualization" {
 %v1 = load i8, i8* %p, !invariant.group !0
 %p2 = call i8* @llvm.launder.invariant.group.p0i8(i8* %p)
 call void @without_devirtualization(i8* %p, i8* %p2)
 %v2 = load i8, i8* %p2, !invariant.group !0
 ret void
}

define void @without_devirtualization(i8* %p, i8* %p2) {
  %b = icmp eq i8* %p, %p2
  call void @llvm.assume(i1 %b)
  ret void
}

!0 = !{!"clang.cxx.devirtualization"}
declare void @llvm.assume(i1)
declare i8* @llvm.launder.invariant.group.p0i8(i8*)


After inlining:

define void @with_devirtualization(i8* %p) {
 ; !invariant.group is inactive now, so it can be stripped.
 %v1 = load i8, i8* %p, !invariant.group !0
 %p2 = call i8* @llvm.launder.invariant.group.p0i8(i8* %p)
 %b = icmp eq i8* %p, %p2
 call void @llvm.assume(i1 %b)
 %v2 = load i8, i8* %p2, !invariant.group !0
 ret void
}


Possible extensions

Instead of indiscriminately taking the intersection of supported optimizations' lists, we may imagine that some of such optimizations may be able to provide a conservative set of annotations to a function lacking them.  By doing so, we may retain at least some level of information available to the optimizer, by preserving the annotations already present in the optimization-compliant function.


For example, devirtualization may conservatively pass every load and function argument through a strip and every store and return value through a launder, which is a way of conservatively making an arbitrary function compliant with the requirements of this particular optimization.

Bikeshedding

Other possible names:

  • compilant_with

  • brittle_representations (John McCall's original suggestion)


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
Thank you for pushing this forward.

On 2 Dec 2018, at 12:47, Piotr Padlewski wrote:
> We propose a fine-grained, function-level solution to this problem,
> originally suggested by John McCall: mark each function with a list of
> such
> optimization requirements it complies

compiles

> invariant.group

In this proposal, it seems like you're assuming that only the
invariant.group
metadata needs to be parameterized with a supported_optimization.  
Should you
also decorate all the other structures used as part of this
optimization, e.g.
the calls to llvm.strip.invariant.group and
llvm.launder.invariant.group?
This would also allow these intrinsics to be more explicit about exactly
which
family of invariant.group metadata they're supposed to be
laundering/stripping,
and correspondingly it would allow LLVM to remove these intrinsics when
it
strips invariant.group optimization information.

> As a transitional measure and for backwards compatibility reasons, any
> !invariant.group metadata with an empty argument (i.e. as before this
> RFC),
> shall not be subject to the above restrictions and shall remain
> applicable
> even when there is no supported_optimizations list provided for the
> enclosing function.

I don't think it's important to support this transitionally.  AFAIK,
there are
no existing, non-experimental optimizations using invariant.group where
it's
crucial that we continue to perform the optimization when linking
existing .bc
files.  We should just insist that invariant.group has a tag and
otherwise
strip it during deserialization.

> Possible extensions
>
> Instead of indiscriminately taking the intersection of supported
> optimizations' lists, we may imagine that some of such optimizations
> may be
> able to provide a conservative set of annotations to a function
> lacking
> them.  By doing so, we may retain at least some level of information
> available to the optimizer, by preserving the annotations already
> present
> in the optimization-compliant function.
>
> For example, devirtualization may conservatively pass every load and
> function argument through a strip and every store and return value
> through
> a launder, which is a way of conservatively making an arbitrary
> function
> compliant with the requirements of this particular optimization.

Sure, that's a theoretical future enhancement that we could provide.

John.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
I think we should have some bounds on how "badly" a
supported_optimizations tag on a function can affect its semantics.
For instance, can a "supported_optimizations" invariant be that "the
CFG is always structured"?  Or (exaggerating to illustrate the point)
"the function has an equal number of loads and stores"?

-- Sanjoy
On Mon, Dec 3, 2018 at 8:52 PM John McCall via llvm-dev
<[hidden email]> wrote:

>
> Thank you for pushing this forward.
>
> On 2 Dec 2018, at 12:47, Piotr Padlewski wrote:
> > We propose a fine-grained, function-level solution to this problem,
> > originally suggested by John McCall: mark each function with a list of
> > such
> > optimization requirements it complies
>
> compiles
>
> > invariant.group
>
> In this proposal, it seems like you're assuming that only the
> invariant.group
> metadata needs to be parameterized with a supported_optimization.
> Should you
> also decorate all the other structures used as part of this
> optimization, e.g.
> the calls to llvm.strip.invariant.group and
> llvm.launder.invariant.group?
> This would also allow these intrinsics to be more explicit about exactly
> which
> family of invariant.group metadata they're supposed to be
> laundering/stripping,
> and correspondingly it would allow LLVM to remove these intrinsics when
> it
> strips invariant.group optimization information.
>
> > As a transitional measure and for backwards compatibility reasons, any
> > !invariant.group metadata with an empty argument (i.e. as before this
> > RFC),
> > shall not be subject to the above restrictions and shall remain
> > applicable
> > even when there is no supported_optimizations list provided for the
> > enclosing function.
>
> I don't think it's important to support this transitionally.  AFAIK,
> there are
> no existing, non-experimental optimizations using invariant.group where
> it's
> crucial that we continue to perform the optimization when linking
> existing .bc
> files.  We should just insist that invariant.group has a tag and
> otherwise
> strip it during deserialization.
>
> > Possible extensions
> >
> > Instead of indiscriminately taking the intersection of supported
> > optimizations' lists, we may imagine that some of such optimizations
> > may be
> > able to provide a conservative set of annotations to a function
> > lacking
> > them.  By doing so, we may retain at least some level of information
> > available to the optimizer, by preserving the annotations already
> > present
> > in the optimization-compliant function.
> >
> > For example, devirtualization may conservatively pass every load and
> > function argument through a strip and every store and return value
> > through
> > a launder, which is a way of conservatively making an arbitrary
> > function
> > compliant with the requirements of this particular optimization.
>
> Sure, that's a theoretical future enhancement that we could provide.
>
> John.
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev
On 3 Dec 2018, at 23:52, John McCall wrote:
> On 2 Dec 2018, at 12:47, Piotr Padlewski wrote:
>> We propose a fine-grained, function-level solution to this problem,
>> originally suggested by John McCall: mark each function with a list
>> of such
>> optimization requirements it complies
>
> compiles

...I have no idea what I was thinking here, "complies" is obviously the
right word.
Apologies.

John.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev

On 4 Dec 2018, at 1:45, Sanjoy Das wrote:

I think we should have some bounds on how "badly" a
supported_optimizations tag on a function can affect its semantics.
For instance, can a "supported_optimizations" invariant be that "the
CFG is always structured"? Or (exaggerating to illustrate the point)
"the function has an equal number of loads and stores"?

Piotr's proposal unfortunately doesn't give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them "brittle", but I can understand why
nobody wants to call their optimization that, so let's call them
"good-faith optimizations" instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that's what makes them structural rules. We don't want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won't be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

  • other transforms will preserve the apparent semantics of the function and
  • other transforms will maintain the standard structural rules of LLVM IR.

For example, if you need arbitrary side-effects to not be re-ordered past
some point in the function, you should put an instruction there that
inhibits that kind of code motion, probably by making it look like a call
to an unknown function. No transform can reorder arbitrary side-effects
across such a call because doing so might change the apparent semantics of
the program. Similarly, if you need a direct use-def link between two
points in the function, you should make the first point define a token
that'll be used by the second point. No transform will break or abstract
that association because that would violate the basic structural rules of IR.
If there's a structural rule that can't be expressed with the current tools
of IR, we can talk about how to fix that --- although tokens are such a big
hammer that I'd be somewhat surprised to hear that something really can't
be expressed at all (but much less surprised to hear that the token rules
are too strong and we'd like other transforms to have more power to change
the code).

So the defining property of a good-faith optimization is that:
- there are rules which participating functions are expected to follow on
pain of undefined behavior but which LLVM IR doesn't require every function
to follow, and
- those rules will be preserved by any transform that doesn't move code
between functions and which preserves the apparent function semantics
and maintains the standard structural rules of LLVM IR.

I think that answers your question about what sort of rules we can expect
supported_optimizations to guarantee.

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
On 12/4/18 1:49 AM, John McCall via cfe-dev wrote:

On 4 Dec 2018, at 1:45, Sanjoy Das wrote:

I think we should have some bounds on how "badly" a
supported_optimizations tag on a function can affect its semantics.
For instance, can a "supported_optimizations" invariant be that "the
CFG is always structured"? Or (exaggerating to illustrate the point)
"the function has an equal number of loads and stores"?

Piotr's proposal unfortunately doesn't give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them "brittle", but I can understand why
nobody wants to call their optimization that, so let's call them
"good-faith optimizations" instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that's what makes them structural rules. We don't want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won't be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

  • other transforms will preserve the apparent semantics of the function and
  • other transforms will maintain the standard structural rules of LLVM IR.

For example, if you need arbitrary side-effects to not be re-ordered past
some point in the function, you should put an instruction there that
inhibits that kind of code motion, probably by making it look like a call
to an unknown function. No transform can reorder arbitrary side-effects
across such a call because doing so might change the apparent semantics of
the program. Similarly, if you need a direct use-def link between two
points in the function, you should make the first point define a token
that'll be used by the second point. No transform will break or abstract
that association because that would violate the basic structural rules of IR.
If there's a structural rule that can't be expressed with the current tools
of IR, we can talk about how to fix that --- although tokens are such a big
hammer that I'd be somewhat surprised to hear that something really can't
be expressed at all (but much less surprised to hear that the token rules
are too strong and we'd like other transforms to have more power to change
the code).

So the defining property of a good-faith optimization is that:
- there are rules which participating functions are expected to follow on
pain of undefined behavior but which LLVM IR doesn't require every function
to follow, and
- those rules will be preserved by any transform that doesn't move code
between functions and which preserves the apparent function semantics
and maintains the standard structural rules of LLVM IR.

I like these conditions, in terms of bounding what this means.

We might need to be more explicit about what moves code means, as we're run into fuzzy lines before around inter-procedural optimizations (e.g., return-value propagation, using the 'returned' attribute, and variants). Would it work to say "move code with potential side effects"?

Thanks again,

Hal

I think that answers your question about what sort of rules we can expect
supported_optimizations to guarantee.

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev
On Mon, Dec 3, 2018 at 11:49 PM John McCall <[hidden email]> wrote:

> Piotr's proposal unfortunately doesn't give us a good name for the class
> of optimizations that require being listed in supported_optimizations.
> In earlier discussions I called them "brittle", but I can understand why
> nobody wants to call their optimization that, so let's call them
> "good-faith optimizations" instead since they rely on the good faith of
> all the participating code.
>
> Every optimization has to know how to maintain the structural rules of
> LLVM IR; that's what makes them structural rules. We don't want the set of
> structural rules to substantially change because such-and-such good-faith
> optimization is in effect because that would require arbitrary transforms
> to check the supported_optimizations list before they knew which rules to
> follow. Instead, the burden is on the optimization designer to pick IR
> constructs that won't be messed up by an arbitrary transform with no special
> knowledge of the optimization. The only thing the optimization designer
> can rely on is this:
>
> other transforms will preserve the apparent semantics of the function and
> other transforms will maintain the standard structural rules of LLVM IR.

Ok.  Just to make sure we're on the same page, if this was all there
is we would not need this attribute right?  All LLVM optimizations do
need to preserve semantics and structural properties anyway?

> So the defining property of a good-faith optimization is that:
> - there are rules which participating functions are expected to follow on
> pain of undefined behavior but which LLVM IR doesn't require every function
> to follow, and
> - those rules will be preserved by any transform that doesn't move code
> between functions and which preserves the apparent function semantics
> and maintains the standard structural rules of LLVM IR.

In other words, certain things are UB in functions tagged with
supported_optimizations that are not UB otherwise?  This breaks code
hoisting transformations right?  I.e.
isSafeToSpeculativelyExecute(Inst) will have to return false if Inst
is in a function with a non-empty supported_optimizations?

-- Sanjoy
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev

On 4 Dec 2018, at 13:16, Sanjoy Das wrote:

On Mon, Dec 3, 2018 at 11:49 PM John McCall [hidden email] wrote:

Piotr's proposal unfortunately doesn't give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them "brittle", but I can understand why
nobody wants to call their optimization that, so let's call them
"good-faith optimizations" instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that's what makes them structural rules. We don't want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won't be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

other transforms will preserve the apparent semantics of the function and
other transforms will maintain the standard structural rules of LLVM IR.

Ok. Just to make sure we're on the same page, if this was all there
is we would not need this attribute right? All LLVM optimizations do
need to preserve semantics and structural properties anyway?

We need this attribute because interprocedural optimizations otherwise
break good-faith optimizations, so yes, my suummary here is missing some
qualification (that I included in the next paragraph, but with a slightly
different spin). So let me restate this.

The designer of a good-faith optimization can rely on this:

  • other transforms will preserve the apparent semantics of the function,
  • other transforms will maintain the standard structural rules of LLVM IR, and
  • interprocedural transforms will honor supported_optimizations as mentioned in Piotr's proposal --- and, in particular, will intersect the supported_optimizations list whenever moving code into a function.

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

So the defining property of a good-faith optimization is that:
- there are rules which participating functions are expected to follow on
pain of undefined behavior but which LLVM IR doesn't require every function
to follow, and
- those rules will be preserved by any transform that doesn't move code
between functions and which preserves the apparent function semantics
and maintains the standard structural rules of LLVM IR.

In other words, certain things are UB in functions tagged with
supported_optimizations that are not UB otherwise? This breaks code
hoisting transformations right? I.e.
isSafeToSpeculativelyExecute(Inst) will have to return false if Inst
is in a function with a non-empty supported_optimizations?

Good question. I would consider that to be an unacceptable intrusion:
intraprocedural transforms should never have to be aware of
supported_optimizations (unless they're implementing a good-faith
optimization, of course) and interprocedural transforms should only have
to be aware of supported_optimizations in the narrow sense outlined
by Piotr. If something about the optimization's representation in IR
is unsafe to speculate, it should be made impossible to speculate for
standard semantic/structural reasons, like having apparently arbitrary
side-effects.

I think the right way to formalize this is to say that, while the
good-faith optimization may impose additional UB rules on the function,
it must guarantee that transforms that are well-behaved as described
above --- i.e. that preserve standard structure and semantics and which,
if interprocedural, appropriately honor supported_optimizations --- will
never introduce new UB.

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev

On 4 Dec 2018, at 11:16, Finkel, Hal J. wrote:

On 12/4/18 1:49 AM, John McCall via cfe-dev wrote:

On 4 Dec 2018, at 1:45, Sanjoy Das wrote:

I think we should have some bounds on how "badly" a
supported_optimizations tag on a function can affect its semantics.
For instance, can a "supported_optimizations" invariant be that "the
CFG is always structured"? Or (exaggerating to illustrate the point)
"the function has an equal number of loads and stores"?

Piotr's proposal unfortunately doesn't give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them "brittle", but I can understand why
nobody wants to call their optimization that, so let's call them
"good-faith optimizations" instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that's what makes them structural rules. We don't want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won't be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

* other transforms will preserve the apparent semantics of the function and
* other transforms will maintain the standard structural rules of LLVM IR.

For example, if you need arbitrary side-effects to not be re-ordered past
some point in the function, you should put an instruction there that
inhibits that kind of code motion, probably by making it look like a call
to an unknown function. No transform can reorder arbitrary side-effects
across such a call because doing so might change the apparent semantics of
the program. Similarly, if you need a direct use-def link between two
points in the function, you should make the first point define a token
that'll be used by the second point. No transform will break or abstract
that association because that would violate the basic structural rules of IR.
If there's a structural rule that can't be expressed with the current tools
of IR, we can talk about how to fix that --- although tokens are such a big
hammer that I'd be somewhat surprised to hear that something really can't
be expressed at all (but much less surprised to hear that the token rules
are too strong and we'd like other transforms to have more power to change
the code).

So the defining property of a good-faith optimization is that:
- there are rules which participating functions are expected to follow on
pain of undefined behavior but which LLVM IR doesn't require every function
to follow, and
- those rules will be preserved by any transform that doesn't move code
between functions and which preserves the apparent function semantics
and maintains the standard structural rules of LLVM IR.

I like these conditions, in terms of bounding what this means.

We might need to be more explicit about what moves code means, as we're run into fuzzy lines before around inter-procedural optimizations (e.g., return-value propagation, using the 'returned' attribute, and variants). Would it work to say "move code with potential side effects"?

That's a really good point. Applying this to our current use-case, I'm pretty
sure that both return-value propagation and the returned attribute can break
Piotr's devirtualization optimization, so they need to intersect the caller's
list or clear it, respectively. IIRC it used to be the case that returned
was mostly honored in the backend for liveness optimizations, i.e. long after
IR constructs like supported_optimizations have stopped being relevant, so
being very pessimistic about it is fine; but I don't know if that's still true.

John.

Thanks again,

Hal

I think that answers your question about what sort of rules we can expect
supported_optimizations to guarantee.

John.



_______________________________________________
cfe-dev mailing list
[hidden email]<[hidden email]>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev
On Tue, Dec 4, 2018 at 11:24 AM John McCall <[hidden email]> wrote:

>> In other words, certain things are UB in functions tagged with
>> supported_optimizations that are not UB otherwise? This breaks code
>> hoisting transformations right? I.e.
>> isSafeToSpeculativelyExecute(Inst) will have to return false if Inst
>> is in a function with a non-empty supported_optimizations?
>>
> Good question. I would consider that to be an unacceptable intrusion:
> intraprocedural transforms should never have to be aware of
> supported_optimizations (unless they're implementing a good-faith
> optimization, of course) and interprocedural transforms should only have
> to be aware of supported_optimizations in the narrow sense outlined
> by Piotr. If something about the optimization's representation in IR
> is unsafe to speculate, it should be made impossible to speculate for
> standard semantic/structural reasons, like having apparently arbitrary
> side-effects.
>
> I think the right way to formalize this is to say that, while the
> good-faith optimization may impose additional UB rules on the function,
> it must guarantee that transforms that are well-behaved as described
> above --- i.e. that preserve standard structure and semantics and which,
> if interprocedural, appropriately honor supported_optimizations --- will
> never introduce new UB.

I see.  Would it be correct to say that a good faith optimization `O`
assumes that a certain construct(s), `FOO`, does not occur in the
*trace* of the program within the function tagged with
supported_optimizations={O}?  Also `FOO` has to be side effecting so
LLVM cannot speculate it into existence (e.g. `FOO` can't be
"getelementptr with a certain property").

-- Sanjoy
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev
On Tue, Dec 4, 2018 at 2:24 PM John McCall via cfe-dev <[hidden email]> wrote:

On 4 Dec 2018, at 13:16, Sanjoy Das wrote:

On Mon, Dec 3, 2018 at 11:49 PM John McCall [hidden email] wrote:

Piotr's proposal unfortunately doesn't give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them "brittle", but I can understand why
nobody wants to call their optimization that, so let's call them
"good-faith optimizations" instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that's what makes them structural rules. We don't want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won't be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

other transforms will preserve the apparent semantics of the function and
other transforms will maintain the standard structural rules of LLVM IR.

Ok. Just to make sure we're on the same page, if this was all there
is we would not need this attribute right? All LLVM optimizations do
need to preserve semantics and structural properties anyway?

We need this attribute because interprocedural optimizations otherwise
break good-faith optimizations, so yes, my suummary here is missing some
qualification (that I included in the next paragraph, but with a slightly
different spin). So let me restate this.

The designer of a good-faith optimization can rely on this:

  • other transforms will preserve the apparent semantics of the function,
  • other transforms will maintain the standard structural rules of LLVM IR, and
  • interprocedural transforms will honor supported_optimizations as mentioned in Piotr's proposal --- and, in particular, will intersect the supported_optimizations list whenever moving code into a function.

[...] I would consider that to be an unacceptable intrusion: intraprocedural transforms should never have to be aware of supported_optimizations (unless they're implementing a good-faith optimization, of course) and interprocedural transforms should only have to be aware of supported_optimizations in the narrow sense outlined by Piotr. If something about the optimization's representation in IR is unsafe to speculate, it should be made impossible to speculate for standard semantic/structural reasons, like having apparently arbitrary side-effects.

Peanut gallery says: I don't fully understand the use-case or the domain, but this description sounds unworkable to me.

AIUI, there are two players here: the "brittle optimization" (which relies on some invariant), and the "transform" (which has the power to break that invariant).

The only two mathematically workable scenarios are:
(A) The brittle optimization's invariant is "impossible to [break] for standard semantic/structural reasons." Therefore no transform ever needs to know anything about it. The result is a "robust" optimization, and no need for the supported-optimizations flagset.
(B) The brittle optimization's invariant is, in fact, brittle. Any transform that doesn't explicitly preserve the invariant can and will break the invariant. Therefore, every transform must have its own whitelist of "invariants I know I don't break." Any flag in the supported-optimizations flagset which is not whitelisted by a given transform must be cleared when that transform is applied to the code. (Because, by definition, a transform that doesn't explicitly preserve the brittle invariant must be assumed to break it.)

my $.02,
–Arthur

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
On Tue, Dec 4, 2018 at 12:04 PM Arthur O'Dwyer
<[hidden email]> wrote:
> Peanut gallery says: I don't fully understand the use-case or the domain, but this description sounds unworkable to me.
>
> AIUI, there are two players here: the "brittle optimization" (which relies on some invariant), and the "transform" (which has the power to break that invariant).
>
> The only two mathematically workable scenarios are:
> (A) The brittle optimization's invariant is "impossible to [break] for standard semantic/structural reasons." Therefore no transform ever needs to know anything about it. The result is a "robust" optimization, and no need for the supported-optimizations flagset.
> (B) The brittle optimization's invariant is, in fact, brittle. Any transform that doesn't explicitly preserve the invariant can and will break the invariant. Therefore, every transform must have its own whitelist of "invariants I know I don't break." Any flag in the supported-optimizations flagset which is not whitelisted by a given transform must be cleared when that transform is applied to the code. (Because, by definition, a transform that doesn't explicitly preserve the brittle invariant must be assumed to break it.)

I think the overarching idea here is that good faith opts will only
rely on invariants that can only be broken by inter-procedural
optimizations.  So intra-procedural opts should not need any changes.

In a sense this proposal makes function call edges semantically
special; functions calls are no longer yet another kind of control
flow, but they can act as bridges between LLVM IR with different
semantics.  Collapsing a function call edge is now a semantically
interesting operation.

-- Sanjoy

>
> my $.02,
> –Arthur
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev

On 4 Dec 2018, at 14:59, Sanjoy Das wrote:

On Tue, Dec 4, 2018 at 11:24 AM John McCall <[hidden email]> wrote:

In other words, certain things are UB in functions tagged with
supported_optimizations that are not UB otherwise? This breaks code
hoisting transformations right? I.e.
isSafeToSpeculativelyExecute(Inst) will have to return false if Inst
is in a function with a non-empty supported_optimizations?

Good question. I would consider that to be an unacceptable intrusion:
intraprocedural transforms should never have to be aware of
supported_optimizations (unless they're implementing a good-faith
optimization, of course) and interprocedural transforms should only have
to be aware of supported_optimizations in the narrow sense outlined
by Piotr. If something about the optimization's representation in IR
is unsafe to speculate, it should be made impossible to speculate for
standard semantic/structural reasons, like having apparently arbitrary
side-effects.

I think the right way to formalize this is to say that, while the
good-faith optimization may impose additional UB rules on the function,
it must guarantee that transforms that are well-behaved as described
above --- i.e. that preserve standard structure and semantics and which,
if interprocedural, appropriately honor supported_optimizations --- will
never introduce new UB.

I see. Would it be correct to say that a good faith optimization `O`
assumes that a certain construct(s), `FOO`, does not occur in the
*trace* of the program within the function tagged with
supported_optimizations={O}?

I'm not sure that's a useful way of thinking about it. Usually the assumption
will be something more like "Thing 1 does not occur after Event A but before any
occurrence of Event B", and we can clearly mark Events A and B in the IR, but
there might be dozens of different code patterns which all qualify as Thing 1.
This would have to be a good-faith optimization because a non-cooperative
function might not mark Events A and B in the IR, so inlining that into a
cooperative function could lead to miscompilation.

Also `FOO` has to be side effecting so
LLVM cannot speculate it into existence (e.g. `FOO` can't be
"getelementptr with a certain property").

I don't think "side effecting" is always the right rule, but yes:

  • If something about the sequence shouldn't be speculated, the optimization
    should arrange it so that transforms following standard rules can't
    speculate it. For example, if you have a language rule that guarantees
    that a certain global variable is guaranteed to be constant past a certain
    point in the program, you can mark that point with an intrinsic, but that
    intrinsic needs to be made non-speculable, and the easiest way to achieve
    that is to ensure that it claims to have arbitrary side-effects so that code
    which doesn't specifically know about this intrinsic won't speculate it.

  • In general, all the intrinsic calls, metadata, and so on that participate
    in the optimization should be marked in some way that's specific to the
    optimization. Formally, this requires that arbitrary transforms that aren't
    aware of the optimization can't just sua sponte add optimization-tagged
    constructs to a function, as opposed to e.g. cloning or moving them from
    somewhere else, but that seems like a pretty safe assumption. :)

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev

On 4 Dec 2018, at 15:36, Sanjoy Das wrote:

On Tue, Dec 4, 2018 at 12:04 PM Arthur O'Dwyer
[hidden email] wrote:

Peanut gallery says: I don't fully understand the use-case or the domain, but this description sounds unworkable to me.

AIUI, there are two players here: the "brittle optimization" (which relies on some invariant), and the "transform" (which has the power to break that invariant).

The only two mathematically workable scenarios are:
(A) The brittle optimization's invariant is "impossible to [break] for standard semantic/structural reasons." Therefore no transform ever needs to know anything about it. The result is a "robust" optimization, and no need for the supported-optimizations flagset.
(B) The brittle optimization's invariant is, in fact, brittle. Any transform that doesn't explicitly preserve the invariant can and will break the invariant. Therefore, every transform must have its own whitelist of "invariants I know I don't break." Any flag in the supported-optimizations flagset which is not whitelisted by a given transform must be cleared when that transform is applied to the code. (Because, by definition, a transform that doesn't explicitly preserve the brittle invariant must be assumed to break it.)

I think the overarching idea here is that good faith opts will only
rely on invariants that can only be broken by inter-procedural
optimizations. So intra-procedural opts should not need any changes.

In a sense this proposal makes function call edges semantically
special; functions calls are no longer yet another kind of control
flow, but they can act as bridges between LLVM IR with different
semantics. Collapsing a function call edge is now a semantically
interesting operation.

Note that this has always been true in some ways. For example, naively
cloning call sites during inlining can change exception semantics; to
make this work correctly you have to reconcile the EH contexts of the
inner and outer calls, which is not always possible if the personalities
differ. Or to pick an even grosser example, you can't just clone a call
to llvm.returnaddress during inlining.

Note also that IPO between functions with the same set of supported
optimizations also doesn't involve any change in semantics.

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev

Skimming along, apologies if I'm repeating something which already got said.

If I understand this correctly, the basic problem we're trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function.  The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds.  There's an ambiguity about what is allowed to be assumed about code outside that universe.

I think it's important to note that we have a precedent of something similar to this in TBAA.  TBAA information coming from different modules has the same base problem.  We solve it by using the "root" of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable. 

Can someone explain concisely why a similar scheme couldn't be used to solve this problem?

On 12/4/18 11:24 AM, John McCall via llvm-dev wrote:

On 4 Dec 2018, at 13:16, Sanjoy Das wrote:

On Mon, Dec 3, 2018 at 11:49 PM John McCall [hidden email] wrote:

Piotr's proposal unfortunately doesn't give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them "brittle", but I can understand why
nobody wants to call their optimization that, so let's call them
"good-faith optimizations" instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that's what makes them structural rules. We don't want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won't be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

other transforms will preserve the apparent semantics of the function and
other transforms will maintain the standard structural rules of LLVM IR.

Ok. Just to make sure we're on the same page, if this was all there
is we would not need this attribute right? All LLVM optimizations do
need to preserve semantics and structural properties anyway?

We need this attribute because interprocedural optimizations otherwise
break good-faith optimizations, so yes, my suummary here is missing some
qualification (that I included in the next paragraph, but with a slightly
different spin). So let me restate this.

The designer of a good-faith optimization can rely on this:

  • other transforms will preserve the apparent semantics of the function,
  • other transforms will maintain the standard structural rules of LLVM IR, and
  • interprocedural transforms will honor supported_optimizations as mentioned in Piotr's proposal --- and, in particular, will intersect the supported_optimizations list whenever moving code into a function.

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

I think the way your framing this is dangerous.  We absolutely can not allow any annotation of this form to *weaken* the semantics of the existing IR.  We can and should impose a criteria that any extension of this variety strictly add information to the IR which might not have been previously inferred.  We can then design rules for how to preserve our new information as long as possible, but framing this in terms of disallowed transformations is really a non-starter. 

So the defining property of a good-faith optimization is that:
- there are rules which participating functions are expected to follow on
pain of undefined behavior but which LLVM IR doesn't require every function
to follow, and
- those rules will be preserved by any transform that doesn't move code
between functions and which preserves the apparent function semantics
and maintains the standard structural rules of LLVM IR.

In other words, certain things are UB in functions tagged with
supported_optimizations that are not UB otherwise? This breaks code
hoisting transformations right? I.e.
isSafeToSpeculativelyExecute(Inst) will have to return false if Inst
is in a function with a non-empty supported_optimizations?

Good question. I would consider that to be an unacceptable intrusion:
intraprocedural transforms should never have to be aware of
supported_optimizations (unless they're implementing a good-faith
optimization, of course) and interprocedural transforms should only have
to be aware of supported_optimizations in the narrow sense outlined
by Piotr. If something about the optimization's representation in IR
is unsafe to speculate, it should be made impossible to speculate for
standard semantic/structural reasons, like having apparently arbitrary
side-effects.

I think the right way to formalize this is to say that, while the
good-faith optimization may impose additional UB rules on the function,
it must guarantee that transforms that are well-behaved as described
above --- i.e. that preserve standard structure and semantics and which,
if interprocedural, appropriately honor supported_optimizations --- will
never introduce new UB.

John.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev

On 4 Dec 2018, at 17:50, Philip Reames wrote:

Skimming along, apologies if I'm repeating something which already got said.

If I understand this correctly, the basic problem we're trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function.  The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds.  There's an ambiguity about what is allowed to be assumed about code outside that universe.

I think it's important to note that we have a precedent of something similar to this in TBAA.  TBAA information coming from different modules has the same base problem.  We solve it by using the "root" of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable.

Can someone explain concisely why a similar scheme couldn't be used to solve this problem?

TBAA is conservative in two ways:
- It allows two accesses to alias if they have TBAA nodes with different roots.
- It allows two accesses to alias if only one of them has a TBAA node.

The second is what doesn't generalize: there are optimizations where you need to
rely on transition points being explicitly identified. Looking at a function
with no identified transition points, you don't know whether it actually doesn't
transition or whether it was compiled without the transitions being explicitly
marked. There's no way to extend the TBAA idea to make that work.

On 12/4/18 11:24 AM, John McCall via llvm-dev wrote:

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

I think the way your framing this is dangerous.  We absolutely can not allow any annotation of this form to *weaken* the semantics of the existing IR.  We can and should impose a criteria that any extension of this variety strictly add information to the IR which might not have been previously inferred.  We can then design rules for how to preserve our new information as long as possible, but framing this in terms of disallowed transformations is really a non-starter.

That's exactly what I was trying to convey here. Authors of good-faith
optimizations need to design their representations so that transformations
that know nothing about their optimizations but merely preserve semantics
and well-formed IR structure will not break their representations. The only
transforms that need to know about the existence of good-faith optimizations
are interprocedural optimizations; furthermore, those optimizations don't
need to know about any good-faith optimizations specifically, they just need
to understand how to correctly update the supported_optimizations list.
That is a very small burden on IPO that enables an interesting class of
language-specific optimizations.

John.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev


śr., 5 gru 2018 o 00:22 John McCall via cfe-dev <[hidden email]> napisał(a):

On 4 Dec 2018, at 17:50, Philip Reames wrote:

Skimming along, apologies if I'm repeating something which already got said.

If I understand this correctly, the basic problem we're trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function.  The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds.  There's an ambiguity about what is allowed to be assumed about code outside that universe.

I think it's important to note that we have a precedent of something similar to this in TBAA.  TBAA information coming from different modules has the same base problem.  We solve it by using the "root" of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable.

Can someone explain concisely why a similar scheme couldn't be used to solve this problem?

TBAA is conservative in two ways:
- It allows two accesses to alias if they have TBAA nodes with different roots.
- It allows two accesses to alias if only one of them has a TBAA node.

The second is what doesn't generalize: there are optimizations where you need to
rely on transition points being explicitly identified. Looking at a function
with no identified transition points, you don't know whether it actually doesn't
transition or whether it was compiled without the transitions being explicitly
marked. There's no way to extend the TBAA idea to make that work.

The other reason why similar scheme doesn't work for !invariant.group is that we rely on a calls to launder/strip being present for some constructs to preserve
information about invartianess of an object (like in the example from RFC).

On 12/4/18 11:24 AM, John McCall via llvm-dev wrote:

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

I think the way your framing this is dangerous.  We absolutely can not allow any annotation of this form to *weaken* the semantics of the existing IR.  We can and should impose a criteria that any extension of this variety strictly add information to the IR which might not have been previously inferred.  We can then design rules for how to preserve our new information as long as possible, but framing this in terms of disallowed transformations is really a non-starter.

That's exactly what I was trying to convey here. Authors of good-faith
optimizations need to design their representations so that transformations
that know nothing about their optimizations but merely preserve semantics
and well-formed IR structure will not break their representations. The only
transforms that need to know about the existence of good-faith optimizations
are interprocedural optimizations; furthermore, those optimizations don't
need to know about any good-faith optimizations specifically, they just need
to understand how to correctly update the supported_optimizations list.
That is a very small burden on IPO that enables an interesting class of
language-specific optimizations.

John.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
An additional bit to the discussion: I think that "null-pointer-is-valid" attribute should also be handled in a similar manner - if one function calls
another one marked with "null-pointer-is-valid", then after inlining the function should also be marked with "null-pointer-is-valid".
I assume this attribute was added to handle kernel or embedded code correctly and probably there was no use case to do LTO between 
modules with and without this attribute (yet).
This, unfortunately, can't be solved with our supported_optimizations attribute, as null pointer dereference is considered to be UB by default.  
We would need something like "unsupported_optimizations", but I think this shows that this is not the last time when a similar problem will arise.


śr., 5 gru 2018 o 09:57 Piotr Padlewski <[hidden email]> napisał(a):


śr., 5 gru 2018 o 00:22 John McCall via cfe-dev <[hidden email]> napisał(a):

On 4 Dec 2018, at 17:50, Philip Reames wrote:

Skimming along, apologies if I'm repeating something which already got said.

If I understand this correctly, the basic problem we're trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function.  The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds.  There's an ambiguity about what is allowed to be assumed about code outside that universe.

I think it's important to note that we have a precedent of something similar to this in TBAA.  TBAA information coming from different modules has the same base problem.  We solve it by using the "root" of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable.

Can someone explain concisely why a similar scheme couldn't be used to solve this problem?

TBAA is conservative in two ways:
- It allows two accesses to alias if they have TBAA nodes with different roots.
- It allows two accesses to alias if only one of them has a TBAA node.

The second is what doesn't generalize: there are optimizations where you need to
rely on transition points being explicitly identified. Looking at a function
with no identified transition points, you don't know whether it actually doesn't
transition or whether it was compiled without the transitions being explicitly
marked. There's no way to extend the TBAA idea to make that work.

The other reason why similar scheme doesn't work for !invariant.group is that we rely on a calls to launder/strip being present for some constructs to preserve
information about invartianess of an object (like in the example from RFC).

On 12/4/18 11:24 AM, John McCall via llvm-dev wrote:

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

I think the way your framing this is dangerous.  We absolutely can not allow any annotation of this form to *weaken* the semantics of the existing IR.  We can and should impose a criteria that any extension of this variety strictly add information to the IR which might not have been previously inferred.  We can then design rules for how to preserve our new information as long as possible, but framing this in terms of disallowed transformations is really a non-starter.

That's exactly what I was trying to convey here. Authors of good-faith
optimizations need to design their representations so that transformations
that know nothing about their optimizations but merely preserve semantics
and well-formed IR structure will not break their representations. The only
transforms that need to know about the existence of good-faith optimizations
are interprocedural optimizations; furthermore, those optimizations don't
need to know about any good-faith optimizations specifically, they just need
to understand how to correctly update the supported_optimizations list.
That is a very small burden on IPO that enables an interesting class of
language-specific optimizations.

John.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev


On 12/5/18 12:57 AM, Piotr Padlewski wrote:


śr., 5 gru 2018 o 00:22 John McCall via cfe-dev <[hidden email]> napisał(a):

On 4 Dec 2018, at 17:50, Philip Reames wrote:

Skimming along, apologies if I'm repeating something which already got said.

If I understand this correctly, the basic problem we're trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function.  The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds.  There's an ambiguity about what is allowed to be assumed about code outside that universe.

I think it's important to note that we have a precedent of something similar to this in TBAA.  TBAA information coming from different modules has the same base problem.  We solve it by using the "root" of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable.

Can someone explain concisely why a similar scheme couldn't be used to solve this problem?

TBAA is conservative in two ways:
- It allows two accesses to alias if they have TBAA nodes with different roots.
- It allows two accesses to alias if only one of them has a TBAA node.

The second is what doesn't generalize: there are optimizations where you need to
rely on transition points being explicitly identified. Looking at a function
with no identified transition points, you don't know whether it actually doesn't
transition or whether it was compiled without the transitions being explicitly
marked. There's no way to extend the TBAA idea to make that work.

The other reason why similar scheme doesn't work for !invariant.group is that we rely on a calls to launder/strip being present for some constructs to preserve
information about invartianess of an object (like in the example from RFC).

I'm really not sure I buy this.  You're effectively saying that you have two points which need to share a common root: 1) the "transition point" and 2) the invariant.group marker.  If it were possible to mark the transition point with a metadata node - is it? - the exact same rules as used for TBAA work just fine. 

p.s. It would help me a lot if you'd spell out specific examples of transition points.  I checked the RFC and don't see them.


On 12/4/18 11:24 AM, John McCall via llvm-dev wrote:

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

I think the way your framing this is dangerous.  We absolutely can not allow any annotation of this form to *weaken* the semantics of the existing IR.  We can and should impose a criteria that any extension of this variety strictly add information to the IR which might not have been previously inferred.  We can then design rules for how to preserve our new information as long as possible, but framing this in terms of disallowed transformations is really a non-starter.

That's exactly what I was trying to convey here. Authors of good-faith
optimizations need to design their representations so that transformations
that know nothing about their optimizations but merely preserve semantics
and well-formed IR structure will not break their representations. The only
transforms that need to know about the existence of good-faith optimizations
are interprocedural optimizations; furthermore, those optimizations don't
need to know about any good-faith optimizations specifically, they just need
to understand how to correctly update the supported_optimizations list.
That is a very small burden on IPO that enables an interesting class of
language-specific optimizations.

John.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Supported Optimizations attribute

Oleg Smolsky via cfe-dev
In reply to this post by Oleg Smolsky via cfe-dev


On 12/4/18 3:21 PM, John McCall wrote:

On 4 Dec 2018, at 17:50, Philip Reames wrote:

Skimming along, apologies if I'm repeating something which already got said.

If I understand this correctly, the basic problem we're trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function.  The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds.  There's an ambiguity about what is allowed to be assumed about code outside that universe.

I think it's important to note that we have a precedent of something similar to this in TBAA.  TBAA information coming from different modules has the same base problem.  We solve it by using the "root" of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable.

Can someone explain concisely why a similar scheme couldn't be used to solve this problem?

TBAA is conservative in two ways:
- It allows two accesses to alias if they have TBAA nodes with different roots.
- It allows two accesses to alias if only one of them has a TBAA node.

The second is what doesn't generalize: there are optimizations where you need to
rely on transition points being explicitly identified. Looking at a function
with no identified transition points, you don't know whether it actually doesn't
transition or whether it was compiled without the transitions being explicitly
marked. There's no way to extend the TBAA idea to make that work.

On 12/4/18 11:24 AM, John McCall via llvm-dev wrote:

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

I think the way your framing this is dangerous.  We absolutely can not allow any annotation of this form to *weaken* the semantics of the existing IR.  We can and should impose a criteria that any extension of this variety strictly add information to the IR which might not have been previously inferred.  We can then design rules for how to preserve our new information as long as possible, but framing this in terms of disallowed transformations is really a non-starter.

That's exactly what I was trying to convey here. Authors of good-faith
optimizations need to design their representations so that transformations
that know nothing about their optimizations but merely preserve semantics
and well-formed IR structure will not break their representations. The only
transforms that need to know about the existence of good-faith optimizations
are interprocedural optimizations; furthermore, those optimizations don't
need to know about any good-faith optimizations specifically, they just need
to understand how to correctly update the supported_optimizations list.
That is a very small burden on IPO that enables an interesting class of
language-specific optimizations.

Two responses:

1) My comment was on *framing*, not substance.  I'm not debating the *semantics* you've proposed (here), just the way they're described.  The way they're described here is very likely to lead to problematic misinterpretation.

2) Reading back through your description again, it really sounds like you've reinvented the rules for metadata with an alternate framing.  The only part which is possibly new is the IPO rules you want to apply.  Worth noting is that we already have existing support for metadata on both instructions and functions.  

If we frame all of this as being *metadata*, then your supported_optimization attribute reduces to the need to define an intersect rule for metadata on functions during inlining and IPO.  Note that we already have precedence for conservative-by-default handling at the instruction level, so extending that to the function scope seems natural.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
12