C++11 and enhacned devirtualization

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

C++11 and enhacned devirtualization

Hal Finkel
Hi everyone,

C++11 added features that allow for certain parts of the class hierarchy to be closed, specifically the 'final' keyword and the semantics of anonymous namespaces, and I think we take advantage of these to enhance our ability to perform devirtualization. For example, given this situation:

struct Base {
  virtual void foo() = 0;
};

void external();
struct Final final : Base {
  void foo() {
    external();
  }
};

void dispatch(Base *B) {
  B->foo();
}

void opportunity(Final *F) {
  dispatch(F);
}

When we optimize this code, we do the expected thing and inline 'dispatch' into 'opportunity' but we don't devirtualize the call to foo(). The fact that we know what the vtable of F is at that callsite is not exploited. To a lesser extent, we can do similar things for final virtual methods, and derived classes in anonymous namespaces (because Clang could determine whether or not a class (or method) there is effectively final).

One possibility might be to @llvm.assume to say something about what the vtable ptr of F might be/contain should it be needed later when we emit the initial IR for 'opportunity' (and then teach the optimizer to use that information), but I'm not at all sure that's the best solution. Thoughts?

Thanks again,
Hal

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Rafael Espíndola
There a quite a few open bugs about devirtualization. A quick search finds

pr18972, pr3100, pr13227, pr15961, pr15963, pr16984, pr17863, pr19545, pr6747.

A fairly important restriction to keep in mind is that the itanium abi
requires some vtables to be exported from a given file (the one with
the key function), but the functions in those vtables don't have to be
exported.

That means that to devirtualize we have to produce a copy, which hits
issues with code that wants avoid #including definitions.

See the commit message of r189852 for more details.

Cheers,
Rafael


On 15 July 2015 at 22:11, Hal Finkel <[hidden email]> wrote:

> Hi everyone,
>
> C++11 added features that allow for certain parts of the class hierarchy to be closed, specifically the 'final' keyword and the semantics of anonymous namespaces, and I think we take advantage of these to enhance our ability to perform devirtualization. For example, given this situation:
>
> struct Base {
>   virtual void foo() = 0;
> };
>
> void external();
> struct Final final : Base {
>   void foo() {
>     external();
>   }
> };
>
> void dispatch(Base *B) {
>   B->foo();
> }
>
> void opportunity(Final *F) {
>   dispatch(F);
> }
>
> When we optimize this code, we do the expected thing and inline 'dispatch' into 'opportunity' but we don't devirtualize the call to foo(). The fact that we know what the vtable of F is at that callsite is not exploited. To a lesser extent, we can do similar things for final virtual methods, and derived classes in anonymous namespaces (because Clang could determine whether or not a class (or method) there is effectively final).
>
> One possibility might be to @llvm.assume to say something about what the vtable ptr of F might be/contain should it be needed later when we emit the initial IR for 'opportunity' (and then teach the optimizer to use that information), but I'm not at all sure that's the best solution. Thoughts?
>
> Thanks again,
> Hal
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Hal Finkel
Hi Rafael,

Thanks for the list of bug reports. The situations I've highlighted are indeed PR16984 and PR13227. Do you have any thoughts on a design to fix them? PR16984 mentions that it is likely best to put the necessary information into the IR and let the optimizer take care of the constant propagation to get rid of the indirect call. I agree, this sounds appealing. The question then is what form should that information take. I could do this with @llvm.assume, but I'm somewhat afraid of littering the IR with them in response to a common core language property.

 -Hal

----- Original Message -----

> From: "Rafael Espíndola" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "[hidden email] Developers" <[hidden email]>, "Richard Smith" <[hidden email]>
> Sent: Thursday, July 16, 2015 12:48:05 AM
> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>
> There a quite a few open bugs about devirtualization. A quick search
> finds
>
> pr18972, pr3100, pr13227, pr15961, pr15963, pr16984, pr17863,
> pr19545, pr6747.
>
> A fairly important restriction to keep in mind is that the itanium
> abi
> requires some vtables to be exported from a given file (the one with
> the key function), but the functions in those vtables don't have to
> be
> exported.
>
> That means that to devirtualize we have to produce a copy, which hits
> issues with code that wants avoid #including definitions.
>
> See the commit message of r189852 for more details.
>
> Cheers,
> Rafael
>
>
> On 15 July 2015 at 22:11, Hal Finkel <[hidden email]> wrote:
> > Hi everyone,
> >
> > C++11 added features that allow for certain parts of the class
> > hierarchy to be closed, specifically the 'final' keyword and the
> > semantics of anonymous namespaces, and I think we take advantage
> > of these to enhance our ability to perform devirtualization. For
> > example, given this situation:
> >
> > struct Base {
> >   virtual void foo() = 0;
> > };
> >
> > void external();
> > struct Final final : Base {
> >   void foo() {
> >     external();
> >   }
> > };
> >
> > void dispatch(Base *B) {
> >   B->foo();
> > }
> >
> > void opportunity(Final *F) {
> >   dispatch(F);
> > }
> >
> > When we optimize this code, we do the expected thing and inline
> > 'dispatch' into 'opportunity' but we don't devirtualize the call
> > to foo(). The fact that we know what the vtable of F is at that
> > callsite is not exploited. To a lesser extent, we can do similar
> > things for final virtual methods, and derived classes in anonymous
> > namespaces (because Clang could determine whether or not a class
> > (or method) there is effectively final).
> >
> > One possibility might be to @llvm.assume to say something about
> > what the vtable ptr of F might be/contain should it be needed
> > later when we emit the initial IR for 'opportunity' (and then
> > teach the optimizer to use that information), but I'm not at all
> > sure that's the best solution. Thoughts?
> >
> > Thanks again,
> > Hal
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > _______________________________________________
> > cfe-dev mailing list
> > [hidden email]
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Rafael Espíndola
My preference for cases like PR16984 would be to change how clang
finds the vtable. Instead of doing a load from this, have clang
directly use the _ZTV variable when it is known at compile time.


On 15 July 2015 at 23:12, Hal Finkel <[hidden email]> wrote:

> Hi Rafael,
>
> Thanks for the list of bug reports. The situations I've highlighted are indeed PR16984 and PR13227. Do you have any thoughts on a design to fix them? PR16984 mentions that it is likely best to put the necessary information into the IR and let the optimizer take care of the constant propagation to get rid of the indirect call. I agree, this sounds appealing. The question then is what form should that information take. I could do this with @llvm.assume, but I'm somewhat afraid of littering the IR with them in response to a common core language property.
>
>  -Hal
>
> ----- Original Message -----
>> From: "Rafael Espíndola" <[hidden email]>
>> To: "Hal Finkel" <[hidden email]>
>> Cc: "[hidden email] Developers" <[hidden email]>, "Richard Smith" <[hidden email]>
>> Sent: Thursday, July 16, 2015 12:48:05 AM
>> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>>
>> There a quite a few open bugs about devirtualization. A quick search
>> finds
>>
>> pr18972, pr3100, pr13227, pr15961, pr15963, pr16984, pr17863,
>> pr19545, pr6747.
>>
>> A fairly important restriction to keep in mind is that the itanium
>> abi
>> requires some vtables to be exported from a given file (the one with
>> the key function), but the functions in those vtables don't have to
>> be
>> exported.
>>
>> That means that to devirtualize we have to produce a copy, which hits
>> issues with code that wants avoid #including definitions.
>>
>> See the commit message of r189852 for more details.
>>
>> Cheers,
>> Rafael
>>
>>
>> On 15 July 2015 at 22:11, Hal Finkel <[hidden email]> wrote:
>> > Hi everyone,
>> >
>> > C++11 added features that allow for certain parts of the class
>> > hierarchy to be closed, specifically the 'final' keyword and the
>> > semantics of anonymous namespaces, and I think we take advantage
>> > of these to enhance our ability to perform devirtualization. For
>> > example, given this situation:
>> >
>> > struct Base {
>> >   virtual void foo() = 0;
>> > };
>> >
>> > void external();
>> > struct Final final : Base {
>> >   void foo() {
>> >     external();
>> >   }
>> > };
>> >
>> > void dispatch(Base *B) {
>> >   B->foo();
>> > }
>> >
>> > void opportunity(Final *F) {
>> >   dispatch(F);
>> > }
>> >
>> > When we optimize this code, we do the expected thing and inline
>> > 'dispatch' into 'opportunity' but we don't devirtualize the call
>> > to foo(). The fact that we know what the vtable of F is at that
>> > callsite is not exploited. To a lesser extent, we can do similar
>> > things for final virtual methods, and derived classes in anonymous
>> > namespaces (because Clang could determine whether or not a class
>> > (or method) there is effectively final).
>> >
>> > One possibility might be to @llvm.assume to say something about
>> > what the vtable ptr of F might be/contain should it be needed
>> > later when we emit the initial IR for 'opportunity' (and then
>> > teach the optimizer to use that information), but I'm not at all
>> > sure that's the best solution. Thoughts?
>> >
>> > Thanks again,
>> > Hal
>> >
>> > --
>> > Hal Finkel
>> > Assistant Computational Scientist
>> > Leadership Computing Facility
>> > Argonne National Laboratory
>> > _______________________________________________
>> > cfe-dev mailing list
>> > [hidden email]
>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

David Blaikie
I believe Piotr Prazek (cc'd) is working with Richard Smith on a proposal/plan for a general device for devirtualization (something like a restricted assume for pointer loads, if I understand it correctly - as we have nonnull and other attributes for other special cases of assume)

On Thu, Jul 16, 2015 at 8:26 AM, Rafael Espíndola <[hidden email]> wrote:
My preference for cases like PR16984 would be to change how clang
finds the vtable. Instead of doing a load from this, have clang
directly use the _ZTV variable when it is known at compile time.


On 15 July 2015 at 23:12, Hal Finkel <[hidden email]> wrote:
> Hi Rafael,
>
> Thanks for the list of bug reports. The situations I've highlighted are indeed PR16984 and PR13227. Do you have any thoughts on a design to fix them? PR16984 mentions that it is likely best to put the necessary information into the IR and let the optimizer take care of the constant propagation to get rid of the indirect call. I agree, this sounds appealing. The question then is what form should that information take. I could do this with @llvm.assume, but I'm somewhat afraid of littering the IR with them in response to a common core language property.
>
>  -Hal
>
> ----- Original Message -----
>> From: "Rafael Espíndola" <[hidden email]>
>> To: "Hal Finkel" <[hidden email]>
>> Cc: "[hidden email] Developers" <[hidden email]>, "Richard Smith" <[hidden email]>
>> Sent: Thursday, July 16, 2015 12:48:05 AM
>> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>>
>> There a quite a few open bugs about devirtualization. A quick search
>> finds
>>
>> pr18972, pr3100, pr13227, pr15961, pr15963, pr16984, pr17863,
>> pr19545, pr6747.
>>
>> A fairly important restriction to keep in mind is that the itanium
>> abi
>> requires some vtables to be exported from a given file (the one with
>> the key function), but the functions in those vtables don't have to
>> be
>> exported.
>>
>> That means that to devirtualize we have to produce a copy, which hits
>> issues with code that wants avoid #including definitions.
>>
>> See the commit message of r189852 for more details.
>>
>> Cheers,
>> Rafael
>>
>>
>> On 15 July 2015 at 22:11, Hal Finkel <[hidden email]> wrote:
>> > Hi everyone,
>> >
>> > C++11 added features that allow for certain parts of the class
>> > hierarchy to be closed, specifically the 'final' keyword and the
>> > semantics of anonymous namespaces, and I think we take advantage
>> > of these to enhance our ability to perform devirtualization. For
>> > example, given this situation:
>> >
>> > struct Base {
>> >   virtual void foo() = 0;
>> > };
>> >
>> > void external();
>> > struct Final final : Base {
>> >   void foo() {
>> >     external();
>> >   }
>> > };
>> >
>> > void dispatch(Base *B) {
>> >   B->foo();
>> > }
>> >
>> > void opportunity(Final *F) {
>> >   dispatch(F);
>> > }
>> >
>> > When we optimize this code, we do the expected thing and inline
>> > 'dispatch' into 'opportunity' but we don't devirtualize the call
>> > to foo(). The fact that we know what the vtable of F is at that
>> > callsite is not exploited. To a lesser extent, we can do similar
>> > things for final virtual methods, and derived classes in anonymous
>> > namespaces (because Clang could determine whether or not a class
>> > (or method) there is effectively final).
>> >
>> > One possibility might be to @llvm.assume to say something about
>> > what the vtable ptr of F might be/contain should it be needed
>> > later when we emit the initial IR for 'opportunity' (and then
>> > teach the optimizer to use that information), but I'm not at all
>> > sure that's the best solution. Thoughts?
>> >
>> > Thanks again,
>> > Hal
>> >
>> > --
>> > Hal Finkel
>> > Assistant Computational Scientist
>> > Leadership Computing Facility
>> > Argonne National Laboratory
>> > _______________________________________________
>> > cfe-dev mailing list
>> > [hidden email]
>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Hal Finkel
In reply to this post by Rafael Espíndola
----- Original Message -----
> From: "Rafael Espíndola" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "[hidden email] Developers" <[hidden email]>, "Richard Smith" <[hidden email]>
> Sent: Thursday, July 16, 2015 10:26:13 AM
> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>
> My preference for cases like PR16984 would be to change how clang
> finds the vtable. Instead of doing a load from this, have clang
> directly use the _ZTV variable when it is known at compile time.

Agreed.

Correction to my previous message: When I said, "PR16984 mentions..." below, I should have said, "PR13227 mentions...".

 -Hal

>
>
> On 15 July 2015 at 23:12, Hal Finkel <[hidden email]> wrote:
> > Hi Rafael,
> >
> > Thanks for the list of bug reports. The situations I've highlighted
> > are indeed PR16984 and PR13227. Do you have any thoughts on a
> > design to fix them? PR16984 mentions that it is likely best to put
> > the necessary information into the IR and let the optimizer take
> > care of the constant propagation to get rid of the indirect call.
> > I agree, this sounds appealing. The question then is what form
> > should that information take. I could do this with @llvm.assume,
> > but I'm somewhat afraid of littering the IR with them in response
> > to a common core language property.
> >
> >  -Hal
> >
> > ----- Original Message -----
> >> From: "Rafael Espíndola" <[hidden email]>
> >> To: "Hal Finkel" <[hidden email]>
> >> Cc: "[hidden email] Developers" <[hidden email]>,
> >> "Richard Smith" <[hidden email]>
> >> Sent: Thursday, July 16, 2015 12:48:05 AM
> >> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
> >>
> >> There a quite a few open bugs about devirtualization. A quick
> >> search
> >> finds
> >>
> >> pr18972, pr3100, pr13227, pr15961, pr15963, pr16984, pr17863,
> >> pr19545, pr6747.
> >>
> >> A fairly important restriction to keep in mind is that the itanium
> >> abi
> >> requires some vtables to be exported from a given file (the one
> >> with
> >> the key function), but the functions in those vtables don't have
> >> to
> >> be
> >> exported.
> >>
> >> That means that to devirtualize we have to produce a copy, which
> >> hits
> >> issues with code that wants avoid #including definitions.
> >>
> >> See the commit message of r189852 for more details.
> >>
> >> Cheers,
> >> Rafael
> >>
> >>
> >> On 15 July 2015 at 22:11, Hal Finkel <[hidden email]> wrote:
> >> > Hi everyone,
> >> >
> >> > C++11 added features that allow for certain parts of the class
> >> > hierarchy to be closed, specifically the 'final' keyword and the
> >> > semantics of anonymous namespaces, and I think we take advantage
> >> > of these to enhance our ability to perform devirtualization. For
> >> > example, given this situation:
> >> >
> >> > struct Base {
> >> >   virtual void foo() = 0;
> >> > };
> >> >
> >> > void external();
> >> > struct Final final : Base {
> >> >   void foo() {
> >> >     external();
> >> >   }
> >> > };
> >> >
> >> > void dispatch(Base *B) {
> >> >   B->foo();
> >> > }
> >> >
> >> > void opportunity(Final *F) {
> >> >   dispatch(F);
> >> > }
> >> >
> >> > When we optimize this code, we do the expected thing and inline
> >> > 'dispatch' into 'opportunity' but we don't devirtualize the call
> >> > to foo(). The fact that we know what the vtable of F is at that
> >> > callsite is not exploited. To a lesser extent, we can do similar
> >> > things for final virtual methods, and derived classes in
> >> > anonymous
> >> > namespaces (because Clang could determine whether or not a class
> >> > (or method) there is effectively final).
> >> >
> >> > One possibility might be to @llvm.assume to say something about
> >> > what the vtable ptr of F might be/contain should it be needed
> >> > later when we emit the initial IR for 'opportunity' (and then
> >> > teach the optimizer to use that information), but I'm not at all
> >> > sure that's the best solution. Thoughts?
> >> >
> >> > Thanks again,
> >> > Hal
> >> >
> >> > --
> >> > Hal Finkel
> >> > Assistant Computational Scientist
> >> > Leadership Computing Facility
> >> > Argonne National Laboratory
> >> > _______________________________________________
> >> > cfe-dev mailing list
> >> > [hidden email]
> >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Piotr Padlewski
In reply to this post by David Blaikie
The proposal of our solution will be published next week. Brief introduction:

"Basic idea is to mark vtable loads with !invariant , and also put assume() after calling constructor to show how vtable will look like. Function @llvm.invariant.barrier that breaks the invariant will be needed to cope with placement new and other cases where the vptr is legitimately permitted to change."

We also want to make some improvements on generating vtables as avaliable_externally - for example when there are no virtual inline functions

Piotr

On Thu, Jul 16, 2015 at 9:11 AM, David Blaikie <[hidden email]> wrote:
I believe Piotr Prazek (cc'd) is working with Richard Smith on a proposal/plan for a general device for devirtualization (something like a restricted assume for pointer loads, if I understand it correctly - as we have nonnull and other attributes for other special cases of assume)

On Thu, Jul 16, 2015 at 8:26 AM, Rafael Espíndola <[hidden email]> wrote:
My preference for cases like PR16984 would be to change how clang
finds the vtable. Instead of doing a load from this, have clang
directly use the _ZTV variable when it is known at compile time.


On 15 July 2015 at 23:12, Hal Finkel <[hidden email]> wrote:
> Hi Rafael,
>
> Thanks for the list of bug reports. The situations I've highlighted are indeed PR16984 and PR13227. Do you have any thoughts on a design to fix them? PR16984 mentions that it is likely best to put the necessary information into the IR and let the optimizer take care of the constant propagation to get rid of the indirect call. I agree, this sounds appealing. The question then is what form should that information take. I could do this with @llvm.assume, but I'm somewhat afraid of littering the IR with them in response to a common core language property.
>
>  -Hal
>
> ----- Original Message -----
>> From: "Rafael Espíndola" <[hidden email]>
>> To: "Hal Finkel" <[hidden email]>
>> Cc: "[hidden email] Developers" <[hidden email]>, "Richard Smith" <[hidden email]>
>> Sent: Thursday, July 16, 2015 12:48:05 AM
>> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>>
>> There a quite a few open bugs about devirtualization. A quick search
>> finds
>>
>> pr18972, pr3100, pr13227, pr15961, pr15963, pr16984, pr17863,
>> pr19545, pr6747.
>>
>> A fairly important restriction to keep in mind is that the itanium
>> abi
>> requires some vtables to be exported from a given file (the one with
>> the key function), but the functions in those vtables don't have to
>> be
>> exported.
>>
>> That means that to devirtualize we have to produce a copy, which hits
>> issues with code that wants avoid #including definitions.
>>
>> See the commit message of r189852 for more details.
>>
>> Cheers,
>> Rafael
>>
>>
>> On 15 July 2015 at 22:11, Hal Finkel <[hidden email]> wrote:
>> > Hi everyone,
>> >
>> > C++11 added features that allow for certain parts of the class
>> > hierarchy to be closed, specifically the 'final' keyword and the
>> > semantics of anonymous namespaces, and I think we take advantage
>> > of these to enhance our ability to perform devirtualization. For
>> > example, given this situation:
>> >
>> > struct Base {
>> >   virtual void foo() = 0;
>> > };
>> >
>> > void external();
>> > struct Final final : Base {
>> >   void foo() {
>> >     external();
>> >   }
>> > };
>> >
>> > void dispatch(Base *B) {
>> >   B->foo();
>> > }
>> >
>> > void opportunity(Final *F) {
>> >   dispatch(F);
>> > }
>> >
>> > When we optimize this code, we do the expected thing and inline
>> > 'dispatch' into 'opportunity' but we don't devirtualize the call
>> > to foo(). The fact that we know what the vtable of F is at that
>> > callsite is not exploited. To a lesser extent, we can do similar
>> > things for final virtual methods, and derived classes in anonymous
>> > namespaces (because Clang could determine whether or not a class
>> > (or method) there is effectively final).
>> >
>> > One possibility might be to @llvm.assume to say something about
>> > what the vtable ptr of F might be/contain should it be needed
>> > later when we emit the initial IR for 'opportunity' (and then
>> > teach the optimizer to use that information), but I'm not at all
>> > sure that's the best solution. Thoughts?
>> >
>> > Thanks again,
>> > Hal
>> >
>> > --
>> > Hal Finkel
>> > Assistant Computational Scientist
>> > Leadership Computing Facility
>> > Argonne National Laboratory
>> > _______________________________________________
>> > cfe-dev mailing list
>> > [hidden email]
>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Rafael Espíndola
On 16 July 2015 at 10:02, Piotr Padlewski <[hidden email]> wrote:
> The proposal of our solution will be published next week. Brief
> introduction:
>
> "Basic idea is to mark vtable loads with !invariant , and also put assume()
> after calling constructor to show how vtable will look like. Function
> @llvm.invariant.barrier that breaks the invariant will be needed to cope
> with placement new and other cases where the vptr is legitimately permitted
> to change."


Nice, this should also avoid loading the vtable pointer multiple times in

struct C {
  virtual void foo();
};
void g();
void f(C *x) {
  x->foo();
  g();
  x->foo();
}

Cheers,
Rafael
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Hal Finkel
In reply to this post by Piotr Padlewski
----- Original Message -----

> From: "Piotr Padlewski" <[hidden email]>
> To: "David Blaikie" <[hidden email]>
> Cc: "Rafael Espíndola" <[hidden email]>, "Hal Finkel" <[hidden email]>, "Richard Smith"
> <[hidden email]>, "[hidden email] Developers" <[hidden email]>
> Sent: Thursday, July 16, 2015 12:02:47 PM
> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>
> The proposal of our solution will be published next week. Brief
> introduction:
>
> "Basic idea is to mark vtable loads with !invariant , and also put
> assume() after calling constructor to show how vtable will look
> like.

Interesting, thanks. Just putting these after the constructor call would not, however, handle the case I outlined in my original e-mail.

> Function @llvm.invariant.barrier that breaks the invariant
> will be needed to cope with placement new and other cases where the
> vptr is legitimately permitted to change."

We already have @llvm.invariant.start/@llvm.invariant.end, could those be used instead? (No need to address this now -- just address this in the proposal when you make it.)

Thanks,
Hal

>
> We also want to make some improvements on generating vtables as
> avaliable_externally - for example when there are no virtual inline
> functions
>
>
> Piotr
>
>
> On Thu, Jul 16, 2015 at 9:11 AM, David Blaikie < [hidden email] >
> wrote:
>
>
>
> I believe Piotr Prazek (cc'd) is working with Richard Smith on a
> proposal/plan for a general device for devirtualization (something
> like a restricted assume for pointer loads, if I understand it
> correctly - as we have nonnull and other attributes for other
> special cases of assume)
>
>
>
> On Thu, Jul 16, 2015 at 8:26 AM, Rafael Espíndola <
> [hidden email] > wrote:
>
>
> My preference for cases like PR16984 would be to change how clang
> finds the vtable. Instead of doing a load from this, have clang
> directly use the _ZTV variable when it is known at compile time.
>
>
>
>
> On 15 July 2015 at 23:12, Hal Finkel < [hidden email] > wrote:
> > Hi Rafael,
> >
> > Thanks for the list of bug reports. The situations I've highlighted
> > are indeed PR16984 and PR13227. Do you have any thoughts on a
> > design to fix them? PR16984 mentions that it is likely best to put
> > the necessary information into the IR and let the optimizer take
> > care of the constant propagation to get rid of the indirect call.
> > I agree, this sounds appealing. The question then is what form
> > should that information take. I could do this with @llvm.assume,
> > but I'm somewhat afraid of littering the IR with them in response
> > to a common core language property.
> >
> > -Hal
> >
> > ----- Original Message -----
> >> From: "Rafael Espíndola" < [hidden email] >
> >> To: "Hal Finkel" < [hidden email] >
> >> Cc: " [hidden email] Developers" < [hidden email] >,
> >> "Richard Smith" < [hidden email] >
> >> Sent: Thursday, July 16, 2015 12:48:05 AM
> >> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
> >>
> >> There a quite a few open bugs about devirtualization. A quick
> >> search
> >> finds
> >>
> >> pr18972, pr3100, pr13227, pr15961, pr15963, pr16984, pr17863,
> >> pr19545, pr6747.
> >>
> >> A fairly important restriction to keep in mind is that the itanium
> >> abi
> >> requires some vtables to be exported from a given file (the one
> >> with
> >> the key function), but the functions in those vtables don't have
> >> to
> >> be
> >> exported.
> >>
> >> That means that to devirtualize we have to produce a copy, which
> >> hits
> >> issues with code that wants avoid #including definitions.
> >>
> >> See the commit message of r189852 for more details.
> >>
> >> Cheers,
> >> Rafael
> >>
> >>
> >> On 15 July 2015 at 22:11, Hal Finkel < [hidden email] > wrote:
> >> > Hi everyone,
> >> >
> >> > C++11 added features that allow for certain parts of the class
> >> > hierarchy to be closed, specifically the 'final' keyword and the
> >> > semantics of anonymous namespaces, and I think we take advantage
> >> > of these to enhance our ability to perform devirtualization. For
> >> > example, given this situation:
> >> >
> >> > struct Base {
> >> > virtual void foo() = 0;
> >> > };
> >> >
> >> > void external();
> >> > struct Final final : Base {
> >> > void foo() {
> >> > external();
> >> > }
> >> > };
> >> >
> >> > void dispatch(Base *B) {
> >> > B->foo();
> >> > }
> >> >
> >> > void opportunity(Final *F) {
> >> > dispatch(F);
> >> > }
> >> >
> >> > When we optimize this code, we do the expected thing and inline
> >> > 'dispatch' into 'opportunity' but we don't devirtualize the call
> >> > to foo(). The fact that we know what the vtable of F is at that
> >> > callsite is not exploited. To a lesser extent, we can do similar
> >> > things for final virtual methods, and derived classes in
> >> > anonymous
> >> > namespaces (because Clang could determine whether or not a class
> >> > (or method) there is effectively final).
> >> >
> >> > One possibility might be to @llvm.assume to say something about
> >> > what the vtable ptr of F might be/contain should it be needed
> >> > later when we emit the initial IR for 'opportunity' (and then
> >> > teach the optimizer to use that information), but I'm not at all
> >> > sure that's the best solution. Thoughts?
> >> >
> >> > Thanks again,
> >> > Hal
> >> >
> >> > --
> >> > Hal Finkel
> >> > Assistant Computational Scientist
> >> > Leadership Computing Facility
> >> > Argonne National Laboratory
> >> > _______________________________________________
> >> > cfe-dev mailing list
> >> > [hidden email]
> >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Hal Finkel
In reply to this post by Rafael Espíndola
----- Original Message -----

> From: "Rafael Espíndola" <[hidden email]>
> To: "Piotr Padlewski" <[hidden email]>
> Cc: "David Blaikie" <[hidden email]>, "Hal Finkel" <[hidden email]>, "Richard Smith" <[hidden email]>,
> "[hidden email] Developers" <[hidden email]>
> Sent: Thursday, July 16, 2015 12:14:29 PM
> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>
> On 16 July 2015 at 10:02, Piotr Padlewski <[hidden email]> wrote:
> > The proposal of our solution will be published next week. Brief
> > introduction:
> >
> > "Basic idea is to mark vtable loads with !invariant , and also put
> > assume()
> > after calling constructor to show how vtable will look like.
> > Function
> > @llvm.invariant.barrier that breaks the invariant will be needed to
> > cope
> > with placement new and other cases where the vptr is legitimately
> > permitted
> > to change."
>
>
> Nice, this should also avoid loading the vtable pointer multiple
> times in
>
> struct C {
>   virtual void foo();
> };
> void g();
> void f(C *x) {
>   x->foo();
>   g();

Not if g() might placement-new something else on top of x. The problem is that if we need to make conservative assumptions about whether or not g() might contain a '@llvm.invariant.barrier', we must assume it might.

 -Hal

>   x->foo();
> }
>
> Cheers,
> Rafael
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Piotr Padlewski
It can, but it will be UB, so we don't have to worry about it.

On Thu, Jul 16, 2015 at 10:16 AM, Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "Rafael Espíndola" <[hidden email]>
> To: "Piotr Padlewski" <[hidden email]>
> Cc: "David Blaikie" <[hidden email]>, "Hal Finkel" <[hidden email]>, "Richard Smith" <[hidden email]>,
> "[hidden email] Developers" <[hidden email]>
> Sent: Thursday, July 16, 2015 12:14:29 PM
> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>
> On 16 July 2015 at 10:02, Piotr Padlewski <[hidden email]> wrote:
> > The proposal of our solution will be published next week. Brief
> > introduction:
> >
> > "Basic idea is to mark vtable loads with !invariant , and also put
> > assume()
> > after calling constructor to show how vtable will look like.
> > Function
> > @llvm.invariant.barrier that breaks the invariant will be needed to
> > cope
> > with placement new and other cases where the vptr is legitimately
> > permitted
> > to change."
>
>
> Nice, this should also avoid loading the vtable pointer multiple
> times in
>
> struct C {
>   virtual void foo();
> };
> void g();
> void f(C *x) {
>   x->foo();
>   g();

Not if g() might placement-new something else on top of x. The problem is that if we need to make conservative assumptions about whether or not g() might contain a '@llvm.invariant.barrier', we must assume it might.

 -Hal

>   x->foo();
> }
>
> Cheers,
> Rafael
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Hal Finkel
----- Original Message -----
> From: "Piotr Padlewski" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "Rafael Espíndola" <[hidden email]>, "David Blaikie" <[hidden email]>, "Richard Smith"
> <[hidden email]>, "[hidden email] Developers" <[hidden email]>
> Sent: Thursday, July 16, 2015 12:19:40 PM
> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>
>
> It can, but it will be UB, so we don't have to worry about it.

Why is it UB? Or when it the placement new changing the vtable not UB?

 -Hal

>
>
> On Thu, Jul 16, 2015 at 10:16 AM, Hal Finkel < [hidden email] >
> wrote:
>
>
> ----- Original Message -----
> > From: "Rafael Espíndola" < [hidden email] >
> > To: "Piotr Padlewski" < [hidden email] >
> > Cc: "David Blaikie" < [hidden email] >, "Hal Finkel" <
> > [hidden email] >, "Richard Smith" < [hidden email] >,
> > " [hidden email] Developers" < [hidden email] >
> > Sent: Thursday, July 16, 2015 12:14:29 PM
> > Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
> >
> > On 16 July 2015 at 10:02, Piotr Padlewski < [hidden email] >
> > wrote:
> > > The proposal of our solution will be published next week. Brief
> > > introduction:
> > >
> > > "Basic idea is to mark vtable loads with !invariant , and also
> > > put
> > > assume()
> > > after calling constructor to show how vtable will look like.
> > > Function
> > > @llvm.invariant.barrier that breaks the invariant will be needed
> > > to
> > > cope
> > > with placement new and other cases where the vptr is legitimately
> > > permitted
> > > to change."
> >
> >
> > Nice, this should also avoid loading the vtable pointer multiple
> > times in
> >
> > struct C {
> > virtual void foo();
> > };
> > void g();
> > void f(C *x) {
> > x->foo();
> > g();
>
> Not if g() might placement-new something else on top of x. The
> problem is that if we need to make conservative assumptions about
> whether or not g() might contain a '@llvm.invariant.barrier', we
> must assume it might.
>
> -Hal
>
> > x->foo();
> > }
> >
> > Cheers,
> > Rafael
>
>
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Piotr Padlewski
Because standard say, that when You put new object into previous object (f.e. by placement new) the lifetime of an object has ended, and using previous pointer to it leads to UB.

see c++11 standard, 3.8 Object lifetime



On Thu, Jul 16, 2015 at 10:26 AM, Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "Piotr Padlewski" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "Rafael Espíndola" <[hidden email]>, "David Blaikie" <[hidden email]>, "Richard Smith"
> <[hidden email]>, "[hidden email] Developers" <[hidden email]>
> Sent: Thursday, July 16, 2015 12:19:40 PM
> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>
>
> It can, but it will be UB, so we don't have to worry about it.

Why is it UB? Or when it the placement new changing the vtable not UB?

 -Hal

>
>
> On Thu, Jul 16, 2015 at 10:16 AM, Hal Finkel < [hidden email] >
> wrote:
>
>
> ----- Original Message -----
> > From: "Rafael Espíndola" < [hidden email] >
> > To: "Piotr Padlewski" < [hidden email] >
> > Cc: "David Blaikie" < [hidden email] >, "Hal Finkel" <
> > [hidden email] >, "Richard Smith" < [hidden email] >,
> > " [hidden email] Developers" < [hidden email] >
> > Sent: Thursday, July 16, 2015 12:14:29 PM
> > Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
> >
> > On 16 July 2015 at 10:02, Piotr Padlewski < [hidden email] >
> > wrote:
> > > The proposal of our solution will be published next week. Brief
> > > introduction:
> > >
> > > "Basic idea is to mark vtable loads with !invariant , and also
> > > put
> > > assume()
> > > after calling constructor to show how vtable will look like.
> > > Function
> > > @llvm.invariant.barrier that breaks the invariant will be needed
> > > to
> > > cope
> > > with placement new and other cases where the vptr is legitimately
> > > permitted
> > > to change."
> >
> >
> > Nice, this should also avoid loading the vtable pointer multiple
> > times in
> >
> > struct C {
> > virtual void foo();
> > };
> > void g();
> > void f(C *x) {
> > x->foo();
> > g();
>
> Not if g() might placement-new something else on top of x. The
> problem is that if we need to make conservative assumptions about
> whether or not g() might contain a '@llvm.invariant.barrier', we
> must assume it might.
>
> -Hal
>
> > x->foo();
> > }
> >
> > Cheers,
> > Rafael
>
>
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Hal Finkel
----- Original Message -----

> From: "Piotr Padlewski" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "Rafael Espíndola" <[hidden email]>, "David Blaikie" <[hidden email]>, "Richard Smith"
> <[hidden email]>, "[hidden email] Developers" <[hidden email]>
> Sent: Thursday, July 16, 2015 12:30:14 PM
> Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
>
>
> Because standard say, that when You put new object into previous
> object (f.e. by placement new) the lifetime of an object has ended,
> and using previous pointer to it leads to UB.
>
>
> see c++11 standard, 3.8 Object lifetime
>

Ah, right. Thanks!

 -Hal

>
> On Thu, Jul 16, 2015 at 10:26 AM, Hal Finkel < [hidden email] >
> wrote:
>
>
> ----- Original Message -----
> > From: "Piotr Padlewski" < [hidden email] >
> > To: "Hal Finkel" < [hidden email] >
> > Cc: "Rafael Espíndola" < [hidden email] >, "David
> > Blaikie" < [hidden email] >, "Richard Smith"
> > < [hidden email] >, " [hidden email] Developers" <
> > [hidden email] >
> > Sent: Thursday, July 16, 2015 12:19:40 PM
> > Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
> >
> >
> > It can, but it will be UB, so we don't have to worry about it.
>
> Why is it UB? Or when it the placement new changing the vtable not
> UB?
>
> -Hal
>
> >
> >
> > On Thu, Jul 16, 2015 at 10:16 AM, Hal Finkel < [hidden email] >
> > wrote:
> >
> >
> > ----- Original Message -----
> > > From: "Rafael Espíndola" < [hidden email] >
> > > To: "Piotr Padlewski" < [hidden email] >
> > > Cc: "David Blaikie" < [hidden email] >, "Hal Finkel" <
> > > [hidden email] >, "Richard Smith" < [hidden email] >,
> > > " [hidden email] Developers" < [hidden email] >
> > > Sent: Thursday, July 16, 2015 12:14:29 PM
> > > Subject: Re: [cfe-dev] C++11 and enhacned devirtualization
> > >
> > > On 16 July 2015 at 10:02, Piotr Padlewski < [hidden email] >
>
>
> > > wrote:
> > > > The proposal of our solution will be published next week. Brief
> > > > introduction:
> > > >
> > > > "Basic idea is to mark vtable loads with !invariant , and also
> > > > put
> > > > assume()
> > > > after calling constructor to show how vtable will look like.
> > > > Function
> > > > @llvm.invariant.barrier that breaks the invariant will be
> > > > needed
> > > > to
> > > > cope
> > > > with placement new and other cases where the vptr is
> > > > legitimately
> > > > permitted
> > > > to change."
> > >
> > >
> > > Nice, this should also avoid loading the vtable pointer multiple
> > > times in
> > >
> > > struct C {
> > > virtual void foo();
> > > };
> > > void g();
> > > void f(C *x) {
> > > x->foo();
> > > g();
> >
> > Not if g() might placement-new something else on top of x. The
> > problem is that if we need to make conservative assumptions about
> > whether or not g() might contain a '@llvm.invariant.barrier', we
> > must assume it might.
> >
> > -Hal
> >
> > > x->foo();
> > > }
> > >
> > > Cheers,
> > > Rafael
> >
> >
> > >
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

John McCall
In reply to this post by Hal Finkel
> On Jul 15, 2015, at 10:11 PM, Hal Finkel <[hidden email]> wrote:
>
> Hi everyone,
>
> C++11 added features that allow for certain parts of the class hierarchy to be closed, specifically the 'final' keyword and the semantics of anonymous namespaces, and I think we take advantage of these to enhance our ability to perform devirtualization. For example, given this situation:
>
> struct Base {
>  virtual void foo() = 0;
> };
>
> void external();
> struct Final final : Base {
>  void foo() {
>    external();
>  }
> };
>
> void dispatch(Base *B) {
>  B->foo();
> }
>
> void opportunity(Final *F) {
>  dispatch(F);
> }
>
> When we optimize this code, we do the expected thing and inline 'dispatch' into 'opportunity' but we don't devirtualize the call to foo(). The fact that we know what the vtable of F is at that callsite is not exploited. To a lesser extent, we can do similar things for final virtual methods, and derived classes in anonymous namespaces (because Clang could determine whether or not a class (or method) there is effectively final).
>
> One possibility might be to @llvm.assume to say something about what the vtable ptr of F might be/contain should it be needed later when we emit the initial IR for 'opportunity' (and then teach the optimizer to use that information), but I'm not at all sure that's the best solution. Thoughts?

The problem with any sort of @llvm.assume-encoded information about memory contents is that C++ does actually allow you to replace objects in memory, up to and including stuff like:

{
  MyClass c;

  // Reuse the storage temporarily.  UB to access the object through ‘c’ now.
  c.~MyClass();
  auto c2 = new (&c) MyOtherClass();

  // The storage has to contain a ‘MyClass’ when it goes out of scope.
  c2->~MyOtherClass();
  new (&c) MyClass();
}

The standard frontend devirtualization optimizations are permitted under a couple of different language rules, specifically that:
1. If you access an object through an l-value of a type, it has to dynamically be an object of that type (potentially a subobject).
2. Object replacement as above only “forwards” existing formal references under specific conditions, e.g. the dynamic type has to be the same, ‘const’ members have to have the same value, etc.  Using an unforwarded reference (like the name of the local variable ‘c’ above) doesn’t formally refer to a valid object and thus has undefined behavior.

You can apply those rules much more broadly than the frontend does, of course; but those are the language tools you get.

John.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Richard Smith
On Thu, Jul 16, 2015 at 11:29 AM, John McCall <[hidden email]> wrote:
> On Jul 15, 2015, at 10:11 PM, Hal Finkel <[hidden email]> wrote:
>
> Hi everyone,
>
> C++11 added features that allow for certain parts of the class hierarchy to be closed, specifically the 'final' keyword and the semantics of anonymous namespaces, and I think we take advantage of these to enhance our ability to perform devirtualization. For example, given this situation:
>
> struct Base {
>  virtual void foo() = 0;
> };
>
> void external();
> struct Final final : Base {
>  void foo() {
>    external();
>  }
> };
>
> void dispatch(Base *B) {
>  B->foo();
> }
>
> void opportunity(Final *F) {
>  dispatch(F);
> }
>
> When we optimize this code, we do the expected thing and inline 'dispatch' into 'opportunity' but we don't devirtualize the call to foo(). The fact that we know what the vtable of F is at that callsite is not exploited. To a lesser extent, we can do similar things for final virtual methods, and derived classes in anonymous namespaces (because Clang could determine whether or not a class (or method) there is effectively final).
>
> One possibility might be to @llvm.assume to say something about what the vtable ptr of F might be/contain should it be needed later when we emit the initial IR for 'opportunity' (and then teach the optimizer to use that information), but I'm not at all sure that's the best solution. Thoughts?

The problem with any sort of @llvm.assume-encoded information about memory contents is that C++ does actually allow you to replace objects in memory, up to and including stuff like:

{
  MyClass c;

  // Reuse the storage temporarily.  UB to access the object through ‘c’ now.
  c.~MyClass();
  auto c2 = new (&c) MyOtherClass();

  // The storage has to contain a ‘MyClass’ when it goes out of scope.
  c2->~MyOtherClass();
  new (&c) MyClass();
}

The standard frontend devirtualization optimizations are permitted under a couple of different language rules, specifically that:
1. If you access an object through an l-value of a type, it has to dynamically be an object of that type (potentially a subobject).
2. Object replacement as above only “forwards” existing formal references under specific conditions, e.g. the dynamic type has to be the same, ‘const’ members have to have the same value, etc.  Using an unforwarded reference (like the name of the local variable ‘c’ above) doesn’t formally refer to a valid object and thus has undefined behavior.

You can apply those rules much more broadly than the frontend does, of course; but those are the language tools you get.

Right. Our current plan for modelling this is:

1) Change the meaning of the existing !invariant.load metadata (or add another parallel metadata kind) so that it allows load-load forwarding (even if the memory is not known to be unmodified between the loads) if:
  a) both loads have !invariant.load metadata with the same operand, and
  b) the pointer operands are the same SSA value (being must-alias is not sufficient)
2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that produces a new pointer that is different for the purpose of !invariant.load. (Some other optimizations are permitted to look through the barrier.)

In particular, "new (&c) MyOtherClass()" would be emitted as something like this:

  %1 = call @operator new(size, %c)
  %2 = call @llvm.invariant.barrier(%1)
  call @MyOtherClass::MyOtherClass(%2)
  %vptr = load %2
  %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load !MyBaseClass.vptr
  call @llvm.assume(%known.vptr)

-- Richard

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Hal Finkel
----- Original Message -----

> From: "Richard Smith" <[hidden email]>
> To: "John McCall" <[hidden email]>
> Cc: "Hal Finkel" <[hidden email]>, "[hidden email] Developers" <[hidden email]>, "chandlerc"
> <[hidden email]>
> Sent: Thursday, July 16, 2015 1:46:05 PM
> Subject: Re: C++11 and enhacned devirtualization
>
>
>
>
> On Thu, Jul 16, 2015 at 11:29 AM, John McCall < [hidden email] >
> wrote:
>
>
> > On Jul 15, 2015, at 10:11 PM, Hal Finkel < [hidden email] > wrote:
> >
> > Hi everyone,
> >
> > C++11 added features that allow for certain parts of the class
> > hierarchy to be closed, specifically the 'final' keyword and the
> > semantics of anonymous namespaces, and I think we take advantage
> > of these to enhance our ability to perform devirtualization. For
> > example, given this situation:
> >
> > struct Base {
> > virtual void foo() = 0;
> > };
> >
> > void external();
> > struct Final final : Base {
> > void foo() {
> > external();
> > }
> > };
> >
> > void dispatch(Base *B) {
> > B->foo();
> > }
> >
> > void opportunity(Final *F) {
> > dispatch(F);
> > }
> >
> > When we optimize this code, we do the expected thing and inline
> > 'dispatch' into 'opportunity' but we don't devirtualize the call
> > to foo(). The fact that we know what the vtable of F is at that
> > callsite is not exploited. To a lesser extent, we can do similar
> > things for final virtual methods, and derived classes in anonymous
> > namespaces (because Clang could determine whether or not a class
> > (or method) there is effectively final).
> >
> > One possibility might be to @llvm.assume to say something about
> > what the vtable ptr of F might be/contain should it be needed
> > later when we emit the initial IR for 'opportunity' (and then
> > teach the optimizer to use that information), but I'm not at all
> > sure that's the best solution. Thoughts?
>
> The problem with any sort of @llvm.assume-encoded information about
> memory contents is that C++ does actually allow you to replace
> objects in memory, up to and including stuff like:
>
> {
> MyClass c;
>
> // Reuse the storage temporarily. UB to access the object through ‘c’
> now.
> c.~MyClass();
> auto c2 = new (&c) MyOtherClass();
>
> // The storage has to contain a ‘MyClass’ when it goes out of scope.
> c2->~MyOtherClass();
> new (&c) MyClass();
> }
>
> The standard frontend devirtualization optimizations are permitted
> under a couple of different language rules, specifically that:
> 1. If you access an object through an l-value of a type, it has to
> dynamically be an object of that type (potentially a subobject).
> 2. Object replacement as above only “forwards” existing formal
> references under specific conditions, e.g. the dynamic type has to
> be the same, ‘const’ members have to have the same value, etc. Using
> an unforwarded reference (like the name of the local variable ‘c’
> above) doesn’t formally refer to a valid object and thus has
> undefined behavior.
>
> You can apply those rules much more broadly than the frontend does,
> of course; but those are the language tools you get.
>
>
> Right. Our current plan for modelling this is:
>
>
> 1) Change the meaning of the existing !invariant.load metadata (or
> add another parallel metadata kind) so that it allows load-load
> forwarding (even if the memory is not known to be unmodified between
> the loads) if:
> a) both loads have !invariant.load metadata with the same operand,
> and
> b) the pointer operands are the same SSA value (being must-alias is
> not sufficient)

Why is being must-alias not sufficient? This seems scary.

 -Hal

> 2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that
> produces a new pointer that is different for the purpose of
> !invariant.load. (Some other optimizations are permitted to look
> through the barrier.)
>
>
> In particular, "new (&c) MyOtherClass()" would be emitted as
> something like this:
>
>
> %1 = call @operator new(size, %c)
> %2 = call @llvm.invariant.barrier(%1)
> call @MyOtherClass::MyOtherClass(%2)
> %vptr = load %2
> %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load
> !MyBaseClass.vptr
> call @llvm.assume(%known.vptr)
>
>
> -- Richard

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Nico Weber
In reply to this post by Hal Finkel
On Wed, Jul 15, 2015 at 10:11 PM, Hal Finkel <[hidden email]> wrote:
Hi everyone,

C++11 added features that allow for certain parts of the class hierarchy to be closed, specifically the 'final' keyword and the semantics of anonymous namespaces, and I think we take advantage of these to enhance our ability to perform devirtualization. For example, given this situation:

struct Base {
  virtual void foo() = 0;
};

void external();
struct Final final : Base {
  void foo() {
    external();
  }
};

void dispatch(Base *B) {
  B->foo();
}

void opportunity(Final *F) {
  dispatch(F);
}

When we optimize this code, we do the expected thing and inline 'dispatch' into 'opportunity' but we don't devirtualize the call to foo(). The fact that we know what the vtable of F is at that callsite is not exploited. To a lesser extent, we can do similar things for final virtual methods, and derived classes in anonymous namespaces (because Clang could determine whether or not a class (or method) there is effectively final).

Out of interest, is there data to suggest that this type of optimization will happen often in practice?
 

One possibility might be to @llvm.assume to say something about what the vtable ptr of F might be/contain should it be needed later when we emit the initial IR for 'opportunity' (and then teach the optimizer to use that information), but I'm not at all sure that's the best solution. Thoughts?

Thanks again,
Hal

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

Richard Smith
In reply to this post by Hal Finkel
On Thu, Jul 16, 2015 at 12:24 PM, Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "Richard Smith" <[hidden email]>
> To: "John McCall" <[hidden email]>
> Cc: "Hal Finkel" <[hidden email]>, "[hidden email] Developers" <[hidden email]>, "chandlerc"
> <[hidden email]>
> Sent: Thursday, July 16, 2015 1:46:05 PM
> Subject: Re: C++11 and enhacned devirtualization
>
>
>
>
> On Thu, Jul 16, 2015 at 11:29 AM, John McCall < [hidden email] >
> wrote:
>
>
> > On Jul 15, 2015, at 10:11 PM, Hal Finkel < [hidden email] > wrote:
> >
> > Hi everyone,
> >
> > C++11 added features that allow for certain parts of the class
> > hierarchy to be closed, specifically the 'final' keyword and the
> > semantics of anonymous namespaces, and I think we take advantage
> > of these to enhance our ability to perform devirtualization. For
> > example, given this situation:
> >
> > struct Base {
> > virtual void foo() = 0;
> > };
> >
> > void external();
> > struct Final final : Base {
> > void foo() {
> > external();
> > }
> > };
> >
> > void dispatch(Base *B) {
> > B->foo();
> > }
> >
> > void opportunity(Final *F) {
> > dispatch(F);
> > }
> >
> > When we optimize this code, we do the expected thing and inline
> > 'dispatch' into 'opportunity' but we don't devirtualize the call
> > to foo(). The fact that we know what the vtable of F is at that
> > callsite is not exploited. To a lesser extent, we can do similar
> > things for final virtual methods, and derived classes in anonymous
> > namespaces (because Clang could determine whether or not a class
> > (or method) there is effectively final).
> >
> > One possibility might be to @llvm.assume to say something about
> > what the vtable ptr of F might be/contain should it be needed
> > later when we emit the initial IR for 'opportunity' (and then
> > teach the optimizer to use that information), but I'm not at all
> > sure that's the best solution. Thoughts?
>
> The problem with any sort of @llvm.assume-encoded information about
> memory contents is that C++ does actually allow you to replace
> objects in memory, up to and including stuff like:
>
> {
> MyClass c;
>
> // Reuse the storage temporarily. UB to access the object through ‘c’
> now.
> c.~MyClass();
> auto c2 = new (&c) MyOtherClass();
>
> // The storage has to contain a ‘MyClass’ when it goes out of scope.
> c2->~MyOtherClass();
> new (&c) MyClass();
> }
>
> The standard frontend devirtualization optimizations are permitted
> under a couple of different language rules, specifically that:
> 1. If you access an object through an l-value of a type, it has to
> dynamically be an object of that type (potentially a subobject).
> 2. Object replacement as above only “forwards” existing formal
> references under specific conditions, e.g. the dynamic type has to
> be the same, ‘const’ members have to have the same value, etc. Using
> an unforwarded reference (like the name of the local variable ‘c’
> above) doesn’t formally refer to a valid object and thus has
> undefined behavior.
>
> You can apply those rules much more broadly than the frontend does,
> of course; but those are the language tools you get.
>
>
> Right. Our current plan for modelling this is:
>
>
> 1) Change the meaning of the existing !invariant.load metadata (or
> add another parallel metadata kind) so that it allows load-load
> forwarding (even if the memory is not known to be unmodified between
> the loads) if:
> a) both loads have !invariant.load metadata with the same operand,
> and
> b) the pointer operands are the same SSA value (being must-alias is
> not sufficient)

Why is being must-alias not sufficient? This seems scary.

Must-alias remains sufficient if the value is known to not have changed between (but that's orthogonal to this metadata). The key property is that given:

  %x = load %p, !invariant.load !a
  %q = call @llvm.invariant.barrier(%p)
  call @foo(%q)
  %y = load %q, !invariant.load !a

... we cannot deduce that %x and %y load the same value unless we can see the definition of @foo (the value could have been overwritten by @foo). However, it may still be reasonable to deduce that %p and %q are mustalias (it's probably not particularly important to actually deduce that, though, so we'd probably lose little if alias analysis can't look through the intrinsic).

 -Hal

> 2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that
> produces a new pointer that is different for the purpose of
> !invariant.load. (Some other optimizations are permitted to look
> through the barrier.)
>
>
> In particular, "new (&c) MyOtherClass()" would be emitted as
> something like this:
>
>
> %1 = call @operator new(size, %c)
> %2 = call @llvm.invariant.barrier(%1)
> call @MyOtherClass::MyOtherClass(%2)
> %vptr = load %2
> %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load
> !MyBaseClass.vptr
> call @llvm.assume(%known.vptr)
>
>
> -- Richard

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: C++11 and enhacned devirtualization

John McCall
In reply to this post by Richard Smith
On Jul 16, 2015, at 11:46 AM, Richard Smith <[hidden email]> wrote:
On Thu, Jul 16, 2015 at 11:29 AM, John McCall <[hidden email]> wrote:
> On Jul 15, 2015, at 10:11 PM, Hal Finkel <[hidden email]> wrote:
>
> Hi everyone,
>
> C++11 added features that allow for certain parts of the class hierarchy to be closed, specifically the 'final' keyword and the semantics of anonymous namespaces, and I think we take advantage of these to enhance our ability to perform devirtualization. For example, given this situation:
>
> struct Base {
>  virtual void foo() = 0;
> };
>
> void external();
> struct Final final : Base {
>  void foo() {
>    external();
>  }
> };
>
> void dispatch(Base *B) {
>  B->foo();
> }
>
> void opportunity(Final *F) {
>  dispatch(F);
> }
>
> When we optimize this code, we do the expected thing and inline 'dispatch' into 'opportunity' but we don't devirtualize the call to foo(). The fact that we know what the vtable of F is at that callsite is not exploited. To a lesser extent, we can do similar things for final virtual methods, and derived classes in anonymous namespaces (because Clang could determine whether or not a class (or method) there is effectively final).
>
> One possibility might be to @llvm.assume to say something about what the vtable ptr of F might be/contain should it be needed later when we emit the initial IR for 'opportunity' (and then teach the optimizer to use that information), but I'm not at all sure that's the best solution. Thoughts?

The problem with any sort of @llvm.assume-encoded information about memory contents is that C++ does actually allow you to replace objects in memory, up to and including stuff like:

{
  MyClass c;

  // Reuse the storage temporarily.  UB to access the object through ‘c’ now.
  c.~MyClass();
  auto c2 = new (&c) MyOtherClass();

  // The storage has to contain a ‘MyClass’ when it goes out of scope.
  c2->~MyOtherClass();
  new (&c) MyClass();
}

The standard frontend devirtualization optimizations are permitted under a couple of different language rules, specifically that:
1. If you access an object through an l-value of a type, it has to dynamically be an object of that type (potentially a subobject).
2. Object replacement as above only “forwards” existing formal references under specific conditions, e.g. the dynamic type has to be the same, ‘const’ members have to have the same value, etc.  Using an unforwarded reference (like the name of the local variable ‘c’ above) doesn’t formally refer to a valid object and thus has undefined behavior.

You can apply those rules much more broadly than the frontend does, of course; but those are the language tools you get.

Right. Our current plan for modelling this is:

1) Change the meaning of the existing !invariant.load metadata (or add another parallel metadata kind) so that it allows load-load forwarding (even if the memory is not known to be unmodified between the loads) if:

invariant.load currently allows the load to be reordered pretty aggressively, so I think you need a new metadata.

  a) both loads have !invariant.load metadata with the same operand, and
  b) the pointer operands are the same SSA value (being must-alias is not sufficient)
2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that produces a new pointer that is different for the purpose of !invariant.load. (Some other optimizations are permitted to look through the barrier.)

In particular, "new (&c) MyOtherClass()" would be emitted as something like this:

  %1 = call @operator new(size, %c)
  %2 = call @llvm.invariant.barrier(%1)
  call @MyOtherClass::MyOtherClass(%2)
  %vptr = load %2
  %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load !MyBaseClass.vptr
  call @llvm.assume(%known.vptr)

Hmm.  And all v-table loads have this invariant metadata?

I am concerned about mixing files with and without barriers.

John.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
12