Potential missed optimization - unnecessary reload of vtable ptr inside loop body

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Potential missed optimization - unnecessary reload of vtable ptr inside loop body

Yaron Keren via cfe-dev
Hello all!

I posted a question about a potential missed optimization to llvm-dev, but was
directed here since it concerned more C++-specific bits of code. Previous
conversion can be found at [0] and [1].

The code in question is here: https://godbolt.org/g/ec5cP7

My main question here is about assembly lines 24 and 46, where I think the
vtable pointer for the Rect object is being reloaded every iteration of the
loop. nbjoerg on #llvm said that's due to the possibility of placement new being
used somewhere inside the called function, but I'm not entirely sure that
placement new can change what vtable the vtable pointer points to.

(I'm new to this language lawyering stuff, so please let me know what I mess up)

As far as I understand, paragraph 8 of 6.6.3 [basic.life] in the most recent
draft of the standard says that references or names of an object that has been
"replaced" by placement new are only "redirected" to the new object if the new
object is the same type and no other class derives from that type; otherwise,
the reference/name refers to an object whose lifetime has ended. Thus, any uses
of the "this" pointer after a member function is called are only valid if the
placement new'd object is the same type, and so has the same vtable, which means
the vtable pointer does not have to be reloaded.

The example for point 6.5 of paragraph 6 of 6.6.3 sort of supports this
interpretation, since calling B::mutate() changes the type of *this, which
causes pb to point to an object whose lifetime has ended, and further method
calls through pb result in undefined behavior.

Is this reasoning correct? ICC 18, Clang trunk, Clang 5.0, GCC 7.3, GCC trunk,
and MSVC 19 all perform the reload, so I'm guessing I'm wrong, but I'm not sure
how.

If the vtable pointer reload is required, is there a way to indicate to Clang
that such a reload will not be necessary, even though the compiler can't verify
that (sort of like __restrict)? I tried adding [[gnu::pure]] to the function
declarations and definitions, but the vtable pointer reload remained. Does Clang
take [[gnu::pure]]/[[gnu::const]] into account for code generation/optimization?

Thanks for the help!

Alex

    [0]: http://lists.llvm.org/pipermail/llvm-dev/2018-February/121439.html
    [1]: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121486.html
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Potential missed optimization - unnecessary reload of vtable ptr inside loop body

Yaron Keren via cfe-dev


> On Mar 7, 2018, at 1:41 PM, Alex Wang via cfe-dev <[hidden email]> wrote:
>
> Hello all!
>
> I posted a question about a potential missed optimization to llvm-dev, but was
> directed here since it concerned more C++-specific bits of code. Previous
> conversion can be found at [0] and [1].
>
> The code in question is here: https://godbolt.org/g/ec5cP7
>
> My main question here is about assembly lines 24 and 46, where I think the
> vtable pointer for the Rect object is being reloaded every iteration of the
> loop. nbjoerg on #llvm said that's due to the possibility of placement new being
> used somewhere inside the called function, but I'm not entirely sure that
> placement new can change what vtable the vtable pointer points to.
>
> (I'm new to this language lawyering stuff, so please let me know what I mess up)
>
> As far as I understand, paragraph 8 of 6.6.3 [basic.life] in the most recent
> draft of the standard says that references or names of an object that has been
> "replaced" by placement new are only "redirected" to the new object if the new
> object is the same type and no other class derives from that type; otherwise,
> the reference/name refers to an object whose lifetime has ended. Thus, any uses
> of the "this" pointer after a member function is called are only valid if the
> placement new'd object is the same type, and so has the same vtable, which means
> the vtable pointer does not have to be reloaded.
>
> The example for point 6.5 of paragraph 6 of 6.6.3 sort of supports this
> interpretation, since calling B::mutate() changes the type of *this, which
> causes pb to point to an object whose lifetime has ended, and further method
> calls through pb result in undefined behavior.
>
> Is this reasoning correct? ICC 18, Clang trunk, Clang 5.0, GCC 7.3, GCC trunk,
> and MSVC 19 all perform the reload, so I'm guessing I'm wrong, but I'm not sure
> how.

Your reasoning is right, but it's proven to be very difficult to write a reliable general
optimization that only triggers on the wrong cases and not when, say, constructing
different base-class subobjects of a class.  Our optimization can be enabled with
-fstrict-vtable-pointers, but it's still experimental.

John.

>
> If the vtable pointer reload is required, is there a way to indicate to Clang
> that such a reload will not be necessary, even though the compiler can't verify
> that (sort of like __restrict)? I tried adding [[gnu::pure]] to the function
> declarations and definitions, but the vtable pointer reload remained. Does Clang
> take [[gnu::pure]]/[[gnu::const]] into account for code generation/optimization?
>
> Thanks for the help!
>
> Alex
>
>    [0]: http://lists.llvm.org/pipermail/llvm-dev/2018-February/121439.html
>    [1]: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121486.html
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Potential missed optimization - unnecessary reload of vtable ptr inside loop body

Yaron Keren via cfe-dev

> On Mar 7, 2018, at 8:18 PM, John McCall <[hidden email]> wrote:
>
>
>
>> On Mar 7, 2018, at 1:41 PM, Alex Wang via cfe-dev <[hidden email]> wrote:
>>
>> Hello all!
>>
>> I posted a question about a potential missed optimization to llvm-dev, but was
>> directed here since it concerned more C++-specific bits of code. Previous
>> conversion can be found at [0] and [1].
>>
>> The code in question is here: https://godbolt.org/g/ec5cP7
>>
>> My main question here is about assembly lines 24 and 46, where I think the
>> vtable pointer for the Rect object is being reloaded every iteration of the
>> loop. nbjoerg on #llvm said that's due to the possibility of placement new being
>> used somewhere inside the called function, but I'm not entirely sure that
>> placement new can change what vtable the vtable pointer points to.
>>
>> (I'm new to this language lawyering stuff, so please let me know what I mess up)
>>
>> As far as I understand, paragraph 8 of 6.6.3 [basic.life] in the most recent
>> draft of the standard says that references or names of an object that has been
>> "replaced" by placement new are only "redirected" to the new object if the new
>> object is the same type and no other class derives from that type; otherwise,
>> the reference/name refers to an object whose lifetime has ended. Thus, any uses
>> of the "this" pointer after a member function is called are only valid if the
>> placement new'd object is the same type, and so has the same vtable, which means
>> the vtable pointer does not have to be reloaded.
>>
>> The example for point 6.5 of paragraph 6 of 6.6.3 sort of supports this
>> interpretation, since calling B::mutate() changes the type of *this, which
>> causes pb to point to an object whose lifetime has ended, and further method
>> calls through pb result in undefined behavior.
>>
>> Is this reasoning correct? ICC 18, Clang trunk, Clang 5.0, GCC 7.3, GCC trunk,
>> and MSVC 19 all perform the reload, so I'm guessing I'm wrong, but I'm not sure
>> how.
>
> Your reasoning is right, but it's proven to be very difficult to write a reliable general
> optimization that only triggers on the wrong cases and not when, say, constructing

Wrong cases? Guessing you meant right cases?

> different base-class subobjects of a class.  Our optimization can be enabled with
> -fstrict-vtable-pointers, but it's still experimental.

Even when dealing with constructing/destructing objects, I thought that the
object's type/vtable pointer effectively changes only between the base
constructor finishing and the next constructor starting, or vice-versa for
destruction. So even if full devirtualization isn't possible, the vtable pointer
could still be hoisted out of loops. Is that kind of optimization just too
specific to spend time on, compared to the benefits from getting
-fstrict-vtable-pointers implemented correctly?

Also, if I were interested in enabling this flag in my codebase, are there any
issues beyond the ones listed in this 2015 Nov. email [0]?

Thanks,
Alex

    [0]: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092384.html

>
> John.
>
>>
>> If the vtable pointer reload is required, is there a way to indicate to Clang
>> that such a reload will not be necessary, even though the compiler can't verify
>> that (sort of like __restrict)? I tried adding [[gnu::pure]] to the function
>> declarations and definitions, but the vtable pointer reload remained. Does Clang
>> take [[gnu::pure]]/[[gnu::const]] into account for code generation/optimization?
>>
>> Thanks for the help!
>>
>> Alex
>>
>>   [0]: http://lists.llvm.org/pipermail/llvm-dev/2018-February/121439.html
>>   [1]: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121486.html
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Potential missed optimization - unnecessary reload of vtable ptr inside loop body

Yaron Keren via cfe-dev


> On Mar 8, 2018, at 2:59 PM, Alex Wang <[hidden email]> wrote:
>
>
>> On Mar 7, 2018, at 8:18 PM, John McCall <[hidden email]> wrote:
>>
>>
>>
>>> On Mar 7, 2018, at 1:41 PM, Alex Wang via cfe-dev <[hidden email]> wrote:
>>>
>>> Hello all!
>>>
>>> I posted a question about a potential missed optimization to llvm-dev, but was
>>> directed here since it concerned more C++-specific bits of code. Previous
>>> conversion can be found at [0] and [1].
>>>
>>> The code in question is here: https://godbolt.org/g/ec5cP7
>>>
>>> My main question here is about assembly lines 24 and 46, where I think the
>>> vtable pointer for the Rect object is being reloaded every iteration of the
>>> loop. nbjoerg on #llvm said that's due to the possibility of placement new being
>>> used somewhere inside the called function, but I'm not entirely sure that
>>> placement new can change what vtable the vtable pointer points to.
>>>
>>> (I'm new to this language lawyering stuff, so please let me know what I mess up)
>>>
>>> As far as I understand, paragraph 8 of 6.6.3 [basic.life] in the most recent
>>> draft of the standard says that references or names of an object that has been
>>> "replaced" by placement new are only "redirected" to the new object if the new
>>> object is the same type and no other class derives from that type; otherwise,
>>> the reference/name refers to an object whose lifetime has ended. Thus, any uses
>>> of the "this" pointer after a member function is called are only valid if the
>>> placement new'd object is the same type, and so has the same vtable, which means
>>> the vtable pointer does not have to be reloaded.
>>>
>>> The example for point 6.5 of paragraph 6 of 6.6.3 sort of supports this
>>> interpretation, since calling B::mutate() changes the type of *this, which
>>> causes pb to point to an object whose lifetime has ended, and further method
>>> calls through pb result in undefined behavior.
>>>
>>> Is this reasoning correct? ICC 18, Clang trunk, Clang 5.0, GCC 7.3, GCC trunk,
>>> and MSVC 19 all perform the reload, so I'm guessing I'm wrong, but I'm not sure
>>> how.
>>
>> Your reasoning is right, but it's proven to be very difficult to write a reliable general
>> optimization that only triggers on the wrong cases and not when, say, constructing
>
> Wrong cases? Guessing you meant right cases?

Yes, sorry.

>> different base-class subobjects of a class.  Our optimization can be enabled with
>> -fstrict-vtable-pointers, but it's still experimental.
>
> Even when dealing with constructing/destructing objects, I thought that the
> object's type/vtable pointer effectively changes only between the base
> constructor finishing and the next constructor starting, or vice-versa for
> destruction. So even if full devirtualization isn't possible, the vtable pointer
> could still be hoisted out of loops.

Yes, absolutely.  I'm just describing the issues that I understand to make the analysis
difficult, not saying that it's impossible.

> Is that kind of optimization just too
> specific to spend time on, compared to the benefits from getting
> -fstrict-vtable-pointers implemented correctly?

I think everybody agrees that this is likely to be an extremely powerful optimization.

> Also, if I were interested in enabling this flag in my codebase, are there any
> issues beyond the ones listed in this 2015 Nov. email [0]?

Hopefully the people working on the optimization can answer that.

John.

>
> Thanks,
> Alex
>
>    [0]: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092384.html
>
>>
>> John.
>>
>>>
>>> If the vtable pointer reload is required, is there a way to indicate to Clang
>>> that such a reload will not be necessary, even though the compiler can't verify
>>> that (sort of like __restrict)? I tried adding [[gnu::pure]] to the function
>>> declarations and definitions, but the vtable pointer reload remained. Does Clang
>>> take [[gnu::pure]]/[[gnu::const]] into account for code generation/optimization?
>>>
>>> Thanks for the help!
>>>
>>> Alex
>>>
>>>  [0]: http://lists.llvm.org/pipermail/llvm-dev/2018-February/121439.html
>>>  [1]: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121486.html
>>> _______________________________________________
>>> cfe-dev mailing list
>>> [hidden email]
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev