__sync_* builtins

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

__sync_* builtins

David Chisnall via cfe-dev
Hello,

I've run into some trouble with the implementation of the __sync_* builtins (for example, __sync_fetch_and_add). I can see possible solutions -- please would someone help me to choose between them?

First, the expected behaviour. The __sync_* builtins are overloaded for various access sizes, so that I can __sync_fetch_and_add a byte, word etc. If the compiler is able to handle the builtin for the target then it generates some appropriate code. If it is not able do this (for example, for armv6m) then it replaces the builtin with a call to a function named for the builtin and the access size, e.g. __sync_fetch_and_add on a char* becomes a call to __sync_fetch_and_add_1.

Second, the problem. The way that the overloaded-ness of the builtins is implemented is that clang substitutes for the original builtin, a builtin named for builtin & access size -- so builtin __sync_fetch_and_add on a char* becomes builtin __sync_fetch_and_add_1. If llvm cannot handle this for the target then it replaces the builtin with an identically-named function call. So far so good. But, we cannot compile any C implementation of this builtin using clang as clang will object that we are redefining the builtin with the same name as the function call.

A few solutions:

1) I see that, for iOS targets where the builtin is not supported, the function call is prefixed with an extra underscore. This should avoid the naming collision. I haven't found where the magic happens, but I wonder whether this is an iOS-specific workaround for this problem that could be generalized. Of course, it may just be a happy coincidence.

2) Rename the size-specific builtins in include/clang/Basic/Builtins.def so that, for example, __sync_fetch_and_add_1 becomes __internal__sync_fetch_and_add_1. This works for me so far. The only downside I can see is that any end-user code that uses the existing size-specific builtins will break -- but I could argue that the size-specific builtins are not a public interface.

3) Add 'f' to the attributes of the size-specific builtins so that -fno-builtins works on them. But I think we would be lying: the attribute means 'this is a libc/libm function', but we might have a function implementation of the builtins in libc, or somewhere else, or not at all. I don't like this solution.

4) Modify behaviour so that builtins without the 'f' attribute are disabled by -fno-builtins. Some archaeology suggests that the check against the 'f' attribute _might_ be a wart from a time when builtins were disabled by -ffreestanding, but this is really just speculation.

Does anyone have any advice/ideas/further pointers?
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: __sync_* builtins

David Chisnall via cfe-dev
On 22 Feb 2017, at 12:10, Bernard Ogden via cfe-dev <[hidden email]> wrote:

>
> Hello,
>
> I've run into some trouble with the implementation of the __sync_* builtins (for example, __sync_fetch_and_add). I can see possible solutions -- please would someone help me to choose between them?
>
> First, the expected behaviour. The __sync_* builtins are overloaded for various access sizes, so that I can __sync_fetch_and_add a byte, word etc. If the compiler is able to handle the builtin for the target then it generates some appropriate code. If it is not able do this (for example, for armv6m) then it replaces the builtin with a call to a function named for the builtin and the access size, e.g. __sync_fetch_and_add on a char* becomes a call to __sync_fetch_and_add_1.
>
> Second, the problem. The way that the overloaded-ness of the builtins is implemented is that clang substitutes for the original builtin, a builtin named for builtin & access size -- so builtin __sync_fetch_and_add on a char* becomes builtin __sync_fetch_and_add_1. If llvm cannot handle this for the target then it replaces the builtin with an identically-named function call. So far so good. But, we cannot compile any C implementation of this builtin using clang as clang will object that we are redefining the builtin with the same name as the function call.
>
> A few solutions:
>
> 1) I see that, for iOS targets where the builtin is not supported, the function call is prefixed with an extra underscore. This should avoid the naming collision. I haven't found where the magic happens, but I wonder whether this is an iOS-specific workaround for this problem that could be generalized. Of course, it may just be a happy coincidence.
>
> 2) Rename the size-specific builtins in include/clang/Basic/Builtins.def so that, for example, __sync_fetch_and_add_1 becomes __internal__sync_fetch_and_add_1. This works for me so far. The only downside I can see is that any end-user code that uses the existing size-specific builtins will break -- but I could argue that the size-specific builtins are not a public interface.
>
> 3) Add 'f' to the attributes of the size-specific builtins so that -fno-builtins works on them. But I think we would be lying: the attribute means 'this is a libc/libm function', but we might have a function implementation of the builtins in libc, or somewhere else, or not at all. I don't like this solution.
>
> 4) Modify behaviour so that builtins without the 'f' attribute are disabled by -fno-builtins. Some archaeology suggests that the check against the 'f' attribute _might_ be a wart from a time when builtins were disabled by -ffreestanding, but this is really just speculation.
>
> Does anyone have any advice/ideas/further pointers?

There is a similar problem for the __atomic builtins.  The problem is not quite as you’ve stated it: clang will emit an atomicrmw instruction for __sync_fetch_and_add if the target supports it, or fall back to emitting __sync_fetch_and_add_x calls (additionally, I believe the back end can now translate some atomicrmw instructions into these calls).

The problem is that __sync_fetch_and_add_1 really isn’t a builtin, it is a support library function that is used to implement a builtin, yet clang complains that it is a builtin.  Compiling these with -fno-builtin is not ideal because the implementation actually wants to use the builtins for checking if a particular size is lock free and using the builtin that *doesn’t* expand to a libcall for these cases, to ensure atomicity between compilation units that call the runtime function and compilation units that emit atomic instructions directly.

For the __atomic functions, we worked around this bug using #pragma redefine_extname.  This unfortunately leaves us with code that doesn’t compile with gcc (because redefine_extname is a weird Solarisism and gcc only supports it on Solaris).

The correct fix it for clang to understand that library functions used to implement builtins are not actually builtins and are expected to be implemented somewhere.  Ideally we’d have a warning if you implemented these and didn’t explicitly silence the warning that you’re implementing the support library, to prevent name collisions.

David

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: __sync_* builtins

David Chisnall via cfe-dev
On Wed, Feb 22, 2017 at 12:22:09PM +0000, David Chisnall via cfe-dev wrote:
> For the __atomic functions, we worked around this bug using #pragma
> redefine_extname.  This unfortunately leaves us with code that doesn’t
> compile with gcc (because redefine_extname is a weird Solarisism and
> gcc only supports it on Solaris).

Just call it sync_foo on the C level and give it a different asm name.
Works perfectly fine with both GCC and Clang.

Joerg
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: __sync_* builtins

David Chisnall via cfe-dev
In reply to this post by David Chisnall via cfe-dev
David:

It seems I had made a few more assumptions that I realised. Thank-you for explaining.


Joerg:

I assume you're talking about this trick:

int func (int x, int y) asm ("MYFUNC");

It does indeed work for me. Thank-you.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev