A pattern for portable __builtin_add_overflow()

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

A pattern for portable __builtin_add_overflow()

Oleg Smolsky via cfe-dev
Hi LLVM, clang,

I'm trying to write a portable version of __builtin_add_overflow() it a way that the compiler would
recognize the pattern and use the add_overflow intrinsic / the best possible machine instruction.


With unsigned types this is easy:

int uaddo_native(unsigned a, unsigned b, unsigned* s)
{
return __builtin_add_overflow(a, b, s);
}

int uaddo_portable(unsigned a, unsigned b, unsigned* s)
{
*s = a + b;
return *s < a;
}

We get exactly the same assembly:
uaddo_native: # @uaddo_native
xor eax, eax
add edi, esi
setb al
mov dword ptr [rdx], edi
ret
uaddo_portable: # @uaddo_portable
xor eax, eax
add edi, esi
setb al
mov dword ptr [rdx], edi
ret

But with signed types it is not so easy. I tried 2 versions, but the result is quite far away from the optimal assembly.


int saddo_native(int a, int b, int* s)
{
return __builtin_add_overflow(a, b, s);
}

int saddo_portable(int a, int b, int* s)
{
*s = (unsigned)a + (unsigned)b;
return (a > 0) ? *s <= b : *s > b;
}

int saddo_portable2(int a, int b, int* s)
{
*s = (unsigned)a + (unsigned)b;
int cond = a > 0;
int check = *s > b;
return (cond & !check) | (!cond & check);
}

Assembly:

saddo_native: # @saddo_native
xor eax, eax
add edi, esi
seto al
mov dword ptr [rdx], edi
ret
saddo_portable: # @saddo_portable
lea eax, [rsi + rdi]
mov dword ptr [rdx], eax
cmp eax, esi
setle al
setg cl
test edi, edi
jg .LBB3_2
mov eax, ecx
.LBB3_2:
movzx eax, al
ret
saddo_portable2: # @saddo_portable2
lea eax, [rsi + rdi]
mov dword ptr [rdx], eax
test edi, edi
setg cl
cmp eax, esi
setg al
xor al, cl
movzx eax, al
ret


Do you know the trick to force the compiler to use the seto instruction?

I also noticed that the transformation for uaddo_portable happens in CodeGen, not in IR.

Bests,
Paweł



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] A pattern for portable __builtin_add_overflow()

Oleg Smolsky via cfe-dev
Going by InstCombiner::foldICmpWithConstant,
https://gcc.godbolt.org/z/To_qmm should work.

-- Sanjoy
On Tue, Nov 20, 2018 at 2:46 AM Paweł Bylica via llvm-dev
<[hidden email]> wrote:

>
> Hi LLVM, clang,
>
> I'm trying to write a portable version of __builtin_add_overflow() it a way that the compiler would
> recognize the pattern and use the add_overflow intrinsic / the best possible machine instruction.
>
> Here are docs about these builtins: https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins.
>
> With unsigned types this is easy:
>
> int uaddo_native(unsigned a, unsigned b, unsigned* s)
> {
> return __builtin_add_overflow(a, b, s);
> }
>
> int uaddo_portable(unsigned a, unsigned b, unsigned* s)
> {
> *s = a + b;
> return *s < a;
> }
>
> We get exactly the same assembly:
> uaddo_native: # @uaddo_native
> xor eax, eax
> add edi, esi
> setb al
> mov dword ptr [rdx], edi
> ret
> uaddo_portable: # @uaddo_portable
> xor eax, eax
> add edi, esi
> setb al
> mov dword ptr [rdx], edi
> ret
>
> But with signed types it is not so easy. I tried 2 versions, but the result is quite far away from the optimal assembly.
>
>
> int saddo_native(int a, int b, int* s)
> {
> return __builtin_add_overflow(a, b, s);
> }
>
> int saddo_portable(int a, int b, int* s)
> {
> *s = (unsigned)a + (unsigned)b;
> return (a > 0) ? *s <= b : *s > b;
> }
>
> int saddo_portable2(int a, int b, int* s)
> {
> *s = (unsigned)a + (unsigned)b;
> int cond = a > 0;
> int check = *s > b;
> return (cond & !check) | (!cond & check);
> }
>
> Assembly:
>
> saddo_native: # @saddo_native
> xor eax, eax
> add edi, esi
> seto al
> mov dword ptr [rdx], edi
> ret
> saddo_portable: # @saddo_portable
> lea eax, [rsi + rdi]
> mov dword ptr [rdx], eax
> cmp eax, esi
> setle al
> setg cl
> test edi, edi
> jg .LBB3_2
> mov eax, ecx
> .LBB3_2:
> movzx eax, al
> ret
> saddo_portable2: # @saddo_portable2
> lea eax, [rsi + rdi]
> mov dword ptr [rdx], eax
> test edi, edi
> setg cl
> cmp eax, esi
> setg al
> xor al, cl
> movzx eax, al
> ret
>
>
> Do you know the trick to force the compiler to use the seto instruction?
>
> I also noticed that the transformation for uaddo_portable happens in CodeGen, not in IR.
>
> Bests,
> Paweł
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] A pattern for portable __builtin_add_overflow()

Oleg Smolsky via cfe-dev
Thanks Sanjoy,

It looks it works only up to 32-bit integers (http://llvm.org/doxygen/InstCombineCompares_8cpp_source.html#l01255).
I will need it also for 64-bit integers.

Even if the pattern match would handle the 64-bit case I would need to use 128-bit types what is not portable still.

// P.

On Thu, Nov 22, 2018 at 6:54 PM Sanjoy Das <[hidden email]> wrote:
Going by InstCombiner::foldICmpWithConstant,
https://gcc.godbolt.org/z/To_qmm should work.

-- Sanjoy
On Tue, Nov 20, 2018 at 2:46 AM Paweł Bylica via llvm-dev
<[hidden email]> wrote:
>
> Hi LLVM, clang,
>
> I'm trying to write a portable version of __builtin_add_overflow() it a way that the compiler would
> recognize the pattern and use the add_overflow intrinsic / the best possible machine instruction.
>
> Here are docs about these builtins: https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins.
>
> With unsigned types this is easy:
>
> int uaddo_native(unsigned a, unsigned b, unsigned* s)
> {
> return __builtin_add_overflow(a, b, s);
> }
>
> int uaddo_portable(unsigned a, unsigned b, unsigned* s)
> {
> *s = a + b;
> return *s < a;
> }
>
> We get exactly the same assembly:
> uaddo_native: # @uaddo_native
> xor eax, eax
> add edi, esi
> setb al
> mov dword ptr [rdx], edi
> ret
> uaddo_portable: # @uaddo_portable
> xor eax, eax
> add edi, esi
> setb al
> mov dword ptr [rdx], edi
> ret
>
> But with signed types it is not so easy. I tried 2 versions, but the result is quite far away from the optimal assembly.
>
>
> int saddo_native(int a, int b, int* s)
> {
> return __builtin_add_overflow(a, b, s);
> }
>
> int saddo_portable(int a, int b, int* s)
> {
> *s = (unsigned)a + (unsigned)b;
> return (a > 0) ? *s <= b : *s > b;
> }
>
> int saddo_portable2(int a, int b, int* s)
> {
> *s = (unsigned)a + (unsigned)b;
> int cond = a > 0;
> int check = *s > b;
> return (cond & !check) | (!cond & check);
> }
>
> Assembly:
>
> saddo_native: # @saddo_native
> xor eax, eax
> add edi, esi
> seto al
> mov dword ptr [rdx], edi
> ret
> saddo_portable: # @saddo_portable
> lea eax, [rsi + rdi]
> mov dword ptr [rdx], eax
> cmp eax, esi
> setle al
> setg cl
> test edi, edi
> jg .LBB3_2
> mov eax, ecx
> .LBB3_2:
> movzx eax, al
> ret
> saddo_portable2: # @saddo_portable2
> lea eax, [rsi + rdi]
> mov dword ptr [rdx], eax
> test edi, edi
> setg cl
> cmp eax, esi
> setg al
> xor al, cl
> movzx eax, al
> ret
>
>
> Do you know the trick to force the compiler to use the seto instruction?
>
> I also noticed that the transformation for uaddo_portable happens in CodeGen, not in IR.
>
> Bests,
> Paweł
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] A pattern for portable __builtin_add_overflow()

Oleg Smolsky via cfe-dev
You can grep for uses of the sadd_with_overflow intrinsic in the LLVM
code base to find other patterns that we may recognize.  If there are
no portable ones you might have to add your own pattern.

-- Sanjoy

On Thu, Nov 22, 2018 at 12:08 PM Paweł Bylica <[hidden email]> wrote:

>
> Thanks Sanjoy,
>
> It looks it works only up to 32-bit integers (http://llvm.org/doxygen/InstCombineCompares_8cpp_source.html#l01255).
> I will need it also for 64-bit integers.
>
> Even if the pattern match would handle the 64-bit case I would need to use 128-bit types what is not portable still.
>
> // P.
>
> On Thu, Nov 22, 2018 at 6:54 PM Sanjoy Das <[hidden email]> wrote:
>>
>> Going by InstCombiner::foldICmpWithConstant,
>> https://gcc.godbolt.org/z/To_qmm should work.
>>
>> -- Sanjoy
>> On Tue, Nov 20, 2018 at 2:46 AM Paweł Bylica via llvm-dev
>> <[hidden email]> wrote:
>> >
>> > Hi LLVM, clang,
>> >
>> > I'm trying to write a portable version of __builtin_add_overflow() it a way that the compiler would
>> > recognize the pattern and use the add_overflow intrinsic / the best possible machine instruction.
>> >
>> > Here are docs about these builtins: https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins.
>> >
>> > With unsigned types this is easy:
>> >
>> > int uaddo_native(unsigned a, unsigned b, unsigned* s)
>> > {
>> > return __builtin_add_overflow(a, b, s);
>> > }
>> >
>> > int uaddo_portable(unsigned a, unsigned b, unsigned* s)
>> > {
>> > *s = a + b;
>> > return *s < a;
>> > }
>> >
>> > We get exactly the same assembly:
>> > uaddo_native: # @uaddo_native
>> > xor eax, eax
>> > add edi, esi
>> > setb al
>> > mov dword ptr [rdx], edi
>> > ret
>> > uaddo_portable: # @uaddo_portable
>> > xor eax, eax
>> > add edi, esi
>> > setb al
>> > mov dword ptr [rdx], edi
>> > ret
>> >
>> > But with signed types it is not so easy. I tried 2 versions, but the result is quite far away from the optimal assembly.
>> >
>> >
>> > int saddo_native(int a, int b, int* s)
>> > {
>> > return __builtin_add_overflow(a, b, s);
>> > }
>> >
>> > int saddo_portable(int a, int b, int* s)
>> > {
>> > *s = (unsigned)a + (unsigned)b;
>> > return (a > 0) ? *s <= b : *s > b;
>> > }
>> >
>> > int saddo_portable2(int a, int b, int* s)
>> > {
>> > *s = (unsigned)a + (unsigned)b;
>> > int cond = a > 0;
>> > int check = *s > b;
>> > return (cond & !check) | (!cond & check);
>> > }
>> >
>> > Assembly:
>> >
>> > saddo_native: # @saddo_native
>> > xor eax, eax
>> > add edi, esi
>> > seto al
>> > mov dword ptr [rdx], edi
>> > ret
>> > saddo_portable: # @saddo_portable
>> > lea eax, [rsi + rdi]
>> > mov dword ptr [rdx], eax
>> > cmp eax, esi
>> > setle al
>> > setg cl
>> > test edi, edi
>> > jg .LBB3_2
>> > mov eax, ecx
>> > .LBB3_2:
>> > movzx eax, al
>> > ret
>> > saddo_portable2: # @saddo_portable2
>> > lea eax, [rsi + rdi]
>> > mov dword ptr [rdx], eax
>> > test edi, edi
>> > setg cl
>> > cmp eax, esi
>> > setg al
>> > xor al, cl
>> > movzx eax, al
>> > ret
>> >
>> >
>> > Do you know the trick to force the compiler to use the seto instruction?
>> >
>> > I also noticed that the transformation for uaddo_portable happens in CodeGen, not in IR.
>> >
>> > Bests,
>> > Paweł
>> >
>> >
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > [hidden email]
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev