Unresolved bug with arm neon instructions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Unresolved bug with arm neon instructions

Kristof Beyls via cfe-dev
Hi, the bug 39087 opened more than a year ago is still unresolved.

I have this code:

#include <arm_neon.h>

#define vroti_epi32(x, i)                                         \
        (i < 0 ? vsliq_n_u32(vshrq_n_u32(x, 32 - i), x, i)        \
               : vsriq_n_u32(vshlq_n_u32(x, 32 + i), x, -i))

int main()
{
        uint32x4_t x = vdupq_n_u32(42);
        uint32x4_t value = vroti_epi32(x, 12);

        return 0;
}

with clang 8.0.7 and other versions there is an error on compilation. With gcc all works without issue.
Could you fix this bug?

Thank you

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Unresolved bug with arm neon instructions

Kristof Beyls via cfe-dev
Hi,

On Sat, 9 Nov 2019 at 19:03, via cfe-dev <[hidden email]> wrote:
> with clang 8.0.7 and other versions there is an error on compilation. With gcc all works without issue.
> Could you fix this bug?

It's an incompatibility, but it's unclear that it's a bug. Clang
intentionally range-checks these intrinsics because they map to
instructions that only have a valid form with the permitted (positive)
shift amounts.

If we relaxed that requirement then -O0 compilation would still be
broken but with a significantly worse error message, which I don't
think many Clang devs would consider acceptable (both because of the
different behaviour depending on -O* and the message) --  I certainly
wouldn't. GCC appears to do some kind of basic optimization that
eliminates the invalid instructions even at -O0, but that's not how
Clang works and is also unlikely to change in the short-medium term.

I have seen someone propose that intrinsics taking immediates should
also accept variables and lower to multiple-instruction equivalents if
necessary, but I wasn't terribly keen on that either personally. It
would add a reciprocal incompatibility with GCC in at least the short
term (possiblty forever if they nope out), and I tend to think most
people writing these would prefer to know if their call was
inefficient anyway.

Someone keen *might* be able to get a patch through cfe-commits that
allows the user to demote these errors to a warning as a compromise,
but I'm afraid otherwise the only suggestion is to rewrite code so
that invalid intrinsics are never called statically.

Cheers.

Tim.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Unresolved bug with arm neon instructions

Kristof Beyls via cfe-dev
In reply to this post by Kristof Beyls via cfe-dev
I left a comment on the ticket.

From: cfe-dev <[hidden email]> on behalf of via cfe-dev <[hidden email]>
Sent: 09 November 2019 19:03
To: [hidden email] <[hidden email]>
Subject: [cfe-dev] Unresolved bug with arm neon instructions
 
Hi, the bug 39087 opened more than a year ago is still unresolved.

I have this code:

#include <arm_neon.h>

#define vroti_epi32(x, i)                                         \
        (i < 0 ? vsliq_n_u32(vshrq_n_u32(x, 32 - i), x, i)        \
               : vsriq_n_u32(vshlq_n_u32(x, 32 + i), x, -i))

int main()
{
        uint32x4_t x = vdupq_n_u32(42);
        uint32x4_t value = vroti_epi32(x, 12);

        return 0;
}

with clang 8.0.7 and other versions there is an error on compilation. With gcc all works without issue.
Could you fix this bug?

Thank you

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev