Fwd: [cfe-users] floor is vectorized, but not sin, cos or exp

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Fwd: [cfe-users] floor is vectorized, but not sin, cos or exp

Richard Smith via cfe-dev

I am forwarding this to cfe-dev as it might this sounds like a bug and cfe-users is not read that much.



-------- Weitergeleitete Nachricht --------
Betreff: [cfe-users] floor is vectorized, but not sin, cos or exp
Datum: Tue, 11 Dec 2018 16:47:16 +0100
Von: Klaus Leppkes via cfe-users [hidden email]
Antwort an: Klaus Leppkes [hidden email]
An: [hidden email]


Hi,

according to the doc (https://releases.llvm.org/7.0.0/docs/Vectorizers.html) floor, sin, cos should be vectorized.

I can confirm (using the great https://gcc.godbolt.org/ tool) that using the flags "-Ofast -mavx2  -fopenmp -ffast-math" the right avx2 opcode (vroundps) is emited for floor (in foo), but unfortunately not for sin, cos or exp (e.g. see sin in bar below).

GCC 8.1+ and the Intel Compiler icc 13+ insert call to vectorized implementations (_ZGVbN4v_sinf or __svml_sinf4 ), but clang seems to have nothing like this.

Here is my small testcode:

#include <cmath>

void foo(float * __restrict __attribute((aligned(32))) x
, float * __restrict __attribute((aligned(32))) y) {
for (int i = 0; i < 4; ++i)
y[i] = floor(x[i]);
}


void bar(float * __restrict __attribute((aligned(32))) x
, float * __restrict __attribute((aligned(32))) y) {
for (int i = 0; i < 4; ++i)
y[i] = sin(x[i]);
}

I have reproduced this behavior on different machines. Maybe I am doing s.th. wrong here, but it seems like there is no vectorized implementation for sin, cos etc. I am using h2lib for now (http://h2lib.org/doc/d1/d89/simd__avx_8h_source.html) as a workaround, but I expect clang to do this job.

Can anybody comment on this please?

Cheers
Klaus

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Nachrichtenteil als Anhang (208 bytes) Download Attachment