I can confirm (using the great https://gcc.godbolt.org/ tool) that
using the flags "-Ofast -mavx2 -fopenmp -ffast-math" the right
avx2 opcode (vroundps) is
emited for floor (in foo), but unfortunately not for sin, cos or
exp (e.g. see sin in bar below).
GCC 8.1+ and the Intel Compiler icc 13+ insert call to
vectorized implementations (_ZGVbN4v_sinf
or __svml_sinf4 ), but
clang seems to have nothing like this.
my small testcode:
void foo(float * __restrict __attribute((aligned(32))) x
I have reproduced this
behavior on different machines. Maybe I am doing s.th. wrong
here, but it seems like there is no vectorized implementation
for sin, cos etc. I am using h2lib for now (http://h2lib.org/doc/d1/d89/simd__avx_8h_source.html)
as a workaround, but I expect clang to do this job.