RFC: Interface user provided vector functions with the vectorizer.

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
Dear all,

I have re-written the proposal for interfacing user provided vector
functions, originally posted in both llvm-dev and cfe-dev mailing
list:

"[RFC] Expose user provided vector function for auto-vectorization."

The proposal looks quite different from the original submission,
therefore I took the liberty to start a new thread.

The original thread generated some good discussion. In particular,
Simon Moll and Johannes Doerfert (CCed) have managed to provide good
arguments for the following claims:

1. The Vector Function ABI name mangling scheme of a target is not
   enough to describe all uses cases of function vectorization that
   the compiler might end up needing to support in the future.
2. `declare variant` needs to be handled properly at IR level, to be
   able to give the compiler the full OpenMP context of the directive.

This proposal addresses those two concerns and other (I believe) minor
concerns that have been raised in the previous thread.

This proposal is provided with examples and a self assessment around
extendibility.

I have CCed all the people that have participated in the discussion so
far, please let me know if you think I have missed anything of what
have been raised.

Kind regards,

Francesco

*** DRAFT OF THE PROPOSAL ***

# SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.

Because the users care about portability (across compilers, libraries
and systems), I believe we have to base sour solution on a standard
that describes the mapping from the scalar function to the vector
function.

Because OpenMP is standard and widely used, we should base our
solution on the mechanisms that the standard provides, via the
directives `declare simd` and `declare variant`, the latter when used
in with the `simd` trait in the `construct` set.

Please notice that:

1. The scope of the proposal is not implementing full support for
   `pragma omp declare variant`.
2. The scope of the proposal is not enabling the vectorizer to do new
   kind of vectorizations (e.g. RV-like vectorization described by
   Simon).
3. The proposal aims to be extendible wrt 1. and 2.
4. The IR attribute introduced in this proposal is equivalent to the
   one needed for the VecClone pass under development in
   https://reviews.llvm.org/D22792

# CLANG COMPONENTS

A C function attribute, `clang_declare_simd_variant`, to attach to the
scalar version. The attribute provides enough information to the
compiler about the vector shape of the user defined function. The
vector shapes handled by the attribute are those handled by the OpenMP
standard via `declare simd` (and no more than that).

1. The function attribute handling in clang is crafted with the
   requirement that it will be possible to re-use the same components
   for the info generated by `declare variant` when used with a `simd`
   traits in the `construct` set.
2. The attribute allows orthogonality with the vectorization that is
   done via OpenMP: the user vector function is still exposed for
   vectorization when not using `-fopenmp-[simd]` once the `declare
   simd` and `declare variant` directive of OpenMP will be available
   in the front-end.

## C function attribute: `clang_declare_simd_variant`

The definition of this attribute has been crafted to match the
semantics of `declare variant` for a `simd` construct described in
OpenMP 5.0. I have added only the traits of the `device` set, `isa`
and `arch`, which I believe are enough to cover for the use case of
this proposal. If that is not the case, please provide an example,
extending the attribute will be easy even once the current one is
implemented.

```
clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>})

<variant-func-id>:= The name of a function variant that is a base language identifier, or,
                    for C++, a template-id.

<simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}

<simdlen> := simdlen(<positive number>) | simdlen("scalable")

<mask>    := inbranch | notinbranch

<optional simd clauses> := <linear clause>
                         | <uniform clause>
                         | <align clause>  | {,<optional simd clauses>}

<linear clause>  := linear_ref(<var>,<step>)
                  | linear_var(<var>, <step>)
                  | linear_uval(<var>, <step>)
                  | linear(<var>, <step>)

<step> := <var> | <non zero number>

<uniform clause> := uniform(<var>)

<align clause>   := align(<var>, <positive number>)

<var> := Name of a parameter in the scalar function declaration/definition

<non zero number> := ... | -2 | -1 | 1 | 2 | ...

<positive number> := 1 | 2 | 3 | ...

<context selector clauses> := {<isa>}{,} {<arch>}

<isa> := isa(target-specific-value)

<arch> := arch(target-specific-value)

```

# LLVM COMPONENTS:

## VectorFunctionShape class

The object `VectorFunctionShape` contains the information about the
kind of vectorization available for an `llvm::Call`.

The object `VectorFunctionShape` must contain the following information:

1. Vectorization Factor (or number or concurrent lanes executed by the
   SIMD version of the function). Encoded by unsigned integer.
2. Whether the vector function is requested for scalable
   vectorization, encoded by a boolean.
3. Information about masking / no masking, encoded by a boolean.
4. Information about the parameters, encoded in a container that
   carries objects of type `ParamaterType`, to describe features like
   `linear` and `uniform`.
5. Function name redirection, if a user has specified to use a custom
   name instead of the Vector Function ABI ones.

Items 1. to 5. represents the information stored in the
`vector-function-abi-variant` attribute (see next section).

The object can be extended in the future to include new vectorization
kinds (for example the RV-like vectorization of the Region
Vectorizer), or to add more context information that might come from
other uses of OpenMP `declare variant`, or to add new Vector Function
ABIs not based on OpenMP. Such information can be retrieved by
attributes that will be added to describe the `Call` instance.

## IR Attribute

We define a `vector-function-abi-variant` attribute that lists the
mangled names produced via the mangling function of the Vector
Function ABI rules.

```
vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
```

1. Because we use only OpenMP `declare simd` vectorization, and
   because we require a vector Function ABI, we make this explicit
   in the name of the attribute.
2. Because the Vector Function ABIs encode all the information
   needed to know the vectorization shape of the vector function in
   the mangled names, we provide the mangled name via the
   attribute.
3. Function names redirection is specified by enclosing the name of
   the redirection in parenthesis, as in
   `abi_mangled_name_02(user_redirection)`.

## Vector ABI Demangler

The “Vector ABI demangler”, is the component that demangles the data
in the `vector-function-abi-variant` attribute and that provides the
instances of the class `VectorFunctionShape` that can be derived by
the mangled names listed in the attribute.

## Query interface: Search Vector Function System (SVFS)

An interface that can be queried by the LLVM components to understand
whether or not a scalar function can be vectorized, and that retrieves
the vector function to be used if such vector shape is available.

1. This component is going to be unrelated to OpenMP.
2. This component will use internally the demangler defined in the
   previous section, but it will not expose any aspect of the Vector
   Function ABI via its interface.

The interface provides two methods.

```
std::vector<VectorFunctionShape> SVFS::isFunctionVectorizable(llvm::CallInst * Call);

llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call, VectorFunctionShape Info);
```

The first method is used to list all the vector shapes that available
and attached to a scalar function. An empty results means that no
vector versions are available.

The second method retrieves the information needed to build a call to
a vector function with a specific `VectorFunctionShape` info.

# (SELF) ASSESSMENT ON EXTENDIBILITY


1. Extending the C function attribute `clang_declare_simd_variant` to
   new Vector Function ABIs that use OpenMP will be straightforward
   because the attribute is tight to such ABIs and OpenMP.
2. The C attribute `clang_declare_simd_variant` and the `declare
   variant` directive used for the `simd` trait will be sharing the
   internals in clang, so adding the OpenMP functionality for `simd`
   traits will be mostly handling the directive in the OpenMP
   parser. How this should be done is described in
   https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute
3. The IR attribute `vector-function-abi-variant` is not to be
   extended to represent other kind of vectorization other than those
   handled by `declare simd` and that are handled with a Vector
   Function ABI.
4. The IR attribute `vector-function-abi-variant` is not defined to be
   extended to represent the information of `declare variant` in its
   totality.
5. The IR attribute will not need to change when we will introduce non
   vector function ABI vectorization (RV-like, reductions...) or when
   we will decide to fully support `declare variant`. The information
   it carries will not need to be invalidated, but just extended with
   new attributes that will need to be handled by the
   `VectorFunctionShape` class, in a similar way the
   `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
   operates on individual attributes to describe an overall
   functionality.

# Examples

## Example 1

Exposing an Advanced SIMD vector function when targeting Advanced SIMD
in AArch64.

```
double foo_01(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_01", simdlen(2), notinbranch, isa("simd"));

// Advanced SIMD version
float64x2_t vector_foo_01(float64x2_t VectorInput);
```

The resulting IR attribute is:

```
attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
```

## Example 2

Exposing an Advanced SIMD vector function when targeting Advanced SIMD
in AArch64, but with the wrong signature. The user specifies a masked
version of the function in the clauses of the attribute, the compiler
throws an error suggesting the signature expected for
``vector_foo_02.``

```
double foo_02(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_02", simdlen(2), inbranch, isa("simd"));

// Advanced SIMD version
float64x2_t vector_foo_02(float64x2_t VectorInput);
// (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
```

## Example 3

Targeting `sincos`-like signatures.

```
void foo_03(double Input, double * Output) __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2), notinbranch, linear(Output, 1), isa("simd"));

// Advanced SIMD version
void vector_foo_03(float64x2_t VectorInput, double * Output);
```

The resulting IR attribute is:

```
attribute #0 = {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
```
## Example 4

Scalable vectorization targeting SVE

```
double foo_04(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_04", simdlen("scalable"), notinbranch, isa("sve"));

// SVE version
svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
```

The resulting IR attribute is:

```
attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
```

## Example 5

Fixed length vectorization targeting SVE

```
double foo_05(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_05", simdlen(4), inbranch, isa("sve"));

// Fixed-length SVE version
svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
```

The resulting IR attribute is:

```
attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
```

## Example 06

This is an x86 example, equivalent to the one provided by Andrei
Elovikow in
http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt
rendering with ICC at https://godbolt.org/z/Of1NxZ

```
float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx"))
{
  return *a + b;
}


__m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2);
```

The resulting IR attribute is:

```
attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
```

## Example showing interaction with `declare simd`

```
#pragma omp declare simd linear(a) notinbranch
float foo_06(float *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
    return *a + x;
}

// Advanced SIMD version
float32x4_t vector_foo_06(float *a, int32x4_t vx) {
// Custom implementation.
}
```

The resulting IR attribute is made of three symbols:

1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
   ones the compiler builds by auto-vectorizing `foo_06` according to
   the rule defined in the Vector Function ABI specifications for
   AArch64.
2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
   user-defined redirection of the 4-lane version of `foo_06` to the
   custom implementation provided by the user when targeting Advanced
   SIMD for version 8.2 of the A64 instruction set.

```
attribute #0 = {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_ZGVnN4l4v_foo_06(vector_foo_06)"}
```

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
Hi Francesco,

On 6/11/19 10:55 PM, Francesco Petrogalli wrote:

> Dear all,
>
> I have re-written the proposal for interfacing user provided vector
> functions, originally posted in both llvm-dev and cfe-dev mailing
> list:
>
> "[RFC] Expose user provided vector function for auto-vectorization."
>
> The proposal looks quite different from the original submission,
> therefore I took the liberty to start a new thread.
>
> The original thread generated some good discussion. In particular,
> Simon Moll and Johannes Doerfert (CCed) have managed to provide good
> arguments for the following claims:
>
> 1. The Vector Function ABI name mangling scheme of a target is not
>     enough to describe all uses cases of function vectorization that
>     the compiler might end up needing to support in the future.
I think the new name of the attribute makes this point clear.

> 2. `declare variant` needs to be handled properly at IR level, to be
>     able to give the compiler the full OpenMP context of the directive.
>
> This proposal addresses those two concerns and other (I believe) minor
> concerns that have been raised in the previous thread.
>
> This proposal is provided with examples and a self assessment around
> extendibility.
>
> I have CCed all the people that have participated in the discussion so
> far, please let me know if you think I have missed anything of what
> have been raised.
>
> Kind regards,
>
> Francesco

LGTM. Please add me as a reviewer for this when you post patches.

Thanks!

Simon

>
> *** DRAFT OF THE PROPOSAL ***
>
> # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
>
> Because the users care about portability (across compilers, libraries
> and systems), I believe we have to base sour solution on a standard
> that describes the mapping from the scalar function to the vector
> function.
>
> Because OpenMP is standard and widely used, we should base our
> solution on the mechanisms that the standard provides, via the
> directives `declare simd` and `declare variant`, the latter when used
> in with the `simd` trait in the `construct` set.
>
> Please notice that:
>
> 1. The scope of the proposal is not implementing full support for
>     `pragma omp declare variant`.
> 2. The scope of the proposal is not enabling the vectorizer to do new
>     kind of vectorizations (e.g. RV-like vectorization described by
>     Simon).
> 3. The proposal aims to be extendible wrt 1. and 2.
> 4. The IR attribute introduced in this proposal is equivalent to the
>     one needed for the VecClone pass under development in
>     https://reviews.llvm.org/D22792
>
> # CLANG COMPONENTS
>
> A C function attribute, `clang_declare_simd_variant`, to attach to the
> scalar version. The attribute provides enough information to the
> compiler about the vector shape of the user defined function. The
> vector shapes handled by the attribute are those handled by the OpenMP
> standard via `declare simd` (and no more than that).
>
> 1. The function attribute handling in clang is crafted with the
>     requirement that it will be possible to re-use the same components
>     for the info generated by `declare variant` when used with a `simd`
>     traits in the `construct` set.
> 2. The attribute allows orthogonality with the vectorization that is
>     done via OpenMP: the user vector function is still exposed for
>     vectorization when not using `-fopenmp-[simd]` once the `declare
>     simd` and `declare variant` directive of OpenMP will be available
>     in the front-end.
>
> ## C function attribute: `clang_declare_simd_variant`
>
> The definition of this attribute has been crafted to match the
> semantics of `declare variant` for a `simd` construct described in
> OpenMP 5.0. I have added only the traits of the `device` set, `isa`
> and `arch`, which I believe are enough to cover for the use case of
> this proposal. If that is not the case, please provide an example,
> extending the attribute will be easy even once the current one is
> implemented.
>
> ```
> clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>})
>
> <variant-func-id>:= The name of a function variant that is a base language identifier, or,
>                      for C++, a template-id.
>
> <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
>
> <simdlen> := simdlen(<positive number>) | simdlen("scalable")
>
> <mask>    := inbranch | notinbranch
>
> <optional simd clauses> := <linear clause>
>                           | <uniform clause>
>                           | <align clause>  | {,<optional simd clauses>}
>
> <linear clause>  := linear_ref(<var>,<step>)
>                    | linear_var(<var>, <step>)
>                    | linear_uval(<var>, <step>)
>                    | linear(<var>, <step>)
>
> <step> := <var> | <non zero number>
>
> <uniform clause> := uniform(<var>)
>
> <align clause>   := align(<var>, <positive number>)
>
> <var> := Name of a parameter in the scalar function declaration/definition
>
> <non zero number> := ... | -2 | -1 | 1 | 2 | ...
>
> <positive number> := 1 | 2 | 3 | ...
>
> <context selector clauses> := {<isa>}{,} {<arch>}
>
> <isa> := isa(target-specific-value)
>
> <arch> := arch(target-specific-value)
>
> ```
>
> # LLVM COMPONENTS:
>
> ## VectorFunctionShape class
>
> The object `VectorFunctionShape` contains the information about the
> kind of vectorization available for an `llvm::Call`.
>
> The object `VectorFunctionShape` must contain the following information:
>
> 1. Vectorization Factor (or number or concurrent lanes executed by the
>     SIMD version of the function). Encoded by unsigned integer.
> 2. Whether the vector function is requested for scalable
>     vectorization, encoded by a boolean.
> 3. Information about masking / no masking, encoded by a boolean.
> 4. Information about the parameters, encoded in a container that
>     carries objects of type `ParamaterType`, to describe features like
>     `linear` and `uniform`.
> 5. Function name redirection, if a user has specified to use a custom
>     name instead of the Vector Function ABI ones.
>
> Items 1. to 5. represents the information stored in the
> `vector-function-abi-variant` attribute (see next section).
>
> The object can be extended in the future to include new vectorization
> kinds (for example the RV-like vectorization of the Region
> Vectorizer), or to add more context information that might come from
> other uses of OpenMP `declare variant`, or to add new Vector Function
> ABIs not based on OpenMP. Such information can be retrieved by
> attributes that will be added to describe the `Call` instance.
>
> ## IR Attribute
>
> We define a `vector-function-abi-variant` attribute that lists the
> mangled names produced via the mangling function of the Vector
> Function ABI rules.
>
> ```
> vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
> ```
>
> 1. Because we use only OpenMP `declare simd` vectorization, and
>     because we require a vector Function ABI, we make this explicit
>     in the name of the attribute.
> 2. Because the Vector Function ABIs encode all the information
>     needed to know the vectorization shape of the vector function in
>     the mangled names, we provide the mangled name via the
>     attribute.
> 3. Function names redirection is specified by enclosing the name of
>     the redirection in parenthesis, as in
>     `abi_mangled_name_02(user_redirection)`.
>
> ## Vector ABI Demangler
>
> The “Vector ABI demangler”, is the component that demangles the data
> in the `vector-function-abi-variant` attribute and that provides the
> instances of the class `VectorFunctionShape` that can be derived by
> the mangled names listed in the attribute.
>
> ## Query interface: Search Vector Function System (SVFS)
>
> An interface that can be queried by the LLVM components to understand
> whether or not a scalar function can be vectorized, and that retrieves
> the vector function to be used if such vector shape is available.
>
> 1. This component is going to be unrelated to OpenMP.
> 2. This component will use internally the demangler defined in the
>     previous section, but it will not expose any aspect of the Vector
>     Function ABI via its interface.
>
> The interface provides two methods.
>
> ```
> std::vector<VectorFunctionShape> SVFS::isFunctionVectorizable(llvm::CallInst * Call);
>
> llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call, VectorFunctionShape Info);
> ```
>
> The first method is used to list all the vector shapes that available
> and attached to a scalar function. An empty results means that no
> vector versions are available.
>
> The second method retrieves the information needed to build a call to
> a vector function with a specific `VectorFunctionShape` info.
>
> # (SELF) ASSESSMENT ON EXTENDIBILITY
>
>
> 1. Extending the C function attribute `clang_declare_simd_variant` to
>     new Vector Function ABIs that use OpenMP will be straightforward
>     because the attribute is tight to such ABIs and OpenMP.
> 2. The C attribute `clang_declare_simd_variant` and the `declare
>     variant` directive used for the `simd` trait will be sharing the
>     internals in clang, so adding the OpenMP functionality for `simd`
>     traits will be mostly handling the directive in the OpenMP
>     parser. How this should be done is described in
>     https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute
> 3. The IR attribute `vector-function-abi-variant` is not to be
>     extended to represent other kind of vectorization other than those
>     handled by `declare simd` and that are handled with a Vector
>     Function ABI.
> 4. The IR attribute `vector-function-abi-variant` is not defined to be
>     extended to represent the information of `declare variant` in its
>     totality.
> 5. The IR attribute will not need to change when we will introduce non
>     vector function ABI vectorization (RV-like, reductions...) or when
>     we will decide to fully support `declare variant`. The information
>     it carries will not need to be invalidated, but just extended with
>     new attributes that will need to be handled by the
>     `VectorFunctionShape` class, in a similar way the
>     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
>     operates on individual attributes to describe an overall
>     functionality.
>
> # Examples
>
> ## Example 1
>
> Exposing an Advanced SIMD vector function when targeting Advanced SIMD
> in AArch64.
>
> ```
> double foo_01(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_01", simdlen(2), notinbranch, isa("simd"));
>
> // Advanced SIMD version
> float64x2_t vector_foo_01(float64x2_t VectorInput);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
> ```
>
> ## Example 2
>
> Exposing an Advanced SIMD vector function when targeting Advanced SIMD
> in AArch64, but with the wrong signature. The user specifies a masked
> version of the function in the clauses of the attribute, the compiler
> throws an error suggesting the signature expected for
> ``vector_foo_02.``
>
> ```
> double foo_02(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_02", simdlen(2), inbranch, isa("simd"));
>
> // Advanced SIMD version
> float64x2_t vector_foo_02(float64x2_t VectorInput);
> // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
> ```
>
> ## Example 3
>
> Targeting `sincos`-like signatures.
>
> ```
> void foo_03(double Input, double * Output) __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2), notinbranch, linear(Output, 1), isa("simd"));
>
> // Advanced SIMD version
> void vector_foo_03(float64x2_t VectorInput, double * Output);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
> ```
> ## Example 4
>
> Scalable vectorization targeting SVE
>
> ```
> double foo_04(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_04", simdlen("scalable"), notinbranch, isa("sve"));
>
> // SVE version
> svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> ```
>
> ## Example 5
>
> Fixed length vectorization targeting SVE
>
> ```
> double foo_05(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_05", simdlen(4), inbranch, isa("sve"));
>
> // Fixed-length SVE version
> svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> ```
>
> ## Example 06
>
> This is an x86 example, equivalent to the one provided by Andrei
> Elovikow in
> http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt
> rendering with ICC at https://godbolt.org/z/Of1NxZ
>
> ```
> float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx"))
> {
>    return *a + b;
> }
>
>
> __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
> ```
>
> ## Example showing interaction with `declare simd`
>
> ```
> #pragma omp declare simd linear(a) notinbranch
> float foo_06(float *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
>      return *a + x;
> }
>
> // Advanced SIMD version
> float32x4_t vector_foo_06(float *a, int32x4_t vx) {
> // Custom implementation.
> }
> ```
>
> The resulting IR attribute is made of three symbols:
>
> 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
>     ones the compiler builds by auto-vectorizing `foo_06` according to
>     the rule defined in the Vector Function ABI specifications for
>     AArch64.
> 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
>     user-defined redirection of the 4-lane version of `foo_06` to the
>     custom implementation provided by the user when targeting Advanced
>     SIMD for version 8.2 of the A64 instruction set.
>
> ```
> attribute #0 = {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_ZGVnN4l4v_foo_06(vector_foo_06)"}
> ```
>
--

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : [hidden email]
Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.

Sorry for the delay and thanks for working on this.


From: Simon Moll <[hidden email]>
Sent: Monday, June 17, 2019 10:02:58 AM
To: Francesco Petrogalli; LLVM Development List; Clang Dev
Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd; Roman Lebedev; Philip Reames; Shawn Landden
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
 
Hi Francesco,

On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
> Dear all,
>
> I have re-written the proposal for interfacing user provided vector
> functions, originally posted in both llvm-dev and cfe-dev mailing
> list:
>
> "[RFC] Expose user provided vector function for auto-vectorization."
>
> The proposal looks quite different from the original submission,
> therefore I took the liberty to start a new thread.
>
> The original thread generated some good discussion. In particular,
> Simon Moll and Johannes Doerfert (CCed) have managed to provide good
> arguments for the following claims:
>
> 1. The Vector Function ABI name mangling scheme of a target is not
>     enough to describe all uses cases of function vectorization that
>     the compiler might end up needing to support in the future.
I think the new name of the attribute makes this point clear.
> 2. `declare variant` needs to be handled properly at IR level, to be
>     able to give the compiler the full OpenMP context of the directive.
>
> This proposal addresses those two concerns and other (I believe) minor
> concerns that have been raised in the previous thread.
>
> This proposal is provided with examples and a self assessment around
> extendibility.
>
> I have CCed all the people that have participated in the discussion so
> far, please let me know if you think I have missed anything of what
> have been raised.
>
> Kind regards,
>
> Francesco

LGTM. Please add me as a reviewer for this when you post patches.

Thanks!

Simon

>
> *** DRAFT OF THE PROPOSAL ***
>
> # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
>
> Because the users care about portability (across compilers, libraries
> and systems), I believe we have to base sour solution on a standard
> that describes the mapping from the scalar function to the vector
> function.
>
> Because OpenMP is standard and widely used, we should base our
> solution on the mechanisms that the standard provides, via the
> directives `declare simd` and `declare variant`, the latter when used
> in with the `simd` trait in the `construct` set.
>
> Please notice that:
>
> 1. The scope of the proposal is not implementing full support for
>     `pragma omp declare variant`.
> 2. The scope of the proposal is not enabling the vectorizer to do new
>     kind of vectorizations (e.g. RV-like vectorization described by
>     Simon).
> 3. The proposal aims to be extendible wrt 1. and 2.
> 4. The IR attribute introduced in this proposal is equivalent to the
>     one needed for the VecClone pass under development in
>     https://reviews.llvm.org/D22792
>
> # CLANG COMPONENTS
>
> A C function attribute, `clang_declare_simd_variant`, to attach to the
> scalar version. The attribute provides enough information to the
> compiler about the vector shape of the user defined function. The
> vector shapes handled by the attribute are those handled by the OpenMP
> standard via `declare simd` (and no more than that).
>
> 1. The function attribute handling in clang is crafted with the
>     requirement that it will be possible to re-use the same components
>     for the info generated by `declare variant` when used with a `simd`
>     traits in the `construct` set.
> 2. The attribute allows orthogonality with the vectorization that is
>     done via OpenMP: the user vector function is still exposed for
>     vectorization when not using `-fopenmp-[simd]` once the `declare
>     simd` and `declare variant` directive of OpenMP will be available
>     in the front-end.
>
> ## C function attribute: `clang_declare_simd_variant`
>
> The definition of this attribute has been crafted to match the
> semantics of `declare variant` for a `simd` construct described in
> OpenMP 5.0. I have added only the traits of the `device` set, `isa`
> and `arch`, which I believe are enough to cover for the use case of
> this proposal. If that is not the case, please provide an example,
> extending the attribute will be easy even once the current one is
> implemented.
>
> ```
> clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>})
>
> <variant-func-id>:= The name of a function variant that is a base language identifier, or,
>                      for C++, a template-id.
>
> <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
>
> <simdlen> := simdlen(<positive number>) | simdlen("scalable")
>
> <mask>    := inbranch | notinbranch
>
> <optional simd clauses> := <linear clause>
>                           | <uniform clause>
>                           | <align clause>  | {,<optional simd clauses>}
>
> <linear clause>  := linear_ref(<var>,<step>)
>                    | linear_var(<var>, <step>)
>                    | linear_uval(<var>, <step>)
>                    | linear(<var>, <step>)
>
> <step> := <var> | <non zero number>
>
> <uniform clause> := uniform(<var>)
>
> <align clause>   := align(<var>, <positive number>)
>
> <var> := Name of a parameter in the scalar function declaration/definition
>
> <non zero number> := ... | -2 | -1 | 1 | 2 | ...
>
> <positive number> := 1 | 2 | 3 | ...
>
> <context selector clauses> := {<isa>}{,} {<arch>}
>
> <isa> := isa(target-specific-value)
>
> <arch> := arch(target-specific-value)
>
> ```
>
> # LLVM COMPONENTS:
>
> ## VectorFunctionShape class
>
> The object `VectorFunctionShape` contains the information about the
> kind of vectorization available for an `llvm::Call`.
>
> The object `VectorFunctionShape` must contain the following information:
>
> 1. Vectorization Factor (or number or concurrent lanes executed by the
>     SIMD version of the function). Encoded by unsigned integer.
> 2. Whether the vector function is requested for scalable
>     vectorization, encoded by a boolean.
> 3. Information about masking / no masking, encoded by a boolean.
> 4. Information about the parameters, encoded in a container that
>     carries objects of type `ParamaterType`, to describe features like
>     `linear` and `uniform`.
> 5. Function name redirection, if a user has specified to use a custom
>     name instead of the Vector Function ABI ones.
>
> Items 1. to 5. represents the information stored in the
> `vector-function-abi-variant` attribute (see next section).
>
> The object can be extended in the future to include new vectorization
> kinds (for example the RV-like vectorization of the Region
> Vectorizer), or to add more context information that might come from
> other uses of OpenMP `declare variant`, or to add new Vector Function
> ABIs not based on OpenMP. Such information can be retrieved by
> attributes that will be added to describe the `Call` instance.
>
> ## IR Attribute
>
> We define a `vector-function-abi-variant` attribute that lists the
> mangled names produced via the mangling function of the Vector
> Function ABI rules.
>
> ```
> vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
> ```
>
> 1. Because we use only OpenMP `declare simd` vectorization, and
>     because we require a vector Function ABI, we make this explicit
>     in the name of the attribute.
> 2. Because the Vector Function ABIs encode all the information
>     needed to know the vectorization shape of the vector function in
>     the mangled names, we provide the mangled name via the
>     attribute.
> 3. Function names redirection is specified by enclosing the name of
>     the redirection in parenthesis, as in
>     `abi_mangled_name_02(user_redirection)`.
>
> ## Vector ABI Demangler
>
> The “Vector ABI demangler”, is the component that demangles the data
> in the `vector-function-abi-variant` attribute and that provides the
> instances of the class `VectorFunctionShape` that can be derived by
> the mangled names listed in the attribute.
>
> ## Query interface: Search Vector Function System (SVFS)
>
> An interface that can be queried by the LLVM components to understand
> whether or not a scalar function can be vectorized, and that retrieves
> the vector function to be used if such vector shape is available.
>
> 1. This component is going to be unrelated to OpenMP.
> 2. This component will use internally the demangler defined in the
>     previous section, but it will not expose any aspect of the Vector
>     Function ABI via its interface.
>
> The interface provides two methods.
>
> ```
> std::vector<VectorFunctionShape> SVFS::isFunctionVectorizable(llvm::CallInst * Call);
>
> llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call, VectorFunctionShape Info);
> ```
>
> The first method is used to list all the vector shapes that available
> and attached to a scalar function. An empty results means that no
> vector versions are available.
>
> The second method retrieves the information needed to build a call to
> a vector function with a specific `VectorFunctionShape` info.
>
> # (SELF) ASSESSMENT ON EXTENDIBILITY
>
>
> 1. Extending the C function attribute `clang_declare_simd_variant` to
>     new Vector Function ABIs that use OpenMP will be straightforward
>     because the attribute is tight to such ABIs and OpenMP.
> 2. The C attribute `clang_declare_simd_variant` and the `declare
>     variant` directive used for the `simd` trait will be sharing the
>     internals in clang, so adding the OpenMP functionality for `simd`
>     traits will be mostly handling the directive in the OpenMP
>     parser. How this should be done is described in
>     https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute
> 3. The IR attribute `vector-function-abi-variant` is not to be
>     extended to represent other kind of vectorization other than those
>     handled by `declare simd` and that are handled with a Vector
>     Function ABI.
> 4. The IR attribute `vector-function-abi-variant` is not defined to be
>     extended to represent the information of `declare variant` in its
>     totality.
> 5. The IR attribute will not need to change when we will introduce non
>     vector function ABI vectorization (RV-like, reductions...) or when
>     we will decide to fully support `declare variant`. The information
>     it carries will not need to be invalidated, but just extended with
>     new attributes that will need to be handled by the
>     `VectorFunctionShape` class, in a similar way the
>     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
>     operates on individual attributes to describe an overall
>     functionality.
>
> # Examples
>
> ## Example 1
>
> Exposing an Advanced SIMD vector function when targeting Advanced SIMD
> in AArch64.
>
> ```
> double foo_01(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_01", simdlen(2), notinbranch, isa("simd"));
>
> // Advanced SIMD version
> float64x2_t vector_foo_01(float64x2_t VectorInput);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
> ```
>
> ## Example 2
>
> Exposing an Advanced SIMD vector function when targeting Advanced SIMD
> in AArch64, but with the wrong signature. The user specifies a masked
> version of the function in the clauses of the attribute, the compiler
> throws an error suggesting the signature expected for
> ``vector_foo_02.``
>
> ```
> double foo_02(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_02", simdlen(2), inbranch, isa("simd"));
>
> // Advanced SIMD version
> float64x2_t vector_foo_02(float64x2_t VectorInput);
> // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
> ```
>
> ## Example 3
>
> Targeting `sincos`-like signatures.
>
> ```
> void foo_03(double Input, double * Output) __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2), notinbranch, linear(Output, 1), isa("simd"));
>
> // Advanced SIMD version
> void vector_foo_03(float64x2_t VectorInput, double * Output);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
> ```
> ## Example 4
>
> Scalable vectorization targeting SVE
>
> ```
> double foo_04(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_04", simdlen("scalable"), notinbranch, isa("sve"));
>
> // SVE version
> svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> ```
>
> ## Example 5
>
> Fixed length vectorization targeting SVE
>
> ```
> double foo_05(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_05", simdlen(4), inbranch, isa("sve"));
>
> // Fixed-length SVE version
> svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> ```
>
> ## Example 06
>
> This is an x86 example, equivalent to the one provided by Andrei
> Elovikow in
> http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt
> rendering with ICC at https://godbolt.org/z/Of1NxZ
>
> ```
> float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx"))
> {
>    return *a + b;
> }
>
>
> __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2);
> ```
>
> The resulting IR attribute is:
>
> ```
> attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
> ```
>
> ## Example showing interaction with `declare simd`
>
> ```
> #pragma omp declare simd linear(a) notinbranch
> float foo_06(float *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
>      return *a + x;
> }
>
> // Advanced SIMD version
> float32x4_t vector_foo_06(float *a, int32x4_t vx) {
> // Custom implementation.
> }
> ```
>
> The resulting IR attribute is made of three symbols:
>
> 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
>     ones the compiler builds by auto-vectorizing `foo_06` according to
>     the rule defined in the Vector Function ABI specifications for
>     AArch64.
> 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
>     user-defined redirection of the 4-lane version of `foo_06` to the
>     custom implementation provided by the user when targeting Advanced
>     SIMD for version 8.2 of the A64 instruction set.
>
> ```
> attribute #0 = {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_ZGVnN4l4v_foo_06(vector_foo_06)"}
> ```
>
--

Simon Moll
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : [hidden email]
Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
Hi all - I am working with a colleague to provide an initial implementation of this.

We encountered a problem when dealing with generating the vector signatures of functions that use complex data.

In this proposal, we expect the SVFS component in the backed to demangle the name of the function in the attribute to be able to reconstruct the signature of the vector function from the scalar function signature.

In case of Complex data, this doesn’t seem to be possible, because the information of “being a vector of 2 lanes” that is supposed to be carried by the complex scalar is lost in the transformation the data type in a “coerced” type.

Consider these three types and the function `foo`:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
 int a;
 int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
 return ...;
}

In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.

Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

In particular, (and this I believe has already been mentioned by Johannes), we could use the @llvm.compiler.used intrinsic to mark those declaration that needs to stay in the IR and not optimized away OPT before reaching the vectorizer.

In summary, the change would consist of:

1. Generate symbols declaration/definitions of the vector function with the mangled name in the IR, and mark it with @llvm-compiler.used. This could be done in CGOpenMPRuntime.cpp
2. Use the attribute vector-abs-variant defined in this RFC to map scalar names to vector ABI mangled name, and used the same redirection mechanism for the user provided vector name.
3. Move the “vector function signature generation” from the SVFS in LLVM to the openmp code generator of the clang frontend

The SVFS query system would still work as in the current proposal. The only difference is that the vector function signature would be given by the frontend and not need to be recomputed.

Here is an example of ho the IR would look like with this change:

```
@llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"

declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)

declare i32 @foo(i32) #0

; other function definition, including the one provided by the user `my_vector_foo` if the user provided a definition and not just the declaration

attribute #0 = {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}

```

If the attribute @llvm.compiler.used is not suitable for this (I am not aware of all implication of using it on a global symbol), maybe we could come up with a intrinsics that does what we need (avoid deleting declarations that are not used) and name it @llvm.vector.function.used?

Please let me know what you think, I will submit an updated proposal next week.

Kind regards,

Francesco

> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <[hidden email]> wrote:
>
> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>
> Sorry for the delay and thanks for working on this.
>
> Get Outlook for Android
>
> From: Simon Moll <[hidden email]>
> Sent: Monday, June 17, 2019 10:02:58 AM
> To: Francesco Petrogalli; LLVM Development List; Clang Dev
> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd; Roman Lebedev; Philip Reames; Shawn Landden
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>  
> Hi Francesco,
>
> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
> > Dear all,
> >
> > I have re-written the proposal for interfacing user provided vector
> > functions, originally posted in both llvm-dev and cfe-dev mailing
> > list:
> >
> > "[RFC] Expose user provided vector function for auto-vectorization."
> >
> > The proposal looks quite different from the original submission,
> > therefore I took the liberty to start a new thread.
> >
> > The original thread generated some good discussion. In particular,
> > Simon Moll and Johannes Doerfert (CCed) have managed to provide good
> > arguments for the following claims:
> >
> > 1. The Vector Function ABI name mangling scheme of a target is not
> >     enough to describe all uses cases of function vectorization that
> >     the compiler might end up needing to support in the future.
> I think the new name of the attribute makes this point clear.
> > 2. `declare variant` needs to be handled properly at IR level, to be
> >     able to give the compiler the full OpenMP context of the directive.
> >
> > This proposal addresses those two concerns and other (I believe) minor
> > concerns that have been raised in the previous thread.
> >
> > This proposal is provided with examples and a self assessment around
> > extendibility.
> >
> > I have CCed all the people that have participated in the discussion so
> > far, please let me know if you think I have missed anything of what
> > have been raised.
> >
> > Kind regards,
> >
> > Francesco
>
> LGTM. Please add me as a reviewer for this when you post patches.
>
> Thanks!
>
> Simon
>
> >
> > *** DRAFT OF THE PROPOSAL ***
> >
> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
> >
> > Because the users care about portability (across compilers, libraries
> > and systems), I believe we have to base sour solution on a standard
> > that describes the mapping from the scalar function to the vector
> > function.
> >
> > Because OpenMP is standard and widely used, we should base our
> > solution on the mechanisms that the standard provides, via the
> > directives `declare simd` and `declare variant`, the latter when used
> > in with the `simd` trait in the `construct` set.
> >
> > Please notice that:
> >
> > 1. The scope of the proposal is not implementing full support for
> >     `pragma omp declare variant`.
> > 2. The scope of the proposal is not enabling the vectorizer to do new
> >     kind of vectorizations (e.g. RV-like vectorization described by
> >     Simon).
> > 3. The proposal aims to be extendible wrt 1. and 2.
> > 4. The IR attribute introduced in this proposal is equivalent to the
> >     one needed for the VecClone pass under development in
> >     https://reviews.llvm.org/D22792
> >
> > # CLANG COMPONENTS
> >
> > A C function attribute, `clang_declare_simd_variant`, to attach to the
> > scalar version. The attribute provides enough information to the
> > compiler about the vector shape of the user defined function. The
> > vector shapes handled by the attribute are those handled by the OpenMP
> > standard via `declare simd` (and no more than that).
> >
> > 1. The function attribute handling in clang is crafted with the
> >     requirement that it will be possible to re-use the same components
> >     for the info generated by `declare variant` when used with a `simd`
> >     traits in the `construct` set.
> > 2. The attribute allows orthogonality with the vectorization that is
> >     done via OpenMP: the user vector function is still exposed for
> >     vectorization when not using `-fopenmp-[simd]` once the `declare
> >     simd` and `declare variant` directive of OpenMP will be available
> >     in the front-end.
> >
> > ## C function attribute: `clang_declare_simd_variant`
> >
> > The definition of this attribute has been crafted to match the
> > semantics of `declare variant` for a `simd` construct described in
> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
> > and `arch`, which I believe are enough to cover for the use case of
> > this proposal. If that is not the case, please provide an example,
> > extending the attribute will be easy even once the current one is
> > implemented.
> >
> > ```
> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>})
> >
> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
> >                      for C++, a template-id.
> >
> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
> >
> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
> >
> > <mask>    := inbranch | notinbranch
> >
> > <optional simd clauses> := <linear clause>
> >                           | <uniform clause>
> >                           | <align clause>  | {,<optional simd clauses>}
> >
> > <linear clause>  := linear_ref(<var>,<step>)
> >                    | linear_var(<var>, <step>)
> >                    | linear_uval(<var>, <step>)
> >                    | linear(<var>, <step>)
> >
> > <step> := <var> | <non zero number>
> >
> > <uniform clause> := uniform(<var>)
> >
> > <align clause>   := align(<var>, <positive number>)
> >
> > <var> := Name of a parameter in the scalar function declaration/definition
> >
> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
> >
> > <positive number> := 1 | 2 | 3 | ...
> >
> > <context selector clauses> := {<isa>}{,} {<arch>}
> >
> > <isa> := isa(target-specific-value)
> >
> > <arch> := arch(target-specific-value)
> >
> > ```
> >
> > # LLVM COMPONENTS:
> >
> > ## VectorFunctionShape class
> >
> > The object `VectorFunctionShape` contains the information about the
> > kind of vectorization available for an `llvm::Call`.
> >
> > The object `VectorFunctionShape` must contain the following information:
> >
> > 1. Vectorization Factor (or number or concurrent lanes executed by the
> >     SIMD version of the function). Encoded by unsigned integer.
> > 2. Whether the vector function is requested for scalable
> >     vectorization, encoded by a boolean.
> > 3. Information about masking / no masking, encoded by a boolean.
> > 4. Information about the parameters, encoded in a container that
> >     carries objects of type `ParamaterType`, to describe features like
> >     `linear` and `uniform`.
> > 5. Function name redirection, if a user has specified to use a custom
> >     name instead of the Vector Function ABI ones.
> >
> > Items 1. to 5. represents the information stored in the
> > `vector-function-abi-variant` attribute (see next section).
> >
> > The object can be extended in the future to include new vectorization
> > kinds (for example the RV-like vectorization of the Region
> > Vectorizer), or to add more context information that might come from
> > other uses of OpenMP `declare variant`, or to add new Vector Function
> > ABIs not based on OpenMP. Such information can be retrieved by
> > attributes that will be added to describe the `Call` instance.
> >
> > ## IR Attribute
> >
> > We define a `vector-function-abi-variant` attribute that lists the
> > mangled names produced via the mangling function of the Vector
> > Function ABI rules.
> >
> > ```
> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
> > ```
> >
> > 1. Because we use only OpenMP `declare simd` vectorization, and
> >     because we require a vector Function ABI, we make this explicit
> >     in the name of the attribute.
> > 2. Because the Vector Function ABIs encode all the information
> >     needed to know the vectorization shape of the vector function in
> >     the mangled names, we provide the mangled name via the
> >     attribute.
> > 3. Function names redirection is specified by enclosing the name of
> >     the redirection in parenthesis, as in
> >     `abi_mangled_name_02(user_redirection)`.
> >
> > ## Vector ABI Demangler
> >
> > The “Vector ABI demangler”, is the component that demangles the data
> > in the `vector-function-abi-variant` attribute and that provides the
> > instances of the class `VectorFunctionShape` that can be derived by
> > the mangled names listed in the attribute.
> >
> > ## Query interface: Search Vector Function System (SVFS)
> >
> > An interface that can be queried by the LLVM components to understand
> > whether or not a scalar function can be vectorized, and that retrieves
> > the vector function to be used if such vector shape is available.
> >
> > 1. This component is going to be unrelated to OpenMP.
> > 2. This component will use internally the demangler defined in the
> >     previous section, but it will not expose any aspect of the Vector
> >     Function ABI via its interface.
> >
> > The interface provides two methods.
> >
> > ```
> > std::vector<VectorFunctionShape> SVFS::isFunctionVectorizable(llvm::CallInst * Call);
> >
> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call, VectorFunctionShape Info);
> > ```
> >
> > The first method is used to list all the vector shapes that available
> > and attached to a scalar function. An empty results means that no
> > vector versions are available.
> >
> > The second method retrieves the information needed to build a call to
> > a vector function with a specific `VectorFunctionShape` info.
> >
> > # (SELF) ASSESSMENT ON EXTENDIBILITY
> >
> >
> > 1. Extending the C function attribute `clang_declare_simd_variant` to
> >     new Vector Function ABIs that use OpenMP will be straightforward
> >     because the attribute is tight to such ABIs and OpenMP.
> > 2. The C attribute `clang_declare_simd_variant` and the `declare
> >     variant` directive used for the `simd` trait will be sharing the
> >     internals in clang, so adding the OpenMP functionality for `simd`
> >     traits will be mostly handling the directive in the OpenMP
> >     parser. How this should be done is described in
> >     https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute
> > 3. The IR attribute `vector-function-abi-variant` is not to be
> >     extended to represent other kind of vectorization other than those
> >     handled by `declare simd` and that are handled with a Vector
> >     Function ABI.
> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
> >     extended to represent the information of `declare variant` in its
> >     totality.
> > 5. The IR attribute will not need to change when we will introduce non
> >     vector function ABI vectorization (RV-like, reductions...) or when
> >     we will decide to fully support `declare variant`. The information
> >     it carries will not need to be invalidated, but just extended with
> >     new attributes that will need to be handled by the
> >     `VectorFunctionShape` class, in a similar way the
> >     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
> >     operates on individual attributes to describe an overall
> >     functionality.
> >
> > # Examples
> >
> > ## Example 1
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced SIMD
> > in AArch64.
> >
> > ```
> > double foo_01(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_01", simdlen(2), notinbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_01(float64x2_t VectorInput);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
> > ```
> >
> > ## Example 2
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced SIMD
> > in AArch64, but with the wrong signature. The user specifies a masked
> > version of the function in the clauses of the attribute, the compiler
> > throws an error suggesting the signature expected for
> > ``vector_foo_02.``
> >
> > ```
> > double foo_02(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_02", simdlen(2), inbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_02(float64x2_t VectorInput);
> > // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
> > ```
> >
> > ## Example 3
> >
> > Targeting `sincos`-like signatures.
> >
> > ```
> > void foo_03(double Input, double * Output) __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2), notinbranch, linear(Output, 1), isa("simd"));
> >
> > // Advanced SIMD version
> > void vector_foo_03(float64x2_t VectorInput, double * Output);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
> > ```
> > ## Example 4
> >
> > Scalable vectorization targeting SVE
> >
> > ```
> > double foo_04(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_04", simdlen("scalable"), notinbranch, isa("sve"));
> >
> > // SVE version
> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 5
> >
> > Fixed length vectorization targeting SVE
> >
> > ```
> > double foo_05(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_05", simdlen(4), inbranch, isa("sve"));
> >
> > // Fixed-length SVE version
> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 06
> >
> > This is an x86 example, equivalent to the one provided by Andrei
> > Elovikow in
> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt
> > rendering with ICC at https://godbolt.org/z/Of1NxZ
> >
> > ```
> > float MyAdd(float* a, int b) __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx"))
> > {
> >    return *a + b;
> > }
> >
> >
> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
> > ```
> >
> > ## Example showing interaction with `declare simd`
> >
> > ```
> > #pragma omp declare simd linear(a) notinbranch
> > float foo_06(float *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
> >      return *a + x;
> > }
> >
> > // Advanced SIMD version
> > float32x4_t vector_foo_06(float *a, int32x4_t vx) {
> > // Custom implementation.
> > }
> > ```
> >
> > The resulting IR attribute is made of three symbols:
> >
> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
> >     ones the compiler builds by auto-vectorizing `foo_06` according to
> >     the rule defined in the Vector Function ABI specifications for
> >     AArch64.
> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
> >     user-defined redirection of the 4-lane version of `foo_06` to the
> >     custom implementation provided by the user when targeting Advanced
> >     SIMD for version 8.2 of the A64 instruction set.
> >
> > ```
> > attribute #0 = {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_ZGVnN4l4v_foo_06(vector_foo_06)"}
> > ```
> >
> --
>
> Simon Moll
> Researcher / PhD Student
>
> Compiler Design Lab (Prof. Hack)
> Saarland University, Computer Science
> Building E1.3, Room 4.31
>
> Tel. +49 (0)681 302-57521 : [hidden email]
> Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev

>In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

Ouch.

>I don’t know if this is going to be a problem for other architectures

I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

>Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

Makes sense to me.

-----Original Message-----
From: Francesco Petrogalli [mailto:[hidden email]]
Sent: Friday, June 21, 2019 2:04 PM
To: Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Saito, Hideki <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

Hi all - I am working with a colleague to provide an initial implementation of this.

We encountered a problem when dealing with generating the vector signatures of functions that use complex data.

In this proposal, we expect the SVFS component in the backed to demangle the name of the function in the attribute to be able to reconstruct the signature of the vector function from the scalar function signature.

In case of Complex data, this doesn’t seem to be possible, because the information of “being a vector of 2 lanes” that is supposed to be carried by the complex scalar is lost in the transformation the data type in a “coerced” type.

Consider these three types and the function `foo`:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
 int a;
 int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
 return ...;
}

In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.

Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

In particular, (and this I believe has already been mentioned by Johannes), we could use the @llvm.compiler.used intrinsic to mark those declaration that needs to stay in the IR and not optimized away OPT before reaching the vectorizer.

In summary, the change would consist of:

1. Generate symbols declaration/definitions of the vector function with the mangled name in the IR, and mark it with @llvm-compiler.used. This could be done in CGOpenMPRuntime.cpp 2. Use the attribute vector-abs-variant defined in this RFC to map scalar names to vector ABI mangled name, and used the same redirection mechanism for the user provided vector name.
3. Move the “vector function signature generation” from the SVFS in LLVM to the openmp code generator of the clang frontend

The SVFS query system would still work as in the current proposal. The only difference is that the vector function signature would be given by the frontend and not need to be recomputed.

Here is an example of ho the IR would look like with this change:

```
@llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"

declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)

declare i32 @foo(i32) #0

; other function definition, including the one provided by the user `my_vector_foo` if the user provided a definition and not just the declaration

attribute #0 = {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}

```

If the attribute @llvm.compiler.used is not suitable for this (I am not aware of all implication of using it on a global symbol), maybe we could come up with a intrinsics that does what we need (avoid deleting declarations that are not used) and name it @llvm.vector.function.used?

Please let me know what you think, I will submit an updated proposal next week.

Kind regards,

Francesco

> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <[hidden email]> wrote:
>
> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>
> Sorry for the delay and thanks for working on this.
>
> Get Outlook for Android
>
> From: Simon Moll <[hidden email]>
> Sent: Monday, June 17, 2019 10:02:58 AM
> To: Francesco Petrogalli; LLVM Development List; Clang Dev
> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei;
> Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd;
> Roman Lebedev; Philip Reames; Shawn Landden
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>  
> Hi Francesco,
>
> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
> > Dear all,
> >
> > I have re-written the proposal for interfacing user provided vector
> > functions, originally posted in both llvm-dev and cfe-dev mailing
> > list:
> >
> > "[RFC] Expose user provided vector function for auto-vectorization."
> >
> > The proposal looks quite different from the original submission,
> > therefore I took the liberty to start a new thread.
> >
> > The original thread generated some good discussion. In particular,
> > Simon Moll and Johannes Doerfert (CCed) have managed to provide good
> > arguments for the following claims:
> >
> > 1. The Vector Function ABI name mangling scheme of a target is not
> >     enough to describe all uses cases of function vectorization that
> >     the compiler might end up needing to support in the future.
> I think the new name of the attribute makes this point clear.
> > 2. `declare variant` needs to be handled properly at IR level, to be
> >     able to give the compiler the full OpenMP context of the directive.
> >
> > This proposal addresses those two concerns and other (I believe)
> > minor concerns that have been raised in the previous thread.
> >
> > This proposal is provided with examples and a self assessment around
> > extendibility.
> >
> > I have CCed all the people that have participated in the discussion
> > so far, please let me know if you think I have missed anything of
> > what have been raised.
> >
> > Kind regards,
> >
> > Francesco
>
> LGTM. Please add me as a reviewer for this when you post patches.
>
> Thanks!
>
> Simon
>
> >
> > *** DRAFT OF THE PROPOSAL ***
> >
> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
> >
> > Because the users care about portability (across compilers,
> > libraries and systems), I believe we have to base sour solution on a
> > standard that describes the mapping from the scalar function to the
> > vector function.
> >
> > Because OpenMP is standard and widely used, we should base our
> > solution on the mechanisms that the standard provides, via the
> > directives `declare simd` and `declare variant`, the latter when
> > used in with the `simd` trait in the `construct` set.
> >
> > Please notice that:
> >
> > 1. The scope of the proposal is not implementing full support for
> >     `pragma omp declare variant`.
> > 2. The scope of the proposal is not enabling the vectorizer to do new
> >     kind of vectorizations (e.g. RV-like vectorization described by
> >     Simon).
> > 3. The proposal aims to be extendible wrt 1. and 2.
> > 4. The IR attribute introduced in this proposal is equivalent to the
> >     one needed for the VecClone pass under development in
> >     https://reviews.llvm.org/D22792
> >
> > # CLANG COMPONENTS
> >
> > A C function attribute, `clang_declare_simd_variant`, to attach to
> > the scalar version. The attribute provides enough information to the
> > compiler about the vector shape of the user defined function. The
> > vector shapes handled by the attribute are those handled by the
> > OpenMP standard via `declare simd` (and no more than that).
> >
> > 1. The function attribute handling in clang is crafted with the
> >     requirement that it will be possible to re-use the same components
> >     for the info generated by `declare variant` when used with a `simd`
> >     traits in the `construct` set.
> > 2. The attribute allows orthogonality with the vectorization that is
> >     done via OpenMP: the user vector function is still exposed for
> >     vectorization when not using `-fopenmp-[simd]` once the `declare
> >     simd` and `declare variant` directive of OpenMP will be available
> >     in the front-end.
> >
> > ## C function attribute: `clang_declare_simd_variant`
> >
> > The definition of this attribute has been crafted to match the
> > semantics of `declare variant` for a `simd` construct described in
> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
> > and `arch`, which I believe are enough to cover for the use case of
> > this proposal. If that is not the case, please provide an example,
> > extending the attribute will be easy even once the current one is
> > implemented.
> >
> > ```
> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{,
> > <context selector clauses>})
> >
> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
> >                      for C++, a template-id.
> >
> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
> >
> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
> >
> > <mask>    := inbranch | notinbranch
> >
> > <optional simd clauses> := <linear clause>
> >                           | <uniform clause>
> >                           | <align clause>  | {,<optional simd
> > clauses>}
> >
> > <linear clause>  := linear_ref(<var>,<step>)
> >                    | linear_var(<var>, <step>)
> >                    | linear_uval(<var>, <step>)
> >                    | linear(<var>, <step>)
> >
> > <step> := <var> | <non zero number>
> >
> > <uniform clause> := uniform(<var>)
> >
> > <align clause>   := align(<var>, <positive number>)
> >
> > <var> := Name of a parameter in the scalar function
> > declaration/definition
> >
> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
> >
> > <positive number> := 1 | 2 | 3 | ...
> >
> > <context selector clauses> := {<isa>}{,} {<arch>}
> >
> > <isa> := isa(target-specific-value)
> >
> > <arch> := arch(target-specific-value)
> >
> > ```
> >
> > # LLVM COMPONENTS:
> >
> > ## VectorFunctionShape class
> >
> > The object `VectorFunctionShape` contains the information about the
> > kind of vectorization available for an `llvm::Call`.
> >
> > The object `VectorFunctionShape` must contain the following information:
> >
> > 1. Vectorization Factor (or number or concurrent lanes executed by the
> >     SIMD version of the function). Encoded by unsigned integer.
> > 2. Whether the vector function is requested for scalable
> >     vectorization, encoded by a boolean.
> > 3. Information about masking / no masking, encoded by a boolean.
> > 4. Information about the parameters, encoded in a container that
> >     carries objects of type `ParamaterType`, to describe features like
> >     `linear` and `uniform`.
> > 5. Function name redirection, if a user has specified to use a custom
> >     name instead of the Vector Function ABI ones.
> >
> > Items 1. to 5. represents the information stored in the
> > `vector-function-abi-variant` attribute (see next section).
> >
> > The object can be extended in the future to include new
> > vectorization kinds (for example the RV-like vectorization of the
> > Region Vectorizer), or to add more context information that might
> > come from other uses of OpenMP `declare variant`, or to add new
> > Vector Function ABIs not based on OpenMP. Such information can be
> > retrieved by attributes that will be added to describe the `Call` instance.
> >
> > ## IR Attribute
> >
> > We define a `vector-function-abi-variant` attribute that lists the
> > mangled names produced via the mangling function of the Vector
> > Function ABI rules.
> >
> > ```
> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
> > ```
> >
> > 1. Because we use only OpenMP `declare simd` vectorization, and
> >     because we require a vector Function ABI, we make this explicit
> >     in the name of the attribute.
> > 2. Because the Vector Function ABIs encode all the information
> >     needed to know the vectorization shape of the vector function in
> >     the mangled names, we provide the mangled name via the
> >     attribute.
> > 3. Function names redirection is specified by enclosing the name of
> >     the redirection in parenthesis, as in
> >     `abi_mangled_name_02(user_redirection)`.
> >
> > ## Vector ABI Demangler
> >
> > The “Vector ABI demangler”, is the component that demangles the data
> > in the `vector-function-abi-variant` attribute and that provides the
> > instances of the class `VectorFunctionShape` that can be derived by
> > the mangled names listed in the attribute.
> >
> > ## Query interface: Search Vector Function System (SVFS)
> >
> > An interface that can be queried by the LLVM components to
> > understand whether or not a scalar function can be vectorized, and
> > that retrieves the vector function to be used if such vector shape is available.
> >
> > 1. This component is going to be unrelated to OpenMP.
> > 2. This component will use internally the demangler defined in the
> >     previous section, but it will not expose any aspect of the Vector
> >     Function ABI via its interface.
> >
> > The interface provides two methods.
> >
> > ```
> > std::vector<VectorFunctionShape>
> > SVFS::isFunctionVectorizable(llvm::CallInst * Call);
> >
> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call,
> > VectorFunctionShape Info); ```
> >
> > The first method is used to list all the vector shapes that
> > available and attached to a scalar function. An empty results means
> > that no vector versions are available.
> >
> > The second method retrieves the information needed to build a call
> > to a vector function with a specific `VectorFunctionShape` info.
> >
> > # (SELF) ASSESSMENT ON EXTENDIBILITY
> >
> >
> > 1. Extending the C function attribute `clang_declare_simd_variant` to
> >     new Vector Function ABIs that use OpenMP will be straightforward
> >     because the attribute is tight to such ABIs and OpenMP.
> > 2. The C attribute `clang_declare_simd_variant` and the `declare
> >     variant` directive used for the `simd` trait will be sharing the
> >     internals in clang, so adding the OpenMP functionality for `simd`
> >     traits will be mostly handling the directive in the OpenMP
> >     parser. How this should be done is described in
> >    
> > https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attri
> > bute 3. The IR attribute `vector-function-abi-variant` is not to be
> >     extended to represent other kind of vectorization other than those
> >     handled by `declare simd` and that are handled with a Vector
> >     Function ABI.
> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
> >     extended to represent the information of `declare variant` in its
> >     totality.
> > 5. The IR attribute will not need to change when we will introduce non
> >     vector function ABI vectorization (RV-like, reductions...) or when
> >     we will decide to fully support `declare variant`. The information
> >     it carries will not need to be invalidated, but just extended with
> >     new attributes that will need to be handled by the
> >     `VectorFunctionShape` class, in a similar way the
> >     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
> >     operates on individual attributes to describe an overall
> >     functionality.
> >
> > # Examples
> >
> > ## Example 1
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64.
> >
> > ```
> > double foo_01(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_01",
> > simdlen(2), notinbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_01(float64x2_t VectorInput); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
> > ```
> >
> > ## Example 2
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64, but with the wrong signature. The user specifies a
> > masked version of the function in the clauses of the attribute, the
> > compiler throws an error suggesting the signature expected for
> > ``vector_foo_02.``
> >
> > ```
> > double foo_02(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_02",
> > simdlen(2), inbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_02(float64x2_t VectorInput);
> > // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
> > ```
> >
> > ## Example 3
> >
> > Targeting `sincos`-like signatures.
> >
> > ```
> > void foo_03(double Input, double * Output)
> > __attribute__(clang_declare_simd_variant(“vector_foo_03",
> > simdlen(2), notinbranch, linear(Output, 1), isa("simd"));
> >
> > // Advanced SIMD version
> > void vector_foo_03(float64x2_t VectorInput, double * Output); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 =
> > {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
> > ```
> > ## Example 4
> >
> > Scalable vectorization targeting SVE
> >
> > ```
> > double foo_04(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_04",
> > simdlen("scalable"), notinbranch, isa("sve"));
> >
> > // SVE version
> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 5
> >
> > Fixed length vectorization targeting SVE
> >
> > ```
> > double foo_05(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_05",
> > simdlen(4), inbranch, isa("sve"));
> >
> > // Fixed-length SVE version
> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 06
> >
> > This is an x86 example, equivalent to the one provided by Andrei
> > Elovikow in
> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html.
> > Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ
> >
> > ```
> > float MyAdd(float* a, int b)
> > __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) {
> >    return *a + b;
> > }
> >
> >
> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
> > ```
> >
> > ## Example showing interaction with `declare simd`
> >
> > ```
> > #pragma omp declare simd linear(a) notinbranch float foo_06(float
> > *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
> >      return *a + x;
> > }
> >
> > // Advanced SIMD version
> > float32x4_t vector_foo_06(float *a, int32x4_t vx) { // Custom
> > implementation.
> > }
> > ```
> >
> > The resulting IR attribute is made of three symbols:
> >
> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
> >     ones the compiler builds by auto-vectorizing `foo_06` according to
> >     the rule defined in the Vector Function ABI specifications for
> >     AArch64.
> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
> >     user-defined redirection of the 4-lane version of `foo_06` to the
> >     custom implementation provided by the user when targeting Advanced
> >     SIMD for version 8.2 of the A64 instruction set.
> >
> > ```
> > attribute #0 =
> > {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_Z
> > GVnN4l4v_foo_06(vector_foo_06)"}
> > ```
> >
> --
>
> Simon Moll
> Researcher / PhD Student
>
> Compiler Design Lab (Prof. Hack)
> Saarland University, Computer Science
> Building E1.3, Room 4.31
>
> Tel. +49 (0)681 302-57521 : [hidden email] Fax. +49 (0)681
> 302-3065  : http://compilers.cs.uni-saarland.de/people/moll

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
>>>>I don’t know if this is going to be a problem for other architectures

++++++I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

Agreed. It looks we have an issue here.  Given this is parameter, could we use  metadata or attribute to preserve the "struct" info, in ICC, we called BE type saved info in the symtab.

Xinmin

-----Original Message-----
From: Saito, Hideki
Sent: Friday, June 21, 2019 4:44 PM
To: Francesco Petrogalli <[hidden email]>; Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: RE: RFC: Interface user provided vector functions with the vectorizer.


>In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

Ouch.

>I don’t know if this is going to be a problem for other architectures

I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

>Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

Makes sense to me.

-----Original Message-----
From: Francesco Petrogalli [mailto:[hidden email]]
Sent: Friday, June 21, 2019 2:04 PM
To: Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Saito, Hideki <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

Hi all - I am working with a colleague to provide an initial implementation of this.

We encountered a problem when dealing with generating the vector signatures of functions that use complex data.

In this proposal, we expect the SVFS component in the backed to demangle the name of the function in the attribute to be able to reconstruct the signature of the vector function from the scalar function signature.

In case of Complex data, this doesn’t seem to be possible, because the information of “being a vector of 2 lanes” that is supposed to be carried by the complex scalar is lost in the transformation the data type in a “coerced” type.

Consider these three types and the function `foo`:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
 int a;
 int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
 return ...;
}

In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.

Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

In particular, (and this I believe has already been mentioned by Johannes), we could use the @llvm.compiler.used intrinsic to mark those declaration that needs to stay in the IR and not optimized away OPT before reaching the vectorizer.

In summary, the change would consist of:

1. Generate symbols declaration/definitions of the vector function with the mangled name in the IR, and mark it with @llvm-compiler.used. This could be done in CGOpenMPRuntime.cpp 2. Use the attribute vector-abs-variant defined in this RFC to map scalar names to vector ABI mangled name, and used the same redirection mechanism for the user provided vector name.
3. Move the “vector function signature generation” from the SVFS in LLVM to the openmp code generator of the clang frontend

The SVFS query system would still work as in the current proposal. The only difference is that the vector function signature would be given by the frontend and not need to be recomputed.

Here is an example of ho the IR would look like with this change:

```
@llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"

declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)

declare i32 @foo(i32) #0

; other function definition, including the one provided by the user `my_vector_foo` if the user provided a definition and not just the declaration

attribute #0 = {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}

```

If the attribute @llvm.compiler.used is not suitable for this (I am not aware of all implication of using it on a global symbol), maybe we could come up with a intrinsics that does what we need (avoid deleting declarations that are not used) and name it @llvm.vector.function.used?

Please let me know what you think, I will submit an updated proposal next week.

Kind regards,

Francesco

> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <[hidden email]> wrote:
>
> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>
> Sorry for the delay and thanks for working on this.
>
> Get Outlook for Android
>
> From: Simon Moll <[hidden email]>
> Sent: Monday, June 17, 2019 10:02:58 AM
> To: Francesco Petrogalli; LLVM Development List; Clang Dev
> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei;
> Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd;
> Roman Lebedev; Philip Reames; Shawn Landden
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>  
> Hi Francesco,
>
> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
> > Dear all,
> >
> > I have re-written the proposal for interfacing user provided vector
> > functions, originally posted in both llvm-dev and cfe-dev mailing
> > list:
> >
> > "[RFC] Expose user provided vector function for auto-vectorization."
> >
> > The proposal looks quite different from the original submission,
> > therefore I took the liberty to start a new thread.
> >
> > The original thread generated some good discussion. In particular,
> > Simon Moll and Johannes Doerfert (CCed) have managed to provide good
> > arguments for the following claims:
> >
> > 1. The Vector Function ABI name mangling scheme of a target is not
> >     enough to describe all uses cases of function vectorization that
> >     the compiler might end up needing to support in the future.
> I think the new name of the attribute makes this point clear.
> > 2. `declare variant` needs to be handled properly at IR level, to be
> >     able to give the compiler the full OpenMP context of the directive.
> >
> > This proposal addresses those two concerns and other (I believe)
> > minor concerns that have been raised in the previous thread.
> >
> > This proposal is provided with examples and a self assessment around
> > extendibility.
> >
> > I have CCed all the people that have participated in the discussion
> > so far, please let me know if you think I have missed anything of
> > what have been raised.
> >
> > Kind regards,
> >
> > Francesco
>
> LGTM. Please add me as a reviewer for this when you post patches.
>
> Thanks!
>
> Simon
>
> >
> > *** DRAFT OF THE PROPOSAL ***
> >
> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
> >
> > Because the users care about portability (across compilers,
> > libraries and systems), I believe we have to base sour solution on a
> > standard that describes the mapping from the scalar function to the
> > vector function.
> >
> > Because OpenMP is standard and widely used, we should base our
> > solution on the mechanisms that the standard provides, via the
> > directives `declare simd` and `declare variant`, the latter when
> > used in with the `simd` trait in the `construct` set.
> >
> > Please notice that:
> >
> > 1. The scope of the proposal is not implementing full support for
> >     `pragma omp declare variant`.
> > 2. The scope of the proposal is not enabling the vectorizer to do new
> >     kind of vectorizations (e.g. RV-like vectorization described by
> >     Simon).
> > 3. The proposal aims to be extendible wrt 1. and 2.
> > 4. The IR attribute introduced in this proposal is equivalent to the
> >     one needed for the VecClone pass under development in
> >     https://reviews.llvm.org/D22792
> >
> > # CLANG COMPONENTS
> >
> > A C function attribute, `clang_declare_simd_variant`, to attach to
> > the scalar version. The attribute provides enough information to the
> > compiler about the vector shape of the user defined function. The
> > vector shapes handled by the attribute are those handled by the
> > OpenMP standard via `declare simd` (and no more than that).
> >
> > 1. The function attribute handling in clang is crafted with the
> >     requirement that it will be possible to re-use the same components
> >     for the info generated by `declare variant` when used with a `simd`
> >     traits in the `construct` set.
> > 2. The attribute allows orthogonality with the vectorization that is
> >     done via OpenMP: the user vector function is still exposed for
> >     vectorization when not using `-fopenmp-[simd]` once the `declare
> >     simd` and `declare variant` directive of OpenMP will be available
> >     in the front-end.
> >
> > ## C function attribute: `clang_declare_simd_variant`
> >
> > The definition of this attribute has been crafted to match the
> > semantics of `declare variant` for a `simd` construct described in
> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
> > and `arch`, which I believe are enough to cover for the use case of
> > this proposal. If that is not the case, please provide an example,
> > extending the attribute will be easy even once the current one is
> > implemented.
> >
> > ```
> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{,
> > <context selector clauses>})
> >
> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
> >                      for C++, a template-id.
> >
> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
> >
> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
> >
> > <mask>    := inbranch | notinbranch
> >
> > <optional simd clauses> := <linear clause>
> >                           | <uniform clause>
> >                           | <align clause>  | {,<optional simd
> > clauses>}
> >
> > <linear clause>  := linear_ref(<var>,<step>)
> >                    | linear_var(<var>, <step>)
> >                    | linear_uval(<var>, <step>)
> >                    | linear(<var>, <step>)
> >
> > <step> := <var> | <non zero number>
> >
> > <uniform clause> := uniform(<var>)
> >
> > <align clause>   := align(<var>, <positive number>)
> >
> > <var> := Name of a parameter in the scalar function
> > declaration/definition
> >
> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
> >
> > <positive number> := 1 | 2 | 3 | ...
> >
> > <context selector clauses> := {<isa>}{,} {<arch>}
> >
> > <isa> := isa(target-specific-value)
> >
> > <arch> := arch(target-specific-value)
> >
> > ```
> >
> > # LLVM COMPONENTS:
> >
> > ## VectorFunctionShape class
> >
> > The object `VectorFunctionShape` contains the information about the
> > kind of vectorization available for an `llvm::Call`.
> >
> > The object `VectorFunctionShape` must contain the following information:
> >
> > 1. Vectorization Factor (or number or concurrent lanes executed by the
> >     SIMD version of the function). Encoded by unsigned integer.
> > 2. Whether the vector function is requested for scalable
> >     vectorization, encoded by a boolean.
> > 3. Information about masking / no masking, encoded by a boolean.
> > 4. Information about the parameters, encoded in a container that
> >     carries objects of type `ParamaterType`, to describe features like
> >     `linear` and `uniform`.
> > 5. Function name redirection, if a user has specified to use a custom
> >     name instead of the Vector Function ABI ones.
> >
> > Items 1. to 5. represents the information stored in the
> > `vector-function-abi-variant` attribute (see next section).
> >
> > The object can be extended in the future to include new
> > vectorization kinds (for example the RV-like vectorization of the
> > Region Vectorizer), or to add more context information that might
> > come from other uses of OpenMP `declare variant`, or to add new
> > Vector Function ABIs not based on OpenMP. Such information can be
> > retrieved by attributes that will be added to describe the `Call` instance.
> >
> > ## IR Attribute
> >
> > We define a `vector-function-abi-variant` attribute that lists the
> > mangled names produced via the mangling function of the Vector
> > Function ABI rules.
> >
> > ```
> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
> > ```
> >
> > 1. Because we use only OpenMP `declare simd` vectorization, and
> >     because we require a vector Function ABI, we make this explicit
> >     in the name of the attribute.
> > 2. Because the Vector Function ABIs encode all the information
> >     needed to know the vectorization shape of the vector function in
> >     the mangled names, we provide the mangled name via the
> >     attribute.
> > 3. Function names redirection is specified by enclosing the name of
> >     the redirection in parenthesis, as in
> >     `abi_mangled_name_02(user_redirection)`.
> >
> > ## Vector ABI Demangler
> >
> > The “Vector ABI demangler”, is the component that demangles the data
> > in the `vector-function-abi-variant` attribute and that provides the
> > instances of the class `VectorFunctionShape` that can be derived by
> > the mangled names listed in the attribute.
> >
> > ## Query interface: Search Vector Function System (SVFS)
> >
> > An interface that can be queried by the LLVM components to
> > understand whether or not a scalar function can be vectorized, and
> > that retrieves the vector function to be used if such vector shape is available.
> >
> > 1. This component is going to be unrelated to OpenMP.
> > 2. This component will use internally the demangler defined in the
> >     previous section, but it will not expose any aspect of the Vector
> >     Function ABI via its interface.
> >
> > The interface provides two methods.
> >
> > ```
> > std::vector<VectorFunctionShape>
> > SVFS::isFunctionVectorizable(llvm::CallInst * Call);
> >
> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call,
> > VectorFunctionShape Info); ```
> >
> > The first method is used to list all the vector shapes that
> > available and attached to a scalar function. An empty results means
> > that no vector versions are available.
> >
> > The second method retrieves the information needed to build a call
> > to a vector function with a specific `VectorFunctionShape` info.
> >
> > # (SELF) ASSESSMENT ON EXTENDIBILITY
> >
> >
> > 1. Extending the C function attribute `clang_declare_simd_variant` to
> >     new Vector Function ABIs that use OpenMP will be straightforward
> >     because the attribute is tight to such ABIs and OpenMP.
> > 2. The C attribute `clang_declare_simd_variant` and the `declare
> >     variant` directive used for the `simd` trait will be sharing the
> >     internals in clang, so adding the OpenMP functionality for `simd`
> >     traits will be mostly handling the directive in the OpenMP
> >     parser. How this should be done is described in
> >    
> > https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attri
> > bute 3. The IR attribute `vector-function-abi-variant` is not to be
> >     extended to represent other kind of vectorization other than those
> >     handled by `declare simd` and that are handled with a Vector
> >     Function ABI.
> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
> >     extended to represent the information of `declare variant` in its
> >     totality.
> > 5. The IR attribute will not need to change when we will introduce non
> >     vector function ABI vectorization (RV-like, reductions...) or when
> >     we will decide to fully support `declare variant`. The information
> >     it carries will not need to be invalidated, but just extended with
> >     new attributes that will need to be handled by the
> >     `VectorFunctionShape` class, in a similar way the
> >     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
> >     operates on individual attributes to describe an overall
> >     functionality.
> >
> > # Examples
> >
> > ## Example 1
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64.
> >
> > ```
> > double foo_01(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_01",
> > simdlen(2), notinbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_01(float64x2_t VectorInput); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
> > ```
> >
> > ## Example 2
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64, but with the wrong signature. The user specifies a
> > masked version of the function in the clauses of the attribute, the
> > compiler throws an error suggesting the signature expected for
> > ``vector_foo_02.``
> >
> > ```
> > double foo_02(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_02",
> > simdlen(2), inbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_02(float64x2_t VectorInput);
> > // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
> > ```
> >
> > ## Example 3
> >
> > Targeting `sincos`-like signatures.
> >
> > ```
> > void foo_03(double Input, double * Output)
> > __attribute__(clang_declare_simd_variant(“vector_foo_03",
> > simdlen(2), notinbranch, linear(Output, 1), isa("simd"));
> >
> > // Advanced SIMD version
> > void vector_foo_03(float64x2_t VectorInput, double * Output); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 =
> > {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
> > ```
> > ## Example 4
> >
> > Scalable vectorization targeting SVE
> >
> > ```
> > double foo_04(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_04",
> > simdlen("scalable"), notinbranch, isa("sve"));
> >
> > // SVE version
> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 5
> >
> > Fixed length vectorization targeting SVE
> >
> > ```
> > double foo_05(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_05",
> > simdlen(4), inbranch, isa("sve"));
> >
> > // Fixed-length SVE version
> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 06
> >
> > This is an x86 example, equivalent to the one provided by Andrei
> > Elovikow in
> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html.
> > Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ
> >
> > ```
> > float MyAdd(float* a, int b)
> > __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) {
> >    return *a + b;
> > }
> >
> >
> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
> > ```
> >
> > ## Example showing interaction with `declare simd`
> >
> > ```
> > #pragma omp declare simd linear(a) notinbranch float foo_06(float
> > *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
> >      return *a + x;
> > }
> >
> > // Advanced SIMD version
> > float32x4_t vector_foo_06(float *a, int32x4_t vx) { // Custom
> > implementation.
> > }
> > ```
> >
> > The resulting IR attribute is made of three symbols:
> >
> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
> >     ones the compiler builds by auto-vectorizing `foo_06` according to
> >     the rule defined in the Vector Function ABI specifications for
> >     AArch64.
> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
> >     user-defined redirection of the 4-lane version of `foo_06` to the
> >     custom implementation provided by the user when targeting Advanced
> >     SIMD for version 8.2 of the A64 instruction set.
> >
> > ```
> > attribute #0 =
> > {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_Z
> > GVnN4l4v_foo_06(vector_foo_06)"}
> > ```
> >
> --
>
> Simon Moll
> Researcher / PhD Student
>
> Compiler Design Lab (Prof. Hack)
> Saarland University, Computer Science
> Building E1.3, Room 4.31
>
> Tel. +49 (0)681 302-57521 : [hidden email] Fax. +49 (0)681
> 302-3065  : http://compilers.cs.uni-saarland.de/people/moll

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
@Xinmin, Saito: If Clang/the frontend generates the version there is no problem, or is there? The frontend knows about the original source type and it's ABI specific lowering already.

@Francesco, we should even consider putting the generating capabilities outside of the OpenMP code generation (in the future). That could allow easier reuse by other frontends.


From: Tian, Xinmin <[hidden email]>
Sent: Monday, June 24, 2019 5:28:45 PM
To: Saito, Hideki; Francesco Petrogalli; Doerfert, Johannes
Cc: Simon Moll; LLVM Development List; Clang Dev; Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; nd; Roman Lebedev; Philip Reames; Shawn Landden
Subject: RE: RFC: Interface user provided vector functions with the vectorizer.
 
>>>>I don’t know if this is going to be a problem for other architectures

++++++I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

Agreed. It looks we have an issue here.  Given this is parameter, could we use  metadata or attribute to preserve the "struct" info, in ICC, we called BE type saved info in the symtab.

Xinmin

-----Original Message-----
From: Saito, Hideki
Sent: Friday, June 21, 2019 4:44 PM
To: Francesco Petrogalli <[hidden email]>; Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: RE: RFC: Interface user provided vector functions with the vectorizer.


>In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

Ouch.

>I don’t know if this is going to be a problem for other architectures

I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

>Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

Makes sense to me.

-----Original Message-----
From: Francesco Petrogalli [[hidden email]]
Sent: Friday, June 21, 2019 2:04 PM
To: Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Saito, Hideki <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

Hi all - I am working with a colleague to provide an initial implementation of this.

We encountered a problem when dealing with generating the vector signatures of functions that use complex data.

In this proposal, we expect the SVFS component in the backed to demangle the name of the function in the attribute to be able to reconstruct the signature of the vector function from the scalar function signature.

In case of Complex data, this doesn’t seem to be possible, because the information of “being a vector of 2 lanes” that is supposed to be carried by the complex scalar is lost in the transformation the data type in a “coerced” type.

Consider these three types and the function `foo`:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
 int a;
 int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
 return ...;
}

In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.

Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

In particular, (and this I believe has already been mentioned by Johannes), we could use the @llvm.compiler.used intrinsic to mark those declaration that needs to stay in the IR and not optimized away OPT before reaching the vectorizer.

In summary, the change would consist of:

1. Generate symbols declaration/definitions of the vector function with the mangled name in the IR, and mark it with @llvm-compiler.used. This could be done in CGOpenMPRuntime.cpp 2. Use the attribute vector-abs-variant defined in this RFC to map scalar names to vector ABI mangled name, and used the same redirection mechanism for the user provided vector name.
3. Move the “vector function signature generation” from the SVFS in LLVM to the openmp code generator of the clang frontend

The SVFS query system would still work as in the current proposal. The only difference is that the vector function signature would be given by the frontend and not need to be recomputed.

Here is an example of ho the IR would look like with this change:

```
@llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"

declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)

declare i32 @foo(i32) #0

; other function definition, including the one provided by the user `my_vector_foo` if the user provided a definition and not just the declaration

attribute #0 = {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}

```

If the attribute @llvm.compiler.used is not suitable for this (I am not aware of all implication of using it on a global symbol), maybe we could come up with a intrinsics that does what we need (avoid deleting declarations that are not used) and name it @llvm.vector.function.used?

Please let me know what you think, I will submit an updated proposal next week.

Kind regards,

Francesco

> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <[hidden email]> wrote:
>
> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>
> Sorry for the delay and thanks for working on this.
>
> Get Outlook for Android
>
> From: Simon Moll <[hidden email]>
> Sent: Monday, June 17, 2019 10:02:58 AM
> To: Francesco Petrogalli; LLVM Development List; Clang Dev
> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei;
> Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd;
> Roman Lebedev; Philip Reames; Shawn Landden
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

> Hi Francesco,
>
> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
> > Dear all,
> >
> > I have re-written the proposal for interfacing user provided vector
> > functions, originally posted in both llvm-dev and cfe-dev mailing
> > list:
> >
> > "[RFC] Expose user provided vector function for auto-vectorization."
> >
> > The proposal looks quite different from the original submission,
> > therefore I took the liberty to start a new thread.
> >
> > The original thread generated some good discussion. In particular,
> > Simon Moll and Johannes Doerfert (CCed) have managed to provide good
> > arguments for the following claims:
> >
> > 1. The Vector Function ABI name mangling scheme of a target is not
> >     enough to describe all uses cases of function vectorization that
> >     the compiler might end up needing to support in the future.
> I think the new name of the attribute makes this point clear.
> > 2. `declare variant` needs to be handled properly at IR level, to be
> >     able to give the compiler the full OpenMP context of the directive.
> >
> > This proposal addresses those two concerns and other (I believe)
> > minor concerns that have been raised in the previous thread.
> >
> > This proposal is provided with examples and a self assessment around
> > extendibility.
> >
> > I have CCed all the people that have participated in the discussion
> > so far, please let me know if you think I have missed anything of
> > what have been raised.
> >
> > Kind regards,
> >
> > Francesco
>
> LGTM. Please add me as a reviewer for this when you post patches.
>
> Thanks!
>
> Simon
>
> >
> > *** DRAFT OF THE PROPOSAL ***
> >
> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
> >
> > Because the users care about portability (across compilers,
> > libraries and systems), I believe we have to base sour solution on a
> > standard that describes the mapping from the scalar function to the
> > vector function.
> >
> > Because OpenMP is standard and widely used, we should base our
> > solution on the mechanisms that the standard provides, via the
> > directives `declare simd` and `declare variant`, the latter when
> > used in with the `simd` trait in the `construct` set.
> >
> > Please notice that:
> >
> > 1. The scope of the proposal is not implementing full support for
> >     `pragma omp declare variant`.
> > 2. The scope of the proposal is not enabling the vectorizer to do new
> >     kind of vectorizations (e.g. RV-like vectorization described by
> >     Simon).
> > 3. The proposal aims to be extendible wrt 1. and 2.
> > 4. The IR attribute introduced in this proposal is equivalent to the
> >     one needed for the VecClone pass under development in
> >     https://reviews.llvm.org/D22792
> >
> > # CLANG COMPONENTS
> >
> > A C function attribute, `clang_declare_simd_variant`, to attach to
> > the scalar version. The attribute provides enough information to the
> > compiler about the vector shape of the user defined function. The
> > vector shapes handled by the attribute are those handled by the
> > OpenMP standard via `declare simd` (and no more than that).
> >
> > 1. The function attribute handling in clang is crafted with the
> >     requirement that it will be possible to re-use the same components
> >     for the info generated by `declare variant` when used with a `simd`
> >     traits in the `construct` set.
> > 2. The attribute allows orthogonality with the vectorization that is
> >     done via OpenMP: the user vector function is still exposed for
> >     vectorization when not using `-fopenmp-[simd]` once the `declare
> >     simd` and `declare variant` directive of OpenMP will be available
> >     in the front-end.
> >
> > ## C function attribute: `clang_declare_simd_variant`
> >
> > The definition of this attribute has been crafted to match the
> > semantics of `declare variant` for a `simd` construct described in
> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
> > and `arch`, which I believe are enough to cover for the use case of
> > this proposal. If that is not the case, please provide an example,
> > extending the attribute will be easy even once the current one is
> > implemented.
> >
> > ```
> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{,
> > <context selector clauses>})
> >
> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
> >                      for C++, a template-id.
> >
> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
> >
> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
> >
> > <mask>    := inbranch | notinbranch
> >
> > <optional simd clauses> := <linear clause>
> >                           | <uniform clause>
> >                           | <align clause>  | {,<optional simd
> > clauses>}
> >
> > <linear clause>  := linear_ref(<var>,<step>)
> >                    | linear_var(<var>, <step>)
> >                    | linear_uval(<var>, <step>)
> >                    | linear(<var>, <step>)
> >
> > <step> := <var> | <non zero number>
> >
> > <uniform clause> := uniform(<var>)
> >
> > <align clause>   := align(<var>, <positive number>)
> >
> > <var> := Name of a parameter in the scalar function
> > declaration/definition
> >
> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
> >
> > <positive number> := 1 | 2 | 3 | ...
> >
> > <context selector clauses> := {<isa>}{,} {<arch>}
> >
> > <isa> := isa(target-specific-value)
> >
> > <arch> := arch(target-specific-value)
> >
> > ```
> >
> > # LLVM COMPONENTS:
> >
> > ## VectorFunctionShape class
> >
> > The object `VectorFunctionShape` contains the information about the
> > kind of vectorization available for an `llvm::Call`.
> >
> > The object `VectorFunctionShape` must contain the following information:
> >
> > 1. Vectorization Factor (or number or concurrent lanes executed by the
> >     SIMD version of the function). Encoded by unsigned integer.
> > 2. Whether the vector function is requested for scalable
> >     vectorization, encoded by a boolean.
> > 3. Information about masking / no masking, encoded by a boolean.
> > 4. Information about the parameters, encoded in a container that
> >     carries objects of type `ParamaterType`, to describe features like
> >     `linear` and `uniform`.
> > 5. Function name redirection, if a user has specified to use a custom
> >     name instead of the Vector Function ABI ones.
> >
> > Items 1. to 5. represents the information stored in the
> > `vector-function-abi-variant` attribute (see next section).
> >
> > The object can be extended in the future to include new
> > vectorization kinds (for example the RV-like vectorization of the
> > Region Vectorizer), or to add more context information that might
> > come from other uses of OpenMP `declare variant`, or to add new
> > Vector Function ABIs not based on OpenMP. Such information can be
> > retrieved by attributes that will be added to describe the `Call` instance.
> >
> > ## IR Attribute
> >
> > We define a `vector-function-abi-variant` attribute that lists the
> > mangled names produced via the mangling function of the Vector
> > Function ABI rules.
> >
> > ```
> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
> > ```
> >
> > 1. Because we use only OpenMP `declare simd` vectorization, and
> >     because we require a vector Function ABI, we make this explicit
> >     in the name of the attribute.
> > 2. Because the Vector Function ABIs encode all the information
> >     needed to know the vectorization shape of the vector function in
> >     the mangled names, we provide the mangled name via the
> >     attribute.
> > 3. Function names redirection is specified by enclosing the name of
> >     the redirection in parenthesis, as in
> >     `abi_mangled_name_02(user_redirection)`.
> >
> > ## Vector ABI Demangler
> >
> > The “Vector ABI demangler”, is the component that demangles the data
> > in the `vector-function-abi-variant` attribute and that provides the
> > instances of the class `VectorFunctionShape` that can be derived by
> > the mangled names listed in the attribute.
> >
> > ## Query interface: Search Vector Function System (SVFS)
> >
> > An interface that can be queried by the LLVM components to
> > understand whether or not a scalar function can be vectorized, and
> > that retrieves the vector function to be used if such vector shape is available.
> >
> > 1. This component is going to be unrelated to OpenMP.
> > 2. This component will use internally the demangler defined in the
> >     previous section, but it will not expose any aspect of the Vector
> >     Function ABI via its interface.
> >
> > The interface provides two methods.
> >
> > ```
> > std::vector<VectorFunctionShape>
> > SVFS::isFunctionVectorizable(llvm::CallInst * Call);
> >
> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call,
> > VectorFunctionShape Info); ```
> >
> > The first method is used to list all the vector shapes that
> > available and attached to a scalar function. An empty results means
> > that no vector versions are available.
> >
> > The second method retrieves the information needed to build a call
> > to a vector function with a specific `VectorFunctionShape` info.
> >
> > # (SELF) ASSESSMENT ON EXTENDIBILITY
> >
> >
> > 1. Extending the C function attribute `clang_declare_simd_variant` to
> >     new Vector Function ABIs that use OpenMP will be straightforward
> >     because the attribute is tight to such ABIs and OpenMP.
> > 2. The C attribute `clang_declare_simd_variant` and the `declare
> >     variant` directive used for the `simd` trait will be sharing the
> >     internals in clang, so adding the OpenMP functionality for `simd`
> >     traits will be mostly handling the directive in the OpenMP
> >     parser. How this should be done is described in
> >    
> > https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attri
> > bute 3. The IR attribute `vector-function-abi-variant` is not to be
> >     extended to represent other kind of vectorization other than those
> >     handled by `declare simd` and that are handled with a Vector
> >     Function ABI.
> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
> >     extended to represent the information of `declare variant` in its
> >     totality.
> > 5. The IR attribute will not need to change when we will introduce non
> >     vector function ABI vectorization (RV-like, reductions...) or when
> >     we will decide to fully support `declare variant`. The information
> >     it carries will not need to be invalidated, but just extended with
> >     new attributes that will need to be handled by the
> >     `VectorFunctionShape` class, in a similar way the
> >     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
> >     operates on individual attributes to describe an overall
> >     functionality.
> >
> > # Examples
> >
> > ## Example 1
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64.
> >
> > ```
> > double foo_01(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_01",
> > simdlen(2), notinbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_01(float64x2_t VectorInput); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
> > ```
> >
> > ## Example 2
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64, but with the wrong signature. The user specifies a
> > masked version of the function in the clauses of the attribute, the
> > compiler throws an error suggesting the signature expected for
> > ``vector_foo_02.``
> >
> > ```
> > double foo_02(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_02",
> > simdlen(2), inbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_02(float64x2_t VectorInput);
> > // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
> > ```
> >
> > ## Example 3
> >
> > Targeting `sincos`-like signatures.
> >
> > ```
> > void foo_03(double Input, double * Output)
> > __attribute__(clang_declare_simd_variant(“vector_foo_03",
> > simdlen(2), notinbranch, linear(Output, 1), isa("simd"));
> >
> > // Advanced SIMD version
> > void vector_foo_03(float64x2_t VectorInput, double * Output); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 =
> > {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
> > ```
> > ## Example 4
> >
> > Scalable vectorization targeting SVE
> >
> > ```
> > double foo_04(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_04",
> > simdlen("scalable"), notinbranch, isa("sve"));
> >
> > // SVE version
> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 5
> >
> > Fixed length vectorization targeting SVE
> >
> > ```
> > double foo_05(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_05",
> > simdlen(4), inbranch, isa("sve"));
> >
> > // Fixed-length SVE version
> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 06
> >
> > This is an x86 example, equivalent to the one provided by Andrei
> > Elovikow in
> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html.
> > Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ
> >
> > ```
> > float MyAdd(float* a, int b)
> > __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) {
> >    return *a + b;
> > }
> >
> >
> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
> > ```
> >
> > ## Example showing interaction with `declare simd`
> >
> > ```
> > #pragma omp declare simd linear(a) notinbranch float foo_06(float
> > *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
> >      return *a + x;
> > }
> >
> > // Advanced SIMD version
> > float32x4_t vector_foo_06(float *a, int32x4_t vx) { // Custom
> > implementation.
> > }
> > ```
> >
> > The resulting IR attribute is made of three symbols:
> >
> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
> >     ones the compiler builds by auto-vectorizing `foo_06` according to
> >     the rule defined in the Vector Function ABI specifications for
> >     AArch64.
> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
> >     user-defined redirection of the 4-lane version of `foo_06` to the
> >     custom implementation provided by the user when targeting Advanced
> >     SIMD for version 8.2 of the A64 instruction set.
> >
> > ```
> > attribute #0 =
> > {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_Z
> > GVnN4l4v_foo_06(vector_foo_06)"}
> > ```
> >
> --
>
> Simon Moll
> Researcher / PhD Student
>
> Compiler Design Lab (Prof. Hack)
> Saarland University, Computer Science
> Building E1.3, Room 4.31
>
> Tel. +49 (0)681 302-57521 : [hidden email] Fax. +49 (0)681
> 302-3065  : http://compilers.cs.uni-saarland.de/people/moll


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev

To me, it is also an issue related to SIMD signature matching when the vectorizer kicks in. Losing info from FE to BE is not good in general.

 

From: Doerfert, Johannes [mailto:[hidden email]]
Sent: Monday, June 24, 2019 8:49 AM
To: Tian, Xinmin <[hidden email]>; Saito, Hideki <[hidden email]>; Francesco Petrogalli <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

 

@Xinmin, Saito: If Clang/the frontend generates the version there is no problem, or is there? The frontend knows about the original source type and it's ABI specific lowering already.

@Francesco, we should even consider putting the generating capabilities outside of the OpenMP code generation (in the future). That could allow easier reuse by other frontends.


From: Tian, Xinmin <[hidden email]>
Sent: Monday, June 24, 2019 5:28:45 PM
To: Saito, Hideki; Francesco Petrogalli; Doerfert, Johannes
Cc: Simon Moll; LLVM Development List; Clang Dev; Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; nd; Roman Lebedev; Philip Reames; Shawn Landden
Subject: RE: RFC: Interface user provided vector functions with the vectorizer.

 

>>>>I don’t know if this is going to be a problem for other architectures

++++++I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

Agreed. It looks we have an issue here.  Given this is parameter, could we use  metadata or attribute to preserve the "struct" info, in ICC, we called BE type saved info in the symtab.

Xinmin

-----Original Message-----
From: Saito, Hideki
Sent: Friday, June 21, 2019 4:44 PM
To: Francesco Petrogalli <[hidden email]>; Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: RE: RFC: Interface user provided vector functions with the vectorizer.


>In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

Ouch.

>I don’t know if this is going to be a problem for other architectures

I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

>Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

Makes sense to me.

-----Original Message-----
From: Francesco Petrogalli [[hidden email]]
Sent: Friday, June 21, 2019 2:04 PM
To: Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Saito, Hideki <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

Hi all - I am working with a colleague to provide an initial implementation of this.

We encountered a problem when dealing with generating the vector signatures of functions that use complex data.

In this proposal, we expect the SVFS component in the backed to demangle the name of the function in the attribute to be able to reconstruct the signature of the vector function from the scalar function signature.

In case of Complex data, this doesn’t seem to be possible, because the information of “being a vector of 2 lanes” that is supposed to be carried by the complex scalar is lost in the transformation the data type in a “coerced” type.

Consider these three types and the function `foo`:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
 int a;
 int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
 return ...;
}

In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.

Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

In particular, (and this I believe has already been mentioned by Johannes), we could use the @llvm.compiler.used intrinsic to mark those declaration that needs to stay in the IR and not optimized away OPT before reaching the vectorizer.

In summary, the change would consist of:

1. Generate symbols declaration/definitions of the vector function with the mangled name in the IR, and mark it with @llvm-compiler.used. This could be done in CGOpenMPRuntime.cpp 2. Use the attribute vector-abs-variant defined in this RFC to map scalar names to vector ABI mangled name, and used the same redirection mechanism for the user provided vector name.
3. Move the “vector function signature generation” from the SVFS in LLVM to the openmp code generator of the clang frontend

The SVFS query system would still work as in the current proposal. The only difference is that the vector function signature would be given by the frontend and not need to be recomputed.

Here is an example of ho the IR would look like with this change:

```
@llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"

declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)

declare i32 @foo(i32) #0

; other function definition, including the one provided by the user `my_vector_foo` if the user provided a definition and not just the declaration

attribute #0 = {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}

```

If the attribute @llvm.compiler.used is not suitable for this (I am not aware of all implication of using it on a global symbol), maybe we could come up with a intrinsics that does what we need (avoid deleting declarations that are not used) and name it @llvm.vector.function.used?

Please let me know what you think, I will submit an updated proposal next week.

Kind regards,

Francesco

> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <[hidden email]> wrote:
>
> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>
> Sorry for the delay and thanks for working on this.
>
> Get Outlook for Android
>
> From: Simon Moll <[hidden email]>
> Sent: Monday, June 17, 2019 10:02:58 AM
> To: Francesco Petrogalli; LLVM Development List; Clang Dev
> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei;
> Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd;
> Roman Lebedev; Philip Reames; Shawn Landden
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

> Hi Francesco,
>
> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
> > Dear all,
> >
> > I have re-written the proposal for interfacing user provided vector
> > functions, originally posted in both llvm-dev and cfe-dev mailing
> > list:
> >
> > "[RFC] Expose user provided vector function for auto-vectorization."
> >
> > The proposal looks quite different from the original submission,
> > therefore I took the liberty to start a new thread.
> >
> > The original thread generated some good discussion. In particular,
> > Simon Moll and Johannes Doerfert (CCed) have managed to provide good
> > arguments for the following claims:
> >
> > 1. The Vector Function ABI name mangling scheme of a target is not
> >     enough to describe all uses cases of function vectorization that
> >     the compiler might end up needing to support in the future.
> I think the new name of the attribute makes this point clear.
> > 2. `declare variant` needs to be handled properly at IR level, to be
> >     able to give the compiler the full OpenMP context of the directive.
> >
> > This proposal addresses those two concerns and other (I believe)
> > minor concerns that have been raised in the previous thread.
> >
> > This proposal is provided with examples and a self assessment around
> > extendibility.
> >
> > I have CCed all the people that have participated in the discussion
> > so far, please let me know if you think I have missed anything of
> > what have been raised.
> >
> > Kind regards,
> >
> > Francesco
>
> LGTM. Please add me as a reviewer for this when you post patches.
>
> Thanks!
>
> Simon
>
> >
> > *** DRAFT OF THE PROPOSAL ***
> >
> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
> >
> > Because the users care about portability (across compilers,
> > libraries and systems), I believe we have to base sour solution on a
> > standard that describes the mapping from the scalar function to the
> > vector function.
> >
> > Because OpenMP is standard and widely used, we should base our
> > solution on the mechanisms that the standard provides, via the
> > directives `declare simd` and `declare variant`, the latter when
> > used in with the `simd` trait in the `construct` set.
> >
> > Please notice that:
> >
> > 1. The scope of the proposal is not implementing full support for
> >     `pragma omp declare variant`.
> > 2. The scope of the proposal is not enabling the vectorizer to do new
> >     kind of vectorizations (e.g. RV-like vectorization described by
> >     Simon).
> > 3. The proposal aims to be extendible wrt 1. and 2.
> > 4. The IR attribute introduced in this proposal is equivalent to the
> >     one needed for the VecClone pass under development in
> >     https://reviews.llvm.org/D22792
> >
> > # CLANG COMPONENTS
> >
> > A C function attribute, `clang_declare_simd_variant`, to attach to
> > the scalar version. The attribute provides enough information to the
> > compiler about the vector shape of the user defined function. The
> > vector shapes handled by the attribute are those handled by the
> > OpenMP standard via `declare simd` (and no more than that).
> >
> > 1. The function attribute handling in clang is crafted with the
> >     requirement that it will be possible to re-use the same components
> >     for the info generated by `declare variant` when used with a `simd`
> >     traits in the `construct` set.
> > 2. The attribute allows orthogonality with the vectorization that is
> >     done via OpenMP: the user vector function is still exposed for
> >     vectorization when not using `-fopenmp-[simd]` once the `declare
> >     simd` and `declare variant` directive of OpenMP will be available
> >     in the front-end.
> >
> > ## C function attribute: `clang_declare_simd_variant`
> >
> > The definition of this attribute has been crafted to match the
> > semantics of `declare variant` for a `simd` construct described in
> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
> > and `arch`, which I believe are enough to cover for the use case of
> > this proposal. If that is not the case, please provide an example,
> > extending the attribute will be easy even once the current one is
> > implemented.
> >
> > ```
> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{,
> > <context selector clauses>})
> >
> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
> >                      for C++, a template-id.
> >
> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
> >
> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
> >
> > <mask>    := inbranch | notinbranch
> >
> > <optional simd clauses> := <linear clause>
> >                           | <uniform clause>
> >                           | <align clause>  | {,<optional simd
> > clauses>}
> >
> > <linear clause>  := linear_ref(<var>,<step>)
> >                    | linear_var(<var>, <step>)
> >                    | linear_uval(<var>, <step>)
> >                    | linear(<var>, <step>)
> >
> > <step> := <var> | <non zero number>
> >
> > <uniform clause> := uniform(<var>)
> >
> > <align clause>   := align(<var>, <positive number>)
> >
> > <var> := Name of a parameter in the scalar function
> > declaration/definition
> >
> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
> >
> > <positive number> := 1 | 2 | 3 | ...
> >
> > <context selector clauses> := {<isa>}{,} {<arch>}
> >
> > <isa> := isa(target-specific-value)
> >
> > <arch> := arch(target-specific-value)
> >
> > ```
> >
> > # LLVM COMPONENTS:
> >
> > ## VectorFunctionShape class
> >
> > The object `VectorFunctionShape` contains the information about the
> > kind of vectorization available for an `llvm::Call`.
> >
> > The object `VectorFunctionShape` must contain the following information:
> >
> > 1. Vectorization Factor (or number or concurrent lanes executed by the
> >     SIMD version of the function). Encoded by unsigned integer.
> > 2. Whether the vector function is requested for scalable
> >     vectorization, encoded by a boolean.
> > 3. Information about masking / no masking, encoded by a boolean.
> > 4. Information about the parameters, encoded in a container that
> >     carries objects of type `ParamaterType`, to describe features like
> >     `linear` and `uniform`.
> > 5. Function name redirection, if a user has specified to use a custom
> >     name instead of the Vector Function ABI ones.
> >
> > Items 1. to 5. represents the information stored in the
> > `vector-function-abi-variant` attribute (see next section).
> >
> > The object can be extended in the future to include new
> > vectorization kinds (for example the RV-like vectorization of the
> > Region Vectorizer), or to add more context information that might
> > come from other uses of OpenMP `declare variant`, or to add new
> > Vector Function ABIs not based on OpenMP. Such information can be
> > retrieved by attributes that will be added to describe the `Call` instance.
> >
> > ## IR Attribute
> >
> > We define a `vector-function-abi-variant` attribute that lists the
> > mangled names produced via the mangling function of the Vector
> > Function ABI rules.
> >
> > ```
> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
> > ```
> >
> > 1. Because we use only OpenMP `declare simd` vectorization, and
> >     because we require a vector Function ABI, we make this explicit
> >     in the name of the attribute.
> > 2. Because the Vector Function ABIs encode all the information
> >     needed to know the vectorization shape of the vector function in
> >     the mangled names, we provide the mangled name via the
> >     attribute.
> > 3. Function names redirection is specified by enclosing the name of
> >     the redirection in parenthesis, as in
> >     `abi_mangled_name_02(user_redirection)`.
> >
> > ## Vector ABI Demangler
> >
> > The “Vector ABI demangler”, is the component that demangles the data
> > in the `vector-function-abi-variant` attribute and that provides the
> > instances of the class `VectorFunctionShape` that can be derived by
> > the mangled names listed in the attribute.
> >
> > ## Query interface: Search Vector Function System (SVFS)
> >
> > An interface that can be queried by the LLVM components to
> > understand whether or not a scalar function can be vectorized, and
> > that retrieves the vector function to be used if such vector shape is available.
> >
> > 1. This component is going to be unrelated to OpenMP.
> > 2. This component will use internally the demangler defined in the
> >     previous section, but it will not expose any aspect of the Vector
> >     Function ABI via its interface.
> >
> > The interface provides two methods.
> >
> > ```
> > std::vector<VectorFunctionShape>
> > SVFS::isFunctionVectorizable(llvm::CallInst * Call);
> >
> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call,
> > VectorFunctionShape Info); ```
> >
> > The first method is used to list all the vector shapes that
> > available and attached to a scalar function. An empty results means
> > that no vector versions are available.
> >
> > The second method retrieves the information needed to build a call
> > to a vector function with a specific `VectorFunctionShape` info.
> >
> > # (SELF) ASSESSMENT ON EXTENDIBILITY
> >
> >
> > 1. Extending the C function attribute `clang_declare_simd_variant` to
> >     new Vector Function ABIs that use OpenMP will be straightforward
> >     because the attribute is tight to such ABIs and OpenMP.
> > 2. The C attribute `clang_declare_simd_variant` and the `declare
> >     variant` directive used for the `simd` trait will be sharing the
> >     internals in clang, so adding the OpenMP functionality for `simd`
> >     traits will be mostly handling the directive in the OpenMP
> >     parser. How this should be done is described in
> >    
> > https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attri
> > bute 3. The IR attribute `vector-function-abi-variant` is not to be
> >     extended to represent other kind of vectorization other than those
> >     handled by `declare simd` and that are handled with a Vector
> >     Function ABI.
> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
> >     extended to represent the information of `declare variant` in its
> >     totality.
> > 5. The IR attribute will not need to change when we will introduce non
> >     vector function ABI vectorization (RV-like, reductions...) or when
> >     we will decide to fully support `declare variant`. The information
> >     it carries will not need to be invalidated, but just extended with
> >     new attributes that will need to be handled by the
> >     `VectorFunctionShape` class, in a similar way the
> >     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
> >     operates on individual attributes to describe an overall
> >     functionality.
> >
> > # Examples
> >
> > ## Example 1
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64.
> >
> > ```
> > double foo_01(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_01",
> > simdlen(2), notinbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_01(float64x2_t VectorInput); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
> > ```
> >
> > ## Example 2
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64, but with the wrong signature. The user specifies a
> > masked version of the function in the clauses of the attribute, the
> > compiler throws an error suggesting the signature expected for
> > ``vector_foo_02.``
> >
> > ```
> > double foo_02(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_02",
> > simdlen(2), inbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_02(float64x2_t VectorInput);
> > // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
> > ```
> >
> > ## Example 3
> >
> > Targeting `sincos`-like signatures.
> >
> > ```
> > void foo_03(double Input, double * Output)
> > __attribute__(clang_declare_simd_variant(“vector_foo_03",
> > simdlen(2), notinbranch, linear(Output, 1), isa("simd"));
> >
> > // Advanced SIMD version
> > void vector_foo_03(float64x2_t VectorInput, double * Output); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 =
> > {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
> > ```
> > ## Example 4
> >
> > Scalable vectorization targeting SVE
> >
> > ```
> > double foo_04(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_04",
> > simdlen("scalable"), notinbranch, isa("sve"));
> >
> > // SVE version
> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 5
> >
> > Fixed length vectorization targeting SVE
> >
> > ```
> > double foo_05(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_05",
> > simdlen(4), inbranch, isa("sve"));
> >
> > // Fixed-length SVE version
> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 06
> >
> > This is an x86 example, equivalent to the one provided by Andrei
> > Elovikow in
> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html.
> > Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ
> >
> > ```
> > float MyAdd(float* a, int b)
> > __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) {
> >    return *a + b;
> > }
> >
> >
> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
> > ```
> >
> > ## Example showing interaction with `declare simd`
> >
> > ```
> > #pragma omp declare simd linear(a) notinbranch float foo_06(float
> > *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
> >      return *a + x;
> > }
> >
> > // Advanced SIMD version
> > float32x4_t vector_foo_06(float *a, int32x4_t vx) { // Custom
> > implementation.
> > }
> > ```
> >
> > The resulting IR attribute is made of three symbols:
> >
> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
> >     ones the compiler builds by auto-vectorizing `foo_06` according to
> >     the rule defined in the Vector Function ABI specifications for
> >     AArch64.
> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
> >     user-defined redirection of the 4-lane version of `foo_06` to the
> >     custom implementation provided by the user when targeting Advanced
> >     SIMD for version 8.2 of the A64 instruction set.
> >
> > ```
> > attribute #0 =
> > {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_Z
> > GVnN4l4v_foo_06(vector_foo_06)"}
> > ```
> >
> --
>
> Simon Moll
> Researcher / PhD Student
>
> Compiler Design Lab (Prof. Hack)
> Saarland University, Computer Science
> Building E1.3, Room 4.31
>
> Tel. +49 (0)681 302-57521 : [hidden email] Fax. +49 (0)681
> 302-3065  : http://compilers.cs.uni-saarland.de/people/moll


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev


> On Jun 24, 2019, at 10:48 AM, Doerfert, Johannes <[hidden email]> wrote:
>
> @Francesco, we should even consider putting the generating capabilities outside of the OpenMP code generation (in the future). That could allow easier reuse by other frontends.

This is already decoupled from OpenMP code generation. Here OpenMP is used only to classify the functions via `clang_declare_simd_variant`. The data it generate is used buy the SVFS, but the SVFS itself is independent from OpenMP, and can be extended to be used for other kind of vector functions that are not handled by the OpenMP description.

Other frontends will be able to use the same mechanism, they will just have to generate the same data in the IR.

Francesco
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev
I thought, when you match in the vectorizer you don't care what the the type was iff the FE made sure the versions available have been encoded according to the ABI. Maybe we need an example where this is a problem.




From: Tian, Xinmin <[hidden email]>
Sent: Monday, June 24, 2019 5:53:28 PM
To: Doerfert, Johannes; Saito, Hideki; Francesco Petrogalli
Cc: Simon Moll; LLVM Development List; Clang Dev; Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; nd; Roman Lebedev; Philip Reames; Shawn Landden
Subject: RE: RFC: Interface user provided vector functions with the vectorizer.
 

To me, it is also an issue related to SIMD signature matching when the vectorizer kicks in. Losing info from FE to BE is not good in general.

 

From: Doerfert, Johannes [mailto:[hidden email]]
Sent: Monday, June 24, 2019 8:49 AM
To: Tian, Xinmin <[hidden email]>; Saito, Hideki <[hidden email]>; Francesco Petrogalli <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

 

@Xinmin, Saito: If Clang/the frontend generates the version there is no problem, or is there? The frontend knows about the original source type and it's ABI specific lowering already.

@Francesco, we should even consider putting the generating capabilities outside of the OpenMP code generation (in the future). That could allow easier reuse by other frontends.


From: Tian, Xinmin <[hidden email]>
Sent: Monday, June 24, 2019 5:28:45 PM
To: Saito, Hideki; Francesco Petrogalli; Doerfert, Johannes
Cc: Simon Moll; LLVM Development List; Clang Dev; Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; nd; Roman Lebedev; Philip Reames; Shawn Landden
Subject: RE: RFC: Interface user provided vector functions with the vectorizer.

 

>>>>I don’t know if this is going to be a problem for other architectures

++++++I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

Agreed. It looks we have an issue here.  Given this is parameter, could we use  metadata or attribute to preserve the "struct" info, in ICC, we called BE type saved info in the symtab.

Xinmin

-----Original Message-----
From: Saito, Hideki
Sent: Friday, June 21, 2019 4:44 PM
To: Francesco Petrogalli <[hidden email]>; Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: RE: RFC: Interface user provided vector functions with the vectorizer.


>In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

Ouch.

>I don’t know if this is going to be a problem for other architectures

I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.

>Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

Makes sense to me.

-----Original Message-----
From: Francesco Petrogalli [[hidden email]]
Sent: Friday, June 21, 2019 2:04 PM
To: Doerfert, Johannes <[hidden email]>
Cc: Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; Saito, Hideki <[hidden email]>; Tian, Xinmin <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

Hi all - I am working with a colleague to provide an initial implementation of this.

We encountered a problem when dealing with generating the vector signatures of functions that use complex data.

In this proposal, we expect the SVFS component in the backed to demangle the name of the function in the attribute to be able to reconstruct the signature of the vector function from the scalar function signature.

In case of Complex data, this doesn’t seem to be possible, because the information of “being a vector of 2 lanes” that is supposed to be carried by the complex scalar is lost in the transformation the data type in a “coerced” type.

Consider these three types and the function `foo`:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
 int a;
 int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
 return ...;
}

In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.

I don’t know if this is going to be a problem for other architectures, but this is definitely a problem on AArch64 where we need to be able to generate the correct vector function signature for a specific simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.

Therefore, I would like to propose a change to the RFC, which would move the responsibility off generating the vector function signature from LLVM to clang.

In particular, (and this I believe has already been mentioned by Johannes), we could use the @llvm.compiler.used intrinsic to mark those declaration that needs to stay in the IR and not optimized away OPT before reaching the vectorizer.

In summary, the change would consist of:

1. Generate symbols declaration/definitions of the vector function with the mangled name in the IR, and mark it with @llvm-compiler.used. This could be done in CGOpenMPRuntime.cpp 2. Use the attribute vector-abs-variant defined in this RFC to map scalar names to vector ABI mangled name, and used the same redirection mechanism for the user provided vector name.
3. Move the “vector function signature generation” from the SVFS in LLVM to the openmp code generator of the clang frontend

The SVFS query system would still work as in the current proposal. The only difference is that the vector function signature would be given by the frontend and not need to be recomputed.

Here is an example of ho the IR would look like with this change:

```
@llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"

declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)

declare i32 @foo(i32) #0

; other function definition, including the one provided by the user `my_vector_foo` if the user provided a definition and not just the declaration

attribute #0 = {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}

```

If the attribute @llvm.compiler.used is not suitable for this (I am not aware of all implication of using it on a global symbol), maybe we could come up with a intrinsics that does what we need (avoid deleting declarations that are not used) and name it @llvm.vector.function.used?

Please let me know what you think, I will submit an updated proposal next week.

Kind regards,

Francesco

> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <[hidden email]> wrote:
>
> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>
> Sorry for the delay and thanks for working on this.
>
> Get Outlook for Android
>
> From: Simon Moll <[hidden email]>
> Sent: Monday, June 17, 2019 10:02:58 AM
> To: Francesco Petrogalli; LLVM Development List; Clang Dev
> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei;
> Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd;
> Roman Lebedev; Philip Reames; Shawn Landden
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

> Hi Francesco,
>
> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
> > Dear all,
> >
> > I have re-written the proposal for interfacing user provided vector
> > functions, originally posted in both llvm-dev and cfe-dev mailing
> > list:
> >
> > "[RFC] Expose user provided vector function for auto-vectorization."
> >
> > The proposal looks quite different from the original submission,
> > therefore I took the liberty to start a new thread.
> >
> > The original thread generated some good discussion. In particular,
> > Simon Moll and Johannes Doerfert (CCed) have managed to provide good
> > arguments for the following claims:
> >
> > 1. The Vector Function ABI name mangling scheme of a target is not
> >     enough to describe all uses cases of function vectorization that
> >     the compiler might end up needing to support in the future.
> I think the new name of the attribute makes this point clear.
> > 2. `declare variant` needs to be handled properly at IR level, to be
> >     able to give the compiler the full OpenMP context of the directive.
> >
> > This proposal addresses those two concerns and other (I believe)
> > minor concerns that have been raised in the previous thread.
> >
> > This proposal is provided with examples and a self assessment around
> > extendibility.
> >
> > I have CCed all the people that have participated in the discussion
> > so far, please let me know if you think I have missed anything of
> > what have been raised.
> >
> > Kind regards,
> >
> > Francesco
>
> LGTM. Please add me as a reviewer for this when you post patches.
>
> Thanks!
>
> Simon
>
> >
> > *** DRAFT OF THE PROPOSAL ***
> >
> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
> >
> > Because the users care about portability (across compilers,
> > libraries and systems), I believe we have to base sour solution on a
> > standard that describes the mapping from the scalar function to the
> > vector function.
> >
> > Because OpenMP is standard and widely used, we should base our
> > solution on the mechanisms that the standard provides, via the
> > directives `declare simd` and `declare variant`, the latter when
> > used in with the `simd` trait in the `construct` set.
> >
> > Please notice that:
> >
> > 1. The scope of the proposal is not implementing full support for
> >     `pragma omp declare variant`.
> > 2. The scope of the proposal is not enabling the vectorizer to do new
> >     kind of vectorizations (e.g. RV-like vectorization described by
> >     Simon).
> > 3. The proposal aims to be extendible wrt 1. and 2.
> > 4. The IR attribute introduced in this proposal is equivalent to the
> >     one needed for the VecClone pass under development in
> >     https://reviews.llvm.org/D22792
> >
> > # CLANG COMPONENTS
> >
> > A C function attribute, `clang_declare_simd_variant`, to attach to
> > the scalar version. The attribute provides enough information to the
> > compiler about the vector shape of the user defined function. The
> > vector shapes handled by the attribute are those handled by the
> > OpenMP standard via `declare simd` (and no more than that).
> >
> > 1. The function attribute handling in clang is crafted with the
> >     requirement that it will be possible to re-use the same components
> >     for the info generated by `declare variant` when used with a `simd`
> >     traits in the `construct` set.
> > 2. The attribute allows orthogonality with the vectorization that is
> >     done via OpenMP: the user vector function is still exposed for
> >     vectorization when not using `-fopenmp-[simd]` once the `declare
> >     simd` and `declare variant` directive of OpenMP will be available
> >     in the front-end.
> >
> > ## C function attribute: `clang_declare_simd_variant`
> >
> > The definition of this attribute has been crafted to match the
> > semantics of `declare variant` for a `simd` construct described in
> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
> > and `arch`, which I believe are enough to cover for the use case of
> > this proposal. If that is not the case, please provide an example,
> > extending the attribute will be easy even once the current one is
> > implemented.
> >
> > ```
> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{,
> > <context selector clauses>})
> >
> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
> >                      for C++, a template-id.
> >
> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
> >
> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
> >
> > <mask>    := inbranch | notinbranch
> >
> > <optional simd clauses> := <linear clause>
> >                           | <uniform clause>
> >                           | <align clause>  | {,<optional simd
> > clauses>}
> >
> > <linear clause>  := linear_ref(<var>,<step>)
> >                    | linear_var(<var>, <step>)
> >                    | linear_uval(<var>, <step>)
> >                    | linear(<var>, <step>)
> >
> > <step> := <var> | <non zero number>
> >
> > <uniform clause> := uniform(<var>)
> >
> > <align clause>   := align(<var>, <positive number>)
> >
> > <var> := Name of a parameter in the scalar function
> > declaration/definition
> >
> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
> >
> > <positive number> := 1 | 2 | 3 | ...
> >
> > <context selector clauses> := {<isa>}{,} {<arch>}
> >
> > <isa> := isa(target-specific-value)
> >
> > <arch> := arch(target-specific-value)
> >
> > ```
> >
> > # LLVM COMPONENTS:
> >
> > ## VectorFunctionShape class
> >
> > The object `VectorFunctionShape` contains the information about the
> > kind of vectorization available for an `llvm::Call`.
> >
> > The object `VectorFunctionShape` must contain the following information:
> >
> > 1. Vectorization Factor (or number or concurrent lanes executed by the
> >     SIMD version of the function). Encoded by unsigned integer.
> > 2. Whether the vector function is requested for scalable
> >     vectorization, encoded by a boolean.
> > 3. Information about masking / no masking, encoded by a boolean.
> > 4. Information about the parameters, encoded in a container that
> >     carries objects of type `ParamaterType`, to describe features like
> >     `linear` and `uniform`.
> > 5. Function name redirection, if a user has specified to use a custom
> >     name instead of the Vector Function ABI ones.
> >
> > Items 1. to 5. represents the information stored in the
> > `vector-function-abi-variant` attribute (see next section).
> >
> > The object can be extended in the future to include new
> > vectorization kinds (for example the RV-like vectorization of the
> > Region Vectorizer), or to add more context information that might
> > come from other uses of OpenMP `declare variant`, or to add new
> > Vector Function ABIs not based on OpenMP. Such information can be
> > retrieved by attributes that will be added to describe the `Call` instance.
> >
> > ## IR Attribute
> >
> > We define a `vector-function-abi-variant` attribute that lists the
> > mangled names produced via the mangling function of the Vector
> > Function ABI rules.
> >
> > ```
> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
> > ```
> >
> > 1. Because we use only OpenMP `declare simd` vectorization, and
> >     because we require a vector Function ABI, we make this explicit
> >     in the name of the attribute.
> > 2. Because the Vector Function ABIs encode all the information
> >     needed to know the vectorization shape of the vector function in
> >     the mangled names, we provide the mangled name via the
> >     attribute.
> > 3. Function names redirection is specified by enclosing the name of
> >     the redirection in parenthesis, as in
> >     `abi_mangled_name_02(user_redirection)`.
> >
> > ## Vector ABI Demangler
> >
> > The “Vector ABI demangler”, is the component that demangles the data
> > in the `vector-function-abi-variant` attribute and that provides the
> > instances of the class `VectorFunctionShape` that can be derived by
> > the mangled names listed in the attribute.
> >
> > ## Query interface: Search Vector Function System (SVFS)
> >
> > An interface that can be queried by the LLVM components to
> > understand whether or not a scalar function can be vectorized, and
> > that retrieves the vector function to be used if such vector shape is available.
> >
> > 1. This component is going to be unrelated to OpenMP.
> > 2. This component will use internally the demangler defined in the
> >     previous section, but it will not expose any aspect of the Vector
> >     Function ABI via its interface.
> >
> > The interface provides two methods.
> >
> > ```
> > std::vector<VectorFunctionShape>
> > SVFS::isFunctionVectorizable(llvm::CallInst * Call);
> >
> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call,
> > VectorFunctionShape Info); ```
> >
> > The first method is used to list all the vector shapes that
> > available and attached to a scalar function. An empty results means
> > that no vector versions are available.
> >
> > The second method retrieves the information needed to build a call
> > to a vector function with a specific `VectorFunctionShape` info.
> >
> > # (SELF) ASSESSMENT ON EXTENDIBILITY
> >
> >
> > 1. Extending the C function attribute `clang_declare_simd_variant` to
> >     new Vector Function ABIs that use OpenMP will be straightforward
> >     because the attribute is tight to such ABIs and OpenMP.
> > 2. The C attribute `clang_declare_simd_variant` and the `declare
> >     variant` directive used for the `simd` trait will be sharing the
> >     internals in clang, so adding the OpenMP functionality for `simd`
> >     traits will be mostly handling the directive in the OpenMP
> >     parser. How this should be done is described in
> >    
> > https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attri
> > bute 3. The IR attribute `vector-function-abi-variant` is not to be
> >     extended to represent other kind of vectorization other than those
> >     handled by `declare simd` and that are handled with a Vector
> >     Function ABI.
> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
> >     extended to represent the information of `declare variant` in its
> >     totality.
> > 5. The IR attribute will not need to change when we will introduce non
> >     vector function ABI vectorization (RV-like, reductions...) or when
> >     we will decide to fully support `declare variant`. The information
> >     it carries will not need to be invalidated, but just extended with
> >     new attributes that will need to be handled by the
> >     `VectorFunctionShape` class, in a similar way the
> >     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
> >     operates on individual attributes to describe an overall
> >     functionality.
> >
> > # Examples
> >
> > ## Example 1
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64.
> >
> > ```
> > double foo_01(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_01",
> > simdlen(2), notinbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_01(float64x2_t VectorInput); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
> > ```
> >
> > ## Example 2
> >
> > Exposing an Advanced SIMD vector function when targeting Advanced
> > SIMD in AArch64, but with the wrong signature. The user specifies a
> > masked version of the function in the clauses of the attribute, the
> > compiler throws an error suggesting the signature expected for
> > ``vector_foo_02.``
> >
> > ```
> > double foo_02(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_02",
> > simdlen(2), inbranch, isa("simd"));
> >
> > // Advanced SIMD version
> > float64x2_t vector_foo_02(float64x2_t VectorInput);
> > // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
> > ```
> >
> > ## Example 3
> >
> > Targeting `sincos`-like signatures.
> >
> > ```
> > void foo_03(double Input, double * Output)
> > __attribute__(clang_declare_simd_variant(“vector_foo_03",
> > simdlen(2), notinbranch, linear(Output, 1), isa("simd"));
> >
> > // Advanced SIMD version
> > void vector_foo_03(float64x2_t VectorInput, double * Output); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 =
> > {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
> > ```
> > ## Example 4
> >
> > Scalable vectorization targeting SVE
> >
> > ```
> > double foo_04(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_04",
> > simdlen("scalable"), notinbranch, isa("sve"));
> >
> > // SVE version
> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 5
> >
> > Fixed length vectorization targeting SVE
> >
> > ```
> > double foo_05(double Input)
> > __attribute__(clang_declare_simd_variant(“vector_foo_05",
> > simdlen(4), inbranch, isa("sve"));
> >
> > // Fixed-length SVE version
> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
> > ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
> > ```
> >
> > ## Example 06
> >
> > This is an x86 example, equivalent to the one provided by Andrei
> > Elovikow in
> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html.
> > Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ
> >
> > ```
> > float MyAdd(float* a, int b)
> > __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8), notinbranch, arch("core_2nd_gen_avx")) {
> >    return *a + b;
> > }
> >
> >
> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ```
> >
> > The resulting IR attribute is:
> >
> > ```
> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
> > ```
> >
> > ## Example showing interaction with `declare simd`
> >
> > ```
> > #pragma omp declare simd linear(a) notinbranch float foo_06(float
> > *a, int x) __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4), linear(a), notinbranch, arch("armv8.2-a+simd")) {
> >      return *a + x;
> > }
> >
> > // Advanced SIMD version
> > float32x4_t vector_foo_06(float *a, int32x4_t vx) { // Custom
> > implementation.
> > }
> > ```
> >
> > The resulting IR attribute is made of three symbols:
> >
> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
> >     ones the compiler builds by auto-vectorizing `foo_06` according to
> >     the rule defined in the Vector Function ABI specifications for
> >     AArch64.
> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
> >     user-defined redirection of the 4-lane version of `foo_06` to the
> >     custom implementation provided by the user when targeting Advanced
> >     SIMD for version 8.2 of the A64 instruction set.
> >
> > ```
> > attribute #0 =
> > {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_Z
> > GVnN4l4v_foo_06(vector_foo_06)"}
> > ```
> >
> --
>
> Simon Moll
> Researcher / PhD Student
>
> Compiler Design Lab (Prof. Hack)
> Saarland University, Computer Science
> Building E1.3, Room 4.31
>
> Tel. +49 (0)681 302-57521 : [hidden email] Fax. +49 (0)681
> 302-3065  : http://compilers.cs.uni-saarland.de/people/moll


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev


> On Jun 24, 2019, at 10:53 AM, Tian, Xinmin <[hidden email]> wrote:
>
> To me, it is also an issue related to SIMD signature matching when the vectorizer kicks in. Losing info from FE to BE is not good in general.
>  

Yes, we cannot loose such information. In particular, the three examples I reported are all generating i64 in the scalar function signature:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
int a;
int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
return ...;
}


On AArch64, the correspondent vector function signature in the three cases would be (for 2-lane unmasked vectorization):

// Type 1:

<4 x int> vectorized_foo(<4 x int>, <4 x int>)

// Type 2:

%a = type struct {I 32, i32}

<2 x %a* > vectorized_foo(<2 x %a*> , <2 x %a*>)

// Type 3:

<2 x i64> vectorized_foo(<2 x i64>, <2 x i64)

To make sure that the vectorizer knows how to map the scalar function parameters to the vector ones, we have to make sure that the original signature  information is stored somewhere.

I will work on this, and provide examples.

Suggestions are welcome.

Thank you

Francesco
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
Thanks Francesco!

-----Original Message-----
From: Francesco Petrogalli [mailto:[hidden email]]
Sent: Monday, June 24, 2019 9:06 AM
To: Tian, Xinmin <[hidden email]>
Cc: Doerfert, Johannes <[hidden email]>; Saito, Hideki <[hidden email]>; Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.



> On Jun 24, 2019, at 10:53 AM, Tian, Xinmin <[hidden email]> wrote:
>
> To me, it is also an issue related to SIMD signature matching when the vectorizer kicks in. Losing info from FE to BE is not good in general.
>  

Yes, we cannot loose such information. In particular, the three examples I reported are all generating i64 in the scalar function signature:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
int a;
int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
return ...;
}


On AArch64, the correspondent vector function signature in the three cases would be (for 2-lane unmasked vectorization):

// Type 1:

<4 x int> vectorized_foo(<4 x int>, <4 x int>)

// Type 2:

%a = type struct {I 32, i32}

<2 x %a* > vectorized_foo(<2 x %a*> , <2 x %a*>)

// Type 3:

<2 x i64> vectorized_foo(<2 x i64>, <2 x i64)

To make sure that the vectorizer knows how to map the scalar function parameters to the vector ones, we have to make sure that the original signature  information is stored somewhere.

I will work on this, and provide examples.

Suggestions are welcome.

Thank you

Francesco
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev
I mean, the FE will create only one of the 3 vector versions matching the one we want for a given vector length, wouldn't it? The question now is: can we with the scalar and one vector version correctly vectorize the call. If the answer is no, what is the minimal amount of information, in addition to the two version, we would need?


From: Francesco Petrogalli <[hidden email]>
Sent: Monday, June 24, 2019 6:06:12 PM
To: Tian, Xinmin
Cc: Doerfert, Johannes; Saito, Hideki; Simon Moll; LLVM Development List; Clang Dev; Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; nd; Roman Lebedev; Philip Reames; Shawn Landden
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
 


> On Jun 24, 2019, at 10:53 AM, Tian, Xinmin <[hidden email]> wrote:
>
> To me, it is also an issue related to SIMD signature matching when the vectorizer kicks in. Losing info from FE to BE is not good in general.


Yes, we cannot loose such information. In particular, the three examples I reported are all generating i64 in the scalar function signature:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
int a;
int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
return ...;
}


On AArch64, the correspondent vector function signature in the three cases would be (for 2-lane unmasked vectorization):

// Type 1:

<4 x int> vectorized_foo(<4 x int>, <4 x int>)

// Type 2:

%a = type struct {I 32, i32}

<2 x %a* > vectorized_foo(<2 x %a*> , <2 x %a*>)

// Type 3:

<2 x i64> vectorized_foo(<2 x i64>, <2 x i64)

To make sure that the vectorizer knows how to map the scalar function parameters to the vector ones, we have to make sure that the original signature  information is stored somewhere.

I will work on this, and provide examples.

Suggestions are welcome.

Thank you

Francesco

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev
I have an RFC for first-class complex types in LLVM IR pending for some
internal review.  I hope to post it soon.  That should help address this
problem.  Then the vector function signature generation could stay in
LLVM, if I'm understanding the issue correctly.

                     -David

Francesco Petrogalli via llvm-dev <[hidden email]> writes:

> Hi all - I am working with a colleague to provide an initial implementation of this.
>
> We encountered a problem when dealing with generating the vector signatures of functions that use complex data.
>
> In this proposal, we expect the SVFS component in the backed to
> demangle the name of the function in the attribute to be able to
> reconstruct the signature of the vector function from the scalar
> function signature.
>
> In case of Complex data, this doesn’t seem to be possible, because the
> information of “being a vector of 2 lanes” that is supposed to be
> carried by the complex scalar is lost in the transformation the data
> type in a “coerced” type.
>
> Consider these three types and the function `foo`:
>
> // Type 1
> typedef _Complex int S;
>
> // Type 2
> typedef struct x{
>  int a;
>  int b;
> } S;
>
> // Type 3
> typedef uint64_t S;
>
> S foo(S a, S b) {
>  return ...;
> }
>
> In all cases, the IR type of the parameters in `foo` is i64, therefore
> is not possible to distinguish what C type generated the signature of
> `foo`.
>
> I don’t know if this is going to be a problem for other architectures,
> but this is definitely a problem on AArch64 where we need to be able
> to generate the correct vector function signature for a specific
> simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector
> type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.
>
> Therefore, I would like to propose a change to the RFC, which would
> move the responsibility off generating the vector function signature
> from LLVM to clang.
>
> In particular, (and this I believe has already been mentioned by
> Johannes), we could use the @llvm.compiler.used intrinsic to mark
> those declaration that needs to stay in the IR and not optimized away
> OPT before reaching the vectorizer.
>
> In summary, the change would consist of:
>
> 1. Generate symbols declaration/definitions of the vector function
> with the mangled name in the IR, and mark it with
> @llvm-compiler.used. This could be done in CGOpenMPRuntime.cpp
> 2. Use the attribute vector-abs-variant defined in this RFC to map
> scalar names to vector ABI mangled name, and used the same redirection
> mechanism for the user provided vector name.
> 3. Move the “vector function signature generation” from the SVFS in LLVM to the openmp code generator of the clang frontend
>
> The SVFS query system would still work as in the current proposal. The
> only difference is that the vector function signature would be given
> by the frontend and not need to be recomputed.
>
> Here is an example of ho the IR would look like with this change:
>
> ```
> @llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"
>
> declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)
>
> declare i32 @foo(i32) #0
>
> ; other function definition, including the one provided by the user
> `my_vector_foo` if the user provided a definition and not just the
> declaration
>
> attribute #0 = {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}
>
> ```
>
> If the attribute @llvm.compiler.used is not suitable for this (I am
> not aware of all implication of using it on a global symbol), maybe we
> could come up with a intrinsics that does what we need (avoid deleting
> declarations that are not used) and name it
> @llvm.vector.function.used?
>
> Please let me know what you think, I will submit an updated proposal next week.
>
> Kind regards,
>
> Francesco
>
>> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <[hidden email]> wrote:
>>
>> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>>
>> Sorry for the delay and thanks for working on this.
>>
>> Get Outlook for Android
>>
>> From: Simon Moll <[hidden email]>
>> Sent: Monday, June 17, 2019 10:02:58 AM
>> To: Francesco Petrogalli; LLVM Development List; Clang Dev
>> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei;
> Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd;
> Roman Lebedev; Philip Reames; Shawn Landden
>> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>>  
>> Hi Francesco,
>>
>> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
>> > Dear all,
>> >
>> > I have re-written the proposal for interfacing user provided vector
>> > functions, originally posted in both llvm-dev and cfe-dev mailing
>> > list:
>> >
>> > "[RFC] Expose user provided vector function for auto-vectorization."
>> >
>> > The proposal looks quite different from the original submission,
>> > therefore I took the liberty to start a new thread.
>> >
>> > The original thread generated some good discussion. In particular,
>> > Simon Moll and Johannes Doerfert (CCed) have managed to provide good
>> > arguments for the following claims:
>> >
>> > 1. The Vector Function ABI name mangling scheme of a target is not
>> >     enough to describe all uses cases of function vectorization that
>> >     the compiler might end up needing to support in the future.
>> I think the new name of the attribute makes this point clear.
>> > 2. `declare variant` needs to be handled properly at IR level, to be
>> >     able to give the compiler the full OpenMP context of the directive.
>> >
>> > This proposal addresses those two concerns and other (I believe) minor
>> > concerns that have been raised in the previous thread.
>> >
>> > This proposal is provided with examples and a self assessment around
>> > extendibility.
>> >
>> > I have CCed all the people that have participated in the discussion so
>> > far, please let me know if you think I have missed anything of what
>> > have been raised.
>> >
>> > Kind regards,
>> >
>> > Francesco
>>
>> LGTM. Please add me as a reviewer for this when you post patches.
>>
>> Thanks!
>>
>> Simon
>>
>> >
>> > *** DRAFT OF THE PROPOSAL ***
>> >
>> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
>> >
>> > Because the users care about portability (across compilers, libraries
>> > and systems), I believe we have to base sour solution on a standard
>> > that describes the mapping from the scalar function to the vector
>> > function.
>> >
>> > Because OpenMP is standard and widely used, we should base our
>> > solution on the mechanisms that the standard provides, via the
>> > directives `declare simd` and `declare variant`, the latter when used
>> > in with the `simd` trait in the `construct` set.
>> >
>> > Please notice that:
>> >
>> > 1. The scope of the proposal is not implementing full support for
>> >     `pragma omp declare variant`.
>> > 2. The scope of the proposal is not enabling the vectorizer to do new
>> >     kind of vectorizations (e.g. RV-like vectorization described by
>> >     Simon).
>> > 3. The proposal aims to be extendible wrt 1. and 2.
>> > 4. The IR attribute introduced in this proposal is equivalent to the
>> >     one needed for the VecClone pass under development in
>> >     https://reviews.llvm.org/D22792> >
>> > # CLANG COMPONENTS
>> >
>> > A C function attribute, `clang_declare_simd_variant`, to attach to the
>> > scalar version. The attribute provides enough information to the
>> > compiler about the vector shape of the user defined function. The
>> > vector shapes handled by the attribute are those handled by the OpenMP
>> > standard via `declare simd` (and no more than that).
>> >
>> > 1. The function attribute handling in clang is crafted with the
>> >     requirement that it will be possible to re-use the same components
>> >     for the info generated by `declare variant` when used with a `simd`
>> >     traits in the `construct` set.
>> > 2. The attribute allows orthogonality with the vectorization that is
>> >     done via OpenMP: the user vector function is still exposed for
>> >     vectorization when not using `-fopenmp-[simd]` once the `declare
>> >     simd` and `declare variant` directive of OpenMP will be available
>> >     in the front-end.
>> >
>> > ## C function attribute: `clang_declare_simd_variant`
>> >
>> > The definition of this attribute has been crafted to match the
>> > semantics of `declare variant` for a `simd` construct described in
>> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
>> > and `arch`, which I believe are enough to cover for the use case of
>> > this proposal. If that is not the case, please provide an example,
>> > extending the attribute will be easy even once the current one is
>> > implemented.
>> >
>> > ```
>> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{, <context selector clauses>})
>> >
>> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
>> >                      for C++, a template-id.
>> >
>> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
>> >
>> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
>> >
>> > <mask>    := inbranch | notinbranch
>> >
>> > <optional simd clauses> := <linear clause>
>> >                           | <uniform clause>
>> >                           | <align clause>  | {,<optional simd clauses>}
>> >
>> > <linear clause>  := linear_ref(<var>,<step>)
>> >                    | linear_var(<var>, <step>)
>> >                    | linear_uval(<var>, <step>)
>> >                    | linear(<var>, <step>)
>> >
>> > <step> := <var> | <non zero number>
>> >
>> > <uniform clause> := uniform(<var>)
>> >
>> > <align clause>   := align(<var>, <positive number>)
>> >
>> > <var> := Name of a parameter in the scalar function declaration/definition
>> >
>> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
>> >
>> > <positive number> := 1 | 2 | 3 | ...
>> >
>> > <context selector clauses> := {<isa>}{,} {<arch>}
>> >
>> > <isa> := isa(target-specific-value)
>> >
>> > <arch> := arch(target-specific-value)
>> >
>> > ```
>> >
>> > # LLVM COMPONENTS:
>> >
>> > ## VectorFunctionShape class
>> >
>> > The object `VectorFunctionShape` contains the information about the
>> > kind of vectorization available for an `llvm::Call`.
>> >
>> > The object `VectorFunctionShape` must contain the following information:
>> >
>> > 1. Vectorization Factor (or number or concurrent lanes executed by the
>> >     SIMD version of the function). Encoded by unsigned integer.
>> > 2. Whether the vector function is requested for scalable
>> >     vectorization, encoded by a boolean.
>> > 3. Information about masking / no masking, encoded by a boolean.
>> > 4. Information about the parameters, encoded in a container that
>> >     carries objects of type `ParamaterType`, to describe features like
>> >     `linear` and `uniform`.
>> > 5. Function name redirection, if a user has specified to use a custom
>> >     name instead of the Vector Function ABI ones.
>> >
>> > Items 1. to 5. represents the information stored in the
>> > `vector-function-abi-variant` attribute (see next section).
>> >
>> > The object can be extended in the future to include new vectorization
>> > kinds (for example the RV-like vectorization of the Region
>> > Vectorizer), or to add more context information that might come from
>> > other uses of OpenMP `declare variant`, or to add new Vector Function
>> > ABIs not based on OpenMP. Such information can be retrieved by
>> > attributes that will be added to describe the `Call` instance.
>> >
>> > ## IR Attribute
>> >
>> > We define a `vector-function-abi-variant` attribute that lists the
>> > mangled names produced via the mangling function of the Vector
>> > Function ABI rules.
>> >
>> > ```
>> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
>> > ```
>> >
>> > 1. Because we use only OpenMP `declare simd` vectorization, and
>> >     because we require a vector Function ABI, we make this explicit
>> >     in the name of the attribute.
>> > 2. Because the Vector Function ABIs encode all the information
>> >     needed to know the vectorization shape of the vector function in
>> >     the mangled names, we provide the mangled name via the
>> >     attribute.
>> > 3. Function names redirection is specified by enclosing the name of
>> >     the redirection in parenthesis, as in
>> >     `abi_mangled_name_02(user_redirection)`.
>> >
>> > ## Vector ABI Demangler
>> >
>> > The “Vector ABI demangler”, is the component that demangles the data
>> > in the `vector-function-abi-variant` attribute and that provides the
>> > instances of the class `VectorFunctionShape` that can be derived by
>> > the mangled names listed in the attribute.
>> >
>> > ## Query interface: Search Vector Function System (SVFS)
>> >
>> > An interface that can be queried by the LLVM components to understand
>> > whether or not a scalar function can be vectorized, and that retrieves
>> > the vector function to be used if such vector shape is available.
>> >
>> > 1. This component is going to be unrelated to OpenMP.
>> > 2. This component will use internally the demangler defined in the
>> >     previous section, but it will not expose any aspect of the Vector
>> >     Function ABI via its interface.
>> >
>> > The interface provides two methods.
>> >
>> > ```
>> > std::vector<VectorFunctionShape> SVFS::isFunctionVectorizable(llvm::CallInst * Call);
>> >
>> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call, VectorFunctionShape Info);
>> > ```
>> >
>> > The first method is used to list all the vector shapes that available
>> > and attached to a scalar function. An empty results means that no
>> > vector versions are available.
>> >
>> > The second method retrieves the information needed to build a call to
>> > a vector function with a specific `VectorFunctionShape` info.
>> >
>> > # (SELF) ASSESSMENT ON EXTENDIBILITY
>> >
>> >
>> > 1. Extending the C function attribute `clang_declare_simd_variant` to
>> >     new Vector Function ABIs that use OpenMP will be straightforward
>> >     because the attribute is tight to such ABIs and OpenMP.
>> > 2. The C attribute `clang_declare_simd_variant` and the `declare
>> >     variant` directive used for the `simd` trait will be sharing the
>> >     internals in clang, so adding the OpenMP functionality for `simd`
>> >     traits will be mostly handling the directive in the OpenMP
>> >     parser. How this should be done is described in
>> >     https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute> > 3. The IR attribute `vector-function-abi-variant` is not to be
>> >     extended to represent other kind of vectorization other than those
>> >     handled by `declare simd` and that are handled with a Vector
>> >     Function ABI.
>> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
>> >     extended to represent the information of `declare variant` in its
>> >     totality.
>> > 5. The IR attribute will not need to change when we will introduce non
>> >     vector function ABI vectorization (RV-like, reductions...) or when
>> >     we will decide to fully support `declare variant`. The information
>> >     it carries will not need to be invalidated, but just extended with
>> >     new attributes that will need to be handled by the
>> >     `VectorFunctionShape` class, in a similar way the
>> >     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
>> >     operates on individual attributes to describe an overall
>> >     functionality.
>> >
>> > # Examples
>> >
>> > ## Example 1
>> >
>> > Exposing an Advanced SIMD vector function when targeting Advanced SIMD
>> > in AArch64.
>> >
>> > ```
>> > double foo_01(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_01", simdlen(2), notinbranch, isa("simd"));
>> >
>> > // Advanced SIMD version
>> > float64x2_t vector_foo_01(float64x2_t VectorInput);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 = {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
>> > ```
>> >
>> > ## Example 2
>> >
>> > Exposing an Advanced SIMD vector function when targeting Advanced SIMD
>> > in AArch64, but with the wrong signature. The user specifies a masked
>> > version of the function in the clauses of the attribute, the compiler
>> > throws an error suggesting the signature expected for
>> > ``vector_foo_02.``
>> >
>> > ```
>> > double foo_02(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_02", simdlen(2), inbranch, isa("simd"));
>> >
>> > // Advanced SIMD version
>> > float64x2_t vector_foo_02(float64x2_t VectorInput);
>> > // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
>> > ```
>> >
>> > ## Example 3
>> >
>> > Targeting `sincos`-like signatures.
>> >
>> > ```
>> > void foo_03(double Input, double * Output)
> __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2),
> notinbranch, linear(Output, 1), isa("simd"));
>> >
>> > // Advanced SIMD version
>> > void vector_foo_03(float64x2_t VectorInput, double * Output);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 = {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
>> > ```
>> > ## Example 4
>> >
>> > Scalable vectorization targeting SVE
>> >
>> > ```
>> > double foo_04(double Input)
> __attribute__(clang_declare_simd_variant(“vector_foo_04",
> simdlen("scalable"), notinbranch, isa("sve"));
>> >
>> > // SVE version
>> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
>> > ```
>> >
>> > ## Example 5
>> >
>> > Fixed length vectorization targeting SVE
>> >
>> > ```
>> > double foo_05(double Input) __attribute__(clang_declare_simd_variant(“vector_foo_05", simdlen(4), inbranch, isa("sve"));
>> >
>> > // Fixed-length SVE version
>> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 = {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
>> > ```
>> >
>> > ## Example 06
>> >
>> > This is an x86 example, equivalent to the one provided by Andrei
>> > Elovikow in
>> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html. Godbolt
>> > rendering with ICC at https://godbolt.org/z/Of1NxZ> >
>> > ```
>> > float MyAdd(float* a, int b)
> __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8),
> notinbranch, arch("core_2nd_gen_avx"))
>> > {
>> >    return *a + b;
>> > }
>> >
>> >
>> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
>> > ```
>> >
>> > ## Example showing interaction with `declare simd`
>> >
>> > ```
>> > #pragma omp declare simd linear(a) notinbranch
>> > float foo_06(float *a, int x)
> __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4),
> linear(a), notinbranch, arch("armv8.2-a+simd")) {
>> >      return *a + x;
>> > }
>> >
>> > // Advanced SIMD version
>> > float32x4_t vector_foo_06(float *a, int32x4_t vx) {
>> > // Custom implementation.
>> > }
>> > ```
>> >
>> > The resulting IR attribute is made of three symbols:
>> >
>> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
>> >     ones the compiler builds by auto-vectorizing `foo_06` according to
>> >     the rule defined in the Vector Function ABI specifications for
>> >     AArch64.
>> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
>> >     user-defined redirection of the 4-lane version of `foo_06` to the
>> >     custom implementation provided by the user when targeting Advanced
>> >     SIMD for version 8.2 of the A64 instruction set.
>> >
>> > ```
>> > attribute #0 = {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_ZGVnN4l4v_foo_06(vector_foo_06)"}
>> > ```
>> >
>> --
>>
>> Simon Moll
>> Researcher / PhD Student
>>
>> Compiler Design Lab (Prof. Hack)
>> Saarland University, Computer Science
>> Building E1.3, Room 4.31
>>
>> Tel. +49 (0)681 302-57521 : [hidden email]
>> Fax. +49 (0)681 302-3065  : http://compilers.cs.uni-saarland.de/people/moll
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev

 

For example, Type 2 case, scalar-foo used call by value while vector-foo used call by ref. The question Johannes is asking is whether we can decipher that after the fact, only by looking at the two function signatures, or need some more info (what kind, what’s minimal)? I think we need to list up cases of interest, and for each vector ABI of interest, we need to work on the requirements and determine whether deciphering after the fact is feasible.

 

I think we can make further progress on trivial cases (where FE doesn’t “change” type) while we continue working out the details on non-trivial cases.

 

Thanks,

Hideki

 

From: Doerfert, Johannes [mailto:[hidden email]]
Sent: Monday, June 24, 2019 9:21 AM
To: Francesco Petrogalli <[hidden email]>; Tian, Xinmin <[hidden email]>
Cc: Saito, Hideki <[hidden email]>; Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

 

I mean, the FE will create only one of the 3 vector versions matching the one we want for a given vector length, wouldn't it? The question now is: can we with the scalar and one vector version correctly vectorize the call. If the answer is no, what is the minimal amount of information, in addition to the two version, we would need?


From: Francesco Petrogalli <[hidden email]>
Sent: Monday, June 24, 2019 6:06:12 PM
To: Tian, Xinmin
Cc: Doerfert, Johannes; Saito, Hideki; Simon Moll; LLVM Development List; Clang Dev; Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; nd; Roman Lebedev; Philip Reames; Shawn Landden
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.

 



> On Jun 24, 2019, at 10:53 AM, Tian, Xinmin <[hidden email]> wrote:
>
> To me, it is also an issue related to SIMD signature matching when the vectorizer kicks in. Losing info from FE to BE is not good in general.


Yes, we cannot loose such information. In particular, the three examples I reported are all generating i64 in the scalar function signature:

// Type 1
typedef _Complex int S;

// Type 2
typedef struct x{
int a;
int b;
} S;

// Type 3
typedef uint64_t S;

S foo(S a, S b) {
return ...;
}


On AArch64, the correspondent vector function signature in the three cases would be (for 2-lane unmasked vectorization):

// Type 1:

<4 x int> vectorized_foo(<4 x int>, <4 x int>)

// Type 2:

%a = type struct {I 32, i32}

<2 x %a* > vectorized_foo(<2 x %a*> , <2 x %a*>)

// Type 3:

<2 x i64> vectorized_foo(<2 x i64>, <2 x i64)

To make sure that the vectorizer knows how to map the scalar function parameters to the vector ones, we have to make sure that the original signature  information is stored somewhere.

I will work on this, and provide examples.

Suggestions are welcome.

Thank you

Francesco


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev

That helps complex but not other structures that can be call-by-value.

Hideki

-----Original Message-----
From: David Greene [mailto:[hidden email]]
Sent: Monday, June 24, 2019 9:40 AM
To: Francesco Petrogalli via llvm-dev <[hidden email]>
Cc: Doerfert, Johannes <[hidden email]>; Francesco Petrogalli <[hidden email]>; nd <[hidden email]>; Andrea Bocci <[hidden email]>; Clang Dev <[hidden email]>; Saito, Hideki <[hidden email]>; Alexey Bataev <[hidden email]>
Subject: Re: [llvm-dev] RFC: Interface user provided vector functions with the vectorizer.

I have an RFC for first-class complex types in LLVM IR pending for some internal review.  I hope to post it soon.  That should help address this problem.  Then the vector function signature generation could stay in LLVM, if I'm understanding the issue correctly.

                     -David

Francesco Petrogalli via llvm-dev <[hidden email]> writes:

> Hi all - I am working with a colleague to provide an initial implementation of this.
>
> We encountered a problem when dealing with generating the vector signatures of functions that use complex data.
>
> In this proposal, we expect the SVFS component in the backed to
> demangle the name of the function in the attribute to be able to
> reconstruct the signature of the vector function from the scalar
> function signature.
>
> In case of Complex data, this doesn’t seem to be possible, because the
> information of “being a vector of 2 lanes” that is supposed to be
> carried by the complex scalar is lost in the transformation the data
> type in a “coerced” type.
>
> Consider these three types and the function `foo`:
>
> // Type 1
> typedef _Complex int S;
>
> // Type 2
> typedef struct x{
>  int a;
>  int b;
> } S;
>
> // Type 3
> typedef uint64_t S;
>
> S foo(S a, S b) {
>  return ...;
> }
>
> In all cases, the IR type of the parameters in `foo` is i64, therefore
> is not possible to distinguish what C type generated the signature of
> `foo`.
>
> I don’t know if this is going to be a problem for other architectures,
> but this is definitely a problem on AArch64 where we need to be able
> to generate the correct vector function signature for a specific
> simdlen(N) attached on `foo`. When simdlen(2), for type 1 the vector
> type is <4 x i32>, for type 2 is <2 x i64*>, for type 3 is <2 x i64>.
>
> Therefore, I would like to propose a change to the RFC, which would
> move the responsibility off generating the vector function signature
> from LLVM to clang.
>
> In particular, (and this I believe has already been mentioned by
> Johannes), we could use the @llvm.compiler.used intrinsic to mark
> those declaration that needs to stay in the IR and not optimized away
> OPT before reaching the vectorizer.
>
> In summary, the change would consist of:
>
> 1. Generate symbols declaration/definitions of the vector function
> with the mangled name in the IR, and mark it with @llvm-compiler.used.
> This could be done in CGOpenMPRuntime.cpp 2. Use the attribute
> vector-abs-variant defined in this RFC to map scalar names to vector
> ABI mangled name, and used the same redirection mechanism for the user
> provided vector name.
> 3. Move the “vector function signature generation” from the SVFS in
> LLVM to the openmp code generator of the clang frontend
>
> The SVFS query system would still work as in the current proposal. The
> only difference is that the vector function signature would be given
> by the frontend and not need to be recomputed.
>
> Here is an example of ho the IR would look like with this change:
>
> ```
> @llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (<2 x i32> (<2 x i32>)* @f to i8*)], section "llvm.metadata"
>
> declare dso_local <2 x i32> @_ZGVnN2v_foo(<2 x i32> returned)
>
> declare i32 @foo(i32) #0
>
> ; other function definition, including the one provided by the user
> `my_vector_foo` if the user provided a definition and not just the
> declaration
>
> attribute #0 =
> {vector-function-abi-variant=“_ZGVnN2v_foo(my_vector_foo)"}
>
> ```
>
> If the attribute @llvm.compiler.used is not suitable for this (I am
> not aware of all implication of using it on a global symbol), maybe we
> could come up with a intrinsics that does what we need (avoid deleting
> declarations that are not used) and name it
> @llvm.vector.function.used?
>
> Please let me know what you think, I will submit an updated proposal next week.
>
> Kind regards,
>
> Francesco
>
>> On Jun 17, 2019, at 7:05 AM, Doerfert, Johannes <[hidden email]> wrote:
>>
>> I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
>>
>> Sorry for the delay and thanks for working on this.
>>
>> Get Outlook for Android
>>
>> From: Simon Moll <[hidden email]>
>> Sent: Monday, June 17, 2019 10:02:58 AM
>> To: Francesco Petrogalli; LLVM Development List; Clang Dev
>> Cc: Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei;
> Alexey Bataev; Doerfert, Johannes; Saito, Hideki; Tian, Xinmin; nd;
> Roman Lebedev; Philip Reames; Shawn Landden
>> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>>  
>> Hi Francesco,
>>
>> On 6/11/19 10:55 PM, Francesco Petrogalli wrote:
>> > Dear all,
>> >
>> > I have re-written the proposal for interfacing user provided vector
>> > functions, originally posted in both llvm-dev and cfe-dev mailing
>> > list:
>> >
>> > "[RFC] Expose user provided vector function for auto-vectorization."
>> >
>> > The proposal looks quite different from the original submission,
>> > therefore I took the liberty to start a new thread.
>> >
>> > The original thread generated some good discussion. In particular,
>> > Simon Moll and Johannes Doerfert (CCed) have managed to provide
>> > good arguments for the following claims:
>> >
>> > 1. The Vector Function ABI name mangling scheme of a target is not
>> >     enough to describe all uses cases of function vectorization that
>> >     the compiler might end up needing to support in the future.
>> I think the new name of the attribute makes this point clear.
>> > 2. `declare variant` needs to be handled properly at IR level, to be
>> >     able to give the compiler the full OpenMP context of the directive.
>> >
>> > This proposal addresses those two concerns and other (I believe)
>> > minor concerns that have been raised in the previous thread.
>> >
>> > This proposal is provided with examples and a self assessment
>> > around extendibility.
>> >
>> > I have CCed all the people that have participated in the discussion
>> > so far, please let me know if you think I have missed anything of
>> > what have been raised.
>> >
>> > Kind regards,
>> >
>> > Francesco
>>
>> LGTM. Please add me as a reviewer for this when you post patches.
>>
>> Thanks!
>>
>> Simon
>>
>> >
>> > *** DRAFT OF THE PROPOSAL ***
>> >
>> > # SCOPE OF THE RFC : Interface user provided vector functions with the vectorizer.
>> >
>> > Because the users care about portability (across compilers,
>> > libraries and systems), I believe we have to base sour solution on
>> > a standard that describes the mapping from the scalar function to
>> > the vector function.
>> >
>> > Because OpenMP is standard and widely used, we should base our
>> > solution on the mechanisms that the standard provides, via the
>> > directives `declare simd` and `declare variant`, the latter when
>> > used in with the `simd` trait in the `construct` set.
>> >
>> > Please notice that:
>> >
>> > 1. The scope of the proposal is not implementing full support for
>> >     `pragma omp declare variant`.
>> > 2. The scope of the proposal is not enabling the vectorizer to do new
>> >     kind of vectorizations (e.g. RV-like vectorization described by
>> >     Simon).
>> > 3. The proposal aims to be extendible wrt 1. and 2.
>> > 4. The IR attribute introduced in this proposal is equivalent to the
>> >     one needed for the VecClone pass under development in
>> >     https://reviews.llvm.org/D22792> > # CLANG COMPONENTS
>> >
>> > A C function attribute, `clang_declare_simd_variant`, to attach to
>> > the scalar version. The attribute provides enough information to
>> > the compiler about the vector shape of the user defined function.
>> > The vector shapes handled by the attribute are those handled by the
>> > OpenMP standard via `declare simd` (and no more than that).
>> >
>> > 1. The function attribute handling in clang is crafted with the
>> >     requirement that it will be possible to re-use the same components
>> >     for the info generated by `declare variant` when used with a `simd`
>> >     traits in the `construct` set.
>> > 2. The attribute allows orthogonality with the vectorization that is
>> >     done via OpenMP: the user vector function is still exposed for
>> >     vectorization when not using `-fopenmp-[simd]` once the `declare
>> >     simd` and `declare variant` directive of OpenMP will be available
>> >     in the front-end.
>> >
>> > ## C function attribute: `clang_declare_simd_variant`
>> >
>> > The definition of this attribute has been crafted to match the
>> > semantics of `declare variant` for a `simd` construct described in
>> > OpenMP 5.0. I have added only the traits of the `device` set, `isa`
>> > and `arch`, which I believe are enough to cover for the use case of
>> > this proposal. If that is not the case, please provide an example,
>> > extending the attribute will be easy even once the current one is
>> > implemented.
>> >
>> > ```
>> > clang_declare_simd_variant(<variant-func-id>, <simd clauses>{,
>> > <context selector clauses>})
>> >
>> > <variant-func-id>:= The name of a function variant that is a base language identifier, or,
>> >                      for C++, a template-id.
>> >
>> > <simd clauses> := <simdlen>, <mask>{, <optional simd clauses>}
>> >
>> > <simdlen> := simdlen(<positive number>) | simdlen("scalable")
>> >
>> > <mask>    := inbranch | notinbranch
>> >
>> > <optional simd clauses> := <linear clause>
>> >                           | <uniform clause>
>> >                           | <align clause>  | {,<optional simd
>> > clauses>}
>> >
>> > <linear clause>  := linear_ref(<var>,<step>)
>> >                    | linear_var(<var>, <step>)
>> >                    | linear_uval(<var>, <step>)
>> >                    | linear(<var>, <step>)
>> >
>> > <step> := <var> | <non zero number>
>> >
>> > <uniform clause> := uniform(<var>)
>> >
>> > <align clause>   := align(<var>, <positive number>)
>> >
>> > <var> := Name of a parameter in the scalar function
>> > declaration/definition
>> >
>> > <non zero number> := ... | -2 | -1 | 1 | 2 | ...
>> >
>> > <positive number> := 1 | 2 | 3 | ...
>> >
>> > <context selector clauses> := {<isa>}{,} {<arch>}
>> >
>> > <isa> := isa(target-specific-value)
>> >
>> > <arch> := arch(target-specific-value)
>> >
>> > ```
>> >
>> > # LLVM COMPONENTS:
>> >
>> > ## VectorFunctionShape class
>> >
>> > The object `VectorFunctionShape` contains the information about the
>> > kind of vectorization available for an `llvm::Call`.
>> >
>> > The object `VectorFunctionShape` must contain the following information:
>> >
>> > 1. Vectorization Factor (or number or concurrent lanes executed by the
>> >     SIMD version of the function). Encoded by unsigned integer.
>> > 2. Whether the vector function is requested for scalable
>> >     vectorization, encoded by a boolean.
>> > 3. Information about masking / no masking, encoded by a boolean.
>> > 4. Information about the parameters, encoded in a container that
>> >     carries objects of type `ParamaterType`, to describe features like
>> >     `linear` and `uniform`.
>> > 5. Function name redirection, if a user has specified to use a custom
>> >     name instead of the Vector Function ABI ones.
>> >
>> > Items 1. to 5. represents the information stored in the
>> > `vector-function-abi-variant` attribute (see next section).
>> >
>> > The object can be extended in the future to include new
>> > vectorization kinds (for example the RV-like vectorization of the
>> > Region Vectorizer), or to add more context information that might
>> > come from other uses of OpenMP `declare variant`, or to add new
>> > Vector Function ABIs not based on OpenMP. Such information can be
>> > retrieved by attributes that will be added to describe the `Call` instance.
>> >
>> > ## IR Attribute
>> >
>> > We define a `vector-function-abi-variant` attribute that lists the
>> > mangled names produced via the mangling function of the Vector
>> > Function ABI rules.
>> >
>> > ```
>> > vector-function-abi-variant = "abi_mangled_name_01, abi_mangled_name_02(user_redirection),..."
>> > ```
>> >
>> > 1. Because we use only OpenMP `declare simd` vectorization, and
>> >     because we require a vector Function ABI, we make this explicit
>> >     in the name of the attribute.
>> > 2. Because the Vector Function ABIs encode all the information
>> >     needed to know the vectorization shape of the vector function in
>> >     the mangled names, we provide the mangled name via the
>> >     attribute.
>> > 3. Function names redirection is specified by enclosing the name of
>> >     the redirection in parenthesis, as in
>> >     `abi_mangled_name_02(user_redirection)`.
>> >
>> > ## Vector ABI Demangler
>> >
>> > The “Vector ABI demangler”, is the component that demangles the
>> > data in the `vector-function-abi-variant` attribute and that
>> > provides the instances of the class `VectorFunctionShape` that can
>> > be derived by the mangled names listed in the attribute.
>> >
>> > ## Query interface: Search Vector Function System (SVFS)
>> >
>> > An interface that can be queried by the LLVM components to
>> > understand whether or not a scalar function can be vectorized, and
>> > that retrieves the vector function to be used if such vector shape is available.
>> >
>> > 1. This component is going to be unrelated to OpenMP.
>> > 2. This component will use internally the demangler defined in the
>> >     previous section, but it will not expose any aspect of the Vector
>> >     Function ABI via its interface.
>> >
>> > The interface provides two methods.
>> >
>> > ```
>> > std::vector<VectorFunctionShape>
>> > SVFS::isFunctionVectorizable(llvm::CallInst * Call);
>> >
>> > llvm::Function * SVFS::getVectorizedFunction(llvm::CallInst * Call,
>> > VectorFunctionShape Info); ```
>> >
>> > The first method is used to list all the vector shapes that
>> > available and attached to a scalar function. An empty results means
>> > that no vector versions are available.
>> >
>> > The second method retrieves the information needed to build a call
>> > to a vector function with a specific `VectorFunctionShape` info.
>> >
>> > # (SELF) ASSESSMENT ON EXTENDIBILITY
>> >
>> >
>> > 1. Extending the C function attribute `clang_declare_simd_variant` to
>> >     new Vector Function ABIs that use OpenMP will be straightforward
>> >     because the attribute is tight to such ABIs and OpenMP.
>> > 2. The C attribute `clang_declare_simd_variant` and the `declare
>> >     variant` directive used for the `simd` trait will be sharing the
>> >     internals in clang, so adding the OpenMP functionality for `simd`
>> >     traits will be mostly handling the directive in the OpenMP
>> >     parser. How this should be done is described in
>> >     https://clang.llvm.org/docs/InternalsManual.html#how-to-add-an-attribute> > 3. The IR attribute `vector-function-abi-variant` is not to be
>> >     extended to represent other kind of vectorization other than those
>> >     handled by `declare simd` and that are handled with a Vector
>> >     Function ABI.
>> > 4. The IR attribute `vector-function-abi-variant` is not defined to be
>> >     extended to represent the information of `declare variant` in its
>> >     totality.
>> > 5. The IR attribute will not need to change when we will introduce non
>> >     vector function ABI vectorization (RV-like, reductions...) or when
>> >     we will decide to fully support `declare variant`. The information
>> >     it carries will not need to be invalidated, but just extended with
>> >     new attributes that will need to be handled by the
>> >     `VectorFunctionShape` class, in a similar way the
>> >     `llvm::FPMathOperator` does with the `llvm::FastMathFlags`, which
>> >     operates on individual attributes to describe an overall
>> >     functionality.
>> >
>> > # Examples
>> >
>> > ## Example 1
>> >
>> > Exposing an Advanced SIMD vector function when targeting Advanced
>> > SIMD in AArch64.
>> >
>> > ```
>> > double foo_01(double Input)
>> > __attribute__(clang_declare_simd_variant(“vector_foo_01",
>> > simdlen(2), notinbranch, isa("simd"));
>> >
>> > // Advanced SIMD version
>> > float64x2_t vector_foo_01(float64x2_t VectorInput); ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 =
>> > {vector-abi-variant="_ZGVnN2v_foo_01(vector_foo_01)"}
>> > ```
>> >
>> > ## Example 2
>> >
>> > Exposing an Advanced SIMD vector function when targeting Advanced
>> > SIMD in AArch64, but with the wrong signature. The user specifies a
>> > masked version of the function in the clauses of the attribute, the
>> > compiler throws an error suggesting the signature expected for
>> > ``vector_foo_02.``
>> >
>> > ```
>> > double foo_02(double Input)
>> > __attribute__(clang_declare_simd_variant(“vector_foo_02",
>> > simdlen(2), inbranch, isa("simd"));
>> >
>> > // Advanced SIMD version
>> > float64x2_t vector_foo_02(float64x2_t VectorInput);
>> > // (suggested) compiler error ->                      ^ Missing mask parameter of type `uint64x2_t`.
>> > ```
>> >
>> > ## Example 3
>> >
>> > Targeting `sincos`-like signatures.
>> >
>> > ```
>> > void foo_03(double Input, double * Output)
> __attribute__(clang_declare_simd_variant(“vector_foo_03", simdlen(2),
> notinbranch, linear(Output, 1), isa("simd"));
>> >
>> > // Advanced SIMD version
>> > void vector_foo_03(float64x2_t VectorInput, double * Output); ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 =
>> > {vector-abi-variant="_ZGVnN2vl8_foo_03(vector_foo_03)"}
>> > ```
>> > ## Example 4
>> >
>> > Scalable vectorization targeting SVE
>> >
>> > ```
>> > double foo_04(double Input)
> __attribute__(clang_declare_simd_variant(“vector_foo_04",
> simdlen("scalable"), notinbranch, isa("sve"));
>> >
>> > // SVE version
>> > svfloat64_t vector_foo_04(svfloat64_t VectorInput, svbool_t Mask);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 =
>> > {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
>> > ```
>> >
>> > ## Example 5
>> >
>> > Fixed length vectorization targeting SVE
>> >
>> > ```
>> > double foo_05(double Input)
>> > __attribute__(clang_declare_simd_variant(“vector_foo_05",
>> > simdlen(4), inbranch, isa("sve"));
>> >
>> > // Fixed-length SVE version
>> > svfloat64_t vector_foo_05(svfloat64_t VectorInput, svbool_t Mask);
>> > ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 =
>> > {vector-abi-variant="_ZGVsM2v_foo_04(vector_foo_04)"}
>> > ```
>> >
>> > ## Example 06
>> >
>> > This is an x86 example, equivalent to the one provided by Andrei
>> > Elovikow in
>> > http://lists.llvm.org/pipermail/llvm-dev/2019-June/132885.html.
>> > Godbolt rendering with ICC at https://godbolt.org/z/Of1NxZ> > ```
>> > float MyAdd(float* a, int b)
> __attribute__(clang_declare_simd_variant(“MyAddVec", simdlen(8),
> notinbranch, arch("core_2nd_gen_avx"))
>> > {
>> >    return *a + b;
>> > }
>> >
>> >
>> > __m256 MyAddVec(float* v_a, __m128i v_b1, __m128i v_b2); ```
>> >
>> > The resulting IR attribute is:
>> >
>> > ```
>> > attribute #0 = {vector-abi-variant="_ZGVbN8l4v_MyAdd(MyAddVec)"}
>> > ```
>> >
>> > ## Example showing interaction with `declare simd`
>> >
>> > ```
>> > #pragma omp declare simd linear(a) notinbranch float foo_06(float
>> > *a, int x)
> __attribute__(clang_declare_simd_variant(“vector_foo_06", simdlen(4),
> linear(a), notinbranch, arch("armv8.2-a+simd")) {
>> >      return *a + x;
>> > }
>> >
>> > // Advanced SIMD version
>> > float32x4_t vector_foo_06(float *a, int32x4_t vx) { // Custom
>> > implementation.
>> > }
>> > ```
>> >
>> > The resulting IR attribute is made of three symbols:
>> >
>> > 1. `_ZGVnN2l4v_foo_06` and `_ZGVnN4l4v_foo_06`, which represent the
>> >     ones the compiler builds by auto-vectorizing `foo_06` according to
>> >     the rule defined in the Vector Function ABI specifications for
>> >     AArch64.
>> > 2. `_ZGVnN4l4v_foo_06(vector_foo_06)`, which represents the
>> >     user-defined redirection of the 4-lane version of `foo_06` to the
>> >     custom implementation provided by the user when targeting Advanced
>> >     SIMD for version 8.2 of the A64 instruction set.
>> >
>> > ```
>> > attribute #0 =
>> > {vector-function-abi-variant="_ZGVnN2l4v_foo_06,_ZGVnN4l4v_foo_06,_
>> > ZGVnN4l4v_foo_06(vector_foo_06)"}
>> > ```
>> >
>> --
>>
>> Simon Moll
>> Researcher / PhD Student
>>
>> Compiler Design Lab (Prof. Hack)
>> Saarland University, Computer Science Building E1.3, Room 4.31
>>
>> Tel. +49 (0)681 302-57521 : [hidden email] Fax. +49 (0)681
>> 302-3065  : http://compilers.cs.uni-saarland.de/people/moll
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev


> On Jun 24, 2019, at 12:20 PM, Saito, Hideki <[hidden email]> wrote:
>
>  
> For example, Type 2 case, scalar-foo used call by value while vector-foo used call by ref.


Yes, the call-by-ref is an important feature that can be used with linear modifiers.

> The question Johannes is asking is whether we can decipher that after the fact, only by looking at the two function signatures, or need some more info (what kind, what’s minimal)?

In the draft implementation we have been working on, we came up with a ParamKind enum that holds the following information:

enum class ParamKind
{
    Vector,
    OMP_Linear,
    OMP_LinearRef,
    OMP_LinearVal,
    OMP_LinearUVal,
    OMP_LinearPos,
    OMP_LinearValPos,
    OMP_LinearRefPos,
    OMP_LinearUValPos,
    OMP_Uniform
};

The enum is used to classify the `ParameterType`, a class that is attached to each parameter and describes things like uniformity, linearity (with and without modifiers). The list of parameter types is then stored in the VectorFunctionShape:

struct VectorFunctionShape {
            unsigned VF; // Vectorization factor
            bool IsMasked;
            bool IsScalable;
            ISAKind ISA;
            std::vector<ParamType> Parameters;
};

Here OpenMP is used to classify the parameter types (OMP_*), but nothing prevents the ParamKind and the VectorFunctionShape to be extended to be able to handle other vector paradigms.

I think that we have to handle all 9 different linear and uniform cases in the ParamKind separately, because there is no way to get such information from the vector function shape.

> I think we need to list up cases of interest, and for each vector ABI of interest, we need to work on the requirements and determine whether deciphering after the fact is feasible.
>  


I will provide the cases for AArch64.


> I think we can make further progress on trivial cases (where FE doesn’t “change” type) while we continue working out the details on non-trivial cases.
>  


I think we can agree to proceed these way. The SVFS and the VectorShapeInfo are (I believe) designed with extendibility as a requirements. If we need more metadata to represent some specific cases, we will extend the SVFS internal to handle such metadata.

I will update the RFC by moving the vector function signature generation in the FE.

Thank you everybody for their input, and for your patience. This is proving harder than expected! :)

Francesco

> Thanks,
> Hideki
>  
> From: Doerfert, Johannes [mailto:[hidden email]]
> Sent: Monday, June 24, 2019 9:21 AM
> To: Francesco Petrogalli <[hidden email]>; Tian, Xinmin <[hidden email]>
> Cc: Saito, Hideki <[hidden email]>; Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>  
> I mean, the FE will create only one of the 3 vector versions matching the one we want for a given vector length, wouldn't it? The question now is: can we with the scalar and one vector version correctly vectorize the call. If the answer is no, what is the minimal amount of information, in addition to the two version, we would need?
>
> Get Outlook for Android
>  
> From: Francesco Petrogalli <[hidden email]>
> Sent: Monday, June 24, 2019 6:06:12 PM
> To: Tian, Xinmin
> Cc: Doerfert, Johannes; Saito, Hideki; Simon Moll; LLVM Development List; Clang Dev; Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov, Andrei; Alexey Bataev; nd; Roman Lebedev; Philip Reames; Shawn Landden
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>  
>
>
> > On Jun 24, 2019, at 10:53 AM, Tian, Xinmin <[hidden email]> wrote:
> >
> > To me, it is also an issue related to SIMD signature matching when the vectorizer kicks in. Losing info from FE to BE is not good in general.
> >  
>
> Yes, we cannot loose such information. In particular, the three examples I reported are all generating i64 in the scalar function signature:
>
> // Type 1
> typedef _Complex int S;
>
> // Type 2
> typedef struct x{
> int a;
> int b;
> } S;
>
> // Type 3
> typedef uint64_t S;
>
> S foo(S a, S b) {
> return ...;
> }
>
>
> On AArch64, the correspondent vector function signature in the three cases would be (for 2-lane unmasked vectorization):
>
> // Type 1:
>
> <4 x int> vectorized_foo(<4 x int>, <4 x int>)
>
> // Type 2:
>
> %a = type struct {I 32, i32}
>
> <2 x %a* > vectorized_foo(<2 x %a*> , <2 x %a*>)
>
> // Type 3:
>
> <2 x i64> vectorized_foo(<2 x i64>, <2 x i64)
>
> To make sure that the vectorizer knows how to map the scalar function parameters to the vector ones, we have to make sure that the original signature  information is stored somewhere.
>
> I will work on this, and provide examples.
>
> Suggestions are welcome.
>
> Thank you
>
> Francesco

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev


> On Jun 24, 2019, at 12:22 PM, Saito, Hideki <[hidden email]> wrote:
>
> That helps complex but not other structures that can be call-by-value.


Still, having complex types in IR will be very helpful also for other reasons! :)

I am looking forward to see your proposal David, thank you.

Francesco
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev

>Still, having complex types in IR will be very helpful also for other reasons! :)

Yes, definitely.

>I am looking forward to see your proposal David, thank you.

+1.

Hideki

-----Original Message-----
From: Francesco Petrogalli [mailto:[hidden email]]
Sent: Monday, June 24, 2019 11:28 AM
To: Saito, Hideki <[hidden email]>
Cc: David Greene <[hidden email]>; Francesco Petrogalli via llvm-dev <[hidden email]>; Doerfert, Johannes <[hidden email]>; nd <[hidden email]>; Andrea Bocci <[hidden email]>; Clang Dev <[hidden email]>; Alexey Bataev <[hidden email]>
Subject: Re: [llvm-dev] RFC: Interface user provided vector functions with the vectorizer.



> On Jun 24, 2019, at 12:22 PM, Saito, Hideki <[hidden email]> wrote:
>
> That helps complex but not other structures that can be call-by-value.


Still, having complex types in IR will be very helpful also for other reasons! :)

I am looking forward to see your proposal David, thank you.

Francesco
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Interface user provided vector functions with the vectorizer.

Joan Lluch via cfe-dev
In reply to this post by Joan Lluch via cfe-dev

>Thank you everybody for their input, and for your patience. This is proving harder than expected! :)

Thank you for doing the hard part of the work.

Hideki

-----Original Message-----
From: Francesco Petrogalli [mailto:[hidden email]]
Sent: Monday, June 24, 2019 11:26 AM
To: Saito, Hideki <[hidden email]>
Cc: Doerfert, Johannes <[hidden email]>; Tian, Xinmin <[hidden email]>; Simon Moll <[hidden email]>; LLVM Development List <[hidden email]>; Clang Dev <[hidden email]>; Renato Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea Bocci <[hidden email]>; Elovikov, Andrei <[hidden email]>; Alexey Bataev <[hidden email]>; nd <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames <[hidden email]>; Shawn Landden <[hidden email]>
Subject: Re: RFC: Interface user provided vector functions with the vectorizer.



> On Jun 24, 2019, at 12:20 PM, Saito, Hideki <[hidden email]> wrote:
>
>  
> For example, Type 2 case, scalar-foo used call by value while vector-foo used call by ref.


Yes, the call-by-ref is an important feature that can be used with linear modifiers.

> The question Johannes is asking is whether we can decipher that after the fact, only by looking at the two function signatures, or need some more info (what kind, what’s minimal)?

In the draft implementation we have been working on, we came up with a ParamKind enum that holds the following information:

enum class ParamKind
{
    Vector,
    OMP_Linear,
    OMP_LinearRef,
    OMP_LinearVal,
    OMP_LinearUVal,
    OMP_LinearPos,
    OMP_LinearValPos,
    OMP_LinearRefPos,
    OMP_LinearUValPos,
    OMP_Uniform
};

The enum is used to classify the `ParameterType`, a class that is attached to each parameter and describes things like uniformity, linearity (with and without modifiers). The list of parameter types is then stored in the VectorFunctionShape:

struct VectorFunctionShape {
            unsigned VF; // Vectorization factor
            bool IsMasked;
            bool IsScalable;
            ISAKind ISA;
            std::vector<ParamType> Parameters; };

Here OpenMP is used to classify the parameter types (OMP_*), but nothing prevents the ParamKind and the VectorFunctionShape to be extended to be able to handle other vector paradigms.

I think that we have to handle all 9 different linear and uniform cases in the ParamKind separately, because there is no way to get such information from the vector function shape.

> I think we need to list up cases of interest, and for each vector ABI of interest, we need to work on the requirements and determine whether deciphering after the fact is feasible.
>  


I will provide the cases for AArch64.


> I think we can make further progress on trivial cases (where FE doesn’t “change” type) while we continue working out the details on non-trivial cases.
>  


I think we can agree to proceed these way. The SVFS and the VectorShapeInfo are (I believe) designed with extendibility as a requirements. If we need more metadata to represent some specific cases, we will extend the SVFS internal to handle such metadata.

I will update the RFC by moving the vector function signature generation in the FE.

Thank you everybody for their input, and for your patience. This is proving harder than expected! :)

Francesco

> Thanks,
> Hideki
>  
> From: Doerfert, Johannes [mailto:[hidden email]]
> Sent: Monday, June 24, 2019 9:21 AM
> To: Francesco Petrogalli <[hidden email]>; Tian, Xinmin
> <[hidden email]>
> Cc: Saito, Hideki <[hidden email]>; Simon Moll
> <[hidden email]>; LLVM Development List
> <[hidden email]>; Clang Dev <[hidden email]>; Renato
> Golin <[hidden email]>; Finkel, Hal J. <[hidden email]>; Andrea
> Bocci <[hidden email]>; Elovikov, Andrei
> <[hidden email]>; Alexey Bataev <[hidden email]>; nd
> <[hidden email]>; Roman Lebedev <[hidden email]>; Philip Reames
> <[hidden email]>; Shawn Landden <[hidden email]>
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>  
> I mean, the FE will create only one of the 3 vector versions matching the one we want for a given vector length, wouldn't it? The question now is: can we with the scalar and one vector version correctly vectorize the call. If the answer is no, what is the minimal amount of information, in addition to the two version, we would need?
>
> Get Outlook for Android
>  
> From: Francesco Petrogalli <[hidden email]>
> Sent: Monday, June 24, 2019 6:06:12 PM
> To: Tian, Xinmin
> Cc: Doerfert, Johannes; Saito, Hideki; Simon Moll; LLVM Development
> List; Clang Dev; Renato Golin; Finkel, Hal J.; Andrea Bocci; Elovikov,
> Andrei; Alexey Bataev; nd; Roman Lebedev; Philip Reames; Shawn Landden
> Subject: Re: RFC: Interface user provided vector functions with the vectorizer.
>  
>
>
> > On Jun 24, 2019, at 10:53 AM, Tian, Xinmin <[hidden email]> wrote:
> >
> > To me, it is also an issue related to SIMD signature matching when the vectorizer kicks in. Losing info from FE to BE is not good in general.
> >  
>
> Yes, we cannot loose such information. In particular, the three examples I reported are all generating i64 in the scalar function signature:
>
> // Type 1
> typedef _Complex int S;
>
> // Type 2
> typedef struct x{
> int a;
> int b;
> } S;
>
> // Type 3
> typedef uint64_t S;
>
> S foo(S a, S b) {
> return ...;
> }
>
>
> On AArch64, the correspondent vector function signature in the three cases would be (for 2-lane unmasked vectorization):
>
> // Type 1:
>
> <4 x int> vectorized_foo(<4 x int>, <4 x int>)
>
> // Type 2:
>
> %a = type struct {I 32, i32}
>
> <2 x %a* > vectorized_foo(<2 x %a*> , <2 x %a*>)
>
> // Type 3:
>
> <2 x i64> vectorized_foo(<2 x i64>, <2 x i64)
>
> To make sure that the vectorizer knows how to map the scalar function parameters to the vector ones, we have to make sure that the original signature  information is stored somewhere.
>
> I will work on this, and provide examples.
>
> Suggestions are welcome.
>
> Thank you
>
> Francesco

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
12