[RFC] Re-use OpenCL address space attributes for SYCL

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC] Re-use OpenCL address space attributes for SYCL

Vassil Vassilev via cfe-dev

Hi,

 

We would like to re-use OpenCL address space attributes for SYCL to target

SPIR-V format and enable efficient memory access on GPUs.

 

```c++

  __attribute__((opencl_global))

  __attribute__((opencl_local))

  __attribute__((opencl_private))

```

 

The first patch enabling conversion between pointers annotated with OpenCL

address space attribute and "default" pointers is being reviewed here

https://reviews.llvm.org/D80932.

 

Before moving further with the implementation we would like to discuss two

questions raised in review comments (https://reviews.llvm.org/D80932#2085848).

 

## Using attributes to annotate memory allocations

 

Introduction section of SYCL-1.2.1 specification describes multiple compilation

flows intended by the design:

 

> SYCL is designed to allow a compilation flow where the source file is passed

> through multiple different compilers, including a standard C++ host compiler

> of the developer’s choice, and where the resulting application combines the

> results of these compilation passes. This is distinct from a single-source

> flow that might use language extensions that preclude the use of a standard

> host compiler. The SYCL standard does not preclude the use of a single

> compiler flow, but is designed to not require it.

> 

> The advantages of this design are two-fold. First, it offers better

> integration with existing tool chains. An application that already builds

> using a chosen compiler can continue to do so when SYCL code is added. Using

> the SYCL tools on a source file within a project will both compile for an

> OpenCL device and let the same source file be compiled using the same host

> compiler that the rest of the project is compiled with. Linking and library

> relationships are unaffected. This design simplifies porting of pre-existing

> applications to SYCL. Second, the design allows the optimal compiler to be

> chosen for each device where different vendors may provide optimized

> tool-chains.

> 

> SYCL is designed to be as close to standard C++ as possible. In practice,

> this means that as long as no dependence is created on SYCL’s integration

> with OpenCL, a standard C++ compiler can compile the SYCL programs and they

> will run correctly on host CPU. Any use of specialized low-level features

> can be masked using the C preprocessor in the same way that

> compiler-specific intrinsics may be hidden to ensure portability between

> different host compilers.

 

Following this approach, SYCL uses C++ templates to represent pointers to

disjoint memory regions on an accelerator to enable compilation with standard

C++ toolchain and SYCL compiler toolchain.

 

For instance:

 

```c++

// CPU/host implementation

template <typename T, address_space AS> class multi_ptr {

  T *data; // ignore address space parameter on CPU

  public:

  T *get_pointer() { return data; }

}

 

// check that SYCL mode is ON and we can use non-standard annotations

#if defined(__SYCL_DEVICE_ONLY__)

// GPU/accelerator implementation

template <typename T, address_space AS> class multi_ptr {

  // GetAnnotatedPointer<T, global>::type == "__attribute__((opencl_global)) T"

  using pointer_t = typename GetAnnotatedPointer<T, AS>::type *;

 

  pointer_t data;

  public:

  pointer_t get_pointer() { return data; }

}

#endif

```

 

User can use `multi_ptr` class as regular user-defined type in regular C++ code:

 

```c++

int *UserFunc(multi_ptr<int, global> ptr) {

  /// ...

  return ptr.get_pointer();

}

```

 

Depending on the compiler mode `multi_ptr` will either annotate internal data

with address space attribute or not.

 

## Implementation details

 

OpenCL attributes are handled by Parser in all modes. OpenCL mode has specific

logic in Sema and CodeGen components for these attributes.

 

SYCL compiler re-use generic support for these attributes as is and modifies

Sema and CodeGen libraries. The main difference with OpenCL mode is that SYCL

mode (similar to other single-source GPU programming modes like OpenMP/CUDA/HIP)

keeps "default" address space for the declaration without address space

attribute annotations. This keeps the code shared between the host and device

semantically-correct for both compilers: regular C++ host compiler and SYCL

compiler.

 

To make all pointers without an explicit address space qualifier to be pointers

in generic address space, we updated SPIR target address space map, which

currently maps default pointers to "private" address space. We made this change

specific to SYCL by adding SYCL environment component to the Triple to avoid

impact on other modes targeting SPIR target (e.g. OpenCL). We would be glad to

see get a feedback from the community if changing this mapping is applicable for

all the modes and additional specialization can be avoided (e.g.

[AMDGPU](https://github.com/llvm/llvm-project/blob/master/clang/lib/Basic/Targets/AMDGPU.cpp#L329)

maps default to "generic" address space with a couple of exceptions).

 

There are a few cases when CodeGen assigns non-default address space:

 

1. For declaration explicitly annotated with address space attribute

2. Variables with static storage duration and string literals are allocated in

   global address space unless specific address space it specified.

3. Variables with automatic storage durations are allocated in private address

   space. It's current compiler behavior and it doesn't require additional

   changes.

 

For (2) and (3) cases, once "default" pointer to such variable is obtained, it

is immediately addrspacecast'ed to generic, because a user does not (and should

not) specify address space for pointers in source code.

 

A draft patch containing complete change-set is available 

[here](https://github.com/bader/llvm/pull/18/).

 

Does this approach seem reasonable?

 

Thanks,

Alexey

 

 


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Re-use OpenCL address space attributes for SYCL

Vassil Vassilev via cfe-dev
Hi Alexey,


Thanks for the clarification.


> SYCL compiler re-use generic support for these attributes as is and modifies
> Sema and CodeGen libraries.

Can you elaborate on your modifications in Sema and CodeGen, please?

> The main difference with OpenCL mode is that SYCL
> mode (similar to other single-source GPU programming modes like
> OpenMP/CUDA/HIP)
> keeps "default" address space for the declaration without address space
> attribute annotations.

Just FYI in C++ mode, Clang implements default/generic address space as
specified in embedded C (ISO/IEC TR 18037) s5.1 - 5.3.

"When not specified otherwise, objects are allocated by default in a generic
address space, which corresponds to the single address space of ISO/IEC
9899:1999."

"Objects are allocated in one or more address spaces. A unique generic address
space always exists. Every address space other than the generic one has a unique
name in the form of an identifier. Address spaces other than the generic one are
called named address spaces. An object is always completely allocated into at
least one address space. Unless otherwise specified, objects are allocated in
the generic address space."

It feels to me this is the model you intend to follow? If you use OpenCL address
space attributes outside of OpenCL mode there is limited logic that you will
inherit. For example deduction of address spaces wouldn't work but conversions
or generation to IR should work fine. It generally sounds like a viable approach
but OpenCL however used Default (no address space) as private AS for a very long
time and there are still a number of places where this assumption is inherent in
the implementation. This is not entirely strange as Default is use by many
languages for automatic storage anyway. My worry is there could be difficulties
in reusing the OpenCL address space model due to this.

Btw can you elaborate on your implementation of constant addr space?

> This keeps the code shared between the host and device
> semantically-correct for both compilers: regular C++ host compiler and SYCL
> compiler.

Sorry perhaps I am not following this thought but can you explain how
address spaces make code semantically incorrect?

> To make all pointers without an explicit address space qualifier to be
> pointers
> in generic address space, we updated SPIR target address space map, which
> currently maps default pointers to "private" address space.

The address space map in Clang is not specific to pointer types. How do you
make it work for pointers only?

> We made this change
> specific to SYCL by adding SYCL environment component to the Triple to avoid
> impact on other modes targeting SPIR target (e.g. OpenCL). We would be glad to
> see get a feedback from the community if changing this mapping is applicable
> for all the modes and additional specialization can be avoided (e.g.
> [AMDGPU](https://github.com/llvm/llvm-project/blob/master/clang/lib/Basic/Targets/AMDGPU.cpp#L329)
> maps default to "generic" address space with a couple of exceptions).

Ok, does it mean that you map Default address space to OpenCL generic?
Please note that Default address space is used outside of OpenCL for all
other languages so remapping this unconditionally will have a wider impact.

> There are a few cases when CodeGen assigns non-default address space:
>
> 1. For declaration explicitly annotated with address space attribute

This is generally how CodeGen works mapping language address spaces to target
address spaces. Is there something different you do here for SYCL?

> 2. Variables with static storage duration and string literals are allocated in
>  global address space unless specific address space it specified.
> 3. Variables with automatic storage durations are allocated in private address
>   space. It's current compiler behavior and it doesn't require additional
>   changes.

We already have this logic for OpenCL in Sema. I am not an expert in CodeGen but
I believe its primary task is to map language constructs onto the target specific IR
i.e. map from AST into IR. However, you are making it dial with language semantic
instead i.e. add missing AST logic such as address space attribute. I believe there
are good reasons to have layering architecture that separates various concerns.
What drives your decision for moving this logic into CodeGen?

> For (2) and (3) cases, once "default" pointer to such variable is obtained, it
> is immediately addrspacecast'ed to generic, because a user does not (and
> should not) specify address space for pointers in source code.

Can you explain why you need this cast? Can user not specify address spaces using
pointer classes that map into address space attributed types i.e. ending up with
pointer with address spaces originating from the user code?

Cheers,
Anastasia



From: Bader, Alexey <[hidden email]>
Sent: 26 June 2020 13:04
To: cfe-dev ([hidden email]) <[hidden email]>; Anastasia Stulova <[hidden email]>; [hidden email] <[hidden email]>
Subject: [RFC] Re-use OpenCL address space attributes for SYCL
 

Hi,

 

We would like to re-use OpenCL address space attributes for SYCL to target

SPIR-V format and enable efficient memory access on GPUs.

 

```c++

  __attribute__((opencl_global))

  __attribute__((opencl_local))

  __attribute__((opencl_private))

```

 

The first patch enabling conversion between pointers annotated with OpenCL

address space attribute and "default" pointers is being reviewed here

https://reviews.llvm.org/D80932.

 

Before moving further with the implementation we would like to discuss two

questions raised in review comments (https://reviews.llvm.org/D80932#2085848).

 

## Using attributes to annotate memory allocations

 

Introduction section of SYCL-1.2.1 specification describes multiple compilation

flows intended by the design:

 

> SYCL is designed to allow a compilation flow where the source file is passed

> through multiple different compilers, including a standard C++ host compiler

> of the developer’s choice, and where the resulting application combines the

> results of these compilation passes. This is distinct from a single-source

> flow that might use language extensions that preclude the use of a standard

> host compiler. The SYCL standard does not preclude the use of a single

> compiler flow, but is designed to not require it.

> The advantages of this design are two-fold. First, it offers better

> integration with existing tool chains. An application that already builds

> using a chosen compiler can continue to do so when SYCL code is added. Using

> the SYCL tools on a source file within a project will both compile for an

> OpenCL device and let the same source file be compiled using the same host

> compiler that the rest of the project is compiled with. Linking and library

> relationships are unaffected. This design simplifies porting of pre-existing

> applications to SYCL. Second, the design allows the optimal compiler to be

> chosen for each device where different vendors may provide optimized

> tool-chains.

> SYCL is designed to be as close to standard C++ as possible. In practice,

> this means that as long as no dependence is created on SYCL’s integration

> with OpenCL, a standard C++ compiler can compile the SYCL programs and they

> will run correctly on host CPU. Any use of specialized low-level features

> can be masked using the C preprocessor in the same way that

> compiler-specific intrinsics may be hidden to ensure portability between

> different host compilers.

 

Following this approach, SYCL uses C++ templates to represent pointers to

disjoint memory regions on an accelerator to enable compilation with standard

C++ toolchain and SYCL compiler toolchain.

 

For instance:

 

```c++

// CPU/host implementation

template <typename T, address_space AS> class multi_ptr {

  T *data; // ignore address space parameter on CPU

  public:

  T *get_pointer() { return data; }

}

 

// check that SYCL mode is ON and we can use non-standard annotations

#if defined(__SYCL_DEVICE_ONLY__)

// GPU/accelerator implementation

template <typename T, address_space AS> class multi_ptr {

  // GetAnnotatedPointer<T, global>::type == "__attribute__((opencl_global)) T"

  using pointer_t = typename GetAnnotatedPointer<T, AS>::type *;

 

  pointer_t data;

  public:

  pointer_t get_pointer() { return data; }

}

#endif

```

 

User can use `multi_ptr` class as regular user-defined type in regular C++ code:

 

```c++

int *UserFunc(multi_ptr<int, global> ptr) {

  /// ...

  return ptr.get_pointer();

}

```

 

Depending on the compiler mode `multi_ptr` will either annotate internal data

with address space attribute or not.

 

## Implementation details

 

OpenCL attributes are handled by Parser in all modes. OpenCL mode has specific

logic in Sema and CodeGen components for these attributes.

 

SYCL compiler re-use generic support for these attributes as is and modifies

Sema and CodeGen libraries. The main difference with OpenCL mode is that SYCL

mode (similar to other single-source GPU programming modes like OpenMP/CUDA/HIP)

keeps "default" address space for the declaration without address space

attribute annotations. This keeps the code shared between the host and device

semantically-correct for both compilers: regular C++ host compiler and SYCL

compiler.

 

To make all pointers without an explicit address space qualifier to be pointers

in generic address space, we updated SPIR target address space map, which

currently maps default pointers to "private" address space. We made this change

specific to SYCL by adding SYCL environment component to the Triple to avoid

impact on other modes targeting SPIR target (e.g. OpenCL). We would be glad to

see get a feedback from the community if changing this mapping is applicable for

all the modes and additional specialization can be avoided (e.g.

[AMDGPU](https://github.com/llvm/llvm-project/blob/master/clang/lib/Basic/Targets/AMDGPU.cpp#L329)

maps default to "generic" address space with a couple of exceptions).

 

There are a few cases when CodeGen assigns non-default address space:

 

1. For declaration explicitly annotated with address space attribute

2. Variables with static storage duration and string literals are allocated in

   global address space unless specific address space it specified.

3. Variables with automatic storage durations are allocated in private address

   space. It's current compiler behavior and it doesn't require additional

   changes.

 

For (2) and (3) cases, once "default" pointer to such variable is obtained, it

is immediately addrspacecast'ed to generic, because a user does not (and should

not) specify address space for pointers in source code.

 

A draft patch containing complete change-set is available 

[here](https://github.com/bader/llvm/pull/18/).

 

Does this approach seem reasonable?

 

Thanks,

Alexey

 

 


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev