clang-offload-bundler

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

clang-offload-bundler

Louis Dionne via cfe-dev
Hi all,

I have been playing with the clang driver to see how it performs the compilations for an OpenMP program with target offloading (for NVidia GPU).

I noticed the use of the tool 'clang-offload-bundler' which seems to bundle together the object files for the host and for the device.
In particular, if I use -c my foo.o will be a bundle of foo-x86_64.o and foo-cuda.o, and at link time it will unbundle the foo.o to obtain back the two object files.

Now, if I put my foo.o for example in a static library clang does not work because it does not know how to unbundle the object files from the library.

Other compilers, such as IBM XL create a specific section in the ELF to store the device code, in this way we always have one object file and there is no need to bundle/unbundle.
Also if a object file/library was compiled with XL it can not be linked with clang and/or viceversa.

So, am I doing something wrong, or is this the status of the clang driver?
Is the clang-offload-bundler the official choice to manage device code?
If my analysis is correct, what's the workaround?

Thanks!
Best,
Simone

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang-offload-bundler

Louis Dionne via cfe-dev
Hi Simone,

the same answers as always:
1. Offloading to GPUs is not yet working with Clang trunk.
2. Most of this has already been discussed on the mailing list or
described in publications. In short, clang-offload-bundler should also
use ELF sections for object files. And AFAIK, IBM XL uses the same
mechanisms in most cases.

Jonas

Am 2018-01-11 19:15, schrieb Simone Atzeni via cfe-dev:

> Hi all,
>
> I have been playing with the clang driver to see how it performs the
> compilations for an OpenMP program with target offloading (for NVidia
> GPU).
>
> I noticed the use of the tool 'clang-offload-bundler' which seems to
> bundle together the object files for the host and for the device.
> In particular, if I use -c my foo.o will be a bundle of foo-x86_64.o
> and foo-cuda.o, and at link time it will unbundle the foo.o to obtain
> back the two object files.
>
> Now, if I put my foo.o for example in a static library clang does not
> work because it does not know how to unbundle the object files from
> the library.
>
> Other compilers, such as IBM XL create a specific section in the ELF
> to store the device code, in this way we always have one object file
> and there is no need to bundle/unbundle.
> Also if a object file/library was compiled with XL it can not be
> linked with clang and/or viceversa.
>
> So, am I doing something wrong, or is this the status of the clang
> driver?
> Is the clang-offload-bundler the official choice to manage device
> code?
> If my analysis is correct, what's the workaround?
>
> Thanks!
> Best,
> Simone
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang-offload-bundler

Louis Dionne via cfe-dev
Hi Jonas,

The clang-offload-bundler does not look like it uses the ELF sections (both the version of trunk and Ykt), it looks like it just concatenate the object files into one file, but I might be wrong or if not hopefully this will be changed later.
At the moment, even though offloading to GPUs is not working on Clang, the driver seems like is doing something different than any other compiler.
I cced Doru, maybe he can add more about this.

I am just asking because we are going to add GPUs support on Flang and of course we want to keep clang and flang compatible.

Thanks.
Simone

On Thu, Jan 11, 2018 at 1:00 PM, Jonas Hahnfeld <[hidden email]> wrote:
Hi Simone,

the same answers as always:
1. Offloading to GPUs is not yet working with Clang trunk.
2. Most of this has already been discussed on the mailing list or described in publications. In short, clang-offload-bundler should also use ELF sections for object files. And AFAIK, IBM XL uses the same mechanisms in most cases.

Jonas


Am 2018-01-11 19:15, schrieb Simone Atzeni via cfe-dev:
Hi all,

I have been playing with the clang driver to see how it performs the
compilations for an OpenMP program with target offloading (for NVidia
GPU).

I noticed the use of the tool 'clang-offload-bundler' which seems to
bundle together the object files for the host and for the device.
In particular, if I use -c my foo.o will be a bundle of foo-x86_64.o
and foo-cuda.o, and at link time it will unbundle the foo.o to obtain
back the two object files.

Now, if I put my foo.o for example in a static library clang does not
work because it does not know how to unbundle the object files from
the library.

Other compilers, such as IBM XL create a specific section in the ELF
to store the device code, in this way we always have one object file
and there is no need to bundle/unbundle.
Also if a object file/library was compiled with XL it can not be
linked with clang and/or viceversa.

So, am I doing something wrong, or is this the status of the clang
driver?
Is the clang-offload-bundler the official choice to manage device
code?
If my analysis is correct, what's the workaround?

Thanks!
Best,
Simone
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang-offload-bundler

Louis Dionne via cfe-dev
This is implemented so that no changes to the build system are
necessary. Have a look at the class ObjectFileHandler
(https://github.com/llvm-mirror/clang/blob/f0382ad/tools/clang-offload-bundler/ClangOffloadBundler.cpp#L370).

$ clang -fopenmp -fopenmp-targets=x86_64-unknown-linux-gnu -c target.c
$ objdump -h target.o

target.o:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off
  Algn
   0 .group        00000014  0000000000000000  0000000000000000  00000040
  2**2
                   CONTENTS, READONLY, EXCLUDE, GROUP, LINK_ONCE_DISCARD
   1 .text         00000068  0000000000000000  0000000000000000  00000060
  2**4
                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
   2 .text.startup 00000080  0000000000000000  0000000000000000  000000d0
  2**4
                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
   3 .rodata       00000001  0000000000000000  0000000000000000  00000150
  2**0
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   4 .rodata.str1.16 00000024  0000000000000000  0000000000000000  
00000160  2**4
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   5 .omp_offloading.entries 00000020  0000000000000000  0000000000000000
  00000184  2**0
                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
   6 .rodata..omp_offloading.device_images 00000020  0000000000000000  
0000000000000000  000001a8  2**3
                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
   7 .rodata..omp_offloading.descriptor 00000020  0000000000000000  
0000000000000000  000001c8  2**3
                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
   8 __CLANG_OFFLOAD_BUNDLE__openmp-x86_64-unknown-linux-gnu 00000588  
0000000000000000  0000000000000000  000001f0  2**4
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
   9 __CLANG_OFFLOAD_BUNDLE__host-x86_64-unknown-linux-gnu 00000001  
0000000000000000  0000000000000000  00000778  2**0
                   CONTENTS, ALLOC, LOAD, READONLY, DATA
  10 .eh_frame     00000098  0000000000000000  0000000000000000  00000780
  2**3
                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  11 .init_array.0 00000008  0000000000000000  0000000000000000  00000818
  2**3
                   CONTENTS, ALLOC, LOAD, RELOC, DATA
  12 .comment      00000038  0000000000000000  0000000000000000  00000820
  2**0
                   CONTENTS, READONLY
  13 .note.GNU-stack 00000000  0000000000000000  0000000000000000  
00000858  2**0
                   CONTENTS, READONLY

Am 2018-01-11 22:12, schrieb Simone Atzeni:

> Hi Jonas,
>
> The clang-offload-bundler does not look like it uses the ELF sections
> (both the version of trunk and Ykt), it looks like it just concatenate
> the object files into one file, but I might be wrong or if not
> hopefully this will be changed later.
> At the moment, even though offloading to GPUs is not working on Clang,
> the driver seems like is doing something different than any other
> compiler.
> I cced Doru, maybe he can add more about this.
>
> I am just asking because we are going to add GPUs support on Flang and
> of course we want to keep clang and flang compatible.
>
> Thanks.
> Simone
>
> On Thu, Jan 11, 2018 at 1:00 PM, Jonas Hahnfeld <[hidden email]>
> wrote:
>
>> Hi Simone,
>>
>> the same answers as always:
>> 1. Offloading to GPUs is not yet working with Clang trunk.
>> 2. Most of this has already been discussed on the mailing list or
>> described in publications. In short, clang-offload-bundler should
>> also use ELF sections for object files. And AFAIK, IBM XL uses the
>> same mechanisms in most cases.
>>
>> Jonas
>>
>> Am 2018-01-11 19:15, schrieb Simone Atzeni via cfe-dev:
>>
>>> Hi all,
>>>
>>> I have been playing with the clang driver to see how it performs
>>> the
>>> compilations for an OpenMP program with target offloading (for
>>> NVidia
>>> GPU).
>>>
>>> I noticed the use of the tool 'clang-offload-bundler' which seems
>>> to
>>> bundle together the object files for the host and for the device.
>>> In particular, if I use -c my foo.o will be a bundle of
>>> foo-x86_64.o
>>> and foo-cuda.o, and at link time it will unbundle the foo.o to
>>> obtain
>>> back the two object files.
>>>
>>> Now, if I put my foo.o for example in a static library clang does
>>> not
>>> work because it does not know how to unbundle the object files
>>> from
>>> the library.
>>>
>>> Other compilers, such as IBM XL create a specific section in the
>>> ELF
>>> to store the device code, in this way we always have one object
>>> file
>>> and there is no need to bundle/unbundle.
>>> Also if a object file/library was compiled with XL it can not be
>>> linked with clang and/or viceversa.
>>>
>>> So, am I doing something wrong, or is this the status of the clang
>>> driver?
>>> Is the clang-offload-bundler the official choice to manage device
>>> code?
>>> If my analysis is correct, what's the workaround?
>>>
>>> Thanks!
>>> Best,
>>> Simone
>>> _______________________________________________
>>> cfe-dev mailing list
>>> [hidden email]
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev [1]
>
>
>
> Links:
> ------
> [1] http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang-offload-bundler

Louis Dionne via cfe-dev
I understand that, but if I put the "target.o" for example in a static library libtarget.a, at link time the clang driver does not call "clang-offload-bundler" on the static library and I get undefined references because it can't find the definitions that are inside the library.
So I just wanted to bring up this problem, because either I am doing something wrong or it's a limitation of the driver/bundler.

As I said, I wanted to bring this up to make sure we know what to do when we start implementing the offload support on Flang.

Thanks Jonas!
Simone

On Thu, Jan 11, 2018 at 1:20 PM, Jonas Hahnfeld <[hidden email]> wrote:
This is implemented so that no changes to the build system are necessary. Have a look at the class ObjectFileHandler (https://github.com/llvm-mirror/clang/blob/f0382ad/tools/clang-offload-bundler/ClangOffloadBundler.cpp#L370).

$ clang -fopenmp -fopenmp-targets=x86_64-unknown-linux-gnu -c target.c
$ objdump -h target.o

target.o:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .group        00000014  0000000000000000  0000000000000000  00000040  2**2
                  CONTENTS, READONLY, EXCLUDE, GROUP, LINK_ONCE_DISCARD
  1 .text         00000068  0000000000000000  0000000000000000  00000060  2**4
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  2 .text.startup 00000080  0000000000000000  0000000000000000  000000d0  2**4
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  3 .rodata       00000001  0000000000000000  0000000000000000  00000150  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .rodata.str1.16 00000024  0000000000000000  0000000000000000  00000160  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .omp_offloading.entries 00000020  0000000000000000  0000000000000000  00000184  2**0
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  6 .rodata..omp_offloading.device_images 00000020  0000000000000000  0000000000000000  000001a8  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  7 .rodata..omp_offloading.descriptor 00000020  0000000000000000  0000000000000000  000001c8  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  8 __CLANG_OFFLOAD_BUNDLE__openmp-x86_64-unknown-linux-gnu 00000588  0000000000000000  0000000000000000  000001f0  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 __CLANG_OFFLOAD_BUNDLE__host-x86_64-unknown-linux-gnu 00000001  0000000000000000  0000000000000000  00000778  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 10 .eh_frame     00000098  0000000000000000  0000000000000000  00000780  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
 11 .init_array.0 00000008  0000000000000000  0000000000000000  00000818  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, DATA
 12 .comment      00000038  0000000000000000  0000000000000000  00000820  2**0
                  CONTENTS, READONLY
 13 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00000858  2**0
                  CONTENTS, READONLY


Am 2018-01-11 22:12, schrieb Simone Atzeni:
Hi Jonas,

The clang-offload-bundler does not look like it uses the ELF sections
(both the version of trunk and Ykt), it looks like it just concatenate
the object files into one file, but I might be wrong or if not
hopefully this will be changed later.
At the moment, even though offloading to GPUs is not working on Clang,
the driver seems like is doing something different than any other
compiler.
I cced Doru, maybe he can add more about this.

I am just asking because we are going to add GPUs support on Flang and
of course we want to keep clang and flang compatible.

Thanks.
Simone

On Thu, Jan 11, 2018 at 1:00 PM, Jonas Hahnfeld <[hidden email]>
wrote:

Hi Simone,

the same answers as always:
1. Offloading to GPUs is not yet working with Clang trunk.
2. Most of this has already been discussed on the mailing list or
described in publications. In short, clang-offload-bundler should
also use ELF sections for object files. And AFAIK, IBM XL uses the
same mechanisms in most cases.

Jonas

Am 2018-01-11 19:15, schrieb Simone Atzeni via cfe-dev:

Hi all,

I have been playing with the clang driver to see how it performs
the
compilations for an OpenMP program with target offloading (for
NVidia
GPU).

I noticed the use of the tool 'clang-offload-bundler' which seems
to
bundle together the object files for the host and for the device.
In particular, if I use -c my foo.o will be a bundle of
foo-x86_64.o
and foo-cuda.o, and at link time it will unbundle the foo.o to
obtain
back the two object files.

Now, if I put my foo.o for example in a static library clang does
not
work because it does not know how to unbundle the object files
from
the library.

Other compilers, such as IBM XL create a specific section in the
ELF
to store the device code, in this way we always have one object
file
and there is no need to bundle/unbundle.
Also if a object file/library was compiled with XL it can not be
linked with clang and/or viceversa.

So, am I doing something wrong, or is this the status of the clang
driver?
Is the clang-offload-bundler the official choice to manage device
code?
If my analysis is correct, what's the workaround?

Thanks!
Best,
Simone
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev [1]



Links:
------
[1] http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev