How does Clang CUDA handle __host__ __device__ template instantiation with __host__ or __device__ only constructs?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How does Clang CUDA handle __host__ __device__ template instantiation with __host__ or __device__ only constructs?

David Blaikie via cfe-dev
CUDA has some complications today when we want to write to write generic __host__ __device__ code (e.g. __host__ __device__ template functions or __host__ __device__ member functions of template classes) that can be instantiated with parameters that cause the instantiation to use __host__ or __device__ only constructs.

Consider the following generic function:

template <typename Range, typename F>
__host__ __device__
void for_each(Range r, F f)
{
  for (auto e : r)
    f(e);
}


template <typename T>
__host__ void foo(T);

for_each(data, foo);
// Instantiates a __host__  __device__ template with a __host__ only
// function, so the instantiation is __host__ only


template <typename T>
__device__ void bar(T);

for_each(data, bar);
// Instantiates a __host__  __device__ template with a __device__ only
// function, so the instantiation is __device__


template <typename T>
__host__ __device__ void void foobar(T);

for_each(data, foobar);
// Instantiates a __host__  __device__ template with a __host__ __device__
// function, so the instantiation is __host__ __device__


With NVCC, when a __host__ __device__ template is instantiated with __host__ only or __device__ only entities, it is treated as described above, and an unnecessary warning is emitted.

There is a pragma to suppress this warning:

#pragma nv_exec_check_disable
template <typename Range, typename F>
__host__ __device__
void for_each(Range r, F f)
{
  for (auto e : r)
    f(e);
}

In Thrust, we decorate many __host__ __device__ template functions and __host__ __device__ member functions of template classes with this pragma.

Basically, NVCC does this:
* Check if the instantiation leads to any __host__ only or __device__ evaluations.
* If both __host__ only and __device__ only evaluations are found, it's a hard compilation error.
* If there are __host__ only evaluations, the instantiated function is __host__ only. Warning emitted.
* If there are __device__ only evaluations, the instantiate function is __device__ only. Warning emitted.

For template classes, the same checks are be performed, but only when a member function of said instantiation is actually used. This follows how template class instantiation works in general; for example, you can instantiate a std::vector with a move-only type, and call emplace_back on it, and everything will compile fine. But if you use a std::vector member that requires copyable types, such as push_back, you'd get a compilation error.

What does Clang CUDA do? Does it emit a warning for instantiations of __host__ __device__ templates with __host__ or __device__ only constructs? Is this warning useful? (I have some examples indicating it is, but I want to hear what others think)

------------------------------------------------------
Bryce Adelstein Lelbach aka wash
ISO C++ Committee Member
CppCon and C++Now Program Chair
Thrust Maintainer, HPX Developer
CUDA Convert and Reformed AVX Junkie

Sleep is for the weak
------------------------------------------------------
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How does Clang CUDA handle __host__ __device__ template instantiation with __host__ or __device__ only constructs?

David Blaikie via cfe-dev
clang's CUDA support does not care about templates.

Instead, the rule is: You can do anything in a HD function, including calling "wrong-side" functions (e.g. calling a D function when compiling for host).  But if you do something that we cannot codegen (like make a wrong-side call), we enqueue an error into a list.  Then if this function is ever emitted (e.g. an inline function that's called), we emit the errors we've deferred.

We never emit these warnings.  In practice at least when we were developing this we couldn't rationalize nvcc's behavior (and I believe it's changed significantly over time), so we didn't try to match it particularly closely, either in terms of errors or warnings.  Rather, we wanted something that made sense to us and was close enough that we could justify to users the code changes we were asking them to make.

Does that answer your question?

-Justin

On Fri, Oct 26, 2018 at 3:15 PM Bryce Lelbach via cfe-dev <[hidden email]> wrote:
CUDA has some complications today when we want to write to write generic __host__ __device__ code (e.g. __host__ __device__ template functions or __host__ __device__ member functions of template classes) that can be instantiated with parameters that cause the instantiation to use __host__ or __device__ only constructs.

Consider the following generic function:

template <typename Range, typename F>
__host__ __device__
void for_each(Range r, F f)
{
  for (auto e : r)
    f(e);
}


template <typename T>
__host__ void foo(T);

for_each(data, foo);
// Instantiates a __host__  __device__ template with a __host__ only
// function, so the instantiation is __host__ only


template <typename T>
__device__ void bar(T);

for_each(data, bar);
// Instantiates a __host__  __device__ template with a __device__ only
// function, so the instantiation is __device__


template <typename T>
__host__ __device__ void void foobar(T);

for_each(data, foobar);
// Instantiates a __host__  __device__ template with a __host__ __device__
// function, so the instantiation is __host__ __device__


With NVCC, when a __host__ __device__ template is instantiated with __host__ only or __device__ only entities, it is treated as described above, and an unnecessary warning is emitted.

There is a pragma to suppress this warning:

#pragma nv_exec_check_disable
template <typename Range, typename F>
__host__ __device__
void for_each(Range r, F f)
{
  for (auto e : r)
    f(e);
}

In Thrust, we decorate many __host__ __device__ template functions and __host__ __device__ member functions of template classes with this pragma.

Basically, NVCC does this:
* Check if the instantiation leads to any __host__ only or __device__ evaluations.
* If both __host__ only and __device__ only evaluations are found, it's a hard compilation error.
* If there are __host__ only evaluations, the instantiated function is __host__ only. Warning emitted.
* If there are __device__ only evaluations, the instantiate function is __device__ only. Warning emitted.

For template classes, the same checks are be performed, but only when a member function of said instantiation is actually used. This follows how template class instantiation works in general; for example, you can instantiate a std::vector with a move-only type, and call emplace_back on it, and everything will compile fine. But if you use a std::vector member that requires copyable types, such as push_back, you'd get a compilation error.

What does Clang CUDA do? Does it emit a warning for instantiations of __host__ __device__ templates with __host__ or __device__ only constructs? Is this warning useful? (I have some examples indicating it is, but I want to hear what others think)

------------------------------------------------------
Bryce Adelstein Lelbach aka wash
ISO C++ Committee Member
CppCon and C++Now Program Chair
Thrust Maintainer, HPX Developer
CUDA Convert and Reformed AVX Junkie

Sleep is for the weak
------------------------------------------------------
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: How does Clang CUDA handle __host__ __device__ template instantiation with __host__ or __device__ only constructs?

David Blaikie via cfe-dev

Yes, I think so. Thanks Justin.


I personally don't understand the rationale behind the warnings, but colleagues have told me there is one. I'll see if I can get them to dig up examples.


------------------------------------------------------
Bryce Adelstein Lelbach aka wash
ISO C++ LEWGI Chair
CppCon and C++Now Program Chair
Thrust Maintainer, HPX Developer
CUDA Convert and Reformed AVX Junkie

Ask "Dumb" Questions
------------------------------------------------------

From: Justin Lebar <[hidden email]>
Sent: Friday, October 26, 2018 3:22 PM
To: Bryce Lelbach
Cc: cfe-dev
Subject: Re: [cfe-dev] How does Clang CUDA handle __host__ __device__ template instantiation with __host__ or __device__ only constructs?

clang's CUDA support does not care about templates.

Instead, the rule is: You can do anything in a HD function, including calling "wrong-side" functions (e.g. calling a D function when compiling for host).  But if you do something that we cannot codegen (like make a wrong-side call), we enqueue an error into a list.  Then if this function is ever emitted (e.g. an inline function that's called), we emit the errors we've deferred.

We never emit these warnings.  In practice at least when we were developing this we couldn't rationalize nvcc's behavior (and I believe it's changed significantly over time), so we didn't try to match it particularly closely, either in terms of errors or warnings.  Rather, we wanted something that made sense to us and was close enough that we could justify to users the code changes we were asking them to make.

Does that answer your question?

-Justin

On Fri, Oct 26, 2018 at 3:15 PM Bryce Lelbach via cfe-dev <[hidden email]> wrote:
CUDA has some complications today when we want to write to write generic __host__ __device__ code (e.g. __host__ __device__ template functions or __host__ __device__ member functions of template classes) that can be instantiated with parameters that cause the instantiation to use __host__ or __device__ only constructs.

Consider the following generic function:

template <typename Range, typename F>
__host__ __device__
void for_each(Range r, F f)
{
  for (auto e : r)
    f(e);
}


template <typename T>
__host__ void foo(T);

for_each(data, foo);
// Instantiates a __host__  __device__ template with a __host__ only
// function, so the instantiation is __host__ only


template <typename T>
__device__ void bar(T);

for_each(data, bar);
// Instantiates a __host__  __device__ template with a __device__ only
// function, so the instantiation is __device__


template <typename T>
__host__ __device__ void void foobar(T);

for_each(data, foobar);
// Instantiates a __host__  __device__ template with a __host__ __device__
// function, so the instantiation is __host__ __device__


With NVCC, when a __host__ __device__ template is instantiated with __host__ only or __device__ only entities, it is treated as described above, and an unnecessary warning is emitted.

There is a pragma to suppress this warning:

#pragma nv_exec_check_disable
template <typename Range, typename F>
__host__ __device__
void for_each(Range r, F f)
{
  for (auto e : r)
    f(e);
}

In Thrust, we decorate many __host__ __device__ template functions and __host__ __device__ member functions of template classes with this pragma.

Basically, NVCC does this:
* Check if the instantiation leads to any __host__ only or __device__ evaluations.
* If both __host__ only and __device__ only evaluations are found, it's a hard compilation error.
* If there are __host__ only evaluations, the instantiated function is __host__ only. Warning emitted.
* If there are __device__ only evaluations, the instantiate function is __device__ only. Warning emitted.

For template classes, the same checks are be performed, but only when a member function of said instantiation is actually used. This follows how template class instantiation works in general; for example, you can instantiate a std::vector with a move-only type, and call emplace_back on it, and everything will compile fine. But if you use a std::vector member that requires copyable types, such as push_back, you'd get a compilation error.

What does Clang CUDA do? Does it emit a warning for instantiations of __host__ __device__ templates with __host__ or __device__ only constructs? Is this warning useful? (I have some examples indicating it is, but I want to hear what others think)

------------------------------------------------------
Bryce Adelstein Lelbach aka wash
ISO C++ Committee Member
CppCon and C++Now Program Chair
Thrust Maintainer, HPX Developer
CUDA Convert and Reformed AVX Junkie

Sleep is for the weak
------------------------------------------------------
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev