Bug in clang-cuda overload set filtering

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Bug in clang-cuda overload set filtering

Hubert Tong via cfe-dev
Hello all, 

I believe I have found a bug in overload resolution for cuda code.  I am currently awaiting permission to post to the bug tracker.  

The following code doesn’t compile with newer versions of clang:

template<class T>
__device__ __host__ int foo(T *x) {
    return 1;

__device__ int foo(int *x) {
    return 2;

__host__ int foo(long *x) {
    return 3;

__device__ __host__ int bar() {
  auto long_val = 1l;
    return foo(&long_val);

clang++ -O2 -g -x cuda --cuda-gpu-arch=sm_61 -std=c++14 -o main -c main.cpp give me:
error: reference to __host__ function 'foo' in __host__ __device__ function
    return foo(&long_val);
main.cpp:10:14: note: 'foo' declared here
__host__ int foo(long *x) {

It’s possible that IdentifyCUDAPreference will return CFP_HostDevice for valid overloads, but this code doesn’t erase the wrong side candidates in that case.  Then because the wrong side candidate is an exact match, minus its host device attributes, clang picks it as the best overload.  

If I rewrite those lines as:

 bool ContainsSameSideCandidate =
     llvm::any_of(Candidates, [&](OverloadCandidate *Cand) {
       // Check viable function only.
       if (Cand->Viable && Cand->Function) {
         auto MatchType = S.IdentifyCUDAPreference(Caller, Cand->Function);
         return MatchType == Sema::CFP_HostDevice ||
                MatchType == Sema::CFP_SameSide;
       return false;

My code compiles again.  I can submit a bug report once I am approved, but I figured I would post here in the mean time. 


cfe-dev mailing list
[hidden email]