[OpenCL] clang can't compile a simple enqueue_kernel with default opt level

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[OpenCL] clang can't compile a simple enqueue_kernel with default opt level

Manuel Klimek via cfe-dev

Hi all,

 

I’ve noticed that clang fails to compile a simple OpenCL kernel with an enqueue_kernel() call if optimization level isn’t set:

 

$ cat test.cl

void foo(size_t id, __global int* out) {

  out[id] = id;

}

 

kernel void enqueue_foo(__global int* out) {

  size_t id = get_global_id(0);

 

  void (^fooBlock)(void) = ^{ foo(id, out); };

 

  queue_t queue = get_default_queue();

  ndrange_t ndrange = ndrange_1D(1);

  enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange, fooBlock);

}

 

$ build/bin/clang --version

clang version 7.0.0 (https://git.llvm.org/git/clang.git/ 08712fff7fba84b88e2e57b3c739d53b1aab1ed6)

 

$ clang -cc1 -emit-llvm -x cl -triple spir64-unknown-unknown -finclude-default-header -cl-std=CL2.0 test.cl

 

clang: ./llvm/include/llvm/Support/Casting.h:255: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = clang::BlockExpr; Y = const clang::Expr; typename llvm::cast_retty<X, Y*

>::ret_type = const clang::BlockExpr*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

#9 llvm::cast_retty<clang::BlockExpr, clang::Expr const*>::ret_type llvm::cast<clang::BlockExpr, clang::Expr const>(clang::Expr const*) ./llvm/include/llvm/Support/Casting

.h:257:0

#10 clang::CodeGen::CGOpenCLRuntime::emitOpenCLEnqueuedBlock(clang::CodeGen::CodeGenFunction&, clang::Expr const*) ./llvm/tools/clang/lib/CodeGen/CGOpenCLRuntime.cpp:125:0

#11 clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::FunctionDecl const*, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/

lib/CodeGen/CGBuiltin.cpp:3017:0

#12 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/lib/CodeGen/CGExpr.cpp:4218:0

#13 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:517:0

#14 clang::StmtVisitorBase<clang::make_ptr, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) ./llvm/build/tools/clang/include/clang/AST/StmtNod

es.inc:329:0

#15 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:358:0

 

Clang compiles OpenCL sources with ‘-O2’ optimization level by default. However tests for enqueue_kernel() in clang are compiled with ‘-O0’.

So, it seems we have a bug here. Am I right?

 

Thanks,

Kristina


--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

Manuel Klimek via cfe-dev

Right. I am taking a look.

 

Thanks.

 

Sam

 

From: Bessonova, Kristina [mailto:[hidden email]]
Sent: Monday, February 05, 2018 12:15 PM
To: [hidden email]
Cc: Sumner, Brian <[hidden email]>; Liu, Yaxun (Sam) <[hidden email]>; [hidden email]
Subject: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Hi all,

 

I’ve noticed that clang fails to compile a simple OpenCL kernel with an enqueue_kernel() call if optimization level isn’t set:

 

$ cat test.cl

void foo(size_t id, __global int* out) {

  out[id] = id;

}

 

kernel void enqueue_foo(__global int* out) {

  size_t id = get_global_id(0);

 

  void (^fooBlock)(void) = ^{ foo(id, out); };

 

  queue_t queue = get_default_queue();

  ndrange_t ndrange = ndrange_1D(1);

  enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange, fooBlock);

}

 

$ build/bin/clang --version

clang version 7.0.0 (https://git.llvm.org/git/clang.git/ 08712fff7fba84b88e2e57b3c739d53b1aab1ed6)

 

$ clang -cc1 -emit-llvm -x cl -triple spir64-unknown-unknown -finclude-default-header -cl-std=CL2.0 test.cl

 

clang: ./llvm/include/llvm/Support/Casting.h:255: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = clang::BlockExpr; Y = const clang::Expr; typename llvm::cast_retty<X, Y*

>::ret_type = const clang::BlockExpr*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

#9 llvm::cast_retty<clang::BlockExpr, clang::Expr const*>::ret_type llvm::cast<clang::BlockExpr, clang::Expr const>(clang::Expr const*) ./llvm/include/llvm/Support/Casting

.h:257:0

#10 clang::CodeGen::CGOpenCLRuntime::emitOpenCLEnqueuedBlock(clang::CodeGen::CodeGenFunction&, clang::Expr const*) ./llvm/tools/clang/lib/CodeGen/CGOpenCLRuntime.cpp:125:0

#11 clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::FunctionDecl const*, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/

lib/CodeGen/CGBuiltin.cpp:3017:0

#12 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/lib/CodeGen/CGExpr.cpp:4218:0

#13 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:517:0

#14 clang::StmtVisitorBase<clang::make_ptr, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) ./llvm/build/tools/clang/include/clang/AST/StmtNod

es.inc:329:0

#15 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:358:0

 

Clang compiles OpenCL sources with ‘-O2’ optimization level by default. However tests for enqueue_kernel() in clang are compiled with ‘-O0’.

So, it seems we have a bug here. Am I right?

 

Thanks,

Kristina


--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

Manuel Klimek via cfe-dev


It seems one of the assumptions that E in emitOpenCLEnqueuedBlock has BlockExpr type is not right and therefore the assert is triggered inside the static cast. This can be easily fixed by calling  IgnoreImplicit() method.

But then there is another issue due to the generation of the Block during the AST parsing of the block declaration and during the enqueue_kernel builtin generation path. I think we should avoid generating the block literal from the enqueue_kernel generation. This doesn't really match the original compilation flow.

@Sam, I was just wondering whether we could avoid generating the literal inside the emitOpenCLEnqueuedBlock and pass the name of the block  and num of its params to createEnqueuedBlockKernel or perhaps we could pass the BlockExpr? The wrapper kernel function is pretty simple at the end and doesn't require the block invoke function itself (it can just rebuild the prototype). What do you think?

Anastasia


From: Liu, Yaxun (Sam) <[hidden email]>
Sent: 05 February 2018 18:22
To: Bessonova, Kristina; [hidden email]
Cc: Sumner, Brian; Anastasia Stulova
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level
 

Right. I am taking a look.

 

Thanks.

 

Sam

 

From: Bessonova, Kristina [mailto:[hidden email]]
Sent: Monday, February 05, 2018 12:15 PM
To: [hidden email]
Cc: Sumner, Brian <[hidden email]>; Liu, Yaxun (Sam) <[hidden email]>; [hidden email]
Subject: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Hi all,

 

I’ve noticed that clang fails to compile a simple OpenCL kernel with an enqueue_kernel() call if optimization level isn’t set:

 

$ cat test.cl

void foo(size_t id, __global int* out) {

  out[id] = id;

}

 

kernel void enqueue_foo(__global int* out) {

  size_t id = get_global_id(0);

 

  void (^fooBlock)(void) = ^{ foo(id, out); };

 

  queue_t queue = get_default_queue();

  ndrange_t ndrange = ndrange_1D(1);

  enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange, fooBlock);

}

 

$ build/bin/clang --version

clang version 7.0.0 (https://git.llvm.org/git/clang.git/ 08712fff7fba84b88e2e57b3c739d53b1aab1ed6)

 

$ clang -cc1 -emit-llvm -x cl -triple spir64-unknown-unknown -finclude-default-header -cl-std=CL2.0 test.cl

 

clang: ./llvm/include/llvm/Support/Casting.h:255: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = clang::BlockExpr; Y = const clang::Expr; typename llvm::cast_retty<X, Y*

>::ret_type = const clang::BlockExpr*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

#9 llvm::cast_retty<clang::BlockExpr, clang::Expr const*>::ret_type llvm::cast<clang::BlockExpr, clang::Expr const>(clang::Expr const*) ./llvm/include/llvm/Support/Casting

.h:257:0

#10 clang::CodeGen::CGOpenCLRuntime::emitOpenCLEnqueuedBlock(clang::CodeGen::CodeGenFunction&, clang::Expr const*) ./llvm/tools/clang/lib/CodeGen/CGOpenCLRuntime.cpp:125:0

#11 clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::FunctionDecl const*, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/

lib/CodeGen/CGBuiltin.cpp:3017:0

#12 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/lib/CodeGen/CGExpr.cpp:4218:0

#13 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:517:0

#14 clang::StmtVisitorBase<clang::make_ptr, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) ./llvm/build/tools/clang/include/clang/AST/StmtNod

es.inc:329:0

#15 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:358:0

 

Clang compiles OpenCL sources with ‘-O2’ optimization level by default. However tests for enqueue_kernel() in clang are compiled with ‘-O0’.

So, it seems we have a bug here. Am I right?

 

Thanks,

Kristina


--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

Manuel Klimek via cfe-dev

Thanks Anastasia for investigating the issue.

 

The wrapper kernel needs to call the block invoke function, which is created by emitOpenCLEnqueuedBlock.

 

Sam

 

From: Anastasia Stulova [mailto:[hidden email]]
Sent: Thursday, February 08, 2018 4:05 PM
To: Liu, Yaxun (Sam) <[hidden email]>; Bessonova, Kristina <[hidden email]>; [hidden email]
Cc: Sumner, Brian <[hidden email]>; nd <[hidden email]>
Subject: Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

 

It seems one of the assumptions that E in emitOpenCLEnqueuedBlock has BlockExpr type is not right and therefore the assert is triggered inside the static cast. This can be easily fixed by calling  IgnoreImplicit() method.

But then there is another issue due to the generation of the Block during the AST parsing of the block declaration and during the enqueue_kernel builtin generation path. I think we should avoid generating the block literal from the enqueue_kernel generation. This doesn't really match the original compilation flow.

 

@Sam, I was just wondering whether we could avoid generating the literal inside the emitOpenCLEnqueuedBlock and pass the name of the block  and num of its params to createEnqueuedBlockKernel or perhaps we could pass the BlockExpr? The wrapper kernel function is pretty simple at the end and doesn't require the block invoke function itself (it can just rebuild the prototype). What do you think?

 

Anastasia


From: Liu, Yaxun (Sam) <[hidden email]>
Sent: 05 February 2018 18:22
To: Bessonova, Kristina; [hidden email]
Cc: Sumner, Brian; Anastasia Stulova
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Right. I am taking a look.

 

Thanks.

 

Sam

 

From: Bessonova, Kristina [[hidden email]]
Sent: Monday, February 05, 2018 12:15 PM
To: [hidden email]
Cc: Sumner, Brian <[hidden email]>; Liu, Yaxun (Sam) <[hidden email]>; [hidden email]
Subject: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Hi all,

 

I’ve noticed that clang fails to compile a simple OpenCL kernel with an enqueue_kernel() call if optimization level isn’t set:

 

$ cat test.cl

void foo(size_t id, __global int* out) {

  out[id] = id;

}

 

kernel void enqueue_foo(__global int* out) {

  size_t id = get_global_id(0);

 

  void (^fooBlock)(void) = ^{ foo(id, out); };

 

  queue_t queue = get_default_queue();

  ndrange_t ndrange = ndrange_1D(1);

  enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange, fooBlock);

}

 

$ build/bin/clang --version

clang version 7.0.0 (https://git.llvm.org/git/clang.git/ 08712fff7fba84b88e2e57b3c739d53b1aab1ed6)

 

$ clang -cc1 -emit-llvm -x cl -triple spir64-unknown-unknown -finclude-default-header -cl-std=CL2.0 test.cl

 

clang: ./llvm/include/llvm/Support/Casting.h:255: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = clang::BlockExpr; Y = const clang::Expr; typename llvm::cast_retty<X, Y*

>::ret_type = const clang::BlockExpr*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

#9 llvm::cast_retty<clang::BlockExpr, clang::Expr const*>::ret_type llvm::cast<clang::BlockExpr, clang::Expr const>(clang::Expr const*) ./llvm/include/llvm/Support/Casting

.h:257:0

#10 clang::CodeGen::CGOpenCLRuntime::emitOpenCLEnqueuedBlock(clang::CodeGen::CodeGenFunction&, clang::Expr const*) ./llvm/tools/clang/lib/CodeGen/CGOpenCLRuntime.cpp:125:0

#11 clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::FunctionDecl const*, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/

lib/CodeGen/CGBuiltin.cpp:3017:0

#12 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/lib/CodeGen/CGExpr.cpp:4218:0

#13 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:517:0

#14 clang::StmtVisitorBase<clang::make_ptr, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) ./llvm/build/tools/clang/include/clang/AST/StmtNod

es.inc:329:0

#15 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:358:0

 

Clang compiles OpenCL sources with ‘-O2’ optimization level by default. However tests for enqueue_kernel() in clang are compiled with ‘-O0’.

So, it seems we have a bug here. Am I right?

 

Thanks,

Kristina


--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

Manuel Klimek via cfe-dev

>The wrapper kernel needs to call the block invoke function, which is created by emitOpenCLEnqueuedBlock.


Yes, but do we actually need the block definition for emitting the call (i.e. llvm::Value for the block invoke function)?
auto *V = CGF.EmitBlockLiteral(cast<BlockExpr>(Block), &Invoke);

Could we just recreate the function prototype only, while emitting the kernel body inside createEnqueuedBlockKernel?


From: Liu, Yaxun (Sam) <[hidden email]>
Sent: 09 February 2018 21:31
To: Anastasia Stulova; Bessonova, Kristina; [hidden email]
Cc: Sumner, Brian; nd
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level
 

Thanks Anastasia for investigating the issue.

 

The wrapper kernel needs to call the block invoke function, which is created by emitOpenCLEnqueuedBlock.

 

Sam

 

From: Anastasia Stulova [mailto:[hidden email]]
Sent: Thursday, February 08, 2018 4:05 PM
To: Liu, Yaxun (Sam) <[hidden email]>; Bessonova, Kristina <[hidden email]>; [hidden email]
Cc: Sumner, Brian <[hidden email]>; nd <[hidden email]>
Subject: Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

 

It seems one of the assumptions that E in emitOpenCLEnqueuedBlock has BlockExpr type is not right and therefore the assert is triggered inside the static cast. This can be easily fixed by calling  IgnoreImplicit() method.

But then there is another issue due to the generation of the Block during the AST parsing of the block declaration and during the enqueue_kernel builtin generation path. I think we should avoid generating the block literal from the enqueue_kernel generation. This doesn't really match the original compilation flow.

 

@Sam, I was just wondering whether we could avoid generating the literal inside the emitOpenCLEnqueuedBlock and pass the name of the block  and num of its params to createEnqueuedBlockKernel or perhaps we could pass the BlockExpr? The wrapper kernel function is pretty simple at the end and doesn't require the block invoke function itself (it can just rebuild the prototype). What do you think?

 

Anastasia


From: Liu, Yaxun (Sam) <[hidden email]>
Sent: 05 February 2018 18:22
To: Bessonova, Kristina; [hidden email]
Cc: Sumner, Brian; Anastasia Stulova
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Right. I am taking a look.

 

Thanks.

 

Sam

 

From: Bessonova, Kristina [[hidden email]]
Sent: Monday, February 05, 2018 12:15 PM
To: [hidden email]
Cc: Sumner, Brian <[hidden email]>; Liu, Yaxun (Sam) <[hidden email]>; [hidden email]
Subject: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Hi all,

 

I’ve noticed that clang fails to compile a simple OpenCL kernel with an enqueue_kernel() call if optimization level isn’t set:

 

$ cat test.cl

void foo(size_t id, __global int* out) {

  out[id] = id;

}

 

kernel void enqueue_foo(__global int* out) {

  size_t id = get_global_id(0);

 

  void (^fooBlock)(void) = ^{ foo(id, out); };

 

  queue_t queue = get_default_queue();

  ndrange_t ndrange = ndrange_1D(1);

  enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange, fooBlock);

}

 

$ build/bin/clang --version

clang version 7.0.0 (https://git.llvm.org/git/clang.git/ 08712fff7fba84b88e2e57b3c739d53b1aab1ed6)

 

$ clang -cc1 -emit-llvm -x cl -triple spir64-unknown-unknown -finclude-default-header -cl-std=CL2.0 test.cl

 

clang: ./llvm/include/llvm/Support/Casting.h:255: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = clang::BlockExpr; Y = const clang::Expr; typename llvm::cast_retty<X, Y*

>::ret_type = const clang::BlockExpr*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

#9 llvm::cast_retty<clang::BlockExpr, clang::Expr const*>::ret_type llvm::cast<clang::BlockExpr, clang::Expr const>(clang::Expr const*) ./llvm/include/llvm/Support/Casting

.h:257:0

#10 clang::CodeGen::CGOpenCLRuntime::emitOpenCLEnqueuedBlock(clang::CodeGen::CodeGenFunction&, clang::Expr const*) ./llvm/tools/clang/lib/CodeGen/CGOpenCLRuntime.cpp:125:0

#11 clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::FunctionDecl const*, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/

lib/CodeGen/CGBuiltin.cpp:3017:0

#12 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/lib/CodeGen/CGExpr.cpp:4218:0

#13 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:517:0

#14 clang::StmtVisitorBase<clang::make_ptr, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) ./llvm/build/tools/clang/include/clang/AST/StmtNod

es.inc:329:0

#15 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:358:0

 

Clang compiles OpenCL sources with ‘-O2’ optimization level by default. However tests for enqueue_kernel() in clang are compiled with ‘-O0’.

So, it seems we have a bug here. Am I right?

 

Thanks,

Kristina


--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

Manuel Klimek via cfe-dev

I have a fix for this issue https://reviews.llvm.org/D43240

 

Thanks.

 

Sam

 

From: Anastasia Stulova [mailto:[hidden email]]
Sent: Monday, February 12, 2018 12:56 PM
To: Liu, Yaxun (Sam) <[hidden email]>; Bessonova, Kristina <[hidden email]>; [hidden email]
Cc: Sumner, Brian <[hidden email]>; nd <[hidden email]>
Subject: Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

>The wrapper kernel needs to call the block invoke function, which is created by emitOpenCLEnqueuedBlock.


Yes, but do we actually need the block definition for emitting the call (i.e. llvm::Value for the block invoke function)?
auto *V = CGF.EmitBlockLiteral(cast<BlockExpr>(Block), &Invoke);

Could we just recreate the function prototype only, while emitting the kernel body inside
createEnqueuedBlockKernel?


From: Liu, Yaxun (Sam) <[hidden email]>
Sent: 09 February 2018 21:31
To: Anastasia Stulova; Bessonova, Kristina;
[hidden email]
Cc: Sumner, Brian; nd
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Thanks Anastasia for investigating the issue.

 

The wrapper kernel needs to call the block invoke function, which is created by emitOpenCLEnqueuedBlock.

 

Sam

 

From: Anastasia Stulova [[hidden email]]
Sent: Thursday, February 08, 2018 4:05 PM
To: Liu, Yaxun (Sam) <
[hidden email]>; Bessonova, Kristina <[hidden email]>; [hidden email]
Cc: Sumner, Brian <
[hidden email]>; nd <[hidden email]>
Subject: Re: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

 

It seems one of the assumptions that E in emitOpenCLEnqueuedBlock has BlockExpr type is not right and therefore the assert is triggered inside the static cast. This can be easily fixed by calling  IgnoreImplicit() method.

But then there is another issue due to the generation of the Block during the AST parsing of the block declaration and during the enqueue_kernel builtin generation path. I think we should avoid generating the block literal from the enqueue_kernel generation. This doesn't really match the original compilation flow.

 

@Sam, I was just wondering whether we could avoid generating the literal inside the emitOpenCLEnqueuedBlock and pass the name of the block  and num of its params to createEnqueuedBlockKernel or perhaps we could pass the BlockExpr? The wrapper kernel function is pretty simple at the end and doesn't require the block invoke function itself (it can just rebuild the prototype). What do you think?

 

Anastasia


From: Liu, Yaxun (Sam) <[hidden email]>
Sent: 05 February 2018 18:22
To: Bessonova, Kristina;
[hidden email]
Cc: Sumner, Brian; Anastasia Stulova
Subject: RE: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Right. I am taking a look.

 

Thanks.

 

Sam

 

From: Bessonova, Kristina [[hidden email]]
Sent: Monday, February 05, 2018 12:15 PM
To:
[hidden email]
Cc: Sumner, Brian <
[hidden email]>; Liu, Yaxun (Sam) <[hidden email]>; [hidden email]
Subject: [OpenCL] clang can't compile a simple enqueue_kernel with default opt level

 

Hi all,

 

I’ve noticed that clang fails to compile a simple OpenCL kernel with an enqueue_kernel() call if optimization level isn’t set:

 

$ cat test.cl

void foo(size_t id, __global int* out) {

  out[id] = id;

}

 

kernel void enqueue_foo(__global int* out) {

  size_t id = get_global_id(0);

 

  void (^fooBlock)(void) = ^{ foo(id, out); };

 

  queue_t queue = get_default_queue();

  ndrange_t ndrange = ndrange_1D(1);

  enqueue_kernel(queue, CLK_ENQUEUE_FLAGS_WAIT_KERNEL, ndrange, fooBlock);

}

 

$ build/bin/clang --version

clang version 7.0.0 (https://git.llvm.org/git/clang.git/ 08712fff7fba84b88e2e57b3c739d53b1aab1ed6)

 

$ clang -cc1 -emit-llvm -x cl -triple spir64-unknown-unknown -finclude-default-header -cl-std=CL2.0 test.cl

 

clang: ./llvm/include/llvm/Support/Casting.h:255: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = clang::BlockExpr; Y = const clang::Expr; typename llvm::cast_retty<X, Y*

>::ret_type = const clang::BlockExpr*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

#9 llvm::cast_retty<clang::BlockExpr, clang::Expr const*>::ret_type llvm::cast<clang::BlockExpr, clang::Expr const>(clang::Expr const*) ./llvm/include/llvm/Support/Casting

.h:257:0

#10 clang::CodeGen::CGOpenCLRuntime::emitOpenCLEnqueuedBlock(clang::CodeGen::CodeGenFunction&, clang::Expr const*) ./llvm/tools/clang/lib/CodeGen/CGOpenCLRuntime.cpp:125:0

#11 clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::FunctionDecl const*, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/

lib/CodeGen/CGBuiltin.cpp:3017:0

#12 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) ./llvm/tools/clang/lib/CodeGen/CGExpr.cpp:4218:0

#13 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:517:0

#14 clang::StmtVisitorBase<clang::make_ptr, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) ./llvm/build/tools/clang/include/clang/AST/StmtNod

es.inc:329:0

#15 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) ./llvm/tools/clang/lib/CodeGen/CGExprScalar.cpp:358:0

 

Clang compiles OpenCL sources with ‘-O2’ optimization level by default. However tests for enqueue_kernel() in clang are compiled with ‘-O0’.

So, it seems we have a bug here. Am I right?

 

Thanks,

Kristina


--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev