Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

David Chisnall via cfe-dev
Hi All,


I have a question about "CodeGenFunction::EmitLoadOfScalar". I am
compiling code with vector type of 3 elements like int3 or float3. Clang
converts the vector load to different vector load with 4 element vector
type because there is code on "CodeGenFunction::EmitLoadOfScalar" as
follows:

1312   // For better performance, handle vector loads differently.
1313   if (Ty->isVectorType()) {
1314     const llvm::Type *EltTy = Addr.getElementType();
1315
1316     const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318     // Handle vectors of size 3 like size 4 for better performance.
1319     if (VTy->getNumElements() == 3) {
1320
1321       // Bitcast to vec4 type.
1322       llvm::VectorType *vec4Ty =
llvm::VectorType::get(VTy->getElementType(),
1323                                                          4);
1324       Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty,
"castToVec4");
1325       // Now load value.
1326       llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

4 element vector load could generate aligned vector load in the end and
it would be better in usual. But it is not good for other target or
language like OpenCL which supports 3 element vector type natively. Can
we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like
this "if (!getLangOpts().OpenCL)" or with target specific property on
TargetCodeGenInfo?

If I missed something, please let me know.

Thanks,
JinGu Kang

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

David Chisnall via cfe-dev
Hi JinGu,

I don't think it should be a problem for OpenCL. 3-component vector is aligned as 4-component vector (see section 6.1.5 "Alignment of Type" of OpenCL C kernel language specification v2.0).
AFAIK, almost all existing OpenCL compilers are based on clang and there seems to be no problems with handling load/store operations this way.
Could you elaborate on the case where this approach doesn't work?

Thanks,
Alexey

On Mon, Mar 6, 2017 at 6:47 PM, [hidden email] via cfe-dev <[hidden email]> wrote:
Hi All,


I have a question about "CodeGenFunction::EmitLoadOfScalar". I am compiling code with vector type of 3 elements like int3 or float3. Clang converts the vector load to different vector load with 4 element vector type because there is code on "CodeGenFunction::EmitLoadOfScalar" as follows:

1312   // For better performance, handle vector loads differently.
1313   if (Ty->isVectorType()) {
1314     const llvm::Type *EltTy = Addr.getElementType();
1315
1316     const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318     // Handle vectors of size 3 like size 4 for better performance.
1319     if (VTy->getNumElements() == 3) {
1320
1321       // Bitcast to vec4 type.
1322       llvm::VectorType *vec4Ty = llvm::VectorType::get(VTy->getElementType(),
1323                                                          4);
1324       Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1325       // Now load value.
1326       llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

4 element vector load could generate aligned vector load in the end and it would be better in usual. But it is not good for other target or language like OpenCL which supports 3 element vector type natively. Can we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like this "if (!getLangOpts().OpenCL)" or with target specific property on TargetCodeGenInfo?

If I missed something, please let me know.

Thanks,
JinGu Kang

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

David Chisnall via cfe-dev
Hi Alexey,

I appreciate your response. My colleague and I are implementing a transformation pass between LLVM IR and another IR and we want to keep the 3-component vector types in our target IR. As you mentioned, the 4-component vector type conversion code is not problem. But I usually expect clang generates more target independent LLVM IR except target specific properties like calling convention, memory layout of variables, etc. clang can keep the 3-component vector type operations and llvm codegen can handle them according to target. At present, we're having to undo Clang's transformation of vec3 -> vec4, to recreate the original type information, which is unfortunate. Would it be possible to add an option to control the behaviour?

Thanks,

JinGu Kang


On 07/03/17 18:19, [hidden email] wrote:
Hi JinGu,

I don't think it should be a problem for OpenCL. 3-component vector is aligned as 4-component vector (see section 6.1.5 "Alignment of Type" of OpenCL C kernel language specification v2.0).
AFAIK, almost all existing OpenCL compilers are based on clang and there seems to be no problems with handling load/store operations this way.
Could you elaborate on the case where this approach doesn't work?

Thanks,
Alexey

On Mon, Mar 6, 2017 at 6:47 PM, [hidden email] via cfe-dev <[hidden email]> wrote:
Hi All,


I have a question about "CodeGenFunction::EmitLoadOfScalar". I am compiling code with vector type of 3 elements like int3 or float3. Clang converts the vector load to different vector load with 4 element vector type because there is code on "CodeGenFunction::EmitLoadOfScalar" as follows:

1312   // For better performance, handle vector loads differently.
1313   if (Ty->isVectorType()) {
1314     const llvm::Type *EltTy = Addr.getElementType();
1315
1316     const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318     // Handle vectors of size 3 like size 4 for better performance.
1319     if (VTy->getNumElements() == 3) {
1320
1321       // Bitcast to vec4 type.
1322       llvm::VectorType *vec4Ty = llvm::VectorType::get(VTy->getElementType(),
1323                                                          4);
1324       Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1325       // Now load value.
1326       llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

4 element vector load could generate aligned vector load in the end and it would be better in usual. But it is not good for other target or language like OpenCL which supports 3 element vector type natively. Can we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like this "if (!getLangOpts().OpenCL)" or with target specific property on TargetCodeGenInfo?

If I missed something, please let me know.

Thanks,
JinGu Kang

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

David Chisnall via cfe-dev

I think the problem is that the borderline for IR being target independent is very vague in general. In this case specifically the issue is that the Spec is very explicit about threating this as 4 element aligned type. However, I agree this lowering could be done later as well. The approach to condition this on the Target property sounds reasonable. I think we have other places in Clang where vec3 is threated as vec4 (e.g. ScalarExprEmitter::VisitAsTypeExpr). Those would have to be handled too. Feel free to propose a prototype.

 

Cheer,

Anastasia

 

From: cfe-dev [mailto:[hidden email]] On Behalf Of [hidden email] via cfe-dev
Sent: 08 March 2017 11:01
To: [hidden email]
Cc: '[hidden email]' ([hidden email])
Subject: Re: [cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

 

Hi Alexey,

I appreciate your response. My colleague and I are implementing a transformation pass between LLVM IR and another IR and we want to keep the 3-component vector types in our target IR. As you mentioned, the 4-component vector type conversion code is not problem. But I usually expect clang generates more target independent LLVM IR except target specific properties like calling convention, memory layout of variables, etc. clang can keep the 3-component vector type operations and llvm codegen can handle them according to target. At present, we're having to undo Clang's transformation of vec3 -> vec4, to recreate the original type information, which is unfortunate. Would it be possible to add an option to control the behaviour?

Thanks,

JinGu Kang

On 07/03/17 18:19, [hidden email] wrote:

Hi JinGu,

 

I don't think it should be a problem for OpenCL. 3-component vector is aligned as 4-component vector (see section 6.1.5 "Alignment of Type" of OpenCL C kernel language specification v2.0).

AFAIK, almost all existing OpenCL compilers are based on clang and there seems to be no problems with handling load/store operations this way.

Could you elaborate on the case where this approach doesn't work?

 

Thanks,

Alexey

 

On Mon, Mar 6, 2017 at 6:47 PM, [hidden email] via cfe-dev <[hidden email]> wrote:

Hi All,


I have a question about "CodeGenFunction::EmitLoadOfScalar". I am compiling code with vector type of 3 elements like int3 or float3. Clang converts the vector load to different vector load with 4 element vector type because there is code on "CodeGenFunction::EmitLoadOfScalar" as follows:

1312   // For better performance, handle vector loads differently.
1313   if (Ty->isVectorType()) {
1314     const llvm::Type *EltTy = Addr.getElementType();
1315
1316     const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318     // Handle vectors of size 3 like size 4 for better performance.
1319     if (VTy->getNumElements() == 3) {
1320
1321       // Bitcast to vec4 type.
1322       llvm::VectorType *vec4Ty = llvm::VectorType::get(VTy->getElementType(),
1323                                                          4);
1324       Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1325       // Now load value.
1326       llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

4 element vector load could generate aligned vector load in the end and it would be better in usual. But it is not good for other target or language like OpenCL which supports 3 element vector type natively. Can we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like this "if (!getLangOpts().OpenCL)" or with target specific property on TargetCodeGenInfo?

If I missed something, please let me know.

Thanks,
JinGu Kang

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

 

 


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

David Chisnall via cfe-dev
Hi Anastasia,

I appreciate your response. I think we need to keep "ScalarExprEmitter::VisitAsTypeExpr" between vec3 and vec4, as we want to maintain the features of the OpenCL source language. If llvm has intrinsic function on IR for the __builtin_astype, we could generate it and llvm's CodeGen could handle it. I have found other location for vec3 and it is "CodeGenFunction::EmitStoreOfScalar". I have simply added a clang's CodeGen Option to preseve vec3. I have attached the diff file and a test. If I missed something, please let me know.

Thanks,

JinGu Kang

On 08/03/17 13:05, Anastasia Stulova wrote:

I think the problem is that the borderline for IR being target independent is very vague in general. In this case specifically the issue is that the Spec is very explicit about threating this as 4 element aligned type. However, I agree this lowering could be done later as well. The approach to condition this on the Target property sounds reasonable. I think we have other places in Clang where vec3 is threated as vec4 (e.g. ScalarExprEmitter::VisitAsTypeExpr). Those would have to be handled too. Feel free to propose a prototype.

 

Cheer,

Anastasia

 

From: cfe-dev [[hidden email]] On Behalf Of [hidden email] via cfe-dev
Sent: 08 March 2017 11:01
To: [hidden email]
Cc: '[hidden email]' ([hidden email])
Subject: Re: [cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

 

Hi Alexey,

I appreciate your response. My colleague and I are implementing a transformation pass between LLVM IR and another IR and we want to keep the 3-component vector types in our target IR. As you mentioned, the 4-component vector type conversion code is not problem. But I usually expect clang generates more target independent LLVM IR except target specific properties like calling convention, memory layout of variables, etc. clang can keep the 3-component vector type operations and llvm codegen can handle them according to target. At present, we're having to undo Clang's transformation of vec3 -> vec4, to recreate the original type information, which is unfortunate. Would it be possible to add an option to control the behaviour?

Thanks,

JinGu Kang

On 07/03/17 18:19, [hidden email] wrote:

Hi JinGu,

 

I don't think it should be a problem for OpenCL. 3-component vector is aligned as 4-component vector (see section 6.1.5 "Alignment of Type" of OpenCL C kernel language specification v2.0).

AFAIK, almost all existing OpenCL compilers are based on clang and there seems to be no problems with handling load/store operations this way.

Could you elaborate on the case where this approach doesn't work?

 

Thanks,

Alexey

 

On Mon, Mar 6, 2017 at 6:47 PM, [hidden email] via cfe-dev <[hidden email]> wrote:

Hi All,


I have a question about "CodeGenFunction::EmitLoadOfScalar". I am compiling code with vector type of 3 elements like int3 or float3. Clang converts the vector load to different vector load with 4 element vector type because there is code on "CodeGenFunction::EmitLoadOfScalar" as follows:

1312   // For better performance, handle vector loads differently.
1313   if (Ty->isVectorType()) {
1314     const llvm::Type *EltTy = Addr.getElementType();
1315
1316     const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318     // Handle vectors of size 3 like size 4 for better performance.
1319     if (VTy->getNumElements() == 3) {
1320
1321       // Bitcast to vec4 type.
1322       llvm::VectorType *vec4Ty = llvm::VectorType::get(VTy->getElementType(),
1323                                                          4);
1324       Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1325       // Now load value.
1326       llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

4 element vector load could generate aligned vector load in the end and it would be better in usual. But it is not good for other target or language like OpenCL which supports 3 element vector type natively. Can we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like this "if (!getLangOpts().OpenCL)" or with target specific property on TargetCodeGenInfo?

If I missed something, please let me know.

Thanks,
JinGu Kang

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

 

 



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

vec3.diff (6K) Download Attachment
vec3test.cl (600 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

David Chisnall via cfe-dev

Cool, could you please resend the patch to cfe-commits with “[OpenCL]” prefix in the subject. Or if possible create review with Phabricator: http://llvm.org/docs/Phabricator.html.

 

Thanks!

 

Anastasia

 

From: [hidden email] [mailto:[hidden email]]
Sent: 09 March 2017 12:03
To: Anastasia Stulova; [hidden email]
Cc: '[hidden email]' ([hidden email]); nd
Subject: Re: [cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

 

Hi Anastasia,

I appreciate your response. I think we need to keep "ScalarExprEmitter::VisitAsTypeExpr" between vec3 and vec4, as we want to maintain the features of the OpenCL source language. If llvm has intrinsic function on IR for the __builtin_astype, we could generate it and llvm's CodeGen could handle it. I have found other location for vec3 and it is "CodeGenFunction::EmitStoreOfScalar". I have simply added a clang's CodeGen Option to preseve vec3. I have attached the diff file and a test. If I missed something, please let me know.

Thanks,

JinGu Kang

On 08/03/17 13:05, Anastasia Stulova wrote:

I think the problem is that the borderline for IR being target independent is very vague in general. In this case specifically the issue is that the Spec is very explicit about threating this as 4 element aligned type. However, I agree this lowering could be done later as well. The approach to condition this on the Target property sounds reasonable. I think we have other places in Clang where vec3 is threated as vec4 (e.g. ScalarExprEmitter::VisitAsTypeExpr). Those would have to be handled too. Feel free to propose a prototype.

 

Cheer,

Anastasia

 

From: cfe-dev [[hidden email]] On Behalf Of [hidden email] via cfe-dev
Sent: 08 March 2017 11:01
To: [hidden email]
Cc: '[hidden email]' ([hidden email])
Subject: Re: [cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

 

Hi Alexey,

I appreciate your response. My colleague and I are implementing a transformation pass between LLVM IR and another IR and we want to keep the 3-component vector types in our target IR. As you mentioned, the 4-component vector type conversion code is not problem. But I usually expect clang generates more target independent LLVM IR except target specific properties like calling convention, memory layout of variables, etc. clang can keep the 3-component vector type operations and llvm codegen can handle them according to target. At present, we're having to undo Clang's transformation of vec3 -> vec4, to recreate the original type information, which is unfortunate. Would it be possible to add an option to control the behaviour?

Thanks,

JinGu Kang


On 07/03/17 18:19, [hidden email] wrote:

Hi JinGu,

 

I don't think it should be a problem for OpenCL. 3-component vector is aligned as 4-component vector (see section 6.1.5 "Alignment of Type" of OpenCL C kernel language specification v2.0).

AFAIK, almost all existing OpenCL compilers are based on clang and there seems to be no problems with handling load/store operations this way.

Could you elaborate on the case where this approach doesn't work?

 

Thanks,

Alexey

 

On Mon, Mar 6, 2017 at 6:47 PM, [hidden email] via cfe-dev <[hidden email]> wrote:

Hi All,


I have a question about "CodeGenFunction::EmitLoadOfScalar". I am compiling code with vector type of 3 elements like int3 or float3. Clang converts the vector load to different vector load with 4 element vector type because there is code on "CodeGenFunction::EmitLoadOfScalar" as follows:

1312   // For better performance, handle vector loads differently.
1313   if (Ty->isVectorType()) {
1314     const llvm::Type *EltTy = Addr.getElementType();
1315
1316     const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318     // Handle vectors of size 3 like size 4 for better performance.
1319     if (VTy->getNumElements() == 3) {
1320
1321       // Bitcast to vec4 type.
1322       llvm::VectorType *vec4Ty = llvm::VectorType::get(VTy->getElementType(),
1323                                                          4);
1324       Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1325       // Now load value.
1326       llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

4 element vector load could generate aligned vector load in the end and it would be better in usual. But it is not good for other target or language like OpenCL which supports 3 element vector type natively. Can we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like this "if (!getLangOpts().OpenCL)" or with target specific property on TargetCodeGenInfo?

If I missed something, please let me know.

Thanks,
JinGu Kang

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

 

 

 


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

David Chisnall via cfe-dev

Hi Anastasia,

I have created review on https://reviews.llvm.org/D30810. I am newbie on phabricator so if I missed something on it, please let me know.

Thanks,

JinGu Kang

On 09/03/2017 19:08, Anastasia Stulova wrote:

Cool, could you please resend the patch to cfe-commits with “[OpenCL]” prefix in the subject. Or if possible create review with Phabricator: http://llvm.org/docs/Phabricator.html.

 

Thanks!

 

Anastasia

 

From: [hidden email] [[hidden email]]
Sent: 09 March 2017 12:03
To: Anastasia Stulova; [hidden email]
Cc: '[hidden email]' ([hidden email]); nd
Subject: Re: [cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

 

Hi Anastasia,

I appreciate your response. I think we need to keep "ScalarExprEmitter::VisitAsTypeExpr" between vec3 and vec4, as we want to maintain the features of the OpenCL source language. If llvm has intrinsic function on IR for the __builtin_astype, we could generate it and llvm's CodeGen could handle it. I have found other location for vec3 and it is "CodeGenFunction::EmitStoreOfScalar". I have simply added a clang's CodeGen Option to preseve vec3. I have attached the diff file and a test. If I missed something, please let me know.

Thanks,

JinGu Kang

On 08/03/17 13:05, Anastasia Stulova wrote:

I think the problem is that the borderline for IR being target independent is very vague in general. In this case specifically the issue is that the Spec is very explicit about threating this as 4 element aligned type. However, I agree this lowering could be done later as well. The approach to condition this on the Target property sounds reasonable. I think we have other places in Clang where vec3 is threated as vec4 (e.g. ScalarExprEmitter::VisitAsTypeExpr). Those would have to be handled too. Feel free to propose a prototype.

 

Cheer,

Anastasia

 

From: cfe-dev [[hidden email]] On Behalf Of [hidden email] via cfe-dev
Sent: 08 March 2017 11:01
To: [hidden email]
Cc: '[hidden email]' ([hidden email])
Subject: Re: [cfe-dev] Question about "CodeGenFunction::EmitLoadOfScalar" with vector type of 3 elements

 

Hi Alexey,

I appreciate your response. My colleague and I are implementing a transformation pass between LLVM IR and another IR and we want to keep the 3-component vector types in our target IR. As you mentioned, the 4-component vector type conversion code is not problem. But I usually expect clang generates more target independent LLVM IR except target specific properties like calling convention, memory layout of variables, etc. clang can keep the 3-component vector type operations and llvm codegen can handle them according to target. At present, we're having to undo Clang's transformation of vec3 -> vec4, to recreate the original type information, which is unfortunate. Would it be possible to add an option to control the behaviour?

Thanks,

JinGu Kang


On 07/03/17 18:19, [hidden email] wrote:

Hi JinGu,

 

I don't think it should be a problem for OpenCL. 3-component vector is aligned as 4-component vector (see section 6.1.5 "Alignment of Type" of OpenCL C kernel language specification v2.0).

AFAIK, almost all existing OpenCL compilers are based on clang and there seems to be no problems with handling load/store operations this way.

Could you elaborate on the case where this approach doesn't work?

 

Thanks,

Alexey

 

On Mon, Mar 6, 2017 at 6:47 PM, [hidden email] via cfe-dev <[hidden email]> wrote:

Hi All,


I have a question about "CodeGenFunction::EmitLoadOfScalar". I am compiling code with vector type of 3 elements like int3 or float3. Clang converts the vector load to different vector load with 4 element vector type because there is code on "CodeGenFunction::EmitLoadOfScalar" as follows:

1312   // For better performance, handle vector loads differently.
1313   if (Ty->isVectorType()) {
1314     const llvm::Type *EltTy = Addr.getElementType();
1315
1316     const auto *VTy = cast<llvm::VectorType>(EltTy);
1317
1318     // Handle vectors of size 3 like size 4 for better performance.
1319     if (VTy->getNumElements() == 3) {
1320
1321       // Bitcast to vec4 type.
1322       llvm::VectorType *vec4Ty = llvm::VectorType::get(VTy->getElementType(),
1323                                                          4);
1324       Address Cast = Builder.CreateElementBitCast(Addr, vec4Ty, "castToVec4");
1325       // Now load value.
1326       llvm::Value *V = Builder.CreateLoad(Cast, Volatile, "loadVec4");

4 element vector load could generate aligned vector load in the end and it would be better in usual. But it is not good for other target or language like OpenCL which supports 3 element vector type natively. Can we consider this situation on "CodeGenFunction::EmitLoadOfScalar" like this "if (!getLangOpts().OpenCL)" or with target specific property on TargetCodeGenInfo?

If I missed something, please let me know.

Thanks,
JinGu Kang

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

 

 

 



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev