[OpenMP]: wrong threadprivate storage name in task reduction

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[OpenMP]: wrong threadprivate storage name in task reduction

Anastasia Stulova via cfe-dev
Hi,

I've noticed that clang does not work properly when using task
reductions with non-constant arrays and/or classes with omp_orig
constructor initializers.

As reduction initialize/combine/finalize functions only receive
references to whatever should be initialized, combined..., clang
generates some additional storages to save the size of the reduction
array and a pointer to the omp_orig. That storages are generated through
__kmpc_thread_private_cached() calls in both reduction functions and
outline task function.

The problem is that the storage used in outline task function does not
match the storage in reduction functions. This happens in programs that
have a task reduction or a taskloop reduction with the in_reduction clause.

To solve this i made an ugly patch that reuses the SourceLocation of
CodeGenFunction::EmitOMPTaskgroupDirective() in
CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages and
make it work. Anyway, i suppose that there is a better way to fix this.

Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp    (revision 326685)
+++ CGStmtOpenMP.cpp    (working copy)
@@ -2713,6 +2713,8 @@
                                   emitEmptyBoundParameters);
  }

+static SourceLocation reductionLoc;
+
  void CodeGenFunction::EmitOMPTaskBasedDirective(
      const OMPExecutableDirective &S, const OpenMPDirectiveKind
CapturedRegion,
      const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
          // FIXME: This must removed once the runtime library is fixed.
          // Emit required threadprivate variables for
          // initilizer/combiner/finalizer.
-        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
S.getLocStart(),
+        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
reductionLoc,
RedCG, Cnt);
        }
      }
@@ -3180,8 +3182,9 @@
            std::advance(IRHS, 1);
          }
        }
+      reductionLoc = S.getLocStart();
        llvm::Value *ReductionDesc =
-          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
S.getLocStart(),
+          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
reductionLoc,
LHSs, RHSs, Data);
        const auto *VD = cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
        CGF.EmitVarDecl(*VD);

Regards,

Raúl


http://bsc.es/disclaimer
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [OpenMP]: wrong threadprivate storage name in task reduction

Anastasia Stulova via cfe-dev

Hi, thanks for the report, I'll take a look.

-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
Hi,

I've noticed that clang does not work properly when using task reductions with non-constant arrays and/or classes with omp_orig constructor initializers.

As reduction initialize/combine/finalize functions only receive references to whatever should be initialized, combined..., clang generates some additional storages to save the size of the reduction array and a pointer to the omp_orig. That storages are generated through __kmpc_thread_private_cached() calls in both reduction functions and outline task function.

The problem is that the storage used in outline task function does not match the storage in reduction functions. This happens in programs that have a task reduction or a taskloop reduction with the in_reduction clause.

To solve this i made an ugly patch that reuses the SourceLocation of CodeGenFunction::EmitOMPTaskgroupDirective() in CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages and make it work. Anyway, i suppose that there is a better way to fix this.

Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp    (revision 326685)
+++ CGStmtOpenMP.cpp    (working copy)
@@ -2713,6 +2713,8 @@
                                  emitEmptyBoundParameters);
 }

+static SourceLocation reductionLoc;
+
 void CodeGenFunction::EmitOMPTaskBasedDirective(
     const OMPExecutableDirective &S, const OpenMPDirectiveKind CapturedRegion,
     const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
         // FIXME: This must removed once the runtime library is fixed.
         // Emit required threadprivate variables for
         // initilizer/combiner/finalizer.
-        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, S.getLocStart(),
+        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, reductionLoc,
RedCG, Cnt);
       }
     }
@@ -3180,8 +3182,9 @@
           std::advance(IRHS, 1);
         }
       }
+      reductionLoc = S.getLocStart();
       llvm::Value *ReductionDesc =
-          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, S.getLocStart(),
+          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, reductionLoc,
LHSs, RHSs, Data);
       const auto *VD = cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
       CGF.EmitVarDecl(*VD);

Regards,

Raúl


http://bsc.es/disclaimer



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [OpenMP]: wrong threadprivate storage name in task reduction

Anastasia Stulova via cfe-dev

Hi,

I did some little research and I have realized that the patch I attached doesn't fix the problem completely. I fixed the storage of the data. When a using task reduction with vla, or a class that uses omp_orig to initialize the private copies, the size of the vla and omp_orig are stored in a storage using __kmpc_threadprivate_cached().

The problem comes because, since in outline parallel function each reduction has lazy_priv flag, the initialization will be done in __kmpc_task_reduction_get_th_data(), called from outline task function. This is done before storing the vla size or omp_orig, so the initialization would be wrong in these cases.

I think that if the storage is put before the __kmpc_task_reduction_get_th_data() it should work.

Anyway, this implementation of task reductions is not fully functional, as only are allowed if the task is created in the scope of taskgroup.

Sorry for the mail duplicate.

Regards,

Raúl

El 06/03/18 a las 16:21, Alexey Bataev escribió:

Hi, thanks for the report, I'll take a look.

-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
Hi,

I've noticed that clang does not work properly when using task reductions with non-constant arrays and/or classes with omp_orig constructor initializers.

As reduction initialize/combine/finalize functions only receive references to whatever should be initialized, combined..., clang generates some additional storages to save the size of the reduction array and a pointer to the omp_orig. That storages are generated through __kmpc_thread_private_cached() calls in both reduction functions and outline task function.

The problem is that the storage used in outline task function does not match the storage in reduction functions. This happens in programs that have a task reduction or a taskloop reduction with the in_reduction clause.

To solve this i made an ugly patch that reuses the SourceLocation of CodeGenFunction::EmitOMPTaskgroupDirective() in CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages and make it work. Anyway, i suppose that there is a better way to fix this.

Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp    (revision 326685)
+++ CGStmtOpenMP.cpp    (working copy)
@@ -2713,6 +2713,8 @@
                                  emitEmptyBoundParameters);
 }

+static SourceLocation reductionLoc;
+
 void CodeGenFunction::EmitOMPTaskBasedDirective(
     const OMPExecutableDirective &S, const OpenMPDirectiveKind CapturedRegion,
     const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
         // FIXME: This must removed once the runtime library is fixed.
         // Emit required threadprivate variables for
         // initilizer/combiner/finalizer.
-        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, S.getLocStart(),
+        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, reductionLoc,
RedCG, Cnt);
       }
     }
@@ -3180,8 +3182,9 @@
           std::advance(IRHS, 1);
         }
       }
+      reductionLoc = S.getLocStart();
       llvm::Value *ReductionDesc =
-          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, S.getLocStart(),
+          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, reductionLoc,
LHSs, RHSs, Data);
       const auto *VD = cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
       CGF.EmitVarDecl(*VD);

Regards,

Raúl


http://bsc.es/disclaimer





WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [OpenMP]: wrong threadprivate storage name in task reduction

Anastasia Stulova via cfe-dev

Hi,

>>> Anyway, this implementation of task reductions is not fully functional, as only are allowed if the task is created in the scope of taskgroup.

What do you mean?Could you provide some more detais?


-------------
Best regards,
Alexey Bataev
08.03.2018 7:47, Raúl Peñacoba пишет:

Hi,

I did some little research and I have realized that the patch I attached doesn't fix the problem completely. I fixed the storage of the data. When a using task reduction with vla, or a class that uses omp_orig to initialize the private copies, the size of the vla and omp_orig are stored in a storage using __kmpc_threadprivate_cached().

The problem comes because, since in outline parallel function each reduction has lazy_priv flag, the initialization will be done in __kmpc_task_reduction_get_th_data(), called from outline task function. This is done before storing the vla size or omp_orig, so the initialization would be wrong in these cases.

I think that if the storage is put before the __kmpc_task_reduction_get_th_data() it should work.

Anyway, this implementation of task reductions is not fully functional, as only are allowed if the task is created in the scope of taskgroup.

Sorry for the mail duplicate.

Regards,

Raúl

El 06/03/18 a las 16:21, Alexey Bataev escribió:

Hi, thanks for the report, I'll take a look.

-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
Hi,

I've noticed that clang does not work properly when using task reductions with non-constant arrays and/or classes with omp_orig constructor initializers.

As reduction initialize/combine/finalize functions only receive references to whatever should be initialized, combined..., clang generates some additional storages to save the size of the reduction array and a pointer to the omp_orig. That storages are generated through __kmpc_thread_private_cached() calls in both reduction functions and outline task function.

The problem is that the storage used in outline task function does not match the storage in reduction functions. This happens in programs that have a task reduction or a taskloop reduction with the in_reduction clause.

To solve this i made an ugly patch that reuses the SourceLocation of CodeGenFunction::EmitOMPTaskgroupDirective() in CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages and make it work. Anyway, i suppose that there is a better way to fix this.

Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp    (revision 326685)
+++ CGStmtOpenMP.cpp    (working copy)
@@ -2713,6 +2713,8 @@
                                  emitEmptyBoundParameters);
 }

+static SourceLocation reductionLoc;
+
 void CodeGenFunction::EmitOMPTaskBasedDirective(
     const OMPExecutableDirective &S, const OpenMPDirectiveKind CapturedRegion,
     const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
         // FIXME: This must removed once the runtime library is fixed.
         // Emit required threadprivate variables for
         // initilizer/combiner/finalizer.
-        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, S.getLocStart(),
+        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, reductionLoc,
RedCG, Cnt);
       }
     }
@@ -3180,8 +3182,9 @@
           std::advance(IRHS, 1);
         }
       }
+      reductionLoc = S.getLocStart();
       llvm::Value *ReductionDesc =
-          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, S.getLocStart(),
+          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, reductionLoc,
LHSs, RHSs, Data);
       const auto *VD = cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
       CGF.EmitVarDecl(*VD);

Regards,

Raúl


http://bsc.es/disclaimer





WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [OpenMP]: wrong threadprivate storage name in task reduction

Anastasia Stulova via cfe-dev

Hi,

I mean that it seems that this implementation restricts the tasks that participate in the reduction to be defined in the same lexical scope. I have not been able to find this restriction in the OpenMP specification. What the spec says about this topic is:

"A list item that appears in an in_reduction clause of a task-generating construct must appear
in a task_reduction clause of a construct corresponding to a taskgroup region that includes
the participating task in its taskgroup set. The construct corresponding to the innermost region
that meets this condition must specify the same reduction-identifier as the in_reduction
clause."

I tried to compile the attached file. The compile command is:

clang -fopenmp task_red.cc

Regards,
Raúl

El 08/03/18 a las 15:30, Alexey Bataev escribió:

Hi,

>>> Anyway, this implementation of task reductions is not fully functional, as only are allowed if the task is created in the scope of taskgroup.

What do you mean?Could you provide some more detais?


-------------
Best regards,
Alexey Bataev
08.03.2018 7:47, Raúl Peñacoba пишет:

Hi,

I did some little research and I have realized that the patch I attached doesn't fix the problem completely. I fixed the storage of the data. When a using task reduction with vla, or a class that uses omp_orig to initialize the private copies, the size of the vla and omp_orig are stored in a storage using __kmpc_threadprivate_cached().

The problem comes because, since in outline parallel function each reduction has lazy_priv flag, the initialization will be done in __kmpc_task_reduction_get_th_data(), called from outline task function. This is done before storing the vla size or omp_orig, so the initialization would be wrong in these cases.

I think that if the storage is put before the __kmpc_task_reduction_get_th_data() it should work.

Anyway, this implementation of task reductions is not fully functional, as only are allowed if the task is created in the scope of taskgroup.

Sorry for the mail duplicate.

Regards,

Raúl

El 06/03/18 a las 16:21, Alexey Bataev escribió:

Hi, thanks for the report, I'll take a look.

-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
Hi,

I've noticed that clang does not work properly when using task reductions with non-constant arrays and/or classes with omp_orig constructor initializers.

As reduction initialize/combine/finalize functions only receive references to whatever should be initialized, combined..., clang generates some additional storages to save the size of the reduction array and a pointer to the omp_orig. That storages are generated through __kmpc_thread_private_cached() calls in both reduction functions and outline task function.

The problem is that the storage used in outline task function does not match the storage in reduction functions. This happens in programs that have a task reduction or a taskloop reduction with the in_reduction clause.

To solve this i made an ugly patch that reuses the SourceLocation of CodeGenFunction::EmitOMPTaskgroupDirective() in CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages and make it work. Anyway, i suppose that there is a better way to fix this.

Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp    (revision 326685)
+++ CGStmtOpenMP.cpp    (working copy)
@@ -2713,6 +2713,8 @@
                                  emitEmptyBoundParameters);
 }

+static SourceLocation reductionLoc;
+
 void CodeGenFunction::EmitOMPTaskBasedDirective(
     const OMPExecutableDirective &S, const OpenMPDirectiveKind CapturedRegion,
     const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
         // FIXME: This must removed once the runtime library is fixed.
         // Emit required threadprivate variables for
         // initilizer/combiner/finalizer.
-        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, S.getLocStart(),
+        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, reductionLoc,
RedCG, Cnt);
       }
     }
@@ -3180,8 +3182,9 @@
           std::advance(IRHS, 1);
         }
       }
+      reductionLoc = S.getLocStart();
       llvm::Value *ReductionDesc =
-          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, S.getLocStart(),
+          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, reductionLoc,
LHSs, RHSs, Data);
       const auto *VD = cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
       CGF.EmitVarDecl(*VD);

Regards,

Raúl


http://bsc.es/disclaimer





WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer




WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received.

http://www.bsc.es/disclaimer

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

task_red.cc (433 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [OpenMP]: wrong threadprivate storage name in task reduction

Anastasia Stulova via cfe-dev
Your code is not correct. Standard does not say, that the reduced items must share the same memory, it says that it must use the same variables. In your example the variables are different, though they share the same memory.

-------------
Best regards,
Alexey Bataev

08.03.2018 14:03, Raúl Peñacoba пишет:

>
> Hi,
>
> I mean that it seems that this implementation restricts the tasks that
> participate in the reduction to be defined in the same lexical scope.
> I have not been able to find this restriction in the OpenMP
> specification. What the spec says about this topic is:
>
> "A list item that appears in an in_reduction clause of a
> task-generating construct must appear
> in a task_reduction clause of a construct corresponding to a taskgroup
> region that includes
> the participating task in its taskgroup set. The construct
> corresponding to the innermost region
> that meets this condition must specify the same reduction-identifier
> as the in_reduction
> clause."
>
> I tried to compile the attached file. The compile command is:
>
> clang -fopenmp task_red.cc
>
> Regards,
> Raúl
>
> El 08/03/18 a las 15:30, Alexey Bataev escribió:
>>
>> Hi,
>>
>> >>> Anyway, this implementation of task reductions is not fully
>> functional, as only are allowed if the task is created in the scope
>> of taskgroup.
>>
>> What do you mean?Could you provide some more detais?
>>
>>
>> -------------
>> Best regards,
>> Alexey Bataev
>> 08.03.2018 7:47, Raúl Peñacoba пишет:
>>>
>>> Hi,
>>>
>>> I did some little research and I have realized that the patch I
>>> attached doesn't fix the problem completely. I fixed the storage of
>>> the data. When a using task reduction with vla, or a class that uses
>>> omp_orig to initialize the private copies, the size of the vla and
>>> omp_orig are stored in a storage using __kmpc_threadprivate_cached().
>>>
>>> The problem comes because, since in outline parallel function each
>>> reduction has lazy_priv flag, the initialization will be done in
>>> __kmpc_task_reduction_get_th_data(), called from outline task
>>> function. This is done _before_storing the vla size or omp_orig, so
>>> the initialization would be wrong in these cases.
>>>
>>> I think that if the storage is put before the
>>> __kmpc_task_reduction_get_th_data() it should work.
>>>
>>> Anyway, this implementation of task reductions is not fully
>>> functional, as only are allowed if the task is created in the scope
>>> of taskgroup.
>>>
>>> Sorry for the mail duplicate.
>>>
>>> Regards,
>>>
>>> Raúl
>>>
>>> El 06/03/18 a las 16:21, Alexey Bataev escribió:
>>>>
>>>> Hi, thanks for the report, I'll take a look.
>>>>
>>>> -------------
>>>> Best regards,
>>>> Alexey Bataev
>>>> 06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
>>>>> Hi,
>>>>>
>>>>> I've noticed that clang does not work properly when using task
>>>>> reductions with non-constant arrays and/or classes with omp_orig
>>>>> constructor initializers.
>>>>>
>>>>> As reduction initialize/combine/finalize functions only receive
>>>>> references to whatever should be initialized, combined..., clang
>>>>> generates some additional storages to save the size of the
>>>>> reduction array and a pointer to the omp_orig. That storages are
>>>>> generated through __kmpc_thread_private_cached() calls in both
>>>>> reduction functions and outline task function.
>>>>>
>>>>> The problem is that the storage used in outline task function does
>>>>> not match the storage in reduction functions. This happens in
>>>>> programs that have a task reduction or a taskloop reduction with
>>>>> the in_reduction clause.
>>>>>
>>>>> To solve this i made an ugly patch that reuses the SourceLocation
>>>>> of CodeGenFunction::EmitOMPTaskgroupDirective() in
>>>>> CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages
>>>>> and make it work. Anyway, i suppose that there is a better way to
>>>>> fix this.
>>>>>
>>>>> Index: CGStmtOpenMP.cpp
>>>>> ===================================================================
>>>>> --- CGStmtOpenMP.cpp    (revision 326685)
>>>>> +++ CGStmtOpenMP.cpp    (working copy)
>>>>> @@ -2713,6 +2713,8 @@
>>>>>                                   emitEmptyBoundParameters);
>>>>>  }
>>>>>
>>>>> +static SourceLocation reductionLoc;
>>>>> +
>>>>>  void CodeGenFunction::EmitOMPTaskBasedDirective(
>>>>>      const OMPExecutableDirective &S, const OpenMPDirectiveKind
>>>>> CapturedRegion,
>>>>>      const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
>>>>> @@ -2952,7 +2954,7 @@
>>>>>          // FIXME: This must removed once the runtime library is
>>>>> fixed.
>>>>>          // Emit required threadprivate variables for
>>>>>          // initilizer/combiner/finalizer.
>>>>> -        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
>>>>> S.getLocStart(),
>>>>> +        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
>>>>> reductionLoc,
>>>>> RedCG, Cnt);
>>>>>        }
>>>>>      }
>>>>> @@ -3180,8 +3182,9 @@
>>>>>            std::advance(IRHS, 1);
>>>>>          }
>>>>>        }
>>>>> +      reductionLoc = S.getLocStart();
>>>>>        llvm::Value *ReductionDesc =
>>>>> -          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
>>>>> S.getLocStart(),
>>>>> +          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
>>>>> reductionLoc,
>>>>> LHSs, RHSs, Data);
>>>>>        const auto *VD =
>>>>> cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
>>>>>        CGF.EmitVarDecl(*VD);
>>>>>
>>>>> Regards,
>>>>>
>>>>> Raúl
>>>>>
>>>>>
>>>>> http://bsc.es/disclaimer
>>>>>
>>>>
>>>
>>>
>>>
>>> WARNING / LEGAL TEXT: This message is intended only for the use of
>>> the individual or entity to which it is addressed and may contain
>>> information which is privileged, confidential, proprietary, or
>>> exempt from disclosure under applicable law. If you are not the
>>> intended recipient or the person responsible for delivering the
>>> message to the intended recipient, you are strictly prohibited from
>>> disclosing, distributing, copying, or in any way using this message.
>>> If you have received this communication in error, please notify the
>>> sender and destroy and delete any copies you may have received.
>>>
>>> http://www.bsc.es/disclaimer
>>> <https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7C2a787fae16d1409fb12808d584f2b8da%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561100369995188&sdata=Qg460eDN4kZxnL4NIiP06MjaE85rnbyx2M%2FyVR7jif8%3D&reserved=0>
>>>
>>
>
>
>
> WARNING / LEGAL TEXT: This message is intended only for the use of the
> individual or entity to which it is addressed and may contain
> information which is privileged, confidential, proprietary, or exempt
> from disclosure under applicable law. If you are not the intended
> recipient or the person responsible for delivering the message to the
> intended recipient, you are strictly prohibited from disclosing,
> distributing, copying, or in any way using this message. If you have
> received this communication in error, please notify the sender and
> destroy and delete any copies you may have received.
>
> http://www.bsc.es/disclaimer
> <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Ca45b287f1ee24a78c93808d585273f8e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561325966851856&sdata=IYNPTvByADBir%2F950dCUevcJHWEIOo90Xx5raSJ2rS8%3D&reserved=0>
>


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [OpenMP]: wrong threadprivate storage name in task reduction

Anastasia Stulova via cfe-dev
I see. In that case, what about using task reductions with a global
variable like the attached example?

I've seen that you fixed the storage calls. I've tested it but still
doesn't work omp_orig initialization. If I'm not wrong, the reduction
init function use the pointer returned by __kmpc_threadprivate_cached()
directly without dereference.

Regards,
Raúl
El 08/03/18 a las 20:10, Alexey Bataev escribió:

> Your code is not correct. Standard does not say, that the reduced items must share the same memory, it says that it must use the same variables. In your example the variables are different, though they share the same memory.
>
> -------------
> Best regards,
> Alexey Bataev
>
> 08.03.2018 14:03, Raúl Peñacoba пишет:
>> Hi,
>>
>> I mean that it seems that this implementation restricts the tasks that
>> participate in the reduction to be defined in the same lexical scope.
>> I have not been able to find this restriction in the OpenMP
>> specification. What the spec says about this topic is:
>>
>> "A list item that appears in an in_reduction clause of a
>> task-generating construct must appear
>> in a task_reduction clause of a construct corresponding to a taskgroup
>> region that includes
>> the participating task in its taskgroup set. The construct
>> corresponding to the innermost region
>> that meets this condition must specify the same reduction-identifier
>> as the in_reduction
>> clause."
>>
>> I tried to compile the attached file. The compile command is:
>>
>> clang -fopenmp task_red.cc
>>
>> Regards,
>> Raúl
>>
>> El 08/03/18 a las 15:30, Alexey Bataev escribió:
>>> Hi,
>>>
>>>>>> Anyway, this implementation of task reductions is not fully
>>> functional, as only are allowed if the task is created in the scope
>>> of taskgroup.
>>>
>>> What do you mean?Could you provide some more detais?
>>>
>>>
>>> -------------
>>> Best regards,
>>> Alexey Bataev
>>> 08.03.2018 7:47, Raúl Peñacoba пишет:
>>>> Hi,
>>>>
>>>> I did some little research and I have realized that the patch I
>>>> attached doesn't fix the problem completely. I fixed the storage of
>>>> the data. When a using task reduction with vla, or a class that uses
>>>> omp_orig to initialize the private copies, the size of the vla and
>>>> omp_orig are stored in a storage using __kmpc_threadprivate_cached().
>>>>
>>>> The problem comes because, since in outline parallel function each
>>>> reduction has lazy_priv flag, the initialization will be done in
>>>> __kmpc_task_reduction_get_th_data(), called from outline task
>>>> function. This is done _before_storing the vla size or omp_orig, so
>>>> the initialization would be wrong in these cases.
>>>>
>>>> I think that if the storage is put before the
>>>> __kmpc_task_reduction_get_th_data() it should work.
>>>>
>>>> Anyway, this implementation of task reductions is not fully
>>>> functional, as only are allowed if the task is created in the scope
>>>> of taskgroup.
>>>>
>>>> Sorry for the mail duplicate.
>>>>
>>>> Regards,
>>>>
>>>> Raúl
>>>>
>>>> El 06/03/18 a las 16:21, Alexey Bataev escribió:
>>>>> Hi, thanks for the report, I'll take a look.
>>>>>
>>>>> -------------
>>>>> Best regards,
>>>>> Alexey Bataev
>>>>> 06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
>>>>>> Hi,
>>>>>>
>>>>>> I've noticed that clang does not work properly when using task
>>>>>> reductions with non-constant arrays and/or classes with omp_orig
>>>>>> constructor initializers.
>>>>>>
>>>>>> As reduction initialize/combine/finalize functions only receive
>>>>>> references to whatever should be initialized, combined..., clang
>>>>>> generates some additional storages to save the size of the
>>>>>> reduction array and a pointer to the omp_orig. That storages are
>>>>>> generated through __kmpc_thread_private_cached() calls in both
>>>>>> reduction functions and outline task function.
>>>>>>
>>>>>> The problem is that the storage used in outline task function does
>>>>>> not match the storage in reduction functions. This happens in
>>>>>> programs that have a task reduction or a taskloop reduction with
>>>>>> the in_reduction clause.
>>>>>>
>>>>>> To solve this i made an ugly patch that reuses the SourceLocation
>>>>>> of CodeGenFunction::EmitOMPTaskgroupDirective() in
>>>>>> CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages
>>>>>> and make it work. Anyway, i suppose that there is a better way to
>>>>>> fix this.
>>>>>>
>>>>>> Index: CGStmtOpenMP.cpp
>>>>>> ===================================================================
>>>>>> --- CGStmtOpenMP.cpp    (revision 326685)
>>>>>> +++ CGStmtOpenMP.cpp    (working copy)
>>>>>> @@ -2713,6 +2713,8 @@
>>>>>>                                    emitEmptyBoundParameters);
>>>>>>   }
>>>>>>
>>>>>> +static SourceLocation reductionLoc;
>>>>>> +
>>>>>>   void CodeGenFunction::EmitOMPTaskBasedDirective(
>>>>>>       const OMPExecutableDirective &S, const OpenMPDirectiveKind
>>>>>> CapturedRegion,
>>>>>>       const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
>>>>>> @@ -2952,7 +2954,7 @@
>>>>>>           // FIXME: This must removed once the runtime library is
>>>>>> fixed.
>>>>>>           // Emit required threadprivate variables for
>>>>>>           // initilizer/combiner/finalizer.
>>>>>> -        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
>>>>>> S.getLocStart(),
>>>>>> +        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
>>>>>> reductionLoc,
>>>>>> RedCG, Cnt);
>>>>>>         }
>>>>>>       }
>>>>>> @@ -3180,8 +3182,9 @@
>>>>>>             std::advance(IRHS, 1);
>>>>>>           }
>>>>>>         }
>>>>>> +      reductionLoc = S.getLocStart();
>>>>>>         llvm::Value *ReductionDesc =
>>>>>> -          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
>>>>>> S.getLocStart(),
>>>>>> +          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
>>>>>> reductionLoc,
>>>>>> LHSs, RHSs, Data);
>>>>>>         const auto *VD =
>>>>>> cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
>>>>>>         CGF.EmitVarDecl(*VD);
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Raúl
>>>>>>
>>>>>>
>>>>>> http://bsc.es/disclaimer
>>>>>>
>>>>
>>>>
>>>> WARNING / LEGAL TEXT: This message is intended only for the use of
>>>> the individual or entity to which it is addressed and may contain
>>>> information which is privileged, confidential, proprietary, or
>>>> exempt from disclosure under applicable law. If you are not the
>>>> intended recipient or the person responsible for delivering the
>>>> message to the intended recipient, you are strictly prohibited from
>>>> disclosing, distributing, copying, or in any way using this message.
>>>> If you have received this communication in error, please notify the
>>>> sender and destroy and delete any copies you may have received.
>>>>
>>>> http://www.bsc.es/disclaimer
>>>> <https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7C2a787fae16d1409fb12808d584f2b8da%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561100369995188&sdata=Qg460eDN4kZxnL4NIiP06MjaE85rnbyx2M%2FyVR7jif8%3D&reserved=0>
>>>>
>>
>>
>> WARNING / LEGAL TEXT: This message is intended only for the use of the
>> individual or entity to which it is addressed and may contain
>> information which is privileged, confidential, proprietary, or exempt
>> from disclosure under applicable law. If you are not the intended
>> recipient or the person responsible for delivering the message to the
>> intended recipient, you are strictly prohibited from disclosing,
>> distributing, copying, or in any way using this message. If you have
>> received this communication in error, please notify the sender and
>> destroy and delete any copies you may have received.
>>
>> http://www.bsc.es/disclaimer
>> <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Ca45b287f1ee24a78c93808d585273f8e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561325966851856&sdata=IYNPTvByADBir%2F950dCUevcJHWEIOo90Xx5raSJ2rS8%3D&reserved=0>
>>
>


http://bsc.es/disclaimer
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

task_red_gvar.cc (326 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [OpenMP]: wrong threadprivate storage name in task reduction

Anastasia Stulova via cfe-dev
1. I don't think that this is correct construct too. Since we don't support such kind of construct for locals, we should not support it for globals, 
since OpenMP does not have any difference between locals and globals. I think it is just the problem of the standard that it is not quite clear about it.
Probably, we can support this construct, but not sure that it is required. I don't want to spend my time implementing the things that won't be used and 
not allowed by the standard.
2. I'll check it. All the problems with it because runtime is not quite correct and does not fully support task reductions. Because of that I had to introduce
all that stuff with the threadprivates.
-------------
Best regards,
Alexey Bataev
09.03.2018 3:57, Raúl Peñacoba пишет:
I see. In that case, what about using task reductions with a global variable like the attached example?

I've seen that you fixed the storage calls. I've tested it but still doesn't work omp_orig initialization. If I'm not wrong, the reduction init function use the pointer returned by __kmpc_threadprivate_cached() directly without dereference.

Regards,
Raúl
El 08/03/18 a las 20:10, Alexey Bataev escribió:
Your code is not correct. Standard does not say, that the reduced items must share the same memory, it says that it must use the same variables. In your example the variables are different, though they share the same memory.

-------------
Best regards,
Alexey Bataev

08.03.2018 14:03, Raúl Peñacoba пишет:
Hi,

I mean that it seems that this implementation restricts the tasks that
participate in the reduction to be defined in the same lexical scope.
I have not been able to find this restriction in the OpenMP
specification. What the spec says about this topic is:

"A list item that appears in an in_reduction clause of a
task-generating construct must appear
in a task_reduction clause of a construct corresponding to a taskgroup
region that includes
the participating task in its taskgroup set. The construct
corresponding to the innermost region
that meets this condition must specify the same reduction-identifier
as the in_reduction
clause."

I tried to compile the attached file. The compile command is:

clang -fopenmp task_red.cc

Regards,
Raúl

El 08/03/18 a las 15:30, Alexey Bataev escribió:
Hi,

Anyway, this implementation of task reductions is not fully
functional, as only are allowed if the task is created in the scope
of taskgroup.

What do you mean?Could you provide some more detais?


-------------
Best regards,
Alexey Bataev
08.03.2018 7:47, Raúl Peñacoba пишет:
Hi,

I did some little research and I have realized that the patch I
attached doesn't fix the problem completely. I fixed the storage of
the data. When a using task reduction with vla, or a class that uses
omp_orig to initialize the private copies, the size of the vla and
omp_orig are stored in a storage using __kmpc_threadprivate_cached().

The problem comes because, since in outline parallel function each
reduction has lazy_priv flag, the initialization will be done in
__kmpc_task_reduction_get_th_data(), called from outline task
function. This is done _before_storing the vla size or omp_orig, so
the initialization would be wrong in these cases.

I think that if the storage is put before the
__kmpc_task_reduction_get_th_data() it should work.

Anyway, this implementation of task reductions is not fully
functional, as only are allowed if the task is created in the scope
of taskgroup.

Sorry for the mail duplicate.

Regards,

Raúl

El 06/03/18 a las 16:21, Alexey Bataev escribió:
Hi, thanks for the report, I'll take a look.

-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
Hi,

I've noticed that clang does not work properly when using task
reductions with non-constant arrays and/or classes with omp_orig
constructor initializers.

As reduction initialize/combine/finalize functions only receive
references to whatever should be initialized, combined..., clang
generates some additional storages to save the size of the
reduction array and a pointer to the omp_orig. That storages are
generated through __kmpc_thread_private_cached() calls in both
reduction functions and outline task function.

The problem is that the storage used in outline task function does
not match the storage in reduction functions. This happens in
programs that have a task reduction or a taskloop reduction with
the in_reduction clause.

To solve this i made an ugly patch that reuses the SourceLocation
of CodeGenFunction::EmitOMPTaskgroupDirective() in
CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages
and make it work. Anyway, i suppose that there is a better way to
fix this.

Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp    (revision 326685)
+++ CGStmtOpenMP.cpp    (working copy)
@@ -2713,6 +2713,8 @@
                                   emitEmptyBoundParameters);
  }

+static SourceLocation reductionLoc;
+
  void CodeGenFunction::EmitOMPTaskBasedDirective(
      const OMPExecutableDirective &S, const OpenMPDirectiveKind
CapturedRegion,
      const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
          // FIXME: This must removed once the runtime library is
fixed.
          // Emit required threadprivate variables for
          // initilizer/combiner/finalizer.
-        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
S.getLocStart(),
+        CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
reductionLoc,
RedCG, Cnt);
        }
      }
@@ -3180,8 +3182,9 @@
            std::advance(IRHS, 1);
          }
        }
+      reductionLoc = S.getLocStart();
        llvm::Value *ReductionDesc =
-          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
S.getLocStart(),
+          CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
reductionLoc,
LHSs, RHSs, Data);
        const auto *VD =
cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
        CGF.EmitVarDecl(*VD);

Regards,

Raúl


https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbsc.es%2Fdisclaimer&data=02%7C01%7C%7Cbdc4c4de2d04480706e308d5859bcff2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561826607210068&sdata=8EdV%2BOxP0hlHRaTsWdbDjPGQZsjKrQVwB4UBpIDscsU%3D&reserved=0



WARNING / LEGAL TEXT: This message is intended only for the use of
the individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or
exempt from disclosure under applicable law. If you are not the
intended recipient or the person responsible for delivering the
message to the intended recipient, you are strictly prohibited from
disclosing, distributing, copying, or in any way using this message.
If you have received this communication in error, please notify the
sender and destroy and delete any copies you may have received.

https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Cbdc4c4de2d04480706e308d5859bcff2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561826607210068&sdata=VijKwX7k2dgfvTE7yBrXjch1lF7lYB2RNMKCW273btM%3D&reserved=0
<https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7C2a787fae16d1409fb12808d584f2b8da%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561100369995188&sdata=Qg460eDN4kZxnL4NIiP06MjaE85rnbyx2M%2FyVR7jif8%3D&reserved=0>



WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Cbdc4c4de2d04480706e308d5859bcff2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561826607210068&sdata=VijKwX7k2dgfvTE7yBrXjch1lF7lYB2RNMKCW273btM%3D&reserved=0
<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Ca45b287f1ee24a78c93808d585273f8e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561325966851856&sdata=IYNPTvByADBir%2F950dCUevcJHWEIOo90Xx5raSJ2rS8%3D&reserved=0>





https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbsc.es%2Fdisclaimer&data=02%7C01%7C%7Cbdc4c4de2d04480706e308d5859bcff2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561826607210068&sdata=8EdV%2BOxP0hlHRaTsWdbDjPGQZsjKrQVwB4UBpIDscsU%3D&reserved=0


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

signature.asc (849 bytes) Download Attachment