|
|
Hi,
I've noticed that clang does not work properly when using task
reductions with non-constant arrays and/or classes with omp_orig
constructor initializers.
As reduction initialize/combine/finalize functions only receive
references to whatever should be initialized, combined..., clang
generates some additional storages to save the size of the reduction
array and a pointer to the omp_orig. That storages are generated through
__kmpc_thread_private_cached() calls in both reduction functions and
outline task function.
The problem is that the storage used in outline task function does not
match the storage in reduction functions. This happens in programs that
have a task reduction or a taskloop reduction with the in_reduction clause.
To solve this i made an ugly patch that reuses the SourceLocation of
CodeGenFunction::EmitOMPTaskgroupDirective() in
CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages and
make it work. Anyway, i suppose that there is a better way to fix this.
Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp (revision 326685)
+++ CGStmtOpenMP.cpp (working copy)
@@ -2713,6 +2713,8 @@
emitEmptyBoundParameters);
}
+static SourceLocation reductionLoc;
+
void CodeGenFunction::EmitOMPTaskBasedDirective(
const OMPExecutableDirective &S, const OpenMPDirectiveKind
CapturedRegion,
const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
// FIXME: This must removed once the runtime library is fixed.
// Emit required threadprivate variables for
// initilizer/combiner/finalizer.
- CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
S.getLocStart(),
+ CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
reductionLoc,
RedCG, Cnt);
}
}
@@ -3180,8 +3182,9 @@
std::advance(IRHS, 1);
}
}
+ reductionLoc = S.getLocStart();
llvm::Value *ReductionDesc =
- CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
S.getLocStart(),
+ CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
reductionLoc,
LHSs, RHSs, Data);
const auto *VD = cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
CGF.EmitVarDecl(*VD);
Regards,
Raúl
http://bsc.es/disclaimer_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
Hi, thanks for the report, I'll take a look.
-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba via
cfe-dev пишет:
Hi,
I've noticed that clang does not work properly when using task
reductions with non-constant arrays and/or classes with omp_orig
constructor initializers.
As reduction initialize/combine/finalize functions only receive
references to whatever should be initialized, combined..., clang
generates some additional storages to save the size of the
reduction array and a pointer to the omp_orig. That storages are
generated through __kmpc_thread_private_cached() calls in both
reduction functions and outline task function.
The problem is that the storage used in outline task function does
not match the storage in reduction functions. This happens in
programs that have a task reduction or a taskloop reduction with
the in_reduction clause.
To solve this i made an ugly patch that reuses the SourceLocation
of CodeGenFunction::EmitOMPTaskgroupDirective() in
CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages
and make it work. Anyway, i suppose that there is a better way to
fix this.
Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp (revision 326685)
+++ CGStmtOpenMP.cpp (working copy)
@@ -2713,6 +2713,8 @@
emitEmptyBoundParameters);
}
+static SourceLocation reductionLoc;
+
void CodeGenFunction::EmitOMPTaskBasedDirective(
const OMPExecutableDirective &S, const
OpenMPDirectiveKind CapturedRegion,
const RegionCodeGenTy &BodyGen, const TaskGenTy
&TaskGen,
@@ -2952,7 +2954,7 @@
// FIXME: This must removed once the runtime library is
fixed.
// Emit required threadprivate variables for
// initilizer/combiner/finalizer.
- CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
S.getLocStart(),
+ CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
reductionLoc,
RedCG, Cnt);
}
}
@@ -3180,8 +3182,9 @@
std::advance(IRHS, 1);
}
}
+ reductionLoc = S.getLocStart();
llvm::Value *ReductionDesc =
- CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
S.getLocStart(),
+ CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
reductionLoc,
LHSs, RHSs, Data);
const auto *VD =
cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
CGF.EmitVarDecl(*VD);
Regards,
Raúl
http://bsc.es/disclaimer
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
Hi,
I did some little research and I have realized that the patch
I attached doesn't fix the problem completely. I fixed the
storage of the data. When a using task reduction with vla, or a
class that uses omp_orig to initialize the private copies, the
size of the vla and omp_orig are stored in a storage using
__kmpc_threadprivate_cached().
The problem comes because, since in outline parallel function
each reduction has lazy_priv flag, the initialization will be
done in __kmpc_task_reduction_get_th_data(), called from outline
task function. This is done before
storing the vla size or omp_orig, so the initialization would be
wrong in these cases.
I think that if the storage is put before the
__kmpc_task_reduction_get_th_data() it should work.
Anyway, this implementation of task reductions is not fully
functional, as only are allowed if the task is created in the
scope of taskgroup.
Sorry for the mail duplicate.
Regards,
Raúl
El 06/03/18 a las 16:21, Alexey
Bataev escribió:
Hi, thanks for the report, I'll take a look.
-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba
via cfe-dev пишет:
Hi,
I've noticed that clang does not work properly when
using task reductions with non-constant arrays and/or classes
with omp_orig constructor initializers.
As reduction initialize/combine/finalize functions
only receive references to whatever should be initialized,
combined..., clang generates some additional storages to save
the size of the reduction array and a pointer to the omp_orig.
That storages are generated through
__kmpc_thread_private_cached() calls in both reduction
functions and outline task function.
The problem is that the storage used in outline task
function does not match the storage in reduction functions.
This happens in programs that have a task reduction or a
taskloop reduction with the in_reduction clause.
To solve this i made an ugly patch that reuses the
SourceLocation of CodeGenFunction::EmitOMPTaskgroupDirective()
in CodeGenFunction::EmitOMPTaskBasedDirective() to match the
storages and make it work. Anyway, i suppose that there is a
better way to fix this.
Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp (revision 326685)
+++ CGStmtOpenMP.cpp (working copy)
@@ -2713,6 +2713,8 @@
emitEmptyBoundParameters);
}
+static SourceLocation reductionLoc;
+
void CodeGenFunction::EmitOMPTaskBasedDirective(
const OMPExecutableDirective &S, const
OpenMPDirectiveKind CapturedRegion,
const RegionCodeGenTy &BodyGen, const
TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
// FIXME: This must removed once the runtime
library is fixed.
// Emit required threadprivate variables for
// initilizer/combiner/finalizer.
-
CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
S.getLocStart(),
+
CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
reductionLoc,
RedCG, Cnt);
}
}
@@ -3180,8 +3182,9 @@
std::advance(IRHS, 1);
}
}
+ reductionLoc = S.getLocStart();
llvm::Value *ReductionDesc =
-
CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
S.getLocStart(),
+
CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
reductionLoc,
LHSs, RHSs, Data);
const auto *VD =
cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
CGF.EmitVarDecl(*VD);
Regards,
Raúl
http://bsc.es/disclaimer
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
Hi,
>>> Anyway, this implementation of task reductions
is not fully functional, as only are allowed if the task is
created in the scope of taskgroup.
What do you mean?Could you provide some more detais?
-------------
Best regards,
Alexey Bataev
08.03.2018 7:47, Raúl Peñacoba пишет:
Hi,
I did some little research and I have realized that the
patch I attached doesn't fix the problem completely. I fixed
the storage of the data. When a using task reduction with vla,
or a class that uses omp_orig to initialize the private
copies, the size of the vla and omp_orig are stored in a
storage using __kmpc_threadprivate_cached().
The problem comes because, since in outline parallel
function each reduction has lazy_priv flag, the initialization
will be done in __kmpc_task_reduction_get_th_data(), called
from outline task function. This is done before
storing the vla size or omp_orig, so the initialization would
be wrong in these cases.
I think that if the storage is put before the
__kmpc_task_reduction_get_th_data() it should work.
Anyway, this implementation of task reductions is not fully
functional, as only are allowed if the task is created in the
scope of taskgroup.
Sorry for the mail duplicate.
Regards,
Raúl
El 06/03/18 a las 16:21, Alexey
Bataev escribió:
Hi, thanks for the report, I'll take a look.
-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba
via cfe-dev пишет:
Hi,
I've noticed that clang does not work properly when
using task reductions with non-constant arrays and/or
classes with omp_orig constructor initializers.
As reduction initialize/combine/finalize functions
only receive references to whatever should be initialized,
combined..., clang generates some additional storages to
save the size of the reduction array and a pointer to the
omp_orig. That storages are generated through
__kmpc_thread_private_cached() calls in both reduction
functions and outline task function.
The problem is that the storage used in outline task
function does not match the storage in reduction functions.
This happens in programs that have a task reduction or a
taskloop reduction with the in_reduction clause.
To solve this i made an ugly patch that reuses the
SourceLocation of
CodeGenFunction::EmitOMPTaskgroupDirective() in
CodeGenFunction::EmitOMPTaskBasedDirective() to match the
storages and make it work. Anyway, i suppose that there is a
better way to fix this.
Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp (revision 326685)
+++ CGStmtOpenMP.cpp (working copy)
@@ -2713,6 +2713,8 @@
emitEmptyBoundParameters);
}
+static SourceLocation reductionLoc;
+
void CodeGenFunction::EmitOMPTaskBasedDirective(
const OMPExecutableDirective &S, const
OpenMPDirectiveKind CapturedRegion,
const RegionCodeGenTy &BodyGen, const
TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
// FIXME: This must removed once the
runtime library is fixed.
// Emit required threadprivate variables
for
// initilizer/combiner/finalizer.
-
CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
S.getLocStart(),
+
CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
reductionLoc,
RedCG, Cnt);
}
}
@@ -3180,8 +3182,9 @@
std::advance(IRHS, 1);
}
}
+ reductionLoc = S.getLocStart();
llvm::Value *ReductionDesc =
-
CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
S.getLocStart(),
+
CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
reductionLoc,
LHSs, RHSs, Data);
const auto *VD =
cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
CGF.EmitVarDecl(*VD);
Regards,
Raúl
http://bsc.es/disclaimer
WARNING / LEGAL TEXT: This message is intended only for the use of
the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or
exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to
the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you
have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
Hi,
I mean that it seems that this implementation
restricts the tasks that participate in the reduction to be
defined in the same lexical scope. I have not been able to find
this restriction in the OpenMP specification. What the spec says
about this topic is:
"A list item that appears in an in_reduction clause of a
task-generating construct must appear
in a task_reduction clause of a construct corresponding to
a taskgroup region that includes
the participating task in its taskgroup set. The construct
corresponding to the innermost region
that meets this condition must specify the same
reduction-identifier as the in_reduction
clause."
I tried to compile the attached file. The compile command is:
clang -fopenmp task_red.cc
Regards,
Raúl
El 08/03/18 a las 15:30, Alexey Bataev
escribió:
Hi,
>>> Anyway, this implementation of task
reductions is not fully functional, as only are allowed if the
task is created in the scope of taskgroup.
What do you mean?Could you provide some more detais?
-------------
Best regards,
Alexey Bataev
08.03.2018 7:47, Raúl Peñacoba пишет:
Hi,
I did some little research and I have realized that the
patch I attached doesn't fix the problem completely. I fixed
the storage of the data. When a using task reduction with
vla, or a class that uses omp_orig to initialize the private
copies, the size of the vla and omp_orig are stored in a
storage using __kmpc_threadprivate_cached().
The problem comes because, since in outline parallel
function each reduction has lazy_priv flag, the
initialization will be done in
__kmpc_task_reduction_get_th_data(), called from outline
task function. This is done before
storing the vla size or omp_orig, so the initialization
would be wrong in these cases.
I think that if the storage is put before the
__kmpc_task_reduction_get_th_data() it should work.
Anyway, this implementation of task reductions is not
fully functional, as only are allowed if the task is created
in the scope of taskgroup.
Sorry for the mail duplicate.
Regards,
Raúl
El 06/03/18 a las 16:21, Alexey
Bataev escribió:
Hi, thanks for the report, I'll take a look.
-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl
Peñacoba via cfe-dev пишет:
Hi,
I've noticed that clang does not work properly
when using task reductions with non-constant arrays and/or
classes with omp_orig constructor initializers.
As reduction initialize/combine/finalize functions
only receive references to whatever should be initialized,
combined..., clang generates some additional storages to
save the size of the reduction array and a pointer to the
omp_orig. That storages are generated through
__kmpc_thread_private_cached() calls in both reduction
functions and outline task function.
The problem is that the storage used in outline
task function does not match the storage in reduction
functions. This happens in programs that have a task
reduction or a taskloop reduction with the in_reduction
clause.
To solve this i made an ugly patch that reuses the
SourceLocation of
CodeGenFunction::EmitOMPTaskgroupDirective() in
CodeGenFunction::EmitOMPTaskBasedDirective() to match the
storages and make it work. Anyway, i suppose that there is
a better way to fix this.
Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp (revision 326685)
+++ CGStmtOpenMP.cpp (working copy)
@@ -2713,6 +2713,8 @@
emitEmptyBoundParameters);
}
+static SourceLocation reductionLoc;
+
void CodeGenFunction::EmitOMPTaskBasedDirective(
const OMPExecutableDirective &S, const
OpenMPDirectiveKind CapturedRegion,
const RegionCodeGenTy &BodyGen, const
TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
// FIXME: This must removed once the
runtime library is fixed.
// Emit required threadprivate variables
for
// initilizer/combiner/finalizer.
-
CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
S.getLocStart(),
+
CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
reductionLoc,
RedCG, Cnt);
}
}
@@ -3180,8 +3182,9 @@
std::advance(IRHS, 1);
}
}
+ reductionLoc = S.getLocStart();
llvm::Value *ReductionDesc =
-
CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
S.getLocStart(),
+
CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
reductionLoc,
LHSs, RHSs, Data);
const auto *VD =
cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
CGF.EmitVarDecl(*VD);
Regards,
Raúl
http://bsc.es/disclaimer
WARNING / LEGAL TEXT: This message is intended only for the use
of the individual or entity to which it is addressed and may
contain information which is privileged, confidential,
proprietary, or exempt from disclosure under applicable law. If
you are not the intended recipient or the person responsible for
delivering the message to the intended recipient, you are
strictly prohibited from disclosing, distributing, copying, or
in any way using this message. If you have received this
communication in error, please notify the sender and destroy and
delete any copies you may have received.
http://www.bsc.es/disclaimer
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
Your code is not correct. Standard does not say, that the reduced items must share the same memory, it says that it must use the same variables. In your example the variables are different, though they share the same memory.
-------------
Best regards,
Alexey Bataev
08.03.2018 14:03, Raúl Peñacoba пишет:
>
> Hi,
>
> I mean that it seems that this implementation restricts the tasks that
> participate in the reduction to be defined in the same lexical scope.
> I have not been able to find this restriction in the OpenMP
> specification. What the spec says about this topic is:
>
> "A list item that appears in an in_reduction clause of a
> task-generating construct must appear
> in a task_reduction clause of a construct corresponding to a taskgroup
> region that includes
> the participating task in its taskgroup set. The construct
> corresponding to the innermost region
> that meets this condition must specify the same reduction-identifier
> as the in_reduction
> clause."
>
> I tried to compile the attached file. The compile command is:
>
> clang -fopenmp task_red.cc
>
> Regards,
> Raúl
>
> El 08/03/18 a las 15:30, Alexey Bataev escribió:
>>
>> Hi,
>>
>> >>> Anyway, this implementation of task reductions is not fully
>> functional, as only are allowed if the task is created in the scope
>> of taskgroup.
>>
>> What do you mean?Could you provide some more detais?
>>
>>
>> -------------
>> Best regards,
>> Alexey Bataev
>> 08.03.2018 7:47, Raúl Peñacoba пишет:
>>>
>>> Hi,
>>>
>>> I did some little research and I have realized that the patch I
>>> attached doesn't fix the problem completely. I fixed the storage of
>>> the data. When a using task reduction with vla, or a class that uses
>>> omp_orig to initialize the private copies, the size of the vla and
>>> omp_orig are stored in a storage using __kmpc_threadprivate_cached().
>>>
>>> The problem comes because, since in outline parallel function each
>>> reduction has lazy_priv flag, the initialization will be done in
>>> __kmpc_task_reduction_get_th_data(), called from outline task
>>> function. This is done _before_storing the vla size or omp_orig, so
>>> the initialization would be wrong in these cases.
>>>
>>> I think that if the storage is put before the
>>> __kmpc_task_reduction_get_th_data() it should work.
>>>
>>> Anyway, this implementation of task reductions is not fully
>>> functional, as only are allowed if the task is created in the scope
>>> of taskgroup.
>>>
>>> Sorry for the mail duplicate.
>>>
>>> Regards,
>>>
>>> Raúl
>>>
>>> El 06/03/18 a las 16:21, Alexey Bataev escribió:
>>>>
>>>> Hi, thanks for the report, I'll take a look.
>>>>
>>>> -------------
>>>> Best regards,
>>>> Alexey Bataev
>>>> 06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
>>>>> Hi,
>>>>>
>>>>> I've noticed that clang does not work properly when using task
>>>>> reductions with non-constant arrays and/or classes with omp_orig
>>>>> constructor initializers.
>>>>>
>>>>> As reduction initialize/combine/finalize functions only receive
>>>>> references to whatever should be initialized, combined..., clang
>>>>> generates some additional storages to save the size of the
>>>>> reduction array and a pointer to the omp_orig. That storages are
>>>>> generated through __kmpc_thread_private_cached() calls in both
>>>>> reduction functions and outline task function.
>>>>>
>>>>> The problem is that the storage used in outline task function does
>>>>> not match the storage in reduction functions. This happens in
>>>>> programs that have a task reduction or a taskloop reduction with
>>>>> the in_reduction clause.
>>>>>
>>>>> To solve this i made an ugly patch that reuses the SourceLocation
>>>>> of CodeGenFunction::EmitOMPTaskgroupDirective() in
>>>>> CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages
>>>>> and make it work. Anyway, i suppose that there is a better way to
>>>>> fix this.
>>>>>
>>>>> Index: CGStmtOpenMP.cpp
>>>>> ===================================================================
>>>>> --- CGStmtOpenMP.cpp (revision 326685)
>>>>> +++ CGStmtOpenMP.cpp (working copy)
>>>>> @@ -2713,6 +2713,8 @@
>>>>> emitEmptyBoundParameters);
>>>>> }
>>>>>
>>>>> +static SourceLocation reductionLoc;
>>>>> +
>>>>> void CodeGenFunction::EmitOMPTaskBasedDirective(
>>>>> const OMPExecutableDirective &S, const OpenMPDirectiveKind
>>>>> CapturedRegion,
>>>>> const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
>>>>> @@ -2952,7 +2954,7 @@
>>>>> // FIXME: This must removed once the runtime library is
>>>>> fixed.
>>>>> // Emit required threadprivate variables for
>>>>> // initilizer/combiner/finalizer.
>>>>> - CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
>>>>> S.getLocStart(),
>>>>> + CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
>>>>> reductionLoc,
>>>>> RedCG, Cnt);
>>>>> }
>>>>> }
>>>>> @@ -3180,8 +3182,9 @@
>>>>> std::advance(IRHS, 1);
>>>>> }
>>>>> }
>>>>> + reductionLoc = S.getLocStart();
>>>>> llvm::Value *ReductionDesc =
>>>>> - CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
>>>>> S.getLocStart(),
>>>>> + CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
>>>>> reductionLoc,
>>>>> LHSs, RHSs, Data);
>>>>> const auto *VD =
>>>>> cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
>>>>> CGF.EmitVarDecl(*VD);
>>>>>
>>>>> Regards,
>>>>>
>>>>> Raúl
>>>>>
>>>>>
>>>>> http://bsc.es/disclaimer>>>>>
>>>>
>>>
>>>
>>>
>>> WARNING / LEGAL TEXT: This message is intended only for the use of
>>> the individual or entity to which it is addressed and may contain
>>> information which is privileged, confidential, proprietary, or
>>> exempt from disclosure under applicable law. If you are not the
>>> intended recipient or the person responsible for delivering the
>>> message to the intended recipient, you are strictly prohibited from
>>> disclosing, distributing, copying, or in any way using this message.
>>> If you have received this communication in error, please notify the
>>> sender and destroy and delete any copies you may have received.
>>>
>>> http://www.bsc.es/disclaimer>>> < https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7C2a787fae16d1409fb12808d584f2b8da%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561100369995188&sdata=Qg460eDN4kZxnL4NIiP06MjaE85rnbyx2M%2FyVR7jif8%3D&reserved=0>
>>>
>>
>
>
>
> WARNING / LEGAL TEXT: This message is intended only for the use of the
> individual or entity to which it is addressed and may contain
> information which is privileged, confidential, proprietary, or exempt
> from disclosure under applicable law. If you are not the intended
> recipient or the person responsible for delivering the message to the
> intended recipient, you are strictly prohibited from disclosing,
> distributing, copying, or in any way using this message. If you have
> received this communication in error, please notify the sender and
> destroy and delete any copies you may have received.
>
> http://www.bsc.es/disclaimer> < https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Ca45b287f1ee24a78c93808d585273f8e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561325966851856&sdata=IYNPTvByADBir%2F950dCUevcJHWEIOo90Xx5raSJ2rS8%3D&reserved=0>
>
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
I see. In that case, what about using task reductions with a global
variable like the attached example?
I've seen that you fixed the storage calls. I've tested it but still
doesn't work omp_orig initialization. If I'm not wrong, the reduction
init function use the pointer returned by __kmpc_threadprivate_cached()
directly without dereference.
Regards,
Raúl
El 08/03/18 a las 20:10, Alexey Bataev escribió:
> Your code is not correct. Standard does not say, that the reduced items must share the same memory, it says that it must use the same variables. In your example the variables are different, though they share the same memory.
>
> -------------
> Best regards,
> Alexey Bataev
>
> 08.03.2018 14:03, Raúl Peñacoba пишет:
>> Hi,
>>
>> I mean that it seems that this implementation restricts the tasks that
>> participate in the reduction to be defined in the same lexical scope.
>> I have not been able to find this restriction in the OpenMP
>> specification. What the spec says about this topic is:
>>
>> "A list item that appears in an in_reduction clause of a
>> task-generating construct must appear
>> in a task_reduction clause of a construct corresponding to a taskgroup
>> region that includes
>> the participating task in its taskgroup set. The construct
>> corresponding to the innermost region
>> that meets this condition must specify the same reduction-identifier
>> as the in_reduction
>> clause."
>>
>> I tried to compile the attached file. The compile command is:
>>
>> clang -fopenmp task_red.cc
>>
>> Regards,
>> Raúl
>>
>> El 08/03/18 a las 15:30, Alexey Bataev escribió:
>>> Hi,
>>>
>>>>>> Anyway, this implementation of task reductions is not fully
>>> functional, as only are allowed if the task is created in the scope
>>> of taskgroup.
>>>
>>> What do you mean?Could you provide some more detais?
>>>
>>>
>>> -------------
>>> Best regards,
>>> Alexey Bataev
>>> 08.03.2018 7:47, Raúl Peñacoba пишет:
>>>> Hi,
>>>>
>>>> I did some little research and I have realized that the patch I
>>>> attached doesn't fix the problem completely. I fixed the storage of
>>>> the data. When a using task reduction with vla, or a class that uses
>>>> omp_orig to initialize the private copies, the size of the vla and
>>>> omp_orig are stored in a storage using __kmpc_threadprivate_cached().
>>>>
>>>> The problem comes because, since in outline parallel function each
>>>> reduction has lazy_priv flag, the initialization will be done in
>>>> __kmpc_task_reduction_get_th_data(), called from outline task
>>>> function. This is done _before_storing the vla size or omp_orig, so
>>>> the initialization would be wrong in these cases.
>>>>
>>>> I think that if the storage is put before the
>>>> __kmpc_task_reduction_get_th_data() it should work.
>>>>
>>>> Anyway, this implementation of task reductions is not fully
>>>> functional, as only are allowed if the task is created in the scope
>>>> of taskgroup.
>>>>
>>>> Sorry for the mail duplicate.
>>>>
>>>> Regards,
>>>>
>>>> Raúl
>>>>
>>>> El 06/03/18 a las 16:21, Alexey Bataev escribió:
>>>>> Hi, thanks for the report, I'll take a look.
>>>>>
>>>>> -------------
>>>>> Best regards,
>>>>> Alexey Bataev
>>>>> 06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
>>>>>> Hi,
>>>>>>
>>>>>> I've noticed that clang does not work properly when using task
>>>>>> reductions with non-constant arrays and/or classes with omp_orig
>>>>>> constructor initializers.
>>>>>>
>>>>>> As reduction initialize/combine/finalize functions only receive
>>>>>> references to whatever should be initialized, combined..., clang
>>>>>> generates some additional storages to save the size of the
>>>>>> reduction array and a pointer to the omp_orig. That storages are
>>>>>> generated through __kmpc_thread_private_cached() calls in both
>>>>>> reduction functions and outline task function.
>>>>>>
>>>>>> The problem is that the storage used in outline task function does
>>>>>> not match the storage in reduction functions. This happens in
>>>>>> programs that have a task reduction or a taskloop reduction with
>>>>>> the in_reduction clause.
>>>>>>
>>>>>> To solve this i made an ugly patch that reuses the SourceLocation
>>>>>> of CodeGenFunction::EmitOMPTaskgroupDirective() in
>>>>>> CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages
>>>>>> and make it work. Anyway, i suppose that there is a better way to
>>>>>> fix this.
>>>>>>
>>>>>> Index: CGStmtOpenMP.cpp
>>>>>> ===================================================================
>>>>>> --- CGStmtOpenMP.cpp (revision 326685)
>>>>>> +++ CGStmtOpenMP.cpp (working copy)
>>>>>> @@ -2713,6 +2713,8 @@
>>>>>> emitEmptyBoundParameters);
>>>>>> }
>>>>>>
>>>>>> +static SourceLocation reductionLoc;
>>>>>> +
>>>>>> void CodeGenFunction::EmitOMPTaskBasedDirective(
>>>>>> const OMPExecutableDirective &S, const OpenMPDirectiveKind
>>>>>> CapturedRegion,
>>>>>> const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
>>>>>> @@ -2952,7 +2954,7 @@
>>>>>> // FIXME: This must removed once the runtime library is
>>>>>> fixed.
>>>>>> // Emit required threadprivate variables for
>>>>>> // initilizer/combiner/finalizer.
>>>>>> - CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
>>>>>> S.getLocStart(),
>>>>>> + CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
>>>>>> reductionLoc,
>>>>>> RedCG, Cnt);
>>>>>> }
>>>>>> }
>>>>>> @@ -3180,8 +3182,9 @@
>>>>>> std::advance(IRHS, 1);
>>>>>> }
>>>>>> }
>>>>>> + reductionLoc = S.getLocStart();
>>>>>> llvm::Value *ReductionDesc =
>>>>>> - CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
>>>>>> S.getLocStart(),
>>>>>> + CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
>>>>>> reductionLoc,
>>>>>> LHSs, RHSs, Data);
>>>>>> const auto *VD =
>>>>>> cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
>>>>>> CGF.EmitVarDecl(*VD);
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Raúl
>>>>>>
>>>>>>
>>>>>> http://bsc.es/disclaimer>>>>>>
>>>>
>>>>
>>>> WARNING / LEGAL TEXT: This message is intended only for the use of
>>>> the individual or entity to which it is addressed and may contain
>>>> information which is privileged, confidential, proprietary, or
>>>> exempt from disclosure under applicable law. If you are not the
>>>> intended recipient or the person responsible for delivering the
>>>> message to the intended recipient, you are strictly prohibited from
>>>> disclosing, distributing, copying, or in any way using this message.
>>>> If you have received this communication in error, please notify the
>>>> sender and destroy and delete any copies you may have received.
>>>>
>>>> http://www.bsc.es/disclaimer>>>> < https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7C2a787fae16d1409fb12808d584f2b8da%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561100369995188&sdata=Qg460eDN4kZxnL4NIiP06MjaE85rnbyx2M%2FyVR7jif8%3D&reserved=0>
>>>>
>>
>>
>> WARNING / LEGAL TEXT: This message is intended only for the use of the
>> individual or entity to which it is addressed and may contain
>> information which is privileged, confidential, proprietary, or exempt
>> from disclosure under applicable law. If you are not the intended
>> recipient or the person responsible for delivering the message to the
>> intended recipient, you are strictly prohibited from disclosing,
>> distributing, copying, or in any way using this message. If you have
>> received this communication in error, please notify the sender and
>> destroy and delete any copies you may have received.
>>
>> http://www.bsc.es/disclaimer>> < https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Ca45b287f1ee24a78c93808d585273f8e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561325966851856&sdata=IYNPTvByADBir%2F950dCUevcJHWEIOo90Xx5raSJ2rS8%3D&reserved=0>
>>
>
http://bsc.es/disclaimer_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
1. I don't think that this is correct construct too. Since we don't support such kind of construct for locals, we should not support it for globals,
since OpenMP does not have any difference between locals and globals. I think it is just the problem of the standard that it is not quite clear about it.
Probably, we can support this construct, but not sure that it is required. I don't want to spend my time implementing the things that won't be used and
not allowed by the standard.
2. I'll check it. All the problems with it because runtime is not quite correct and does not fully support task reductions. Because of that I had to introduce
all that stuff with the threadprivates.
-------------
Best regards,
Alexey Bataev
09.03.2018 3:57, Raúl Peñacoba пишет:
I see. In
that case, what about using task reductions with a global variable
like the attached example?
I've seen that you fixed the storage calls. I've tested it but
still doesn't work omp_orig initialization. If I'm not wrong, the
reduction init function use the pointer returned by
__kmpc_threadprivate_cached() directly without dereference.
Regards,
Raúl
El 08/03/18 a las 20:10, Alexey Bataev escribió:
Your code is not correct. Standard does
not say, that the reduced items must share the same memory, it
says that it must use the same variables. In your example the
variables are different, though they share the same memory.
-------------
Best regards,
Alexey Bataev
08.03.2018 14:03, Raúl Peñacoba пишет:
Hi,
I mean that it seems that this implementation restricts the
tasks that
participate in the reduction to be defined in the same lexical
scope.
I have not been able to find this restriction in the OpenMP
specification. What the spec says about this topic is:
"A list item that appears in an in_reduction clause of a
task-generating construct must appear
in a task_reduction clause of a construct corresponding to a
taskgroup
region that includes
the participating task in its taskgroup set. The construct
corresponding to the innermost region
that meets this condition must specify the same
reduction-identifier
as the in_reduction
clause."
I tried to compile the attached file. The compile command is:
clang -fopenmp task_red.cc
Regards,
Raúl
El 08/03/18 a las 15:30, Alexey Bataev escribió:
Hi,
Anyway, this implementation of
task reductions is not fully
functional, as only are allowed if the task is created in
the scope
of taskgroup.
What do you mean?Could you provide some more detais?
-------------
Best regards,
Alexey Bataev
08.03.2018 7:47, Raúl Peñacoba пишет:
Hi,
I did some little research and I have realized that the
patch I
attached doesn't fix the problem completely. I fixed the
storage of
the data. When a using task reduction with vla, or a class
that uses
omp_orig to initialize the private copies, the size of the
vla and
omp_orig are stored in a storage using
__kmpc_threadprivate_cached().
The problem comes because, since in outline parallel
function each
reduction has lazy_priv flag, the initialization will be
done in
__kmpc_task_reduction_get_th_data(), called from outline
task
function. This is done _before_storing the vla size or
omp_orig, so
the initialization would be wrong in these cases.
I think that if the storage is put before the
__kmpc_task_reduction_get_th_data() it should work.
Anyway, this implementation of task reductions is not
fully
functional, as only are allowed if the task is created in
the scope
of taskgroup.
Sorry for the mail duplicate.
Regards,
Raúl
El 06/03/18 a las 16:21, Alexey Bataev escribió:
Hi, thanks for the report, I'll
take a look.
-------------
Best regards,
Alexey Bataev
06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
Hi,
I've noticed that clang does not work properly when
using task
reductions with non-constant arrays and/or classes
with omp_orig
constructor initializers.
As reduction initialize/combine/finalize functions
only receive
references to whatever should be initialized,
combined..., clang
generates some additional storages to save the size of
the
reduction array and a pointer to the omp_orig. That
storages are
generated through __kmpc_thread_private_cached() calls
in both
reduction functions and outline task function.
The problem is that the storage used in outline task
function does
not match the storage in reduction functions. This
happens in
programs that have a task reduction or a taskloop
reduction with
the in_reduction clause.
To solve this i made an ugly patch that reuses the
SourceLocation
of CodeGenFunction::EmitOMPTaskgroupDirective() in
CodeGenFunction::EmitOMPTaskBasedDirective() to match
the storages
and make it work. Anyway, i suppose that there is a
better way to
fix this.
Index: CGStmtOpenMP.cpp
===================================================================
--- CGStmtOpenMP.cpp (revision 326685)
+++ CGStmtOpenMP.cpp (working copy)
@@ -2713,6 +2713,8 @@
emitEmptyBoundParameters);
}
+static SourceLocation reductionLoc;
+
void CodeGenFunction::EmitOMPTaskBasedDirective(
const OMPExecutableDirective &S, const
OpenMPDirectiveKind
CapturedRegion,
const RegionCodeGenTy &BodyGen, const
TaskGenTy &TaskGen,
@@ -2952,7 +2954,7 @@
// FIXME: This must removed once the runtime
library is
fixed.
// Emit required threadprivate variables for
// initilizer/combiner/finalizer.
-
CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
S.getLocStart(),
+
CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF,
reductionLoc,
RedCG, Cnt);
}
}
@@ -3180,8 +3182,9 @@
std::advance(IRHS, 1);
}
}
+ reductionLoc = S.getLocStart();
llvm::Value *ReductionDesc =
-
CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
S.getLocStart(),
+
CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF,
reductionLoc,
LHSs, RHSs, Data);
const auto *VD =
cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
CGF.EmitVarDecl(*VD);
Regards,
Raúl
https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbsc.es%2Fdisclaimer&data=02%7C01%7C%7Cbdc4c4de2d04480706e308d5859bcff2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561826607210068&sdata=8EdV%2BOxP0hlHRaTsWdbDjPGQZsjKrQVwB4UBpIDscsU%3D&reserved=0
WARNING / LEGAL TEXT: This message is intended only for
the use of
the individual or entity to which it is addressed and may
contain
information which is privileged, confidential,
proprietary, or
exempt from disclosure under applicable law. If you are
not the
intended recipient or the person responsible for
delivering the
message to the intended recipient, you are strictly
prohibited from
disclosing, distributing, copying, or in any way using
this message.
If you have received this communication in error, please
notify the
sender and destroy and delete any copies you may have
received.
https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Cbdc4c4de2d04480706e308d5859bcff2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561826607210068&sdata=VijKwX7k2dgfvTE7yBrXjch1lF7lYB2RNMKCW273btM%3D&reserved=0
<https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7C2a787fae16d1409fb12808d584f2b8da%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561100369995188&sdata=Qg460eDN4kZxnL4NIiP06MjaE85rnbyx2M%2FyVR7jif8%3D&reserved=0>
WARNING / LEGAL TEXT: This message is intended only for the
use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or
exempt
from disclosure under applicable law. If you are not the
intended
recipient or the person responsible for delivering the message
to the
intended recipient, you are strictly prohibited from
disclosing,
distributing, copying, or in any way using this message. If
you have
received this communication in error, please notify the sender
and
destroy and delete any copies you may have received.
https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Cbdc4c4de2d04480706e308d5859bcff2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561826607210068&sdata=VijKwX7k2dgfvTE7yBrXjch1lF7lYB2RNMKCW273btM%3D&reserved=0
<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bsc.es%2Fdisclaimer&data=02%7C01%7C%7Ca45b287f1ee24a78c93808d585273f8e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561325966851856&sdata=IYNPTvByADBir%2F950dCUevcJHWEIOo90Xx5raSJ2rS8%3D&reserved=0>
https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbsc.es%2Fdisclaimer&data=02%7C01%7C%7Cbdc4c4de2d04480706e308d5859bcff2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636561826607210068&sdata=8EdV%2BOxP0hlHRaTsWdbDjPGQZsjKrQVwB4UBpIDscsU%3D&reserved=0
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|