|
|
Hi,
At the moment the vectorize_width(X) #pragma is used to provide hints to LLVM
about which vectorisation factor to use. The unsigned argument ‘X’ used to match
the NumElements property in the VectorType class, however VectorType is now
defined in terms of a ElementCount class.
I’d like to propose an extension to the vectorize_width #pragma that now takes
an optional second parameter of ‘fixed’ or ‘scalable’ that matches up with
ElementCount. When not specified the default value would be ‘fixed’. A few
examples of how this would look like are shown below:
// Vectorize the loop with <4 x eltty>
#pragma clang loop vectorize_width(4)
#pragma clang loop vectorize_width(4, fixed)
// Vectorize the loop with <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
Any thoughts you have would be much appreciated!
Kind Regards,
David Sherwood.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
Hi David,
Thanks for bringing this up here. We have discussed this already on
https://reviews.llvm.org/D89031 and a bit offline, and it would be good to get some other opinions on this too.
What we achieve with this extension is that we can toggle fixed/scalable vectorisation. The proposal is to add this property to vectorize_width, because it kind of defines the VectorType which consists of the elementcount and the scalable/fixed part, which
sounds reasonable. However, there are other loop pragmas that (implicitly) enable vectorisation:
#pragma clang loop interleave_count(some-number)
or
#pragma clang loop vectorize_predicate(enable)
for which you may want to toggle fixed|scalable vectorisation. If this is correct, then I think the current proposal/implementation is incomplete and/or inconsistent.
I think your own suggestion was to introduce a vectorization_style(enable|disable) at some point, but my proposal would be to use that instead of adjusting vectorize_width as that would address the issue incompleteness/inconsistency issue. Besides this, but
more subjective, I don't see all the new combinations of vectorize_width() as making things clearer:
vectorize_width(VF)
vectorize_width(VF, fixed|scalable)
vectorize_width(fixed|scalable)
Probably the implementation of adding vectorization_style(enable|disable) is easier and less contentious than adjusting an existing one, so all together I don't see why the approach of adjusting vectorize_wdith would be preferred. But I might be wrong, might
be missing something, so welcome other views on this.
Cheers,
Sjoerd.
Hi,
At the moment the vectorize_width(X) #pragma is used to provide hints to LLVM
about which vectorisation factor to use. The unsigned argument ‘X’ used to match
the NumElements property in the VectorType class, however VectorType is now
defined in terms of a ElementCount class.
I’d like to propose an extension to the vectorize_width #pragma that now takes
an optional second parameter of ‘fixed’ or ‘scalable’ that matches up with
ElementCount. When not specified the default value would be ‘fixed’. A few
examples of how this would look like are shown below:
// Vectorize the loop with <4 x eltty>
#pragma clang loop vectorize_width(4)
#pragma clang loop vectorize_width(4, fixed)
// Vectorize the loop with <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
Any thoughts you have would be much appreciated!
Kind Regards,
David Sherwood.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
One typo fixed inline
Thanks for bringing this up here. We have discussed this already on
https://reviews.llvm.org/D89031 and a bit offline, and it would be good to get some other opinions on this too.
What we achieve with this extension is that we can toggle fixed/scalable vectorisation. The proposal is to add this property to vectorize_width, because it kind of defines the VectorType which consists of the elementcount and the scalable/fixed part, which
sounds reasonable. However, there are other loop pragmas that (implicitly) enable vectorisation:
#pragma clang loop interleave_count(some-number)
or
#pragma clang loop vectorize_predicate(enable)
for which you may want to toggle fixed|scalable vectorisation. If this is correct, then I think the current proposal/implementation is incomplete and/or inconsistent.
I think your own suggestion was to introduce a vectorization_style(enable|disable) at some point,
I meant vectorization_style(fixed|scalable)
but my proposal would be to use that instead of adjusting vectorize_width as that would address the issue incompleteness/inconsistency issue. Besides this, but more subjective, I don't see all the new combinations of vectorize_width() as making things clearer:
vectorize_width(VF)
vectorize_width(VF, fixed|scalable)
vectorize_width(fixed|scalable)
Probably the implementation of adding vectorization_style(enable|disable) is easier and less contentious than adjusting an existing one, so all together I don't see why the approach of adjusting vectorize_wdith would be preferred. But I might be wrong, might
be missing something, so welcome other views on this.
Cheers,
Sjoerd.
Hi,
At the moment the vectorize_width(X) #pragma is used to provide hints to LLVM
about which vectorisation factor to use. The unsigned argument ‘X’ used to match
the NumElements property in the VectorType class, however VectorType is now
defined in terms of a ElementCount class.
I’d like to propose an extension to the vectorize_width #pragma that now takes
an optional second parameter of ‘fixed’ or ‘scalable’ that matches up with
ElementCount. When not specified the default value would be ‘fixed’. A few
examples of how this would look like are shown below:
// Vectorize the loop with <4 x eltty>
#pragma clang loop vectorize_width(4)
#pragma clang loop vectorize_width(4, fixed)
// Vectorize the loop with <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
Any thoughts you have would be much appreciated!
Kind Regards,
David Sherwood.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
In reply to this post by David Blaikie via cfe-dev
Hi Sjoerd,
As I understand it the interleave count is orthogonal to the vectorization factor and
one does not imply the other. I think the clang documentation gives an example of
this:
#pragma clang loop vectorize_width(2)
#pragma clang loop interleave_count(2)
for(...) {
...
}
Also, I believe that each pragma that we set is a hint for one unit of the
loop vectorizer. It is true that vectorize_predicate enables vectorization, but
the vectorizer will always choose what it thinks is the most profitable
vectorization factor, which could be fixed or scalable. If you wanted to hint
to the compiler that we should use scalable vectors with my proposal you’d
simply add an extra pragma, i.e.
#clang loop vectorize_predicate(enable) vectorize_width(scalable)
Kind Regards,
David.
Thanks for bringing this up here. We have discussed this already on
https://reviews.llvm.org/D89031 and a bit offline, and it would be good to get some other opinions on this too.
What we achieve with this extension is that we can toggle fixed/scalable vectorisation. The proposal is to add this property to vectorize_width, because it kind of defines the VectorType which
consists of the elementcount and the scalable/fixed part, which sounds reasonable. However, there are other loop pragmas that (implicitly) enable vectorisation:
#pragma clang loop interleave_count(some-number)
#pragma clang loop vectorize_predicate(enable)
for which you may want to toggle fixed|scalable vectorisation. If this is correct, then I think the current proposal/implementation is incomplete and/or inconsistent.
I think your own suggestion was to introduce a vectorization_style(enable|disable) at some point, but my proposal would be to use that instead of adjusting vectorize_width as that would address
the issue incompleteness/inconsistency issue. Besides this, but more subjective, I don't see all the new combinations of vectorize_width() as making things clearer:
vectorize_width(VF, fixed|scalable)
vectorize_width(fixed|scalable)
Probably the implementation of adding vectorization_style(enable|disable) is easier and less contentious than adjusting an existing one, so all together I don't see why the approach of adjusting
vectorize_wdith would be preferred. But I might be wrong, might be missing something, so welcome other views on this.
Hi,
At the moment the vectorize_width(X) #pragma is used to provide hints to LLVM
about which vectorisation factor to use. The unsigned argument ‘X’ used to match
the NumElements property in the VectorType class, however VectorType is now
defined in terms of a ElementCount class.
I’d like to propose an extension to the vectorize_width #pragma that now takes
an optional second parameter of ‘fixed’ or ‘scalable’ that matches up with
ElementCount. When not specified the default value would be ‘fixed’. A few
examples of how this would look like are shown below:
// Vectorize the loop with <4 x eltty>
#pragma clang loop vectorize_width(4)
#pragma clang loop vectorize_width(4, fixed)
// Vectorize the loop with <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
Any thoughts you have would be much appreciated!
Kind Regards,
David Sherwood.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
If you wanted to hint
to the compiler that we should use scalable vectors with my proposal you’d
simply add an extra pragma, i.e.
#clang loop vectorize_predicate(enable) vectorize_width(scalable)
Ah yes, that might have been the thing that I missed, but that would indeed then be equivalent with:
#clang loop vectorize_predicate(enable) vectorize_style(scalable)
I think that leaves us with 2 options that can express the same things, i.e. change or introduce:
1)
vectorize_width(VF, fixed|scalable)
vectorize_width(fixed|scalable)
vectorize_width(VF)
2)
vectorize_style(fixed|scalable)
And then it's probably more of a style question and not that important if there are no implementation or usability issues overloading vectorize_width.
Cheers,
Sjoerd.
Hi Sjoerd,
As I understand it the interleave count is orthogonal to the vectorization factor and
one does not imply the other. I think the clang documentation gives an example of
this:
#pragma clang loop vectorize_width(2)
#pragma clang loop interleave_count(2)
for(...) {
...
}
Also, I believe that each pragma that we set is a hint for one unit of the
loop vectorizer. It is true that vectorize_predicate enables vectorization, but
the vectorizer will always choose what it thinks is the most profitable
vectorization factor, which could be fixed or scalable. If you wanted to hint
to the compiler that we should use scalable vectors with my proposal you’d
simply add an extra pragma, i.e.
#clang loop vectorize_predicate(enable) vectorize_width(scalable)
Kind Regards,
David.
Thanks for bringing this up here. We have discussed this already on
https://reviews.llvm.org/D89031 and a bit offline, and it would be good to get some other opinions on this too.
What we achieve with this extension is that we can toggle fixed/scalable vectorisation. The proposal is to add this property to vectorize_width, because it kind of defines the VectorType which
consists of the elementcount and the scalable/fixed part, which sounds reasonable. However, there are other loop pragmas that (implicitly) enable vectorisation:
#pragma clang loop interleave_count(some-number)
#pragma clang loop vectorize_predicate(enable)
for which you may want to toggle fixed|scalable vectorisation. If this is correct, then I think the current proposal/implementation is incomplete and/or inconsistent.
I think your own suggestion was to introduce a vectorization_style(enable|disable) at some point, but my proposal would be to use that instead of adjusting vectorize_width as that would address
the issue incompleteness/inconsistency issue. Besides this, but more subjective, I don't see all the new combinations of vectorize_width() as making things clearer:
vectorize_width(VF, fixed|scalable)
vectorize_width(fixed|scalable)
Probably the implementation of adding vectorization_style(enable|disable) is easier and less contentious than adjusting an existing one, so all together I don't see why the approach of adjusting
vectorize_wdith would be preferred. But I might be wrong, might be missing something, so welcome other views on this.
Hi,
At the moment the vectorize_width(X) #pragma is used to provide hints to LLVM
about which vectorisation factor to use. The unsigned argument ‘X’ used to match
the NumElements property in the VectorType class, however VectorType is now
defined in terms of a ElementCount class.
I’d like to propose an extension to the vectorize_width #pragma that now takes
an optional second parameter of ‘fixed’ or ‘scalable’ that matches up with
ElementCount. When not specified the default value would be ‘fixed’. A few
examples of how this would look like are shown below:
// Vectorize the loop with <4 x eltty>
#pragma clang loop vectorize_width(4)
#pragma clang loop vectorize_width(4, fixed)
// Vectorize the loop with <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
Any thoughts you have would be much appreciated!
Kind Regards,
David Sherwood.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
My feeling is this is not just a question of style but includes an element of design. Where possible we want to express vectorisation factors/element counts as a single unit, hence the proposal to extend vectorize_width as this is the
unit of information that it controls.
If you wanted to hint
to the compiler that we should use scalable vectors with my proposal you’d
simply add an extra pragma, i.e.
#clang loop vectorize_predicate(enable) vectorize_width(scalable)
Ah yes, that might have been the thing that I missed, but that would indeed then be equivalent with:
#clang loop vectorize_predicate(enable) vectorize_style(scalable)
I think that leaves us with 2 options that can express the same things, i.e. change or introduce:
vectorize_width(VF, fixed|scalable)
vectorize_width(fixed|scalable)
vectorize_style(fixed|scalable)
And then it's probably more of a style question and not that important if there are no implementation or usability issues overloading vectorize_width.
Hi Sjoerd,
As I understand it the interleave count is orthogonal to the vectorization factor and
one does not imply the other. I think the clang documentation gives an example of
this:
#pragma clang loop vectorize_width(2)
#pragma clang loop interleave_count(2)
for(...) {
...
}
Also, I believe that each pragma that we set is a hint for one unit of the
loop vectorizer. It is true that vectorize_predicate enables vectorization, but
the vectorizer will always choose what it thinks is the most profitable
vectorization factor, which could be fixed or scalable. If you wanted to hint
to the compiler that we should use scalable vectors with my proposal you’d
simply add an extra pragma, i.e.
#clang loop vectorize_predicate(enable) vectorize_width(scalable)
Kind Regards,
David.
Thanks for bringing this up here. We have discussed this already on
https://reviews.llvm.org/D89031 and a bit offline, and it would be good to get some other opinions on this too.
What we achieve with this extension is that we can toggle fixed/scalable vectorisation. The proposal is to add this property to vectorize_width, because it kind of defines the VectorType which
consists of the elementcount and the scalable/fixed part, which sounds reasonable. However, there are other loop pragmas that (implicitly) enable vectorisation:
#pragma clang loop interleave_count(some-number)
#pragma clang loop vectorize_predicate(enable)
for which you may want to toggle fixed|scalable vectorisation. If this is correct, then I think the current proposal/implementation is incomplete and/or inconsistent.
I think your own suggestion was to introduce a vectorization_style(enable|disable) at some point, but my proposal would be to use that instead of adjusting vectorize_width as that would address
the issue incompleteness/inconsistency issue. Besides this, but more subjective, I don't see all the new combinations of vectorize_width() as making things clearer:
vectorize_width(VF, fixed|scalable)
vectorize_width(fixed|scalable)
Probably the implementation of adding vectorization_style(enable|disable) is easier and less contentious than adjusting an existing one, so all together I don't see why the approach of adjusting
vectorize_wdith would be preferred. But I might be wrong, might be missing something, so welcome other views on this.
Hi,
At the moment the vectorize_width(X) #pragma is used to provide hints to LLVM
about which vectorisation factor to use. The unsigned argument ‘X’ used to match
the NumElements property in the VectorType class, however VectorType is now
defined in terms of a ElementCount class.
I’d like to propose an extension to the vectorize_width #pragma that now takes
an optional second parameter of ‘fixed’ or ‘scalable’ that matches up with
ElementCount. When not specified the default value would be ‘fixed’. A few
examples of how this would look like are shown below:
// Vectorize the loop with <4 x eltty>
#pragma clang loop vectorize_width(4)
#pragma clang loop vectorize_width(4, fixed)
// Vectorize the loop with <vscale x 4 x eltty>
#pragma clang loop vectorize_width(4, scalable)
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
Any thoughts you have would be much appreciated!
Kind Regards,
David Sherwood.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
In reply to this post by David Blaikie via cfe-dev
Hi David,
Your proposal looks sensible to me. I understand that for reasons of evolution of the pragma, you chose to give it `fixed` semantics if no explicit mark of vectorisation style appears, right?
Is this something in the future we'd want to relax? This way the target could also pick the best vectorization style (borrowing Sjoerd's terminology here).
Perhaps we could define a `vectorize_style(any)` as well. That would be the one used if no explicit `vectorize_style` is specified.
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
In those cases, I imagine `vectorize_style` could be enough and we avoid having a `vectorize_width` that doesn't actually tell us the width (or the factor of the actual width, for scalables). But this falls in the "aesthetics" category, I think.
Kind regards,
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
In reply to this post by David Blaikie via cfe-dev
Is "style" the right terminology? Since it affects semantics, I would
prefer some other terminology.
how about vectorize_scalable(enable|disable)?
Michael
Am Mi., 25. Nov. 2020 um 07:18 Uhr schrieb Sjoerd Meijer via cfe-dev
< [hidden email]>:
>
> One typo fixed inline
>
> Thanks for bringing this up here. We have discussed this already on https://reviews.llvm.org/D89031 and a bit offline, and it would be good to get some other opinions on this too.
>
> What we achieve with this extension is that we can toggle fixed/scalable vectorisation. The proposal is to add this property to vectorize_width, because it kind of defines the VectorType which consists of the elementcount and the scalable/fixed part, which sounds reasonable. However, there are other loop pragmas that (implicitly) enable vectorisation:
>
> #pragma clang loop interleave_count(some-number)
>
> or
>
> #pragma clang loop vectorize_predicate(enable)
>
> for which you may want to toggle fixed|scalable vectorisation. If this is correct, then I think the current proposal/implementation is incomplete and/or inconsistent.
>
> I think your own suggestion was to introduce a vectorization_style(enable|disable) at some point,
>
> I meant vectorization_style(fixed|scalable)
>
> but my proposal would be to use that instead of adjusting vectorize_width as that would address the issue incompleteness/inconsistency issue. Besides this, but more subjective, I don't see all the new combinations of vectorize_width() as making things clearer:
>
> vectorize_width(VF)
> vectorize_width(VF, fixed|scalable)
> vectorize_width(fixed|scalable)
>
> Probably the implementation of adding vectorization_style(enable|disable) is easier and less contentious than adjusting an existing one, so all together I don't see why the approach of adjusting vectorize_wdith would be preferred. But I might be wrong, might be missing something, so welcome other views on this.
>
> Cheers,
> Sjoerd.
>
>
> ________________________________
> From: cfe-dev < [hidden email]> on behalf of David Sherwood via cfe-dev < [hidden email]>
> Sent: 24 November 2020 09:04
> To: [hidden email] < [hidden email]>
> Subject: [cfe-dev] Proposed changes to vectorize_width #pragma
>
>
> Hi,
>
>
>
> At the moment the vectorize_width(X) #pragma is used to provide hints to LLVM
>
> about which vectorisation factor to use. The unsigned argument ‘X’ used to match
>
> the NumElements property in the VectorType class, however VectorType is now
>
> defined in terms of a ElementCount class.
>
>
>
> I’d like to propose an extension to the vectorize_width #pragma that now takes
>
> an optional second parameter of ‘fixed’ or ‘scalable’ that matches up with
>
> ElementCount. When not specified the default value would be ‘fixed’. A few
>
> examples of how this would look like are shown below:
>
>
>
> // Vectorize the loop with <4 x eltty>
>
> #pragma clang loop vectorize_width(4)
>
> #pragma clang loop vectorize_width(4, fixed)
>
>
>
> // Vectorize the loop with <vscale x 4 x eltty>
>
> #pragma clang loop vectorize_width(4, scalable)
>
>
>
> As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
>
> allow users to hint at the type of vector used without specifying the
>
> vectorisation factor. Examples of this would be:
>
>
>
> // Vectorize the loop with <N x eltty> for a profitable N
>
> #pragma clang loop vectorize_width(fixed)
>
>
>
> // Vectorize the loop with <vscale x N x eltty> for a profitable N
>
> #pragma clang loop vectorize_width(scalable)
>
>
>
> Any thoughts you have would be much appreciated!
>
>
>
> Kind Regards,
>
> David Sherwood.
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
In reply to this post by David Blaikie via cfe-dev
Am Mo., 30. Nov. 2020 um 07:34 Uhr schrieb Roger Ferrer Ibáñez via
cfe-dev < [hidden email]>:
> Your proposal looks sensible to me. I understand that for reasons of evolution of the pragma, you chose to give it `fixed` semantics if no explicit mark of vectorisation style appears, right?
If LoopVectorize is able to generate SVE without pragma, it should
still be able to do so with a hint that does not force a fixed vector
width. E.g. vectorize_predicate(enable) may implicitly enable
vectorization, but does (should not?) change the choses vector width.
An interpretation is that loop hint restrict the choices the
LoopVectorize's profitability heuristic can make. If the choices are
(interleave_count=1,vectorize_width=1) // .i.e. don't do anything
(interleave_count=1,vectorize_width=2)
(interleave_count=1,vectorize_width=4)
(interleave_count=2,vectorize_width=1)
(interleave_count=2,vectorize_width=2)
(interleave_count=2,vectorize_width=4)
then vectorize_width(4) only keeps
(interleave_count=1,vectorize_width=4)
(interleave_count=2,vectorize_width=4)
as available options. vectorize_enable(enable), or those that enable
vectorization implicitly, remove the vectorize_width=1 options from
the list.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
In reply to this post by David Blaikie via cfe-dev
Am Mi., 25. Nov. 2020 um 10:15 Uhr schrieb Sjoerd Meijer via cfe-dev
< [hidden email]>:
> I think that leaves us with 2 options that can express the same things, i.e. change or introduce:
>
> 1)
> vectorize_width(VF, fixed|scalable)
> vectorize_width(fixed|scalable)
> vectorize_width(VF)
>
> 2)
> vectorize_style(fixed|scalable)
Another proposal:
3)
vectorize_width(VF) // For fixed vector width.
vectorize_width_at_least(MinVF) // For SVE; alternatives:
vectorize_dynamic, vectorize_scalable.
What are the intended semantics? Does scalable mean "width of MinVF or
more", "any multiple of MinVF", "power-of-2 multiple of MinVF", "any
width of at least MinVF allowed by ARM's SVE"?
Michael
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
Hi,
So by adding support for scalable vectorisation widths we are effectively
updating the pragma to mirror the existing VectorType class in LLVM,
which is defined by a ElementCount and an element Type. The
ElementCount is a tuple consisting of a minimum number of elements
and a scalable flag. The meaning of 'scalable' as used in the vectorize_width
pragma as identical to that of ElementCount. Using one of my examples
in the initial proposal then this pragma
#pragma clang loop vectorize_width(4, scalable)
would mean the same in LLVM as a VectorType like this:
<vscale x 4 x eltty>
where eltty depends upon the types used in the loop. The 'vscale' parameter
is defined by the target - it is at least 1 and does not have to be a power of 2.
Kind Regards,
David.
-----Original Message-----
From: Michael Kruse < [hidden email]>
Sent: 30 November 2020 18:01
To: Sjoerd Meijer < [hidden email]>
Cc: David Sherwood < [hidden email]>; [hidden email]; Sander De Smalen < [hidden email]>
Subject: Re: [cfe-dev] Proposed changes to vectorize_width #pragma
Am Mi., 25. Nov. 2020 um 10:15 Uhr schrieb Sjoerd Meijer via cfe-dev
< [hidden email]>:
> I think that leaves us with 2 options that can express the same things, i.e. change or introduce:
>
> 1)
> vectorize_width(VF, fixed|scalable)
> vectorize_width(fixed|scalable)
> vectorize_width(VF)
>
> 2)
> vectorize_style(fixed|scalable)
Another proposal:
3)
vectorize_width(VF) // For fixed vector width.
vectorize_width_at_least(MinVF) // For SVE; alternatives:
vectorize_dynamic, vectorize_scalable.
What are the intended semantics? Does scalable mean "width of MinVF or
more", "any multiple of MinVF", "power-of-2 multiple of MinVF", "any
width of at least MinVF allowed by ARM's SVE"?
Michael
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
I see the motivation, but there are different requirements for
LLVM-internals and user-facing extensions, which is why e.g. clang
does not implement a #pragma ivdep.
The definitions looks fine to me, as long as it is documented without
referring to compiler internals.
Michael
Am Di., 1. Dez. 2020 um 07:46 Uhr schrieb David Sherwood
< [hidden email]>:
>
> Hi,
>
> So by adding support for scalable vectorisation widths we are effectively
> updating the pragma to mirror the existing VectorType class in LLVM,
> which is defined by a ElementCount and an element Type. The
> ElementCount is a tuple consisting of a minimum number of elements
> and a scalable flag. The meaning of 'scalable' as used in the vectorize_width
> pragma as identical to that of ElementCount. Using one of my examples
> in the initial proposal then this pragma
>
> #pragma clang loop vectorize_width(4, scalable)
>
> would mean the same in LLVM as a VectorType like this:
>
> <vscale x 4 x eltty>
>
> where eltty depends upon the types used in the loop. The 'vscale' parameter
> is defined by the target - it is at least 1 and does not have to be a power of 2.
>
> Kind Regards,
> David.
>
> -----Original Message-----
> From: Michael Kruse < [hidden email]>
> Sent: 30 November 2020 18:01
> To: Sjoerd Meijer < [hidden email]>
> Cc: David Sherwood < [hidden email]>; [hidden email]; Sander De Smalen < [hidden email]>
> Subject: Re: [cfe-dev] Proposed changes to vectorize_width #pragma
>
> Am Mi., 25. Nov. 2020 um 10:15 Uhr schrieb Sjoerd Meijer via cfe-dev
> < [hidden email]>:
> > I think that leaves us with 2 options that can express the same things, i.e. change or introduce:
> >
> > 1)
> > vectorize_width(VF, fixed|scalable)
> > vectorize_width(fixed|scalable)
> > vectorize_width(VF)
> >
> > 2)
> > vectorize_style(fixed|scalable)
>
> Another proposal:
>
> 3)
> vectorize_width(VF) // For fixed vector width.
> vectorize_width_at_least(MinVF) // For SVE; alternatives:
> vectorize_dynamic, vectorize_scalable.
>
> What are the intended semantics? Does scalable mean "width of MinVF or
> more", "any multiple of MinVF", "power-of-2 multiple of MinVF", "any
> width of at least MinVF allowed by ARM's SVE"?
>
>
> Michael
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
In reply to this post by David Blaikie via cfe-dev
Hi Roger,
Thanks for the suggestion. With regards to possible use cases of a vectorize_style(any)
pragma my thoughts are:
1. Any existing tests that currently use vectorize_width(#number) were presumably
written with fixed width vectorisation in mind. So it makes sense in those cases
for the default to be fixed width. If the user wants to go back and fix them to explicitly
use scalable vectorisation they can just add vectorize_width(#number, scalable). We
feel that specifying the numeric part of the vectorisation factor without also considering
if the factor is fixed-length or scalable is not a realistic/real world use case. I imagine
that best results will be obtained by letting the vectoriser choose the best pair, i.e.
vectorize_width(4, fixed) or vectorize_width(8, scalable).
2. However, if the user wants the compiler to choose the best option (fixed or scalable)
then we already have a route for that with vectorize(enable). Similarly when compiling
at -O2 or above the compiler will choose the most profitable option.
Kind Regards,
David.
Your proposal looks sensible to me. I understand that for reasons of evolution of the pragma, you chose to give it `fixed` semantics if no explicit mark of vectorisation style appears, right?
Is this something in the future we'd want to relax? This way the target could also pick the best vectorization style (borrowing Sjoerd's terminology here).
Perhaps we could define a `vectorize_style(any)` as well. That would be the one used if no explicit `vectorize_style` is specified.
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
In those cases, I imagine `vectorize_style` could be enough and we avoid having a `vectorize_width` that doesn't actually tell us the width (or the factor of the actual width, for scalables). But this falls in the "aesthetics" category,
I think.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|
Hi David,
Thanks a lot for the clarification.
Defaulting to fixed vectorization and having a qualifier that restricts to fixed/scalable vectorization seems very reasonable to me in this context. I can see how a `vectorize_style(any)` would be unnecessary.
Kind regards,
Missatge de David Sherwood < [hidden email]> del dia dc., 9 de des. 2020 a les 13:49:
Hi Roger,
Thanks for the suggestion. With regards to possible use cases of a vectorize_style(any)
pragma my thoughts are:
1. Any existing tests that currently use vectorize_width(#number) were presumably
written with fixed width vectorisation in mind. So it makes sense in those cases
for the default to be fixed width. If the user wants to go back and fix them to explicitly
use scalable vectorisation they can just add vectorize_width(#number, scalable). We
feel that specifying the numeric part of the vectorisation factor without also considering
if the factor is fixed-length or scalable is not a realistic/real world use case. I imagine
that best results will be obtained by letting the vectoriser choose the best pair, i.e.
vectorize_width(4, fixed) or vectorize_width(8, scalable).
2. However, if the user wants the compiler to choose the best option (fixed or scalable)
then we already have a route for that with vectorize(enable). Similarly when compiling
at -O2 or above the compiler will choose the most profitable option.
Kind Regards,
David.
Your proposal looks sensible to me. I understand that for reasons of evolution of the pragma, you chose to give it `fixed` semantics if no explicit mark of vectorisation style appears, right?
Is this something in the future we'd want to relax? This way the target could also pick the best vectorization style (borrowing Sjoerd's terminology here).
Perhaps we could define a `vectorize_style(any)` as well. That would be the one used if no explicit `vectorize_style` is specified.
As a further extension I’d also like to permit vectorize_width(fixed|scalable) to
allow users to hint at the type of vector used without specifying the
vectorisation factor. Examples of this would be:
// Vectorize the loop with <N x eltty> for a profitable N
#pragma clang loop vectorize_width(fixed)
// Vectorize the loop with <vscale x N x eltty> for a profitable N
#pragma clang loop vectorize_width(scalable)
In those cases, I imagine `vectorize_style` could be enough and we avoid having a `vectorize_width` that doesn't actually tell us the width (or the factor of the actual width, for scalables). But this falls in the "aesthetics" category,
I think.
-- Roger Ferrer Ibáñez
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
|
|