Quantcast

[OT?] real-world interest of the polly optimiser

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[OT?] real-world interest of the polly optimiser

Martin J. O'Riordan via cfe-dev
Hi,

Apologies if this isn't the best place. I've been looking for some information (understandable by the average user) about the real-world benefits of the polly optimiser, but have found only either very broad and vague claims or specialist research papers.

What I'd like to get an idea of is what benefits Polly brings, under what conditions, for what cost and how (= any special compiler options needed?).

Also, given I'm installing clang via MacPorts: does clang pick up Polly's presence automatically after I add the libpolly binary (i.e. port:llvm with the +polly install variant) or do I need to rebuild clang too?

Thanks,
René
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [OT?] real-world interest of the polly optimiser

Martin J. O'Riordan via cfe-dev
Hi René,

2017-05-19 10:38 GMT+02:00 René J.V. Bertin via cfe-dev
<[hidden email]>:
> Hi,
>
> Apologies if this isn't the best place.

Polly has its own mailing list here:
https://groups.google.com/forum/#!forum/polly-dev
[hidden email]


> I've been looking for some information (understandable by the average user) about the real-world benefits of the polly optimiser, but have found only either very broad and vague claims or specialist research papers.

As a researcher, I can tell about the research we are doing. We
currently have a paper under review about optimizing gemm where we get
85\% of vendor-provided BLAS implementation, which is 20x the speed of
the program compiled by clang without Polly.

We know Samsung, Qualcomm and Xilinx are using Polly on a regular basis.

Polly can automatically generate OpenMP and CUDA code. The benefits
depend a lot on what you are using it for, for instance whether your
code consists of for-loops and dense arrays. In other cases you only
get increased compilation time.


> What I'd like to get an idea of is what benefits Polly brings, under what conditions, for what cost and how (= any special compiler options needed?).
>
> Also, given I'm installing clang via MacPorts: does clang pick up Polly's presence automatically after I add the libpolly binary (i.e. port:llvm with the +polly install variant) or do I need to rebuild clang too?

I don't have a Mac, so I don't know how it works there. So I can only
explain how to do it from source:

Check out the Polly source into LLVM's tools directory then recompile
opt and clang. Add "-mllvm -polly" to the clang command line to enable
Polly.

As currently being a research project, I'd not expect a sudden
improvement of execution time. Performance-critical "real-world" code
is often already optimized manually simply because general purpose
compilers do not automatically optimize code aggressive enough. Many
such manual optimizations are incompatible with Polly, e.g. parts
written in assembler.


Michael
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [OT?] real-world interest of the polly optimiser

Martin J. O'Riordan via cfe-dev


On Mon, May 22, 2017 at 10:08 PM, Michael Kruse via cfe-dev <[hidden email]> wrote:
Hi René,

2017-05-19 10:38 GMT+02:00 René J.V. Bertin via cfe-dev
<[hidden email]>:
> Hi,
>
> Apologies if this isn't the best place.

Polly has its own mailing list here:
https://groups.google.com/forum/#!forum/polly-dev
[hidden email]


> I've been looking for some information (understandable by the average user) about the real-world benefits of the polly optimiser, but have found only either very broad and vague claims or specialist research papers.

As a researcher, I can tell about the research we are doing. We
currently have a paper under review about optimizing gemm where we get
85\% of vendor-provided BLAS implementation, which is 20x the speed of
the program compiled by clang without Polly.

Sorry, but please don't
1) Provide numbers when comparing against a weak baseline

Please do
2) If you do have a valid performance comparison or claim - please do provide enough information so that a complete picture is presented.

You statement just came across as something like either llvm's loop optimizer sucks so bad that polly is required and or somehow it's hitting a corner case which is a sweetspot for polly.
------------
Also I'd kindly ask that if you do have such specific performance examples of clang doing a rather poor job, please file a bug report and include as much detail as you have time. It's unlikely that polly is doing anything that a traditional loop optimizer can't do and or at least attempted.



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Loading...