Sharing MemoryBuffers between front ends and LLVM

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Sharing MemoryBuffers between front ends and LLVM

Robinson, Paul via cfe-dev
Hi all,

I'm implementing interleaved source in assembly output. Early reviews raised the concern that the current implementation will be opening files (using a llvm::MemoryBuffer) that are likely to be in the memory of the front end (commonly clang but I think we want this to be front end agnostic).

I'm now exploring ideas to avoid reopening files and let LLVM reuse the files the FE had to open.

I am assuming that the front end will use llvm::MemoryBuffer (e.g.: clang does indirectly through clang::SourceManager).

So for buffers related to named files (including stdin, which does not have name and is handled in a special way) we could have in the LLVM context a MemoryBufferRegistry. The idea is to add new creators of MemoryBuffer (the ones that work on named files and stdin) that can be passed a reference to that llvm::MemoryBufferRegistry. MemoryBuffer objects would register/deregister themselves at creation/destruction. This registry can then be used as a cache of already opened files from which retrieve a reference to the MemoryBuffer itself using the file path. These new interfaces would be opt-in for all users of MemoryBuffer.

Back to my case,  the new AsmPrinterHandler could now use the MemoryBufferRegistry of the LLVM context. If there is none or the memory buffer associated to a file path has been already deregistered (or was never registered e.g. because we are using a .ll file directly), it would open the file as usual, otherwise it would reuse the registered MemoryBuffer.

I see a few downsides of this approach, though.

It overlaps a bit with the existing SourceManager in clang which already does some caching work through the clang::ContentCache class. At first the cache seems hard to abstract away as it uses clang::FileEntry and looks pretty tailored for clang needs.

Also, assuming that the front end is using a MemoryBuffer may be a too strong requirement, in particular for FE's that are mostly unaware of LLVM except for a final LLVM codegen pass. This would mean that the files would be reopened even if they are already in the memory of the FE.

Finally the file path may not even be a good identifier to reuse MemoryBuffer objects.

Thoughts?

Thank you,
Roger

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] Sharing MemoryBuffers between front ends and LLVM

Robinson, Paul via cfe-dev
On 03/16/2017 12:29 PM, Roger Ferrer Ibanez via llvm-dev wrote:
> Hi all,
>
> I'm implementing interleaved source in assembly output. Early reviews raised the concern that the current implementation will be opening files (using a llvm::MemoryBuffer) that are likely to be in the memory of the front end (commonly clang but I think we want this to be front end agnostic).
>
> I'm now exploring ideas to avoid reopening files and let LLVM reuse the files the FE had to open.

I think this is a good idea.

>
> I am assuming that the front end will use llvm::MemoryBuffer (e.g.: clang does indirectly through clang::SourceManager).
>
> So for buffers related to named files (including stdin, which does not have name and is handled in a special way) we could have in the LLVM context a MemoryBufferRegistry. The idea is to add new creators of MemoryBuffer (the ones that work on named files and stdin) that can be passed a reference to that llvm::MemoryBufferRegistry. MemoryBuffer objects would register/deregister themselves at creation/destruction. This registry can then be used as a cache of already opened files from which retrieve a reference to the MemoryBuffer itself using the file path. These new interfaces would be opt-in for all users of MemoryBuffer.
>
> Back to my case,  the new AsmPrinterHandler could now use the MemoryBufferRegistry of the LLVM context. If there is none or the memory buffer associated to a file path has been already deregistered (or was never registered e.g. because we are using a .ll file directly), it would open the file as usual, otherwise it would reuse the registered MemoryBuffer.
>
> I see a few downsides of this approach, though.
>
> It overlaps a bit with the existing SourceManager in clang which already does some caching work through the clang::ContentCache class. At first the cache seems hard to abstract away as it uses clang::FileEntry and looks pretty tailored for clang needs.

I don't see why this is hard to abstract? As you describe it, it sounds
like we just need to define some callback that can be implemented by the
FE. For example, we could add the following to LLVMContext:

typedef ErrorOr<MemoryBufferRef> (*FileBufferCallbackTy)(StringRef
FileName, void *Context);

void setFileBufferCallback(FileBufferCallbackTy Callback, void *Context
= nullptr);
FileBufferCallbackTy getFileBufferCallback() const;

Using this, we can create an additional abstraction in LLVM that will
open a given file and return a memory buffer, asking the FE first if
there's a callback, or otherwise just opening the file directly.

>
> Also, assuming that the front end is using a MemoryBuffer may be a too strong requirement, in particular for FE's that are mostly unaware of LLVM except for a final LLVM codegen pass. This would mean that the files would be reopened even if they are already in the memory of the FE.

Given that you can construct a MemoryBuffer from a StringRef, and a
StringRef can be constructed around an arbitrary buffer, if we wanted to
create a wrapper around this interface usable by a less-LLVM-aware FE,
that seems possible. I don't know how important this is right now.

>
> Finally the file path may not even be a good identifier to reuse MemoryBuffer objects.

It might not be, but I think that we can make clear that the keys used
are intended to correspond to the file names from the debug info (or
module name).

Thanks again,
Hal

>
> Thoughts?
>
> Thank you,
> Roger
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] Sharing MemoryBuffers between front ends and LLVM

Robinson, Paul via cfe-dev
In reply to this post by Robinson, Paul via cfe-dev

> On Mar 16, 2017, at 10:29 AM, Roger Ferrer Ibanez via llvm-dev <[hidden email]> wrote:
>
> Hi all,
>
> I'm implementing interleaved source in assembly output. Early reviews raised the concern

Is there a patch up for review?
I’m wondering how is the frontend enabling this interleaved output mode?


Mehdi


> that the current implementation will be opening files (using a llvm::MemoryBuffer) that are likely to be in the memory of the front end (commonly clang but I think we want this to be front end agnostic).
>
> I'm now exploring ideas to avoid reopening files and let LLVM reuse the files the FE had to open.
>
> I am assuming that the front end will use llvm::MemoryBuffer (e.g.: clang does indirectly through clang::SourceManager).
>
> So for buffers related to named files (including stdin, which does not have name and is handled in a special way) we could have in the LLVM context a MemoryBufferRegistry. The idea is to add new creators of MemoryBuffer (the ones that work on named files and stdin) that can be passed a reference to that llvm::MemoryBufferRegistry. MemoryBuffer objects would register/deregister themselves at creation/destruction. This registry can then be used as a cache of already opened files from which retrieve a reference to the MemoryBuffer itself using the file path. These new interfaces would be opt-in for all users of MemoryBuffer.
>
> Back to my case,  the new AsmPrinterHandler could now use the MemoryBufferRegistry of the LLVM context. If there is none or the memory buffer associated to a file path has been already deregistered (or was never registered e.g. because we are using a .ll file directly), it would open the file as usual, otherwise it would reuse the registered MemoryBuffer.
>
> I see a few downsides of this approach, though.
>
> It overlaps a bit with the existing SourceManager in clang which already does some caching work through the clang::ContentCache class. At first the cache seems hard to abstract away as it uses clang::FileEntry and looks pretty tailored for clang needs.
>
> Also, assuming that the front end is using a MemoryBuffer may be a too strong requirement, in particular for FE's that are mostly unaware of LLVM except for a final LLVM codegen pass. This would mean that the files would be reopened even if they are already in the memory of the FE.
>
> Finally the file path may not even be a good identifier to reuse MemoryBuffer objects.
>
> Thoughts?
>
> Thank you,
> Roger
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] Sharing MemoryBuffers between front ends and LLVM

Robinson, Paul via cfe-dev

On Mar 16, 2017, at 4:22 PM, Mehdi Amini via llvm-dev <[hidden email]> wrote:


On Mar 16, 2017, at 10:29 AM, Roger Ferrer Ibanez via llvm-dev <[hidden email]> wrote:

Hi all,

I'm implementing interleaved source in assembly output. Early reviews raised the concern

Is there a patch up for review?
I’m wondering how is the frontend enabling this interleaved output mode?

that the current implementation will be opening files (using a llvm::MemoryBuffer) that are likely to be in the memory of the front end (commonly clang but I think we want this to be front end agnostic). 

If we mmap files, is it a problem if a process mmap two times the same file? I was assuming this to be “free”. (I’d like to make sure there is a real issue to solve here).

— 
Mehdi




I'm now exploring ideas to avoid reopening files and let LLVM reuse the files the FE had to open.

I am assuming that the front end will use llvm::MemoryBuffer (e.g.: clang does indirectly through clang::SourceManager).

So for buffers related to named files (including stdin, which does not have name and is handled in a special way) we could have in the LLVM context a MemoryBufferRegistry. The idea is to add new creators of MemoryBuffer (the ones that work on named files and stdin) that can be passed a reference to that llvm::MemoryBufferRegistry. MemoryBuffer objects would register/deregister themselves at creation/destruction. This registry can then be used as a cache of already opened files from which retrieve a reference to the MemoryBuffer itself using the file path. These new interfaces would be opt-in for all users of MemoryBuffer.

Back to my case,  the new AsmPrinterHandler could now use the MemoryBufferRegistry of the LLVM context. If there is none or the memory buffer associated to a file path has been already deregistered (or was never registered e.g. because we are using a .ll file directly), it would open the file as usual, otherwise it would reuse the registered MemoryBuffer.

I see a few downsides of this approach, though.

It overlaps a bit with the existing SourceManager in clang which already does some caching work through the clang::ContentCache class. At first the cache seems hard to abstract away as it uses clang::FileEntry and looks pretty tailored for clang needs.

Also, assuming that the front end is using a MemoryBuffer may be a too strong requirement, in particular for FE's that are mostly unaware of LLVM except for a final LLVM codegen pass. This would mean that the files would be reopened even if they are already in the memory of the FE.

Finally the file path may not even be a good identifier to reuse MemoryBuffer objects.

Thoughts?

Thank you,
Roger

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] Sharing MemoryBuffers between front ends and LLVM

Robinson, Paul via cfe-dev
In reply to this post by Robinson, Paul via cfe-dev

On 03/16/2017 06:22 PM, Mehdi Amini via llvm-dev wrote:
>> On Mar 16, 2017, at 10:29 AM, Roger Ferrer Ibanez via llvm-dev <[hidden email]> wrote:
>>
>> Hi all,
>>
>> I'm implementing interleaved source in assembly output. Early reviews raised the concern
> Is there a patch up for review?

https://reviews.llvm.org/D30898
https://reviews.llvm.org/D30897

> I’m wondering how is the frontend enabling this interleaved output mode?

I'm not sure this has been really discussed in the review yet (although
there certainly is a proposed mechanism in the patches).

  -Hal

>
> —
> Mehdi
>
>
>> that the current implementation will be opening files (using a llvm::MemoryBuffer) that are likely to be in the memory of the front end (commonly clang but I think we want this to be front end agnostic).
>>
>> I'm now exploring ideas to avoid reopening files and let LLVM reuse the files the FE had to open.
>>
>> I am assuming that the front end will use llvm::MemoryBuffer (e.g.: clang does indirectly through clang::SourceManager).
>>
>> So for buffers related to named files (including stdin, which does not have name and is handled in a special way) we could have in the LLVM context a MemoryBufferRegistry. The idea is to add new creators of MemoryBuffer (the ones that work on named files and stdin) that can be passed a reference to that llvm::MemoryBufferRegistry. MemoryBuffer objects would register/deregister themselves at creation/destruction. This registry can then be used as a cache of already opened files from which retrieve a reference to the MemoryBuffer itself using the file path. These new interfaces would be opt-in for all users of MemoryBuffer.
>>
>> Back to my case,  the new AsmPrinterHandler could now use the MemoryBufferRegistry of the LLVM context. If there is none or the memory buffer associated to a file path has been already deregistered (or was never registered e.g. because we are using a .ll file directly), it would open the file as usual, otherwise it would reuse the registered MemoryBuffer.
>>
>> I see a few downsides of this approach, though.
>>
>> It overlaps a bit with the existing SourceManager in clang which already does some caching work through the clang::ContentCache class. At first the cache seems hard to abstract away as it uses clang::FileEntry and looks pretty tailored for clang needs.
>>
>> Also, assuming that the front end is using a MemoryBuffer may be a too strong requirement, in particular for FE's that are mostly unaware of LLVM except for a final LLVM codegen pass. This would mean that the files would be reopened even if they are already in the memory of the FE.
>>
>> Finally the file path may not even be a good identifier to reuse MemoryBuffer objects.
>>
>> Thoughts?
>>
>> Thank you,
>> Roger
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] Sharing MemoryBuffers between front ends and LLVM

Robinson, Paul via cfe-dev
In reply to this post by Robinson, Paul via cfe-dev


On 03/16/2017 06:24 PM, Mehdi Amini via cfe-dev wrote:

On Mar 16, 2017, at 4:22 PM, Mehdi Amini via llvm-dev <[hidden email]> wrote:


On Mar 16, 2017, at 10:29 AM, Roger Ferrer Ibanez via llvm-dev <[hidden email]> wrote:

Hi all,

I'm implementing interleaved source in assembly output. Early reviews raised the concern

Is there a patch up for review?
I’m wondering how is the frontend enabling this interleaved output mode?

that the current implementation will be opening files (using a llvm::MemoryBuffer) that are likely to be in the memory of the front end (commonly clang but I think we want this to be front end agnostic). 

If we mmap files, is it a problem if a process mmap two times the same file? I was assuming this to be “free”. (I’d like to make sure there is a real issue to solve here).

I agree, the cost of that should be low. I believe the concern raised on the review revolved around files that don't really exist on the file system, such as code piped in on STDIN (or things otherwise provided through the VFS layer).

 -Hal


— 
Mehdi




I'm now exploring ideas to avoid reopening files and let LLVM reuse the files the FE had to open.

I am assuming that the front end will use llvm::MemoryBuffer (e.g.: clang does indirectly through clang::SourceManager).

So for buffers related to named files (including stdin, which does not have name and is handled in a special way) we could have in the LLVM context a MemoryBufferRegistry. The idea is to add new creators of MemoryBuffer (the ones that work on named files and stdin) that can be passed a reference to that llvm::MemoryBufferRegistry. MemoryBuffer objects would register/deregister themselves at creation/destruction. This registry can then be used as a cache of already opened files from which retrieve a reference to the MemoryBuffer itself using the file path. These new interfaces would be opt-in for all users of MemoryBuffer.

Back to my case,  the new AsmPrinterHandler could now use the MemoryBufferRegistry of the LLVM context. If there is none or the memory buffer associated to a file path has been already deregistered (or was never registered e.g. because we are using a .ll file directly), it would open the file as usual, otherwise it would reuse the registered MemoryBuffer.

I see a few downsides of this approach, though.

It overlaps a bit with the existing SourceManager in clang which already does some caching work through the clang::ContentCache class. At first the cache seems hard to abstract away as it uses clang::FileEntry and looks pretty tailored for clang needs.

Also, assuming that the front end is using a MemoryBuffer may be a too strong requirement, in particular for FE's that are mostly unaware of LLVM except for a final LLVM codegen pass. This would mean that the files would be reopened even if they are already in the memory of the FE.

Finally the file path may not even be a good identifier to reuse MemoryBuffer objects.

Thoughts?

Thank you,
Roger

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] Sharing MemoryBuffers between front ends and LLVM

Robinson, Paul via cfe-dev
In reply to this post by Robinson, Paul via cfe-dev

On Mar 16, 2017, at 5:27 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 06:22 PM, Mehdi Amini via llvm-dev wrote:
On Mar 16, 2017, at 10:29 AM, Roger Ferrer Ibanez via llvm-dev <[hidden email]> wrote:

Hi all,

I'm implementing interleaved source in assembly output. Early reviews raised the concern
Is there a patch up for review?

https://reviews.llvm.org/D30898
https://reviews.llvm.org/D30897

I’m wondering how is the frontend enabling this interleaved output mode?

I'm not sure this has been really discussed in the review yet (although there certainly is a proposed mechanism in the patches).

I wonder if the most straightforward way isn’t to leave the memorybuffer initialization up to the client, at the same place where the setting is (currently MCTargetOptions, not sure if it is the right place).

To enable the ASMSource option, client would have to call something like  `MCTargetOptions::setAsmInterleavedSourceCode( Mode, Buffer);

— 
Mehdi


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] Sharing MemoryBuffers between front ends and LLVM

Robinson, Paul via cfe-dev


On 03/16/2017 07:36 PM, Mehdi Amini wrote:

On Mar 16, 2017, at 5:27 PM, Hal Finkel <[hidden email]> wrote:


On 03/16/2017 06:22 PM, Mehdi Amini via llvm-dev wrote:
On Mar 16, 2017, at 10:29 AM, Roger Ferrer Ibanez via llvm-dev <[hidden email]> wrote:

Hi all,

I'm implementing interleaved source in assembly output. Early reviews raised the concern
Is there a patch up for review?

https://reviews.llvm.org/D30898
https://reviews.llvm.org/D30897

I’m wondering how is the frontend enabling this interleaved output mode?

I'm not sure this has been really discussed in the review yet (although there certainly is a proposed mechanism in the patches).

I wonder if the most straightforward way isn’t to leave the memorybuffer initialization up to the client, at the same place where the setting is (currently MCTargetOptions, not sure if it is the right place).

To enable the ASMSource option, client would have to call something like  `MCTargetOptions::setAsmInterleavedSourceCode( Mode, Buffer);

I think you'd still need a callback, although you could certainly set it this way too. The problem is that you potentially need buffers corresponding to all files included by the main source file, not just the main source file itself. You wouldn't want to open them all ahead of time either because, besides being wasteful, you might run out of file handles).

 -Hal


— 
Mehdi


-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Loading...