[RFC] Open sourcing and contributing TAPI back to the LLVM community

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
Hi @ll,

Over the past years I have been looking into how to reduce the size of the SDK that ships with Xcode and how to improve build times for the overall OS inside Apple. The result is a tool called TAPI, which is used at Apple for all things related to text-based dynamic library files (.tbd).

What are text-based dynamic library files?
Text-based dynamic library files (TBDs) are a textual representation of the information in a dynamic library / shared library that is required by the static linker - basically a symbol list of the exported symbols.

Apple’s SDKs originally used Mach-O Dynamic Library Stubs. Mach-O Dynamic Library Stubs are dynamic library files, but with all the text and data stripped out. TBD files were introduced to replaced Mach-O Dynamic Library Stub files in the SDK to further reduce its overall size.

Over time the TAPI tool has grown and is used now in a variety of ways.

Dynamic Library Stubbing:
As mentioned above, TAPI is used to read the content of dynamic library / shared library and generates a textual representation that can be used by the static linker. The current implementation reads MachO files, but it could be extended to also provide the same functionality for other object file formats.

Framework / Dynamic Library Verification:
The symbols that are exported from a dynamic library should ideally match, or at least contain, all the API that is specified in the associated header files. TAPI performs this verification by parsing the header files with CLANG and compare the findings to the exported symbols from the library.

InstallAPI:
InstallAPI is a new build phase that generates the TBD file from header files only. This allows a dependency of the library to build concurrently even before the library has been built itself. This can be used to increase parallelism in the build or larger projects or operating systems.

Misc:
- display and operate on TBD files
- automatically generate API tests from header files
- libtapi, which is used by the linker (ld64) to parse the TBD files


The functionality of the tool is currently limited to Mach-O object files, but that is not a technical limitation. In making the tool open source I hope others will be able to take advantage of it too and extend its functionality to other object file formats.


I initially developed the project as a CLANG project, but that was mostly for practical reasons (out-of-tree development, separate repo, etc). For the curious ones I pushed the repo to github (https://github.com/ributzka/tapi).

I imagine, for example, that the reading/writing of TBD files is something that would fit better into the LLVM sources, which makes it available to other libraries and tools (e.g. LLVMObject, llvm-nm, lld, ...).

I created a small patch that integrates it with llvm-nm and LLVMObject. This patch is not complete and I will split it up into smaller patches for review. I am providing it as a reference to get the discussion started.

Please let me know what you think and bikeshed away :)

Thanks

Cheers,
Juergen






_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

tapi-llvm-nm.patch.tar.bz2 (22K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
On Thu, Sep 7, 2017 at 5:01 PM, Juergen Ributzka via llvm-dev
<[hidden email]> wrote:

> Hi @ll,
>
> Over the past years I have been looking into how to reduce the size of the
> SDK that ships with Xcode and how to improve build times for the overall OS
> inside Apple. The result is a tool called TAPI, which is used at Apple for
> all things related to text-based dynamic library files (.tbd).
>
> What are text-based dynamic library files?
> Text-based dynamic library files (TBDs) are a textual representation of the
> information in a dynamic library / shared library that is required by the
> static linker - basically a symbol list of the exported symbols.
>
> Apple’s SDKs originally used Mach-O Dynamic Library Stubs. Mach-O Dynamic
> Library Stubs are dynamic library files, but with all the text and data
> stripped out. TBD files were introduced to replaced Mach-O Dynamic Library
> Stub files in the SDK to further reduce its overall size.
>
> Over time the TAPI tool has grown and is used now in a variety of ways.
>
> Dynamic Library Stubbing:
> As mentioned above, TAPI is used to read the content of dynamic library /
> shared library and generates a textual representation that can be used by
> the static linker. The current implementation reads MachO files, but it
> could be extended to also provide the same functionality for other object
> file formats.
>
> Framework / Dynamic Library Verification:
> The symbols that are exported from a dynamic library should ideally match,
> or at least contain, all the API that is specified in the associated header
> files. TAPI performs this verification by parsing the header files with
> CLANG and compare the findings to the exported symbols from the library.
>
> InstallAPI:
> InstallAPI is a new build phase that generates the TBD file from header
> files only. This allows a dependency of the library to build concurrently
> even before the library has been built itself. This can be used to increase
> parallelism in the build or larger projects or operating systems.
>
> Misc:
> - display and operate on TBD files
> - automatically generate API tests from header files
> - libtapi, which is used by the linker (ld64) to parse the TBD files
>

I'm interested in whether you plan to have this integrated in lld as well.
As far as I understand, this is going to be the de-facto way of
shipping for Mach-O binaries (at least, the ones released by Apple).
Please correct me if I'm wrong.
I tried to self-host lld on El Capitan and it fails because lld
doesn't really know about TBD files.
This, unfortunately, makes the linker not really usable for modern Mac
OS releases.

>
> The functionality of the tool is currently limited to Mach-O object files,
> but that is not a technical limitation. In making the tool open source I
> hope others will be able to take advantage of it too and extend its
> functionality to other object file formats.
>
>
> I initially developed the project as a CLANG project, but that was mostly
> for practical reasons (out-of-tree development, separate repo, etc). For the
> curious ones I pushed the repo to github (https://github.com/ributzka/tapi).
>
> I imagine, for example, that the reading/writing of TBD files is something
> that would fit better into the LLVM sources, which makes it available to
> other libraries and tools (e.g. LLVMObject, llvm-nm, lld, ...).
>
> I created a small patch that integrates it with llvm-nm and LLVMObject. This
> patch is not complete and I will split it up into smaller patches for
> review. I am providing it as a reference to get the discussion started.
>
> Please let me know what you think and bikeshed away :)
>
> Thanks
>
> Cheers,
> Juergen
>
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

--
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
On Thu, Sep 7, 2017 at 6:52 PM, Davide Italiano <[hidden email]> wrote:
On Thu, Sep 7, 2017 at 5:01 PM, Juergen Ributzka via llvm-dev
<[hidden email]> wrote:
> Hi @ll,
>
> Over the past years I have been looking into how to reduce the size of the
> SDK that ships with Xcode and how to improve build times for the overall OS
> inside Apple. The result is a tool called TAPI, which is used at Apple for
> all things related to text-based dynamic library files (.tbd).
>
> What are text-based dynamic library files?
> Text-based dynamic library files (TBDs) are a textual representation of the
> information in a dynamic library / shared library that is required by the
> static linker - basically a symbol list of the exported symbols.
>
> Apple’s SDKs originally used Mach-O Dynamic Library Stubs. Mach-O Dynamic
> Library Stubs are dynamic library files, but with all the text and data
> stripped out. TBD files were introduced to replaced Mach-O Dynamic Library
> Stub files in the SDK to further reduce its overall size.
>
> Over time the TAPI tool has grown and is used now in a variety of ways.
>
> Dynamic Library Stubbing:
> As mentioned above, TAPI is used to read the content of dynamic library /
> shared library and generates a textual representation that can be used by
> the static linker. The current implementation reads MachO files, but it
> could be extended to also provide the same functionality for other object
> file formats.
>
> Framework / Dynamic Library Verification:
> The symbols that are exported from a dynamic library should ideally match,
> or at least contain, all the API that is specified in the associated header
> files. TAPI performs this verification by parsing the header files with
> CLANG and compare the findings to the exported symbols from the library.
>
> InstallAPI:
> InstallAPI is a new build phase that generates the TBD file from header
> files only. This allows a dependency of the library to build concurrently
> even before the library has been built itself. This can be used to increase
> parallelism in the build or larger projects or operating systems.
>
> Misc:
> - display and operate on TBD files
> - automatically generate API tests from header files
> - libtapi, which is used by the linker (ld64) to parse the TBD files
>

I'm interested in whether you plan to have this integrated in lld as well.
As far as I understand, this is going to be the de-facto way of
shipping for Mach-O binaries (at least, the ones released by Apple).
Please correct me if I'm wrong.

Yes, this is already the de-facto way of shipping Mach-O files in the SDK. That means self-hosting LLD against the SDK is currently not possible. The system itself is obviously still shipping full Mach-O files in /System, so you should be still able to self-host against those file.

My plan is to integrate support for TBD files into all LLVM tools where it makes sense (including LLD). This is why I wanted to start to put the basic support into LLVM first, so it can be used by other tools and libraries.
 
I tried to self-host lld on El Capitan and it fails because lld
doesn't really know about TBD files.
This, unfortunately, makes the linker not really usable for modern Mac
OS releases.

>
> The functionality of the tool is currently limited to Mach-O object files,
> but that is not a technical limitation. In making the tool open source I
> hope others will be able to take advantage of it too and extend its
> functionality to other object file formats.
>
>
> I initially developed the project as a CLANG project, but that was mostly
> for practical reasons (out-of-tree development, separate repo, etc). For the
> curious ones I pushed the repo to github (https://github.com/ributzka/tapi).
>
> I imagine, for example, that the reading/writing of TBD files is something
> that would fit better into the LLVM sources, which makes it available to
> other libraries and tools (e.g. LLVMObject, llvm-nm, lld, ...).
>
> I created a small patch that integrates it with llvm-nm and LLVMObject. This
> patch is not complete and I will split it up into smaller patches for
> review. I am providing it as a reference to get the discussion started.
>
> Please let me know what you think and bikeshed away :)
>
> Thanks
>
> Cheers,
> Juergen
>
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

--
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev
Hi Paul,

My experience has shown the same when it comes to header files and I am not claiming this is going to work out of the box for all library projects. It usually requires some cleanup first and that is why the tool comes with a verification mode to make sure the headers are the truth. Also keep in mind that you don't have to parse all the headers, but only the small set that get installed as part of the library API.

The tool does not read the linker script / export file, because they are not necessarily the truth either and may have wildcards. In my view they are just one way of managing exported symbols. Another way, which I personally prefer, is to build with visibility hidden and annotate only the API with visibility default. That makes the headers the single source of what is API.

Cheers,
Juergen



On Fri, Sep 8, 2017 at 9:29 AM, Robinson, Paul <[hidden email]> wrote:
> InstallAPI:
> InstallAPI is a new build phase that generates the TBD file from header
> files only. This allows a dependency of the library to build concurrently
> even before the library has been built itself. This can be used to
> increase parallelism in the build or larger projects or operating systems.

My experience is that headers don't necessarily form the best source of
truth about the API exported from a library.  If you follow the Windows
model of marking exported APIs explicitly (declspec(dllexport) or something)
then okay, but that's a Windows extension and not common in other systems.
Linker scripts seem to be a more popular method; does the tool read linker
scripts to form the content of a TBD file?
Otherwise I'm not seeing a generic improvement in build parallelism.
--paulr




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
I think it makes sense to have support for this input format in the tools.  Since the macOS SDK is slowly switching to this, having the tools work out of the box is a nice feature.  It is rather convenient having a single toolset be sufficient to provide infrastructure for all the targets.

Saleem

On Fri, Sep 8, 2017 at 10:32 AM, Juergen Ributzka via cfe-dev <[hidden email]> wrote:
Hi Paul,

My experience has shown the same when it comes to header files and I am not claiming this is going to work out of the box for all library projects. It usually requires some cleanup first and that is why the tool comes with a verification mode to make sure the headers are the truth. Also keep in mind that you don't have to parse all the headers, but only the small set that get installed as part of the library API.

The tool does not read the linker script / export file, because they are not necessarily the truth either and may have wildcards. In my view they are just one way of managing exported symbols. Another way, which I personally prefer, is to build with visibility hidden and annotate only the API with visibility default. That makes the headers the single source of what is API.

Cheers,
Juergen



On Fri, Sep 8, 2017 at 9:29 AM, Robinson, Paul <[hidden email]> wrote:
> InstallAPI:
> InstallAPI is a new build phase that generates the TBD file from header
> files only. This allows a dependency of the library to build concurrently
> even before the library has been built itself. This can be used to
> increase parallelism in the build or larger projects or operating systems.

My experience is that headers don't necessarily form the best source of
truth about the API exported from a library.  If you follow the Windows
model of marking exported APIs explicitly (declspec(dllexport) or something)
then okay, but that's a Windows extension and not common in other systems.
Linker scripts seem to be a more popular method; does the tool read linker
scripts to form the content of a TBD file?
Otherwise I'm not seeing a generic improvement in build parallelism.
--paulr




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




--
Saleem Abdulrasool
compnerd (at) compnerd (dot) org

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev
Hi Juergen,

At a minimum I think adding the support to libobject, etc so the various llvm tools can read or even write files from/for OSX should be fairly non-controversial so how about go ahead and do that first (I'll happily review if you'd like) and then we can go from there to do anything else with TAPI and llvm?

Sound good?

-eric

On Thu, Sep 7, 2017 at 5:01 PM Juergen Ributzka via cfe-dev <[hidden email]> wrote:
Hi @ll,

Over the past years I have been looking into how to reduce the size of the SDK that ships with Xcode and how to improve build times for the overall OS inside Apple. The result is a tool called TAPI, which is used at Apple for all things related to text-based dynamic library files (.tbd).

What are text-based dynamic library files?
Text-based dynamic library files (TBDs) are a textual representation of the information in a dynamic library / shared library that is required by the static linker - basically a symbol list of the exported symbols.

Apple’s SDKs originally used Mach-O Dynamic Library Stubs. Mach-O Dynamic Library Stubs are dynamic library files, but with all the text and data stripped out. TBD files were introduced to replaced Mach-O Dynamic Library Stub files in the SDK to further reduce its overall size.

Over time the TAPI tool has grown and is used now in a variety of ways.

Dynamic Library Stubbing:
As mentioned above, TAPI is used to read the content of dynamic library / shared library and generates a textual representation that can be used by the static linker. The current implementation reads MachO files, but it could be extended to also provide the same functionality for other object file formats.

Framework / Dynamic Library Verification:
The symbols that are exported from a dynamic library should ideally match, or at least contain, all the API that is specified in the associated header files. TAPI performs this verification by parsing the header files with CLANG and compare the findings to the exported symbols from the library.

InstallAPI:
InstallAPI is a new build phase that generates the TBD file from header files only. This allows a dependency of the library to build concurrently even before the library has been built itself. This can be used to increase parallelism in the build or larger projects or operating systems.

Misc:
- display and operate on TBD files
- automatically generate API tests from header files
- libtapi, which is used by the linker (ld64) to parse the TBD files


The functionality of the tool is currently limited to Mach-O object files, but that is not a technical limitation. In making the tool open source I hope others will be able to take advantage of it too and extend its functionality to other object file formats.


I initially developed the project as a CLANG project, but that was mostly for practical reasons (out-of-tree development, separate repo, etc). For the curious ones I pushed the repo to github (https://github.com/ributzka/tapi).

I imagine, for example, that the reading/writing of TBD files is something that would fit better into the LLVM sources, which makes it available to other libraries and tools (e.g. LLVMObject, llvm-nm, lld, ...).

I created a small patch that integrates it with llvm-nm and LLVMObject. This patch is not complete and I will split it up into smaller patches for review. I am providing it as a reference to get the discussion started.

Please let me know what you think and bikeshed away :)

Thanks

Cheers,
Juergen




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev
I'm not really clear on the actual benefits of the TBD file, and why Apple migrated to them in the first place. Shouldn't a dynamic library containing only the relevant parts (e.g. the dynamic symbol table) be roughly comparable in size? And, much simpler to support? I assume that's effectively what "Mach-O Dynamic Library Stubs" actually _were_, before the introduction of TBD files, so presumably there were good reasons for switching?

If anyone wants to do something similar for another platform (that is to say, ELF; COFF already has import libraries), I'd suggest that the sensible way to do so would be to generate actual shared object files which contain only the appropriate interface definitions.

Regardless of any of that, given that TBD files _are_ an integral part of the apple platform, supporting them is certainly a necessity in order to have a working apple linker. So, if making LLD work for Apple/MachO is the justification for adding TBD support to LLVM, that seems self-evidently a reasonable thing to do. On the other hand, it looks like the LLD mach-o code is unmaintained and nobody seems to be much interested in it. And having code for reading TBD files in LLVM seems not terribly interesting, unless it is as part of a project to make the LLD MachO linker actually functional and supported.


On Thu, Sep 7, 2017 at 8:01 PM, Juergen Ributzka via cfe-dev <[hidden email]> wrote:
Hi @ll,

Over the past years I have been looking into how to reduce the size of the SDK that ships with Xcode and how to improve build times for the overall OS inside Apple. The result is a tool called TAPI, which is used at Apple for all things related to text-based dynamic library files (.tbd).

What are text-based dynamic library files?
Text-based dynamic library files (TBDs) are a textual representation of the information in a dynamic library / shared library that is required by the static linker - basically a symbol list of the exported symbols.

Apple’s SDKs originally used Mach-O Dynamic Library Stubs. Mach-O Dynamic Library Stubs are dynamic library files, but with all the text and data stripped out. TBD files were introduced to replaced Mach-O Dynamic Library Stub files in the SDK to further reduce its overall size.

Over time the TAPI tool has grown and is used now in a variety of ways.

Dynamic Library Stubbing:
As mentioned above, TAPI is used to read the content of dynamic library / shared library and generates a textual representation that can be used by the static linker. The current implementation reads MachO files, but it could be extended to also provide the same functionality for other object file formats.

Framework / Dynamic Library Verification:
The symbols that are exported from a dynamic library should ideally match, or at least contain, all the API that is specified in the associated header files. TAPI performs this verification by parsing the header files with CLANG and compare the findings to the exported symbols from the library.

InstallAPI:
InstallAPI is a new build phase that generates the TBD file from header files only. This allows a dependency of the library to build concurrently even before the library has been built itself. This can be used to increase parallelism in the build or larger projects or operating systems.

Misc:
- display and operate on TBD files
- automatically generate API tests from header files
- libtapi, which is used by the linker (ld64) to parse the TBD files


The functionality of the tool is currently limited to Mach-O object files, but that is not a technical limitation. In making the tool open source I hope others will be able to take advantage of it too and extend its functionality to other object file formats.


I initially developed the project as a CLANG project, but that was mostly for practical reasons (out-of-tree development, separate repo, etc). For the curious ones I pushed the repo to github (https://github.com/ributzka/tapi).

I imagine, for example, that the reading/writing of TBD files is something that would fit better into the LLVM sources, which makes it available to other libraries and tools (e.g. LLVMObject, llvm-nm, lld, ...).

I created a small patch that integrates it with llvm-nm and LLVMObject. This patch is not complete and I will split it up into smaller patches for review. I am providing it as a reference to get the discussion started.

Please let me know what you think and bikeshed away :)

Thanks

Cheers,
Juergen






_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev

> On Apr 9, 2018, at 3:23 PM, James Y Knight via cfe-dev <[hidden email]> wrote:
>
> I'm not really clear on the actual benefits of the TBD file, and why Apple migrated to them in the first place. Shouldn't a dynamic library containing only the relevant parts (e.g. the dynamic symbol table) be roughly comparable in size? And, much simpler to support? I assume that's effectively what "Mach-O Dynamic Library Stubs" actually _were_, before the introduction of TBD files, so presumably there were good reasons for switching?

File size is one reason. A TBD file is typically one third the size of the corresponding stub library for a single architecture. Multiple architectures dramatically increase the TBD advantage: a new architecture in TBD may cost as little as a few bytes if all architectures export the same functions, but each new architecture in a stub library requires duplicating its entire contents.


--
Greg Parker     [hidden email]     Runtime Wrangler


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
 > Regardless of any of that, given that TBD files _are_ an integral
part of the apple platform, supporting them is certainly a necessity in
order to have a working apple linker. So, if making LLD work for
Apple/MachO is the justification for adding TBD support to LLVM, that
seems self-evidently a reasonable thing to do. On the other hand, it
looks like the LLD mach-o code is unmaintained and nobody seems to be
much interested in it. And having code for reading TBD files in
LLVM seems not terribly interesting, unless it is as part of a project
to make the LLD MachO linker actually functional and supported.

Yes. I hope this can be reason enough. Hobbyists could push for LLD
support for Mach-O besides Apple, and if LLD is to displace other
linkers this is a necessary component as you say. Better to upstream now
before the code diverges than more work later? Conversely if nothing
happens, I doubt libtapi would be a greater drag on the codebase than
the MachO LLD code, so whatever cost/benefit analysis exists for keeping
that around could also apply to this.

 > On 04/09/2018 06:39 PM, Greg Parker via cfe-dev wrote:

 >> On Apr 9, 2018, at 3:23 PM, James Y Knight via cfe-dev
<[hidden email]> wrote:

 >> I'm not really clear on the actual benefits of the TBD file, and why
Apple migrated to them in the first place. Shouldn't a dynamic library
containing only the relevant parts (e.g. the dynamic symbol table) be
roughly comparable in size?...

 > File size is one reason...

For the record, other small benefits are

  - The inclusion of the path to the actual library, which as far as I
know is not something that can be done with a stub library. This allows
easy absolute or relative (with R(UN)PATH) linking. Comparatively,
passing the right -rpath and -rpath link is manual and (in my opinion)
harder to understand and cumbersome, and also not a solution for
absolute linking. I work with Nixpkgs of NixOS, where absolute path
linking is frequently an objective as part of a general principle of
avoiding indirection.

  - YAML. The option for line-oriented structure allows for easy diffing
with conventional line-based diffing tools, which is useful for
debugging compatability issues. (e.g. Why did my new version remove
symbols? Why did my security update change anything at all?). Of course
one can just objdump and diff, but that wouldn't happen automatically
with version control, for example.

John
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev


On Mon, Apr 9, 2018 at 10:11 PM, John Ericson via llvm-dev <[hidden email]> wrote:
> Regardless of any of that, given that TBD files _are_ an integral part of the apple platform, supporting them is certainly a necessity in order to have a working apple linker. So, if making LLD work for Apple/MachO is the justification for adding TBD support to LLVM, that seems self-evidently a reasonable thing to do. On the other hand, it looks like the LLD mach-o code is unmaintained and nobody seems to be much interested in it. And having code for reading TBD files in LLVM seems not terribly interesting, unless it is as part of a project to make the LLD MachO linker actually functional and supported.

Yes. I hope this can be reason enough. Hobbyists could push for LLD support for Mach-O besides Apple, and if LLD is to displace other linkers this is a necessary component as you say. Better to upstream now before the code diverges than more work later? Conversely if nothing happens, I doubt libtapi would be a greater drag on the codebase than the MachO LLD code, so whatever cost/benefit analysis exists for keeping that around could also apply to this.


Speaking for the Zig project here, our goal is to support cross-compilation for any target, on any target, without requiring installation of any target-specific SDK. So, for example, these use cases:
 * on linux, compile & link a binary targeting macos
 * on windows, compile & link a binary targeting macos

This works today, although it depends on a patch to LLD to fix the MACH-O linker that is not high enough quality to upstream.

So we have a vested interest in improving the MACH-O linker, and in fact a Zig community member has fixed at least one bug in MACH-O LLD: reviews.llvm.org/D35387

I don't fully understand how TBD or TAPI works, but I hope that it results in improvements to the MACH-O linker.

 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK
3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

On Tue, Apr 10, 2018 at 1:14 PM Andrew Kelley via llvm-dev <[hidden email]> wrote:
On Mon, Apr 9, 2018 at 10:11 PM, John Ericson via llvm-dev <[hidden email]> wrote:
> Regardless of any of that, given that TBD files _are_ an integral part of the apple platform, supporting them is certainly a necessity in order to have a working apple linker. So, if making LLD work for Apple/MachO is the justification for adding TBD support to LLVM, that seems self-evidently a reasonable thing to do. On the other hand, it looks like the LLD mach-o code is unmaintained and nobody seems to be much interested in it. And having code for reading TBD files in LLVM seems not terribly interesting, unless it is as part of a project to make the LLD MachO linker actually functional and supported.

Yes. I hope this can be reason enough. Hobbyists could push for LLD support for Mach-O besides Apple, and if LLD is to displace other linkers this is a necessary component as you say. Better to upstream now before the code diverges than more work later? Conversely if nothing happens, I doubt libtapi would be a greater drag on the codebase than the MachO LLD code, so whatever cost/benefit analysis exists for keeping that around could also apply to this.


Speaking for the Zig project here, our goal is to support cross-compilation for any target, on any target, without requiring installation of any target-specific SDK. So, for example, these use cases:
 * on linux, compile & link a binary targeting macos
 * on windows, compile & link a binary targeting macos

This works today, although it depends on a patch to LLD to fix the MACH-O linker that is not high enough quality to upstream.

So we have a vested interest in improving the MACH-O linker, and in fact a Zig community member has fixed at least one bug in MACH-O LLD: reviews.llvm.org/D35387

I don't fully understand how TBD or TAPI works, but I hope that it results in improvements to the MACH-O linker.

 
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev

Seems like there are a few of us interested in this then. I new around here and don't really know how decisions are made, so what's next? Just open a diff with the entire library??

John

On 04/10/2018 05:33 PM, Jake Ehrlich wrote:
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK
3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

On Tue, Apr 10, 2018 at 1:14 PM Andrew Kelley via llvm-dev <[hidden email]> wrote:
On Mon, Apr 9, 2018 at 10:11 PM, John Ericson via llvm-dev <[hidden email]> wrote:
> Regardless of any of that, given that TBD files _are_ an integral part of the apple platform, supporting them is certainly a necessity in order to have a working apple linker. So, if making LLD work for Apple/MachO is the justification for adding TBD support to LLVM, that seems self-evidently a reasonable thing to do. On the other hand, it looks like the LLD mach-o code is unmaintained and nobody seems to be much interested in it. And having code for reading TBD files in LLVM seems not terribly interesting, unless it is as part of a project to make the LLD MachO linker actually functional and supported.

Yes. I hope this can be reason enough. Hobbyists could push for LLD support for Mach-O besides Apple, and if LLD is to displace other linkers this is a necessary component as you say. Better to upstream now before the code diverges than more work later? Conversely if nothing happens, I doubt libtapi would be a greater drag on the codebase than the MachO LLD code, so whatever cost/benefit analysis exists for keeping that around could also apply to this.


Speaking for the Zig project here, our goal is to support cross-compilation for any target, on any target, without requiring installation of any target-specific SDK. So, for example, these use cases:
 * on linux, compile & link a binary targeting macos
 * on windows, compile & link a binary targeting macos

This works today, although it depends on a patch to LLD to fix the MACH-O linker that is not high enough quality to upstream.

So we have a vested interest in improving the MACH-O linker, and in fact a Zig community member has fixed at least one bug in MACH-O LLD: reviews.llvm.org/D35387

I don't fully understand how TBD or TAPI works, but I hope that it results in improvements to the MACH-O linker.

 
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
Ideally Jurgen would cut up the code on github, put up an initial diff for a minimal viable tool, and then we would review it and then continue to copy code from the github repo into llvm and review it. I'm also willing to do that if Jurgen doesn't want to at this point though. I'd like the OK from Jurgen on that and I'd also like the OK from someone that the license stuff is all good to go (I'm not sure who should check licence stuff).

Best,
Jake

On Tue, Apr 10, 2018 at 2:39 PM John Ericson <[hidden email]> wrote:

Seems like there are a few of us interested in this then. I new around here and don't really know how decisions are made, so what's next? Just open a diff with the entire library??

John

On 04/10/2018 05:33 PM, Jake Ehrlich wrote:
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK
3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

On Tue, Apr 10, 2018 at 1:14 PM Andrew Kelley via llvm-dev <[hidden email]> wrote:
On Mon, Apr 9, 2018 at 10:11 PM, John Ericson via llvm-dev <[hidden email]> wrote:
> Regardless of any of that, given that TBD files _are_ an integral part of the apple platform, supporting them is certainly a necessity in order to have a working apple linker. So, if making LLD work for Apple/MachO is the justification for adding TBD support to LLVM, that seems self-evidently a reasonable thing to do. On the other hand, it looks like the LLD mach-o code is unmaintained and nobody seems to be much interested in it. And having code for reading TBD files in LLVM seems not terribly interesting, unless it is as part of a project to make the LLD MachO linker actually functional and supported.

Yes. I hope this can be reason enough. Hobbyists could push for LLD support for Mach-O besides Apple, and if LLD is to displace other linkers this is a necessary component as you say. Better to upstream now before the code diverges than more work later? Conversely if nothing happens, I doubt libtapi would be a greater drag on the codebase than the MachO LLD code, so whatever cost/benefit analysis exists for keeping that around could also apply to this.


Speaking for the Zig project here, our goal is to support cross-compilation for any target, on any target, without requiring installation of any target-specific SDK. So, for example, these use cases:
 * on linux, compile & link a binary targeting macos
 * on windows, compile & link a binary targeting macos

This works today, although it depends on a patch to LLD to fix the MACH-O linker that is not high enough quality to upstream.

So we have a vested interest in improving the MACH-O linker, and in fact a Zig community member has fixed at least one bug in MACH-O LLD: reviews.llvm.org/D35387

I don't fully understand how TBD or TAPI works, but I hope that it results in improvements to the MACH-O linker.

 
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev

That sounds great to me, thanks Jake. I'm not Jurgen either, of course, but I'm happy to assist you if he is unavailable. I'm not also not qualified to audit the license, but do note Apple formally also released some code at https://opensource.apple.com/tarballs/tapi/. If there's anything else I can do to help, let me know.

Cheers,

John


On 04/10/2018 06:13 PM, Jake Ehrlich wrote:
Ideally Jurgen would cut up the code on github, put up an initial diff for a minimal viable tool, and then we would review it and then continue to copy code from the github repo into llvm and review it. I'm also willing to do that if Jurgen doesn't want to at this point though. I'd like the OK from Jurgen on that and I'd also like the OK from someone that the license stuff is all good to go (I'm not sure who should check licence stuff).

Best,
Jake

On Tue, Apr 10, 2018 at 2:39 PM John Ericson [hidden email] wrote:

Seems like there are a few of us interested in this then. I new around here and don't really know how decisions are made, so what's next? Just open a diff with the entire library??

John

On 04/10/2018 05:33 PM, Jake Ehrlich wrote:
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK
3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

On Tue, Apr 10, 2018 at 1:14 PM Andrew Kelley via llvm-dev <[hidden email]> wrote:
On Mon, Apr 9, 2018 at 10:11 PM, John Ericson via llvm-dev <[hidden email]> wrote:
> Regardless of any of that, given that TBD files _are_ an integral part of the apple platform, supporting them is certainly a necessity in order to have a working apple linker. So, if making LLD work for Apple/MachO is the justification for adding TBD support to LLVM, that seems self-evidently a reasonable thing to do. On the other hand, it looks like the LLD mach-o code is unmaintained and nobody seems to be much interested in it. And having code for reading TBD files in LLVM seems not terribly interesting, unless it is as part of a project to make the LLD MachO linker actually functional and supported.

Yes. I hope this can be reason enough. Hobbyists could push for LLD support for Mach-O besides Apple, and if LLD is to displace other linkers this is a necessary component as you say. Better to upstream now before the code diverges than more work later? Conversely if nothing happens, I doubt libtapi would be a greater drag on the codebase than the MachO LLD code, so whatever cost/benefit analysis exists for keeping that around could also apply to this.


Speaking for the Zig project here, our goal is to support cross-compilation for any target, on any target, without requiring installation of any target-specific SDK. So, for example, these use cases:
 * on linux, compile & link a binary targeting macos
 * on windows, compile & link a binary targeting macos

This works today, although it depends on a patch to LLD to fix the MACH-O linker that is not high enough quality to upstream.

So we have a vested interest in improving the MACH-O linker, and in fact a Zig community member has fixed at least one bug in MACH-O LLD: reviews.llvm.org/D35387

I don't fully understand how TBD or TAPI works, but I hope that it results in improvements to the MACH-O linker.

 
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
Also I mainly care about getting the ELF part of this working so it would be nice to have an informal owner of the MachO part.

On Tue, Apr 10, 2018 at 4:15 PM John Ericson <[hidden email]> wrote:

That sounds great to me, thanks Jake. I'm not Jurgen either, of course, but I'm happy to assist you if he is unavailable. I'm not also not qualified to audit the license, but do note Apple formally also released some code at https://opensource.apple.com/tarballs/tapi/. If there's anything else I can do to help, let me know.

Cheers,

John


On 04/10/2018 06:13 PM, Jake Ehrlich wrote:
Ideally Jurgen would cut up the code on github, put up an initial diff for a minimal viable tool, and then we would review it and then continue to copy code from the github repo into llvm and review it. I'm also willing to do that if Jurgen doesn't want to at this point though. I'd like the OK from Jurgen on that and I'd also like the OK from someone that the license stuff is all good to go (I'm not sure who should check licence stuff).

Best,
Jake

On Tue, Apr 10, 2018 at 2:39 PM John Ericson [hidden email] wrote:

Seems like there are a few of us interested in this then. I new around here and don't really know how decisions are made, so what's next? Just open a diff with the entire library??

John

On 04/10/2018 05:33 PM, Jake Ehrlich wrote:
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK
3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

On Tue, Apr 10, 2018 at 1:14 PM Andrew Kelley via llvm-dev <[hidden email]> wrote:
On Mon, Apr 9, 2018 at 10:11 PM, John Ericson via llvm-dev <[hidden email]> wrote:
> Regardless of any of that, given that TBD files _are_ an integral part of the apple platform, supporting them is certainly a necessity in order to have a working apple linker. So, if making LLD work for Apple/MachO is the justification for adding TBD support to LLVM, that seems self-evidently a reasonable thing to do. On the other hand, it looks like the LLD mach-o code is unmaintained and nobody seems to be much interested in it. And having code for reading TBD files in LLVM seems not terribly interesting, unless it is as part of a project to make the LLD MachO linker actually functional and supported.

Yes. I hope this can be reason enough. Hobbyists could push for LLD support for Mach-O besides Apple, and if LLD is to displace other linkers this is a necessary component as you say. Better to upstream now before the code diverges than more work later? Conversely if nothing happens, I doubt libtapi would be a greater drag on the codebase than the MachO LLD code, so whatever cost/benefit analysis exists for keeping that around could also apply to this.


Speaking for the Zig project here, our goal is to support cross-compilation for any target, on any target, without requiring installation of any target-specific SDK. So, for example, these use cases:
 * on linux, compile & link a binary targeting macos
 * on windows, compile & link a binary targeting macos

This works today, although it depends on a patch to LLD to fix the MACH-O linker that is not high enough quality to upstream.

So we have a vested interest in improving the MACH-O linker, and in fact a Zig community member has fixed at least one bug in MACH-O LLD: reviews.llvm.org/D35387

I don't fully understand how TBD or TAPI works, but I hope that it results in improvements to the MACH-O linker.

 
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev


On Tue, Apr 10, 2018 at 5:33 PM Jake Ehrlich via llvm-dev <[hidden email]> wrote:
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK

I'm still skeptical that this is significant.

3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

Yea, a tool which can produce a .so from a textual description is certainly much less concerning than adding linker support for a new textual description format. If it's an official linker-supported format, it'd be yet another format that potentially needs to be standardized across multiple linkers, and kept compatible for"ever", etc. I just don't think that seems worthwhile for ELF.

OTOH, a standalone tool which can convert from a "full" shared-object to an interface shared-object would be _great_ to have. If that tool also has some auxiliary textual I/O format it supports, I guess that's fine, too. (We do have some existing yaml <-> ELF support, via the "obj2yaml" and "yaml2obj" tools.)

I'd note that reproducing all the things that are required/used from an ELF shared object during linking -- symbol type, binding-type, visibility, version, alignment (!), .gnu.warning messages, various important "SHT_NOTE" sections, and whatever other things I've forgotten about, will need to be a _significantly_ different format than what Apple has as their "TBD" format. Apple's format also has a bunch of special cases in it to make it easier to use for their platform, but a rather less generic tool. E.g., symbols starting with "_OBJC_CLASS_$" are recorded in the "objc-classes" field with the prefix removed, instead of just recording it as-is.

So, I'd also caution that while the project of "import apple's libtapi into LLVM for LLD/MachO" and "Make a scheme to do interface shared-libs for ELF" might seem superficially related, I'd be very surprised if that actually ended up being the case. I would really not expect it to share just about anything at all other than the concept of being a textual description for a library.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
I fully agree TBD files will need to be format specific. Producing stubs (from full shared objects or text) is important and should have a tool to do it properly. Right now that's an artislian process unique to each project. Adding linker support is a whole other issue and not one I'm too concerned with. If someone wants to propose that later, they can. I certainly won't be proposing it.

As an aside I'm actually quite aware of demand for a textual representation. I work on a team producing an operating system and we care about what symbols expose in our system libraries. I also know that lots of people use libabigail for this sorts of reasons. The demand for human ABI review exists. Weather a textual representation is important or not is I suppose unclear.

On Wed, Apr 11, 2018, 8:53 AM James Y Knight <[hidden email]> wrote:
On Tue, Apr 10, 2018 at 5:33 PM Jake Ehrlich via llvm-dev <[hidden email]> wrote:
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK

I'm still skeptical that this is significant.

3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

Yea, a tool which can produce a .so from a textual description is certainly much less concerning than adding linker support for a new textual description format. If it's an official linker-supported format, it'd be yet another format that potentially needs to be standardized across multiple linkers, and kept compatible for"ever", etc. I just don't think that seems worthwhile for ELF.

OTOH, a standalone tool which can convert from a "full" shared-object to an interface shared-object would be _great_ to have. If that tool also has some auxiliary textual I/O format it supports, I guess that's fine, too. (We do have some existing yaml <-> ELF support, via the "obj2yaml" and "yaml2obj" tools.)

I'd note that reproducing all the things that are required/used from an ELF shared object during linking -- symbol type, binding-type, visibility, version, alignment (!), .gnu.warning messages, various important "SHT_NOTE" sections, and whatever other things I've forgotten about, will need to be a _significantly_ different format than what Apple has as their "TBD" format. Apple's format also has a bunch of special cases in it to make it easier to use for their platform, but a rather less generic tool. E.g., symbols starting with "_OBJC_CLASS_$" are recorded in the "objc-classes" field with the prefix removed, instead of just recording it as-is.

So, I'd also caution that while the project of "import apple's libtapi into LLVM for LLD/MachO" and "Make a scheme to do interface shared-libs for ELF" might seem superficially related, I'd be very surprised if that actually ended up being the case. I would really not expect it to share just about anything at all other than the concept of being a textual description for a library.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev


On Apr 11, 2018, at 11:52 AM, James Y Knight via cfe-dev <[hidden email]> wrote:



On Tue, Apr 10, 2018 at 5:33 PM Jake Ehrlich via llvm-dev <[hidden email]> wrote:
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK

I'm still skeptical that this is significant.

For Apple it certainly is, mostly because TBDs can support multiple architectures in a much more efficient way than fat images do.  Apple has a lot of architecture/SDK variants with a lot of redundancy between what libraries export on different platforms.

John.


3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

Yea, a tool which can produce a .so from a textual description is certainly much less concerning than adding linker support for a new textual description format. If it's an official linker-supported format, it'd be yet another format that potentially needs to be standardized across multiple linkers, and kept compatible for"ever", etc. I just don't think that seems worthwhile for ELF.

OTOH, a standalone tool which can convert from a "full" shared-object to an interface shared-object would be _great_ to have. If that tool also has some auxiliary textual I/O format it supports, I guess that's fine, too. (We do have some existing yaml <-> ELF support, via the "obj2yaml" and "yaml2obj" tools.)

I'd note that reproducing all the things that are required/used from an ELF shared object during linking -- symbol type, binding-type, visibility, version, alignment (!), .gnu.warning messages, various important "SHT_NOTE" sections, and whatever other things I've forgotten about, will need to be a _significantly_ different format than what Apple has as their "TBD" format. Apple's format also has a bunch of special cases in it to make it easier to use for their platform, but a rather less generic tool. E.g., symbols starting with "_OBJC_CLASS_$" are recorded in the "objc-classes" field with the prefix removed, instead of just recording it as-is.

So, I'd also caution that while the project of "import apple's libtapi into LLVM for LLD/MachO" and "Make a scheme to do interface shared-libs for ELF" might seem superficially related, I'd be very surprised if that actually ended up being the case. I would really not expect it to share just about anything at all other than the concept of being a textual description for a library.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev
Sound great :)

On Wed, Oct 25, 2017 at 4:28 PM, Eric Christopher <[hidden email]> wrote:
Hi Juergen,

At a minimum I think adding the support to libobject, etc so the various llvm tools can read or even write files from/for OSX should be fairly non-controversial so how about go ahead and do that first (I'll happily review if you'd like) and then we can go from there to do anything else with TAPI and llvm?

Sound good?

-eric

On Thu, Sep 7, 2017 at 5:01 PM Juergen Ributzka via cfe-dev <[hidden email]> wrote:
Hi @ll,

Over the past years I have been looking into how to reduce the size of the SDK that ships with Xcode and how to improve build times for the overall OS inside Apple. The result is a tool called TAPI, which is used at Apple for all things related to text-based dynamic library files (.tbd).

What are text-based dynamic library files?
Text-based dynamic library files (TBDs) are a textual representation of the information in a dynamic library / shared library that is required by the static linker - basically a symbol list of the exported symbols.

Apple’s SDKs originally used Mach-O Dynamic Library Stubs. Mach-O Dynamic Library Stubs are dynamic library files, but with all the text and data stripped out. TBD files were introduced to replaced Mach-O Dynamic Library Stub files in the SDK to further reduce its overall size.

Over time the TAPI tool has grown and is used now in a variety of ways.

Dynamic Library Stubbing:
As mentioned above, TAPI is used to read the content of dynamic library / shared library and generates a textual representation that can be used by the static linker. The current implementation reads MachO files, but it could be extended to also provide the same functionality for other object file formats.

Framework / Dynamic Library Verification:
The symbols that are exported from a dynamic library should ideally match, or at least contain, all the API that is specified in the associated header files. TAPI performs this verification by parsing the header files with CLANG and compare the findings to the exported symbols from the library.

InstallAPI:
InstallAPI is a new build phase that generates the TBD file from header files only. This allows a dependency of the library to build concurrently even before the library has been built itself. This can be used to increase parallelism in the build or larger projects or operating systems.

Misc:
- display and operate on TBD files
- automatically generate API tests from header files
- libtapi, which is used by the linker (ld64) to parse the TBD files


The functionality of the tool is currently limited to Mach-O object files, but that is not a technical limitation. In making the tool open source I hope others will be able to take advantage of it too and extend its functionality to other object file formats.


I initially developed the project as a CLANG project, but that was mostly for practical reasons (out-of-tree development, separate repo, etc). For the curious ones I pushed the repo to github (https://github.com/ributzka/tapi).

I imagine, for example, that the reading/writing of TBD files is something that would fit better into the LLVM sources, which makes it available to other libraries and tools (e.g. LLVMObject, llvm-nm, lld, ...).

I created a small patch that integrates it with llvm-nm and LLVMObject. This patch is not complete and I will split it up into smaller patches for review. I am providing it as a reference to get the discussion started.

Please let me know what you think and bikeshed away :)

Thanks

Cheers,
Juergen




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Open sourcing and contributing TAPI back to the LLVM community

Yvan Roux via cfe-dev
In reply to this post by Yvan Roux via cfe-dev
The code on opensource.apple.com is the minimal code needed for libtapi to read TBD files with ld64. The code I pushed to GitHub on the other side includes the full TAPI tool source code. If you check the code you will see that I already use the LLVM license for everything, because the goal has always been to contribute this back to the community.

I attached an initial patch for libtapi and nm support to my original email. I also have some updated patches for that too, but somehow got derailed into "lets rewrite libobject" discussions.

On Tue, Apr 10, 2018 at 4:15 PM, John Ericson via llvm-dev <[hidden email]> wrote:

That sounds great to me, thanks Jake. I'm not Jurgen either, of course, but I'm happy to assist you if he is unavailable. I'm not also not qualified to audit the license, but do note Apple formally also released some code at https://opensource.apple.com/tarballs/tapi/. If there's anything else I can do to help, let me know.

Cheers,

John


On 04/10/2018 06:13 PM, Jake Ehrlich wrote:
Ideally Jurgen would cut up the code on github, put up an initial diff for a minimal viable tool, and then we would review it and then continue to copy code from the github repo into llvm and review it. I'm also willing to do that if Jurgen doesn't want to at this point though. I'd like the OK from Jurgen on that and I'd also like the OK from someone that the license stuff is all good to go (I'm not sure who should check licence stuff).

Best,
Jake

On Tue, Apr 10, 2018 at 2:39 PM John Ericson [hidden email] wrote:

Seems like there are a few of us interested in this then. I new around here and don't really know how decisions are made, so what's next? Just open a diff with the entire library??

John

On 04/10/2018 05:33 PM, Jake Ehrlich wrote:
Benifits of TBD:
1) It's human readable and diffs on TBDs correspond to changes in the ABI. Diffs can be automatically added to review processes to ensure that changes to the ABI are reviewed. The TBDs also document your precise ABI.
2) The size is smaller which means they can be shipped in an SDK instead of binaries to reduce the size of an SDK
3) Stubs are producible from TBDs (or should be) which means stubs for linking can be produced even if we don't directly support them in LLD. This lets you ship the smaller TBD files in place of larger binaries and still link things without direct linker support (assuming you already ship a toolchain with your SDK or expect your users to have this tool)

Since stubs are producible from TBDs I don't really see a downside. I think we need both, I was going to propose a yaml based representation for ELF for the above reasons anyhow.

On Tue, Apr 10, 2018 at 1:14 PM Andrew Kelley via llvm-dev <[hidden email]> wrote:
On Mon, Apr 9, 2018 at 10:11 PM, John Ericson via llvm-dev <[hidden email]> wrote:
> Regardless of any of that, given that TBD files _are_ an integral part of the apple platform, supporting them is certainly a necessity in order to have a working apple linker. So, if making LLD work for Apple/MachO is the justification for adding TBD support to LLVM, that seems self-evidently a reasonable thing to do. On the other hand, it looks like the LLD mach-o code is unmaintained and nobody seems to be much interested in it. And having code for reading TBD files in LLVM seems not terribly interesting, unless it is as part of a project to make the LLD MachO linker actually functional and supported.

Yes. I hope this can be reason enough. Hobbyists could push for LLD support for Mach-O besides Apple, and if LLD is to displace other linkers this is a necessary component as you say. Better to upstream now before the code diverges than more work later? Conversely if nothing happens, I doubt libtapi would be a greater drag on the codebase than the MachO LLD code, so whatever cost/benefit analysis exists for keeping that around could also apply to this.


Speaking for the Zig project here, our goal is to support cross-compilation for any target, on any target, without requiring installation of any target-specific SDK. So, for example, these use cases:
 * on linux, compile & link a binary targeting macos
 * on windows, compile & link a binary targeting macos

This works today, although it depends on a patch to LLD to fix the MACH-O linker that is not high enough quality to upstream.

So we have a vested interest in improving the MACH-O linker, and in fact a Zig community member has fixed at least one bug in MACH-O LLD: reviews.llvm.org/D35387

I don't fully understand how TBD or TAPI works, but I hope that it results in improvements to the MACH-O linker.

 
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
12