RFC: Clang driver redesign

classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFC: Clang driver redesign

James Molloy-2
Hi,

The clang driver has been the subject of much dislike for a while now, and
for a while I've wanted to sort it out. I'm chairing a BoF session on it a
the dev meeting later this month, and as a precursor to that I've started
putting together a requirements document.

The initial aim of this document is to work out the specific requirements
for a Clang compiler driver. Then, we can see if the current driver doesn't
meet them (it won't!) and design a solution based on actual use cases.

The first draft of this document is attached in HTML form and inline below
in ReST form. I've documented the obvious use cases that I can think of, but
I am certain to have missed some out. I've also started a strawman proposal
solution, with the intention of stimulating discussion by providing
something concrete.

Please comment! The more feedback I get (even "I don't really care" is
useful) the better prepared I can be for the BoF and the better driver Clang
will get as a result.

Cheers,

James

===========================
 Clang driver requirements
===========================

Changelog
=========

2011-Nov-04: JamesM: Updated usecases, s/driver/plugin/g, sent to list.
2011-Oct-25: JamesM: Initial draft.

Introduction
============

The current Clang driver is inadequate, as shown by many mailing list
threads, the latest being [ML1]_. Its failings seem to stem from the
lack of real extensibility without touching lots of the codebase, and
many interlinked special cases.

The current driver, while OK for hosted compilation is very difficult
to set up for cross-compilation.

This document aims initially to merely document the requirements for
*a compiler driver for an Clang/LLVM based compiler*, specifically aimed at
but
not limited to the Clang codebase - that is, it should be possible to
reuse this driver component for a similar but different tool, much
like the rest of Clang's design.

This document aims at building/documenting a consensus on driver
design. There are many different use cases in the community so this
design is expected to be a living document which will have input from
the entire (interested) community.

Potential use cases
===================

Where these use cases spawn out requirements, these are indexed in
square brackets ([X]).

Usecase 1: UNIX/Windows distribution maintainer creating a distribution
-----------------------------------------------------------------------

Wants to take Clang ToT or release tag, compile and package. The only
change he needs to make is to tell Clang where to pick up headers and
libraries. This differs between distributions, and with multilib
support it has started differing even more. This is due to differing
directory structures for multilib and the expectation of where to find
headers, libraries and crt*.o's. The ideal would be to not
have to recompile Clang or maintain diffs against the clang tree to
make this change [2].

The library and header locations may depend on the resolved target
triple, along with other possible parameters (command line options,
environment variables, etc.).

Usecase 2: Developer creating a Clang-based derivative tool
-----------------------------------------------------------

Researchers or companies may wish to adapt Clang in more complex ways
than Usecase 1, such as adding new command line flags, adding
compatibility modes or changing the subtools invoked (for example,
invoking an alternate linker which takes different command line arguments).

Wants to adapt the command line parsing in a more complex way -
perhaps by ([3]):

  * Adding new command line flags.
  * Editing the functionality of current command line flags, possibly
    in a complex (non-declarative) way.
  * Altering the way subtools are invoked.

These changes should be easy to maintain - diffs should be able to be
separate from the main Driver and not subject to clobbering by
enthusiastic Driver developers [4].

Usecase 3: Clang developer, developing
--------------------------------------

Wants no functionality to change - things keep working as normal [1]

Usecase 4: Apple/Darwin developer, using fat binaries
-----------------------------------------------------

Requires fat-binary support. This entails multiple "-arch" arguments
being supported. [1]

.. note::

    Describe this some more?

Functional requirements
=======================

The following requirements follow from the use cases above and attempt
to formalise those use cases more precisely.

[1] No functional regressions
  The driver **must** be able to be configured such that it can parse
  command lines that the current Clang driver accepts. The driver
  **must** invoke all subtools in the same manner as the current Clang
  driver, with the possible exception of obtuse, undefined, legacy or
  otherwise incorrect behaviour, permission for which must be obtained
  from the mailing list and documented in a subsection of this
  document for decision tracking.

[2] Adaptability
  The driver's parameters (search paths, header locations etc)
  **must** be able to be changed with minimal intervention.

  These parameters **should** be able to be changed without a
  recompile of Clang or any changes to the source base.

[3] Extensibility
  All parts of the driver that are to interact with outside
  environment (such as interpreting command lines and launching
  subtools) **must** be able to have their behaviour easily modified.

  While there is no requirement for this to be able to be done with no
  source changes, there **could** be scope for allowing dynamically
  loadable modules (in the spirit of ``opt -load``) to change the
  driver's behaviour at invoke-time.

[4] Maintainability of downstream changes
  There must be a highly defined and documented API that can be
  followed by a developer attempting to modify the driver's
  behaviour. This API should make it possible to, at a minimum:

    * Add, remove and modify command line flags with possibly complex
      imperative rules.
    * Hook into and mutate commands to be passed to subtools.
    * Pass "Clang-like" diagnostics to the user.
    * Set up default parameters such as include paths.
    * Separate their modification from the Clang sourcebase, at least
      to the extent that existing Clang source files should not need
      to be modified with anything other than a trivial patch.

Proposed design "A"
===================

The following design is proposed as a strawman to expose possible
flaws in the requirements and to generate more targetted
discussion. That said, I (JamesM) certainly think it's a decent
design, else I wouldn't propose it.

As the requirements change (and they will!) this design should change with
them.

The high level overview is of three parts: a driver "Kernel", one or
more "Driver Plugins" and a "Config file".

.. note::

    Find a better word than "plugin" so it won't cause confusion
    between this and normal Clang plugins?

This solves the requirements in the following ways:

1. As a pure rearchitecting exercise there should be no need for any
   functional differences to take place.
2. The config files allow for easy tweaking of configuration
   without a recompile.
3. The driver framework forms a stable API for adding to and defining
   driver functionality, and also easily allows this to be imported at
   runtime via shared object.
4. The driver framework means that ideally a developer can create
   his/her own driver, plug it in and it not be affected at all by any
   Clang change other than the driver API.

Driver Kernel
-------------

The driver Kernel will be responsible for handling user input and
calling subprocesses. In particular it will parse command lines in a
generic, almost POSIX-compliant manner, launch subprocesses and
display their diagnostics, and emit diagnostics of its own.

It will maintain a list of plugins which are partially ordered (so that the
order they are linked in / loaded is not important), which will each
expose a list of command line options they handle, and how to handle
them (for example, setting a parameter or handing off to a handler
function).

The kernel should maintain a state as a sort of dictionary/hash which
the plugins can access and mutate, as well as add extra entries to.

Once all plugins have mutated the state due to command line options,
the state is handed to the adapter file to mutate further, after which
the kernel generates command lines to invoke subprocesses.

These command lines are then sent to the plugins for possible mutation
before being executed.

Driver Plugin
-------------

A driver plugin is a C++ module, either statically or dynamically
linked, which implements a specific API.

The API should consist of at least:
 
  * A function for the Kernel to obtain a list of command line options
    the driver can handle and how to handle them. This is yet to be
    defined, but should be nearly (may require a few changes)
    compatible with the current output generated by the Clang argument
    parser tablegen backend.
  * A function for the Kernel to obtain the "priority" of the
    driver. Priority is a way for the developer to define which
    drivers are queried first for command line argument resolution and
    subprocess command mutation.
  * A function that the Kernel will call on every subprocess
    invocation to allow the plugin to mutate that invocation.
  * A function to allow the plugin to emit diagnostics to the user via
    the Kernel, or to abort compilation.

Statically linking plugins to the driver will result in the fastest
compilation speed, but because the API is so defined I suggest
offering the ability to dynamically load plugins at runtime - this may
possibly make development easier for some users (TODO: Does anyone
care about this? Would it help anyone?)

Driver Config files
-------------------

These are designed to allow the *user* to tweak settings at invoke
time, without requiring a recompile. For speed, they allow a
restricted set of operations in comparison to driver plugins - they
are pure declarative with no imperative constructs and can modify or
add to the kernel state.

These files could be written in whatever lightweight markup language
we choose, which is not really important at this stage. The important
thing is that it is simple enough to parse speedily with no interpret
overhead and no extra dependencies.

Suggestions include JSON, YAML, XML or a INI style, similar to Daniel
Dunbar's recent build system changes.

References
==========

.. [ML1] http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-October/018059.html


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

clang-driver-spec.html (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Joerg Sonnenberger
On Fri, Nov 04, 2011 at 11:11:37AM -0000, James Molloy wrote:
> The current driver, while OK for hosted compilation is very difficult
> to set up for cross-compilation.

I resent this. I am using it for cross-compilation all the time and it
is trivial to use for me. Let's at least call the child by name: Linux
support is a mess and needs to improve. Pretty much all other platforms
shouldn't be negatively effected by that.

Joerg
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

James Molloy-2
Actually I'm not referring explicitly to Linux - the use case that springs
most to mind is our own, which is cross-compiling for baremetal ARM targets.

I have to use -ccc-host-triple arm-freebsd-eabi because otherwise the driver
attempts to autoconfigure from the host GCC (freebsd uses as and ld directly
and happens (!!) to default to looking in the current directory first).

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On
Behalf Of Joerg Sonnenberger
Sent: 04 November 2011 11:41
To: [hidden email]
Subject: Re: [cfe-dev] RFC: Clang driver redesign

On Fri, Nov 04, 2011 at 11:11:37AM -0000, James Molloy wrote:
> The current driver, while OK for hosted compilation is very difficult
> to set up for cross-compilation.

I resent this. I am using it for cross-compilation all the time and it
is trivial to use for me. Let's at least call the child by name: Linux
support is a mess and needs to improve. Pretty much all other platforms
shouldn't be negatively effected by that.

Joerg
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev





_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Joerg Sonnenberger
On Fri, Nov 04, 2011 at 11:45:53AM -0000, James Molloy wrote:
> Actually I'm not referring explicitly to Linux - the use case that springs
> most to mind is our own, which is cross-compiling for baremetal ARM targets.
>
> I have to use -ccc-host-triple arm-freebsd-eabi because otherwise the driver
> attempts to autoconfigure from the host GCC (freebsd uses as and ld directly
> and happens (!!) to default to looking in the current directory first).

Just make sure you have arm-freebsd-eabi-{as,ld} around in PATH and be
done.

Joerg
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

David Chisnall
In reply to this post by James Molloy-2
Hi James,

It would be worth linking to this somewhere in your draft, as a lot of the issues have already been discussed:

http://clang.llvm.org/UniversalDriver.html

On 4 Nov 2011, at 11:11, James Molloy wrote:

> The first draft of this document is attached in HTML form and inline below
> in ReST form. I've documented the obvious use cases that I can think of, but
> I am certain to have missed some out. I've also started a strawman proposal
> solution, with the intention of stimulating discussion by providing
> something concrete.

Since we're throwing up use-cases, here are the ones I have where the current driver is not the best:

Driving plugins that produce some non-LLVM output.  I have written one that generates JavaScript from an ObjC AST and there are others that generate different compiler IRs for codegen by something other than LLVM.  Invoking these is pretty painful with the current driver.  It would be great to be able to just provide a config file that would remove the LLVM CodeGen and substitute something else.  Currently I have to do something like -fsyntax-only (which seems nonsense, because I actually do want code generated, just not LLVM code), -load full/path/to/plugin.so and a few more options.  

Shipping a cross-compile toolchain: Most of these include gcc, binutils, headers and libraries.  Including clang with the toolchain wouldn't make sense, because it would be the same clang as the version for the host system, unlike GCC where you need one specifically configured for each target.  Ideally, you'd just provide a config file that would tell clang where to find everything and it would just work.

Driving extra LLVM optimisations.  I have a set for Objective-C that will automatically add themselves to the optimisation toolchain if the library is loaded.  Currently this is done by adding -Xclang -load -Xclang path/to/plugin.so to the command line.  This is far from ideal, I'd like to be able to provide a config file that said something like 'when compiling Objective-C, load this library'.  I imagine other libraries may eventually want to add some library-specific optimisations.  For example, GObject or Qt's signals and slots mechanism may both benefit from running some extra LLVM passes that optimise their specific uses.  

Finally, an idea that's been floated before and may or may not require changes to the driver is that of a compile daemon. This would start once, parse all prefix headers, and then spawn threads to handle the parsing and compiling of each new source file.  It could more easily limit resource usage than a parallel make (for example, C source files typically take under 100MB to compile, C++ ones can take 500MB easily if they're full of templates - being able to limit clangd (clanged?) to using 1GB of RAM would be better than running make -j2 on a quad-core system).  Ideally, it could also be used from libclang so that files could be parsed frequently and sent off for codegen when clangd has spare resources and the file has been unmodified for a while (i.e. not when it's being frequently queried by the IDE for code completion, or autocorrection).  

>From a user perspective, the command line should be simple.  Being able to easily add command line arguments that simply map to combinations of others would be a huge win for usability.  As a user, I want to be able to just say something like -target=touchpad and have it load touchpad.conf, which specifies all of the arguments for that specific cross-compile toolchain.  I want it to load my user config and add a -fexperimental-stuff option, which expands to -Xclang -load -Xclang ~/my/buggy/experimental/optimisations.so

A lot of the things I want are possible with the current driver, but they require command lines so complicated that people generally just don't bother with them.

David

-- Sent from my brain
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Jean-Daniel Dupas-2

Le 4 nov. 2011 à 13:42, David Chisnall a écrit :

> Hi James,
>
> It would be worth linking to this somewhere in your draft, as a lot of the issues have already been discussed:
>
> http://clang.llvm.org/UniversalDriver.html
>
> On 4 Nov 2011, at 11:11, James Molloy wrote:
>
>> The first draft of this document is attached in HTML form and inline below
>> in ReST form. I've documented the obvious use cases that I can think of, but
>> I am certain to have missed some out. I've also started a strawman proposal
>> solution, with the intention of stimulating discussion by providing
>> something concrete.
>
> Since we're throwing up use-cases, here are the ones I have where the current driver is not the best:
>
> Shipping a cross-compile toolchain: Most of these include gcc, binutils, headers and libraries.  Including clang with the toolchain wouldn't make sense, because it would be the same clang as the version for the host system, unlike GCC where you need one specifically configured for each target.  Ideally, you'd just provide a config file that would tell clang where to find everything and it would just work.

I'm with you on that one. I really dream of something like having a "sdk folder" which contains headers, libraries, binutils (if needed), and a configuration file, and just being able to invoke clang, telling it where the folder is, and having it cross compile my code.

The configuration file should have a predefined name and location inside the sdk folder, so you don't have to tell clang where it is, and it should contain default target triple, path relative to the SDK root of headers, libraries, tools, and any other information clang need, so you just have to drop the sdk folder wherever you want, and invoke clang with something like this:

clang --sdk=<path to my sdk>


-- Jean-Daniel





_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

reed kotler
In reply to this post by David Chisnall
I think that the whatever you decide to do, you should make the new
driver a separate executable.
That way you can develop it and test it as a separate unit.

Or you can change Clang so that the driver is a plugin that can be
freely substituted.


On 11/04/2011 05:42 AM, David Chisnall wrote:

> Hi James,
>
> It would be worth linking to this somewhere in your draft, as a lot of the issues have already been discussed:
>
> http://clang.llvm.org/UniversalDriver.html
>
> On 4 Nov 2011, at 11:11, James Molloy wrote:
>
>> The first draft of this document is attached in HTML form and inline below
>> in ReST form. I've documented the obvious use cases that I can think of, but
>> I am certain to have missed some out. I've also started a strawman proposal
>> solution, with the intention of stimulating discussion by providing
>> something concrete.
> Since we're throwing up use-cases, here are the ones I have where the current driver is not the best:
>
> Driving plugins that produce some non-LLVM output.  I have written one that generates JavaScript from an ObjC AST and there are others that generate different compiler IRs for codegen by something other than LLVM.  Invoking these is pretty painful with the current driver.  It would be great to be able to just provide a config file that would remove the LLVM CodeGen and substitute something else.  Currently I have to do something like -fsyntax-only (which seems nonsense, because I actually do want code generated, just not LLVM code), -load full/path/to/plugin.so and a few more options.
>
> Shipping a cross-compile toolchain: Most of these include gcc, binutils, headers and libraries.  Including clang with the toolchain wouldn't make sense, because it would be the same clang as the version for the host system, unlike GCC where you need one specifically configured for each target.  Ideally, you'd just provide a config file that would tell clang where to find everything and it would just work.
>
> Driving extra LLVM optimisations.  I have a set for Objective-C that will automatically add themselves to the optimisation toolchain if the library is loaded.  Currently this is done by adding -Xclang -load -Xclang path/to/plugin.so to the command line.  This is far from ideal, I'd like to be able to provide a config file that said something like 'when compiling Objective-C, load this library'.  I imagine other libraries may eventually want to add some library-specific optimisations.  For example, GObject or Qt's signals and slots mechanism may both benefit from running some extra LLVM passes that optimise their specific uses.
>
> Finally, an idea that's been floated before and may or may not require changes to the driver is that of a compile daemon. This would start once, parse all prefix headers, and then spawn threads to handle the parsing and compiling of each new source file.  It could more easily limit resource usage than a parallel make (for example, C source files typically take under 100MB to compile, C++ ones can take 500MB easily if they're full of templates - being able to limit clangd (clanged?) to using 1GB of RAM would be better than running make -j2 on a quad-core system).  Ideally, it could also be used from libclang so that files could be parsed frequently and sent off for codegen when clangd has spare resources and the file has been unmodified for a while (i.e. not when it's being frequently queried by the IDE for code completion, or autocorrection).
>
> > From a user perspective, the command line should be simple.  Being able to easily add command line arguments that simply map to combinations of others would be a huge win for usability.  As a user, I want to be able to just say something like -target=touchpad and have it load touchpad.conf, which specifies all of the arguments for that specific cross-compile toolchain.  I want it to load my user config and add a -fexperimental-stuff option, which expands to -Xclang -load -Xclang ~/my/buggy/experimental/optimisations.so
>
> A lot of the things I want are possible with the current driver, but they require command lines so complicated that people generally just don't bother with them.
>
> David
>
> -- Sent from my brain
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Sean Hunt-2
In reply to this post by James Molloy-2
On Fri, Nov 4, 2011 at 07:11, James Molloy <[hidden email]> wrote:

> Usecase 3: Clang developer, developing
> --------------------------------------
>
> Wants no functionality to change - things keep working as normal [1]
>
> Usecase 4: Apple/Darwin developer, using fat binaries
> -----------------------------------------------------
>
> Requires fat-binary support. This entails multiple "-arch" arguments
> being supported. [1]
>
> .. note::
>
>    Describe this some more?
>
> Functional requirements
> =======================
>
> The following requirements follow from the use cases above and attempt
> to formalise those use cases more precisely.
>
> [1] No functional regressions
>  The driver **must** be able to be configured such that it can parse
>  command lines that the current Clang driver accepts. The driver
>  **must** invoke all subtools in the same manner as the current Clang
>  driver, with the possible exception of obtuse, undefined, legacy or
>  otherwise incorrect behaviour, permission for which must be obtained
>  from the mailing list and documented in a subsection of this
>  document for decision tracking.

I honestly have to disagree with this one. A lot of the reasons for
horribleness in the current driver is compatibility with GCC. I
believe that we should really have two drivers, one being the 'nice'
driver, and one being the compatibility driver. To be honest, I
consider POSIX specifications for CC rather irrigating as well, but
I'm willing to concede POSIX compatibility. Naturally, it should be
easy for these to both be changeable at once so that we don't have
ridiculous levels of maintenance being performed, but I'm of opinion
that the current model is predicated on enough levels of annoyances
that trying to promote a compatible compiler is not a good approach
(the first example that comes to mind is -Wall).

I'm sure people will disagree with me, though.

> [3] Extensibility
>  All parts of the driver that are to interact with outside
>  environment (such as interpreting command lines and launching
>  subtools) **must** be able to have their behaviour easily modified.
>
>  While there is no requirement for this to be able to be done with no
>  source changes, there **could** be scope for allowing dynamically
>  loadable modules (in the spirit of ``opt -load``) to change the
>  driver's behaviour at invoke-time.

Oh no, spec files. ;)

Sean

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Christopher Jefferson

On 4 Nov 2011, at 14:55, Sean Hunt wrote:

> On Fri, Nov 4, 2011 at 07:11, James Molloy <[hidden email]> wrote:
>> Usecase 3: Clang developer, developing
>> --------------------------------------
>>
>> Wants no functionality to change - things keep working as normal [1]
>>
>> Usecase 4: Apple/Darwin developer, using fat binaries
>> -----------------------------------------------------
>>
>> Requires fat-binary support. This entails multiple "-arch" arguments
>> being supported. [1]
>>
>> .. note::
>>
>>    Describe this some more?
>>
>> Functional requirements
>> =======================
>>
>> The following requirements follow from the use cases above and attempt
>> to formalise those use cases more precisely.
>>
>> [1] No functional regressions
>>  The driver **must** be able to be configured such that it can parse
>>  command lines that the current Clang driver accepts. The driver
>>  **must** invoke all subtools in the same manner as the current Clang
>>  driver, with the possible exception of obtuse, undefined, legacy or
>>  otherwise incorrect behaviour, permission for which must be obtained
>>  from the mailing list and documented in a subsection of this
>>  document for decision tracking.
>
> I honestly have to disagree with this one. A lot of the reasons for
> horribleness in the current driver is compatibility with GCC. I
> believe that we should really have two drivers, one being the 'nice'
> driver, and one being the compatibility driver. To be honest, I
> consider POSIX specifications for CC rather irrigating as well, but
> I'm willing to concede POSIX compatibility. Naturally, it should be
> easy for these to both be changeable at once so that we don't have
> ridiculous levels of maintenance being performed, but I'm of opinion
> that the current model is predicated on enough levels of annoyances
> that trying to promote a compatible compiler is not a good approach
> (the first example that comes to mind is -Wall).

Having recently taught C++, I strongly agree. The two most obvious problems are:

a) Too low a default warning level

b) The 'clang' command's stupid treatment of C++ (inherited from gcc) where it will compile it, but not link in the standard library. I have to teach students that if they see the error message:

Undefined symbols for architecture x86_64:
  "std::ios_base::Init::Init()", referenced from:
      ___cxx_global_var_init in z-pj7RVn.o
  "std::ios_base::Init::~Init()", referenced from:
      ___cxx_global_var_init in z-pj7RVn.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

It means that tried compiling c++ with clang instead of clang++. Almost everyone hits it at some point, and it's impossible to diagnose without googling or help.

Chris
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Peter Collingbourne
In reply to this post by David Chisnall
On Fri, Nov 04, 2011 at 12:42:49PM +0000, David Chisnall wrote:
> >From a user perspective, the command line should be simple.  Being able to easily add command line arguments that simply map to combinations of others would be a huge win for usability.  As a user, I want to be able to just say something like -target=touchpad and have it load touchpad.conf, which specifies all of the arguments for that specific cross-compile toolchain.

+1.  This would also be very useful for OpenCL C standard libraries
such as libclc which require a specific set of command line options
to be provided to the frontend.  Currently I have a shell script for
testing purposes which acts as a wrapper around the clang driver,
which I would ideally like to replace with some kind of driver
configuration file generated at compile time.

Thanks,
--
Peter
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Matthieu Monrocq
In reply to this post by Sean Hunt-2


Le 4 novembre 2011 15:55, Sean Hunt <[hidden email]> a écrit :
On Fri, Nov 4, 2011 at 07:11, James Molloy <[hidden email]> wrote:
> Usecase 3: Clang developer, developing
> --------------------------------------
>
> Wants no functionality to change - things keep working as normal [1]
>
> Usecase 4: Apple/Darwin developer, using fat binaries
> -----------------------------------------------------
>
> Requires fat-binary support. This entails multiple "-arch" arguments
> being supported. [1]
>
> .. note::
>
>    Describe this some more?
>
> Functional requirements
> =======================
>
> The following requirements follow from the use cases above and attempt
> to formalise those use cases more precisely.
>
> [1] No functional regressions
>  The driver **must** be able to be configured such that it can parse
>  command lines that the current Clang driver accepts. The driver
>  **must** invoke all subtools in the same manner as the current Clang
>  driver, with the possible exception of obtuse, undefined, legacy or
>  otherwise incorrect behaviour, permission for which must be obtained
>  from the mailing list and documented in a subsection of this
>  document for decision tracking.

I honestly have to disagree with this one. A lot of the reasons for
horribleness in the current driver is compatibility with GCC. I
believe that we should really have two drivers, one being the 'nice'
driver, and one being the compatibility driver. To be honest, I
consider POSIX specifications for CC rather irrigating as well, but
I'm willing to concede POSIX compatibility. Naturally, it should be
easy for these to both be changeable at once so that we don't have
ridiculous levels of maintenance being performed, but I'm of opinion
that the current model is predicated on enough levels of annoyances
that trying to promote a compatible compiler is not a good approach
(the first example that comes to mind is -Wall).

I'm sure people will disagree with me, though.

I fully agree.

I've tried adding options, in the past, and the seggregation of options is quite... baffling. Some `-fXXX` will influence LLVM while others influence Clang !?!

The dual Options.td cc1Options.td is quite nice too...

The GCC compatibility is great for a drop-in replacement, but I certainly see no harm into building a "pure" clang driver, where options are seggregated according to Clang usecases.

This would also allow implementing a *sane* option syntax, like for example a hierarchical option parser:

--codegen-stackframe-limit=50

Where "stackframe-limit=50" is dispatched to the "codegen" plugin, which itself dispatch "limit=50kB" to its stackframe object, which sets its "limit" attribute to 50000 (for example).

This make it easy to:
- group options right where they are used
- avoid name collisions
- have plugins register their own set of options

I have been working on such a parser in my spare time, and I don't mind giving the code away as a basis:
=> automatic handling of Integers, Strings, and Enums (which must provide some specific functions for conversion and listing the available values)
=> automatic handling of Booleans arguments (at the moment, I use =yes or =no, but it would be trivial to parse --no- as meaning =no providing that no plugin tries to grab the "no" namespace)

Ah, and I also have a configuration file parser which sets the options objects before the command line is parsed... (modelled after the config python module)
 

> [3] Extensibility
>  All parts of the driver that are to interact with outside
>  environment (such as interpreting command lines and launching
>  subtools) **must** be able to have their behaviour easily modified.
>
>  While there is no requirement for this to be able to be done with no
>  source changes, there **could** be scope for allowing dynamically
>  loadable modules (in the spirit of ``opt -load``) to change the
>  driver's behaviour at invoke-time.

Oh no, spec files. ;)

Sean



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Eric Christopher-2

On Nov 4, 2011, at 1:08 PM, Matthieu Monrocq wrote:

This would also allow implementing a *sane* option syntax, like for example a hierarchical option parser:

--codegen-stackframe-limit=50

Where "stackframe-limit=50" is dispatched to the "codegen" plugin, which itself dispatch "limit=50kB" to its stackframe object, which sets its "limit" attribute to 50000 (for example).

There's one particular set of use cases that this brings up that I'd like to mention: 

"Less knobs for users to use"

I realize it's an example, and a good way of describing a partial way of evaluation options, however, the idea behind this option in particular is something I explicitly don't want to do. I believe we want less knobs, not more. We don't want the users mucking with things like inlining heuristics, the size of the stack, and whether they want to unroll 4 loops or 3. These are the kinds of decisions that the optimizer should be able to handle and people working on the compiler have their own ways of mucking with these sorts of options.

-eric

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Ruben Van Boxem

Op 4 nov. 2011 21:51 schreef "Eric Christopher" <[hidden email]> het volgende:
>
>
> On Nov 4, 2011, at 1:08 PM, Matthieu Monrocq wrote:
>
>> This would also allow implementing a *sane* option syntax, like for example a hierarchical option parser:
>>
>> --codegen-stackframe-limit=50
>>
>> Where "stackframe-limit=50" is dispatched to the "codegen" plugin, which itself dispatch "limit=50kB" to its stackframe object, which sets its "limit" attribute to 50000 (for example).
>
>
> There's one particular set of use cases that this brings up that I'd like to mention: 
>
> "Less knobs for users to use"
>
> I realize it's an example, and a good way of describing a partial way of evaluation options, however, the idea behind this option in particular is something I explicitly don't want to do. I believe we want less knobs, not more. We don't want the users mucking with things like inlining heuristics, the size of the stack, and whether they want to unroll 4 loops or 3. These are the kinds of decisions that the optimizer should be able to handle and people working on the compiler have their own ways of mucking with these sorts of options.

Wrong; to successfully build the cellspu tblgen files on windows x64, one had to increase stack size for tblgen not to crash. I'm quite certain this use case is far from unique. I do agree hiding low level options casual users shouldn't know about (although they should be well documented), but not having it is really shooting yourself in the foot...

Ruben

>
> -eric
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Eric Christopher-2

On Nov 4, 2011, at 2:55 PM, Ruben Van Boxem wrote:

Op 4 nov. 2011 21:51 schreef "Eric Christopher" <[hidden email]> het volgende:
>
>
> On Nov 4, 2011, at 1:08 PM, Matthieu Monrocq wrote:
>
>> This would also allow implementing a *sane* option syntax, like for example a hierarchical option parser:
>>
>> --codegen-stackframe-limit=50
>>
>> Where "stackframe-limit=50" is dispatched to the "codegen" plugin, which itself dispatch "limit=50kB" to its stackframe object, which sets its "limit" attribute to 50000 (for example).
>
>
> There's one particular set of use cases that this brings up that I'd like to mention: 
>
> "Less knobs for users to use"
>
> I realize it's an example, and a good way of describing a partial way of evaluation options, however, the idea behind this option in particular is something I explicitly don't want to do. I believe we want less knobs, not more. We don't want the users mucking with things like inlining heuristics, the size of the stack, and whether they want to unroll 4 loops or 3. These are the kinds of decisions that the optimizer should be able to handle and people working on the compiler have their own ways of mucking with these sorts of options.

Wrong; to successfully build the cellspu tblgen files on windows x64, one had to increase stack size for tblgen not to crash. I'm quite certain this use case is far from unique. I do agree hiding low level options casual users shouldn't know about (although they should be well documented), but not having it is really shooting yourself in the foot..


Then the port is broken. You shouldn't need a compiler option for this.

-eric


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Ruben Van Boxem

Op 4 nov. 2011 23:11 schreef "Eric Christopher" <[hidden email]> het volgende:
>
>
> On Nov 4, 2011, at 2:55 PM, Ruben Van Boxem wrote:
>
>> Op 4 nov. 2011 21:51 schreef "Eric Christopher" <[hidden email]> het volgende:
>> >
>> >
>> > On Nov 4, 2011, at 1:08 PM, Matthieu Monrocq wrote:
>> >
>> >> This would also allow implementing a *sane* option syntax, like for example a hierarchical option parser:
>> >>
>> >> --codegen-stackframe-limit=50
>> >>
>> >> Where "stackframe-limit=50" is dispatched to the "codegen" plugin, which itself dispatch "limit=50kB" to its stackframe object, which sets its "limit" attribute to 50000 (for example).
>> >
>> >
>> > There's one particular set of use cases that this brings up that I'd like to mention: 
>> >
>> > "Less knobs for users to use"
>> >
>> > I realize it's an example, and a good way of describing a partial way of evaluation options, however, the idea behind this option in particular is something I explicitly don't want to do. I believe we want less knobs, not more. We don't want the users mucking with things like inlining heuristics, the size of the stack, and whether they want to unroll 4 loops or 3. These are the kinds of decisions that the optimizer should be able to handle and people working on the compiler have their own ways of mucking with these sorts of options.
>>
>> Wrong; to successfully build the cellspu tblgen files on windows x64, one had to increase stack size for tblgen not to crash. I'm quite certain this use case is far from unique. I do agree hiding low level options casual users shouldn't know about (although they should be well documented), but not having it is really shooting yourself in the foot..
>>
>>
> Then the port is broken. You shouldn't need a compiler option for this.

Agreed, but that doesn't take away there will always be wanted and controlled cases where these types of things are required. It would only limit Clang's power by removing that compiler option interface. And there will always be compiler bugs, that could be effectively worked around by using these kinds of options.

Ruben

>
> -eric
>


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Eric Christopher-2


> Then the port is broken. You shouldn't need a compiler option for this.

Agreed, but that doesn't take away there will always be wanted and controlled cases where these types of things are required. It would only limit Clang's power by removing that compiler option interface. And there will always be compiler bugs, that could be effectively worked around by using these kinds of options.

Did you file a bug? Did you fix the bad behavior? Or did you just look for an undocumented option that would work around it and have left that in? Even if you did the right thing most users won't. Supporting and maintaining those sorts of behaviors with explicit command line options is exactly why those kinds of options shouldn't exist.

A generic interface for short term fixes (perhaps a reason for the plugins that James mentioned) might be OK, but the general driver should expose as little of the backend as possible.

-eric

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Ruben Van Boxem

Op 4 nov. 2011 23:55 schreef "Eric Christopher" <[hidden email]> het volgende:
>>
>>
>> > Then the port is broken. You shouldn't need a compiler option for this.
>>
>> Agreed, but that doesn't take away there will always be wanted and controlled cases where these types of things are required. It would only limit Clang's power by removing that compiler option interface. And there will always be compiler bugs, that could be effectively worked around by using these kinds of options.
>
> Did you file a bug? Did you fix the bad behavior? Or did you just look for an undocumented option that would work around it and have left that in? Even if you did the right thing most users won't. Supporting and maintaining those sorts of behaviors with explicit command line options is exactly why those kinds of options shouldn't exist.
>
> A generic interface for short term fixes (perhaps a reason for the plugins that James mentioned) might be OK, but the general driver should expose as little of the backend as possible.

OK, workarounds are a bad reason.

The optimizer will not always make the best decisions, especially in situations where numerical data is processed that is unknown at compile time. Backend inlining  options can improve performance if the user knows what to optimize for. I agree there's most probably a lot of mucky options, but sometimes fine-grained control beyond what the backend itself can provide, is very wanted, if not necessary.

All I'm trying to say is a (too) dumbed down interface can be harmful (to adoption, usefulness, adaptability, research...) as much as too many obscure options can lead to misuse.

Ruben

PS: the stack size option was for the linker, making it a bit irrelevant in light of the current discussion, I wrongfully picked it up from a previous message.

>
> -eric


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Hal Finkel
In reply to this post by Eric Christopher-2
On Fri, 2011-11-04 at 13:48 -0700, Eric Christopher wrote:

>
> On Nov 4, 2011, at 1:08 PM, Matthieu Monrocq wrote:
>
> > This would also allow implementing a *sane* option syntax, like for
> > example a hierarchical option parser:
> >
> > --codegen-stackframe-limit=50
> >
> >
> > Where "stackframe-limit=50" is dispatched to the "codegen" plugin,
> > which itself dispatch "limit=50kB" to its stackframe object, which
> > sets its "limit" attribute to 50000 (for example).
>
> There's one particular set of use cases that this brings up that I'd
> like to mention:
>
>
> "Less knobs for users to use"
>
>
> I realize it's an example, and a good way of describing a partial way
> of evaluation options, however, the idea behind this option in
> particular is something I explicitly don't want to do. I believe we
> want less knobs, not more. We don't want the users mucking with things
> like inlining heuristics, the size of the stack, and whether they want
> to unroll 4 loops or 3. These are the kinds of decisions that the
> optimizer should be able to handle and people working on the compiler
> have their own ways of mucking with these sorts of options.

>From a scientific-programming perspective, I think that this is the
wrong way to approach the problem. Although I certainly understand the
desire to decrease the maintenance burden by restricting the number of
public-facing options, tuning things like loop-unrolling limits are
often necessary for squeezing the last bit of performance out of some
scientific code. There is a tendency to think, "but people should not
spend their time doing that"; and for the most part that is true. But
many places now have autotunners that attempt to find optimal compiler
parameters for specific routines on specific input data, and those
autotunners often work with multiple compilers and so the options that
they're tuning must be available using the command-line interface.

I think, however, that it is important to make a distinction between the
options that are designed to be public facing and those that are not.
For example, a public option could look like: -floop-unrolling-limit=200
while a non-public, could-change-at-any-time option could look like:
-finternal:loop-unrolling-limit=200. Options that have meanings that are
(mostly) independent of the underlying implementation, such has now many
instructions can be in an unrolled loop, stack sizes, etc. should be
made public. Other options should not, but should still be made
available. In this way, clang/LLVM will be as friendly as possible to
regular users, performance engineers, and compiler developers.

 -Hal

>
>
> -eric
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Douglas Gregor
In reply to this post by Sean Hunt-2

On Nov 4, 2011, at 7:55 AM, Sean Hunt wrote:

> On Fri, Nov 4, 2011 at 07:11, James Molloy <[hidden email]> wrote:
>> Usecase 3: Clang developer, developing
>> --------------------------------------
>>
>> Wants no functionality to change - things keep working as normal [1]
>>
>> Usecase 4: Apple/Darwin developer, using fat binaries
>> -----------------------------------------------------
>>
>> Requires fat-binary support. This entails multiple "-arch" arguments
>> being supported. [1]
>>
>> .. note::
>>
>>    Describe this some more?
>>
>> Functional requirements
>> =======================
>>
>> The following requirements follow from the use cases above and attempt
>> to formalise those use cases more precisely.
>>
>> [1] No functional regressions
>>  The driver **must** be able to be configured such that it can parse
>>  command lines that the current Clang driver accepts. The driver
>>  **must** invoke all subtools in the same manner as the current Clang
>>  driver, with the possible exception of obtuse, undefined, legacy or
>>  otherwise incorrect behaviour, permission for which must be obtained
>>  from the mailing list and documented in a subsection of this
>>  document for decision tracking.
>
> I honestly have to disagree with this one. A lot of the reasons for
> horribleness in the current driver is compatibility with GCC. I
> believe that we should really have two drivers, one being the 'nice'
> driver, and one being the compatibility driver. To be honest, I
> consider POSIX specifications for CC rather irrigating as well, but
> I'm willing to concede POSIX compatibility. Naturally, it should be
> easy for these to both be changeable at once so that we don't have
> ridiculous levels of maintenance being performed, but I'm of opinion
> that the current model is predicated on enough levels of annoyances
> that trying to promote a compatible compiler is not a good approach
> (the first example that comes to mind is -Wall).


GCC compatibility is and has always been crucial to the viability of Clang, *especially* in the driver, which needs to deal with many years of accumulated cruft in makefiles and command lines. Unlike with language compatibility, where we can differ from GCC to better adhere to a language standard, GCC's driver *is* the standard for most *nix systems out there. You won't win the hearts and minds of users if you tell them to change all of their makefiles before they can even try Clang.

By all means, please make it easier to build and distribute cross compilers, but any Clang driver that does not provide GCC compatibility is likely to be a non-starter [*].

        - Doug

[*] The natural exception would be a driver designed for compatibility with a different compiler, e.g., a Clang that accepts Microsoft CL command-like syntax.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Clang driver redesign

Andrew Trick
On Nov 7, 2011, at 8:02 AM, Douglas Gregor wrote:

>> I honestly have to disagree with this one. A lot of the reasons for
>> horribleness in the current driver is compatibility with GCC. I
>> believe that we should really have two drivers, one being the 'nice'
>> driver, and one being the compatibility driver. To be honest, I
>> consider POSIX specifications for CC rather irrigating as well, but
>> I'm willing to concede POSIX compatibility. Naturally, it should be
>> easy for these to both be changeable at once so that we don't have
>> ridiculous levels of maintenance being performed, but I'm of opinion
>> that the current model is predicated on enough levels of annoyances
>> that trying to promote a compatible compiler is not a good approach
>> (the first example that comes to mind is -Wall).
>
>
> GCC compatibility is and has always been crucial to the viability of Clang, *especially* in the driver, which needs to deal with many years of accumulated cruft in makefiles and command lines. Unlike with language compatibility, where we can differ from GCC to better adhere to a language standard, GCC's driver *is* the standard for most *nix systems out there. You won't win the hearts and minds of users if you tell them to change all of their makefiles before they can even try Clang.
>
> By all means, please make it easier to build and distribute cross compilers, but any Clang driver that does not provide GCC compatibility is likely to be a non-starter [*].

This isn't completely true. gcc is the standard for building *open
source packages* across platforms, which has only become important
recently. Your conclusion is correct that tweaking Makefiles cannot be
a requirement for adopting clang. But the assumption that most users
are gcc option gurus is wrong, and clang suffers from that mentality.

The majority of users, build maintainers, need some superficial level
of compatibility. As long as the build doesn't break, they're happy.

The details of invoking subtools, diagnostics, fine control of
optimizations, and target specific flags are important to hackers
who are already dealing with something broken or requiring
extraordinary optimization. Then there are the people on this list,
who simply want to waste less time repeating the horrible
trial-and-error process that it takes to reproduce a problem or enable
an experimental feature. Our experience matters too.

My own experience parallels the community in general. I've used gcc
far more than any other compiler and always appreciated having it when
I need to port a missing library. But I never cared about tweaking any
command line options other than "-g -O0". For any real compiler
development, performance work, or debugging I used the platform
vendor's native compiler which always had a sane, well documented
option set. Very un-like gcc.

My experience adopting a mostly undocumented and seemingly
obfuscated clang driver for development was violently traumatic and
still causes me grief. I'm guessing there are two reasons: (1) the
assumption that all compiler developers must have been gcc/llvm-gcc
developers at some point (2) the evil idea that a driver should be
designed primarily to prevent build engineers from using anything
beyong the minimal option set, which results in burying the rest of
the functionality in an indecipherable web of driver code.

Moving to a driver that compartmentalizes gcc compatibility would be
fantastic. We could finally focus on providing a sensible command-line
interface for the clang community. I'm not necessarily advocating two
drivers so much as a clear and formal segregation between
first-class clang options and gcc compatibility options. This is sort of
the intent behind the unfortunate "-cc1" and "-mllvm" flags. But they
only add considerable confusion from my point of view.

And while I'm in evangelism mode, we need a strict requirement that
every decision within the compiler that can be impacted by the
environment, including target data and library versions, should be
formalized and captured by an option framework. These options
could be either printed and replayed on the command line or potentially
embedded in bitcode. I have no problem disabling these options in
release builds if that's really the way to solve the inertial-QA-team problem.

And how difficult would it be to record those options + compiler version
in the obj file? Really!

Taking it one step further, we should have a single driver that
supports decomposition of the major compilation stages. For example,
I'm currently unable to use basic codegen diagnostics without
hand-inserting instrumentation because I often can't force the llc
driver to produce the same code as the clang driver.

Believe me, experimental compiler development doesn't need to be this
difficult.

-Andy
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
12