Integrating "-distribute" into clang's Driver

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Integrating "-distribute" into clang's Driver

Mike Miller
Hi cfe-dev,

I've added an option to -cc1: -distribute. The option takes as input a
single source file, and produces object code, by distributing the
source to slaves. I'm now trying to make '-distribute' a
non-cc1-option as well, so that a user can use -distribute in their
CFLAGS to get projects to build in a distributed manner without much
hassle. I have several questions regarding this:

1. Since I'm skipping the assembler(I'm doing assembly on slaves), but
still going on to the linker, I'm confused about how to integrate the
-distribute option into the Action pipeline in Driver.cpp. What's the
best way to do this? I'd like to be able to smartly handle a user
typing "clang -distribute -E  myFile.c" by not invoking -distribute in
-cc1 if no object code is required.

2. Since I'm skipping the assembler, I need to know where to save the
object code to on disk. Is there an easy way to get clang to pass -cc1
the expected location of the object file, so that the linker will be
able to find the object file?

3. Is there any way (or does clang already) invoke multiple -cc1s in
parallel where possible? If not, would this be easy to add in? When
called with -distribute, clang will just connect via a UNIX socket to
another process, send over the source+args, and receive the diags, and
the object file will be written out to disk by the process at the
other end of the socket, so I'm not worried about thread safety at
all.

Thanks,
Mike
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Integrating "-distribute" into clang's Driver

Eli Friedman
On Thu, May 13, 2010 at 4:27 PM, Mike Miller <[hidden email]> wrote:

> Hi cfe-dev,
>
> I've added an option to -cc1: -distribute. The option takes as input a
> single source file, and produces object code, by distributing the
> source to slaves. I'm now trying to make '-distribute' a
> non-cc1-option as well, so that a user can use -distribute in their
> CFLAGS to get projects to build in a distributed manner without much
> hassle. I have several questions regarding this:
>
> 1. Since I'm skipping the assembler(I'm doing assembly on slaves), but
> still going on to the linker, I'm confused about how to integrate the
> -distribute option into the Action pipeline in Driver.cpp. What's the
> best way to do this? I'd like to be able to smartly handle a user
> typing "clang -distribute -E  myFile.c" by not invoking -distribute in
> -cc1 if no object code is required.

The -integrated-as option is pretty similar to what you need; try
taking a look at how that is implemented?  As for -E, you can check
explicitly in Clang::ConstructJob in lib/Driver/Tools.cpp.

> 2. Since I'm skipping the assembler, I need to know where to save the
> object code to on disk. Is there an easy way to get clang to pass -cc1
> the expected location of the object file, so that the linker will be
> able to find the object file?

See above.

> 3. Is there any way (or does clang already) invoke multiple -cc1s in
> parallel where possible? If not, would this be easy to add in? When
> called with -distribute, clang will just connect via a UNIX socket to
> another process, send over the source+args, and receive the diags, and
> the object file will be written out to disk by the process at the
> other end of the socket, so I'm not worried about thread safety at
> all.

It probably wouldn't be that difficult to implement; the driver
already invokes separate -cc1 instances when it is passed multiple
files.  That said, majority of popular build systems don't call the
compiler in this way, so it isn't very high priority.

-Eli

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Integrating "-distribute" into clang's Driver

Mike Miller
Thanks for the quick reply!

It looks like -integrated-as is handled in SelectToolForJob, where the
default assembler is overridden with the compiler's... this is
slightly different than what I want to do(it is replacing a stage in
the pipeline instead of removing one and bridging the gap between the
two bordering stages).

Right now, my plan is to check in Driver::BuildActions if the args has
the option OPT_distribute, and if it does, leave out an assembly
stage. The -E example I mentioned was just an example of a larger
class of problems(i.e. what if the user passes --emit-llvm and
-distribute?), so I'm concerned about handling these.

I'm still unsure about how to bridge the gap between the -cc1
invocation and the linker though. Any ideas on how I'd do that? Also,
any ideas on how I would pass -distribute down to -cc1?

Thanks!
Mike

On Thu, May 13, 2010 at 6:54 PM, Eli Friedman <[hidden email]> wrote:

> On Thu, May 13, 2010 at 4:27 PM, Mike Miller <[hidden email]> wrote:
>> Hi cfe-dev,
>>
>> I've added an option to -cc1: -distribute. The option takes as input a
>> single source file, and produces object code, by distributing the
>> source to slaves. I'm now trying to make '-distribute' a
>> non-cc1-option as well, so that a user can use -distribute in their
>> CFLAGS to get projects to build in a distributed manner without much
>> hassle. I have several questions regarding this:
>>
>> 1. Since I'm skipping the assembler(I'm doing assembly on slaves), but
>> still going on to the linker, I'm confused about how to integrate the
>> -distribute option into the Action pipeline in Driver.cpp. What's the
>> best way to do this? I'd like to be able to smartly handle a user
>> typing "clang -distribute -E  myFile.c" by not invoking -distribute in
>> -cc1 if no object code is required.
>
> The -integrated-as option is pretty similar to what you need; try
> taking a look at how that is implemented?  As for -E, you can check
> explicitly in Clang::ConstructJob in lib/Driver/Tools.cpp.
>
>> 2. Since I'm skipping the assembler, I need to know where to save the
>> object code to on disk. Is there an easy way to get clang to pass -cc1
>> the expected location of the object file, so that the linker will be
>> able to find the object file?
>
> See above.
>
>> 3. Is there any way (or does clang already) invoke multiple -cc1s in
>> parallel where possible? If not, would this be easy to add in? When
>> called with -distribute, clang will just connect via a UNIX socket to
>> another process, send over the source+args, and receive the diags, and
>> the object file will be written out to disk by the process at the
>> other end of the socket, so I'm not worried about thread safety at
>> all.
>
> It probably wouldn't be that difficult to implement; the driver
> already invokes separate -cc1 instances when it is passed multiple
> files.  That said, majority of popular build systems don't call the
> compiler in this way, so it isn't very high priority.
>
> -Eli
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Integrating "-distribute" into clang's Driver

Mike Miller
Okay, did a little more digging, and I think I'm making the situation
more complicated than it is :).

If I leave out the assembly stage, clang produces a -cc1 command that
outputs (in my example) to cc-IpYybF.s, and a linker command that
takes cc-IpYybF.s as input. So, looks like it won't be an issue
getting the filename into -cc1(even if the file extension is a little
misleading)!

As for passing the arguments on, in Clang::ConstructJob, I simply
check if Args contains "-distribute", and if it does, I push
"-distribute" onto CmdArgs.

There is one small snag I encountered(and worked around, in an ugly
way). Because clang expects assembly output, it passes "-S" to the
-cc1 invocation. This will override the "-distribute" option. What I
did to work around this, was check to see if "-distribute" and "-S"
are both present. If they are, I drop the "-S".

This is unfortunate, because if a user adds both -distribute and -S to
their args, the behavior will not be to do the action locally, but
instead to write an object file to the assembly location. Any ideas on
elegant workarounds?

Thanks,
Mike

On Thu, May 13, 2010 at 7:10 PM, Mike Miller <[hidden email]> wrote:

> Thanks for the quick reply!
>
> It looks like -integrated-as is handled in SelectToolForJob, where the
> default assembler is overridden with the compiler's... this is
> slightly different than what I want to do(it is replacing a stage in
> the pipeline instead of removing one and bridging the gap between the
> two bordering stages).
>
> Right now, my plan is to check in Driver::BuildActions if the args has
> the option OPT_distribute, and if it does, leave out an assembly
> stage. The -E example I mentioned was just an example of a larger
> class of problems(i.e. what if the user passes --emit-llvm and
> -distribute?), so I'm concerned about handling these.
>
> I'm still unsure about how to bridge the gap between the -cc1
> invocation and the linker though. Any ideas on how I'd do that? Also,
> any ideas on how I would pass -distribute down to -cc1?
>
> Thanks!
> Mike
>
> On Thu, May 13, 2010 at 6:54 PM, Eli Friedman <[hidden email]> wrote:
>> On Thu, May 13, 2010 at 4:27 PM, Mike Miller <[hidden email]> wrote:
>>> Hi cfe-dev,
>>>
>>> I've added an option to -cc1: -distribute. The option takes as input a
>>> single source file, and produces object code, by distributing the
>>> source to slaves. I'm now trying to make '-distribute' a
>>> non-cc1-option as well, so that a user can use -distribute in their
>>> CFLAGS to get projects to build in a distributed manner without much
>>> hassle. I have several questions regarding this:
>>>
>>> 1. Since I'm skipping the assembler(I'm doing assembly on slaves), but
>>> still going on to the linker, I'm confused about how to integrate the
>>> -distribute option into the Action pipeline in Driver.cpp. What's the
>>> best way to do this? I'd like to be able to smartly handle a user
>>> typing "clang -distribute -E  myFile.c" by not invoking -distribute in
>>> -cc1 if no object code is required.
>>
>> The -integrated-as option is pretty similar to what you need; try
>> taking a look at how that is implemented?  As for -E, you can check
>> explicitly in Clang::ConstructJob in lib/Driver/Tools.cpp.
>>
>>> 2. Since I'm skipping the assembler, I need to know where to save the
>>> object code to on disk. Is there an easy way to get clang to pass -cc1
>>> the expected location of the object file, so that the linker will be
>>> able to find the object file?
>>
>> See above.
>>
>>> 3. Is there any way (or does clang already) invoke multiple -cc1s in
>>> parallel where possible? If not, would this be easy to add in? When
>>> called with -distribute, clang will just connect via a UNIX socket to
>>> another process, send over the source+args, and receive the diags, and
>>> the object file will be written out to disk by the process at the
>>> other end of the socket, so I'm not worried about thread safety at
>>> all.
>>
>> It probably wouldn't be that difficult to implement; the driver
>> already invokes separate -cc1 instances when it is passed multiple
>> files.  That said, majority of popular build systems don't call the
>> compiler in this way, so it isn't very high priority.
>>
>> -Eli
>>
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Integrating "-distribute" into clang's Driver

Eli Friedman
On Thu, May 13, 2010 at 5:52 PM, Mike Miller <[hidden email]> wrote:

> Okay, did a little more digging, and I think I'm making the situation
> more complicated than it is :).
>
> If I leave out the assembly stage, clang produces a -cc1 command that
> outputs (in my example) to cc-IpYybF.s, and a linker command that
> takes cc-IpYybF.s as input. So, looks like it won't be an issue
> getting the filename into -cc1(even if the file extension is a little
> misleading)!
>
> As for passing the arguments on, in Clang::ConstructJob, I simply
> check if Args contains "-distribute", and if it does, I push
> "-distribute" onto CmdArgs.
>
> There is one small snag I encountered(and worked around, in an ugly
> way). Because clang expects assembly output, it passes "-S" to the
> -cc1 invocation. This will override the "-distribute" option. What I
> did to work around this, was check to see if "-distribute" and "-S"
> are both present. If they are, I drop the "-S".
>
> This is unfortunate, because if a user adds both -distribute and -S to
> their args, the behavior will not be to do the action locally, but
> instead to write an object file to the assembly location. Any ideas on
> elegant workarounds?

Here's what I was thinking: your "clang -cc1 -distribute" mode is
roughly equivalent to "clang -cc1 -emit-obj" in the sense that it
takes a C source code file and outputs an object file, right?
Therefore, I think you should be able to make "-distribute" act like
"-integrated-as" in job creation, and just modify Clang::ConstructJob
slightly to pass "-distribute" instead of "-emit-obj" in "-distribute"
mode.  Or maybe make a unique job type for it.  Does that sound like
it could work?

-Eli

> Thanks,
> Mike
>
> On Thu, May 13, 2010 at 7:10 PM, Mike Miller <[hidden email]> wrote:
>> Thanks for the quick reply!
>>
>> It looks like -integrated-as is handled in SelectToolForJob, where the
>> default assembler is overridden with the compiler's... this is
>> slightly different than what I want to do(it is replacing a stage in
>> the pipeline instead of removing one and bridging the gap between the
>> two bordering stages).
>>
>> Right now, my plan is to check in Driver::BuildActions if the args has
>> the option OPT_distribute, and if it does, leave out an assembly
>> stage. The -E example I mentioned was just an example of a larger
>> class of problems(i.e. what if the user passes --emit-llvm and
>> -distribute?), so I'm concerned about handling these.
>>
>> I'm still unsure about how to bridge the gap between the -cc1
>> invocation and the linker though. Any ideas on how I'd do that? Also,
>> any ideas on how I would pass -distribute down to -cc1?
>>
>> Thanks!
>> Mike
>>
>> On Thu, May 13, 2010 at 6:54 PM, Eli Friedman <[hidden email]> wrote:
>>> On Thu, May 13, 2010 at 4:27 PM, Mike Miller <[hidden email]> wrote:
>>>> Hi cfe-dev,
>>>>
>>>> I've added an option to -cc1: -distribute. The option takes as input a
>>>> single source file, and produces object code, by distributing the
>>>> source to slaves. I'm now trying to make '-distribute' a
>>>> non-cc1-option as well, so that a user can use -distribute in their
>>>> CFLAGS to get projects to build in a distributed manner without much
>>>> hassle. I have several questions regarding this:
>>>>
>>>> 1. Since I'm skipping the assembler(I'm doing assembly on slaves), but
>>>> still going on to the linker, I'm confused about how to integrate the
>>>> -distribute option into the Action pipeline in Driver.cpp. What's the
>>>> best way to do this? I'd like to be able to smartly handle a user
>>>> typing "clang -distribute -E  myFile.c" by not invoking -distribute in
>>>> -cc1 if no object code is required.
>>>
>>> The -integrated-as option is pretty similar to what you need; try
>>> taking a look at how that is implemented?  As for -E, you can check
>>> explicitly in Clang::ConstructJob in lib/Driver/Tools.cpp.
>>>
>>>> 2. Since I'm skipping the assembler, I need to know where to save the
>>>> object code to on disk. Is there an easy way to get clang to pass -cc1
>>>> the expected location of the object file, so that the linker will be
>>>> able to find the object file?
>>>
>>> See above.
>>>
>>>> 3. Is there any way (or does clang already) invoke multiple -cc1s in
>>>> parallel where possible? If not, would this be easy to add in? When
>>>> called with -distribute, clang will just connect via a UNIX socket to
>>>> another process, send over the source+args, and receive the diags, and
>>>> the object file will be written out to disk by the process at the
>>>> other end of the socket, so I'm not worried about thread safety at
>>>> all.
>>>
>>> It probably wouldn't be that difficult to implement; the driver
>>> already invokes separate -cc1 instances when it is passed multiple
>>> files.  That said, majority of popular build systems don't call the
>>> compiler in this way, so it isn't very high priority.
>>>
>>> -Eli
>>>
>>
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev