[RFC] Path mappings for reproducable builds and debugging hacks

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC] Path mappings for reproducable builds and debugging hacks

Eric Fiselier via cfe-dev
Hello all,
I've been discussing this topic on #llvm with some of the regulars, but
this merrits a wider audience. As you some of you might know, NetBSD
allows doing a full release build with GCC in a reproducable way,
including variations of the source locations. This is currently not
possible with clang and I want to fix that. There are four identified
primary points where absolute path names leak into the output:

(1) .file
(2) DWARF
(3) __FILE__
(4) __PRETTY_FUNCTION__ for lambdas etc

We have -fdebug-prefix-map [-fdpm from here] for (2) with some
limitations. I've created -iremap for GCC for (3) years ago, the patch
is still in review limbo.  We don't have anything for (1) and (4) right
now in clangland. I've started to write patches for that, but this is a
bit messy as it tends to duplicate code. This made me want to step back
and review whether we need/want many different switches in first place.
I couldn't come up with a very good reason, but it has been mentioned
that Facebook is using -fdpm for speakable hacks to get space into the
binaries for patching in real patches. That seems to be abusive for me,
even when I can somewhat understand the motivation.

My proposal forward is:
(1) Tighten the definition of -fdpm to mean prefix paths, i.e. the next
character must be a path separator:
  -fdebug-prefix-map=/foo=/bar should not change /foobar into /barbar
(2) Introduce a new option -frewrite-path=src=dst and make -fdpm an
alias of it. The translation applies to all four points above.
(3) Introduce a new -gdwarf-path-padding=$n option to prefix all path
names encoded in DWARF with $n path separators.

The goal forward is that path references in output files generated by
clang (except dependency files) should easy adjustable to a canonical
location, independent of where the sources are located.

Joerg
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Path mappings for reproducable builds and debugging hacks

Eric Fiselier via cfe-dev
On Thu, Sep 28, 2017 at 9:43 AM, Joerg Sonnenberger via cfe-dev <[hidden email]> wrote:
Hello all,
I've been discussing this topic on #llvm with some of the regulars, but
this merrits a wider audience. As you some of you might know, NetBSD
allows doing a full release build with GCC in a reproducable way,
including variations of the source locations. This is currently not
possible with clang and I want to fix that. There are four identified
primary points where absolute path names leak into the output:

(1) .file
(2) DWARF
(3) __FILE__
(4) __PRETTY_FUNCTION__ for lambdas etc

We have -fdebug-prefix-map [-fdpm from here] for (2) with some
limitations. I've created -iremap for GCC for (3) years ago, the patch
is still in review limbo.  We don't have anything for (1) and (4) right
now in clangland. I've started to write patches for that, but this is a
bit messy as it tends to duplicate code. This made me want to step back
and review whether we need/want many different switches in first place.
I couldn't come up with a very good reason, but it has been mentioned
that Facebook is using -fdpm for speakable hacks to get space into the
binaries for patching in real patches. That seems to be abusive for me,
even when I can somewhat understand the motivation.

My proposal forward is:
(1) Tighten the definition of -fdpm to mean prefix paths, i.e. the next
character must be a path separator:
  -fdebug-prefix-map=/foo=/bar should not change /foobar into /barbar
(2) Introduce a new option -frewrite-path=src=dst and make -fdpm an
alias of it. The translation applies to all four points above.

These two SGTM. I also don't see why you'd want different path translations for the different kinds of data. Controlling it all with one flag seems like a very reasonable proposal. Suggestion: "-frewrite-source-path" may be a clearer name for the flag.

I'd also like to point out the (proposed, but not accepted) patch to GCC here to do something similar:
Note that the discussion does continue in some of the later months too. If we decide to go for this, it would be nice to respond to the gcc thread saying what we're planning to do, in case they'd like to be compatible with it.
 
(3) Introduce a new -gdwarf-path-padding=$n option to prefix all path
names encoded in DWARF with $n path separators.

This part sounds pretty horrible. =)

Even after the discussion on #llvm earlier, I still don't really understand why Facebook has a need to mangle dwarf paths in-place, instead of using a debugger source mapping feature.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Path mappings for reproducable builds and debugging hacks

Eric Fiselier via cfe-dev
In reply to this post by Eric Fiselier via cfe-dev
I haven't looked deeply into this issue, but I thought -no-canonical-prefixes was the flag for this use case. Can you elaborate on what makes it unsuitable?

On Thu, Sep 28, 2017 at 6:43 AM, Joerg Sonnenberger via cfe-dev <[hidden email]> wrote:
Hello all,
I've been discussing this topic on #llvm with some of the regulars, but
this merrits a wider audience. As you some of you might know, NetBSD
allows doing a full release build with GCC in a reproducable way,
including variations of the source locations. This is currently not
possible with clang and I want to fix that. There are four identified
primary points where absolute path names leak into the output:

(1) .file
(2) DWARF
(3) __FILE__
(4) __PRETTY_FUNCTION__ for lambdas etc

We have -fdebug-prefix-map [-fdpm from here] for (2) with some
limitations. I've created -iremap for GCC for (3) years ago, the patch
is still in review limbo.  We don't have anything for (1) and (4) right
now in clangland. I've started to write patches for that, but this is a
bit messy as it tends to duplicate code. This made me want to step back
and review whether we need/want many different switches in first place.
I couldn't come up with a very good reason, but it has been mentioned
that Facebook is using -fdpm for speakable hacks to get space into the
binaries for patching in real patches. That seems to be abusive for me,
even when I can somewhat understand the motivation.

My proposal forward is:
(1) Tighten the definition of -fdpm to mean prefix paths, i.e. the next
character must be a path separator:
  -fdebug-prefix-map=/foo=/bar should not change /foobar into /barbar
(2) Introduce a new option -frewrite-path=src=dst and make -fdpm an
alias of it. The translation applies to all four points above.
(3) Introduce a new -gdwarf-path-padding=$n option to prefix all path
names encoded in DWARF with $n path separators.

The goal forward is that path references in output files generated by
clang (except dependency files) should easy adjustable to a canonical
location, independent of where the sources are located.

Joerg
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev