rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Nico Weber
Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB. We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)

Thanks,
Nico



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Reid Kleckner-3
On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]> wrote:
Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB. We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)

Is there some way to change the pipeline to use the mangled name stored in .symtab instead of the DW_AT_name in the DWARF? My understanding is that the sanitizers already do this.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Nico Weber
On Wed, Apr 29, 2015 at 12:35 PM, Reid Kleckner <[hidden email]> wrote:
On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]> wrote:
Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB. We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)

Is there some way to change the pipeline to use the mangled name stored in .symtab instead of the DW_AT_name in the DWARF? My understanding is that the sanitizers already do this.

See "We can't easily use symbol names as they get stripped very early in our pipeline." above – from what I understand, one machine does the build, strips, runs dsymutil to get a .dSYM bundle, and then that is shipped to the crash servers.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Eric Christopher


On Wed, Apr 29, 2015 at 12:41 PM Nico Weber <[hidden email]> wrote:
On Wed, Apr 29, 2015 at 12:35 PM, Reid Kleckner <[hidden email]> wrote:
On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]> wrote:
Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB. We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)

Is there some way to change the pipeline to use the mangled name stored in .symtab instead of the DW_AT_name in the DWARF? My understanding is that the sanitizers already do this.

See "We can't easily use symbol names as they get stripped very early in our pipeline." above – from what I understand, one machine does the build, strips, runs dsymutil to get a .dSYM bundle, and then that is shipped to the crash servers.


(adding Greg for dsymutil issues, but I don't think there's much in the way of solving it)
Parameter names is something we could probably add. I'd like to see the impact of it from a size perspective. I mean, next people will want locations of parameters. Then... and we're right back to just full debug info.

-eric

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Nico Weber
On Wed, Apr 29, 2015 at 12:47 PM, Eric Christopher <[hidden email]> wrote:


On Wed, Apr 29, 2015 at 12:41 PM Nico Weber <[hidden email]> wrote:
On Wed, Apr 29, 2015 at 12:35 PM, Reid Kleckner <[hidden email]> wrote:
On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]> wrote:
Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB. We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)

Is there some way to change the pipeline to use the mangled name stored in .symtab instead of the DW_AT_name in the DWARF? My understanding is that the sanitizers already do this.

See "We can't easily use symbol names as they get stripped very early in our pipeline." above – from what I understand, one machine does the build, strips, runs dsymutil to get a .dSYM bundle, and then that is shipped to the crash servers.


(adding Greg for dsymutil issues, but I don't think there's much in the way of solving it) 

(the rdar is 20695512)
 
Parameter names is something we could probably add. I'd like to see the impact of it from a size perspective. I mean, next people will want locations of parameters. Then... and we're right back to just full debug info.

-eric


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Adrian Prantl
In reply to this post by Nico Weber

On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:

Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.


(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.

The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark. 

We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)


-- adrian


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Duncan P. N. Exon Smith

> On 2015-Apr-29, at 13:03, Adrian Prantl <[hidden email]> wrote:
>
>
>> On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:
>>
>> Hi,
>>
>> the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

This bothers me too, although Adrian has convinced me that
filename and line number are sufficient for triaging (even if they
aren't convenient).

>
>>
>> Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?
>

FWIW, I'd be in favour of having a mode that includes scopes.
They were ripped out in r220408:

> Author: dblaikie
> Date: Wed Oct 22 14:34:33 2014
> New Revision: 220408
>
> URL: http://llvm.org/viewvc/llvm-project?rev=220408&view=rev
> Log:
> DebugInfo: Omit scopes in -gmlt to reduce metadata size (on disk and in memory)
>
> I haven't done any actual impact analysis of this change as it's a
> strict improvement, but I'd be curious to know how much it helps.

I'd appreciate them for triaging clang crashes, which I do a fair bit
of internally.  (We don't use -g because I haven't made it scale with
-flto yet.)

Would including just namespaces/scopes be good enough for your use
case?  (Adrian's point below seems valid -- parameters kind of pull
in everything.)

> I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.
>
>>
>> (Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.
>
> The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark.
>
>> We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)
>
>
> -- adrian
>>
>> Thanks,
>> Nico
>>
>>
>> 1: http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20120423/056674.html
>


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Adrian Prantl
In reply to this post by Adrian Prantl

On Apr 29, 2015, at 1:03 PM, Adrian Prantl <[hidden email]> wrote:


On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:

Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.


(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.

The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark. 

As a stopgap until all of Fred’s patches are in, you could also try building with -fstandalone-debug. It will make LLDB unhappy but the result will still be much better than 0-line-tables-only.

-- adrian

We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)


-- adrian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Nico Weber
On Wed, Apr 29, 2015 at 1:25 PM, Adrian Prantl <[hidden email]> wrote:

On Apr 29, 2015, at 1:03 PM, Adrian Prantl <[hidden email]> wrote:


On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:

Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.


(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.

The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark. 

As a stopgap until all of Fred’s patches are in, you could also try building with -fstandalone-debug.

Isn't that on by default on Darwin?
 
It will make LLDB unhappy but the result will still be much better than 0-line-tables-only.

-- adrian

We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)


-- adrian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Eric Christopher-3


On Wed, Apr 29, 2015 at 1:30 PM Nico Weber <[hidden email]> wrote:
On Wed, Apr 29, 2015 at 1:25 PM, Adrian Prantl <[hidden email]> wrote:

On Apr 29, 2015, at 1:03 PM, Adrian Prantl <[hidden email]> wrote:


On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:

Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.


(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.

The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark. 

As a stopgap until all of Fred’s patches are in, you could also try building with -fstandalone-debug.

Isn't that on by default on Darwin?
 

Other way. It's off by default on darwin.

-eric
 
It will make LLDB unhappy but the result will still be much better than 0-line-tables-only.

-- adrian

We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)


-- adrian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Adrian Prantl
In reply to this post by Nico Weber

On Apr 29, 2015, at 1:30 PM, Nico Weber <[hidden email]> wrote:

On Wed, Apr 29, 2015 at 1:25 PM, Adrian Prantl <[hidden email]> wrote:

On Apr 29, 2015, at 1:03 PM, Adrian Prantl <[hidden email]> wrote:


On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:

Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.


(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.

The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark. 

As a stopgap until all of Fred’s patches are in, you could also try building with -fstandalone-debug.

Isn't that on by default on Darwin?

It is of course. I meant to say -fno-standalone-debug.

 
It will make LLDB unhappy but the result will still be much better than 0-line-tables-only.


-- adrian

We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)


-- adrian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Eric Christopher-3


On Wed, Apr 29, 2015 at 1:33 PM Adrian Prantl <[hidden email]> wrote:
On Apr 29, 2015, at 1:30 PM, Nico Weber <[hidden email]> wrote:

On Wed, Apr 29, 2015 at 1:25 PM, Adrian Prantl <[hidden email]> wrote:

On Apr 29, 2015, at 1:03 PM, Adrian Prantl <[hidden email]> wrote:


On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:

Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.


(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.

The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark. 

As a stopgap until all of Fred’s patches are in, you could also try building with -fstandalone-debug.

Isn't that on by default on Darwin?

It is of course. I meant to say -fno-standalone-debug.

... as I did the same thing Adrian did. :)

-eric
 

 
It will make LLDB unhappy but the result will still be much better than 0-line-tables-only.


-- adrian

We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)


-- adrian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Greg Clayton
In reply to this post by Eric Christopher
Actually parameters and their types and locations would greatly help out. This would increase debug info size, but it would also keep me from having to detect borked DWARF and having to avoid parsing any of it in LLDB. Right now the -gline-tables-only will quickly make LLDB assert and die if we try to use this debug info because we try to convert a function into a real clang::ASTContext type and clang will quickly assert and kill the debug session when it becomes unhappy with us trying to create functions with no params or return types (think "operator<" with no params... Boom. Crash).

It is too bad that DWARF requires compile units just so we can have line tables. If it wasn't for this, we could just emit a .debug_line section with no .debug_info.

More comments inline below.

> On Apr 29, 2015, at 12:47 PM, Eric Christopher <[hidden email]> wrote:
>
>
>
> On Wed, Apr 29, 2015 at 12:41 PM Nico Weber <[hidden email]> wrote:
> On Wed, Apr 29, 2015 at 12:35 PM, Reid Kleckner <[hidden email]> wrote:
> On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]> wrote:
> Hi,
>
> the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.
>
> Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I would like this due to reasons mentioned above.
>
> (Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB. We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)

You can still recover function bounds using the LC_FUNCTION_STARTS load command even if you have a stripped executable. You won't have the function names, but you can still backtrace and you can get accurate function bound info for backtracing.

>
> Is there some way to change the pipeline to use the mangled name stored in .symtab instead of the DW_AT_name in the DWARF? My understanding is that the sanitizers already do this.

They probably just get the info from the symbol table.

>
> See "We can't easily use symbol names as they get stripped very early in our pipeline." above – from what I understand, one machine does the build, strips, runs dsymutil to get a .dSYM bundle, and then that is shipped to the crash servers.

It better strip _after_ you run dsymutil to make the dSYM or you will get empty dSYM files if the debug map is stripped. It really depends on what kind of strip you are doing.

>
>
> (adding Greg for dsymutil issues, but I don't think there's much in the way of solving it)
> Parameter names is something we could probably add. I'd like to see the impact of it from a size perspective. I mean, next people will want locations of parameters. Then... and we're right back to just full debug info.

My main issue with -gline-tables-only (which LLDB doesn't support currently due to the partial DWARF that is emitted that causes clang to assert and kill the debugger) is that we now have DWARF that we can't trust to use when creating types because all functions are potentially invalid to create in a clang::ASTContext since everything has been removed. Clang has the notion that certain types of functions must have certain number of arguments, like C++ operators, and it gets unhappy if we try to convert this DWARF into clang::AST types. We have a radar for clang to somehow mark the debug info has "-gline-tables-only" so we can know that any DW_TAG_subprogram tags we run into need to be converted to clang AST types that return "UnknownAnyTy" and taking varargs as the parameters so that they will be callable from the expression parser. We still might run into names that can't be used as function names, especially if the namespaces and class decl contexts are removed, so LLDB might have to just say "I am not parsing any DWARF from anything that had -gline-tables-only enabled..".

Greg Clayton


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Alexey Samsonov-2
In reply to this post by Nico Weber

On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]> wrote:
Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB. We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)

I agree that if you don't see fully qualified function names and parameter types for non-inlined functions, understanding stack traces is hard... (from my experience, they are especially useful for template functions).
However, -gline-tables-only are kind of designed with assumption that you *have* the symbol table (I think this was also the case for early GCC -gmlt proposal), which is not the case in your pipeline.

"Changing that is apparently involved" - what do you mean by this?

I think we need to collect the data: the binary size increase after adding full linkage names and parameter types for SPEC and for Chromium. If it's significantly larger than current -gline-tables-only, but way too less than full debug info (which I believe would be the case), than introducing one more flag makes sense to me.
 

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev




--
Alexey Samsonov
[hidden email]

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Robinson, Paul-3
In reply to this post by Greg Clayton
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On
> Behalf Of Greg Clayton
> Sent: Wednesday, April 29, 2015 1:55 PM
> To: Eric Christopher
> Cc: Eric Christopher; Alexey Samsonov; David Blaikie; [hidden email]
> Developers
> Subject: Re: [cfe-dev] rfc: Adding a mode to -gline-tables-only to include
> parameter names, namespaces
>
> Actually parameters and their types and locations would greatly help out.
> This would increase debug info size, but it would also keep me from having
> to detect borked DWARF and having to avoid parsing any of it in LLDB.
> Right now the -gline-tables-only will quickly make LLDB assert and die if
> we try to use this debug info because we try to convert a function into a
> real clang::ASTContext type and clang will quickly assert and kill the
> debug session when it becomes unhappy with us trying to create functions
> with no params or return types (think "operator<" with no params... Boom.
> Crash).
>
> It is too bad that DWARF requires compile units just so we can have line
> tables. If it wasn't for this, we could just emit a .debug_line section
> with no .debug_info.

Hmmm why was that, again?  I can think of three possibilities, none of
which should be insurmountable:
1) CU has the base directory and filename of the compilation; .debug_line
can include these directly, if we know it's -gline-tables-only.
2) CU has the use_utf8 flag; just assume UTF-8.
3) CU points to the chunk of .debug_line for this CU; we can probably
work out how to reliably subdivide .debug_line without that assist.

Thanks,
--paulr

>
> More comments inline below.
>
> > On Apr 29, 2015, at 12:47 PM, Eric Christopher <[hidden email]>
> wrote:
> >
> >
> >
> > On Wed, Apr 29, 2015 at 12:41 PM Nico Weber <[hidden email]> wrote:
> > On Wed, Apr 29, 2015 at 12:35 PM, Reid Kleckner <[hidden email]> wrote:
> > On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]>
> wrote:
> > Hi,
> >
> > the motivation for -gline-tables-only was to make debug info much
> smaller, but still include enough to get usable stack frames [1]. We
> recently tried using it in Chromium and discovered that the stack frames
> aren't all that usable: Function parameters disappear, as do function
> namespaces.
> >
> > Are there any concerns about adding a mode to -gline-tables-only (or a
> second flag -gline-tables-full, or similar) that includes function
> parameter info and namespace info but still omits most debug info?
>
> I would like this due to reasons mentioned above.
> >
> > (Background: We rely on dsym files to let our crash server symbolize
> crash dumps we get from the wild. dsymutil uses debug info, but it
> apparently crashes if debug info is > 4GB. We hit that recently. So we
> started using -gline-tables-only, but now all our stacks are close to
> unusable, even though the motivation for gline-tables-only was to still
> have usable stacks. We can't easily use symbol names as they get stripped
> very early in our pipeline, and changing that is apparently involved.)
>
> You can still recover function bounds using the LC_FUNCTION_STARTS load
> command even if you have a stripped executable. You won't have the
> function names, but you can still backtrace and you can get accurate
> function bound info for backtracing.
>
> >
> > Is there some way to change the pipeline to use the mangled name stored
> in .symtab instead of the DW_AT_name in the DWARF? My understanding is
> that the sanitizers already do this.
>
> They probably just get the info from the symbol table.
>
> >
> > See "We can't easily use symbol names as they get stripped very early in
> our pipeline." above – from what I understand, one machine does the build,
> strips, runs dsymutil to get a .dSYM bundle, and then that is shipped to
> the crash servers.
>
> It better strip _after_ you run dsymutil to make the dSYM or you will get
> empty dSYM files if the debug map is stripped. It really depends on what
> kind of strip you are doing.
>
> >
> >
> > (adding Greg for dsymutil issues, but I don't think there's much in the
> way of solving it)
> > Parameter names is something we could probably add. I'd like to see the
> impact of it from a size perspective. I mean, next people will want
> locations of parameters. Then... and we're right back to just full debug
> info.
>
> My main issue with -gline-tables-only (which LLDB doesn't support
> currently due to the partial DWARF that is emitted that causes clang to
> assert and kill the debugger) is that we now have DWARF that we can't
> trust to use when creating types because all functions are potentially
> invalid to create in a clang::ASTContext since everything has been
> removed. Clang has the notion that certain types of functions must have
> certain number of arguments, like C++ operators, and it gets unhappy if we
> try to convert this DWARF into clang::AST types. We have a radar for clang
> to somehow mark the debug info has "-gline-tables-only" so we can know
> that any DW_TAG_subprogram tags we run into need to be converted to clang
> AST types that return "UnknownAnyTy" and taking varargs as the parameters
> so that they will be callable from the expression parser. We still might
> run into names that can't be used as function names, especially if the
> namespaces and class decl contexts are removed, so LLDB might have to just
> say "I am not parsing any DWARF from anything that had -gline-tables-only
> enabled..".
>
> Greg Clayton
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

David Blaikie


On Wed, Apr 29, 2015 at 3:23 PM, Robinson, Paul <[hidden email]> wrote:
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On
> Behalf Of Greg Clayton
> Sent: Wednesday, April 29, 2015 1:55 PM
> To: Eric Christopher
> Cc: Eric Christopher; Alexey Samsonov; David Blaikie; [hidden email]
> Developers
> Subject: Re: [cfe-dev] rfc: Adding a mode to -gline-tables-only to include
> parameter names, namespaces
>
> Actually parameters and their types and locations would greatly help out.
> This would increase debug info size, but it would also keep me from having
> to detect borked DWARF and having to avoid parsing any of it in LLDB.
> Right now the -gline-tables-only will quickly make LLDB assert and die if
> we try to use this debug info because we try to convert a function into a
> real clang::ASTContext type and clang will quickly assert and kill the
> debug session when it becomes unhappy with us trying to create functions
> with no params or return types (think "operator<" with no params... Boom.
> Crash).
>
> It is too bad that DWARF requires compile units just so we can have line
> tables. If it wasn't for this, we could just emit a .debug_line section
> with no .debug_info.

Hmmm why was that, again?  I can think of three possibilities, none of
which should be insurmountable:
1) CU has the base directory and filename of the compilation; .debug_line
can include these directly, if we know it's -gline-tables-only.
2) CU has the use_utf8 flag; just assume UTF-8.
3) CU points to the chunk of .debug_line for this CU; we can probably
work out how to reliably subdivide .debug_line without that assist.

I don't think (3) is an issue - the line tables probably have their length in them.

4) pointer size - it's specified by the CU header, but not in the line table header
5) inlining info (this is pretty much the major point/issue with -gmlt) is not included in the line table. Cary's two-level line tables address this, though it's not implemented in LLVM. It's the minimal debug_info describing function inlining that's what confuses LLDB.

- David
 

Thanks,
--paulr

>
> More comments inline below.
>
> > On Apr 29, 2015, at 12:47 PM, Eric Christopher <[hidden email]>
> wrote:
> >
> >
> >
> > On Wed, Apr 29, 2015 at 12:41 PM Nico Weber <[hidden email]> wrote:
> > On Wed, Apr 29, 2015 at 12:35 PM, Reid Kleckner <[hidden email]> wrote:
> > On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]>
> wrote:
> > Hi,
> >
> > the motivation for -gline-tables-only was to make debug info much
> smaller, but still include enough to get usable stack frames [1]. We
> recently tried using it in Chromium and discovered that the stack frames
> aren't all that usable: Function parameters disappear, as do function
> namespaces.
> >
> > Are there any concerns about adding a mode to -gline-tables-only (or a
> second flag -gline-tables-full, or similar) that includes function
> parameter info and namespace info but still omits most debug info?
>
> I would like this due to reasons mentioned above.
> >
> > (Background: We rely on dsym files to let our crash server symbolize
> crash dumps we get from the wild. dsymutil uses debug info, but it
> apparently crashes if debug info is > 4GB. We hit that recently. So we
> started using -gline-tables-only, but now all our stacks are close to
> unusable, even though the motivation for gline-tables-only was to still
> have usable stacks. We can't easily use symbol names as they get stripped
> very early in our pipeline, and changing that is apparently involved.)
>
> You can still recover function bounds using the LC_FUNCTION_STARTS load
> command even if you have a stripped executable. You won't have the
> function names, but you can still backtrace and you can get accurate
> function bound info for backtracing.
>
> >
> > Is there some way to change the pipeline to use the mangled name stored
> in .symtab instead of the DW_AT_name in the DWARF? My understanding is
> that the sanitizers already do this.
>
> They probably just get the info from the symbol table.
>
> >
> > See "We can't easily use symbol names as they get stripped very early in
> our pipeline." above – from what I understand, one machine does the build,
> strips, runs dsymutil to get a .dSYM bundle, and then that is shipped to
> the crash servers.
>
> It better strip _after_ you run dsymutil to make the dSYM or you will get
> empty dSYM files if the debug map is stripped. It really depends on what
> kind of strip you are doing.
>
> >
> >
> > (adding Greg for dsymutil issues, but I don't think there's much in the
> way of solving it)
> > Parameter names is something we could probably add. I'd like to see the
> impact of it from a size perspective. I mean, next people will want
> locations of parameters. Then... and we're right back to just full debug
> info.
>
> My main issue with -gline-tables-only (which LLDB doesn't support
> currently due to the partial DWARF that is emitted that causes clang to
> assert and kill the debugger) is that we now have DWARF that we can't
> trust to use when creating types because all functions are potentially
> invalid to create in a clang::ASTContext since everything has been
> removed. Clang has the notion that certain types of functions must have
> certain number of arguments, like C++ operators, and it gets unhappy if we
> try to convert this DWARF into clang::AST types. We have a radar for clang
> to somehow mark the debug info has "-gline-tables-only" so we can know
> that any DW_TAG_subprogram tags we run into need to be converted to clang
> AST types that return "UnknownAnyTy" and taking varargs as the parameters
> so that they will be callable from the expression parser. We still might
> run into names that can't be used as function names, especially if the
> namespaces and class decl contexts are removed, so LLDB might have to just
> say "I am not parsing any DWARF from anything that had -gline-tables-only
> enabled..".
>
> Greg Clayton
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Robinson, Paul-3

Ø  4) pointer size - it's specified by the CU header, but not in the line table header

 

(beats head on table) gotta have that to interpret DW_LNE_set_address correctly, afaict that's the *only* reason .debug_line would need it.  Feh. But basically you always need that, just to get started.

 

Ø  5) inlining info

 

For conjuring up fake frames in the traceback, presumably?  That sort of thing always felt like the debugger was lying to me, personally.

--paulr

 

From: David Blaikie [mailto:[hidden email]]
Sent: Wednesday, April 29, 2015 3:33 PM
To: Robinson, Paul
Cc: Greg Clayton; Eric Christopher; Eric Christopher; David Blaikie; Alexey Samsonov; [hidden email] Developers
Subject: Re: [cfe-dev] rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

 

 

 

On Wed, Apr 29, 2015 at 3:23 PM, Robinson, Paul <[hidden email]> wrote:

> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On
> Behalf Of Greg Clayton
> Sent: Wednesday, April 29, 2015 1:55 PM
> To: Eric Christopher
> Cc: Eric Christopher; Alexey Samsonov; David Blaikie; [hidden email]
> Developers
> Subject: Re: [cfe-dev] rfc: Adding a mode to -gline-tables-only to include
> parameter names, namespaces
>
> Actually parameters and their types and locations would greatly help out.
> This would increase debug info size, but it would also keep me from having
> to detect borked DWARF and having to avoid parsing any of it in LLDB.
> Right now the -gline-tables-only will quickly make LLDB assert and die if
> we try to use this debug info because we try to convert a function into a
> real clang::ASTContext type and clang will quickly assert and kill the
> debug session when it becomes unhappy with us trying to create functions
> with no params or return types (think "operator<" with no params... Boom.
> Crash).
>
> It is too bad that DWARF requires compile units just so we can have line
> tables. If it wasn't for this, we could just emit a .debug_line section
> with no .debug_info.

Hmmm why was that, again?  I can think of three possibilities, none of
which should be insurmountable:
1) CU has the base directory and filename of the compilation; .debug_line
can include these directly, if we know it's -gline-tables-only.
2) CU has the use_utf8 flag; just assume UTF-8.
3) CU points to the chunk of .debug_line for this CU; we can probably
work out how to reliably subdivide .debug_line without that assist.


I don't think (3) is an issue - the line tables probably have their length in them.

4) pointer size - it's specified by the CU header, but not in the line table header
5) inlining info (this is pretty much the major point/issue with -gmlt) is not included in the line table. Cary's two-level line tables address this, though it's not implemented in LLVM. It's the minimal debug_info describing function inlining that's what confuses LLDB.

- David
 


Thanks,
--paulr


>
> More comments inline below.
>
> > On Apr 29, 2015, at 12:47 PM, Eric Christopher <[hidden email]>
> wrote:
> >
> >
> >
> > On Wed, Apr 29, 2015 at 12:41 PM Nico Weber <[hidden email]> wrote:
> > On Wed, Apr 29, 2015 at 12:35 PM, Reid Kleckner <[hidden email]> wrote:
> > On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]>
> wrote:
> > Hi,
> >
> > the motivation for -gline-tables-only was to make debug info much
> smaller, but still include enough to get usable stack frames [1]. We
> recently tried using it in Chromium and discovered that the stack frames
> aren't all that usable: Function parameters disappear, as do function
> namespaces.
> >
> > Are there any concerns about adding a mode to -gline-tables-only (or a
> second flag -gline-tables-full, or similar) that includes function
> parameter info and namespace info but still omits most debug info?
>
> I would like this due to reasons mentioned above.
> >
> > (Background: We rely on dsym files to let our crash server symbolize
> crash dumps we get from the wild. dsymutil uses debug info, but it
> apparently crashes if debug info is > 4GB. We hit that recently. So we
> started using -gline-tables-only, but now all our stacks are close to
> unusable, even though the motivation for gline-tables-only was to still
> have usable stacks. We can't easily use symbol names as they get stripped
> very early in our pipeline, and changing that is apparently involved.)
>
> You can still recover function bounds using the LC_FUNCTION_STARTS load
> command even if you have a stripped executable. You won't have the
> function names, but you can still backtrace and you can get accurate
> function bound info for backtracing.
>
> >
> > Is there some way to change the pipeline to use the mangled name stored
> in .symtab instead of the DW_AT_name in the DWARF? My understanding is
> that the sanitizers already do this.
>
> They probably just get the info from the symbol table.
>
> >
> > See "We can't easily use symbol names as they get stripped very early in
> our pipeline." above – from what I understand, one machine does the build,
> strips, runs dsymutil to get a .dSYM bundle, and then that is shipped to
> the crash servers.
>
> It better strip _after_ you run dsymutil to make the dSYM or you will get
> empty dSYM files if the debug map is stripped. It really depends on what
> kind of strip you are doing.
>
> >
> >
> > (adding Greg for dsymutil issues, but I don't think there's much in the
> way of solving it)
> > Parameter names is something we could probably add. I'd like to see the
> impact of it from a size perspective. I mean, next people will want
> locations of parameters. Then... and we're right back to just full debug
> info.
>
> My main issue with -gline-tables-only (which LLDB doesn't support
> currently due to the partial DWARF that is emitted that causes clang to
> assert and kill the debugger) is that we now have DWARF that we can't
> trust to use when creating types because all functions are potentially
> invalid to create in a clang::ASTContext since everything has been
> removed. Clang has the notion that certain types of functions must have
> certain number of arguments, like C++ operators, and it gets unhappy if we
> try to convert this DWARF into clang::AST types. We have a radar for clang
> to somehow mark the debug info has "-gline-tables-only" so we can know
> that any DW_TAG_subprogram tags we run into need to be converted to clang
> AST types that return "UnknownAnyTy" and taking varargs as the parameters
> so that they will be callable from the expression parser. We still might
> run into names that can't be used as function names, especially if the
> namespaces and class decl contexts are removed, so LLDB might have to just
> say "I am not parsing any DWARF from anything that had -gline-tables-only
> enabled..".
>
> Greg Clayton
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

 


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

David Blaikie


On Wed, Apr 29, 2015 at 4:08 PM, Robinson, Paul <[hidden email]> wrote:

Ø  4) pointer size - it's specified by the CU header, but not in the line table header

 

(beats head on table) gotta have that to interpret DW_LNE_set_address correctly, afaict that's the *only* reason .debug_line would need it.  Feh. But basically you always need that, just to get started.

 

Ø  5) inlining info

 

For conjuring up fake frames in the traceback, presumably?


Yep
 

  That sort of thing always felt like the debugger was lying to me, personally.


The alternative is a frame that looks like this:

outer_function() : inner_function_location.cpp:42

which tends to confuse people (& is ambiguous - which call to inner_function from outer_function are we in?).

If I'm looking for an asan report, or walking through code in a debugger, etc - it's often/usually not important that code has been inlined (just as it's not important that a variable has been put in a register or a stack slot, etc) - I just want to think about the code as though it wasn't optimized, insofar as I can.
 

--paulr

 

From: David Blaikie [mailto:[hidden email]]
Sent: Wednesday, April 29, 2015 3:33 PM
To: Robinson, Paul
Cc: Greg Clayton; Eric Christopher; Eric Christopher; David Blaikie; Alexey Samsonov; [hidden email] Developers


Subject: Re: [cfe-dev] rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

 

 

 

On Wed, Apr 29, 2015 at 3:23 PM, Robinson, Paul <[hidden email]> wrote:

> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On
> Behalf Of Greg Clayton
> Sent: Wednesday, April 29, 2015 1:55 PM
> To: Eric Christopher
> Cc: Eric Christopher; Alexey Samsonov; David Blaikie; [hidden email]
> Developers
> Subject: Re: [cfe-dev] rfc: Adding a mode to -gline-tables-only to include
> parameter names, namespaces
>
> Actually parameters and their types and locations would greatly help out.
> This would increase debug info size, but it would also keep me from having
> to detect borked DWARF and having to avoid parsing any of it in LLDB.
> Right now the -gline-tables-only will quickly make LLDB assert and die if
> we try to use this debug info because we try to convert a function into a
> real clang::ASTContext type and clang will quickly assert and kill the
> debug session when it becomes unhappy with us trying to create functions
> with no params or return types (think "operator<" with no params... Boom.
> Crash).
>
> It is too bad that DWARF requires compile units just so we can have line
> tables. If it wasn't for this, we could just emit a .debug_line section
> with no .debug_info.

Hmmm why was that, again?  I can think of three possibilities, none of
which should be insurmountable:
1) CU has the base directory and filename of the compilation; .debug_line
can include these directly, if we know it's -gline-tables-only.
2) CU has the use_utf8 flag; just assume UTF-8.
3) CU points to the chunk of .debug_line for this CU; we can probably
work out how to reliably subdivide .debug_line without that assist.


I don't think (3) is an issue - the line tables probably have their length in them.

4) pointer size - it's specified by the CU header, but not in the line table header
5) inlining info (this is pretty much the major point/issue with -gmlt) is not included in the line table. Cary's two-level line tables address this, though it's not implemented in LLVM. It's the minimal debug_info describing function inlining that's what confuses LLDB.

- David
 


Thanks,
--paulr


>
> More comments inline below.
>
> > On Apr 29, 2015, at 12:47 PM, Eric Christopher <[hidden email]>
> wrote:
> >
> >
> >
> > On Wed, Apr 29, 2015 at 12:41 PM Nico Weber <[hidden email]> wrote:
> > On Wed, Apr 29, 2015 at 12:35 PM, Reid Kleckner <[hidden email]> wrote:
> > On Wed, Apr 29, 2015 at 12:25 PM, Nico Weber <[hidden email]>
> wrote:
> > Hi,
> >
> > the motivation for -gline-tables-only was to make debug info much
> smaller, but still include enough to get usable stack frames [1]. We
> recently tried using it in Chromium and discovered that the stack frames
> aren't all that usable: Function parameters disappear, as do function
> namespaces.
> >
> > Are there any concerns about adding a mode to -gline-tables-only (or a
> second flag -gline-tables-full, or similar) that includes function
> parameter info and namespace info but still omits most debug info?
>
> I would like this due to reasons mentioned above.
> >
> > (Background: We rely on dsym files to let our crash server symbolize
> crash dumps we get from the wild. dsymutil uses debug info, but it
> apparently crashes if debug info is > 4GB. We hit that recently. So we
> started using -gline-tables-only, but now all our stacks are close to
> unusable, even though the motivation for gline-tables-only was to still
> have usable stacks. We can't easily use symbol names as they get stripped
> very early in our pipeline, and changing that is apparently involved.)
>
> You can still recover function bounds using the LC_FUNCTION_STARTS load
> command even if you have a stripped executable. You won't have the
> function names, but you can still backtrace and you can get accurate
> function bound info for backtracing.
>
> >
> > Is there some way to change the pipeline to use the mangled name stored
> in .symtab instead of the DW_AT_name in the DWARF? My understanding is
> that the sanitizers already do this.
>
> They probably just get the info from the symbol table.
>
> >
> > See "We can't easily use symbol names as they get stripped very early in
> our pipeline." above – from what I understand, one machine does the build,
> strips, runs dsymutil to get a .dSYM bundle, and then that is shipped to
> the crash servers.
>
> It better strip _after_ you run dsymutil to make the dSYM or you will get
> empty dSYM files if the debug map is stripped. It really depends on what
> kind of strip you are doing.
>
> >
> >
> > (adding Greg for dsymutil issues, but I don't think there's much in the
> way of solving it)
> > Parameter names is something we could probably add. I'd like to see the
> impact of it from a size perspective. I mean, next people will want
> locations of parameters. Then... and we're right back to just full debug
> info.
>
> My main issue with -gline-tables-only (which LLDB doesn't support
> currently due to the partial DWARF that is emitted that causes clang to
> assert and kill the debugger) is that we now have DWARF that we can't
> trust to use when creating types because all functions are potentially
> invalid to create in a clang::ASTContext since everything has been
> removed. Clang has the notion that certain types of functions must have
> certain number of arguments, like C++ operators, and it gets unhappy if we
> try to convert this DWARF into clang::AST types. We have a radar for clang
> to somehow mark the debug info has "-gline-tables-only" so we can know
> that any DW_TAG_subprogram tags we run into need to be converted to clang
> AST types that return "UnknownAnyTy" and taking varargs as the parameters
> so that they will be callable from the expression parser. We still might
> run into names that can't be used as function names, especially if the
> namespaces and class decl contexts are removed, so LLDB might have to just
> say "I am not parsing any DWARF from anything that had -gline-tables-only
> enabled..".
>
> Greg Clayton
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

 



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Nico Weber
In reply to this post by Adrian Prantl
On Wed, Apr 29, 2015 at 1:33 PM, Adrian Prantl <[hidden email]> wrote:

On Apr 29, 2015, at 1:30 PM, Nico Weber <[hidden email]> wrote:

On Wed, Apr 29, 2015 at 1:25 PM, Adrian Prantl <[hidden email]> wrote:

On Apr 29, 2015, at 1:03 PM, Adrian Prantl <[hidden email]> wrote:


On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:

Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.


(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.

The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark. 

As a stopgap until all of Fred’s patches are in, you could also try building with -fstandalone-debug.

Isn't that on by default on Darwin?

It is of course. I meant to say -fno-standalone-debug.

Looks like this might do the trick for us, thanks! We'll try using this, and if everything's happy with that flag we'll hope that by the time we hit this again, dsymutil is more reliable. (fno-standalone-debug reduces dSYM size from 4GB to 1.7GB, so this likely won't happen in the next few years.)
 

 
It will make LLDB unhappy but the result will still be much better than 0-line-tables-only.


-- adrian

We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)


-- adrian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev





_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: rfc: Adding a mode to -gline-tables-only to include parameter names, namespaces

Frédéric Riss
In reply to this post by Adrian Prantl
[reviving an old thread, sorry. Still catching up]
 
On Apr 29, 2015, at 1:03 PM, Adrian Prantl <[hidden email]> wrote:


On Apr 29, 2015, at 12:25 PM, Nico Weber <[hidden email]> wrote:

Hi,

the motivation for -gline-tables-only was to make debug info much smaller, but still include enough to get usable stack frames [1]. We recently tried using it in Chromium and discovered that the stack frames aren't all that usable: Function parameters disappear, as do function namespaces.

Are there any concerns about adding a mode to -gline-tables-only (or a second flag -gline-tables-full, or similar) that includes function parameter info and namespace info but still omits most debug info?

I’m not convinced that the resulting debug info will dramatically smaller than the full debug info. The largest bit of the debug info is the type information if we are going to emit function parameters that will probably pull in the majority of the types in the program.

For the sake of backtraces, we could emit only forward declarations of type parameters. This would allow a debugger to show you if a pointer/reference is null, which is a pretty important piece of information when debugging crashes. In this case I think that the location information would be much bigger than the type info. Hard to even give an estimate of what this represents though…

Fred


(Background: We rely on dsym files to let our crash server symbolize crash dumps we get from the wild. dsymutil uses debug info, but it apparently crashes if debug info is > 4GB.

The problem here is that neither llvm nor dsymutil understand the 64-bit DWARF format. Note that the llvm-dsymutil that is being developed will be able to do ODR-based type uniquing for C++, which should also provide enough savings to make this go well under the 4GB mark. 

We hit that recently. So we started using -gline-tables-only, but now all our stacks are close to unusable, even though the motivation for gline-tables-only was to still have usable stacks. We can't easily use symbol names as they get stripped very early in our pipeline, and changing that is apparently involved.)


-- adrian



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
12
Loading...