Support for selectively enabling profile instrumentation only for certain files/functions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Support for selectively enabling profile instrumentation only for certain files/functions

Fangrui Song via cfe-dev
We'd like to implement support for selectively enabling profile instrumentation only for certain files/functions.

The motivation is collecting code coverage information in presubmit testing, where we want to avoid instrumenting the entire build which would introduce a lot of overhead. Rather, we only want to instrument files/functions that were modified by the patch by passing this information to Clang.

Clang already has -fprofile-filter-files= and -fprofile-exclude-files=, but those flags are currently only supported by GCOV. Furthermore, they only work at the granularity of entire files. I'm also not a fan of using flags which increases the size of the command line as you keep adding more files.

The solution I'm considering for source-code coverage is use the special case list (see https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Support/SpecialCaseList.h) which is already used by sanitizers and XRay and seems like a great fit for this case. Concretely, we would add a new flag, e.g. -fprofile-flter=path/to/special/case/list. The file would have the following format:

[include]
src:src:/path/to/source/*
fun:MyFooBar
[exclude]
src:src:/path/to/source/file.c


Does this sound reasonable?

Related question is whether this option should apply to both AST and IR based instrumentation and where to perform the filtering. We could do the filtering either when inserting the instrumentation instructions, i.e. in PGOInstrumentation.cpp and CodeGenPGO.cpp, or when lowering these instructions InstrProfiling.cpp (where we would just discard these instructions). The advantage of the latter approach is that we only need to implement the filtering once and it'd support both AST and IR based instrumentation, but it also means that we would unnecessarily insert extra instructions only to drop them later.

Does anyone have any preference or suggestion?

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Support for selectively enabling profile instrumentation only for certain files/functions

Fangrui Song via cfe-dev


On Mar 31, 2020, at 3:31 PM, Petr Hosek via cfe-dev <[hidden email]> wrote:

We'd like to implement support for selectively enabling profile instrumentation only for certain files/functions.

The motivation is collecting code coverage information in presubmit testing, where we want to avoid instrumenting the entire build which would introduce a lot of overhead. Rather, we only want to instrument files/functions that were modified by the patch by passing this information to Clang.

Clang already has -fprofile-filter-files= and -fprofile-exclude-files=, but those flags are currently only supported by GCOV. Furthermore, they only work at the granularity of entire files. I'm also not a fan of using flags which increases the size of the command line as you keep adding more files.

The solution I'm considering for source-code coverage is use the special case list (see https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Support/SpecialCaseList.h) which is already used by sanitizers and XRay and seems like a great fit for this case. Concretely, we would add a new flag, e.g. -fprofile-flter=path/to/special/case/list. The file would have the following format:

[include]
src:src:/path/to/source/*
fun:MyFooBar
[exclude]
src:src:/path/to/source/file.c


Does this sound reasonable?

This sounds like a reasonable plan to me.

Related question is whether this option should apply to both AST and IR based instrumentation and where to perform the filtering.

Is there a specific use case for filtering files when applying IR based instrumentation, or would this just be for the sake of completeness?

We could do the filtering either when inserting the instrumentation instructions, i.e. in PGOInstrumentation.cpp and CodeGenPGO.cpp, or when lowering these instructions InstrProfiling.cpp (where we would just discard these instructions). The advantage of the latter approach is that we only need to implement the filtering once and it'd support both AST and IR based instrumentation, but it also means that we would unnecessarily insert extra instructions only to drop them later.

Does anyone have any preference or suggestion?

It is probably cleaner to avoid emitting instrumentation for filtered-out files as early as possible. If the filtering is implemented late, then e.g. with source-based coverage unnecessary coverage mappings may be embedded in the binary. With the recent format change this might not necessarily pose a scaling problem, but it doesn’t seem ideal.

For IR PGO instrumentation, one alternative is to add an attribute to functions that should be instrumented (c.f. the “sanitize_address” attribute used by ASan). Then, the filtering can be done very early. Later, the instrumentation pass would skip functions without the attribute. I’m not sure I recall correctly, but in an old review I think it was suggested that this is a prerequisite for removing the Optional<PGOOptions> from PassBuilder.

best,
vedant


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Support for selectively enabling profile instrumentation only for certain files/functions

Fangrui Song via cfe-dev
In reply to this post by Fangrui Song via cfe-dev
The format of the filtering file sounds good to me.

In terms of training runtime performance (with instrumented binary), this feature probably won't help much as we don't want skip hot functions, but skipping cold functions does not make much difference. However there is one case where it is useful. Some users may hit the size limit of the linking step, so skipping known cold functions helps in that case.

Having an attribute as suggested by vedant is one way to handle this -- but I think it is better to negate attribute -- by default all functions will be instrumented, but functions with 'no-instrument' attribute will be skipped.

David


On Tue, Mar 31, 2020 at 3:31 PM Petr Hosek <[hidden email]> wrote:
We'd like to implement support for selectively enabling profile instrumentation only for certain files/functions.

The motivation is collecting code coverage information in presubmit testing, where we want to avoid instrumenting the entire build which would introduce a lot of overhead. Rather, we only want to instrument files/functions that were modified by the patch by passing this information to Clang.

Clang already has -fprofile-filter-files= and -fprofile-exclude-files=, but those flags are currently only supported by GCOV. Furthermore, they only work at the granularity of entire files. I'm also not a fan of using flags which increases the size of the command line as you keep adding more files.

The solution I'm considering for source-code coverage is use the special case list (see https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Support/SpecialCaseList.h) which is already used by sanitizers and XRay and seems like a great fit for this case. Concretely, we would add a new flag, e.g. -fprofile-flter=path/to/special/case/list. The file would have the following format:

[include]
src:src:/path/to/source/*
fun:MyFooBar
[exclude]
src:src:/path/to/source/file.c


Does this sound reasonable?

Related question is whether this option should apply to both AST and IR based instrumentation and where to perform the filtering. We could do the filtering either when inserting the instrumentation instructions, i.e. in PGOInstrumentation.cpp and CodeGenPGO.cpp, or when lowering these instructions InstrProfiling.cpp (where we would just discard these instructions). The advantage of the latter approach is that we only need to implement the filtering once and it'd support both AST and IR based instrumentation, but it also means that we would unnecessarily insert extra instructions only to drop them later.

Does anyone have any preference or suggestion?

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev