Avoiding unnecessary calls to Decl::getASTContext

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
Hi all,

I have been looking at Decl::getASTContext and DeclContext::getParentASTContext,
and it turns out that they are not that cheap.

The reason for this is that they must walk back to the translation unit declaration
along what is essentially a linked list. This is amplified by the fact that it does
look like a simple getter and therefore is called extremely often (the most egregious
case by far being DeclContext::lookup), even when most of the times the context is
already easily available.

To put some numbers on this I experimented with removing about half of the iterations
in Decl::getTranslationUnitDecl. This gives a speed up of about 1% when parsing all
of Boost. Extrapolating from this I get that about 1.5%-2% of the time spent
parsing Boost is spent getting the ASTContext, which seems a bit unfortunate.

Now to be clear 1.5%-2% is not that much, but the fix is simply to pass a ref to
the ASTContext more often as a function parameter. The downside is that this would
cause a some amount of churn in the C++ api. To be clear I am *not* suggesting
removing all of the calls to Decl::getASTContext, but only removing the ones causing
the most iterations in Decl::getTranslationUnitDecl (say the top 10 or 20).

What do you think ?

Bruno
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
Seems reasonable and some profile data would help motivate/support/justify which bits are worth changing and which aren't

On Sat, Dec 15, 2018, 8:27 AM Bruno Ricci via cfe-dev <[hidden email] wrote:
Hi all,

I have been looking at Decl::getASTContext and DeclContext::getParentASTContext,
and it turns out that they are not that cheap.

The reason for this is that they must walk back to the translation unit declaration
along what is essentially a linked list. This is amplified by the fact that it does
look like a simple getter and therefore is called extremely often (the most egregious
case by far being DeclContext::lookup), even when most of the times the context is
already easily available.

To put some numbers on this I experimented with removing about half of the iterations
in Decl::getTranslationUnitDecl. This gives a speed up of about 1% when parsing all
of Boost. Extrapolating from this I get that about 1.5%-2% of the time spent
parsing Boost is spent getting the ASTContext, which seems a bit unfortunate.

Now to be clear 1.5%-2% is not that much, but the fix is simply to pass a ref to
the ASTContext more often as a function parameter. The downside is that this would
cause a some amount of churn in the C++ api. To be clear I am *not* suggesting
removing all of the calls to Decl::getASTContext, but only removing the ones causing
the most iterations in Decl::getTranslationUnitDecl (say the top 10 or 20).

What do you think ?

Bruno
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
I have added the profile data to the first patch (https://reviews.llvm.org/D55658)

The top 20 callers of getASTContext and getParentASTContext are:
(with the number of calls to getTranslationUnitDecl, edited for readability)

5399586 : DeclContext::lookup
801684 : VarDecl::getMemberSpecializationInfo
551928 : FunctionDecl::getMinRequiredArguments
401464 : CXXRecordDecl::lookupInBases
338614 : Decl::getAttrs
311651 : DeclContext::makeDeclVisibleInContextImpl
173684 : Decl::getASTMutationListener
164264 : CXXConstructorDecl::isSpecializationCopyingObject
154066 : CXXMethodDecl::size_overridden_methods
141622 : CXXRecordDecl::getVisibleConversionFunctions
120440 : ClassTemplateSpecializationDecl::Profile
116303 : VarDecl::isThisDeclarationADefinition
106208 : CXXRecordDecl::conversion_begin
106208 : CXXRecordDecl::conversion_end
99075 : FunctionDecl::setParams
95910 : RedeclarableTemplateDecl::findSpecializationImpl
57043 : CXXRecordDecl::getDestructor
56668 : VarDecl::getDefinition
52140 : TagDecl::startDefinition
48266 : CXXConstructorDecl::isCopyOrMoveConstructor
37407 : FunctionDecl::getBody

Bruno

On 15/12/2018 16:38, David Blaikie wrote:

> Seems reasonable and some profile data would help motivate/support/justify which bits are worth changing and which aren't
>
> On Sat, Dec 15, 2018, 8:27 AM Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> wrote:
>
>     Hi all,
>
>     I have been looking at Decl::getASTContext and DeclContext::getParentASTContext,
>     and it turns out that they are not that cheap.
>
>     The reason for this is that they must walk back to the translation unit declaration
>     along what is essentially a linked list. This is amplified by the fact that it does
>     look like a simple getter and therefore is called extremely often (the most egregious
>     case by far being DeclContext::lookup), even when most of the times the context is
>     already easily available.
>
>     To put some numbers on this I experimented with removing about half of the iterations
>     in Decl::getTranslationUnitDecl. This gives a speed up of about 1% when parsing all
>     of Boost. Extrapolating from this I get that about 1.5%-2% of the time spent
>     parsing Boost is spent getting the ASTContext, which seems a bit unfortunate.
>
>     Now to be clear 1.5%-2% is not that much, but the fix is simply to pass a ref to
>     the ASTContext more often as a function parameter. The downside is that this would
>     cause a some amount of churn in the C++ api. To be clear I am *not* suggesting
>     removing all of the calls to Decl::getASTContext, but only removing the ones causing
>     the most iterations in Decl::getTranslationUnitDecl (say the top 10 or 20).
>
>     What do you think ?
>
>     Bruno
>     _______________________________________________
>     cfe-dev mailing list
>     [hidden email] <mailto:[hidden email]>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
Ah, I was hoping/looking for more of a timing/cpu-counter type profile, to demonstrate the overall impact (% runtime) of these changes, if possible?

On Sat, Dec 15, 2018 at 9:57 AM Bruno Ricci <[hidden email]> wrote:
I have added the profile data to the first patch (https://reviews.llvm.org/D55658)

The top 20 callers of getASTContext and getParentASTContext are:
(with the number of calls to getTranslationUnitDecl, edited for readability)

5399586 : DeclContext::lookup
801684 : VarDecl::getMemberSpecializationInfo
551928 : FunctionDecl::getMinRequiredArguments
401464 : CXXRecordDecl::lookupInBases
338614 : Decl::getAttrs
311651 : DeclContext::makeDeclVisibleInContextImpl
173684 : Decl::getASTMutationListener
164264 : CXXConstructorDecl::isSpecializationCopyingObject
154066 : CXXMethodDecl::size_overridden_methods
141622 : CXXRecordDecl::getVisibleConversionFunctions
120440 : ClassTemplateSpecializationDecl::Profile
116303 : VarDecl::isThisDeclarationADefinition
106208 : CXXRecordDecl::conversion_begin
106208 : CXXRecordDecl::conversion_end
99075 : FunctionDecl::setParams
95910 : RedeclarableTemplateDecl::findSpecializationImpl
57043 : CXXRecordDecl::getDestructor
56668 : VarDecl::getDefinition
52140 : TagDecl::startDefinition
48266 : CXXConstructorDecl::isCopyOrMoveConstructor
37407 : FunctionDecl::getBody

Bruno

On 15/12/2018 16:38, David Blaikie wrote:
> Seems reasonable and some profile data would help motivate/support/justify which bits are worth changing and which aren't
>
> On Sat, Dec 15, 2018, 8:27 AM Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> wrote:
>
>     Hi all,
>
>     I have been looking at Decl::getASTContext and DeclContext::getParentASTContext,
>     and it turns out that they are not that cheap.
>
>     The reason for this is that they must walk back to the translation unit declaration
>     along what is essentially a linked list. This is amplified by the fact that it does
>     look like a simple getter and therefore is called extremely often (the most egregious
>     case by far being DeclContext::lookup), even when most of the times the context is
>     already easily available.
>
>     To put some numbers on this I experimented with removing about half of the iterations
>     in Decl::getTranslationUnitDecl. This gives a speed up of about 1% when parsing all
>     of Boost. Extrapolating from this I get that about 1.5%-2% of the time spent
>     parsing Boost is spent getting the ASTContext, which seems a bit unfortunate.
>
>     Now to be clear 1.5%-2% is not that much, but the fix is simply to pass a ref to
>     the ASTContext more often as a function parameter. The downside is that this would
>     cause a some amount of churn in the C++ api. To be clear I am *not* suggesting
>     removing all of the calls to Decl::getASTContext, but only removing the ones causing
>     the most iterations in Decl::getTranslationUnitDecl (say the top 10 or 20).
>
>     What do you think ?
>
>     Bruno
>     _______________________________________________
>     cfe-dev mailing list
>     [hidden email] <mailto:[hidden email]>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
In reply to this post by Richard Smith via cfe-dev
On Sat, 15 Dec 2018, 09:57 Bruno Ricci via cfe-dev <[hidden email] wrote:
I have added the profile data to the first patch (https://reviews.llvm.org/D55658)

The top 20 callers of getASTContext and getParentASTContext are:
(with the number of calls to getTranslationUnitDecl, edited for readability)

5399586 : DeclContext::lookup
801684 : VarDecl::getMemberSpecializationInfo
551928 : FunctionDecl::getMinRequiredArguments
401464 : CXXRecordDecl::lookupInBases
338614 : Decl::getAttrs
311651 : DeclContext::makeDeclVisibleInContextImpl
173684 : Decl::getASTMutationListener
164264 : CXXConstructorDecl::isSpecializationCopyingObject
154066 : CXXMethodDecl::size_overridden_methods
141622 : CXXRecordDecl::getVisibleConversionFunctions
120440 : ClassTemplateSpecializationDecl::Profile
116303 : VarDecl::isThisDeclarationADefinition
106208 : CXXRecordDecl::conversion_begin
106208 : CXXRecordDecl::conversion_end
99075 : FunctionDecl::setParams
95910 : RedeclarableTemplateDecl::findSpecializationImpl
57043 : CXXRecordDecl::getDestructor
56668 : VarDecl::getDefinition
52140 : TagDecl::startDefinition
48266 : CXXConstructorDecl::isCopyOrMoveConstructor
37407 : FunctionDecl::getBody

Given your data that perhaps 1.5-2% of the time building boost is finding the AST context, it definitely seems worthwhile to avoid doing so for lookup. The next three seem reasonable to address too, if you feel so inclined, but I'd not go any further than that -- the long tail seems negligible in comparison.

Bruno

On 15/12/2018 16:38, David Blaikie wrote:
> Seems reasonable and some profile data would help motivate/support/justify which bits are worth changing and which aren't
>
> On Sat, Dec 15, 2018, 8:27 AM Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> wrote:
>
>     Hi all,
>
>     I have been looking at Decl::getASTContext and DeclContext::getParentASTContext,
>     and it turns out that they are not that cheap.
>
>     The reason for this is that they must walk back to the translation unit declaration
>     along what is essentially a linked list. This is amplified by the fact that it does
>     look like a simple getter and therefore is called extremely often (the most egregious
>     case by far being DeclContext::lookup), even when most of the times the context is
>     already easily available.
>
>     To put some numbers on this I experimented with removing about half of the iterations
>     in Decl::getTranslationUnitDecl. This gives a speed up of about 1% when parsing all
>     of Boost. Extrapolating from this I get that about 1.5%-2% of the time spent
>     parsing Boost is spent getting the ASTContext, which seems a bit unfortunate.
>
>     Now to be clear 1.5%-2% is not that much, but the fix is simply to pass a ref to
>     the ASTContext more often as a function parameter. The downside is that this would
>     cause a some amount of churn in the C++ api. To be clear I am *not* suggesting
>     removing all of the calls to Decl::getASTContext, but only removing the ones causing
>     the most iterations in Decl::getTranslationUnitDecl (say the top 10 or 20).
>
>     What do you think ?
>
>     Bruno
>     _______________________________________________
>     cfe-dev mailing list
>     [hidden email] <mailto:[hidden email]>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
In reply to this post by Richard Smith via cfe-dev
On 16/12/2018 16:48, David Blaikie wrote:
> Ah, I was hoping/looking for more of a timing/cpu-counter type profile, to demonstrate the overall impact (% runtime) of these changes, if possible?

The estimation I got is that about 1.5%-2% of the time spent doing an
fsyntax-only on all of Boost is spent obtaining the AST context.

This estimation was obtained by removing a fraction of the iterations in
Decl::getTranslationUnitDecl and extrapolating from this. The timings were
obtained by doing 20 iterations of fsyntax-only with perf stat both before
and after the changes.

This should probably be taken with a grain of salt since this is likely
dependent on the hardware and cpu usage, and since it is in general not
easy to reliably measure small timing differences.

Bruno

>
> On Sat, Dec 15, 2018 at 9:57 AM Bruno Ricci <[hidden email] <mailto:[hidden email]>> wrote:
>
>     I have added the profile data to the first patch (https://reviews.llvm.org/D55658)
>
>     The top 20 callers of getASTContext and getParentASTContext are:
>     (with the number of calls to getTranslationUnitDecl, edited for readability)
>
>     5399586 : DeclContext::lookup
>     801684 : VarDecl::getMemberSpecializationInfo
>     551928 : FunctionDecl::getMinRequiredArguments
>     401464 : CXXRecordDecl::lookupInBases
>     338614 : Decl::getAttrs
>     311651 : DeclContext::makeDeclVisibleInContextImpl
>     173684 : Decl::getASTMutationListener
>     164264 : CXXConstructorDecl::isSpecializationCopyingObject
>     154066 : CXXMethodDecl::size_overridden_methods
>     141622 : CXXRecordDecl::getVisibleConversionFunctions
>     120440 : ClassTemplateSpecializationDecl::Profile
>     116303 : VarDecl::isThisDeclarationADefinition
>     106208 : CXXRecordDecl::conversion_begin
>     106208 : CXXRecordDecl::conversion_end
>     99075 : FunctionDecl::setParams
>     95910 : RedeclarableTemplateDecl::findSpecializationImpl
>     57043 : CXXRecordDecl::getDestructor
>     56668 : VarDecl::getDefinition
>     52140 : TagDecl::startDefinition
>     48266 : CXXConstructorDecl::isCopyOrMoveConstructor
>     37407 : FunctionDecl::getBody
>
>     Bruno
>
>     On 15/12/2018 16:38, David Blaikie wrote:
>     > Seems reasonable and some profile data would help motivate/support/justify which bits are worth changing and which aren't
>     >
>     > On Sat, Dec 15, 2018, 8:27 AM Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>> wrote:
>     >
>     >     Hi all,
>     >
>     >     I have been looking at Decl::getASTContext and DeclContext::getParentASTContext,
>     >     and it turns out that they are not that cheap.
>     >
>     >     The reason for this is that they must walk back to the translation unit declaration
>     >     along what is essentially a linked list. This is amplified by the fact that it does
>     >     look like a simple getter and therefore is called extremely often (the most egregious
>     >     case by far being DeclContext::lookup), even when most of the times the context is
>     >     already easily available.
>     >
>     >     To put some numbers on this I experimented with removing about half of the iterations
>     >     in Decl::getTranslationUnitDecl. This gives a speed up of about 1% when parsing all
>     >     of Boost. Extrapolating from this I get that about 1.5%-2% of the time spent
>     >     parsing Boost is spent getting the ASTContext, which seems a bit unfortunate.
>     >
>     >     Now to be clear 1.5%-2% is not that much, but the fix is simply to pass a ref to
>     >     the ASTContext more often as a function parameter. The downside is that this would
>     >     cause a some amount of churn in the C++ api. To be clear I am *not* suggesting
>     >     removing all of the calls to Decl::getASTContext, but only removing the ones causing
>     >     the most iterations in Decl::getTranslationUnitDecl (say the top 10 or 20).
>     >
>     >     What do you think ?
>     >
>     >     Bruno
>     >     _______________________________________________
>     >     cfe-dev mailing list
>     >     [hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>>
>     >     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>     >
>




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
In reply to this post by Richard Smith via cfe-dev
On 16/12/2018 18:55, Richard Smith wrote:

> On Sat, 15 Dec 2018, 09:57 Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> wrote:
>
>     I have added the profile data to the first patch (https://reviews.llvm.org/D55658)
>
>     The top 20 callers of getASTContext and getParentASTContext are:
>     (with the number of calls to getTranslationUnitDecl, edited for readability)
>
>     5399586 : DeclContext::lookup
>     801684 : VarDecl::getMemberSpecializationInfo
>     551928 : FunctionDecl::getMinRequiredArguments
>     401464 : CXXRecordDecl::lookupInBases
>     338614 : Decl::getAttrs
>     311651 : DeclContext::makeDeclVisibleInContextImpl
>     173684 : Decl::getASTMutationListener
>     164264 : CXXConstructorDecl::isSpecializationCopyingObject
>     154066 : CXXMethodDecl::size_overridden_methods
>     141622 : CXXRecordDecl::getVisibleConversionFunctions
>     120440 : ClassTemplateSpecializationDecl::Profile
>     116303 : VarDecl::isThisDeclarationADefinition
>     106208 : CXXRecordDecl::conversion_begin
>     106208 : CXXRecordDecl::conversion_end
>     99075 : FunctionDecl::setParams
>     95910 : RedeclarableTemplateDecl::findSpecializationImpl
>     57043 : CXXRecordDecl::getDestructor
>     56668 : VarDecl::getDefinition
>     52140 : TagDecl::startDefinition
>     48266 : CXXConstructorDecl::isCopyOrMoveConstructor
>     37407 : FunctionDecl::getBody
>
>
> Given your data that perhaps 1.5-2% of the time building boost is finding the AST context, it definitely seems worthwhile to avoid doing so for lookup. The next three seem reasonable to address too, if you feel so inclined, but I'd not go any further than that -- the long tail seems negligible in comparison.

Just to clarify, the 1.5%-2% figure is an estimation for the time spent obtaining the
AST context when parsing all of Boost without doing any code generation (ie: -fsyntax-only).

I generally agree about not bothering too much with the long tail, with perhaps two exceptions:

1.) Some of these are really easy to fix. For example FunctionDecl::setParams is almost only
    called in Sema and so the context is already easily available.

2.) The numbers shown above are the number of calls to getTranslationUnitDecl. A better metric
    would be the number of iterations needed to reach the translation unit declaration. With
    that in mind it would also make sense to pass the AST context to some of the above where
    the declaration is likely to occur nested deep in the declaration context chain
    (VarDecls for example).

If there are no objections I will submit some patches next week to do this change, hopefully
in small increments. In any case this will require small (but straightforward) changes to at
least clang-tools-extra and lldb. (Am I forgetting any project here ?)

Bruno
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
On Sun, 16 Dec 2018, 14:27 Bruno Ricci via cfe-dev <[hidden email] wrote:
On 16/12/2018 18:55, Richard Smith wrote:
> On Sat, 15 Dec 2018, 09:57 Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> wrote:
>
>     I have added the profile data to the first patch (https://reviews.llvm.org/D55658)
>
>     The top 20 callers of getASTContext and getParentASTContext are:
>     (with the number of calls to getTranslationUnitDecl, edited for readability)
>
>     5399586 : DeclContext::lookup
>     801684 : VarDecl::getMemberSpecializationInfo
>     551928 : FunctionDecl::getMinRequiredArguments
>     401464 : CXXRecordDecl::lookupInBases
>     338614 : Decl::getAttrs
>     311651 : DeclContext::makeDeclVisibleInContextImpl
>     173684 : Decl::getASTMutationListener
>     164264 : CXXConstructorDecl::isSpecializationCopyingObject
>     154066 : CXXMethodDecl::size_overridden_methods
>     141622 : CXXRecordDecl::getVisibleConversionFunctions
>     120440 : ClassTemplateSpecializationDecl::Profile
>     116303 : VarDecl::isThisDeclarationADefinition
>     106208 : CXXRecordDecl::conversion_begin
>     106208 : CXXRecordDecl::conversion_end
>     99075 : FunctionDecl::setParams
>     95910 : RedeclarableTemplateDecl::findSpecializationImpl
>     57043 : CXXRecordDecl::getDestructor
>     56668 : VarDecl::getDefinition
>     52140 : TagDecl::startDefinition
>     48266 : CXXConstructorDecl::isCopyOrMoveConstructor
>     37407 : FunctionDecl::getBody
>
>
> Given your data that perhaps 1.5-2% of the time building boost is finding the AST context, it definitely seems worthwhile to avoid doing so for lookup. The next three seem reasonable to address too, if you feel so inclined, but I'd not go any further than that -- the long tail seems negligible in comparison.

Just to clarify, the 1.5%-2% figure is an estimation for the time spent obtaining the
AST context when parsing all of Boost without doing any code generation (ie: -fsyntax-only).

I generally agree about not bothering too much with the long tail, with perhaps two exceptions:

1.) Some of these are really easy to fix. For example FunctionDecl::setParams is almost only
    called in Sema and so the context is already easily available.

Sure, non-disruptive cleanups for this seem good where possible. (But for example, getAttrs is called from a large number of places and it'd be really awkward to require an ASTContext, so I'd leave that one.)

2.) The numbers shown above are the number of calls to getTranslationUnitDecl. A better metric
    would be the number of iterations needed to reach the translation unit declaration. With
    that in mind it would also make sense to pass the AST context to some of the above where
    the declaration is likely to occur nested deep in the declaration context chain
    (VarDecls for example).

If there are no objections I will submit some patches next week to do this change, hopefully
in small increments. In any case this will require small (but straightforward) changes to at
least clang-tools-extra and lldb. (Am I forgetting any project here ?)

I think it's just those two. Thank you!

Bruno
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
Hi,

Sorry for being late to the discussion.

Do I understand it right that you’d like to use a local DeclContext* variable and make sure it’s identical to S.getASTContext() result (or similar for Parent)?

What do you think about caching ASTContext and/or ParentASTContext in Sema?

Jan

On Dec 17, 2018, at 9:51 AM, Richard Smith via cfe-dev <[hidden email]> wrote:

On Sun, 16 Dec 2018, 14:27 Bruno Ricci via cfe-dev <[hidden email] wrote:
On 16/12/2018 18:55, Richard Smith wrote:

> On Sat, 15 Dec 2018, 09:57 Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> wrote:
> 
>     I have added the profile data to the first patch (https://reviews.llvm.org/D55658)
> 
>     The top 20 callers of getASTContext and getParentASTContext are:
>     (with the number of calls to getTranslationUnitDecl, edited for readability)
> 
>     5399586 : DeclContext::lookup
>     801684 : VarDecl::getMemberSpecializationInfo
>     551928 : FunctionDecl::getMinRequiredArguments
>     401464 : CXXRecordDecl::lookupInBases
>     338614 : Decl::getAttrs
>     311651 : DeclContext::makeDeclVisibleInContextImpl
>     173684 : Decl::getASTMutationListener
>     164264 : CXXConstructorDecl::isSpecializationCopyingObject
>     154066 : CXXMethodDecl::size_overridden_methods
>     141622 : CXXRecordDecl::getVisibleConversionFunctions
>     120440 : ClassTemplateSpecializationDecl::Profile
>     116303 : VarDecl::isThisDeclarationADefinition
>     106208 : CXXRecordDecl::conversion_begin
>     106208 : CXXRecordDecl::conversion_end
>     99075 : FunctionDecl::setParams
>     95910 : RedeclarableTemplateDecl::findSpecializationImpl
>     57043 : CXXRecordDecl::getDestructor
>     56668 : VarDecl::getDefinition
>     52140 : TagDecl::startDefinition
>     48266 : CXXConstructorDecl::isCopyOrMoveConstructor
>     37407 : FunctionDecl::getBody
> 
> 
> Given your data that perhaps 1.5-2% of the time building boost is finding the AST context, it definitely seems worthwhile to avoid doing so for lookup. The next three seem reasonable to address too, if you feel so inclined, but I'd not go any further than that -- the long tail seems negligible in comparison.

Just to clarify, the 1.5%-2% figure is an estimation for the time spent obtaining the
AST context when parsing all of Boost without doing any code generation (ie: -fsyntax-only).

I generally agree about not bothering too much with the long tail, with perhaps two exceptions:

1.) Some of these are really easy to fix. For example FunctionDecl::setParams is almost only
    called in Sema and so the context is already easily available.

Sure, non-disruptive cleanups for this seem good where possible. (But for example, getAttrs is called from a large number of places and it'd be really awkward to require an ASTContext, so I'd leave that one.)

2.) The numbers shown above are the number of calls to getTranslationUnitDecl. A better metric
    would be the number of iterations needed to reach the translation unit declaration. With
    that in mind it would also make sense to pass the AST context to some of the above where
    the declaration is likely to occur nested deep in the declaration context chain
    (VarDecls for example).

If there are no objections I will submit some patches next week to do this change, hopefully
in small increments. In any case this will require small (but straightforward) changes to at
least clang-tools-extra and lldb. (Am I forgetting any project here ?)

I think it's just those two. Thank you!

Bruno
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
I don't understand the part about caching the AST context in Sema since Sema already store
a reference to the AST context.

What I would like to propose is the pass a reference to the AST context more often as
a function parameter. For example instead of having "lookup_result lookup(DeclarationName Name) const;"
we would have "lookup_result lookup(ASTContext &Context, DeclarationName Name) const;" (with ASTContext
possibly const-qualified).

This is a little bit annoying since it require updating all the callers and propagating the changes
along the call chains. I would only suggest doing this when the changes are not too invasive.
For example Decl::getAttrs is used in a lot of places and requiring the AST context would probably
not be a good idea. (although possibly getAttrs could have two versions, one with a reference to the
AST context and another one without).

The other alternatives I considered are:

1.) Store a reference to the AST context in each DeclContext. This has in my opinion
    an unacceptable memory cost. In addition when I experimented with this the speedup
    obtained from avoiding the lookup of the AST context are almost entirely negated by
    the increased memory usage (this might of course vary depending on the specific test setup).

2.) Cache the AST context in Decl::getASTContext. I did not experiment with this but it seems to
    me that this will not work when there are multiple AST contexts. Or if it can be made to work
    there is a need to store some data in Decl for the invalidation, but Decl has not a single bit
    available.

Bruno

On 02/01/2019 13:33, Jan Korous wrote:

> Hi,
>
> Sorry for being late to the discussion.
>
> Do I understand it right that you’d like to use a local DeclContext* variable and make sure it’s identical to S.getASTContext() result (or similar for Parent)?
>
> What do you think about caching ASTContext and/or ParentASTContext in Sema?
>
> Jan
>
>> On Dec 17, 2018, at 9:51 AM, Richard Smith via cfe-dev <[hidden email] <mailto:[hidden email]>> wrote:
>>
>> On Sun, 16 Dec 2018, 14:27 Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> wrote:
>>
>>     On 16/12/2018 18:55, Richard Smith wrote:
>>     > On Sat, 15 Dec 2018, 09:57 Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>> wrote:
>>     > 
>>     >     I have added the profile data to the first patch (https://reviews.llvm.org/D55658)
>>     > 
>>     >     The top 20 callers of getASTContext and getParentASTContext are:
>>     >     (with the number of calls to getTranslationUnitDecl, edited for readability)
>>     > 
>>     >     5399586 : DeclContext::lookup
>>     >     801684 : VarDecl::getMemberSpecializationInfo
>>     >     551928 : FunctionDecl::getMinRequiredArguments
>>     >     401464 : CXXRecordDecl::lookupInBases
>>     >     338614 : Decl::getAttrs
>>     >     311651 : DeclContext::makeDeclVisibleInContextImpl
>>     >     173684 : Decl::getASTMutationListener
>>     >     164264 : CXXConstructorDecl::isSpecializationCopyingObject
>>     >     154066 : CXXMethodDecl::size_overridden_methods
>>     >     141622 : CXXRecordDecl::getVisibleConversionFunctions
>>     >     120440 : ClassTemplateSpecializationDecl::Profile
>>     >     116303 : VarDecl::isThisDeclarationADefinition
>>     >     106208 : CXXRecordDecl::conversion_begin
>>     >     106208 : CXXRecordDecl::conversion_end
>>     >     99075 : FunctionDecl::setParams
>>     >     95910 : RedeclarableTemplateDecl::findSpecializationImpl
>>     >     57043 : CXXRecordDecl::getDestructor
>>     >     56668 : VarDecl::getDefinition
>>     >     52140 : TagDecl::startDefinition
>>     >     48266 : CXXConstructorDecl::isCopyOrMoveConstructor
>>     >     37407 : FunctionDecl::getBody
>>     > 
>>     > 
>>     > Given your data that perhaps 1.5-2% of the time building boost is finding the AST context, it definitely seems worthwhile to avoid doing so for lookup. The next three seem reasonable to address too, if you feel so inclined, but I'd not go any further than that -- the long tail seems negligible in comparison.
>>
>>     Just to clarify, the 1.5%-2% figure is an estimation for the time spent obtaining the
>>     AST context when parsing all of Boost without doing any code generation (ie: -fsyntax-only).
>>
>>     I generally agree about not bothering too much with the long tail, with perhaps two exceptions:
>>
>>     1.) Some of these are really easy to fix. For example FunctionDecl::setParams is almost only
>>         called in Sema and so the context is already easily available.
>>
>>
>> Sure, non-disruptive cleanups for this seem good where possible. (But for example, getAttrs is called from a large number of places and it'd be really awkward to require an ASTContext, so I'd leave that one.)
>>
>>     2.) The numbers shown above are the number of calls to getTranslationUnitDecl. A better metric
>>         would be the number of iterations needed to reach the translation unit declaration. With
>>         that in mind it would also make sense to pass the AST context to some of the above where
>>         the declaration is likely to occur nested deep in the declaration context chain
>>         (VarDecls for example).
>>
>>     If there are no objections I will submit some patches next week to do this change, hopefully
>>     in small increments. In any case this will require small (but straightforward) changes to at
>>     least clang-tools-extra and lldb. (Am I forgetting any project here ?)
>>
>>
>> I think it's just those two. Thank you!
>>
>>     Bruno
>>     _______________________________________________
>>     cfe-dev mailing list
>>     [hidden email] <mailto:[hidden email]>
>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email] <mailto:[hidden email]>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Avoiding unnecessary calls to Decl::getASTContext

Richard Smith via cfe-dev
Sorry, my bad. I was just thinking whether there’s any more fool-proof way than manual passing of DeclContext but my thoughts derailed.

> On Jan 2, 2019, at 3:31 PM, Bruno Ricci <[hidden email]> wrote:
>
> I don't understand the part about caching the AST context in Sema since Sema already store
> a reference to the AST context.
>
> What I would like to propose is the pass a reference to the AST context more often as
> a function parameter. For example instead of having "lookup_result lookup(DeclarationName Name) const;"
> we would have "lookup_result lookup(ASTContext &Context, DeclarationName Name) const;" (with ASTContext
> possibly const-qualified).
>
> This is a little bit annoying since it require updating all the callers and propagating the changes
> along the call chains. I would only suggest doing this when the changes are not too invasive.
> For example Decl::getAttrs is used in a lot of places and requiring the AST context would probably
> not be a good idea. (although possibly getAttrs could have two versions, one with a reference to the
> AST context and another one without).
>
> The other alternatives I considered are:
>
> 1.) Store a reference to the AST context in each DeclContext. This has in my opinion
>    an unacceptable memory cost. In addition when I experimented with this the speedup
>    obtained from avoiding the lookup of the AST context are almost entirely negated by
>    the increased memory usage (this might of course vary depending on the specific test setup).
>
> 2.) Cache the AST context in Decl::getASTContext. I did not experiment with this but it seems to
>    me that this will not work when there are multiple AST contexts. Or if it can be made to work
>    there is a need to store some data in Decl for the invalidation, but Decl has not a single bit
>    available.
>
> Bruno
>
> On 02/01/2019 13:33, Jan Korous wrote:
>> Hi,
>>
>> Sorry for being late to the discussion.
>>
>> Do I understand it right that you’d like to use a local DeclContext* variable and make sure it’s identical to S.getASTContext() result (or similar for Parent)?
>>
>> What do you think about caching ASTContext and/or ParentASTContext in Sema?
>>
>> Jan
>>
>>> On Dec 17, 2018, at 9:51 AM, Richard Smith via cfe-dev <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>> On Sun, 16 Dec 2018, 14:27 Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> wrote:
>>>
>>>    On 16/12/2018 18:55, Richard Smith wrote:
>>>> On Sat, 15 Dec 2018, 09:57 Bruno Ricci via cfe-dev <[hidden email] <mailto:[hidden email]> <mailto:[hidden email] <mailto:[hidden email]>> wrote:
>>>>  
>>>>      I have added the profile data to the first patch (https://reviews.llvm.org/D55658)
>>>>  
>>>>      The top 20 callers of getASTContext and getParentASTContext are:
>>>>      (with the number of calls to getTranslationUnitDecl, edited for readability)
>>>>  
>>>>      5399586 : DeclContext::lookup
>>>>      801684 : VarDecl::getMemberSpecializationInfo
>>>>      551928 : FunctionDecl::getMinRequiredArguments
>>>>      401464 : CXXRecordDecl::lookupInBases
>>>>      338614 : Decl::getAttrs
>>>>      311651 : DeclContext::makeDeclVisibleInContextImpl
>>>>      173684 : Decl::getASTMutationListener
>>>>      164264 : CXXConstructorDecl::isSpecializationCopyingObject
>>>>      154066 : CXXMethodDecl::size_overridden_methods
>>>>      141622 : CXXRecordDecl::getVisibleConversionFunctions
>>>>      120440 : ClassTemplateSpecializationDecl::Profile
>>>>      116303 : VarDecl::isThisDeclarationADefinition
>>>>      106208 : CXXRecordDecl::conversion_begin
>>>>      106208 : CXXRecordDecl::conversion_end
>>>>      99075 : FunctionDecl::setParams
>>>>      95910 : RedeclarableTemplateDecl::findSpecializationImpl
>>>>      57043 : CXXRecordDecl::getDestructor
>>>>      56668 : VarDecl::getDefinition
>>>>      52140 : TagDecl::startDefinition
>>>>      48266 : CXXConstructorDecl::isCopyOrMoveConstructor
>>>>      37407 : FunctionDecl::getBody
>>>>  
>>>>  
>>>> Given your data that perhaps 1.5-2% of the time building boost is finding the AST context, it definitely seems worthwhile to avoid doing so for lookup. The next three seem reasonable to address too, if you feel so inclined, but I'd not go any further than that -- the long tail seems negligible in comparison.
>>>
>>>    Just to clarify, the 1.5%-2% figure is an estimation for the time spent obtaining the
>>>    AST context when parsing all of Boost without doing any code generation (ie: -fsyntax-only).
>>>
>>>    I generally agree about not bothering too much with the long tail, with perhaps two exceptions:
>>>
>>>    1.) Some of these are really easy to fix. For example FunctionDecl::setParams is almost only
>>>        called in Sema and so the context is already easily available.
>>>
>>>
>>> Sure, non-disruptive cleanups for this seem good where possible. (But for example, getAttrs is called from a large number of places and it'd be really awkward to require an ASTContext, so I'd leave that one.)
>>>
>>>    2.) The numbers shown above are the number of calls to getTranslationUnitDecl. A better metric
>>>        would be the number of iterations needed to reach the translation unit declaration. With
>>>        that in mind it would also make sense to pass the AST context to some of the above where
>>>        the declaration is likely to occur nested deep in the declaration context chain
>>>        (VarDecls for example).
>>>
>>>    If there are no objections I will submit some patches next week to do this change, hopefully
>>>    in small increments. In any case this will require small (but straightforward) changes to at
>>>    least clang-tools-extra and lldb. (Am I forgetting any project here ?)
>>>
>>>
>>> I think it's just those two. Thank you!
>>>
>>>    Bruno
>>>    _______________________________________________
>>>    cfe-dev mailing list
>>>    [hidden email] <mailto:[hidden email]>
>>>    http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> [hidden email] <mailto:[hidden email]>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev