MatchFinder returning many duplicate matches

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

MatchFinder returning many duplicate matches

Tom Stellard via cfe-dev
Hello,

I have a tool that runs over all TUs in a project and matches all function declarations.
These declarations are then stored in a database for later processing.

My problem is that the MatchFinder callback returns many duplicates of the same
function, probably due to the use of templates in the function decl.

I've provided a sample of duplicate USRs for the from_json function in a test project.

from_json c:@N@nlohmann@S@adl_serializer>#{n5C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#{n6C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#*1C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#{n7C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#{n10C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#{n8C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#b#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S

Based on the USRs, it seems like the MatchFinder considers each template instantiation
a separate declaration, whereas I'd like it to return a single function decl that is templated.

Is there a way to eliminate these duplicates during the matching stage?
I have attached a minimal code example below.

class FunctionMatcher : public MatchFinder::MatchCallback {
public:
    FunctionMatcher() {}
    DeclarationMatcher matcher = functionDecl(unless(anyOf(isDefinition(),
                                                        isImplicit(),
                                                        isInStdNamespace(),
                                                        isExpansionInSystemHeader(),
                                                        isTemplateInstantiation(),
                                                        isExplicitTemplateSpecialization())))
                                    .bind("function");
    virtual void run(const MatchFinder::MatchResult& Result) {
        auto res = Result.Nodes.getNodeAs<clang::FunctionDecl>("function");
        // Ignore invalid matches
        if (res == nullptr) {
            return;
        }

        // Print function name and USR
        llvm::SmallString<128> USR;
        if (!clang::index::generateUSRForDecl(d, USR)) {
           printf("%s %s\n", res->getNameAsString().c_str(), USR.str().c_str());
        }
    }

}

int main() {
    FunctionMatcher FunctionFinder(&this->index, this->cfg);
    clang::ast_matchers::MatchFinder         Finder;
    Finder.addMatcher(FunctionFinder.matcher, &FunctionFinder);

    auto cmpdb = clang::tooling::CompilationDatabase::loadFromDirectory("test", "err_msg");
    clang::tooling::ClangTool t(*cmpdb, cmpdb->getAllFiles());
    t.run(clang::tooling::newFrontendActionFactory(&Finder).get());
}

Thanks.


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: MatchFinder returning many duplicate matches

Tom Stellard via cfe-dev
If I understand correctly you want to return what's usually referred to as the "primary template"? The program element which is the template itself, not any instantiation:

    template <int i> f() {}

There is a big distinction in the language's wording that X-templates are not Xs, only their instantiations are.

You could try using functionTemplateDecl() for the match. If I understand this correctly, it'll give you the primary templates.

In case you want, apart from this, the explicit specialisations, there's a predicate isExplicitTemplateSpecialisation() which you used as a filter already. That predicate should apply to the "concrete decl" (as opposed to the template) as explicit specialisations are concrete existing code with no need to instantiate any further.

From a glance or two I've no idea why your current matcher seems to spew out a lot of nodes still. 

;;
;; Whisperity.

On Tue, 19 Nov 2019, 08:56 E via cfe-dev, <[hidden email]> wrote:
Hello,

I have a tool that runs over all TUs in a project and matches all function declarations.
These declarations are then stored in a database for later processing.

My problem is that the MatchFinder callback returns many duplicates of the same
function, probably due to the use of templates in the function decl.

I've provided a sample of duplicate USRs for the from_json function in a test project.

from_json c:@N@nlohmann@S@adl_serializer>#{n5C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#{n6C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#*1C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#{n7C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#{n10C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#{n8C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
from_json c:@N@nlohmann@S@adl_serializer>#b#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S

Based on the USRs, it seems like the MatchFinder considers each template instantiation
a separate declaration, whereas I'd like it to return a single function decl that is templated.

Is there a way to eliminate these duplicates during the matching stage?
I have attached a minimal code example below.

class FunctionMatcher : public MatchFinder::MatchCallback {
public:
    FunctionMatcher() {}
    DeclarationMatcher matcher = functionDecl(unless(anyOf(isDefinition(),
                                                        isImplicit(),
                                                        isInStdNamespace(),
                                                        isExpansionInSystemHeader(),
                                                        isTemplateInstantiation(),
                                                        isExplicitTemplateSpecialization())))
                                    .bind("function");
    virtual void run(const MatchFinder::MatchResult& Result) {
        auto res = Result.Nodes.getNodeAs<clang::FunctionDecl>("function");
        // Ignore invalid matches
        if (res == nullptr) {
            return;
        }

        // Print function name and USR
        llvm::SmallString<128> USR;
        if (!clang::index::generateUSRForDecl(d, USR)) {
           printf("%s %s\n", res->getNameAsString().c_str(), USR.str().c_str());
        }
    }

}

int main() {
    FunctionMatcher FunctionFinder(&this->index, this->cfg);
    clang::ast_matchers::MatchFinder         Finder;
    Finder.addMatcher(FunctionFinder.matcher, &FunctionFinder);

    auto cmpdb = clang::tooling::CompilationDatabase::loadFromDirectory("test", "err_msg");
    clang::tooling::ClangTool t(*cmpdb, cmpdb->getAllFiles());
    t.run(clang::tooling::newFrontendActionFactory(&Finder).get());
}

Thanks.


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: MatchFinder returning many duplicate matches

Tom Stellard via cfe-dev
Yes, I'm looking to match the primary template instead of its instantiations.
Agreed, it's bizarre that the isExplicitTemplateSpecialisation() and
isTemplateInstantiation() don't filter these out.

I also thought about filtering out all templates in the function matcher and then
having a separate matcher for only functionTemplateDecl(). Looks like
there isn't a specific function to filter out templated functions, but perhaps
I could do this by dyn_casting the match result to a functionTemplateDecl()
and checking if the result is valid. Not a clean solution though.

If someone could take another glance at this I'd appreciate it.

Thanks.

> On Nov 19, 2019, at 22:56, Whisperity <[hidden email]> wrote:
>
> If I understand correctly you want to return what's usually referred to as the "primary template"? The program element which is the template itself, not any instantiation:
>
>     template <int i> f() {}
>
> There is a big distinction in the language's wording that X-templates are not Xs, only their instantiations are.
>
> You could try using functionTemplateDecl() for the match. If I understand this correctly, it'll give you the primary templates.
>
> In case you want, apart from this, the explicit specialisations, there's a predicate isExplicitTemplateSpecialisation() which you used as a filter already. That predicate should apply to the "concrete decl" (as opposed to the template) as explicit specialisations are concrete existing code with no need to instantiate any further.
>
> From a glance or two I've no idea why your current matcher seems to spew out a lot of nodes still.
>
> ;;
> ;; Whisperity.
>
> On Tue, 19 Nov 2019, 08:56 E via cfe-dev, <[hidden email]> wrote:
> Hello,
>
> I have a tool that runs over all TUs in a project and matches all function declarations.
> These declarations are then stored in a database for later processing.
>
> My problem is that the MatchFinder callback returns many duplicates of the same
> function, probably due to the use of templates in the function decl.
>
> I've provided a sample of duplicate USRs for the from_json function in a test project.
>
> from_json c:@N@nlohmann@S@adl_serializer>#{n5C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
> from_json c:@N@nlohmann@S@adl_serializer>#{n6C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
> from_json c:@N@nlohmann@S@adl_serializer>#*1C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
> from_json c:@N@nlohmann@S@adl_serializer>#{n7C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
> from_json c:@N@nlohmann@S@adl_serializer>#{n10C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
> from_json c:@N@nlohmann@S@adl_serializer>#{n8C#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
> from_json c:@N@nlohmann@S@adl_serializer>#b#v@FT@>2#T#Tfrom_json#&&t0.0#&t0.1# #S
>
> Based on the USRs, it seems like the MatchFinder considers each template instantiation
> a separate declaration, whereas I'd like it to return a single function decl that is templated.
>
> Is there a way to eliminate these duplicates during the matching stage?
> I have attached a minimal code example below.
>
> class FunctionMatcher : public MatchFinder::MatchCallback {
> public:
>     FunctionMatcher() {}
>     DeclarationMatcher matcher = functionDecl(unless(anyOf(isDefinition(),
>                                                         isImplicit(),
>                                                         isInStdNamespace(),
>                                                         isExpansionInSystemHeader(),
>                                                         isTemplateInstantiation(),
>                                                         isExplicitTemplateSpecialization())))
>                                     .bind("function");
>     virtual void run(const MatchFinder::MatchResult& Result) {
>         auto res = Result.Nodes.getNodeAs<clang::FunctionDecl>("function");
>         // Ignore invalid matches
>         if (res == nullptr) {
>             return;
>         }
>
>         // Print function name and USR
>         llvm::SmallString<128> USR;
>         if (!clang::index::generateUSRForDecl(d, USR)) {
>            printf("%s %s\n", res->getNameAsString().c_str(), USR.str().c_str());
>         }
>     }
>
> }
>
> int main() {
>     FunctionMatcher FunctionFinder(&this->index, this->cfg);
>     clang::ast_matchers::MatchFinder         Finder;
>     Finder.addMatcher(FunctionFinder.matcher, &FunctionFinder);
>
>     auto cmpdb = clang::tooling::CompilationDatabase::loadFromDirectory("test", "err_msg");
>     clang::tooling::ClangTool t(*cmpdb, cmpdb->getAllFiles());
>     t.run(clang::tooling::newFrontendActionFactory(&Finder).get());
> }
>
> Thanks.
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev