[Clang-Sema] Renaming function and refactoring the formatting function handling code

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Clang-Sema] Renaming function and refactoring the formatting function handling code

Deep Majumder via cfe-dev
Hey guys,

I'm working on a proposal to ISO WG14 for C2x to add support for char16_t and char32_t string and character specifiers for the printf and scanf family of functions.

In working on this proposal for WG14, I came across an issue in Clang.

the problem is that C11 defines char16_t and char32_t as typedefs in uchar.h; while C++11 defines them as new builtin types.

The real problem with this difference, is that the printf and scanf string checking code in Sema is built with the implicit assumption that any string type passed to a printf or scanf family function will be a built in type.

for example, isAnyCharacterType() in the class clang::Type, only checks builtin types.

so when C code is compiled, char16_t and char32_t are not seen as valid string types and Clang errors out.

So to fix this mess, I've been implementing a new function called isTypedefCharacterType(ASTContext &AST) in clang:Type which will use getLangOpts() in ASTContext to check the language mode, and if we're compiling in C mode, it will desugar the type all the way down until it finds a typedef for char16_t or char32_t.

But I still have a few questions for the community.

1: How should wchar_t be treated when it's been disabled as a builtin type?

2: isAnyCharacterType() as a name does not reflect what it actually does, it only checks builtin types, I've renamed this to isBuiltinCharacterType() in my fork, there are 17 instances throughout the LLVM codebase of "isAnyCharacterType", is it ok to change this name, and if so is my choice of isBuiltinCharacterType acceptable?

3: Any other comments o concerns?

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [Clang-Sema] Renaming function and refactoring the formatting function handling code

Deep Majumder via cfe-dev


On Feb 4, 2021, at 9:17 AM, MLJ1991 via cfe-dev <[hidden email]> wrote:

Hey guys,

I'm working on a proposal to ISO WG14 for C2x to add support for char16_t and char32_t string and character specifiers for the printf and scanf family of functions.

In working on this proposal for WG14, I came across an issue in Clang.

the problem is that C11 defines char16_t and char32_t as typedefs in uchar.h; while C++11 defines them as new builtin types.

The real problem with this difference, is that the printf and scanf string checking code in Sema is built with the implicit assumption that any string type passed to a printf or scanf family function will be a built in type.

for example, isAnyCharacterType() in the class clang::Type, only checks builtin types.

so when C code is compiled, char16_t and char32_t are not seen as valid string types and Clang errors out.

So to fix this mess, I've been implementing a new function called isTypedefCharacterType(ASTContext &AST) in clang:Type which will use getLangOpts() in ASTContext to check the language mode, and if we're compiling in C mode, it will desugar the type all the way down until it finds a typedef for char16_t or char32_t.

But I still have a few questions for the community.

1: How should wchar_t be treated when it's been disabled as a builtin type?

2: isAnyCharacterType() as a name does not reflect what it actually does, it only checks builtin types, I've renamed this to isBuiltinCharacterType() in my fork, there are 17 instances throughout the LLVM codebase of "isAnyCharacterType", is it ok to change this name, and if so is my choice of isBuiltinCharacterType acceptable?

IMO you should keep it as a single function, but change the signature to `isAnyCharacterType(LangOptions &LangOpts)` — I would imagine most callers of `isAnyCharacterType` have the same expectations you did about the meaning of that function, and since you have recognized that the answer is language dependent, I imagine you would be doing people a favor/fixing bugs by forcing them to specify the language.


3: Any other comments o concerns?
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev