Questions/discussions about cast types in clang

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Questions/discussions about cast types in clang

Vassil Vassilev via cfe-dev
Hi all,

I'm currently implementing a clang tool to replace C style casts to C++ style casts, i.e. const/static/reinterpret_casts. I'm almost done, but I've ran into a collection of issues/questions that I'd like to raise here to make sure it's not a misunderstanding on my part. My solutions are included in each bullet point:

1. (Question) For dependent types, it is unknown whether static_cast or reinterpret_cast is appropriate for the situation, so I would like to leave that as a C-style cast in the refactor tool. Example below:

template <typename T> void foo() { int* ptr; (T*) ptr; }
If we call foo<int>, it can be a const_cast, but if we call foo<double>, it has to be a reinterpret_cast. In a situation where the user is writing library code, it won't only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are.

2. (Issue/bug) In OperationKinds.def, there is an enum detailing casting to union types. This doesn't work in clang(nor GCC). Here's a compiler explorer MVCE: https://godbolt.org/z/rp8DDt. The first statement is not a cast, but rather a CompoundLiteralExpr(InitListExpr(...)), and should not be treated as a cast. Here is the corresponding GCC document: https://gcc.gnu.org/onlinedocs/gcc/Cast-to-Union.html about union casts. If CastKind::CK_ToUnion is ever encountered in my program, I will leave the C-style cast as is.

3. (Question) In general, should this tool for C++ also support OpenCL and Objective C++? Technically, const/static/reinterpret_casts exist in those languages (see https://godbolt.org/z/DEz8Rs for example). In my opinion, I think the casting tool should strictly adhere to C++ as the language of support and nothing more for the first iteration because different coding standards may be held for different languages(and I only have ample experience with C++). As a result, I propose that the first iteration of the casting tool be placed outside of clang-tidy as clang-tidy also supports Objective C (but not OpenCL), as seen here: https://clang.llvm.org/extra/clang-tidy/#id2. In the main driver I will explicitly check whether the program is in C++.

I'd appreciate any questions/concerns, or oversight on my part on the 3 issues above. Thanks!

Best,
Ray Zhang







_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Questions/discussions about cast types in clang

Vassil Vassilev via cfe-dev
On Sat, Jun 20, 2020 at 3:31 PM Ray Zhang via cfe-dev <[hidden email]> wrote:
Hi all,

I'm currently implementing a clang tool to replace C style casts to C++ style casts, i.e. const/static/reinterpret_casts. I'm almost done, but I've ran into a collection of issues/questions that I'd like to raise here to make sure it's not a misunderstanding on my part. My solutions are included in each bullet point:

1. (Question) For dependent types, it is unknown whether static_cast or reinterpret_cast is appropriate for the situation, so I would like to leave that as a C-style cast in the refactor tool. Example below:

template <typename T> void foo() { int* ptr; (T*) ptr; }
If we call foo<int>, it can be a const_cast, but if we call foo<double>, it has to be a reinterpret_cast.

It could even involve both a const_cast and a static_cast; or it could be unrepresentable as a non-C-style cast (e.g. a cast to a private base).
 
In a situation where the user is writing library code, it won't only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are.

Sounds reasonable. Although, it would be nice to flag such casts for human inspection somehow; see my very informal proposal here for example.
 

2. (Issue/bug) In OperationKinds.def, there is an enum detailing casting to union types. This doesn't work in clang(nor GCC). Here's a compiler explorer MVCE: https://godbolt.org/z/rp8DDt. The first statement is not a cast, but rather a CompoundLiteralExpr(InitListExpr(...)), and should not be treated as a cast. Here is the corresponding GCC document: https://gcc.gnu.org/onlinedocs/gcc/Cast-to-Union.html about union casts. If CastKind::CK_ToUnion is ever encountered in my program, I will leave the C-style cast as is.

It works in GCC trunk for me... but only in C++20 mode.
I suspect that your analysis is still correct: this feature is documented, but does not work, in GCC pre-C++20, and then in C++20 the changes around parens-initialization of aggregates broke it in such a way that now it appears to work but nobody should trust it to actually work. It might be nice to submit a patch ripping this broken stuff out of Clang and/or out of GCC.

my $.02,
–Arthur

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Questions/discussions about cast types in clang

Vassil Vassilev via cfe-dev
Hi Arthur,

First of all, it's an honor. I remember watching your "Allocator is a handle to a heap" c++now talk a while ago :)

Second, you're right - I forgot about DerivedToBase or BaseToDerived in the above example. Regarding the statement regarding unpresentable as a non-C-style cast, in the standard it breaks down C-style casts as a cascading attempt to use const/static/reinterpret, or combinations of them. Here's an example of private base: https://godbolt.org/z/5STynJ.

Third, I didn't check for C++20 for union casts, but I'm guessing it "working" is a side-effect of the parens-initialization of aggregates allowing narrowing conversions. Here's a link that shows that the example in the C++20 mode you showed is not equivalent to a C-style cast assignment which is just an alias for (union U) {.x = x}: https://godbolt.org/z/7LJCQy. I will leave the union c-style casts as is for now(as clang trunk can't reproduce this even), and in the future hope to contribute a patch to remove this. Thank you for your input!

Best,
Ray





On Sat, Jun 20, 2020 at 1:22 PM Arthur O'Dwyer <[hidden email]> wrote:
On Sat, Jun 20, 2020 at 3:31 PM Ray Zhang via cfe-dev <[hidden email]> wrote:
Hi all,

I'm currently implementing a clang tool to replace C style casts to C++ style casts, i.e. const/static/reinterpret_casts. I'm almost done, but I've ran into a collection of issues/questions that I'd like to raise here to make sure it's not a misunderstanding on my part. My solutions are included in each bullet point:

1. (Question) For dependent types, it is unknown whether static_cast or reinterpret_cast is appropriate for the situation, so I would like to leave that as a C-style cast in the refactor tool. Example below:

template <typename T> void foo() { int* ptr; (T*) ptr; }
If we call foo<int>, it can be a const_cast, but if we call foo<double>, it has to be a reinterpret_cast.

It could even involve both a const_cast and a static_cast; or it could be unrepresentable as a non-C-style cast (e.g. a cast to a private base).
 
In a situation where the user is writing library code, it won't only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are.

Sounds reasonable. Although, it would be nice to flag such casts for human inspection somehow; see my very informal proposal here for example.
 

2. (Issue/bug) In OperationKinds.def, there is an enum detailing casting to union types. This doesn't work in clang(nor GCC). Here's a compiler explorer MVCE: https://godbolt.org/z/rp8DDt. The first statement is not a cast, but rather a CompoundLiteralExpr(InitListExpr(...)), and should not be treated as a cast. Here is the corresponding GCC document: https://gcc.gnu.org/onlinedocs/gcc/Cast-to-Union.html about union casts. If CastKind::CK_ToUnion is ever encountered in my program, I will leave the C-style cast as is.

It works in GCC trunk for me... but only in C++20 mode.
I suspect that your analysis is still correct: this feature is documented, but does not work, in GCC pre-C++20, and then in C++20 the changes around parens-initialization of aggregates broke it in such a way that now it appears to work but nobody should trust it to actually work. It might be nice to submit a patch ripping this broken stuff out of Clang and/or out of GCC.

my $.02,
–Arthur

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Questions/discussions about cast types in clang

Vassil Vassilev via cfe-dev
On Sat, Jun 20, 2020 at 5:33 PM Ray Zhang <[hidden email]> wrote:
Hi Arthur,
First of all, it's an honor. I remember watching your "Allocator is a handle to a heap" c++now talk a while ago :)
Hey, thanks! :) Don't take anything I say as gospel when it comes to cfe-dev, though. :)

Second, you're right - I forgot about DerivedToBase or BaseToDerived in the above example. Regarding the statement regarding unpresentable as a non-C-style cast, in the standard it breaks down C-style casts as a cascading attempt to use const/static/reinterpret, or combinations of them. Here's an example of private base: https://godbolt.org/z/5STynJ.

Yes, but notice where cppreference says "static_cast with exceptions"; I was talking about the exceptions.
Here's an example where C-style cast and reinterpret_cast do different things, but any other kind of cast (e.g. static_cast) is ill-formed.
Clang gives a warning suggesting that you replace the reinterpret_cast with a static_cast, but of course if you actually did that, the code wouldn't compile anymore.


Third, I didn't check for C++20 for union casts, but I'm guessing it "working" is a side-effect of the parens-initialization of aggregates allowing narrowing conversions. Here's a link that shows that the example in the C++20 mode you showed is not equivalent to a C-style cast assignment which is just an alias for (union U) {.x = x}: https://godbolt.org/z/7LJCQy.

Ah, yes, you're right. With `(U)y`, GCC is calling the same pseudo-constructor as `U(y)`, which is supposed to just implicitly convert the float to an int and initialize the first member of the union, completely ignoring the fact that `y` is a float
This appears to be actually the correct behavior in C++20, even though it seems pretty crazy. (In other words, C++20's new union syntax doesn't follow the same overload-resolution-ish rules as C++17's std::variant.) Textual support:
http://eel.is/c++draft/expr.cast#4 says that (U)y should behave the same as static_cast<U>(y) if possible, and
http://eel.is/c++draft/expr.static.cast#4 says that static_cast<U>(y) should behave the same as U(y), which (new in C++20) should behave the same as U{.x = y}.

Here's a pared-down example where the two fields of the union have utterly different types, so the silent implicit conversion fails instead of succeeding. https://godbolt.org/z/9wVX6B
Apparently Clang just doesn't implement this C++20 rule yet; once it does, Clang's behavior should end up matching GCC's.

...Oh, and wait a minute! The first sentence of the documentation you linked says, "A cast to a union type is a C extension not available in C++."  So that explains why neither of us could reproduce the documented behavior! :)  Now here's a Godbolt showing the same code compiled as C++ (where it initializes the first field of the union) and C-with-extensions (where it initializes the second field because that field is a float).
And here's Clang, reproducing GCC's behavior in C-with-extensions but failing to implement C++20 yet:

–Arthur

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Questions/discussions about cast types in clang

Vassil Vassilev via cfe-dev
Hi Arthur,

Regarding the static_cast extension - I see how I can reproduce that now. Thank you! For each DerivedToBase cast type, I'll check to see whether the base class is inaccessible, and leave the C-style cast as-is.

:facepalm: at not reading the first sentence of that GCC document. I had the assumption that all C extensions were available in C++ for backwards compatibility, but today I learned. Sorry about the wild goose-chase! The enum CastKind::CK_ToUnion is thus referring to a feature from a C extension not yet available(will it ever be available?) in C++, I'll try to document it in my tool and later in the enum definitions.

Learned a lot from our discussion, thanks so much!

Best,
Ray

On Sat, Jun 20, 2020 at 3:19 PM Arthur O'Dwyer <[hidden email]> wrote:
On Sat, Jun 20, 2020 at 5:33 PM Ray Zhang <[hidden email]> wrote:
Hi Arthur,
First of all, it's an honor. I remember watching your "Allocator is a handle to a heap" c++now talk a while ago :)
Hey, thanks! :) Don't take anything I say as gospel when it comes to cfe-dev, though. :)

Second, you're right - I forgot about DerivedToBase or BaseToDerived in the above example. Regarding the statement regarding unpresentable as a non-C-style cast, in the standard it breaks down C-style casts as a cascading attempt to use const/static/reinterpret, or combinations of them. Here's an example of private base: https://godbolt.org/z/5STynJ.

Yes, but notice where cppreference says "static_cast with exceptions"; I was talking about the exceptions.
Here's an example where C-style cast and reinterpret_cast do different things, but any other kind of cast (e.g. static_cast) is ill-formed.
Clang gives a warning suggesting that you replace the reinterpret_cast with a static_cast, but of course if you actually did that, the code wouldn't compile anymore.


Third, I didn't check for C++20 for union casts, but I'm guessing it "working" is a side-effect of the parens-initialization of aggregates allowing narrowing conversions. Here's a link that shows that the example in the C++20 mode you showed is not equivalent to a C-style cast assignment which is just an alias for (union U) {.x = x}: https://godbolt.org/z/7LJCQy.

Ah, yes, you're right. With `(U)y`, GCC is calling the same pseudo-constructor as `U(y)`, which is supposed to just implicitly convert the float to an int and initialize the first member of the union, completely ignoring the fact that `y` is a float
This appears to be actually the correct behavior in C++20, even though it seems pretty crazy. (In other words, C++20's new union syntax doesn't follow the same overload-resolution-ish rules as C++17's std::variant.) Textual support:
http://eel.is/c++draft/expr.cast#4 says that (U)y should behave the same as static_cast<U>(y) if possible, and
http://eel.is/c++draft/expr.static.cast#4 says that static_cast<U>(y) should behave the same as U(y), which (new in C++20) should behave the same as U{.x = y}.

Here's a pared-down example where the two fields of the union have utterly different types, so the silent implicit conversion fails instead of succeeding. https://godbolt.org/z/9wVX6B
Apparently Clang just doesn't implement this C++20 rule yet; once it does, Clang's behavior should end up matching GCC's.

...Oh, and wait a minute! The first sentence of the documentation you linked says, "A cast to a union type is a C extension not available in C++."  So that explains why neither of us could reproduce the documented behavior! :)  Now here's a Godbolt showing the same code compiled as C++ (where it initializes the first field of the union) and C-with-extensions (where it initializes the second field because that field is a float).
And here's Clang, reproducing GCC's behavior in C-with-extensions but failing to implement C++20 yet:

–Arthur

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Questions/discussions about cast types in clang

Vassil Vassilev via cfe-dev
In reply to this post by Vassil Vassilev via cfe-dev


On 20/06/2020 08:21, Ray Zhang via cfe-dev wrote:
Hi all,

I'm currently implementing a clang tool to replace C style casts to C++ style casts, i.e. const/static/reinterpret_casts. I'm almost done, but I've ran into a collection of issues/questions that I'd like to raise here to make sure it's not a misunderstanding on my part. My solutions are included in each bullet point:

1. (Question) For dependent types, it is unknown whether static_cast or reinterpret_cast is appropriate for the situation, so I would like to leave that as a C-style cast in the refactor tool. Example below:

template <typename T> void foo() { int* ptr; (T*) ptr; }
If we call foo<int>, it can be a const_cast, but if we call foo<double>, it has to be a reinterpret_cast. In a situation where the user is writing library code, it won't only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are.


Makes sense. You shouldn't be trying to transform instantiations of templates. Perhaps issuing a warning as Arthur wrote makes sense though.


3. (Question) In general, should this tool for C++ also support OpenCL and Objective C++? Technically, const/static/reinterpret_casts exist in those languages (see https://godbolt.org/z/DEz8Rs for example). In my opinion, I think the casting tool should strictly adhere to C++ as the language of support and nothing more for the first iteration because different coding standards may be held for different languages(and I only have ample experience with C++). As a result, I propose that the first iteration of the casting tool be placed outside of clang-tidy as clang-tidy also supports Objective C (but not OpenCL), as seen here: https://clang.llvm.org/extra/clang-tidy/#id2. In the main driver I will explicitly check whether the program is in C++.


Clang-tidy has plenty of checks which are C++-specific.

You can make your check c++-specific too:


bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
  return LangOpts.CPlusPlus;
}

clang-tidy is still an appropriate place for your check.

Thanks,

Stephen.



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Questions/discussions about cast types in clang

Vassil Vassilev via cfe-dev
Hi Stephen,

Thank you for the suggestions! And I see the option, I will aim to contribute it inside of clang-tidy :)

By the way, I have encountered an issue when trying to address the cast type CastKind::CK_BuiltinFnToFnPtr. According to this demo, it is impossible to pass in a builtin function as a function pointer. In the comments here, it says "only available from the callee of a call expression". It seems like I am passing in the function pointer as the callee in my demo, no?

Best,
Ray

On Sun, Jun 21, 2020 at 4:25 AM Stephen Kelly <[hidden email]> wrote:


On 20/06/2020 08:21, Ray Zhang via cfe-dev wrote:
Hi all,

I'm currently implementing a clang tool to replace C style casts to C++ style casts, i.e. const/static/reinterpret_casts. I'm almost done, but I've ran into a collection of issues/questions that I'd like to raise here to make sure it's not a misunderstanding on my part. My solutions are included in each bullet point:

1. (Question) For dependent types, it is unknown whether static_cast or reinterpret_cast is appropriate for the situation, so I would like to leave that as a C-style cast in the refactor tool. Example below:

template <typename T> void foo() { int* ptr; (T*) ptr; }
If we call foo<int>, it can be a const_cast, but if we call foo<double>, it has to be a reinterpret_cast. In a situation where the user is writing library code, it won't only be relevant to the current translation unit but also for future purposes. Therefore, I propose that we leave dependent type c-style casts as they are.


Makes sense. You shouldn't be trying to transform instantiations of templates. Perhaps issuing a warning as Arthur wrote makes sense though.


3. (Question) In general, should this tool for C++ also support OpenCL and Objective C++? Technically, const/static/reinterpret_casts exist in those languages (see https://godbolt.org/z/DEz8Rs for example). In my opinion, I think the casting tool should strictly adhere to C++ as the language of support and nothing more for the first iteration because different coding standards may be held for different languages(and I only have ample experience with C++). As a result, I propose that the first iteration of the casting tool be placed outside of clang-tidy as clang-tidy also supports Objective C (but not OpenCL), as seen here: https://clang.llvm.org/extra/clang-tidy/#id2. In the main driver I will explicitly check whether the program is in C++.


Clang-tidy has plenty of checks which are C++-specific.

You can make your check c++-specific too:


bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
  return LangOpts.CPlusPlus;
}

clang-tidy is still an appropriate place for your check.

Thanks,

Stephen.



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev