Inconsistent warnings with -Wundefined-reinterpret-cast

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Inconsistent warnings with -Wundefined-reinterpret-cast

Deep Majumder via cfe-dev
Hi all!

I stumbled across some C++ code today that uses reinterpret_cast to reshape a 1D array into a 2D array. I wondered if this is actually legal or if it would cause undefined behavior. There is an elaborate post on Stack Overflow [1] that argues in favor of this method, quoting sections of the C++ standard but other people disagree [2].

So I went ahead and tested different variants of such conversions with Clang’s -Wundefined-reinterpret-cast flag:

        https://godbolt.org/z/aE4fEM

The results can be summarized as:
1. Casting from int[4] to int(&)[2][2] or int(&)[][2] causes a warning
2. Casting from int(*)[4] to int(*)[2][2] or int(*)[][2] and _immediately_ dereferencing causes a warning
3. Casting from int(*)[4] to int(*)[2][2] or int(*)[][2] without immediate dereferencing seems fine
4. Casting from int[4] to int(*)[2] seems fine
5. Dereferencing those pointers from 3. and 4. afterwards also seems fine

This seems to be a bit inconsistent. From the results I assume that the cast itself is fine but dereferencing the casted pointer is undefined behavior. However, if those operations are split across multiple source code lines, there is no warning. Is this a bug or a natural limitation of the error checking within Clang?

Cheers,
Michael


[1] https://stackoverflow.com/a/15284276
[2] https://stackoverflow.com/q/44398700
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Inconsistent warnings with -Wundefined-reinterpret-cast

Deep Majumder via cfe-dev
OK, I think I can answer my own question. Here is the current code of the undefined-reinterpret-cast check:

    https://github.com/llvm/llvm-project/blob/b94c215592bdba915455895b2041398dfb2ac44a/clang/lib/Sema/SemaCast.cpp#L1874

In the comment it states:

    // The cases that is checked for is:
    // *reinterpret_cast<T*>(&a)
    // reinterpret_cast<T&>(a)
    // where accessing 'a' as type 'T' will result in undefined behavior.

So casting a pointer and dereferencing it at some later point in time is not covered by the check. It can also miss cases like the following:

    int matrix[2 * 2] = {0, 1, 2, 3};

    // missed because matrix is not referenced:
    int(&m1)[2][2] = *reinterpret_cast<int(*)[2][2]>(matrix);

    // missed because casted pointer is dereferenced via indexing operator:
    int(&m2)[2][2] = reinterpret_cast<int(*)[2][2]>(&matrix)[0];

On the question whether these casts (and dereferencing the resulting pointers) actually cause undefined behavior, I noticed a comment by Davis Herring on https://stackoverflow.com/a/15284276. I read it as: This is undefined behavior because the array referenced by the casted pointer does not actually exist and therefore the argument about array-to-pointer conversion made by the author of that post does not hold.

Cheers,
Michael

> Am 22.02.2021 um 21:01 schrieb Michael Laß via cfe-dev <[hidden email]>:
>
> Hi all!
>
> I stumbled across some C++ code today that uses reinterpret_cast to reshape a 1D array into a 2D array. I wondered if this is actually legal or if it would cause undefined behavior. There is an elaborate post on Stack Overflow [1] that argues in favor of this method, quoting sections of the C++ standard but other people disagree [2].
>
> So I went ahead and tested different variants of such conversions with Clang’s -Wundefined-reinterpret-cast flag:
>
>        https://godbolt.org/z/aE4fEM
>
> The results can be summarized as:
> 1. Casting from int[4] to int(&)[2][2] or int(&)[][2] causes a warning
> 2. Casting from int(*)[4] to int(*)[2][2] or int(*)[][2] and _immediately_ dereferencing causes a warning
> 3. Casting from int(*)[4] to int(*)[2][2] or int(*)[][2] without immediate dereferencing seems fine
> 4. Casting from int[4] to int(*)[2] seems fine
> 5. Dereferencing those pointers from 3. and 4. afterwards also seems fine
>
> This seems to be a bit inconsistent. From the results I assume that the cast itself is fine but dereferencing the casted pointer is undefined behavior. However, if those operations are split across multiple source code lines, there is no warning. Is this a bug or a natural limitation of the error checking within Clang?
>
> Cheers,
> Michael
>
>
> [1] https://stackoverflow.com/a/15284276
> [2] https://stackoverflow.com/q/44398700
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Inconsistent warnings with -Wundefined-reinterpret-cast

Deep Majumder via cfe-dev
On Wed, 24 Feb 2021 at 04:30, Michael Laß via cfe-dev <[hidden email]> wrote:
OK, I think I can answer my own question. Here is the current code of the undefined-reinterpret-cast check:

    https://github.com/llvm/llvm-project/blob/b94c215592bdba915455895b2041398dfb2ac44a/clang/lib/Sema/SemaCast.cpp#L1874

In the comment it states:

    // The cases that is checked for is:
    // *reinterpret_cast<T*>(&a)
    // reinterpret_cast<T&>(a)
    // where accessing 'a' as type 'T' will result in undefined behavior.

So casting a pointer and dereferencing it at some later point in time is not covered by the check.

Yes; unfortunately, Clang warnings tend to only perform a very local analysis, so they don't tend to catch this kind of issue when it's split across multiple expressions. (The static analyzer should be able to spot this kind of problem, but I don't know if it has a checker corresponding to this warning.)
 
It can also miss cases like the following:

    int matrix[2 * 2] = {0, 1, 2, 3};

    // missed because matrix is not referenced:
    int(&m1)[2][2] = *reinterpret_cast<int(*)[2][2]>(matrix);

    // missed because casted pointer is dereferenced via indexing operator:
    int(&m2)[2][2] = reinterpret_cast<int(*)[2][2]>(&matrix)[0];

On the question whether these casts (and dereferencing the resulting pointers) actually cause undefined behavior, I noticed a comment by Davis Herring on https://stackoverflow.com/a/15284276. I read it as: This is undefined behavior because the array referenced by the casted pointer does not actually exist and therefore the argument about array-to-pointer conversion made by the author of that post does not hold.

Yes, Davis's comment matches my understanding.
 
Cheers,
Michael

> Am 22.02.2021 um 21:01 schrieb Michael Laß via cfe-dev <[hidden email]>:
>
> Hi all!
>
> I stumbled across some C++ code today that uses reinterpret_cast to reshape a 1D array into a 2D array. I wondered if this is actually legal or if it would cause undefined behavior. There is an elaborate post on Stack Overflow [1] that argues in favor of this method, quoting sections of the C++ standard but other people disagree [2].
>
> So I went ahead and tested different variants of such conversions with Clang’s -Wundefined-reinterpret-cast flag:
>
>        https://godbolt.org/z/aE4fEM
>
> The results can be summarized as:
> 1. Casting from int[4] to int(&)[2][2] or int(&)[][2] causes a warning
> 2. Casting from int(*)[4] to int(*)[2][2] or int(*)[][2] and _immediately_ dereferencing causes a warning
> 3. Casting from int(*)[4] to int(*)[2][2] or int(*)[][2] without immediate dereferencing seems fine
> 4. Casting from int[4] to int(*)[2] seems fine
> 5. Dereferencing those pointers from 3. and 4. afterwards also seems fine
>
> This seems to be a bit inconsistent. From the results I assume that the cast itself is fine but dereferencing the casted pointer is undefined behavior. However, if those operations are split across multiple source code lines, there is no warning. Is this a bug or a natural limitation of the error checking within Clang?
>
> Cheers,
> Michael
>
>
> [1] https://stackoverflow.com/a/15284276
> [2] https://stackoverflow.com/q/44398700
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev