-ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

-ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: -ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
Hi Vitaly,

Indeed: there are cases where clang happened to honor the standard without intending to (through the magic of memcpy’ing globals), and auto-init=pattern broke this. I got a report about it a while ago and agree it’s an issue, I haven’t had time to fix it. It’s related to what C calls “designated initializers” to initialize structs. The relevant standardese from C17 are in 6.7.9 Initialization (the grammar has “designator” being either “ [ constant-expression ] ” or “ . identifier ”).

The relevant rules are:

“””
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
“””

I had fixed something in https://reviews.llvm.org/D61280 but it’s not sufficient.


I don’t think fixing this is giving up: it’s doing what the standard mandates.

Are you interested in fixing it?



On Sep 23, 2019, at 2:20 PM, Vitaly Buka <[hidden email]> wrote:

Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: -ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
Thanks.
Somehow I read the same text but interpreted it differently. Now it makes sense, it must be zeroed and trivial-auto-var-init needs improvement.
I'll will create a patch.


On Mon, Sep 23, 2019 at 2:36 PM JF Bastien <[hidden email]> wrote:
Hi Vitaly,

Indeed: there are cases where clang happened to honor the standard without intending to (through the magic of memcpy’ing globals), and auto-init=pattern broke this. I got a report about it a while ago and agree it’s an issue, I haven’t had time to fix it. It’s related to what C calls “designated initializers” to initialize structs. The relevant standardese from C17 are in 6.7.9 Initialization (the grammar has “designator” being either “ [ constant-expression ] ” or “ . identifier ”).

The relevant rules are:

“””
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
“””

I had fixed something in https://reviews.llvm.org/D61280 but it’s not sufficient.


I don’t think fixing this is giving up: it’s doing what the standard mandates.

Are you interested in fixing it?



On Sep 23, 2019, at 2:20 PM, Vitaly Buka <[hidden email]> wrote:

Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: -ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
For code line this:
void used(void *);
union unmatched { char c[2]; char i[7]; };
void test1() {
  union unmatched U = {1};
  used(&U);
}

tryEmitAbstractForInitializer already emits 
@__const.test1.U = private unnamed_addr constant { [2 x i8], [5 x i8] } { [2 x i8] c"\01\00", [5 x i8] undef }, align 1

Then we replace all undefs with pattern.
So I believe I need to fix tryEmitAbstractForInitializer to emit zeroes. However it breaks a bunch of tests.

Does this make sense?


On Mon, Sep 23, 2019 at 4:10 PM Vitaly Buka <[hidden email]> wrote:
Thanks.
Somehow I read the same text but interpreted it differently. Now it makes sense, it must be zeroed and trivial-auto-var-init needs improvement.
I'll will create a patch.


On Mon, Sep 23, 2019 at 2:36 PM JF Bastien <[hidden email]> wrote:
Hi Vitaly,

Indeed: there are cases where clang happened to honor the standard without intending to (through the magic of memcpy’ing globals), and auto-init=pattern broke this. I got a report about it a while ago and agree it’s an issue, I haven’t had time to fix it. It’s related to what C calls “designated initializers” to initialize structs. The relevant standardese from C17 are in 6.7.9 Initialization (the grammar has “designator” being either “ [ constant-expression ] ” or “ . identifier ”).

The relevant rules are:

“””
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
“””

I had fixed something in https://reviews.llvm.org/D61280 but it’s not sufficient.


I don’t think fixing this is giving up: it’s doing what the standard mandates.

Are you interested in fixing it?



On Sep 23, 2019, at 2:20 PM, Vitaly Buka <[hidden email]> wrote:

Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: -ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
On Tue, Sep 24, 2019 at 4:06 PM Vitaly Buka <[hidden email]> wrote:
For code line this:
void used(void *);
union unmatched { char c[2]; char i[7]; };
void test1() {
  union unmatched U = {1};
  used(&U);
}

tryEmitAbstractForInitializer already emits 
@__const.test1.U = private unnamed_addr constant { [2 x i8], [5 x i8] } { [2 x i8] c"\01\00", [5 x i8] undef }, align 1

Then we replace all undefs with pattern.
So I believe I need to fix tryEmitAbstractForInitializer to emit zeroes. However it breaks a bunch of tests.

Does this make sense?

Yes. We'll presumably need to track a bit on APValue's struct and union representations to model whether padding is zeroed or undefined. We already track on CXXConstructExpr whether the padding is zeroed, but don't preserve properly everywhere else.

This is http://llvm.org/PR11742 by the way. =)
 
On Mon, Sep 23, 2019 at 4:10 PM Vitaly Buka <[hidden email]> wrote:
Thanks.
Somehow I read the same text but interpreted it differently. Now it makes sense, it must be zeroed and trivial-auto-var-init needs improvement.
I'll will create a patch.


On Mon, Sep 23, 2019 at 2:36 PM JF Bastien <[hidden email]> wrote:
Hi Vitaly,

Indeed: there are cases where clang happened to honor the standard without intending to (through the magic of memcpy’ing globals), and auto-init=pattern broke this. I got a report about it a while ago and agree it’s an issue, I haven’t had time to fix it. It’s related to what C calls “designated initializers” to initialize structs. The relevant standardese from C17 are in 6.7.9 Initialization (the grammar has “designator” being either “ [ constant-expression ] ” or “ . identifier ”).

The relevant rules are:

“””
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
“””

I had fixed something in https://reviews.llvm.org/D61280 but it’s not sufficient.


I don’t think fixing this is giving up: it’s doing what the standard mandates.

Are you interested in fixing it?



On Sep 23, 2019, at 2:20 PM, Vitaly Buka <[hidden email]> wrote:

Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: -ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
In reply to this post by Kristof Beyls via cfe-dev
On Mon, Sep 23, 2019 at 5:36 PM JF Bastien via cfe-dev <[hidden email]> wrote:
Hi Vitaly,

Indeed: there are cases where clang happened to honor the standard without intending to (through the magic of memcpy’ing globals), and auto-init=pattern broke this. I got a report about it a while ago and agree it’s an issue, I haven’t had time to fix it. It’s related to what C calls “designated initializers” to initialize structs. The relevant standardese from C17 are in 6.7.9 Initialization (the grammar has “designator” being either “ [ constant-expression ] ” or “ . identifier ”).

The relevant rules are:

“””
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
“””
Unions are not aggregates in the C standard, so the last sentence does not apply.
 

I had fixed something in https://reviews.llvm.org/D61280 but it’s not sufficient.


I don’t think fixing this is giving up: it’s doing what the standard mandates.

Are you interested in fixing it?



On Sep 23, 2019, at 2:20 PM, Vitaly Buka <[hidden email]> wrote:

Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: -ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
On Tue, Sep 24, 2019 at 9:09 PM Hubert Tong <[hidden email]> wrote:
On Mon, Sep 23, 2019 at 5:36 PM JF Bastien via cfe-dev <[hidden email]> wrote:
Hi Vitaly,

Indeed: there are cases where clang happened to honor the standard without intending to (through the magic of memcpy’ing globals), and auto-init=pattern broke this. I got a report about it a while ago and agree it’s an issue, I haven’t had time to fix it. It’s related to what C calls “designated initializers” to initialize structs. The relevant standardese from C17 are in 6.7.9 Initialization (the grammar has “designator” being either “ [ constant-expression ] ” or “ . identifier ”).

The relevant rules are:

“””
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
“””
Unions are not aggregates in the C standard, so the last sentence does not apply.
 
I assume it's not going to hurt if we zero init there? I see a lot of code which assume that large fields will be initialized, and not sure how to use non-zero behaviour.
 

I had fixed something in https://reviews.llvm.org/D61280 but it’s not sufficient.


I don’t think fixing this is giving up: it’s doing what the standard mandates.

Are you interested in fixing it?



On Sep 23, 2019, at 2:20 PM, Vitaly Buka <[hidden email]> wrote:

Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: -ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
On Thu, Sep 26, 2019 at 10:28 PM Vitaly Buka <[hidden email]> wrote:
On Tue, Sep 24, 2019 at 9:09 PM Hubert Tong <[hidden email]> wrote:
On Mon, Sep 23, 2019 at 5:36 PM JF Bastien via cfe-dev <[hidden email]> wrote:
Hi Vitaly,

Indeed: there are cases where clang happened to honor the standard without intending to (through the magic of memcpy’ing globals), and auto-init=pattern broke this. I got a report about it a while ago and agree it’s an issue, I haven’t had time to fix it. It’s related to what C calls “designated initializers” to initialize structs. The relevant standardese from C17 are in 6.7.9 Initialization (the grammar has “designator” being either “ [ constant-expression ] ” or “ . identifier ”).

The relevant rules are:

“””
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
“””
Unions are not aggregates in the C standard, so the last sentence does not apply.
 
I assume it's not going to hurt if we zero init there? I see a lot of code which assume that large fields will be initialized, and not sure how to use non-zero behaviour.
There are still platforms where such code breaks naturally anyway because the zero value for a particular type is not the same as having zeroes for all the bytes of its object representation. Under the status quo, -ftrivial-auto-var-init has some ability to act as a tool for portability. Although, as noted in this thread, the status quo is neither here nor there with actually using the pattern (or using zeroes or something else).
 
 

I had fixed something in https://reviews.llvm.org/D61280 but it’s not sufficient.


I don’t think fixing this is giving up: it’s doing what the standard mandates.

Are you interested in fixing it?



On Sep 23, 2019, at 2:20 PM, Vitaly Buka <[hidden email]> wrote:

Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: -ftrivial-auto-var-init=pattern vs =uninitialized and union initialization

Kristof Beyls via cfe-dev
So this means that my original diagnosis was correct?
What are we going to do about -ftrivial-auto-var-init
1. Keep as-is and forse folks to replace "U u = {}" with "U u; u.large = {}" or memset
2. Update https://reviews.llvm.org/D68115 to engage only with =pattern




On Thu, Sep 26, 2019 at 8:48 PM Hubert Tong <[hidden email]> wrote:
On Thu, Sep 26, 2019 at 10:28 PM Vitaly Buka <[hidden email]> wrote:
On Tue, Sep 24, 2019 at 9:09 PM Hubert Tong <[hidden email]> wrote:
On Mon, Sep 23, 2019 at 5:36 PM JF Bastien via cfe-dev <[hidden email]> wrote:
Hi Vitaly,

Indeed: there are cases where clang happened to honor the standard without intending to (through the magic of memcpy’ing globals), and auto-init=pattern broke this. I got a report about it a while ago and agree it’s an issue, I haven’t had time to fix it. It’s related to what C calls “designated initializers” to initialize structs. The relevant standardese from C17 are in 6.7.9 Initialization (the grammar has “designator” being either “ [ constant-expression ] ” or “ . identifier ”).

The relevant rules are:

“””
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
“””
Unions are not aggregates in the C standard, so the last sentence does not apply.
 
I assume it's not going to hurt if we zero init there? I see a lot of code which assume that large fields will be initialized, and not sure how to use non-zero behaviour.
There are still platforms where such code breaks naturally anyway because the zero value for a particular type is not the same as having zeroes for all the bytes of its object representation. Under the status quo, -ftrivial-auto-var-init has some ability to act as a tool for portability. Although, as noted in this thread, the status quo is neither here nor there with actually using the pattern (or using zeroes or something else).
 
 

I had fixed something in https://reviews.llvm.org/D61280 but it’s not sufficient.


I don’t think fixing this is giving up: it’s doing what the standard mandates.

Are you interested in fixing it?



On Sep 23, 2019, at 2:20 PM, Vitaly Buka <[hidden email]> wrote:

Hi everyone,

I am trying to enable -ftrivial-auto-var-init=pattern on various code and noticed inconvenient inconsistency with code like [1].

According standard, as I understand, only the first member of the union must be initialized and the tail of the second member can stay uninitialized. If we use -ftrivial-auto-var-init=pattern it fills the tail using our pattern. However by default GCC, clang (without -ftrivial-auto-var-init), msvc all initialize entire union with 0s. Not sure if it's just coincidence or guaranteed feature.

So -ftrivial-auto-var-init=pattern breaks such code. Especially bad if you don't know that U is a union and ={} looks good there.

Should we consider giving up here and using zeroes for union tails even with -ftrivial-auto-var-init=pattern?

1. Example:
union U {
  char small[2];
  char large[100];
};
void f(void*);
void test() {
  union U u = {};
  f(&u);
}

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev