Struct padding

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Struct padding

Eric Fiselier via cfe-dev
Hi,

I am wondering how I can tell whether a field of a struct is introduced by padding or not.

For example, if I have a struct:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    long x;      /* 8 bytes */
};

clang may generate:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
};

Is there any way that I can tell the "pad" array is generated by padding?

Thanks a lot
Hongbin

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Struct padding

Eric Fiselier via cfe-dev
Hi Hongbin,

You can pass `-Wpadded` to clang. For your particular example it will print something along the lines of 

```
warning: padding struct 'foo1' with 7 bytes to align 'x' [-Wpadded]
    long x;
```

Jonas

On Thu, May 18, 2017 at 9:15 AM, Hongbin Zheng via llvm-dev <[hidden email]> wrote:
Hi,

I am wondering how I can tell whether a field of a struct is introduced by padding or not.

For example, if I have a struct:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    long x;      /* 8 bytes */
};

clang may generate:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
};

Is there any way that I can tell the "pad" array is generated by padding?

Thanks a lot
Hongbin

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Struct padding

Eric Fiselier via cfe-dev
Hi Jonas,

Thanks a lot.
In an LLVM pass, how can I check the related information? will clang emit some metadata table?

Thanks
Hongbin

On Thu, May 18, 2017 at 12:47 AM, Jonas Devlieghere <[hidden email]> wrote:
Hi Hongbin,

You can pass `-Wpadded` to clang. For your particular example it will print something along the lines of 

```
warning: padding struct 'foo1' with 7 bytes to align 'x' [-Wpadded]
    long x;
```

Jonas

On Thu, May 18, 2017 at 9:15 AM, Hongbin Zheng via llvm-dev <[hidden email]> wrote:
Hi,

I am wondering how I can tell whether a field of a struct is introduced by padding or not.

For example, if I have a struct:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    long x;      /* 8 bytes */
};

clang may generate:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
};

Is there any way that I can tell the "pad" array is generated by padding?

Thanks a lot
Hongbin

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Struct padding

Eric Fiselier via cfe-dev
What are you actually trying to achieve? LLVM knows the alignment and size of each component. You could iterate over the different types and identify when there is a difference in "calculated total size and the current alignment requirement", but LLVM does automatically pad structures [unless you specifically ask it not to].

Note that there is no actual field added for padding, it's just the size and alignment itself.

--
Mats

On 18 May 2017 at 08:51, Hongbin Zheng via cfe-dev <[hidden email]> wrote:
Hi Jonas,

Thanks a lot.
In an LLVM pass, how can I check the related information? will clang emit some metadata table?

Thanks
Hongbin

On Thu, May 18, 2017 at 12:47 AM, Jonas Devlieghere <[hidden email]> wrote:
Hi Hongbin,

You can pass `-Wpadded` to clang. For your particular example it will print something along the lines of 

```
warning: padding struct 'foo1' with 7 bytes to align 'x' [-Wpadded]
    long x;
```

Jonas

On Thu, May 18, 2017 at 9:15 AM, Hongbin Zheng via llvm-dev <[hidden email]> wrote:
Hi,

I am wondering how I can tell whether a field of a struct is introduced by padding or not.

For example, if I have a struct:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    long x;      /* 8 bytes */
};

clang may generate:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
};

Is there any way that I can tell the "pad" array is generated by padding?

Thanks a lot
Hongbin

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Struct padding

Eric Fiselier via cfe-dev
Hi Mats,

When the struct is packed, explicit byte array is introduced to pad the struct. (I saw this happened in clang 3.9.)

I want to check if a byte or byte array in an LLVM struct is introduce for explicit padding or not.

I don't need to worry about this problem in case the newest clang do not introduce byte array anymore.

Thanks
Hongbin

On Thu, May 18, 2017 at 1:03 AM, mats petersson <[hidden email]> wrote:
What are you actually trying to achieve? LLVM knows the alignment and size of each component. You could iterate over the different types and identify when there is a difference in "calculated total size and the current alignment requirement", but LLVM does automatically pad structures [unless you specifically ask it not to].

Note that there is no actual field added for padding, it's just the size and alignment itself.

--
Mats

On 18 May 2017 at 08:51, Hongbin Zheng via cfe-dev <[hidden email]> wrote:
Hi Jonas,

Thanks a lot.
In an LLVM pass, how can I check the related information? will clang emit some metadata table?

Thanks
Hongbin

On Thu, May 18, 2017 at 12:47 AM, Jonas Devlieghere <[hidden email]> wrote:
Hi Hongbin,

You can pass `-Wpadded` to clang. For your particular example it will print something along the lines of 

```
warning: padding struct 'foo1' with 7 bytes to align 'x' [-Wpadded]
    long x;
```

Jonas

On Thu, May 18, 2017 at 9:15 AM, Hongbin Zheng via llvm-dev <[hidden email]> wrote:
Hi,

I am wondering how I can tell whether a field of a struct is introduced by padding or not.

For example, if I have a struct:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    long x;      /* 8 bytes */
};

clang may generate:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
};

Is there any way that I can tell the "pad" array is generated by padding?

Thanks a lot
Hongbin

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Struct padding

Eric Fiselier via cfe-dev
How do you mean that a byte array is added? Because at least in my experiments, I don't see that:

struct A
{
    int a;
    char b;
    long c;
};

struct A a;

produces:

; ModuleID = 'pad.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.A = type { i32, i8, i64 }

@a = common global %struct.A zeroinitializer, align 8

!llvm.ident = !{!0}


Adding a call to printf,

extern int printf(const char *fmt, ...);
void func(struct A* a)
{
    printf("c=%ld", a->c);
}

and outputting assembler, we can see that the offset to "c" in that struct is 8:

func:                                   # @func
    .cfi_startproc
# BB#0:                                 # %entry
    movq    8(%rdi), %rsi
    movl    $.L.str, %edi
    xorl    %eax, %eax
    jmp    printf                  # TAILCALL

So, can you provide an example of this padding, because I don't see it. This is clang 3.8, but 3.9 did the same thing (I went back to 3.8 to check if it was different)

There will be padding in the actual data structure, based on the need for aligning (better performance if not required by the hardware), so if we for example initalize the data:
struct A a = { 3, 'a', 4711 };
then there will be LLVM-code like this:
@a = global %struct.A { i32 3, i8 97, i64 4711 }, align 8
and in the machine code there will be:
a:
    .long    3                       # 0x3
    .byte    97                      # 0x61
    .zero    3
    .quad    4711                    # 0x1267

Because three bytes of zeros are needed to fill the data between the 'a' and the long of 4711. But nowhere other than in the machine-code is that padding anything  more than "difference between theoretical closest offset and aligned offset".

--
Mats

On 18 May 2017 at 09:14, Hongbin Zheng <[hidden email]> wrote:
Hi Mats,

When the struct is packed, explicit byte array is introduced to pad the struct. (I saw this happened in clang 3.9.)

I want to check if a byte or byte array in an LLVM struct is introduce for explicit padding or not.

I don't need to worry about this problem in case the newest clang do not introduce byte array anymore.

Thanks
Hongbin

On Thu, May 18, 2017 at 1:03 AM, mats petersson <[hidden email]> wrote:
What are you actually trying to achieve? LLVM knows the alignment and size of each component. You could iterate over the different types and identify when there is a difference in "calculated total size and the current alignment requirement", but LLVM does automatically pad structures [unless you specifically ask it not to].

Note that there is no actual field added for padding, it's just the size and alignment itself.

--
Mats

On 18 May 2017 at 08:51, Hongbin Zheng via cfe-dev <[hidden email]> wrote:
Hi Jonas,

Thanks a lot.
In an LLVM pass, how can I check the related information? will clang emit some metadata table?

Thanks
Hongbin

On Thu, May 18, 2017 at 12:47 AM, Jonas Devlieghere <[hidden email]> wrote:
Hi Hongbin,

You can pass `-Wpadded` to clang. For your particular example it will print something along the lines of 

```
warning: padding struct 'foo1' with 7 bytes to align 'x' [-Wpadded]
    long x;
```

Jonas

On Thu, May 18, 2017 at 9:15 AM, Hongbin Zheng via llvm-dev <[hidden email]> wrote:
Hi,

I am wondering how I can tell whether a field of a struct is introduced by padding or not.

For example, if I have a struct:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    long x;      /* 8 bytes */
};

clang may generate:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
};

Is there any way that I can tell the "pad" array is generated by padding?

Thanks a lot
Hongbin

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev





_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Struct padding

Eric Fiselier via cfe-dev
the packed + aligned attribute will automatically introduce explicit padding byte array:

Sometimes Clang will decide to automatically pack the struct/class in C++, I don't know the details here, but looks like it is related to inheritance.

Thanks
Hongbin


On Thu, May 18, 2017 at 1:32 AM, mats petersson <[hidden email]> wrote:
How do you mean that a byte array is added? Because at least in my experiments, I don't see that:

struct A
{
    int a;
    char b;
    long c;
};

struct A a;

produces:

; ModuleID = 'pad.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.A = type { i32, i8, i64 }

@a = common global %struct.A zeroinitializer, align 8

!llvm.ident = !{!0}


Adding a call to printf,

extern int printf(const char *fmt, ...);
void func(struct A* a)
{
    printf("c=%ld", a->c);
}

and outputting assembler, we can see that the offset to "c" in that struct is 8:

func:                                   # @func
    .cfi_startproc
# BB#0:                                 # %entry
    movq    8(%rdi), %rsi
    movl    $.L.str, %edi
    xorl    %eax, %eax
    jmp    printf                  # TAILCALL

So, can you provide an example of this padding, because I don't see it. This is clang 3.8, but 3.9 did the same thing (I went back to 3.8 to check if it was different)

There will be padding in the actual data structure, based on the need for aligning (better performance if not required by the hardware), so if we for example initalize the data:
struct A a = { 3, 'a', 4711 };
then there will be LLVM-code like this:
@a = global %struct.A { i32 3, i8 97, i64 4711 }, align 8
and in the machine code there will be:
a:
    .long    3                       # 0x3
    .byte    97                      # 0x61
    .zero    3
    .quad    4711                    # 0x1267

Because three bytes of zeros are needed to fill the data between the 'a' and the long of 4711. But nowhere other than in the machine-code is that padding anything  more than "difference between theoretical closest offset and aligned offset".

--
Mats

On 18 May 2017 at 09:14, Hongbin Zheng <[hidden email]> wrote:
Hi Mats,

When the struct is packed, explicit byte array is introduced to pad the struct. (I saw this happened in clang 3.9.)

I want to check if a byte or byte array in an LLVM struct is introduce for explicit padding or not.

I don't need to worry about this problem in case the newest clang do not introduce byte array anymore.

Thanks
Hongbin

On Thu, May 18, 2017 at 1:03 AM, mats petersson <[hidden email]> wrote:
What are you actually trying to achieve? LLVM knows the alignment and size of each component. You could iterate over the different types and identify when there is a difference in "calculated total size and the current alignment requirement", but LLVM does automatically pad structures [unless you specifically ask it not to].

Note that there is no actual field added for padding, it's just the size and alignment itself.

--
Mats

On 18 May 2017 at 08:51, Hongbin Zheng via cfe-dev <[hidden email]> wrote:
Hi Jonas,

Thanks a lot.
In an LLVM pass, how can I check the related information? will clang emit some metadata table?

Thanks
Hongbin

On Thu, May 18, 2017 at 12:47 AM, Jonas Devlieghere <[hidden email]> wrote:
Hi Hongbin,

You can pass `-Wpadded` to clang. For your particular example it will print something along the lines of 

```
warning: padding struct 'foo1' with 7 bytes to align 'x' [-Wpadded]
    long x;
```

Jonas

On Thu, May 18, 2017 at 9:15 AM, Hongbin Zheng via llvm-dev <[hidden email]> wrote:
Hi,

I am wondering how I can tell whether a field of a struct is introduced by padding or not.

For example, if I have a struct:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    long x;      /* 8 bytes */
};

clang may generate:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
};

Is there any way that I can tell the "pad" array is generated by padding?

Thanks a lot
Hongbin

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev






_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Struct padding

Eric Fiselier via cfe-dev
In that particular example, it's because the WHOLE structure needs to be aligned to 4 bytes, but the contents inside it is packed (because that's what your attributes request - packed, and then align to 4).

So, yes, if you use attribute packed or attribute aligned to change the natural alignment WITHIN a structure, then you will (if necessary) get extra elements added to the struct. This largely because LLVM doesn't have a (good) way to express this in a StructType.

Still don't understand what it is you are trying to do here. Definitely something clang does, not something LLVM does. Also, I don't think you can tell the difference between a manuall padded and an automatically padded struct. Adding a char d[3]; to the struct with the packed,align 4 attribute, it produces the same type. The only difference is that it will zero initialize `d`, where the anonymous padding is `undef` (allows the compiler to optimise it away at times, I think).

--
Mats

On 18 May 2017 at 09:39, Hongbin Zheng <[hidden email]> wrote:
the packed + aligned attribute will automatically introduce explicit padding byte array:

Sometimes Clang will decide to automatically pack the struct/class in C++, I don't know the details here, but looks like it is related to inheritance.

Thanks
Hongbin


On Thu, May 18, 2017 at 1:32 AM, mats petersson <[hidden email]> wrote:
How do you mean that a byte array is added? Because at least in my experiments, I don't see that:

struct A
{
    int a;
    char b;
    long c;
};

struct A a;

produces:

; ModuleID = 'pad.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%struct.A = type { i32, i8, i64 }

@a = common global %struct.A zeroinitializer, align 8

!llvm.ident = !{!0}


Adding a call to printf,

extern int printf(const char *fmt, ...);
void func(struct A* a)
{
    printf("c=%ld", a->c);
}

and outputting assembler, we can see that the offset to "c" in that struct is 8:

func:                                   # @func
    .cfi_startproc
# BB#0:                                 # %entry
    movq    8(%rdi), %rsi
    movl    $.L.str, %edi
    xorl    %eax, %eax
    jmp    printf                  # TAILCALL

So, can you provide an example of this padding, because I don't see it. This is clang 3.8, but 3.9 did the same thing (I went back to 3.8 to check if it was different)

There will be padding in the actual data structure, based on the need for aligning (better performance if not required by the hardware), so if we for example initalize the data:
struct A a = { 3, 'a', 4711 };
then there will be LLVM-code like this:
@a = global %struct.A { i32 3, i8 97, i64 4711 }, align 8
and in the machine code there will be:
a:
    .long    3                       # 0x3
    .byte    97                      # 0x61
    .zero    3
    .quad    4711                    # 0x1267

Because three bytes of zeros are needed to fill the data between the 'a' and the long of 4711. But nowhere other than in the machine-code is that padding anything  more than "difference between theoretical closest offset and aligned offset".

--
Mats

On 18 May 2017 at 09:14, Hongbin Zheng <[hidden email]> wrote:
Hi Mats,

When the struct is packed, explicit byte array is introduced to pad the struct. (I saw this happened in clang 3.9.)

I want to check if a byte or byte array in an LLVM struct is introduce for explicit padding or not.

I don't need to worry about this problem in case the newest clang do not introduce byte array anymore.

Thanks
Hongbin

On Thu, May 18, 2017 at 1:03 AM, mats petersson <[hidden email]> wrote:
What are you actually trying to achieve? LLVM knows the alignment and size of each component. You could iterate over the different types and identify when there is a difference in "calculated total size and the current alignment requirement", but LLVM does automatically pad structures [unless you specifically ask it not to].

Note that there is no actual field added for padding, it's just the size and alignment itself.

--
Mats

On 18 May 2017 at 08:51, Hongbin Zheng via cfe-dev <[hidden email]> wrote:
Hi Jonas,

Thanks a lot.
In an LLVM pass, how can I check the related information? will clang emit some metadata table?

Thanks
Hongbin

On Thu, May 18, 2017 at 12:47 AM, Jonas Devlieghere <[hidden email]> wrote:
Hi Hongbin,

You can pass `-Wpadded` to clang. For your particular example it will print something along the lines of 

```
warning: padding struct 'foo1' with 7 bytes to align 'x' [-Wpadded]
    long x;
```

Jonas

On Thu, May 18, 2017 at 9:15 AM, Hongbin Zheng via llvm-dev <[hidden email]> wrote:
Hi,

I am wondering how I can tell whether a field of a struct is introduced by padding or not.

For example, if I have a struct:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    long x;      /* 8 bytes */
};

clang may generate:

struct foo1 {
    char *p;     /* 8 bytes */
    char c;      /* 1 byte
    char pad[7]; /* 7 bytes */
    long x;      /* 8 bytes */
};

Is there any way that I can tell the "pad" array is generated by padding?

Thanks a lot
Hongbin

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev







_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Struct padding

Eric Fiselier via cfe-dev
On 18 May 2017, at 09:50, mats petersson via cfe-dev <[hidden email]> wrote:
>
> The only difference is that it will zero initialize `d`, where the anonymous padding is `undef` (allows the compiler to optimise it away at times, I think).

This is one of the underspecified corner cases in the C spec (and the subject of some ongoing WG14 discussions).  In particular, for atomic structs to work, struct padding is required to be stable, so undef isn’t quite right (an optimiser is permitted to spot an atomic compare and exchange on a struct containing undef and allow assume undef != undef and so it will always fail).  Some architectures (for example, Alpha) make sub-word stores much more expensive and so field updates on these architectures may modify the following padding (which is the reason for the vagueness in the C spec and why sizeof(T) and sizeof(_Atomic(T)) are not required to be the same - on Alpha you’d likely want _Atomic(char) to be 64 bits).

It would be nice if LLVM had a way to differentiate between padding and non-padding struct fields (even if it were metadata, because losing the ‘padding’ attribute would impede optimisation but shouldn’t harm correctness).

David

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev