Redundant byval in C codegen?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Redundant byval in C codegen?

Zhongxing Xu
For this C code,

struct s {
  int a[40];
};

void g(struct s a) {
  a.a[0] = 4;
}

void f() {
  struct s a;
  g(a);
}

clang generates llvm IR:

define void @g(%struct.s* byval %a) nounwind {
entry:
  %tmp = getelementptr inbounds %struct.s* %a, i32 0, i32 0 ; <[40 x i32]*> [#uses=1]
  %arraydecay = getelementptr inbounds [40 x i32]* %tmp, i32 0, i32 0 ; <i32*> [#uses=1]
  %arrayidx = getelementptr inbounds i32* %arraydecay, i64 0 ; <i32*> [#uses=1]
  store i32 4, i32* %arrayidx
  ret void
}

define void @f() nounwind {
entry:
  %a = alloca %struct.s, align 4                  ; <%struct.s*> [#uses=1]
  %agg.tmp = alloca %struct.s                     ; <%struct.s*> [#uses=2]
  %tmp = bitcast %struct.s* %agg.tmp to i8*       ; <i8*> [#uses=1]
  %tmp1 = bitcast %struct.s* %a to i8*            ; <i8*> [#uses=1]
  call void @llvm.memcpy.i64(i8* %tmp, i8* %tmp1, i64 160, i32 4)
  call void @g(%struct.s* byval %agg.tmp)
  ret void
}

Since we have already alloca'ed a temporary struct.s %agg.tmp, why is there still a 'byval' in g's parameter? The consequence of this is when assembly code is generated, we end up with allocating 3 structs on the stack:

f:
.Leh_func_begin2:
pushq %rbp
.Llabel3:
movq %rsp, %rbp
.Llabel4:
subq $496, %rsp


Could somebody explain the rationale behind this behavior? Thanks.


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Redundant byval in C codegen?

Rafael Espindola
> Could somebody explain the rationale behind this behavior? Thanks.

I have no idea why we are unable to remove one of the allocas in the
caller, but I think I know why we always keep a copy in the caller and
one in the callee. Take a modified testcase:

----------------------------------------
#include <string.h>
struct s {
  int a[40];
};

void g(struct s a);

void f() {
  struct s a;
  memset(&a, 0, sizeof(struct s));
  g(a);
}
------------------------------

This gets compiled to
----------------------
...
  %agg.tmp = alloca %struct.s, align 8            ; <%struct.s*> [#uses=2]
  %agg.tmp2 = bitcast %struct.s* %agg.tmp to i8*  ; <i8*> [#uses=1]
  call void @llvm.memset.i64(i8* %agg.tmp2, i8 0, i64 160, i32 8)
  call void @g(%struct.s* byval %agg.tmp) nounwind optsize
  ret void
....
---------------------

Ideally we would not have that alloca since the struct is passed by
value and we now have first class aggregates. The problem is that
byval is a pointer. This is very good for the callee since it
correctly represents the fact that the argument is in memory (as
mandated by the ABI). It is  not so good for the caller that is forced
to produce an object in memory that will be copied by the code
generator.

If we were to use first class aggregates we would have a problem is
the callee that now would not have explicit loads.

One of my crazy ideas I never have time to implement is to change the
call function to support passing first class aggregates to byval
arguments. It already does an implicit copy, this would just add
support for it copying things that are not in memory in the caller.

Cheers,
--
Rafael Ávila de Espíndola

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Redundant byval in C codegen?

Zhongxing Xu
Does that mean that if we created a temporary for a struct argument,
'byval' attribute should not be used to avoid an alloca?

2010/1/17 Rafael Espindola <[hidden email]>:

>> Could somebody explain the rationale behind this behavior? Thanks.
>
> I have no idea why we are unable to remove one of the allocas in the
> caller, but I think I know why we always keep a copy in the caller and
> one in the callee. Take a modified testcase:
>
> ----------------------------------------
> #include <string.h>
> struct s {
>  int a[40];
> };
>
> void g(struct s a);
>
> void f() {
>  struct s a;
>  memset(&a, 0, sizeof(struct s));
>  g(a);
> }
> ------------------------------
>
> This gets compiled to
> ----------------------
> ...
>  %agg.tmp = alloca %struct.s, align 8            ; <%struct.s*> [#uses=2]
>  %agg.tmp2 = bitcast %struct.s* %agg.tmp to i8*  ; <i8*> [#uses=1]
>  call void @llvm.memset.i64(i8* %agg.tmp2, i8 0, i64 160, i32 8)
>  call void @g(%struct.s* byval %agg.tmp) nounwind optsize
>  ret void
> ....
> ---------------------
>
> Ideally we would not have that alloca since the struct is passed by
> value and we now have first class aggregates. The problem is that
> byval is a pointer. This is very good for the callee since it
> correctly represents the fact that the argument is in memory (as
> mandated by the ABI). It is  not so good for the caller that is forced
> to produce an object in memory that will be copied by the code
> generator.
>
> If we were to use first class aggregates we would have a problem is
> the callee that now would not have explicit loads.
>
> One of my crazy ideas I never have time to implement is to change the
> call function to support passing first class aggregates to byval
> arguments. It already does an implicit copy, this would just add
> support for it copying things that are not in memory in the caller.
>
> Cheers,
> --
> Rafael Ávila de Espíndola
>

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Redundant byval in C codegen?

Rafael Espindola
2010/1/19 Zhongxing Xu <[hidden email]>:
> Does that mean that if we created a temporary for a struct argument,
> 'byval' attribute should not be used to avoid an alloca?

In some cases we do that (small structs). In some cases you don't have
a choice (ABI). In others not using a byval will bloat the IL with
lots of scalar arguments. Not using a byval might also worsen the
callee code since the arguments will be on the stack but nothing
before the codegen will know that.

My idea is that if

struct s  {int a; int b;  int c; int d int e;};
void g(struct s a);
void f() {
  struct s a = {1, 2, 3, 4, 5};
 g(a);
}

Could be compiled to something like
-------------------
%struct.s = type { i32, i32, i32, i32, i32 }

define void @f() {
entry:
  call void @g({ i32 1, i32 2, i32 3, i32 4, i32 5})
  ret void
}

declare void @g(%struct.s* byval)
-----------------

We would have the nice properties
*) It is clear that the caller can store the the struct temporary in
any way. In fact, it can optimize it away
*) It is clear that the callee will get the structure via a memory pointer

It looks strange at first to have call pass a value to a pointer, but
since it copies the argument, that probably is not a problem.

Cheers,
--
Rafael Ávila de Espíndola

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev