Automatic variable initialization and copy_from_user()

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Automatic variable initialization and copy_from_user()

Hans Wennborg via cfe-dev
Hi folks,

Looking at the performance of Linux kernel benchmarks with stack
initialization enabled, I've noticed the following pattern is quite
common in the kernel:

  struct whatever input;
  if (copy_from_user(&input, user_ptr, sizeof(input))
    return -EFAULT;

(For those not familiar with copy_from_user(), it's a function that
copies data from the userspace to the kernel and returns the number of
bytes that weren't copied)

When building with -ftrivial-var-auto-init=pattern, Clang generates
the 0xAAAA initialization despite |input| is either fully initialized
by copy_from_user() or never used.

I'm wondering if we could do better in this case and make the compiler
optimize away the initialization.
As copy_from_user() itself is written in assembly, we couldn't simply
rely on DSE. Instead, we probably need some annotation that prevents
Clang from instrumenting |input|.

Simply using __attribute__((uninitialized)) on |input| doesn't work
well, because a further change to this code may introduce another path
on which |input| isn't initialized.
For the same reason it's incorrect to make Clang apply this attribute
to every destination argument of copy_to_user(), especially given that
the copying may fail.

It would help to mark copy_from_user() as a function that initializes
one of these arguments, but only if it does not fail. We also need to
tell the compiler about the size of that argument, so that the
function prototype will look along the lines of:

unsigned long copy_from_user (void *  to, const void __user *  from,
unsigned long  n)
  __attribute__((param_sizeof(1, 3)) __attribute__((param_init_if_ret0(1)));

, where param_sizeof(1, 3) means that parameter #3 contains the size
of parameter #1, and param_init_if_ret0(1) means that parameter #1 is
initialized iff the function returns 0.

This looks quite clumsy, but maybe can be made more elegant and
general enough to handle other cases.

Does the idea make sense to you? Maybe you've seen other cases in
which a similar optimization can be applied?
To people working on DSE right now: do you think it's possible to
perform such an optimization assuming that we know about the behavior
of copy_from_user()?

Thanks,

--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automatic variable initialization and copy_from_user()

Hans Wennborg via cfe-dev
Hi,

> On Feb 11, 2020, at 11:13, Alexander Potapenko <[hidden email]> wrote:
> It would help to mark copy_from_user() as a function that initializes
> one of these arguments, but only if it does not fail. We also need to
> tell the compiler about the size of that argument, so that the
> function prototype will look along the lines of:
>
> unsigned long copy_from_user (void *  to, const void __user *  from,
> unsigned long  n)
>  __attribute__((param_sizeof(1, 3)) __attribute__((param_init_if_ret0(1)));
>
> , where param_sizeof(1, 3) means that parameter #3 contains the size
> of parameter #1, and param_init_if_ret0(1) means that parameter #1 is
> initialized iff the function returns 0.
>
> This looks quite clumsy, but maybe can be made more elegant and
> general enough to handle other cases.
>
> Does the idea make sense to you? Maybe you've seen other cases in
> which a similar optimization can be applied?
> To people working on DSE right now: do you think it's possible to
> perform such an optimization assuming that we know about the behavior
> of copy_from_user()?


Yes, I think MemorySSA backed DSE should be able to handle this (it’s still WIP, but the first patch just landed). The key question is how to convey the fact that a function fully initializes `input` along some paths. The MemorySSA-backed DSE patch series will eliminate the first memset in the example below, if there’s no read access on the error path.

From the DSE perspective, it would be easiest if Clang could inject something like an intrinsic that marks `input` as initialized on the success path (e.g. inserts a call to the not-yet existing @llvm.initialized(%ptr, size)). IMO it seems appropriate to handle this at the Clang level, as it seems quite language/attribute-specific. As a stepping stone, it might be feasible to provide a builtin to the user to specify that an object is initialized along a certain path.


declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i1) noun-wind
declare i1 @cond() readnone
declare void @use(i8*)

define void @test19(i32* noalias %P) {
entry:
  %obj = alloca [10 x i8]
  %ptr = bitcast [10 x i8]* %obj to i8*
  call void @llvm.memset.p0i8.i64(i8* %ptr, i8 0, i64 10, i1 false)
  %succ = call i1 @cond()
  br i1 %succ, label %copy, label %err

copy:
  call void @llvm.memset.p0i8.i64(i8* %ptr, i8 0, i64 10, i1 false)
  call void @use(i8* %ptr)
  br label %exit

err:
  ;call void @use(i8* %ptr)
  ret void

exit:
  ret void
}
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automatic variable initialization and copy_from_user()

Hans Wennborg via cfe-dev
On Tue, Feb 11, 2020 at 1:31 PM Florian Hahn <[hidden email]> wrote:

>
> Hi,
>
> > On Feb 11, 2020, at 11:13, Alexander Potapenko <[hidden email]> wrote:
> > It would help to mark copy_from_user() as a function that initializes
> > one of these arguments, but only if it does not fail. We also need to
> > tell the compiler about the size of that argument, so that the
> > function prototype will look along the lines of:
> >
> > unsigned long copy_from_user (void *  to, const void __user *  from,
> > unsigned long  n)
> >  __attribute__((param_sizeof(1, 3)) __attribute__((param_init_if_ret0(1)));
> >
> > , where param_sizeof(1, 3) means that parameter #3 contains the size
> > of parameter #1, and param_init_if_ret0(1) means that parameter #1 is
> > initialized iff the function returns 0.
> >
> > This looks quite clumsy, but maybe can be made more elegant and
> > general enough to handle other cases.
> >
> > Does the idea make sense to you? Maybe you've seen other cases in
> > which a similar optimization can be applied?
> > To people working on DSE right now: do you think it's possible to
> > perform such an optimization assuming that we know about the behavior
> > of copy_from_user()?
>
>
> Yes, I think MemorySSA backed DSE should be able to handle this (it’s still WIP, but the first patch just landed). The key question is how to convey the fact that a function fully initializes `input` along some paths. The MemorySSA-backed DSE patch series will eliminate the first memset in the example below, if there’s no read access on the error path.
>
> From the DSE perspective, it would be easiest if Clang could inject something like an intrinsic that marks `input` as initialized on the success path (e.g. inserts a call to the not-yet existing @llvm.initialized(%ptr, size)). IMO it seems appropriate to handle this at the Clang level, as it seems quite language/attribute-specific. As a stepping stone, it might be feasible to provide a builtin to the user to specify that an object is initialized along a certain path.
>
>
> declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i1) noun-wind
> declare i1 @cond() readnone
> declare void @use(i8*)
>
> define void @test19(i32* noalias %P) {
> entry:
>   %obj = alloca [10 x i8]
>   %ptr = bitcast [10 x i8]* %obj to i8*
>   call void @llvm.memset.p0i8.i64(i8* %ptr, i8 0, i64 10, i1 false)
>   %succ = call i1 @cond()
>   br i1 %succ, label %copy, label %err
>
> copy:
>   call void @llvm.memset.p0i8.i64(i8* %ptr, i8 0, i64 10, i1 false)
>   call void @use(i8* %ptr)
>   br label %exit
>
> err:
>   ;call void @use(i8* %ptr)
>   ret void
>
> exit:
>   ret void
> }

One alternative for completeness. There is a version of copy_from_user
that always initializes the whole range (if part is not copied from
userspace, it's zeroed). Handling such a version is much easier,
right. However, some versions don't do this (for performance?). Maybe
we could make _all_ copy_from_user's initialize whole range always iff
memory initialization is enabled. Then we could add a simpler
attribute for clang that will convey that copy_from_user initializes
the memory.
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Automatic variable initialization and copy_from_user()

Hans Wennborg via cfe-dev


> On Feb 11, 2020, at 12:38, Dmitry Vyukov <[hidden email]> wrote:
>
> On Tue, Feb 11, 2020 at 1:31 PM Florian Hahn <[hidden email]> wrote:
>>
>> Hi,
>>
>>> On Feb 11, 2020, at 11:13, Alexander Potapenko <[hidden email]> wrote:
>>> It would help to mark copy_from_user() as a function that initializes
>>> one of these arguments, but only if it does not fail. We also need to
>>> tell the compiler about the size of that argument, so that the
>>> function prototype will look along the lines of:
>>>
>>> unsigned long copy_from_user (void *  to, const void __user *  from,
>>> unsigned long  n)
>>> __attribute__((param_sizeof(1, 3)) __attribute__((param_init_if_ret0(1)));
>>>
>>> , where param_sizeof(1, 3) means that parameter #3 contains the size
>>> of parameter #1, and param_init_if_ret0(1) means that parameter #1 is
>>> initialized iff the function returns 0.
>>>
>>> This looks quite clumsy, but maybe can be made more elegant and
>>> general enough to handle other cases.
>>>
>>> Does the idea make sense to you? Maybe you've seen other cases in
>>> which a similar optimization can be applied?
>>> To people working on DSE right now: do you think it's possible to
>>> perform such an optimization assuming that we know about the behavior
>>> of copy_from_user()?
>>
>>
>> Yes, I think MemorySSA backed DSE should be able to handle this (it’s still WIP, but the first patch just landed). The key question is how to convey the fact that a function fully initializes `input` along some paths. The MemorySSA-backed DSE patch series will eliminate the first memset in the example below, if there’s no read access on the error path.
>>
>> From the DSE perspective, it would be easiest if Clang could inject something like an intrinsic that marks `input` as initialized on the success path (e.g. inserts a call to the not-yet existing @llvm.initialized(%ptr, size)). IMO it seems appropriate to handle this at the Clang level, as it seems quite language/attribute-specific. As a stepping stone, it might be feasible to provide a builtin to the user to specify that an object is initialized along a certain path.
>>
>>
>> declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i1) noun-wind
>> declare i1 @cond() readnone
>> declare void @use(i8*)
>>
>> define void @test19(i32* noalias %P) {
>> entry:
>>  %obj = alloca [10 x i8]
>>  %ptr = bitcast [10 x i8]* %obj to i8*
>>  call void @llvm.memset.p0i8.i64(i8* %ptr, i8 0, i64 10, i1 false)
>>  %succ = call i1 @cond()
>>  br i1 %succ, label %copy, label %err
>>
>> copy:
>>  call void @llvm.memset.p0i8.i64(i8* %ptr, i8 0, i64 10, i1 false)
>>  call void @use(i8* %ptr)
>>  br label %exit
>>
>> err:
>>  ;call void @use(i8* %ptr)
>>  ret void
>>
>> exit:
>>  ret void
>> }
>
> One alternative for completeness. There is a version of copy_from_user
> that always initializes the whole range (if part is not copied from
> userspace, it's zeroed). Handling such a version is much easier,
> right. However, some versions don't do this (for performance?). Maybe
> we could make _all_ copy_from_user's initialize whole range always iff
> memory initialization is enabled. Then we could add a simpler
> attribute for clang that will convey that copy_from_user initializes
> the memory.

That would also work from the DSE side of things. I’m not sure if/how functions that are marked to initialize a pointer are handled, but adding that to DSE should be easy. One more thing to keep in mind: if we just rely on function attributes, that might lead to missed eliminations if copy_from_user gets inlined and we fail to determine that it initialize a parameter from its contents (e.g. due to it using inline-assembly)

Cheers,
Florian
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev