[RFC] Zeroing Caller Saved Regs

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
[This feature addresses https://bugs.llvm.org/show_bug.cgi?id=37880
and https://github.com/KSPP/linux/issues/84.]

Clang has been ramping up its support of the Linux kernel. We recently
added "asm goto with outputs", a long requested feature. We want to
continue building our relationship with the Linux community.

KSPP is a project to improve security in the Linux kernel, through
both kernel changes and compiler features. One compiler feature they
want is the ability to zero out caller-saved registers on function
return as a defense against stale register contents being used as a
side-channel or speculation path.

The option will be "opt-in" for each target. Targets that don't
support the flag should probably emit a warning or error.

Our proposal for the feature is modeled off of H. J. Lu's
description[1] (copied with some modifications):

```
Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]
command-line option and zero_caller_saved_regs function attributes:

* Don't zero caller-saved registers upon function return (default):

    -mzero-caller-saved-regs=skip
    zero_caller_saved_regs("skip")

* Zero used caller-saved integer registers upon function return:

    -mzero-caller-saved-regs=used-gpr
    zero_caller_saved_regs("used-gpr")

* Zero all integer registers upon function return:

    -mzero-caller-saved-regs=all-gpr
    zero_caller_saved_regs("all-gpr")

* Zero used caller-saved integer and vector registers upon function return:

    -mzero-caller-saved-regs=used
    zero_caller_saved_regs("used")

* Zero all caller-saved integer and vector registers upon function return:

    -mzero-caller-saved-regs=all
    zero_caller_saved_regs("all")
```

-bw

[1] https://github.com/clearlinux-pkgs/gcc/blob/master/0001-x86-Add-mzero-caller.patch
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
Hi Bill,

The thing missing from the RFC, though tangentially referenced in a couple of the links, is the threat model for this.

We’ve looked in the past (and had a paper at EuroS&P a few years ago) at the ways in which the compiler can leak secrets that the C abstract machine believes are erased but can be found via a stack memory safety gadget.  Fully addressing this requires zeroing spill slots and may also require spilling temporary registers across calls rather than placing values in callee-save registers (though if *all* code is compiled defensively, then the callee can be trusted to zero spill slots for these).

It appears as if this proposal is to address a weaker security model: leaking secrets via transient execution vulnerabilities.  As such, it is not concerned with secrets spilled to the stack or leaked from the caller to callee in callee-save registers.  

The latter almost makes sense.  These registers are normally spilled unconditionally on entry to a function if they are used.  The presence of shrink wrapping invalidates this assumption though, and there are transient execution paths through a function that uses shrink wrapping that will leak secrets from the caller.  I would therefore expect that this option would disable shrink wrapping.

The former makes less sense to me.  Anything that loads from the stack and does something with the result is a potential gadget for leaking secrets spilled to the stack by a prior call via transient execution side channels and, even in the presence of this defence, it seems likely that there are a large number of useful gadgets remaining.  

I think it would be useful for the discussion to have a clear threat model that this intends to defend against and a rough analysis of the security benefits that this is believed to bring.  

David

> On 7 Aug 2020, at 00:12, Bill Wendling via llvm-dev <[hidden email]> wrote:
>
> [This feature addresses https://bugs.llvm.org/show_bug.cgi?id=37880
> and https://github.com/KSPP/linux/issues/84.]
>
> Clang has been ramping up its support of the Linux kernel. We recently
> added "asm goto with outputs", a long requested feature. We want to
> continue building our relationship with the Linux community.
>
> KSPP is a project to improve security in the Linux kernel, through
> both kernel changes and compiler features. One compiler feature they
> want is the ability to zero out caller-saved registers on function
> return as a defense against stale register contents being used as a
> side-channel or speculation path.
>
> The option will be "opt-in" for each target. Targets that don't
> support the flag should probably emit a warning or error.
>
> Our proposal for the feature is modeled off of H. J. Lu's
> description[1] (copied with some modifications):
>
> ```
> Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]
> command-line option and zero_caller_saved_regs function attributes:
>
> * Don't zero caller-saved registers upon function return (default):
>
>    -mzero-caller-saved-regs=skip
>    zero_caller_saved_regs("skip")
>
> * Zero used caller-saved integer registers upon function return:
>
>    -mzero-caller-saved-regs=used-gpr
>    zero_caller_saved_regs("used-gpr")
>
> * Zero all integer registers upon function return:
>
>    -mzero-caller-saved-regs=all-gpr
>    zero_caller_saved_regs("all-gpr")
>
> * Zero used caller-saved integer and vector registers upon function return:
>
>    -mzero-caller-saved-regs=used
>    zero_caller_saved_regs("used")
>
> * Zero all caller-saved integer and vector registers upon function return:
>
>    -mzero-caller-saved-regs=all
>    zero_caller_saved_regs("all")
> ```
>
> -bw
>
> [1] https://github.com/clearlinux-pkgs/gcc/blob/master/0001-x86-Add-mzero-caller.patch
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
On Fri, Aug 7, 2020 at 1:18 AM David Chisnall
<[hidden email]> wrote:
> I think it would be useful for the discussion to have a clear threat model that this intends to defend against and a rough analysis of the security benefits that this is believed to bring.

I view this as being even more about a ROP defense. Dealing with spill
slots is, IMO, a separate issue, more related to the auto-var-init
work (though that would be stack erasure on function exit, rather than
entry, which addresses a different set of issues). I think this thread
from the GCC list has some good details on the ROP defense:

https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551607.html

--
Kees Cook
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
Thanks,

On 07/08/2020 23:28, Kees Cook wrote:

> On Fri, Aug 7, 2020 at 1:18 AM David Chisnall
> <[hidden email]> wrote:
>> I think it would be useful for the discussion to have a clear threat model that this intends to defend against and a rough analysis of the security benefits that this is believed to bring.
>
> I view this as being even more about a ROP defense. Dealing with spill
> slots is, IMO, a separate issue, more related to the auto-var-init
> work (though that would be stack erasure on function exit, rather than
> entry, which addresses a different set of issues). I think this thread
> from the GCC list has some good details on the ROP defense:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551607.html

This link gives two motivations:

1. Reducing information leak (which I find unconvincing, because there's
a lot more left on the stack than in caller-save registers).
2. Reducing ROP gadgets.

Unfortunately, for claim 2 they cite a paper that is behind a paywall,
so I can't easily see what that's doing and I'll have to guess what the
paper says:

Caller-save registers are intuitively useful in the first gadget in a
ROP sequence, because the current frame will have put values into them
(and so they are most likely to hold attacker-controlled values).  I can
imagine quite easily a paper that shows that you break the first gadget
in a chain with this mitigation.

It's possible that it would also significantly reduce the number of
total gadgets if each ret is preceeded by the zeroing sequence,
effectively denying the ability for the attacker to use these registers.
  Unfortunately, to be able to make arbitrary calls they would just need
one unguarded forward control-flow edge that loaded a function pointer
and its arguments from the stack, and I can't imagine that such a gadget
is absent from most nontrivial codebases.  I'd like to see an analysis
of the gadgets remaining when this mitigation is used.

I don't object to adding a flag that makes the Linux kernel slower but
if it is being advertised as a security feature then I would like to see
some evidence that it does something other than require automated attack
tools pick a different set of gadgets to use.

David

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
In reply to this post by Hubert Tong via cfe-dev
On Fri, 7 Aug 2020 at 00:12, Bill Wendling via cfe-dev
<[hidden email]> wrote:

>
> [This feature addresses https://bugs.llvm.org/show_bug.cgi?id=37880
> and https://github.com/KSPP/linux/issues/84.]
>
> Clang has been ramping up its support of the Linux kernel. We recently
> added "asm goto with outputs", a long requested feature. We want to
> continue building our relationship with the Linux community.
>
> KSPP is a project to improve security in the Linux kernel, through
> both kernel changes and compiler features. One compiler feature they
> want is the ability to zero out caller-saved registers on function
> return as a defense against stale register contents being used as a
> side-channel or speculation path.
>
> The option will be "opt-in" for each target. Targets that don't
> support the flag should probably emit a warning or error.
>
> Our proposal for the feature is modeled off of H. J. Lu's
> description[1] (copied with some modifications):
>
> ```
> Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]
> command-line option and zero_caller_saved_regs function attributes:
>
> * Don't zero caller-saved registers upon function return (default):
>
>     -mzero-caller-saved-regs=skip
>     zero_caller_saved_regs("skip")
>
> * Zero used caller-saved integer registers upon function return:
>
>     -mzero-caller-saved-regs=used-gpr
>     zero_caller_saved_regs("used-gpr")
>
> * Zero all integer registers upon function return:
>
>     -mzero-caller-saved-regs=all-gpr
>     zero_caller_saved_regs("all-gpr")
>
> * Zero used caller-saved integer and vector registers upon function return:
>
>     -mzero-caller-saved-regs=used
>     zero_caller_saved_regs("used")
>
> * Zero all caller-saved integer and vector registers upon function return:
>
>     -mzero-caller-saved-regs=all
>     zero_caller_saved_regs("all")
> ```
>
> -bw
>
> [1] https://github.com/clearlinux-pkgs/gcc/blob/master/0001-x86-Add-mzero-caller.patch

While the above might have a significant impact on performance, adding
this sort of security controls at the C and C++ level would be very
helpful.

e.g.

int function_containing_secret() {
int secret = 12345;
/* do some work with the secret */
secret = 0;
return 0;
}

Now, with compiler optimisations, the last "secret = 0;" will get
optimised out, as there are no uses of it.
Having some way to enforce that the "secret = 0" remains, and at
assembler level, which every registers and temporary stack values it
was stored in are wiped could be a useful addition.

I don't quite understand how the zero-all-values helps in the ROP
case, because the zero-all-value part can be bypassed by the malicious
user.

Kind Regards

James
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
In reply to this post by Hubert Tong via cfe-dev
The presence of those 4 different zeroing modes feels to me like this gcc feature was added speculatively -- with the intent of gathering some data on the various tradeoffs and impacts of the different options, rather than with the expectation that it's necessarily going to be actually useful. Has that investigation already been done? If so, which of the modes is useful, and for what?

On Fri, Aug 7, 2020 at 4:18 AM David Chisnall via cfe-dev <[hidden email]> wrote:
Hi Bill,

The thing missing from the RFC, though tangentially referenced in a couple of the links, is the threat model for this.

We’ve looked in the past (and had a paper at EuroS&P a few years ago) at the ways in which the compiler can leak secrets that the C abstract machine believes are erased but can be found via a stack memory safety gadget.  Fully addressing this requires zeroing spill slots and may also require spilling temporary registers across calls rather than placing values in callee-save registers (though if *all* code is compiled defensively, then the callee can be trusted to zero spill slots for these).

It appears as if this proposal is to address a weaker security model: leaking secrets via transient execution vulnerabilities.  As such, it is not concerned with secrets spilled to the stack or leaked from the caller to callee in callee-save registers. 

The latter almost makes sense.  These registers are normally spilled unconditionally on entry to a function if they are used.  The presence of shrink wrapping invalidates this assumption though, and there are transient execution paths through a function that uses shrink wrapping that will leak secrets from the caller.  I would therefore expect that this option would disable shrink wrapping.

The former makes less sense to me.  Anything that loads from the stack and does something with the result is a potential gadget for leaking secrets spilled to the stack by a prior call via transient execution side channels and, even in the presence of this defence, it seems likely that there are a large number of useful gadgets remaining. 

I think it would be useful for the discussion to have a clear threat model that this intends to defend against and a rough analysis of the security benefits that this is believed to bring. 

David

> On 7 Aug 2020, at 00:12, Bill Wendling via llvm-dev <[hidden email]> wrote:
>
> [This feature addresses https://bugs.llvm.org/show_bug.cgi?id=37880
> and https://github.com/KSPP/linux/issues/84.]
>
> Clang has been ramping up its support of the Linux kernel. We recently
> added "asm goto with outputs", a long requested feature. We want to
> continue building our relationship with the Linux community.
>
> KSPP is a project to improve security in the Linux kernel, through
> both kernel changes and compiler features. One compiler feature they
> want is the ability to zero out caller-saved registers on function
> return as a defense against stale register contents being used as a
> side-channel or speculation path.
>
> The option will be "opt-in" for each target. Targets that don't
> support the flag should probably emit a warning or error.
>
> Our proposal for the feature is modeled off of H. J. Lu's
> description[1] (copied with some modifications):
>
> ```
> Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]
> command-line option and zero_caller_saved_regs function attributes:
>
> * Don't zero caller-saved registers upon function return (default):
>
>    -mzero-caller-saved-regs=skip
>    zero_caller_saved_regs("skip")
>
> * Zero used caller-saved integer registers upon function return:
>
>    -mzero-caller-saved-regs=used-gpr
>    zero_caller_saved_regs("used-gpr")
>
> * Zero all integer registers upon function return:
>
>    -mzero-caller-saved-regs=all-gpr
>    zero_caller_saved_regs("all-gpr")
>
> * Zero used caller-saved integer and vector registers upon function return:
>
>    -mzero-caller-saved-regs=used
>    zero_caller_saved_regs("used")
>
> * Zero all caller-saved integer and vector registers upon function return:
>
>    -mzero-caller-saved-regs=all
>    zero_caller_saved_regs("all")
> ```
>
> -bw
>
> [1] https://github.com/clearlinux-pkgs/gcc/blob/master/0001-x86-Add-mzero-caller.patch
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
In reply to this post by Hubert Tong via cfe-dev
On Mon, Aug 10, 2020 at 3:34 AM David Chisnall
<[hidden email]> wrote:

>
> Thanks,
>
> On 07/08/2020 23:28, Kees Cook wrote:
> > On Fri, Aug 7, 2020 at 1:18 AM David Chisnall
> > <[hidden email]> wrote:
> >> I think it would be useful for the discussion to have a clear threat model that this intends to defend against and a rough analysis of the security benefits that this is believed to bring.
> >
> > I view this as being even more about a ROP defense. Dealing with spill
> > slots is, IMO, a separate issue, more related to the auto-var-init
> > work (though that would be stack erasure on function exit, rather than
> > entry, which addresses a different set of issues). I think this thread
> > from the GCC list has some good details on the ROP defense:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551607.html
>
> This link gives two motivations:
>
> 1. Reducing information leak (which I find unconvincing, because there's
> a lot more left on the stack than in caller-save registers).
> 2. Reducing ROP gadgets.
>
> Unfortunately, for claim 2 they cite a paper that is behind a paywall,
> so I can't easily see what that's doing and I'll have to guess what the
> paper says:
>
> Caller-save registers are intuitively useful in the first gadget in a
> ROP sequence, because the current frame will have put values into them
> (and so they are most likely to hold attacker-controlled values).  I can
> imagine quite easily a paper that shows that you break the first gadget
> in a chain with this mitigation.
>
> It's possible that it would also significantly reduce the number of
> total gadgets if each ret is preceeded by the zeroing sequence,
> effectively denying the ability for the attacker to use these registers.
>   Unfortunately, to be able to make arbitrary calls they would just need
> one unguarded forward control-flow edge that loaded a function pointer
> and its arguments from the stack, and I can't imagine that such a gadget
> is absent from most nontrivial codebases.  I'd like to see an analysis
> of the gadgets remaining when this mitigation is used.
>
> I don't object to adding a flag that makes the Linux kernel slower but
> if it is being advertised as a security feature then I would like to see
> some evidence that it does something other than require automated attack
> tools pick a different set of gadgets to use.
>
After reading the paper they link to, I'm rethinking this feature. :-)

From what I can gather from the paper, they use a tool to determine
which scratch (caller saved) registers are used in a function call.
They then use some type of instrumentation to zero out those scratch
registers. This can apparently break the change. For example, in line
17 below, RDI will be zeroed out as is RSI in line 19:

1: p = ''
2: p += pack('<Q', 0x0000000000401627) # pop rsi ; ret
3: p += pack('<Q', 0x00000000006ca080) # @ .data
4: p += pack('<Q', 0x00000000004784d6) # pop rax ; pop rdx ; pop rbx ; ret
5: p += '/bin//sh'
6: p += pack('<Q', 0x4141414141414141) # padding
7: p += pack('<Q', 0x4141414141414141) # padding
8: p += pack('<Q', 0x0000000000473f81) # mov qword ptr [rsi], rax ; ret
9: p += pack('<Q', 0x0000000000401627) # pop rsi ; ret
10: p += pack('<Q', 0x00000000006ca088) # @ .data + 8
11: p += pack('<Q', 0x0000000000425e3f) # xor rax, rax ; ret
12: p += pack('<Q', 0x0000000000473f81) # mov qword ptr [rsi], rax ; ret
13: p+= pack('<Q', 0x00000000004784d6) # pop rax ; pop rdx ; pop rbx ; ret
14: p += p64(59) # execve syscall number
15: p += pack('<Q', 0x4141414141414141) # padding
16: p += pack('<Q', 0x4141414141414141) # padding
17: p += pack('<Q', 0x0000000000401506) # pop rdi ; ret
18: p += pack('<Q', 0x00000000006ca080) # @ .data
19: p += pack('<Q', 0x0000000000401627) # pop rsi ; ret
20: p += pack('<Q', 0x00000000006ca088) # @ .data + 8
21: p += pack('<Q', 0x0000000000442636) # pop rdx ; ret
22: p += pack('<Q', 0x00000000006ca088) # @ .data + 8
23: p += pack('<Q', 0x0000000000467175) # syscall ; ret

Their instrumentation is impractical though as it increases the
runtime by over 16x.

My guess is that inserting zeroing instructions right before the "ret"
instruction can disable some of the hacks we see with ROP:

   `pop rdi ; ret` becomes `pop rdi ; xor rdi, rdi ; ret`

-bw
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
On Wed, Aug 12, 2020 at 02:44:59PM -0700, Bill Wendling wrote:
> My guess is that inserting zeroing instructions right before the "ret"
> instruction can disable some of the hacks we see with ROP:
>
>    `pop rdi ; ret` becomes `pop rdi ; xor rdi, rdi ; ret`

Right; this isn't meant to be a perfect defense. Nothing can be, really.
But it narrows the opportunities available to an attacker (whether it be
ROP, exposures, speculation, etc). The more deterministic the execution
paths, the lower the chance that each given path is both useful (i.e.
does work that helps an attacker) and available (i.e. can be "reached"
through some specific bug) to an attacker.

Given the near-zero cost (in both runtime and code size) of self-xor-ing
registers, it's a "free" change that has a greater-than-zero cost to an
attacker.

--
Kees Cook
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
On Wed, Aug 12, 2020 at 2:59 PM Kees Cook <[hidden email]> wrote:

>
> On Wed, Aug 12, 2020 at 02:44:59PM -0700, Bill Wendling wrote:
> > My guess is that inserting zeroing instructions right before the "ret"
> > instruction can disable some of the hacks we see with ROP:
> >
> >    `pop rdi ; ret` becomes `pop rdi ; xor rdi, rdi ; ret`
>
> Right; this isn't meant to be a perfect defense. Nothing can be, really.
> But it narrows the opportunities available to an attacker (whether it be
> ROP, exposures, speculation, etc). The more deterministic the execution
> paths, the lower the chance that each given path is both useful (i.e.
> does work that helps an attacker) and available (i.e. can be "reached"
> through some specific bug) to an attacker.
>
> Given the near-zero cost (in both runtime and code size) of self-xor-ing
> registers, it's a "free" change that has a greater-than-zero cost to an
> attacker.
>
I wanted to clarify that the 16x slowdown was in the authors'
implementation, which used instrumentation to inject code. But yeah,
this could help limit the avenues open to attackers.

-bw
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
In reply to this post by Hubert Tong via cfe-dev


> On Aug 12, 2020, at 17:44, Bill Wendling via llvm-dev <[hidden email]> wrote:
>
> My guess is that inserting zeroing instructions right before the "ret"
> instruction can disable some of the hacks we see with ROP:
>
>   `pop rdi ; ret` becomes `pop rdi ; xor rdi, rdi ; ret`

Three comments on this.
1. The very first ROP paper [1] used only unintended instruction sequences. That is, none of the return instructions were placed there by the compiler, they appeared completely within other instructions.
2. ROP doesn't require any return instructions [2]. It can be performed using call or jmp instructions.
3. As binaries get larger, the number of available instruction sequences from which one can build gadgets increases dramatically. If the goal is to make one system call like mprotect, you don't need very many at all. If want to get arbitrary computation using ROP and something like mprotect doesn't exist (e.g., on a Harvard architecture machine), you only need a few tens of kilobytes of code. I did it on the Z80 with 16 kB of code with a hardware interlock that forced instructions to be fetched from ROM [3].

There have been a bunch of defenses that purport to make attacks harder by decreasing the number of useful instruction sequences available to the attacker. They don't have a significant impact on attacks.

That's not to say that this couldn't be useful, but I'm skeptical it would defend against ROP, or even make a ROP attack much more difficult.

1. https://hovav.net/ucsd/dist/geometry.pdf
2. https://checkoway.net/papers/noret_ccs2010/noret_ccs2010.pdf
3. https://checkoway.net/papers/evt2009/evt2009.pdf

--
Stephen Checkoway



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
On Wed, Aug 12, 2020 at 8:38 PM Stephen Checkoway <[hidden email]> wrote:

> > On Aug 12, 2020, at 17:44, Bill Wendling via llvm-dev <[hidden email]> wrote:
> >
> > My guess is that inserting zeroing instructions right before the "ret"
> > instruction can disable some of the hacks we see with ROP:
> >
> >   `pop rdi ; ret` becomes `pop rdi ; xor rdi, rdi ; ret`
>
> Three comments on this.
> 1. The very first ROP paper [1] used only unintended instruction sequences. That is, none of the return instructions were placed there by the compiler, they appeared completely within other instructions.
> 2. ROP doesn't require any return instructions [2]. It can be performed using call or jmp instructions.

Sure, but the authors of the paper claim that it's incredibly
difficult to have *only* COP / JOP gadgets. At some point you'll need
to have an ROP gadget:

"Usually, the gadgets of ROP end with a return instruction which we
called conventional ROP attacks. Call Oriented Programming (COP) [8]
and Jump-Oriented Programming (JOP) [9] are the variations of ROP
attacks without returns [10]. The variations use gadgets that end with
indirect call or jump instruction. However, performing ROP attacks
without return instruction in reality is difficult for the reason that
the gadgets of COP and JOP that can form a completed gadget chain are
almost nonexistent. Actually, adversaries prefer to use combinational
gadgets to evade current protection mechanisms."

> 3. As binaries get larger, the number of available instruction sequences from which one can build gadgets increases dramatically. If the goal is to make one system call like mprotect, you don't need very many at all. If want to get arbitrary computation using ROP and something like mprotect doesn't exist (e.g., on a Harvard architecture machine), you only need a few tens of kilobytes of code. I did it on the Z80 with 16 kB of code with a hardware interlock that forced instructions to be fetched from ROM [3].
>
> There have been a bunch of defenses that purport to make attacks harder by decreasing the number of useful instruction sequences available to the attacker. They don't have a significant impact on attacks.
>
> That's not to say that this couldn't be useful, but I'm skeptical it would defend against ROP, or even make a ROP attack much more difficult.
>
This is why having variable length instructions sucks. :-)

I see your point. I was actually looking at the code we generate with
the pop/xor if you start at different offsets in the code when your
email came in.

-bw
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev


> On Aug 13, 2020, at 00:01, Bill Wendling <[hidden email]> wrote:
>
> On Wed, Aug 12, 2020 at 8:38 PM Stephen Checkoway <[hidden email]> wrote:
>>> On Aug 12, 2020, at 17:44, Bill Wendling via llvm-dev <[hidden email]> wrote:
>>>
>>> My guess is that inserting zeroing instructions right before the "ret"
>>> instruction can disable some of the hacks we see with ROP:
>>>
>>>  `pop rdi ; ret` becomes `pop rdi ; xor rdi, rdi ; ret`
>>
>> Three comments on this.
>> 1. The very first ROP paper [1] used only unintended instruction sequences. That is, none of the return instructions were placed there by the compiler, they appeared completely within other instructions.
>> 2. ROP doesn't require any return instructions [2]. It can be performed using call or jmp instructions.
>
> Sure, but the authors of the paper claim that it's incredibly
> difficult to have *only* COP / JOP gadgets. At some point you'll need
> to have an ROP gadget:
>
> "Usually, the gadgets of ROP end with a return instruction which we
> called conventional ROP attacks. Call Oriented Programming (COP) [8]
> and Jump-Oriented Programming (JOP) [9] are the variations of ROP
> attacks without returns [10]. The variations use gadgets that end with
> indirect call or jump instruction. However, performing ROP attacks
> without return instruction in reality is difficult for the reason that
> the gadgets of COP and JOP that can form a completed gadget chain are
> almost nonexistent. Actually, adversaries prefer to use combinational
> gadgets to evade current protection mechanisms."

That's not entirely wrong, but also not entirely correct. The key insight that makes return-oriented programming without returns work is that you only need a single "update-load-branch" sequence. If you can get the address of that into a register, then any sequence of instructions ending in an indirect jump through that register is usable. And it turns out there are plenty of those.

So the hard part is finding that one update-load-branch sequence. In my paper, I used pop edx; jmp edx and pointed out it can be done using other instructions like call (which is exactly what I believe the COP paper later did). I didn't find any pop/jmp sequences in libc, but they did exist in large libraries at the time like libxul and libphp5. Of course, my work was all on 32-bit x86 years ago. Modern x86-64 codegen may make finding them easier or harder, I haven't looked.

>
>> 3. As binaries get larger, the number of available instruction sequences from which one can build gadgets increases dramatically. If the goal is to make one system call like mprotect, you don't need very many at all. If want to get arbitrary computation using ROP and something like mprotect doesn't exist (e.g., on a Harvard architecture machine), you only need a few tens of kilobytes of code. I did it on the Z80 with 16 kB of code with a hardware interlock that forced instructions to be fetched from ROM [3].
>>
>> There have been a bunch of defenses that purport to make attacks harder by decreasing the number of useful instruction sequences available to the attacker. They don't have a significant impact on attacks.
>>
>> That's not to say that this couldn't be useful, but I'm skeptical it would defend against ROP, or even make a ROP attack much more difficult.
>>
> This is why having variable length instructions sucks. :-)

One of several reasons. :) Alas, ROP works on fixed-length instructions too.

> I see your point. I was actually looking at the code we generate with
> the pop/xor if you start at different offsets in the code when your
> email came in.

There's a simple recursive algorithm using a trie in Shacham's original ROP paper to find all such sequences. I believe ROPgadget implemented it or something similar at some point [1].

1. https://github.com/JonathanSalwan/ROPgadget

Best,

Steve

--
Stephen Checkoway



_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Zeroing Caller Saved Regs

Hubert Tong via cfe-dev
In reply to this post by Hubert Tong via cfe-dev
On 13/08/2020 05:01, Bill Wendling via cfe-dev wrote:
 > Sure, but the authors of the paper claim that it's incredibly
 > difficult to have*only*  COP / JOP gadgets. At some point you'll need
 > to have an ROP gadget:
 >
 > "Usually, the gadgets of ROP end with a return instruction which we
 > called conventional ROP attacks. Call Oriented Programming (COP) [8]
 > and Jump-Oriented Programming (JOP) [9] are the variations of ROP
 > attacks without returns [10]. The variations use gadgets that end with
 > indirect call or jump instruction. However, performing ROP attacks
 > without return instruction in reality is difficult for the reason that
 > the gadgets of COP and JOP that can form a completed gadget chain are
 > almost nonexistent. Actually, adversaries prefer to use combinational
 > gadgets to evade current protection mechanisms."

I don't believe that this transform eliminates the set of useful ROP
gadgets.  A stack overwrite that writes over the return address can also
overwrite anything in the spill area, so even if clear all caller-save
registers, there will be sequences that allow you to modify one of the
values in the stack frame that you return to.  In C++ codebases, it's
fairly common to see an object pointer stored in a callee-save register
and then used for multiple calls, so even without overwriting the return
address you have a nice entry point for a gadget chain using only
forward control-flow integrity violations in the rest of the chain.

Most code reuse attacks these days use weird machine compilers to
analyse a binary (either acquired offline or dumped via a memory
disclosure gadget).  Reducing the set of gadgets but still giving enough
for a gadget train does not make it harder for an attacker it just makes
their compiler work slightly harder.  It's analogous to claiming that
it's harder for a programmer to write C code for architectures with no
FPU.  It makes life slightly harder for the toolchain, but not for its user.

As I said, I don't object to adding this feature as a checkbox ticking
exercise so that the Linux kernel can opt into the same security theatre
with both gcc and clang, I object only to advertising it as a security
feature.

David

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev