[RFC] Allocatable Global Register Variables for ARM

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC] Allocatable Global Register Variables for ARM

Richard Smith via cfe-dev

Hi all,

This is a RFC on support for Global Register Variables in the Arm backend.

Whilst there has been some prior discussion about whether or not LLVM should (or even needs to) support global register variables,
today there seems to be a good measure of support for this in both Clang+LLVM (although it is currently limited to SP).
When most of this support landed there was some concern expressed around the difficulty of extending it to cover allocatable registers.
We have been looking at building atop of this current support to provide that ability to reserve allocatable registers.

Our primary (bare-metal) use-cases for such support are:

    1. Holding pointers for frequently-accessed data.
    2. Placement of secure values (e.g. checksums) in specific registers to prevent them from being written out to main memory.

This proposal aims to provide support for using r4-r11 as global register variables. This involves adding them to the reserved register set
and preventing them from being callee-saved. We have deliberately tried to avoid registers that have a distinct ABI/AAPCS use,
such as call-clobbered registers, LR and PC etc. Naturally the current support for the stack-pointer remains.

r7, r11 and r9 at least are special cases, and are mentioned in more detail below.

Clang Changes
----------------------
The main proposed functional change to Clang is the tracking of global register variable declarations via module flags.
Each declaration in a translation unit such as "register unsigned int foo __asm("r4");", will be mapped by the front-end to a module flag.

e.g.
---
!{i32 1, !"fixed_reg.foo", !"r4"}
---

This is achieved via a modification to CodeGenModule::EmitGlobal. In addition, there are some changes relating
to specifying valid global registers (by adding an Arm override for isValidGlobalRegister),

Draft Patch: https://reviews.llvm.org/D56003

LLVM Changes
----------------------
In the LLVM backend we can then use the module flags to add any (valid) specified global registers to the
reserved reg list (getReservedRegs). In addition, to satisfy the second use-case*, we use the flags to remove
any (valid) global registers from the list of callee registers to be saved (determineCalleeSaves).
This shouldn't have any negative impact on a register that is already reserved for use.

GCC seems to exhibit similar behaviour:

    "If the register is a call-saved register, call ABI is affected: the register will not be restored in function epilogue sequences after the variable has been assigned."
        See: https://gcc.gnu.org/onlinedocs/gcc/Global-Register-Variables.html#Global-Register-Variables

Draft Patch: https://reviews.llvm.org/D56005

A Note on r7, r11 and r9
----------------------------------
Whilst generally considered allocatable, these registers can on occasion be reserved for other purposes.
As frame pointers, in the case of r7, and r11, and via -ffixed-r9 (or -frwpi), in the case of r9.

The attached patches do not currently provide any mitigations against these cases.
Options could range from disallowing these registers entirely, to throwing warnings or
trying to catch and error in the correct scenarios (e.g. usage -ffixed-r9 when r9 is declared as a global reg variable).

GCC's behaviour for many registers (at least notably, the call-clobbered registers, e.g. r0-r3), is to throw a warning, rather than an error.

Other Notes
----------------------
- It's worth noting that we may also look into extending this to cover AArch64 as well, in the near future.
- Extending this proposal to work with a -ffixed-reg option should be feasible (if desired).
- GCC warns when two global variables refer to the same register - this implementation silently accepts it.

Previous Patches/Discussion
------------------------------------------
- https://reviews.llvm.org/D3261
- https://reviews.llvm.org/D3797
- http://lists.llvm.org/pipermail/llvm-dev/2014-March/071472.html

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Allocatable Global Register Variables for ARM

Richard Smith via cfe-dev
On 12/21/2018 8:08 AM, Carey Williams via llvm-dev wrote:
Hi all,

This is a RFC on support for Global Register Variables in the Arm backend.

Whilst there has been some prior discussion about whether or not LLVM should (or even needs to) support global register variables,
today there seems to be a good measure of support for this in both Clang+LLVM (although it is currently limited to SP).
When most of this support landed there was some concern expressed around the difficulty of extending it to cover allocatable registers.
We have been looking at building atop of this current support to provide that ability to reserve allocatable registers.

Our primary (bare-metal) use-cases for such support are:

    1. Holding pointers for frequently-accessed data.
    2. Placement of secure values (e.g. checksums) in specific registers to prevent them from being written out to main memory.


As a side-note, you might want to check that prologue/epilogue emission won't emit a PUSH/POP that refers to a register reserved this way; we sometimes add an "extra" register to align the stack.



This proposal aims to provide support for using r4-r11 as global register variables. This involves adding them to the reserved register set
and preventing them from being callee-saved. We have deliberately tried to avoid registers that have a distinct ABI/AAPCS use,
such as call-clobbered registers, LR and PC etc. Naturally the current support for the stack-pointer remains.


Restricting this specifically to r4-r11 definitely makes sense; allowing other registers would be hard.


r7, r11 and r9 at least are special cases, and are mentioned in more detail below.

Clang Changes
----------------------
The main proposed functional change to Clang is the tracking of global register variable declarations via module flags.
Each declaration in a translation unit such as "register unsigned int foo __asm("r4");", will be mapped by the front-end to a module flag.

e.g.
---
!{i32 1, !"fixed_reg.foo", !"r4"}
---

This is achieved via a modification to CodeGenModule::EmitGlobal. In addition, there are some changes relating
to specifying valid global registers (by adding an Arm override for isValidGlobalRegister),

Draft Patch: https://reviews.llvm.org/D56003


Why did you decide to use global metadata here?  For AArch64, we use a target feature instead, to implement roughly equivalent functionality (the reserve-x18 feature, to implement -ffixed-x18).


Making a global register declaration have side-effects never made sense, IMO; on the surface, it's using variable declaration syntax, but in reality it's actually changing the ABI rules for the whole file.  I would prefer to support -ffixed-r4, and never allow global register declarations to modify the ABI. This subset should be compatible with gcc, as far as I know.


(Compiler flags that affect the ABI are easy to misuse, but clang and gcc have a long tradition of flags which change the ABI, so it's not really worse than what we already do.)



A Note on r7, r11 and r9
----------------------------------
Whilst generally considered allocatable, these registers can on occasion be reserved for other purposes.
As frame pointers, in the case of r7, and r11, and via -ffixed-r9 (or -frwpi), in the case of r9.

The attached patches do not currently provide any mitigations against these cases.
Options could range from disallowing these registers entirely, to throwing warnings or
trying to catch and error in the correct scenarios (e.g. usage -ffixed-r9 when r9 is declared as a global reg variable).


We probably want to emit an error here, to avoid confusion.


-Eli


-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Allocatable Global Register Variables for ARM

Richard Smith via cfe-dev

Thank you for your reply Eli,

I too was working with Carey on this feature, so please let me reply.

On 12/21/18 8:05 PM, Friedman, Eli via llvm-dev wrote:
> As a side-note, you might want to check that prologue/epilogue emission won't emit a PUSH/POP that refers to a register reserved this way; we sometimes add an "extra" register to align the stack.

Yes, you are right. 
Checking determineCalleeSaves(), we see that it maintains a number of free Register pools, UnspilledCS1GPRs, UnspilledCS2GPRs and AvailableRegs, which are used to find Registers that can be used for the extra callee saves you mentioned.
There probably are more like this. Thanks for pointing that out.
We will investigate and extend our design. > > Why did you decide to use global metadata here? For AArch64, we use a target feature instead, to implement roughly equivalent functionality (the reserve-x18 feature, to implement -ffixed-x18). > Making a global register declaration have side-effects never made sense, IMO; on the surface, it's using variable declaration syntax, but in reality it's actually changing the ABI rules for the whole file. > I would prefer to support -ffixed-r4, and never allow global register declarations to modify the ABI. This subset should be compatible with gcc, as far as I know. > > (Compiler flags that affect the ABI are easy to misuse, but clang and gcc have a long tradition of flags which change the ABI, so it's not really worse than what we already do.) > > We were looking for a solution which works with LTO. While we investigated a possible -ffixed-reg flag, our understanding was that it would only work as long as it sets module metadata in the IR. Adding a -ffixed-reg option in the LLVM backend, and adding that option to the -cc1 command-line, would not work because that would not get passed through to the backend when LTO is used. So our belief was that one could later implement the -ffixed-reg flag upon the module metadata added by this patch. Is this the target feature mechanism you explained? https://llvm.org/docs/WritingAnLLVMBackend.html#subtarget-support I still have not gone through the specifics of that but do you know if it would work with LTO? Thanks Amilendra

From: llvm-dev <[hidden email]> on behalf of Friedman, Eli via llvm-dev <[hidden email]>
Sent: Friday, December 21, 2018 8:05 PM
To: Carey Williams; [hidden email]
Cc: [hidden email]
Subject: Re: [llvm-dev] [RFC] Allocatable Global Register Variables for ARM
 
On 12/21/2018 8:08 AM, Carey Williams via llvm-dev wrote:
Hi all,

This is a RFC on support for Global Register Variables in the Arm backend.

Whilst there has been some prior discussion about whether or not LLVM should (or even needs to) support global register variables,
today there seems to be a good measure of support for this in both Clang+LLVM (although it is currently limited to SP).
When most of this support landed there was some concern expressed around the difficulty of extending it to cover allocatable registers.
We have been looking at building atop of this current support to provide that ability to reserve allocatable registers.

Our primary (bare-metal) use-cases for such support are:

    1. Holding pointers for frequently-accessed data.
    2. Placement of secure values (e.g. checksums) in specific registers to prevent them from being written out to main memory.


As a side-note, you might want to check that prologue/epilogue emission won't emit a PUSH/POP that refers to a register reserved this way; we sometimes add an "extra" register to align the stack.



This proposal aims to provide support for using r4-r11 as global register variables. This involves adding them to the reserved register set
and preventing them from being callee-saved. We have deliberately tried to avoid registers that have a distinct ABI/AAPCS use,
such as call-clobbered registers, LR and PC etc. Naturally the current support for the stack-pointer remains.


Restricting this specifically to r4-r11 definitely makes sense; allowing other registers would be hard.


r7, r11 and r9 at least are special cases, and are mentioned in more detail below.

Clang Changes
----------------------
The main proposed functional change to Clang is the tracking of global register variable declarations via module flags.
Each declaration in a translation unit such as "register unsigned int foo __asm("r4");", will be mapped by the front-end to a module flag.

e.g.
---
!{i32 1, !"fixed_reg.foo", !"r4"}
---

This is achieved via a modification to CodeGenModule::EmitGlobal. In addition, there are some changes relating
to specifying valid global registers (by adding an Arm override for isValidGlobalRegister),

Draft Patch: https://reviews.llvm.org/D56003


Why did you decide to use global metadata here?  For AArch64, we use a target feature instead, to implement roughly equivalent functionality (the reserve-x18 feature, to implement -ffixed-x18).


Making a global register declaration have side-effects never made sense, IMO; on the surface, it's using variable declaration syntax, but in reality it's actually changing the ABI rules for the whole file.  I would prefer to support -ffixed-r4, and never allow global register declarations to modify the ABI. This subset should be compatible with gcc, as far as I know.


(Compiler flags that affect the ABI are easy to misuse, but clang and gcc have a long tradition of flags which change the ABI, so it's not really worse than what we already do.)



A Note on r7, r11 and r9
----------------------------------
Whilst generally considered allocatable, these registers can on occasion be reserved for other purposes.
As frame pointers, in the case of r7, and r11, and via -ffixed-r9 (or -frwpi), in the case of r9.

The attached patches do not currently provide any mitigations against these cases.
Options could range from disallowing these registers entirely, to throwing warnings or
trying to catch and error in the correct scenarios (e.g. usage -ffixed-r9 when r9 is declared as a global reg variable).


We probably want to emit an error here, to avoid confusion.


-Eli


-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Allocatable Global Register Variables for ARM

Richard Smith via cfe-dev
On 1/4/2019 1:49 AM, Amilendra Kodithuwakku wrote:
> Why did you decide to use global metadata here?  For AArch64, we use a target feature instead, to implement roughly equivalent functionality (the reserve-x18 feature, to implement -ffixed-x18).
> Making a global register declaration have side-effects never made sense, IMO; on the surface, it's using variable declaration syntax, but in reality it's actually changing the ABI rules for the whole file. 
> I would prefer to support -ffixed-r4, and never allow global register declarations to modify the ABI. This subset should be compatible with gcc, as far as I know.
>
> (Compiler flags that affect the ABI are easy to misuse, but clang and gcc have a long tradition of flags which change the ABI, so it's not really worse than what we already do.)
>
>
We were looking for a solution which works with LTO.
While we investigated a possible -ffixed-reg flag, our understanding was that it would only work as long as it sets module metadata in the IR.
Adding a -ffixed-reg option in the LLVM backend, and adding that option to the -cc1 command-line, would not work because that would not get passed through to the backend when LTO is used. So our belief was that one could later implement the -ffixed-reg flag upon the module metadata added by this patch.

Is this the target feature mechanism you explained? https://llvm.org/docs/WritingAnLLVMBackend.html#subtarget-support
I still have not gone through the specifics of that but do you know if it would work with LTO?

You're correct that any solution that works with LTO must encode the information into the object file.  But module metadata isn't the best way to do that here.  The approach I'm suggesting is to use the "target-features" attribute on each function in the file.  See https://reviews.llvm.org/D46552https://reviews.llvm.org/D48581 / https://reviews.llvm.org/D46552 / etc.


-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Allocatable Global Register Variables for ARM

Richard Smith via cfe-dev

Hi Eli,

> You're correct that any solution that works with LTO must encode the information into the object file.
> But module metadata isn't the best way to do that here.
> The approach I'm suggesting is to use the "target-features" attribute on each function in the file.  
> See https://reviews.llvm.org/D46552 /  https://reviews.llvm.org/D48581 / https://reviews.llvm.org/D46552 / etc.

We took your advice and redid our work to use target-features instead of using module-metadata.
With these patches, declaring the register variable in source will not reserve a register,
and  to reserve the register rN, -ffixed-rN should be specified in the command line.
Reserving a register rN by -ffixed-rN will prevent them from being callee-saved.
While we initially expected to support the full callee-saved register set (R4-R11),
the source code shows that R4 is being hard-coded to be used as a scratch register under various conditions,
so for now we revised the supported range of registers from R5-R11.

As you pointed out previously, extra checks have been implemented so that reserved registers will not be spilled for stack alignment purposes.

Effect on existing behavior
----------------------------
1. -ffixed-r9 and -frwpi
The previous implementation of both -ffixed-r9 and -frwpi reserved the register R9, but did not stop it from being callee-saved.
Specifying either flag new implementation reserves R9 and also stops it from being callee-saved.


Errors for conflicting options
------------------------------
The following combination of flags will raise errors.
1. -ffixed-r7 specified without -fomit-frame-pointer or with -fno-omit-frame-pointer
2. -ffixed-r11 specified without -fomit-frame-pointer or with -fno-omit-frame-pointer
3. -ffixed-r9 specified with -frwpi


Draft Patches
-------------
Clang: https://reviews.llvm.org/D56003
LLVM: https://reviews.llvm.org/D56005

Thank you

Amilendra





From: Friedman, Eli <[hidden email]>
Sent: Monday, January 7, 2019 10:31 PM
To: Amilendra Kodithuwakku; Carey Williams
Cc: [hidden email]; [hidden email]
Subject: Re: [llvm-dev] [RFC] Allocatable Global Register Variables for ARM
 
On 1/4/2019 1:49 AM, Amilendra Kodithuwakku wrote:
> Why did you decide to use global metadata here?  For AArch64, we use a target feature instead, to implement roughly equivalent functionality (the reserve-x18 feature, to implement -ffixed-x18).
> Making a global register declaration have side-effects never made sense, IMO; on the surface, it's using variable declaration syntax, but in reality it's actually changing the ABI rules for the whole file. 
> I would prefer to support -ffixed-r4, and never allow global register declarations to modify the ABI. This subset should be compatible with gcc, as far as I know.
>
> (Compiler flags that affect the ABI are easy to misuse, but clang and gcc have a long tradition of flags which change the ABI, so it's not really worse than what we already do.)
>
>
We were looking for a solution which works with LTO.
While we investigated a possible -ffixed-reg flag, our understanding was that it would only work as long as it sets module metadata in the IR.
Adding a -ffixed-reg option in the LLVM backend, and adding that option to the -cc1 command-line, would not work because that would not get passed through to the backend when LTO is used. So our belief was that one could later implement the -ffixed-reg flag upon the module metadata added by this patch.

Is this the target feature mechanism you explained? https://llvm.org/docs/WritingAnLLVMBackend.html#subtarget-support
I still have not gone through the specifics of that but do you know if it would work with LTO?

You're correct that any solution that works with LTO must encode the information into the object file.  But module metadata isn't the best way to do that here.  The approach I'm suggesting is to use the "target-features" attribute on each function in the file.  See https://reviews.llvm.org/D46552 https://reviews.llvm.org/D48581 / https://reviews.llvm.org/D46552 / etc.


-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev