Is force_align_arg_pointer function attribute supported at x86?

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
Hello

I am trying to compile an x86 operation system at Arch Linux. I need
to align the function stack to make SSE working. There is a function
attribute __attribute__((force_align_arg_pointer)) that works great
with gcc 7.x.

But when I compile my sources with following command line it complains
that force_align_arg_pointer is unknown. Looking at generated ASM I
see that stack alignment is not enorced.


clang -c start_64.c -o start_64.o -g -ggdb -nostdlib -ffreestanding
-std=c11 -fno-stack-protector -mno-red-zone -fno-common -W -Wall
-Wextra -O3 -I../../include
start_64.c:12:25: warning: unknown attribute 'force_align_arg_pointer'
ignored [-Wunknown-attributes]
NORETURN __attribute__((force_align_arg_pointer)) void start_64(void) {
                        ^
1 warning generated.

Here is my clang:

$ clang --version
clang version 4.0.0 (tags/RELEASE_400/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Is there special command line key that enables force_align_arg_pointer
attribute?
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
Hi Anatol,

On 21 June 2017 at 13:30, Anatol Pomozov via cfe-dev
<[hidden email]> wrote:
> There is a function
> attribute __attribute__((force_align_arg_pointer)) that works great
> with gcc 7.x.

I think the issue is that this attribute only applies to 32-bit x86
(i.e. compiling with -m32). All x86_64 ABIs require sufficient stack
alignment for SSE instructions so Clang doesn't support it there.

Essential stack realignment probably needs to be done in assembly if
you're writing an OS or equivalent low-level code.

Tim.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
Hi

On Wed, Jun 21, 2017 at 1:42 PM, Tim Northover <[hidden email]> wrote:

> Hi Anatol,
>
> On 21 June 2017 at 13:30, Anatol Pomozov via cfe-dev
> <[hidden email]> wrote:
>> There is a function
>> attribute __attribute__((force_align_arg_pointer)) that works great
>> with gcc 7.x.
>
> I think the issue is that this attribute only applies to 32-bit x86
> (i.e. compiling with -m32). All x86_64 ABIs require sufficient stack
> alignment for SSE instructions so Clang doesn't support it there.
>
> Essential stack realignment probably needs to be done in assembly if
> you're writing an OS or equivalent low-level code.

Thanks for the explanation.

In my example above start_64() is the entry point for 64 mode. 32-bit
code makes a long jump and does not follow x86_64 stack alignment
policy. Having __attribute__((force_align_arg_pointer)) in this 64bit
code is a valid use-case IMO. But as you suggested the other way to do
stack alignment is to make it explicitly in 32-bit code..

Could you please help me with the 32bit assembly that emulates
__attribute__((force_align_arg_pointer)) in the callee?

I tried to do this in my 32-bit code
    __asm__ volatile("and %0, %%esp"::"irm"(-16));
    __asm__ volatile("jmp %0, %1" ::"i"(CODE_SELECTOR), "p"(start_64));
that compiled into

"and $0xfffffff0,%esp"
jmp $SELECTOR,start_64

but it does not provide the correct stack alignment.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
On 21 June 2017 at 14:21, Anatol Pomozov <[hidden email]> wrote:
> In my example above start_64() is the entry point for 64 mode. 32-bit
> code makes a long jump and does not follow x86_64 stack alignment
> policy. Having __attribute__((force_align_arg_pointer)) in this 64bit
> code is a valid use-case IMO.

The jump has to be made with inline assembly, that puts the whole
control transfer well outside the compiler's purview. That said, I
doubt there'd be strenuous objections to supporting it if patches
turned up.

> I tried to do this in my 32-bit code
>     __asm__ volatile("and %0, %%esp"::"irm"(-16));
>     __asm__ volatile("jmp %0, %1" ::"i"(CODE_SELECTOR), "p"(start_64));
> that compiled into
>
> "and $0xfffffff0,%esp"
> jmp $SELECTOR,start_64
>
> but it does not provide the correct stack alignment.

What goes wrong? I'd wire up a debugger and inspect the state. Some
things that might be useful to probe:

  * Is %esp properly aligned before the far jump?
  * Does the jump involve a change in privilege level, or other weird
x86ism that replaces SP entirely?
  * Does some shim between reaching 64-bit mode and boot_64 mess with it?

The Intel manual also seems to imply that the high bits of %rsp
shouldn't be relied on after a transition to 64-bit mode (from section
3.4.1.1):

"Because the upper 32 bits of 64-bit general-purpose registers are
undefined in 32-bit modes, the upper 32 bits of any general-purpose
register are not preserved when switching from 64-bit mode to a 32-bit
mode (to protected mode or compatibility mode). Software must not
depend on these bits to maintain a value after a 64-bit to 32-bit mode
switch."

I don't know enough about x86 microarchitectures to say what really
happens (maybe it's zeroed and fine) but random data would definitely
cause problems.

Cheers.

Tim.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev


On 22 June 2017 at 03:12, Tim Northover via cfe-dev <[hidden email]> wrote:
On 21 June 2017 at 14:21, Anatol Pomozov <[hidden email]> wrote:
> In my example above start_64() is the entry point for 64 mode. 32-bit
> code makes a long jump and does not follow x86_64 stack alignment
> policy. Having __attribute__((force_align_arg_pointer)) in this 64bit
> code is a valid use-case IMO.

The jump has to be made with inline assembly, that puts the whole
control transfer well outside the compiler's purview. That said, I
doubt there'd be strenuous objections to supporting it if patches
turned up.

> I tried to do this in my 32-bit code
>     __asm__ volatile("and %0, %%esp"::"irm"(-16));
>     __asm__ volatile("jmp %0, %1" ::"i"(CODE_SELECTOR), "p"(start_64));
> that compiled into
>
> "and $0xfffffff0,%esp"
> jmp $SELECTOR,start_64
>
> but it does not provide the correct stack alignment.

What goes wrong? I'd wire up a debugger and inspect the state. Some
things that might be useful to probe:

  * Is %esp properly aligned before the far jump?
  * Does the jump involve a change in privilege level, or other weird
x86ism that replaces SP entirely?
  * Does some shim between reaching 64-bit mode and boot_64 mess with it?

The Intel manual also seems to imply that the high bits of %rsp
shouldn't be relied on after a transition to 64-bit mode (from section
3.4.1.1):

"Because the upper 32 bits of 64-bit general-purpose registers are
undefined in 32-bit modes, the upper 32 bits of any general-purpose
register are not preserved when switching from 64-bit mode to a 32-bit
mode (to protected mode or compatibility mode). Software must not
depend on these bits to maintain a value after a 64-bit to 32-bit mode
switch."

I'm reasonably sure that the rules of x86-64 is that upper 32-bits of registers is always zero when operating in 32-bit mode. That certainly applies to the processors I worked on at AMD when they first came out. A 32-bit operation, no matter what mode the processor is in, will force zeros in the upper 32 bits. And I think this can still be depended on. What CAN'T be depended on is that the upper bits remain when switching modes - in particular, a 16- or 32-bit mode operation may well change something that you didn't expect to change.

I'd be hugely surprised if the upper bits of RSP isn't zero in this case. (I'm assuming this is a "user-mode -> user-mode" jump, and not a gate to an call-gate on the selector).

Obviously, there may be OTHER things that go wrong in the transition (or just after), without actually knowing more details.

--
Mats

I don't know enough about x86 microarchitectures to say what really
happens (maybe it's zeroed and fine) but random data would definitely
cause problems.

Cheers.

Tim.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
In reply to this post by Sumner, Brian via cfe-dev
On Wed, Jun 21, 2017 at 02:21:45PM -0700, Anatol Pomozov via cfe-dev wrote:
> I tried to do this in my 32-bit code
>     __asm__ volatile("and %0, %%esp"::"irm"(-16));
>     __asm__ volatile("jmp %0, %1" ::"i"(CODE_SELECTOR), "p"(start_64));
> that compiled into
>
> "and $0xfffffff0,%esp"
> jmp $SELECTOR,start_64
>
> but it does not provide the correct stack alignment.

Depending on the exact position in the calling sequence, you actually
want to subtract 8 after the AND.

Joerg
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
> Depending on the exact position in the calling sequence, you actually
> want to subtract 8 after the AND.

Oh, well spotted! I'd forgotten x86 pushed the return address as part
of the call.

Tim.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
In reply to this post by Sumner, Brian via cfe-dev
Hi

On Wed, Jun 21, 2017 at 7:12 PM, Tim Northover <[hidden email]> wrote:
> "Because the upper 32 bits of 64-bit general-purpose registers are
> undefined in 32-bit modes, the upper 32 bits of any general-purpose
> register are not preserved when switching from 64-bit mode to a 32-bit
> mode (to protected mode or compatibility mode). Software must not
> depend on these bits to maintain a value after a 64-bit to 32-bit mode
> switch."

This statement describes switching 64->32 bits. In my case the switch
happens the other way 32->64 bits. I expect that the top half of all
registers is initialized with zeros during the switch to 64bits
(citation is needed here).

I checked ABI spec at
https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
it describes stack layout at "3.2.2". Though I found their images and
wording a bit confusing and I did not get clear understanding what is
the alignment requirement for a far jump.

After some experiments I found that the stack pointer should be
aligned to 16-bytes *before* far call. Following example works fine in
vmware, KVM:

  and $-16, %esp
  call $SELECTOR,start_64

Unfortunately this code crashes Qemu. I believe it is a qemu bug and I
filed it here https://bugs.launchpad.net/bugs/1699867

So I replaced far call with a far jump. Far call puts 4 bytes %cs and
%ip registers to stack, i.e. additional 8 bytes of stack are used. I
replaced the code above with

  and $-16, %esp
  sub $8, %esp
  jmp $SELECTOR,start_64

And SSE works fine with qemu, kvm, vmware.

It solves my original problem, thanks everyone for the help. But if
you folks decide to enable __attribute__((force_align_arg_pointer)) at
x86_64 to make this situation a bit simpler and get into parity with
GCC then I fully support you.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
In reply to this post by Sumner, Brian via cfe-dev
Hi Tim

On Wed, Jun 21, 2017 at 7:12 PM, Tim Northover <[hidden email]> wrote:

> On 21 June 2017 at 14:21, Anatol Pomozov <[hidden email]> wrote:
>> In my example above start_64() is the entry point for 64 mode. 32-bit
>> code makes a long jump and does not follow x86_64 stack alignment
>> policy. Having __attribute__((force_align_arg_pointer)) in this 64bit
>> code is a valid use-case IMO.
>
> The jump has to be made with inline assembly, that puts the whole
> control transfer well outside the compiler's purview. That said, I
> doubt there'd be strenuous objections to supporting it if patches
> turned up.
Here is the patch that enables
__attribute__((force_align_arg_pointer)) at x86_64. Tested with my 64
bit OS and it seems work as expected.

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

0001-Enable-force_align_arg_pointer-attribute-at-x86_64.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
On 2 August 2017 at 16:23, Anatol Pomozov <[hidden email]> wrote:
> Here is the patch that enables
> __attribute__((force_align_arg_pointer)) at x86_64. Tested with my 64
> bit OS and it seems work as expected.

You probably want to send this to the cfe-commits list since that's
where code review happens. Also, it'll need a test of some kind
(probably in test/CodeGen). There's some more info at
https://llvm.org/docs/DeveloperPolicy.html if you haven't seen that
page yet.

Cheers.

Tim.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Is force_align_arg_pointer function attribute supported at x86?

Sumner, Brian via cfe-dev
Hi Tim

On Wed, Aug 2, 2017 at 9:43 PM, Tim Northover <[hidden email]> wrote:

> On 2 August 2017 at 16:23, Anatol Pomozov <[hidden email]> wrote:
>> Here is the patch that enables
>> __attribute__((force_align_arg_pointer)) at x86_64. Tested with my 64
>> bit OS and it seems work as expected.
>
> You probably want to send this to the cfe-commits list since that's
> where code review happens. Also, it'll need a test of some kind
> (probably in test/CodeGen). There's some more info at
> https://llvm.org/docs/DeveloperPolicy.html if you haven't seen that
> page yet.

I uploaded my patch to Phabricator here https://reviews.llvm.org/D36272
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Loading...