FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: LIT Test case and unit testing

Nathan Ridge via cfe-dev

Hi Csaba ,

My system got hanged during Ninja compilation.
Is there any system requirement for LLVM/CLANG/Ninja ?

My configuration is Ubuntu 16.0x ,12GB RAM and 27GB swap memory.

ninja check-llvm-unit ( system got hanged)
[6/12] Linking CXX executable unittests/tools/llvm-cfi-verify/CFIVerifyTests

Please check the above issue and provide the solution.

~/Srinivas/llvm-ninja/llvm-project/build$ pwd
~/Srinivas/llvm-ninja/llvm-project/build

~/Srinivas/llvm-ninja/llvm-project/build$ ls -l
total 22860
drwxrwxr-x  3 tcs tcs     4096 Apr  2 17:32 benchmarks
drwxrwxr-x  2 tcs tcs     4096 Apr 13 05:54 bin
drwxrwxr-x  2 tcs tcs     4096 Mar 29 15:33 build
-rw-rw-r--  1 tcs tcs 19548918 Apr  2 17:32 build.ninja
drwxrwxr-x  3 tcs tcs     4096 Mar 29 15:40 cmake
-rw-rw-r--  1 tcs tcs    75530 Mar 29 15:40 CMakeCache.txt
drwxrwxr-x  4 tcs tcs     4096 Apr  2 17:32 CMakeFiles
-rw-rw-r--  1 tcs tcs     4874 Mar 29 15:40 cmake_install.cmake
-rw-rw-r--  1 tcs tcs  3371071 Apr  2 17:32 compile_commands.json
-rw-r--r--  1 tcs tcs     3684 Mar 29 15:39 CPackConfig.cmake
-rw-r--r--  1 tcs tcs     4156 Mar 29 15:39 CPackSourceConfig.cmake
drwxrwxr-x  3 tcs tcs     4096 Apr  2 17:32 docs
-rw-rw-r--  1 tcs tcs      466 Apr  2 17:32 DummyConfigureOutput
drwxrwxr-x  9 tcs tcs     4096 Apr  2 17:32 examples
drwxrwxr-x  3 tcs tcs     4096 Mar 29 15:39 include
drwxrwxr-x 36 tcs tcs    12288 Apr 13 06:11 lib
drwxrwxr-x  2 tcs tcs     4096 Mar 29 16:18 libexec
-rw-rw-r--  1 tcs tcs    59344 Apr  2 17:32 LLVMBuild.cmake
-rwxrwxr-x  1 tcs tcs     1307 Apr  2 14:50 llvm-lit
-rw-rw-r--  1 tcs tcs     1783 Mar 29 15:39 llvm.spec
drwxrwxr-x  3 tcs tcs     4096 Apr  2 17:32 projects
-rw-rw-r--  1 tcs tcs   231148 Apr  2 17:32 rules.ninja
drwxrwxr-x  3 tcs tcs     4096 Apr  2 17:32 runtimes
drwxrwxr-x  5 tcs tcs     4096 Mar 29 23:22 share
drwxrwxr-x  4 tcs tcs     4096 Apr  2 17:32 test
drwxrwxr-x 72 tcs tcs     4096 Apr  2 17:32 tools
drwxrwxr-x 30 tcs tcs     4096 Apr  2 17:32 unittests
drwxrwxr-x 12 tcs tcs     4096 Mar 29 15:40 utils


Regards,
Srinivas

-----"Csaba Raduly" <[hidden email]> wrote: -----
To: "Srinivas Kakarla1" <[hidden email]>
From: "Csaba Raduly" <[hidden email]>
Date: 04/10/2019 09:41PM
Cc: "cfe-dev" <[hidden email]>
Subject: Re: [cfe-dev] LIT Test case and unit testing

"External email. Open with Caution"
Hi Srinivas,

Which parameters did you give to cmake?
Can you show the output of pwd and ls -l ?

I get the exact same error, because I choose ninja as the cmake generator.

$ ninja check-llvm-unit
$ ninja check-llvm
$ ninja check-all

all work for me.

Csaba

On Wed, Apr 10, 2019 at 12:58 PM Srinivas Kakarla1 via cfe-dev <[hidden email]> wrote:

Hi  ,

I am following the below url .
make check-llvm-unit  - not working
make check-llvm  - not working

make check-llvm
make: *** No rule to make target 'check-llvm'.  Stop.

llvm-project/llvm/test$ make check-all
make: *** No rule to make target 'check-all'.  Stop.
 
Can you please let me know the problem.Is there any configuration or setup is missing.

Regards,
Srinivas

-----Srinivas Kakarla1/HYD/TCS wrote: -----
To: [hidden email]
From: Srinivas Kakarla1/HYD/TCS
Date: 03/20/2019 11:44AM
Subject: LIT Test case and unit testing


Hi ,

I have configure LLVM and LIT on my PC (lit-0.8.0.dev0 and clang -v clang version 3.8.0 )
I wan to know how to run or add specific unit test or test case from LIT ?
Can any one have inputs or idea ?

Regards,
Srinivas



--
You can get very substantial performance improvements
by not doing the right thing. - Scott Meyers, An Effective C++11/14 Sampler
So if you're looking for a completely portable, 100% standards-conformat way
to get the wrong information: this is what you want. - Scott Meyers (C++TDaWYK)


[attachment "ninja check-llvm-unit.odt" removed by Srinivas Kakarla1/HYD/TCS]

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: LIT Test case and unit testing

Nathan Ridge via cfe-dev
In reply to this post by Nathan Ridge via cfe-dev

Hi Csaba /LLVM,

My system got hanged during Ninja compilation.
Is there any system requirement for LLVM/CLANG/Ninja ?

My configuration is Ubuntu 16.0x ,12GB RAM and 27GB swap memory.

ninja check-llvm-unit ( system got hanged)
[6/12] Linking CXX executable unittests/tools/llvm-cfi-verify/CFIVerifyTests

Please check the above issue and provide the solution.

~/Srinivas/llvm-ninja/llvm-project/build$ pwd
~/Srinivas/llvm-ninja/llvm-project/build

~/Srinivas/llvm-ninja/llvm-project/build$ ls -l
total 22860
drwxrwxr-x  3 tcs tcs     4096 Apr  2 17:32 benchmarks
drwxrwxr-x  2 tcs tcs     4096 Apr 13 05:54 bin
drwxrwxr-x  2 tcs tcs     4096 Mar 29 15:33 build
-rw-rw-r--  1 tcs tcs 19548918 Apr  2 17:32 build.ninja
drwxrwxr-x  3 tcs tcs     4096 Mar 29 15:40 cmake
-rw-rw-r--  1 tcs tcs    75530 Mar 29 15:40 CMakeCache.txt
drwxrwxr-x  4 tcs tcs     4096 Apr  2 17:32 CMakeFiles
-rw-rw-r--  1 tcs tcs     4874 Mar 29 15:40 cmake_install.cmake
-rw-rw-r--  1 tcs tcs  3371071 Apr  2 17:32 compile_commands.json
-rw-r--r--  1 tcs tcs     3684 Mar 29 15:39 CPackConfig.cmake
-rw-r--r--  1 tcs tcs     4156 Mar 29 15:39 CPackSourceConfig.cmake
drwxrwxr-x  3 tcs tcs     4096 Apr  2 17:32 docs
-rw-rw-r--  1 tcs tcs      466 Apr  2 17:32 DummyConfigureOutput
drwxrwxr-x  9 tcs tcs     4096 Apr  2 17:32 examples
drwxrwxr-x  3 tcs tcs     4096 Mar 29 15:39 include
drwxrwxr-x 36 tcs tcs    12288 Apr 13 06:11 lib
drwxrwxr-x  2 tcs tcs     4096 Mar 29 16:18 libexec
-rw-rw-r--  1 tcs tcs    59344 Apr  2 17:32 LLVMBuild.cmake
-rwxrwxr-x  1 tcs tcs     1307 Apr  2 14:50 llvm-lit
-rw-rw-r--  1 tcs tcs     1783 Mar 29 15:39 llvm.spec
drwxrwxr-x  3 tcs tcs     4096 Apr  2 17:32 projects
-rw-rw-r--  1 tcs tcs   231148 Apr  2 17:32 rules.ninja
drwxrwxr-x  3 tcs tcs     4096 Apr  2 17:32 runtimes
drwxrwxr-x  5 tcs tcs     4096 Mar 29 23:22 share
drwxrwxr-x  4 tcs tcs     4096 Apr  2 17:32 test
drwxrwxr-x 72 tcs tcs     4096 Apr  2 17:32 tools
drwxrwxr-x 30 tcs tcs     4096 Apr  2 17:32 unittests
drwxrwxr-x 12 tcs tcs     4096 Mar 29 15:40 utils


Regards,
Srinivas

-----"Csaba Raduly" <[hidden email]> wrote: -----
To: "Srinivas Kakarla1" <[hidden email]>
From: "Csaba Raduly" <[hidden email]>
Date: 04/10/2019 09:41PM
Cc: "cfe-dev" <[hidden email]>
Subject: Re: [cfe-dev] LIT Test case and unit testing

"External email. Open with Caution"
Hi Srinivas,

Which parameters did you give to cmake?
Can you show the output of pwd and ls -l ?

I get the exact same error, because I choose ninja as the cmake generator.

$ ninja check-llvm-unit
$ ninja check-llvm
$ ninja check-all

all work for me.

Csaba

On Wed, Apr 10, 2019 at 12:58 PM Srinivas Kakarla1 via cfe-dev <[hidden email]> wrote:

Hi  ,

I am following the below url .
make check-llvm-unit  - not working
make check-llvm  - not working

make check-llvm
make: *** No rule to make target 'check-llvm'.  Stop.

llvm-project/llvm/test$ make check-all
make: *** No rule to make target 'check-all'.  Stop.
 
Can you please let me know the problem.Is there any configuration or setup is missing.

Regards,
Srinivas

-----Srinivas Kakarla1/HYD/TCS wrote: -----
To: [hidden email]
From: Srinivas Kakarla1/HYD/TCS
Date: 03/20/2019 11:44AM
Subject: LIT Test case and unit testing


Hi ,

I have configure LLVM and LIT on my PC (lit-0.8.0.dev0 and clang -v clang version 3.8.0 )
I wan to know how to run or add specific unit test or test case from LIT ?
Can any one have inputs or idea ?

Regards,
Srinivas



--
You can get very substantial performance improvements
by not doing the right thing. - Scott Meyers, An Effective C++11/14 Sampler
So if you're looking for a completely portable, 100% standards-conformat way
to get the wrong information: this is what you want. - Scott Meyers (C++TDaWYK)


[attachment "ninja check-llvm-unit.odt" removed by Srinivas Kakarla1/HYD/TCS]

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev
In reply to this post by Nathan Ridge via cfe-dev
Update: MSVC++ should gain this ability in 2019 Update 4; so it would be nice to enable these as soon as possible. The user in https://developercommunity.visualstudio.com/content/problem/274938/vs2017-1567158p2-stdatomicload-causes-write-access.html has been long suffering 🙂

Billy3

From: Billy O'Neal (VC LIBS) <[hidden email]>
Sent: Thursday, March 28, 2019 03:13 PM
To: Reid Kleckner <[hidden email]>; Friedman, Eli <[hidden email]>; Grang, Mandeep Singh <[hidden email]>
Cc: JF Bastien <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
>The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior.

Right, but if they want to link with MSVC++ generated code they need to at least respect _ReadWriteBarrier.

I can certainly mess with our <atomic> to play more nicely with you folks if you want us calling different builtins instead. <atomic> is entirely library tech for us.

Billy3

From: Reid Kleckner <[hidden email]>
Sent: Thursday, March 28, 2019 03:08 PM
To: Billy O'Neal (VC LIBS); Friedman, Eli; Grang, Mandeep Singh
Cc: JF Bastien; cfe-dev
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
On Thu, Mar 28, 2019 at 2:28 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
>Are you sure we shouldn't be marking these as atomic instead of volatile? Volatile is usually not suitable for anything except  

I am implementing <atomic> 🙂

My concern is that volatile in LLVM has nothing to do with atomicity, and I think what you really want is LLVM atomic loads and stores, with real ordering markers of monotonic, seq_cst, release, acquire, etc. This is all described here:
In particular: "The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior."

So, I'm concerned that there is something subtly incorrect about using __iso_volatile_* to implement atomics. 

After looking at the xatomic.h source, I think I have a better understanding of what is going on. I added some folks who probably have a better understanding of LLVM IR atomics, and maybe they can help guide the discussion to a simpler implementation of Visual C++ <atomic> that plays well with LLVM. We had a similar discussion about over- or under-emitting dmb fences in this code review: https://reviews.llvm.org/D52807.

The Visual C++ headers currently add fences for ARM themselves. The code looks like this: https://reviews.llvm.org/P8137 In small test cases, this generates the appropriate code, a single load followed by a fence.

-----

Completely aside from improving the implementation of <atomic> with clang, I will go ahead and make those __iso_volatile_* intrinsics available everywhere.

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev
Apparently I did this in March, and then completely forgot about it: http://reviews.llvm.org/rC357220  

I think I still had concerns about whether implementing <atomic> with _ReadWriteBarrier and plain, non-LLVM-atomic, volatile loads and stores is completely correct. I never figured out how to express those concerns, so I guess I'll just raise it to JF and Eli, since they probably know more about the C++ memory model than I do. Are there any correctness problems with the VC++ library <atomic> implementation strategy?

I think the only downside to using _ReadWriteBarrier, which compiles to an LLVM IR fence, and volatile memory operations, is that they won't optimize as well as LLVM-IR-atomic loads and stores. But as long as it's correct, we can worry about optimization, perhaps using _Atomic, later.

On Thu, Jul 25, 2019 at 12:18 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
Update: MSVC++ should gain this ability in 2019 Update 4; so it would be nice to enable these as soon as possible. The user in https://developercommunity.visualstudio.com/content/problem/274938/vs2017-1567158p2-stdatomicload-causes-write-access.html has been long suffering 🙂

Billy3

From: Billy O'Neal (VC LIBS) <[hidden email]>
Sent: Thursday, March 28, 2019 03:13 PM
To: Reid Kleckner <[hidden email]>; Friedman, Eli <[hidden email]>; Grang, Mandeep Singh <[hidden email]>
Cc: JF Bastien <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
>The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior.

Right, but if they want to link with MSVC++ generated code they need to at least respect _ReadWriteBarrier.

I can certainly mess with our <atomic> to play more nicely with you folks if you want us calling different builtins instead. <atomic> is entirely library tech for us.

Billy3

From: Reid Kleckner <[hidden email]>
Sent: Thursday, March 28, 2019 03:08 PM
To: Billy O'Neal (VC LIBS); Friedman, Eli; Grang, Mandeep Singh
Cc: JF Bastien; cfe-dev
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
On Thu, Mar 28, 2019 at 2:28 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
>Are you sure we shouldn't be marking these as atomic instead of volatile? Volatile is usually not suitable for anything except  

I am implementing <atomic> 🙂

My concern is that volatile in LLVM has nothing to do with atomicity, and I think what you really want is LLVM atomic loads and stores, with real ordering markers of monotonic, seq_cst, release, acquire, etc. This is all described here:
In particular: "The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior."

So, I'm concerned that there is something subtly incorrect about using __iso_volatile_* to implement atomics. 

After looking at the xatomic.h source, I think I have a better understanding of what is going on. I added some folks who probably have a better understanding of LLVM IR atomics, and maybe they can help guide the discussion to a simpler implementation of Visual C++ <atomic> that plays well with LLVM. We had a similar discussion about over- or under-emitting dmb fences in this code review: https://reviews.llvm.org/D52807.

The Visual C++ headers currently add fences for ARM themselves. The code looks like this: https://reviews.llvm.org/P8137 In small test cases, this generates the appropriate code, a single load followed by a fence.

-----

Completely aside from improving the implementation of <atomic> with clang, I will go ahead and make those __iso_volatile_* intrinsics available everywhere.

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev

> we can worry about optimization, perhaps using _Atomic, later

 

We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.

 

It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.

 

> I think the only downside to using _ReadWriteBarrier, which compiles to an LLVM IR fence, and volatile memory operations

 

Also be aware that on Windows there is a lot of code from before the C++ memory model existed, that treats volatile loads as having acquire semantics, hence our `/volatile:ms` switch, which is the default on the platforms that do this for `free` (x86 and amd64).

 

Billy3

 


From: Reid Kleckner <[hidden email]>
Sent: Tuesday, July 30, 2019 4:10:28 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: Friedman, Eli <[hidden email]>; Grang, Mandeep Singh <[hidden email]>; JF Bastien <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
Apparently I did this in March, and then completely forgot about it: http://reviews.llvm.org/rC357220  

I think I still had concerns about whether implementing <atomic> with _ReadWriteBarrier and plain, non-LLVM-atomic, volatile loads and stores is completely correct. I never figured out how to express those concerns, so I guess I'll just raise it to JF and Eli, since they probably know more about the C++ memory model than I do. Are there any correctness problems with the VC++ library <atomic> implementation strategy?

I think the only downside to using _ReadWriteBarrier, which compiles to an LLVM IR fence, and volatile memory operations, is that they won't optimize as well as LLVM-IR-atomic loads and stores. But as long as it's correct, we can worry about optimization, perhaps using _Atomic, later.

On Thu, Jul 25, 2019 at 12:18 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
Update: MSVC++ should gain this ability in 2019 Update 4; so it would be nice to enable these as soon as possible. The user in https://developercommunity.visualstudio.com/content/problem/274938/vs2017-1567158p2-stdatomicload-causes-write-access.html has been long suffering 🙂

Billy3

From: Billy O'Neal (VC LIBS) <[hidden email]>
Sent: Thursday, March 28, 2019 03:13 PM
To: Reid Kleckner <[hidden email]>; Friedman, Eli <[hidden email]>; Grang, Mandeep Singh <[hidden email]>
Cc: JF Bastien <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
>The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior.

Right, but if they want to link with MSVC++ generated code they need to at least respect _ReadWriteBarrier.

I can certainly mess with our <atomic> to play more nicely with you folks if you want us calling different builtins instead. <atomic> is entirely library tech for us.

Billy3

From: Reid Kleckner <[hidden email]>
Sent: Thursday, March 28, 2019 03:08 PM
To: Billy O'Neal (VC LIBS); Friedman, Eli; Grang, Mandeep Singh
Cc: JF Bastien; cfe-dev
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
On Thu, Mar 28, 2019 at 2:28 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
>Are you sure we shouldn't be marking these as atomic instead of volatile? Volatile is usually not suitable for anything except  

I am implementing <atomic> 🙂

My concern is that volatile in LLVM has nothing to do with atomicity, and I think what you really want is LLVM atomic loads and stores, with real ordering markers of monotonic, seq_cst, release, acquire, etc. This is all described here:
In particular: "The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior."

So, I'm concerned that there is something subtly incorrect about using __iso_volatile_* to implement atomics. 

After looking at the xatomic.h source, I think I have a better understanding of what is going on. I added some folks who probably have a better understanding of LLVM IR atomics, and maybe they can help guide the discussion to a simpler implementation of Visual C++ <atomic> that plays well with LLVM. We had a similar discussion about over- or under-emitting dmb fences in this code review: https://reviews.llvm.org/D52807.

The Visual C++ headers currently add fences for ARM themselves. The code looks like this: https://reviews.llvm.org/P8137 In small test cases, this generates the appropriate code, a single load followed by a fence.

-----

Completely aside from improving the implementation of <atomic> with clang, I will go ahead and make those __iso_volatile_* intrinsics available everywhere.

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev
On Tue, Jul 30, 2019 at 4:15 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:

> we can worry about optimization, perhaps using _Atomic, later

 

We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.

 

It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.


I was thinking that perhaps the _Atomic_address_as template would do the necessary casts to use it when necessary without changing the storage type inside the std::atomic object.
  

> I think the only downside to using _ReadWriteBarrier, which compiles to an LLVM IR fence, and volatile memory operations

 

Also be aware that on Windows there is a lot of code from before the C++ memory model existed, that treats volatile loads as having acquire semantics, hence our `/volatile:ms` switch, which is the default on the platforms that do this for `free` (x86 and amd64).


Right. 

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev

From: cfe-dev <[hidden email]> On Behalf Of Reid Kleckner via cfe-dev
Sent: Tuesday, July 30, 2019 4:54 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: cfe-dev <[hidden email]>
Subject: [EXT] Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

 

On Tue, Jul 30, 2019 at 4:15 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:

> we can worry about optimization, perhaps using _Atomic, later

 

We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.

 

It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.

 

I was thinking that perhaps the _Atomic_address_as template would do the necessary casts to use it when necessary without changing the storage type inside the std::atomic object.

 

There are actually multiple APIs to access C++11 atomics in clang, since we invented APIs in parallel with gcc.  The __atomic_* builtins (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) don’t require the use of _Atomic types.

 

Using volatile loads and stores with barriers probably prevents the compiler from performing any breaking optimizations, assuming the RMW atomics are protected appropriately, and the user doesn’t call any of the clang atomic builtins directly.  But it’s not a good idea; it won’t optimize well.  For example, on AArch64, it will prevent the compiler from using the dedicated load-acquire and store-release instructions.

 

-Eli


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev


On Jul 30, 2019, at 5:21 PM, Eli Friedman via cfe-dev <[hidden email]> wrote:

From: cfe-dev <[hidden email]> On Behalf Of Reid Kleckner via cfe-dev
Sent: Tuesday, July 30, 2019 4:54 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: cfe-dev <[hidden email]>
Subject: [EXT] Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
On Tue, Jul 30, 2019 at 4:15 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
> we can worry about optimization, perhaps using _Atomic, later
 
We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.
 
It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.
 
I was thinking that perhaps the _Atomic_address_as template would do the necessary casts to use it when necessary without changing the storage type inside the std::atomic object.
 
There are actually multiple APIs to access C++11 atomics in clang, since we invented APIs in parallel with gcc.  The __atomic_* builtins (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) don’t require the use of _Atomic types.
 
Using volatile loads and stores with barriers probably prevents the compiler from performing any breaking optimizations, assuming the RMW atomics are protected appropriately, and the user doesn’t call any of the clang atomic builtins directly.  But it’s not a good idea; it won’t optimize well.  For example, on AArch64, it will prevent the compiler from using the dedicated load-acquire and store-release instructions.

TL;DR: what are these _iso_ things? I read the bug report, I don’t see the details I’d expect.
I think the __atomic_* builtins should work as Eli suggests.


-Eli
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev

> TL;DR: what are these _iso_ things?

 

They are volatile ops that are guaranteed to use 1 instruction to do that load. So for x86 that means using x87, or MMX, or SSE to do the load. They also suppress the ‘volatile is affected by the /volatile:ms vs. /volatile:iso setting’ warnings. <atomic> uses these in conjunction with barrier instrinsics.

 

#if defined(_M_ARM) || defined(_M_ARM64)

#define _Memory_barrier() __dmb(0xB) // inner shared data memory barrier

#define _Compiler_or_memory_barrier() _Memory_barrier()

 

#define _ISO_VOLATILE_STORE8(_Storage, _Value) __iso_volatile_store8(_Atomic_address_as<char>(_Storage), _Value)

#define _ISO_VOLATILE_STORE16(_Storage, _Value) __iso_volatile_store16(_Atomic_address_as<short>(_Storage), _Value)

#define _ISO_VOLATILE_STORE32(_Storage, _Value) __iso_volatile_store32(_Atomic_address_as<int>(_Storage), _Value)

#define _ISO_VOLATILE_STORE64(_Storage, _Value) __iso_volatile_store64(_Atomic_address_as<long long>(_Storage), _Value)

#define _ISO_VOLATILE_LOAD8(_Storage) __iso_volatile_load8(_Atomic_address_as<const char>(_Storage))

#define _ISO_VOLATILE_LOAD16(_Storage) __iso_volatile_load16(_Atomic_address_as<const short>(_Storage))

 

#elif defined(_M_IX86) || defined(_M_X64)

// x86/x64 hardware only emits memory barriers inside _Interlocked intrinsics

#define _Compiler_or_memory_barrier() _Compiler_barrier()

 

#define _ISO_VOLATILE_STORE8(_Storage, _Value) (*_Atomic_address_as<char>(_Storage) = _Value)

#define _ISO_VOLATILE_STORE16(_Storage, _Value) (*_Atomic_address_as<short>(_Storage) = _Value)

#define _ISO_VOLATILE_STORE32(_Storage, _Value) (*_Atomic_address_as<long>(_Storage) = _Value)

#define _ISO_VOLATILE_STORE64(_Storage, _Value) (*_Atomic_address_as<long long>(_Storage) = _Value)

#define _ISO_VOLATILE_LOAD8(_Storage) (*_Atomic_address_as<const char>(_Storage))

#define _ISO_VOLATILE_LOAD16(_Storage) (*_Atomic_address_as<const short>(_Storage))

 

#else // ^^^ x86/x64 / unsupported hardware vvv

#error Unsupported hardware

#endif // hardware

 

 

…. Later…

 

// in _Atomic_storage<1>

    _NODISCARD _Ty load() const noexcept { // load with sequential consistency

        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);

        _Compiler_or_memory_barrier();

        return reinterpret_cast<_Ty&>(_As_bytes);

    }

 

    _NODISCARD _Ty load(const memory_order _Order) const noexcept { // load with given memory order

        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);

        _Load_barrier(_Order);

        return reinterpret_cast<_Ty&>(_As_bytes);

    }

 

Billy3

 

From: [hidden email]
Sent: Tuesday, July 30, 2019 8:43 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

 

 



On Jul 30, 2019, at 5:21 PM, Eli Friedman via cfe-dev <[hidden email]> wrote:

 

From: cfe-dev <[hidden email]> On Behalf Of Reid Kleckner via cfe-dev
Sent: Tuesday, July 30, 2019 4:54 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: cfe-dev <[hidden email]>
Subject: [EXT] Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

 

On Tue, Jul 30, 2019 at 4:15 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:

> we can worry about optimization, perhaps using _Atomic, later

 

We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.

 

It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.

 

I was thinking that perhaps the _Atomic_address_as template would do the necessary casts to use it when necessary without changing the storage type inside the std::atomic object.

 

There are actually multiple APIs to access C++11 atomics in clang, since we invented APIs in parallel with gcc.  The __atomic_* builtins (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) don’t require the use of _Atomic types.

 

Using volatile loads and stores with barriers probably prevents the compiler from performing any breaking optimizations, assuming the RMW atomics are protected appropriately, and the user doesn’t call any of the clang atomic builtins directly.  But it’s not a good idea; it won’t optimize well.  For example, on AArch64, it will prevent the compiler from using the dedicated load-acquire and store-release instructions.

 

TL;DR: what are these _iso_ things? I read the bug report, I don’t see the details I’d expect.

I think the __atomic_* builtins should work as Eli suggests.

 



-Eli

_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

 

 


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev
Eh, those aren’t the builtins I’d choose, but you do you… Reid’s implementation seems to match the quirks chosen, so I’m not sure there’s anything else to do here.


On Jul 30, 2019, at 8:54 PM, Billy O'Neal (VC LIBS) <[hidden email]> wrote:

> TL;DR: what are these _iso_ things?
 
They are volatile ops that are guaranteed to use 1 instruction to do that load. So for x86 that means using x87, or MMX, or SSE to do the load. They also suppress the ‘volatile is affected by the /volatile:ms vs. /volatile:iso setting’ warnings. <atomic> uses these in conjunction with barrier instrinsics.
 
#if defined(_M_ARM) || defined(_M_ARM64)
#define _Memory_barrier() __dmb(0xB) // inner shared data memory barrier
#define _Compiler_or_memory_barrier() _Memory_barrier()
 
#define _ISO_VOLATILE_STORE8(_Storage, _Value) __iso_volatile_store8(_Atomic_address_as<char>(_Storage), _Value)
#define _ISO_VOLATILE_STORE16(_Storage, _Value) __iso_volatile_store16(_Atomic_address_as<short>(_Storage), _Value)
#define _ISO_VOLATILE_STORE32(_Storage, _Value) __iso_volatile_store32(_Atomic_address_as<int>(_Storage), _Value)
#define _ISO_VOLATILE_STORE64(_Storage, _Value) __iso_volatile_store64(_Atomic_address_as<long long>(_Storage), _Value)
#define _ISO_VOLATILE_LOAD8(_Storage) __iso_volatile_load8(_Atomic_address_as<const char>(_Storage))
#define _ISO_VOLATILE_LOAD16(_Storage) __iso_volatile_load16(_Atomic_address_as<const short>(_Storage))
 
#elif defined(_M_IX86) || defined(_M_X64)
// x86/x64 hardware only emits memory barriers inside _Interlocked intrinsics
#define _Compiler_or_memory_barrier() _Compiler_barrier()
 
#define _ISO_VOLATILE_STORE8(_Storage, _Value) (*_Atomic_address_as<char>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE16(_Storage, _Value) (*_Atomic_address_as<short>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE32(_Storage, _Value) (*_Atomic_address_as<long>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE64(_Storage, _Value) (*_Atomic_address_as<long long>(_Storage) = _Value)
#define _ISO_VOLATILE_LOAD8(_Storage) (*_Atomic_address_as<const char>(_Storage))
#define _ISO_VOLATILE_LOAD16(_Storage) (*_Atomic_address_as<const short>(_Storage))
 
#else // ^^^ x86/x64 / unsupported hardware vvv
#error Unsupported hardware
#endif // hardware
 
 
…. Later…
 
// in _Atomic_storage<1>
    _NODISCARD _Ty load() const noexcept { // load with sequential consistency
        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);
        _Compiler_or_memory_barrier();
        return reinterpret_cast<_Ty&>(_As_bytes);
    }
 
    _NODISCARD _Ty load(const memory_order _Order) const noexcept { // load with given memory order
        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);
        _Load_barrier(_Order);
        return reinterpret_cast<_Ty&>(_As_bytes);
    }
 
Billy3
 
From: [hidden email]
Sent: Tuesday, July 30, 2019 8:43 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
 


On Jul 30, 2019, at 5:21 PM, Eli Friedman via cfe-dev <[hidden email]> wrote:
 
From: cfe-dev <[hidden email]> On Behalf Of Reid Kleckner via cfe-dev
Sent: Tuesday, July 30, 2019 4:54 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: cfe-dev <[hidden email]>
Subject: [EXT] Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
On Tue, Jul 30, 2019 at 4:15 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
> we can worry about optimization, perhaps using _Atomic, later
 
We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.
 
It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.
 
I was thinking that perhaps the _Atomic_address_as template would do the necessary casts to use it when necessary without changing the storage type inside the std::atomic object.
 
There are actually multiple APIs to access C++11 atomics in clang, since we invented APIs in parallel with gcc.  The __atomic_* builtins (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) don’t require the use of _Atomic types.
 
Using volatile loads and stores with barriers probably prevents the compiler from performing any breaking optimizations, assuming the RMW atomics are protected appropriately, and the user doesn’t call any of the clang atomic builtins directly.  But it’s not a good idea; it won’t optimize well.  For example, on AArch64, it will prevent the compiler from using the dedicated load-acquire and store-release instructions.
 
TL;DR: what are these _iso_ things? I read the bug report, I don’t see the details I’d expect.
I think the __atomic_* builtins should work as Eli suggests.
 


-Eli
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev

These are the builtins I have, not necessarily the builtins I desire.

 

Billy3

 


From: [hidden email] <[hidden email]> on behalf of JF Bastien <[hidden email]>
Sent: Tuesday, July 30, 2019 8:58:14 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: Eli Friedman <[hidden email]>; [hidden email] <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
Eh, those aren’t the builtins I’d choose, but you do you… Reid’s implementation seems to match the quirks chosen, so I’m not sure there’s anything else to do here.


On Jul 30, 2019, at 8:54 PM, Billy O'Neal (VC LIBS) <[hidden email]> wrote:

> TL;DR: what are these _iso_ things?
 
They are volatile ops that are guaranteed to use 1 instruction to do that load. So for x86 that means using x87, or MMX, or SSE to do the load. They also suppress the ‘volatile is affected by the /volatile:ms vs. /volatile:iso setting’ warnings. <atomic> uses these in conjunction with barrier instrinsics.
 
#if defined(_M_ARM) || defined(_M_ARM64)
#define _Memory_barrier() __dmb(0xB) // inner shared data memory barrier
#define _Compiler_or_memory_barrier() _Memory_barrier()
 
#define _ISO_VOLATILE_STORE8(_Storage, _Value) __iso_volatile_store8(_Atomic_address_as<char>(_Storage), _Value)
#define _ISO_VOLATILE_STORE16(_Storage, _Value) __iso_volatile_store16(_Atomic_address_as<short>(_Storage), _Value)
#define _ISO_VOLATILE_STORE32(_Storage, _Value) __iso_volatile_store32(_Atomic_address_as<int>(_Storage), _Value)
#define _ISO_VOLATILE_STORE64(_Storage, _Value) __iso_volatile_store64(_Atomic_address_as<long long>(_Storage), _Value)
#define _ISO_VOLATILE_LOAD8(_Storage) __iso_volatile_load8(_Atomic_address_as<const char>(_Storage))
#define _ISO_VOLATILE_LOAD16(_Storage) __iso_volatile_load16(_Atomic_address_as<const short>(_Storage))
 
#elif defined(_M_IX86) || defined(_M_X64)
// x86/x64 hardware only emits memory barriers inside _Interlocked intrinsics
#define _Compiler_or_memory_barrier() _Compiler_barrier()
 
#define _ISO_VOLATILE_STORE8(_Storage, _Value) (*_Atomic_address_as<char>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE16(_Storage, _Value) (*_Atomic_address_as<short>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE32(_Storage, _Value) (*_Atomic_address_as<long>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE64(_Storage, _Value) (*_Atomic_address_as<long long>(_Storage) = _Value)
#define _ISO_VOLATILE_LOAD8(_Storage) (*_Atomic_address_as<const char>(_Storage))
#define _ISO_VOLATILE_LOAD16(_Storage) (*_Atomic_address_as<const short>(_Storage))
 
#else // ^^^ x86/x64 / unsupported hardware vvv
#error Unsupported hardware
#endif // hardware
 
 
…. Later…
 
// in _Atomic_storage<1>
    _NODISCARD _Ty load() const noexcept { // load with sequential consistency
        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);
        _Compiler_or_memory_barrier();
        return reinterpret_cast<_Ty&>(_As_bytes);
    }
 
    _NODISCARD _Ty load(const memory_order _Order) const noexcept { // load with given memory order
        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);
        _Load_barrier(_Order);
        return reinterpret_cast<_Ty&>(_As_bytes);
    }
 
Billy3
 
From: [hidden email]
Sent: Tuesday, July 30, 2019 8:43 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
 


On Jul 30, 2019, at 5:21 PM, Eli Friedman via cfe-dev <[hidden email]> wrote:
 
From: cfe-dev <[hidden email]> On Behalf Of Reid Kleckner via cfe-dev
Sent: Tuesday, July 30, 2019 4:54 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: cfe-dev <[hidden email]>
Subject: [EXT] Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
On Tue, Jul 30, 2019 at 4:15 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
> we can worry about optimization, perhaps using _Atomic, later
 
We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.
 
It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.
 
I was thinking that perhaps the _Atomic_address_as template would do the necessary casts to use it when necessary without changing the storage type inside the std::atomic object.
 
There are actually multiple APIs to access C++11 atomics in clang, since we invented APIs in parallel with gcc.  The __atomic_* builtins (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) don’t require the use of _Atomic types.
 
Using volatile loads and stores with barriers probably prevents the compiler from performing any breaking optimizations, assuming the RMW atomics are protected appropriately, and the user doesn’t call any of the clang atomic builtins directly.  But it’s not a good idea; it won’t optimize well.  For example, on AArch64, it will prevent the compiler from using the dedicated load-acquire and store-release instructions.
 
TL;DR: what are these _iso_ things? I read the bug report, I don’t see the details I’d expect.
I think the __atomic_* builtins should work as Eli suggests.
 


-Eli
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev

On Jul 30, 2019, at 8:59 PM, Billy O'Neal (VC LIBS) <[hidden email]> wrote:

These are the builtins I have, not necessarily the builtins I desire.

Definitely the builtins you deserve 🦹‍♂️

 
Billy3
 

From: [hidden email] <[hidden email]> on behalf of JF Bastien <[hidden email]>
Sent: Tuesday, July 30, 2019 8:58:14 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: Eli Friedman <[hidden email]>; [hidden email] <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
Eh, those aren’t the builtins I’d choose, but you do you… Reid’s implementation seems to match the quirks chosen, so I’m not sure there’s anything else to do here.


On Jul 30, 2019, at 8:54 PM, Billy O'Neal (VC LIBS) <[hidden email]> wrote:

> TL;DR: what are these _iso_ things?
 
They are volatile ops that are guaranteed to use 1 instruction to do that load. So for x86 that means using x87, or MMX, or SSE to do the load. They also suppress the ‘volatile is affected by the /volatile:ms vs. /volatile:iso setting’ warnings. <atomic> uses these in conjunction with barrier instrinsics.
 
#if defined(_M_ARM) || defined(_M_ARM64)
#define _Memory_barrier() __dmb(0xB) // inner shared data memory barrier
#define _Compiler_or_memory_barrier() _Memory_barrier()
 
#define _ISO_VOLATILE_STORE8(_Storage, _Value) __iso_volatile_store8(_Atomic_address_as<char>(_Storage), _Value)
#define _ISO_VOLATILE_STORE16(_Storage, _Value) __iso_volatile_store16(_Atomic_address_as<short>(_Storage), _Value)
#define _ISO_VOLATILE_STORE32(_Storage, _Value) __iso_volatile_store32(_Atomic_address_as<int>(_Storage), _Value)
#define _ISO_VOLATILE_STORE64(_Storage, _Value) __iso_volatile_store64(_Atomic_address_as<long long>(_Storage), _Value)
#define _ISO_VOLATILE_LOAD8(_Storage) __iso_volatile_load8(_Atomic_address_as<const char>(_Storage))
#define _ISO_VOLATILE_LOAD16(_Storage) __iso_volatile_load16(_Atomic_address_as<const short>(_Storage))
 
#elif defined(_M_IX86) || defined(_M_X64)
// x86/x64 hardware only emits memory barriers inside _Interlocked intrinsics
#define _Compiler_or_memory_barrier() _Compiler_barrier()
 
#define _ISO_VOLATILE_STORE8(_Storage, _Value) (*_Atomic_address_as<char>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE16(_Storage, _Value) (*_Atomic_address_as<short>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE32(_Storage, _Value) (*_Atomic_address_as<long>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE64(_Storage, _Value) (*_Atomic_address_as<long long>(_Storage) = _Value)
#define _ISO_VOLATILE_LOAD8(_Storage) (*_Atomic_address_as<const char>(_Storage))
#define _ISO_VOLATILE_LOAD16(_Storage) (*_Atomic_address_as<const short>(_Storage))
 
#else // ^^^ x86/x64 / unsupported hardware vvv
#error Unsupported hardware
#endif // hardware
 
 
…. Later…
 
// in _Atomic_storage<1>
    _NODISCARD _Ty load() const noexcept { // load with sequential consistency
        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);
        _Compiler_or_memory_barrier();
        return reinterpret_cast<_Ty&>(_As_bytes);
    }
 
    _NODISCARD _Ty load(const memory_order _Order) const noexcept { // load with given memory order
        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);
        _Load_barrier(_Order);
        return reinterpret_cast<_Ty&>(_As_bytes);
    }
 
Billy3
 
From: [hidden email]
Sent: Tuesday, July 30, 2019 8:43 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
 


On Jul 30, 2019, at 5:21 PM, Eli Friedman via cfe-dev <[hidden email]> wrote:
 
From: cfe-dev <[hidden email]> On Behalf Of Reid Kleckner via cfe-dev
Sent: Tuesday, July 30, 2019 4:54 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: cfe-dev <[hidden email]>
Subject: [EXT] Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
On Tue, Jul 30, 2019 at 4:15 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
> we can worry about optimization, perhaps using _Atomic, later
 
We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.
 
It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.
 
I was thinking that perhaps the _Atomic_address_as template would do the necessary casts to use it when necessary without changing the storage type inside the std::atomic object.
 
There are actually multiple APIs to access C++11 atomics in clang, since we invented APIs in parallel with gcc.  The __atomic_* builtins (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) don’t require the use of _Atomic types.
 
Using volatile loads and stores with barriers probably prevents the compiler from performing any breaking optimizations, assuming the RMW atomics are protected appropriately, and the user doesn’t call any of the clang atomic builtins directly.  But it’s not a good idea; it won’t optimize well.  For example, on AArch64, it will prevent the compiler from using the dedicated load-acquire and store-release instructions.
 
TL;DR: what are these _iso_ things? I read the bug report, I don’t see the details I’d expect.
I think the __atomic_* builtins should work as Eli suggests.
 


-Eli
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Nathan Ridge via cfe-dev

 


From: [hidden email] <[hidden email]> on behalf of JF Bastien <[hidden email]>
Sent: Tuesday, July 30, 2019 9:00:43 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: Eli Friedman <[hidden email]>; [hidden email] <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 

On Jul 30, 2019, at 8:59 PM, Billy O'Neal (VC LIBS) <[hidden email]> wrote:

These are the builtins I have, not necessarily the builtins I desire.

Definitely the builtins you deserve 🦹‍♂️

 
Billy3
 

From: [hidden email] <[hidden email]> on behalf of JF Bastien <[hidden email]>
Sent: Tuesday, July 30, 2019 8:58:14 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: Eli Friedman <[hidden email]>; [hidden email] <[hidden email]>; cfe-dev <[hidden email]>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
Eh, those aren’t the builtins I’d choose, but you do you… Reid’s implementation seems to match the quirks chosen, so I’m not sure there’s anything else to do here.


On Jul 30, 2019, at 8:54 PM, Billy O'Neal (VC LIBS) <[hidden email]> wrote:

> TL;DR: what are these _iso_ things?
 
They are volatile ops that are guaranteed to use 1 instruction to do that load. So for x86 that means using x87, or MMX, or SSE to do the load. They also suppress the ‘volatile is affected by the /volatile:ms vs. /volatile:iso setting’ warnings. <atomic> uses these in conjunction with barrier instrinsics.
 
#if defined(_M_ARM) || defined(_M_ARM64)
#define _Memory_barrier() __dmb(0xB) // inner shared data memory barrier
#define _Compiler_or_memory_barrier() _Memory_barrier()
 
#define _ISO_VOLATILE_STORE8(_Storage, _Value) __iso_volatile_store8(_Atomic_address_as<char>(_Storage), _Value)
#define _ISO_VOLATILE_STORE16(_Storage, _Value) __iso_volatile_store16(_Atomic_address_as<short>(_Storage), _Value)
#define _ISO_VOLATILE_STORE32(_Storage, _Value) __iso_volatile_store32(_Atomic_address_as<int>(_Storage), _Value)
#define _ISO_VOLATILE_STORE64(_Storage, _Value) __iso_volatile_store64(_Atomic_address_as<long long>(_Storage), _Value)
#define _ISO_VOLATILE_LOAD8(_Storage) __iso_volatile_load8(_Atomic_address_as<const char>(_Storage))
#define _ISO_VOLATILE_LOAD16(_Storage) __iso_volatile_load16(_Atomic_address_as<const short>(_Storage))
 
#elif defined(_M_IX86) || defined(_M_X64)
// x86/x64 hardware only emits memory barriers inside _Interlocked intrinsics
#define _Compiler_or_memory_barrier() _Compiler_barrier()
 
#define _ISO_VOLATILE_STORE8(_Storage, _Value) (*_Atomic_address_as<char>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE16(_Storage, _Value) (*_Atomic_address_as<short>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE32(_Storage, _Value) (*_Atomic_address_as<long>(_Storage) = _Value)
#define _ISO_VOLATILE_STORE64(_Storage, _Value) (*_Atomic_address_as<long long>(_Storage) = _Value)
#define _ISO_VOLATILE_LOAD8(_Storage) (*_Atomic_address_as<const char>(_Storage))
#define _ISO_VOLATILE_LOAD16(_Storage) (*_Atomic_address_as<const short>(_Storage))
 
#else // ^^^ x86/x64 / unsupported hardware vvv
#error Unsupported hardware
#endif // hardware
 
 
…. Later…
 
// in _Atomic_storage<1>
    _NODISCARD _Ty load() const noexcept { // load with sequential consistency
        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);
        _Compiler_or_memory_barrier();
        return reinterpret_cast<_Ty&>(_As_bytes);
    }
 
    _NODISCARD _Ty load(const memory_order _Order) const noexcept { // load with given memory order
        char _As_bytes = _ISO_VOLATILE_LOAD8(_Storage);
        _Load_barrier(_Order);
        return reinterpret_cast<_Ty&>(_As_bytes);
    }
 
Billy3
 
From: [hidden email]
Sent: Tuesday, July 30, 2019 8:43 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
 


On Jul 30, 2019, at 5:21 PM, Eli Friedman via cfe-dev <[hidden email]> wrote:
 
From: cfe-dev <[hidden email]> On Behalf Of Reid Kleckner via cfe-dev
Sent: Tuesday, July 30, 2019 4:54 PM
To: Billy O'Neal (VC LIBS) <[hidden email]>
Cc: cfe-dev <[hidden email]>
Subject: [EXT] Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon
 
On Tue, Jul 30, 2019 at 4:15 PM Billy O'Neal (VC LIBS) <[hidden email]> wrote:
> we can worry about optimization, perhaps using _Atomic, later
 
We can’t use _Atomic because it would not be ABI compatible with our std::atomic; in particular, because we put the spinlock for non-lock-free atomics inside the atomic, for instance. And because that isn’t a thing for some of our supported frontends.
 
It is possible that there are different intrinsics we could call for Clang that would make you folks happier, but we don’t know what those are or even if they exist at present.
 
I was thinking that perhaps the _Atomic_address_as template would do the necessary casts to use it when necessary without changing the storage type inside the std::atomic object.
 
There are actually multiple APIs to access C++11 atomics in clang, since we invented APIs in parallel with gcc.  The __atomic_* builtins (https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html) don’t require the use of _Atomic types.
 
Using volatile loads and stores with barriers probably prevents the compiler from performing any breaking optimizations, assuming the RMW atomics are protected appropriately, and the user doesn’t call any of the clang atomic builtins directly.  But it’s not a good idea; it won’t optimize well.  For example, on AArch64, it will prevent the compiler from using the dedicated load-acquire and store-release instructions.
 
TL;DR: what are these _iso_ things? I read the bug report, I don’t see the details I’d expect.
I think the __atomic_* builtins should work as Eli suggests.
 


-Eli
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
12