Optimization creates infinite loop

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Optimization creates infinite loop

Kristof Beyls via cfe-dev

Hello Clang-Developer,

 

We ran into a – for us big problem – with the Clang compiler. We had the following code:

extern double RT_Time;

void Sample1sec()

{

       const double startTime = RT_Time;

       while(RT_Time < (startTime + 5.0005));

}

 

The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:

"?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"

.Lfunc_begin0:

                .cv_func_id 0

# %bb.0:

                .cv_file 1 "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.cpp" "4E92768969662F6E513C11595FEA9D63" 1

                .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0

                movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] # xmm0 = mem[0],zero

.Ltmp0:

                #DEBUG_VALUE: Sample1sec:startTime <- $xmm0

                movsd  xmm1, qword ptr [rip + __real@40140083126e978d] # xmm1 = mem[0],zero

                .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0

                addsd   xmm1, xmm0

                ucomisd               xmm1, xmm0

                .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0

                jbe         .LBB0_2

.Ltmp1:

                .p2align                4, 0x90

.LBB0_1:                                # =>This Inner Loop Header: Depth=1

                #DEBUG_VALUE: Sample1sec:startTime <- $xmm0

                jmp        .LBB0_1

.Ltmp2:

.LBB0_2:

                #DEBUG_VALUE: Sample1sec:startTime <- $xmm0

                .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0

                ret

.Ltmp3:

.Lfunc_end0:

 

Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?

Is this a bug or do I understand something wrong?

 

Kind greetings and thank you for the help in advance,

Björn

 

P.S.

My compile flags (when using -###) are:

clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"

 

clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"

Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Hello,

This isn't really surprising. Since RT_Time is not marked volatile,
nor is it accessed through std::atomic functions (or the equivalent
builtins), the compiler has no reason to assume that any external
actor can change the value of it. So the load is moved out of the loop
since doing it repeatedly would, from the compiler's point of view, be
wasteful.

If RT_Time can be changed by hardware or something along those lines,
then volatile is appropriate. If another thread can change it, you
should be using atomic APIs to access it.


On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev
<[hidden email]> wrote:

>
> Hello Clang-Developer,
>
>
>
> We ran into a – for us big problem – with the Clang compiler. We had the following code:
>
> extern double RT_Time;
>
> void Sample1sec()
>
> {
>
>        const double startTime = RT_Time;
>
>        while(RT_Time < (startTime + 5.0005));
>
> }
>
>
>
> The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
>
> "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
>
> .Lfunc_begin0:
>
>                 .cv_func_id 0
>
> # %bb.0:
>
>                 .cv_file 1 "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.cpp" "4E92768969662F6E513C11595FEA9D63" 1
>
>                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
>
>                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] # xmm0 = mem[0],zero
>
> .Ltmp0:
>
>                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
>
>                 movsd  xmm1, qword ptr [rip + __real@40140083126e978d] # xmm1 = mem[0],zero
>
>                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
>
>                 addsd   xmm1, xmm0
>
>                 ucomisd               xmm1, xmm0
>
>                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
>
>                 jbe         .LBB0_2
>
> .Ltmp1:
>
>                 .p2align                4, 0x90
>
> .LBB0_1:                                # =>This Inner Loop Header: Depth=1
>
>                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
>
>                 jmp        .LBB0_1
>
> .Ltmp2:
>
> .LBB0_2:
>
>                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
>
>                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
>
>                 ret
>
> .Ltmp3:
>
> .Lfunc_end0:
>
>
>
> Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
>
> Is this a bug or do I understand something wrong?
>
>
>
> Kind greetings and thank you for the help in advance,
>
> Björn
>
>
>
> P.S.
>
> My compile flags (when using -###) are:
>
> clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
>
>
>
> clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
>
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Hey,

Thanks for the answer! As said, volatile fixed the issue for us - however you said:
"the compiler has no reason to assume that any external actor can change the value of it"
I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?

That sounds like any extern variable that will be changed should be marked 'volatile'.

-----Original Message-----
From: Alex Rønne Petersen <[hidden email]>
Sent: 01 November 2019 11:11
To: Gaier, Bjoern <[hidden email]>
Cc: [hidden email]
Subject: Re: [cfe-dev] Optimization creates infinite loop

Hello,

This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.

If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.


On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:

>
> Hello Clang-Developer,
>
>
>
> We ran into a – for us big problem – with the Clang compiler. We had the following code:
>
> extern double RT_Time;
>
> void Sample1sec()
>
> {
>
>        const double startTime = RT_Time;
>
>        while(RT_Time < (startTime + 5.0005));
>
> }
>
>
>
> The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
>
> "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
>
> .Lfunc_begin0:
>
>                 .cv_func_id 0
>
> # %bb.0:
>
>                 .cv_file 1
> "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\C
> M_Driver1\\CM_Driver1.cpp" "4E92768969662F6E513C11595FEA9D63" 1
>
>                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
>
>                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] # xmm0
> = mem[0],zero
>
> .Ltmp0:
>
>                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
>
>                 movsd  xmm1, qword ptr [rip + __real@40140083126e978d]
> # xmm1 = mem[0],zero
>
>                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
>
>                 addsd   xmm1, xmm0
>
>                 ucomisd               xmm1, xmm0
>
>                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
>
>                 jbe         .LBB0_2
>
> .Ltmp1:
>
>                 .p2align                4, 0x90
>
> .LBB0_1:                                # =>This Inner Loop Header: Depth=1
>
>                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
>
>                 jmp        .LBB0_1
>
> .Ltmp2:
>
> .LBB0_2:
>
>                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
>
>                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
>
>                 ret
>
> .Ltmp3:
>
> .Lfunc_end0:
>
>
>
> Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
>
> Is this a bug or do I understand something wrong?
>
>
>
> Kind greetings and thank you for the help in advance,
>
> Björn
>
>
>
> P.S.
>
> My compile flags (when using -###) are:
>
> clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
>
>
>
> clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
>
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> Fukushima. Junichi Tajika
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Hello,

The compiler is making the (correct) assumption that RT_Time cannot be
changed by anything other than the currently executing thread since
you have not told it otherwise, and since the loop body (or rather,
lack thereof) does not modify it.

What exactly *is* changing RT_Time? Since the thread that executes
Sample1sec() is obviously intended to spin until it changes, I can
only think that either some hardware is changing it or another thread
is. In which case, see my earlier reply. The fact that it's extern
isn't really relevant; I'm fairly confident Clang would generate the
same code if you removed the extern keyword.

On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <[hidden email]> wrote:

>
> Hey,
>
> Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> "the compiler has no reason to assume that any external actor can change the value of it"
> I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
>
> That sounds like any extern variable that will be changed should be marked 'volatile'.
>
> -----Original Message-----
> From: Alex Rønne Petersen <[hidden email]>
> Sent: 01 November 2019 11:11
> To: Gaier, Bjoern <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
>
> If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
>
>
> On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:
> >
> > Hello Clang-Developer,
> >
> >
> >
> > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> >
> > extern double RT_Time;
> >
> > void Sample1sec()
> >
> > {
> >
> >        const double startTime = RT_Time;
> >
> >        while(RT_Time < (startTime + 5.0005));
> >
> > }
> >
> >
> >
> > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> >
> > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> >
> > .Lfunc_begin0:
> >
> >                 .cv_func_id 0
> >
> > # %bb.0:
> >
> >                 .cv_file 1
> > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\C
> > M_Driver1\\CM_Driver1.cpp" "4E92768969662F6E513C11595FEA9D63" 1
> >
> >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> >
> >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] # xmm0
> > = mem[0],zero
> >
> > .Ltmp0:
> >
> >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> >
> >                 movsd  xmm1, qword ptr [rip + __real@40140083126e978d]
> > # xmm1 = mem[0],zero
> >
> >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> >
> >                 addsd   xmm1, xmm0
> >
> >                 ucomisd               xmm1, xmm0
> >
> >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> >
> >                 jbe         .LBB0_2
> >
> > .Ltmp1:
> >
> >                 .p2align                4, 0x90
> >
> > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> >
> >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> >
> >                 jmp        .LBB0_1
> >
> > .Ltmp2:
> >
> > .LBB0_2:
> >
> >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> >
> >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> >
> >                 ret
> >
> > .Ltmp3:
> >
> > .Lfunc_end0:
> >
> >
> >
> > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> >
> > Is this a bug or do I understand something wrong?
> >
> >
> >
> > Kind greetings and thank you for the help in advance,
> >
> > Björn
> >
> >
> >
> > P.S.
> >
> > My compile flags (when using -###) are:
> >
> > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> >
> >
> >
> > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> >
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> > _______________________________________________
> > cfe-dev mailing list
> > [hidden email]
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Hey,

You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.

I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.

-----Original Message-----
From: Alex Rønne Petersen <[hidden email]>
Sent: 01 November 2019 11:26
To: Gaier, Bjoern <[hidden email]>
Cc: [hidden email]
Subject: Re: [cfe-dev] Optimization creates infinite loop

Hello,

The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.

What exactly *is* changing RT_Time? Since the thread that executes
Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.

On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <[hidden email]> wrote:

>
> Hey,
>
> Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> "the compiler has no reason to assume that any external actor can change the value of it"
> I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
>
> That sounds like any extern variable that will be changed should be marked 'volatile'.
>
> -----Original Message-----
> From: Alex Rønne Petersen <[hidden email]>
> Sent: 01 November 2019 11:11
> To: Gaier, Bjoern <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
>
> If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
>
>
> On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:
> >
> > Hello Clang-Developer,
> >
> >
> >
> > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> >
> > extern double RT_Time;
> >
> > void Sample1sec()
> >
> > {
> >
> >        const double startTime = RT_Time;
> >
> >        while(RT_Time < (startTime + 5.0005));
> >
> > }
> >
> >
> >
> > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> >
> > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> >
> > .Lfunc_begin0:
> >
> >                 .cv_func_id 0
> >
> > # %bb.0:
> >
> >                 .cv_file 1
> > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\
> > \C M_Driver1\\CM_Driver1.cpp" "4E92768969662F6E513C11595FEA9D63" 1
> >
> >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> >
> >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > xmm0 = mem[0],zero
> >
> > .Ltmp0:
> >
> >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> >
> >                 movsd  xmm1, qword ptr [rip +
> > __real@40140083126e978d] # xmm1 = mem[0],zero
> >
> >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> >
> >                 addsd   xmm1, xmm0
> >
> >                 ucomisd               xmm1, xmm0
> >
> >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> >
> >                 jbe         .LBB0_2
> >
> > .Ltmp1:
> >
> >                 .p2align                4, 0x90
> >
> > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> >
> >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> >
> >                 jmp        .LBB0_1
> >
> > .Ltmp2:
> >
> > .LBB0_2:
> >
> >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> >
> >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> >
> >                 ret
> >
> > .Ltmp3:
> >
> > .Lfunc_end0:
> >
> >
> >
> > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> >
> > Is this a bug or do I understand something wrong?
> >
> >
> >
> > Kind greetings and thank you for the help in advance,
> >
> > Björn
> >
> >
> >
> > P.S.
> >
> > My compile flags (when using -###) are:
> >
> > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> >
> >
> >
> > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> >
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> > _______________________________________________
> > cfe-dev mailing list
> > [hidden email]
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> Fukushima. Junichi Tajika
Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Hello,

So in that case you'd want to do something like:

extern double RT_Time;
void Sample1sec()
{
       const double startTime = RT_Time;
       while (__atomic_load_n (&RT_Time, __ATOMIC_RELAXED) <
(startTime + 5.0005));
}

This is assuming you're unable to change the definition side of
RT_Time. If you *are* able to change it, you can (and should) just
make it an std::atomic<double> and use .load()/.store() with
memory_order_relaxed to interact with it. volatile would be the wrong
tool here since you're dealing with threads.


On Fri, Nov 1, 2019 at 11:35 AM Gaier, Bjoern <[hidden email]> wrote:

>
> Hey,
>
> You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.
>
> I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.
>
> -----Original Message-----
> From: Alex Rønne Petersen <[hidden email]>
> Sent: 01 November 2019 11:26
> To: Gaier, Bjoern <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.
>
> What exactly *is* changing RT_Time? Since the thread that executes
> Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.
>
> On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <[hidden email]> wrote:
> >
> > Hey,
> >
> > Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> > "the compiler has no reason to assume that any external actor can change the value of it"
> > I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
> >
> > That sounds like any extern variable that will be changed should be marked 'volatile'.
> >
> > -----Original Message-----
> > From: Alex Rønne Petersen <[hidden email]>
> > Sent: 01 November 2019 11:11
> > To: Gaier, Bjoern <[hidden email]>
> > Cc: [hidden email]
> > Subject: Re: [cfe-dev] Optimization creates infinite loop
> >
> > Hello,
> >
> > This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
> >
> > If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
> >
> >
> > On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:
> > >
> > > Hello Clang-Developer,
> > >
> > >
> > >
> > > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> > >
> > > extern double RT_Time;
> > >
> > > void Sample1sec()
> > >
> > > {
> > >
> > >        const double startTime = RT_Time;
> > >
> > >        while(RT_Time < (startTime + 5.0005));
> > >
> > > }
> > >
> > >
> > >
> > > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> > >
> > > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> > >
> > > .Lfunc_begin0:
> > >
> > >                 .cv_func_id 0
> > >
> > > # %bb.0:
> > >
> > >                 .cv_file 1
> > > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\
> > > \C M_Driver1\\CM_Driver1.cpp" "4E92768969662F6E513C11595FEA9D63" 1
> > >
> > >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> > >
> > >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > > xmm0 = mem[0],zero
> > >
> > > .Ltmp0:
> > >
> > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > >
> > >                 movsd  xmm1, qword ptr [rip +
> > > __real@40140083126e978d] # xmm1 = mem[0],zero
> > >
> > >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> > >
> > >                 addsd   xmm1, xmm0
> > >
> > >                 ucomisd               xmm1, xmm0
> > >
> > >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> > >
> > >                 jbe         .LBB0_2
> > >
> > > .Ltmp1:
> > >
> > >                 .p2align                4, 0x90
> > >
> > > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> > >
> > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > >
> > >                 jmp        .LBB0_1
> > >
> > > .Ltmp2:
> > >
> > > .LBB0_2:
> > >
> > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > >
> > >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> > >
> > >                 ret
> > >
> > > .Ltmp3:
> > >
> > > .Lfunc_end0:
> > >
> > >
> > >
> > > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> > >
> > > Is this a bug or do I understand something wrong?
> > >
> > >
> > >
> > > Kind greetings and thank you for the help in advance,
> > >
> > > Björn
> > >
> > >
> > >
> > > P.S.
> > >
> > > My compile flags (when using -###) are:
> > >
> > > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> > >
> > >
> > >
> > > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> > >
> > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > Fukushima. Junichi Tajika
> > > _______________________________________________
> > > cfe-dev mailing list
> > > [hidden email]
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Hey,

Thanks for the input! Does __atomic_load_n not generated additional overhead? In terms of instruction time and so on

-----Original Message-----
From: Alex Rønne Petersen <[hidden email]>
Sent: 01 November 2019 11:47
To: Gaier, Bjoern <[hidden email]>
Cc: [hidden email]
Subject: Re: [cfe-dev] Optimization creates infinite loop

Hello,

So in that case you'd want to do something like:

extern double RT_Time;
void Sample1sec()
{
       const double startTime = RT_Time;
       while (__atomic_load_n (&RT_Time, __ATOMIC_RELAXED) < (startTime + 5.0005)); }

This is assuming you're unable to change the definition side of RT_Time. If you *are* able to change it, you can (and should) just make it an std::atomic<double> and use .load()/.store() with memory_order_relaxed to interact with it. volatile would be the wrong tool here since you're dealing with threads.


On Fri, Nov 1, 2019 at 11:35 AM Gaier, Bjoern <[hidden email]> wrote:

>
> Hey,
>
> You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.
>
> I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.
>
> -----Original Message-----
> From: Alex Rønne Petersen <[hidden email]>
> Sent: 01 November 2019 11:26
> To: Gaier, Bjoern <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.
>
> What exactly *is* changing RT_Time? Since the thread that executes
> Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.
>
> On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <[hidden email]> wrote:
> >
> > Hey,
> >
> > Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> > "the compiler has no reason to assume that any external actor can change the value of it"
> > I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
> >
> > That sounds like any extern variable that will be changed should be marked 'volatile'.
> >
> > -----Original Message-----
> > From: Alex Rønne Petersen <[hidden email]>
> > Sent: 01 November 2019 11:11
> > To: Gaier, Bjoern <[hidden email]>
> > Cc: [hidden email]
> > Subject: Re: [cfe-dev] Optimization creates infinite loop
> >
> > Hello,
> >
> > This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
> >
> > If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
> >
> >
> > On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:
> > >
> > > Hello Clang-Developer,
> > >
> > >
> > >
> > > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> > >
> > > extern double RT_Time;
> > >
> > > void Sample1sec()
> > >
> > > {
> > >
> > >        const double startTime = RT_Time;
> > >
> > >        while(RT_Time < (startTime + 5.0005));
> > >
> > > }
> > >
> > >
> > >
> > > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> > >
> > > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> > >
> > > .Lfunc_begin0:
> > >
> > >                 .cv_func_id 0
> > >
> > > # %bb.0:
> > >
> > >                 .cv_file 1
> > > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviatha
> > > n\ \C M_Driver1\\CM_Driver1.cpp"
> > > "4E92768969662F6E513C11595FEA9D63" 1
> > >
> > >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> > >
> > >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > > xmm0 = mem[0],zero
> > >
> > > .Ltmp0:
> > >
> > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > >
> > >                 movsd  xmm1, qword ptr [rip +
> > > __real@40140083126e978d] # xmm1 = mem[0],zero
> > >
> > >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> > >
> > >                 addsd   xmm1, xmm0
> > >
> > >                 ucomisd               xmm1, xmm0
> > >
> > >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> > >
> > >                 jbe         .LBB0_2
> > >
> > > .Ltmp1:
> > >
> > >                 .p2align                4, 0x90
> > >
> > > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> > >
> > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > >
> > >                 jmp        .LBB0_1
> > >
> > > .Ltmp2:
> > >
> > > .LBB0_2:
> > >
> > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > >
> > >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> > >
> > >                 ret
> > >
> > > .Ltmp3:
> > >
> > > .Lfunc_end0:
> > >
> > >
> > >
> > > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> > >
> > > Is this a bug or do I understand something wrong?
> > >
> > >
> > >
> > > Kind greetings and thank you for the help in advance,
> > >
> > > Björn
> > >
> > >
> > >
> > > P.S.
> > >
> > > My compile flags (when using -###) are:
> > >
> > > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> > >
> > >
> > >
> > > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> > >
> > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > Fukushima. Junichi Tajika
> > > _______________________________________________
> > > cfe-dev mailing list
> > > [hidden email]
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> Fukushima. Junichi Tajika
Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Hello,

On most 64-bit targets, loading a double (a 64-bit quantity) from
memory is atomic already, and relaxed memory order means no memory
barriers, so in practice it's really just informing the compiler that
the variable will be accessed from multiple threads.

Of course, on 32-bit targets, a 64-bit atomic load is not the same
thing as a 64-bit non-atomic load. But if you don't do an atomic load
on such targets and instead use volatile, your program would be likely
to encounter word tearing when reading the variable (putting aside the
fact that the program would be illegal due to undefined behavior
resulting from the data race).

On Fri, Nov 1, 2019 at 1:11 PM Gaier, Bjoern <[hidden email]> wrote:

>
> Hey,
>
> Thanks for the input! Does __atomic_load_n not generated additional overhead? In terms of instruction time and so on
>
> -----Original Message-----
> From: Alex Rønne Petersen <[hidden email]>
> Sent: 01 November 2019 11:47
> To: Gaier, Bjoern <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> So in that case you'd want to do something like:
>
> extern double RT_Time;
> void Sample1sec()
> {
>        const double startTime = RT_Time;
>        while (__atomic_load_n (&RT_Time, __ATOMIC_RELAXED) < (startTime + 5.0005)); }
>
> This is assuming you're unable to change the definition side of RT_Time. If you *are* able to change it, you can (and should) just make it an std::atomic<double> and use .load()/.store() with memory_order_relaxed to interact with it. volatile would be the wrong tool here since you're dealing with threads.
>
>
> On Fri, Nov 1, 2019 at 11:35 AM Gaier, Bjoern <[hidden email]> wrote:
> >
> > Hey,
> >
> > You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.
> >
> > I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.
> >
> > -----Original Message-----
> > From: Alex Rønne Petersen <[hidden email]>
> > Sent: 01 November 2019 11:26
> > To: Gaier, Bjoern <[hidden email]>
> > Cc: [hidden email]
> > Subject: Re: [cfe-dev] Optimization creates infinite loop
> >
> > Hello,
> >
> > The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.
> >
> > What exactly *is* changing RT_Time? Since the thread that executes
> > Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.
> >
> > On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <[hidden email]> wrote:
> > >
> > > Hey,
> > >
> > > Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> > > "the compiler has no reason to assume that any external actor can change the value of it"
> > > I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
> > >
> > > That sounds like any extern variable that will be changed should be marked 'volatile'.
> > >
> > > -----Original Message-----
> > > From: Alex Rønne Petersen <[hidden email]>
> > > Sent: 01 November 2019 11:11
> > > To: Gaier, Bjoern <[hidden email]>
> > > Cc: [hidden email]
> > > Subject: Re: [cfe-dev] Optimization creates infinite loop
> > >
> > > Hello,
> > >
> > > This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
> > >
> > > If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
> > >
> > >
> > > On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:
> > > >
> > > > Hello Clang-Developer,
> > > >
> > > >
> > > >
> > > > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> > > >
> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }
> > > >
> > > >
> > > >
> > > > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> > > >
> > > > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> > > >
> > > > .Lfunc_begin0:
> > > >
> > > >                 .cv_func_id 0
> > > >
> > > > # %bb.0:
> > > >
> > > >                 .cv_file 1
> > > > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviatha
> > > > n\ \C M_Driver1\\CM_Driver1.cpp"
> > > > "4E92768969662F6E513C11595FEA9D63" 1
> > > >
> > > >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> > > >
> > > >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > > > xmm0 = mem[0],zero
> > > >
> > > > .Ltmp0:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 movsd  xmm1, qword ptr [rip +
> > > > __real@40140083126e978d] # xmm1 = mem[0],zero
> > > >
> > > >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> > > >
> > > >                 addsd   xmm1, xmm0
> > > >
> > > >                 ucomisd               xmm1, xmm0
> > > >
> > > >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> > > >
> > > >                 jbe         .LBB0_2
> > > >
> > > > .Ltmp1:
> > > >
> > > >                 .p2align                4, 0x90
> > > >
> > > > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 jmp        .LBB0_1
> > > >
> > > > .Ltmp2:
> > > >
> > > > .LBB0_2:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> > > >
> > > >                 ret
> > > >
> > > > .Ltmp3:
> > > >
> > > > .Lfunc_end0:
> > > >
> > > >
> > > >
> > > > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> > > >
> > > > Is this a bug or do I understand something wrong?
> > > >
> > > >
> > > >
> > > > Kind greetings and thank you for the help in advance,
> > > >
> > > > Björn
> > > >
> > > >
> > > >
> > > > P.S.
> > > >
> > > > My compile flags (when using -###) are:
> > > >
> > > > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> > > >
> > > >
> > > >
> > > > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> > > >
> > > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > > Fukushima. Junichi Tajika
> > > > _______________________________________________
> > > > cfe-dev mailing list
> > > > [hidden email]
> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > Fukushima. Junichi Tajika
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
To answer the initial question:

> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }

The above is 1 of two things, an infinite loop (nothing else modifies the value), or a race condition (something else modifies the value in a different thread).  Race conditions are Undefined Behavior, so the compiler is allowed to assume that it is not the case.

Interestingly, infinite loops are ALSO undefined behavior, so the compiler would be welcome to actually remove the entire loop if it thought it was valuable.

-----Original Message-----
From: cfe-dev <[hidden email]> On Behalf Of Alex Rønne Petersen via cfe-dev
Sent: Friday, November 1, 2019 5:34 AM
To: Gaier, Bjoern <[hidden email]>
Cc: [hidden email]
Subject: Re: [cfe-dev] Optimization creates infinite loop

Hello,

On most 64-bit targets, loading a double (a 64-bit quantity) from memory is atomic already, and relaxed memory order means no memory barriers, so in practice it's really just informing the compiler that the variable will be accessed from multiple threads.

Of course, on 32-bit targets, a 64-bit atomic load is not the same thing as a 64-bit non-atomic load. But if you don't do an atomic load on such targets and instead use volatile, your program would be likely to encounter word tearing when reading the variable (putting aside the fact that the program would be illegal due to undefined behavior resulting from the data race).

On Fri, Nov 1, 2019 at 1:11 PM Gaier, Bjoern <[hidden email]> wrote:

>
> Hey,
>
> Thanks for the input! Does __atomic_load_n not generated additional
> overhead? In terms of instruction time and so on
>
> -----Original Message-----
> From: Alex Rønne Petersen <[hidden email]>
> Sent: 01 November 2019 11:47
> To: Gaier, Bjoern <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> So in that case you'd want to do something like:
>
> extern double RT_Time;
> void Sample1sec()
> {
>        const double startTime = RT_Time;
>        while (__atomic_load_n (&RT_Time, __ATOMIC_RELAXED) <
> (startTime + 5.0005)); }
>
> This is assuming you're unable to change the definition side of RT_Time. If you *are* able to change it, you can (and should) just make it an std::atomic<double> and use .load()/.store() with memory_order_relaxed to interact with it. volatile would be the wrong tool here since you're dealing with threads.
>
>
> On Fri, Nov 1, 2019 at 11:35 AM Gaier, Bjoern <[hidden email]> wrote:
> >
> > Hey,
> >
> > You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.
> >
> > I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.
> >
> > -----Original Message-----
> > From: Alex Rønne Petersen <[hidden email]>
> > Sent: 01 November 2019 11:26
> > To: Gaier, Bjoern <[hidden email]>
> > Cc: [hidden email]
> > Subject: Re: [cfe-dev] Optimization creates infinite loop
> >
> > Hello,
> >
> > The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.
> >
> > What exactly *is* changing RT_Time? Since the thread that executes
> > Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.
> >
> > On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <[hidden email]> wrote:
> > >
> > > Hey,
> > >
> > > Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> > > "the compiler has no reason to assume that any external actor can change the value of it"
> > > I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
> > >
> > > That sounds like any extern variable that will be changed should be marked 'volatile'.
> > >
> > > -----Original Message-----
> > > From: Alex Rønne Petersen <[hidden email]>
> > > Sent: 01 November 2019 11:11
> > > To: Gaier, Bjoern <[hidden email]>
> > > Cc: [hidden email]
> > > Subject: Re: [cfe-dev] Optimization creates infinite loop
> > >
> > > Hello,
> > >
> > > This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
> > >
> > > If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
> > >
> > >
> > > On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:
> > > >
> > > > Hello Clang-Developer,
> > > >
> > > >
> > > >
> > > > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> > > >
> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }
> > > >
> > > >
> > > >
> > > > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> > > >
> > > > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> > > >
> > > > .Lfunc_begin0:
> > > >
> > > >                 .cv_func_id 0
> > > >
> > > > # %bb.0:
> > > >
> > > >                 .cv_file 1
> > > > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviat
> > > > ha n\ \C M_Driver1\\CM_Driver1.cpp"
> > > > "4E92768969662F6E513C11595FEA9D63" 1
> > > >
> > > >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> > > >
> > > >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > > > xmm0 = mem[0],zero
> > > >
> > > > .Ltmp0:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 movsd  xmm1, qword ptr [rip +
> > > > __real@40140083126e978d] # xmm1 = mem[0],zero
> > > >
> > > >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> > > >
> > > >                 addsd   xmm1, xmm0
> > > >
> > > >                 ucomisd               xmm1, xmm0
> > > >
> > > >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> > > >
> > > >                 jbe         .LBB0_2
> > > >
> > > > .Ltmp1:
> > > >
> > > >                 .p2align                4, 0x90
> > > >
> > > > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 jmp        .LBB0_1
> > > >
> > > > .Ltmp2:
> > > >
> > > > .LBB0_2:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> > > >
> > > >                 ret
> > > >
> > > > .Ltmp3:
> > > >
> > > > .Lfunc_end0:
> > > >
> > > >
> > > >
> > > > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> > > >
> > > > Is this a bug or do I understand something wrong?
> > > >
> > > >
> > > >
> > > > Kind greetings and thank you for the help in advance,
> > > >
> > > > Björn
> > > >
> > > >
> > > >
> > > > P.S.
> > > >
> > > > My compile flags (when using -###) are:
> > > >
> > > > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> > > >
> > > >
> > > >
> > > > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> > > >
> > > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano,
> > > > Takeshi Fukushima. Junichi Tajika
> > > > _______________________________________________
> > > > cfe-dev mailing list
> > > > [hidden email]
> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > Fukushima. Junichi Tajika
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Thank you Alex! I understood that!

Infinite loops are undefined behaviour? I thought they are a "common" thing in hardware...

-----Original Message-----
From: Keane, Erich <[hidden email]>
Sent: 01 November 2019 14:01
To: Alex Rønne Petersen <[hidden email]>; Gaier, Bjoern <[hidden email]>; Clang Dev <[hidden email]>
Subject: RE: [cfe-dev] Optimization creates infinite loop

To answer the initial question:

> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }

The above is 1 of two things, an infinite loop (nothing else modifies the value), or a race condition (something else modifies the value in a different thread).  Race conditions are Undefined Behavior, so the compiler is allowed to assume that it is not the case.

Interestingly, infinite loops are ALSO undefined behavior, so the compiler would be welcome to actually remove the entire loop if it thought it was valuable.

-----Original Message-----
From: cfe-dev <[hidden email]> On Behalf Of Alex Rønne Petersen via cfe-dev
Sent: Friday, November 1, 2019 5:34 AM
To: Gaier, Bjoern <[hidden email]>
Cc: [hidden email]
Subject: Re: [cfe-dev] Optimization creates infinite loop

Hello,

On most 64-bit targets, loading a double (a 64-bit quantity) from memory is atomic already, and relaxed memory order means no memory barriers, so in practice it's really just informing the compiler that the variable will be accessed from multiple threads.

Of course, on 32-bit targets, a 64-bit atomic load is not the same thing as a 64-bit non-atomic load. But if you don't do an atomic load on such targets and instead use volatile, your program would be likely to encounter word tearing when reading the variable (putting aside the fact that the program would be illegal due to undefined behavior resulting from the data race).

On Fri, Nov 1, 2019 at 1:11 PM Gaier, Bjoern <[hidden email]> wrote:

>
> Hey,
>
> Thanks for the input! Does __atomic_load_n not generated additional
> overhead? In terms of instruction time and so on
>
> -----Original Message-----
> From: Alex Rønne Petersen <[hidden email]>
> Sent: 01 November 2019 11:47
> To: Gaier, Bjoern <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> So in that case you'd want to do something like:
>
> extern double RT_Time;
> void Sample1sec()
> {
>        const double startTime = RT_Time;
>        while (__atomic_load_n (&RT_Time, __ATOMIC_RELAXED) <
> (startTime + 5.0005)); }
>
> This is assuming you're unable to change the definition side of RT_Time. If you *are* able to change it, you can (and should) just make it an std::atomic<double> and use .load()/.store() with memory_order_relaxed to interact with it. volatile would be the wrong tool here since you're dealing with threads.
>
>
> On Fri, Nov 1, 2019 at 11:35 AM Gaier, Bjoern <[hidden email]> wrote:
> >
> > Hey,
> >
> > You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.
> >
> > I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.
> >
> > -----Original Message-----
> > From: Alex Rønne Petersen <[hidden email]>
> > Sent: 01 November 2019 11:26
> > To: Gaier, Bjoern <[hidden email]>
> > Cc: [hidden email]
> > Subject: Re: [cfe-dev] Optimization creates infinite loop
> >
> > Hello,
> >
> > The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.
> >
> > What exactly *is* changing RT_Time? Since the thread that executes
> > Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.
> >
> > On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <[hidden email]> wrote:
> > >
> > > Hey,
> > >
> > > Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> > > "the compiler has no reason to assume that any external actor can change the value of it"
> > > I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
> > >
> > > That sounds like any extern variable that will be changed should be marked 'volatile'.
> > >
> > > -----Original Message-----
> > > From: Alex Rønne Petersen <[hidden email]>
> > > Sent: 01 November 2019 11:11
> > > To: Gaier, Bjoern <[hidden email]>
> > > Cc: [hidden email]
> > > Subject: Re: [cfe-dev] Optimization creates infinite loop
> > >
> > > Hello,
> > >
> > > This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
> > >
> > > If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
> > >
> > >
> > > On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:
> > > >
> > > > Hello Clang-Developer,
> > > >
> > > >
> > > >
> > > > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> > > >
> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }
> > > >
> > > >
> > > >
> > > > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> > > >
> > > > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> > > >
> > > > .Lfunc_begin0:
> > > >
> > > >                 .cv_func_id 0
> > > >
> > > > # %bb.0:
> > > >
> > > >                 .cv_file 1
> > > > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviat
> > > > ha n\ \C M_Driver1\\CM_Driver1.cpp"
> > > > "4E92768969662F6E513C11595FEA9D63" 1
> > > >
> > > >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> > > >
> > > >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > > > xmm0 = mem[0],zero
> > > >
> > > > .Ltmp0:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 movsd  xmm1, qword ptr [rip +
> > > > __real@40140083126e978d] # xmm1 = mem[0],zero
> > > >
> > > >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> > > >
> > > >                 addsd   xmm1, xmm0
> > > >
> > > >                 ucomisd               xmm1, xmm0
> > > >
> > > >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> > > >
> > > >                 jbe         .LBB0_2
> > > >
> > > > .Ltmp1:
> > > >
> > > >                 .p2align                4, 0x90
> > > >
> > > > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 jmp        .LBB0_1
> > > >
> > > > .Ltmp2:
> > > >
> > > > .LBB0_2:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> > > >
> > > >                 ret
> > > >
> > > > .Ltmp3:
> > > >
> > > > .Lfunc_end0:
> > > >
> > > >
> > > >
> > > > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> > > >
> > > > Is this a bug or do I understand something wrong?
> > > >
> > > >
> > > >
> > > > Kind greetings and thank you for the help in advance,
> > > >
> > > > Björn
> > > >
> > > >
> > > >
> > > > P.S.
> > > >
> > > > My compile flags (when using -###) are:
> > > >
> > > > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> > > >
> > > >
> > > >
> > > > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> > > >
> > > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano,
> > > > Takeshi Fukushima. Junichi Tajika
> > > > _______________________________________________
> > > > cfe-dev mailing list
> > > > [hidden email]
> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > Fukushima. Junichi Tajika
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization creates infinite loop

Kristof Beyls via cfe-dev
Yep they are.  See the C paper here that has a great justification for it: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm

Everything but "while(1)" is UB for infinite loops.  

-----Original Message-----
From: Gaier, Bjoern <[hidden email]>
Sent: Friday, November 1, 2019 6:58 AM
To: Keane, Erich <[hidden email]>; Alex Rønne Petersen <[hidden email]>; Clang Dev <[hidden email]>
Subject: RE: [cfe-dev] Optimization creates infinite loop

Thank you Alex! I understood that!

Infinite loops are undefined behaviour? I thought they are a "common" thing in hardware...

-----Original Message-----
From: Keane, Erich <[hidden email]>
Sent: 01 November 2019 14:01
To: Alex Rønne Petersen <[hidden email]>; Gaier, Bjoern <[hidden email]>; Clang Dev <[hidden email]>
Subject: RE: [cfe-dev] Optimization creates infinite loop

To answer the initial question:

> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }

The above is 1 of two things, an infinite loop (nothing else modifies the value), or a race condition (something else modifies the value in a different thread).  Race conditions are Undefined Behavior, so the compiler is allowed to assume that it is not the case.

Interestingly, infinite loops are ALSO undefined behavior, so the compiler would be welcome to actually remove the entire loop if it thought it was valuable.

-----Original Message-----
From: cfe-dev <[hidden email]> On Behalf Of Alex Rønne Petersen via cfe-dev
Sent: Friday, November 1, 2019 5:34 AM
To: Gaier, Bjoern <[hidden email]>
Cc: [hidden email]
Subject: Re: [cfe-dev] Optimization creates infinite loop

Hello,

On most 64-bit targets, loading a double (a 64-bit quantity) from memory is atomic already, and relaxed memory order means no memory barriers, so in practice it's really just informing the compiler that the variable will be accessed from multiple threads.

Of course, on 32-bit targets, a 64-bit atomic load is not the same thing as a 64-bit non-atomic load. But if you don't do an atomic load on such targets and instead use volatile, your program would be likely to encounter word tearing when reading the variable (putting aside the fact that the program would be illegal due to undefined behavior resulting from the data race).

On Fri, Nov 1, 2019 at 1:11 PM Gaier, Bjoern <[hidden email]> wrote:

>
> Hey,
>
> Thanks for the input! Does __atomic_load_n not generated additional
> overhead? In terms of instruction time and so on
>
> -----Original Message-----
> From: Alex Rønne Petersen <[hidden email]>
> Sent: 01 November 2019 11:47
> To: Gaier, Bjoern <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [cfe-dev] Optimization creates infinite loop
>
> Hello,
>
> So in that case you'd want to do something like:
>
> extern double RT_Time;
> void Sample1sec()
> {
>        const double startTime = RT_Time;
>        while (__atomic_load_n (&RT_Time, __ATOMIC_RELAXED) <
> (startTime + 5.0005)); }
>
> This is assuming you're unable to change the definition side of RT_Time. If you *are* able to change it, you can (and should) just make it an std::atomic<double> and use .load()/.store() with memory_order_relaxed to interact with it. volatile would be the wrong tool here since you're dealing with threads.
>
>
> On Fri, Nov 1, 2019 at 11:35 AM Gaier, Bjoern <[hidden email]> wrote:
> >
> > Hey,
> >
> > You are right with your assumption - RT_Time is coming from a different object file and declared global there. You will also find the second thread there, that will modify the value.
> >
> > I expected to tell Clang with "extern" that stuff is happening to this value in a different source file, and telling it without having the 'const' specifier, that this value will be changed.
> >
> > -----Original Message-----
> > From: Alex Rønne Petersen <[hidden email]>
> > Sent: 01 November 2019 11:26
> > To: Gaier, Bjoern <[hidden email]>
> > Cc: [hidden email]
> > Subject: Re: [cfe-dev] Optimization creates infinite loop
> >
> > Hello,
> >
> > The compiler is making the (correct) assumption that RT_Time cannot be changed by anything other than the currently executing thread since you have not told it otherwise, and since the loop body (or rather, lack thereof) does not modify it.
> >
> > What exactly *is* changing RT_Time? Since the thread that executes
> > Sample1sec() is obviously intended to spin until it changes, I can only think that either some hardware is changing it or another thread is. In which case, see my earlier reply. The fact that it's extern isn't really relevant; I'm fairly confident Clang would generate the same code if you removed the extern keyword.
> >
> > On Fri, Nov 1, 2019 at 11:18 AM Gaier, Bjoern <[hidden email]> wrote:
> > >
> > > Hey,
> > >
> > > Thanks for the answer! As said, volatile fixed the issue for us - however you said:
> > > "the compiler has no reason to assume that any external actor can change the value of it"
> > > I would understand that sentence, when the variable would be declared 'const' - but it was not. Isn't the const statement supposed to declare such situations, when the value will not be changed?
> > >
> > > That sounds like any extern variable that will be changed should be marked 'volatile'.
> > >
> > > -----Original Message-----
> > > From: Alex Rønne Petersen <[hidden email]>
> > > Sent: 01 November 2019 11:11
> > > To: Gaier, Bjoern <[hidden email]>
> > > Cc: [hidden email]
> > > Subject: Re: [cfe-dev] Optimization creates infinite loop
> > >
> > > Hello,
> > >
> > > This isn't really surprising. Since RT_Time is not marked volatile, nor is it accessed through std::atomic functions (or the equivalent builtins), the compiler has no reason to assume that any external actor can change the value of it. So the load is moved out of the loop since doing it repeatedly would, from the compiler's point of view, be wasteful.
> > >
> > > If RT_Time can be changed by hardware or something along those lines, then volatile is appropriate. If another thread can change it, you should be using atomic APIs to access it.
> > >
> > >
> > > On Fri, Nov 1, 2019 at 10:59 AM Gaier, Bjoern via cfe-dev <[hidden email]> wrote:
> > > >
> > > > Hello Clang-Developer,
> > > >
> > > >
> > > >
> > > > We ran into a – for us big problem – with the Clang compiler. We had the following code:
> > > >
> > > > extern double RT_Time;
> > > >
> > > > void Sample1sec()
> > > >
> > > > {
> > > >
> > > >        const double startTime = RT_Time;
> > > >
> > > >        while(RT_Time < (startTime + 5.0005));
> > > >
> > > > }
> > > >
> > > >
> > > >
> > > > The value of RT_Time will change after a while and the loop is supposed to wait for it. However the assembly code will lead into an endless loop if the condition is not meet the first time:
> > > >
> > > > "?Sample1sec@@YAXXZ":                   # @"?Sample1sec@@YAXXZ"
> > > >
> > > > .Lfunc_begin0:
> > > >
> > > >                 .cv_func_id 0
> > > >
> > > > # %bb.0:
> > > >
> > > >                 .cv_file 1
> > > > "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviat
> > > > ha n\ \C M_Driver1\\CM_Driver1.cpp"
> > > > "4E92768969662F6E513C11595FEA9D63" 1
> > > >
> > > >                 .cv_loc  0 1 5 0                 # CM_Driver1.cpp:5:0
> > > >
> > > >                 movsd  xmm0, qword ptr [rip + "?RT_Time@@3NA"] #
> > > > xmm0 = mem[0],zero
> > > >
> > > > .Ltmp0:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 movsd  xmm1, qword ptr [rip +
> > > > __real@40140083126e978d] # xmm1 = mem[0],zero
> > > >
> > > >                 .cv_loc  0 1 0 0                 # CM_Driver1.cpp:0:0
> > > >
> > > >                 addsd   xmm1, xmm0
> > > >
> > > >                 ucomisd               xmm1, xmm0
> > > >
> > > >                 .cv_loc  0 1 6 0                 # CM_Driver1.cpp:6:0
> > > >
> > > >                 jbe         .LBB0_2
> > > >
> > > > .Ltmp1:
> > > >
> > > >                 .p2align                4, 0x90
> > > >
> > > > .LBB0_1:                                # =>This Inner Loop Header: Depth=1
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 jmp        .LBB0_1
> > > >
> > > > .Ltmp2:
> > > >
> > > > .LBB0_2:
> > > >
> > > >                 #DEBUG_VALUE: Sample1sec:startTime <- $xmm0
> > > >
> > > >                 .cv_loc  0 1 7 0                 # CM_Driver1.cpp:7:0
> > > >
> > > >                 ret
> > > >
> > > > .Ltmp3:
> > > >
> > > > .Lfunc_end0:
> > > >
> > > >
> > > >
> > > > Marking RT_Time with volatile fixes the problem. But I wonder, why is doing Clang that optimization? RT_Time was not declared ‘const’ so didn’t Clang had to know the value can change?
> > > >
> > > > Is this a bug or do I understand something wrong?
> > > >
> > > >
> > > >
> > > > Kind greetings and thank you for the help in advance,
> > > >
> > > > Björn
> > > >
> > > >
> > > >
> > > > P.S.
> > > >
> > > > My compile flags (when using -###) are:
> > > >
> > > > clang-cl.exe "-cc1" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-S" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "CM_Driver1.cpp" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-relaxed-aliasing" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-mllvm" "-x86-asm-syntax=intel" "-D_MT" "-flto-visibility-public-std" "--dependent-lib=libcmt" "--dependent-lib=oldnames" "-fcxx-exceptions" "-fexceptions" "-fexternc-nounwind" "-fms-volatile" "-fdefault-calling-conv=cdecl" "-fdiagnostics-format" "msvc" "-gcodeview" "-debug-info-kind=limited" "-momit-leaf-frame-pointer" "-ffunction-sections" "-coverage-notes-file" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1\\CM_Driver1.gcno" "-resource-dir" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0" "-D" "WIN32" "-D" "NTSYSAD_CM" "-D" "_CONSOLE" "-D" "CODEMODULE" "-D" "__asm=RemoveAssembly" "-D" "_UNICODE" "-D" "UNICODE" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\lib\\clang\\9.0.0\\include" "-internal-isystem" "S:\\STSHARE\\EXTERNAL\\LLVM\\Include" "-internal-isystem" "S:\\STEXEC\\ExecutionEnvironment\\INCLUDE" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Tools\\MSVC\\14.16.27023\\atlmfc\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Professional\\VC\\Auxiliary\\VS\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\Include\\10.0.10240.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\8.1\\Include\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.6.1\\Include\\um" "-O2" "-WCL4" "-Wno-error" "-Wno-unused-command-line-argument" "-Wno-microsoft-cast" "-Wno-writable-strings" "-Wno-microsoft-enum-forward-reference" "-Wno-invalid-token-paste" "-fdeprecated-macro" "-fdebug-compilation-dir" "C:\\Users\\h16020\\Desktop\\Projects\\Squirrel\\Project\\Leviathan\\CM_Driver1" "-ferror-limit" "3000" "-fmessage-length" "0" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.16.27034" "-std=c++14" "-fno-threadsafe-statics" "-fdelayed-template-parsing" "-finline-functions" "-fobjc-runtime=gcc" "-fno-caret-diagnostics" "-fdiagnostics-show-option" "-fno-show-column" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "x64\\Release\\CM_Driver1.asm" "-x" "c++" "CM_Driver1.cpp"
> > > >
> > > >
> > > >
> > > > clang-cl.exe "-cc1as" "-triple" "x86_64-pc-windows-msvc19.16.27034" "-filetype" "obj" "-main-file-name" "CM_Driver1.cpp" "-target-cpu" "x86-64" "-dwarf-version=4" "-mrelocation-model" "pic" "-mincremental-linker-compatible" "-o" "x64\\Release\\CM_Driver1.obj" "x64\\Release\\CM_Driver1.asm"
> > > >
> > > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano,
> > > > Takeshi Fukushima. Junichi Tajika
> > > > _______________________________________________
> > > > cfe-dev mailing list
> > > > [hidden email]
> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> > > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB
> > > 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > > Fukushima. Junichi Tajika
> > Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> > USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> > Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> > Fukushima. Junichi Tajika
> Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816,
> USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr.
> Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi
> Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789 Geschäftsführer: Dr. Hiroshi Nakamura, Dr. Robert Plank, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima. Junichi Tajika
_______________________________________________
cfe-dev mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev