Clang 'locks up' when compiling optimized

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
The file isn’t very large, at 181949 bytes, and is a machine generated bit of code.

The strange thing is that if I run the compile by hand by cutting/pasting the line into a shell window, it compiles in seconds. I run the same code generation/compile/execution with g++ (5.1) and it never locks up.

Another thing, the code is compiled from a daemon that places the compiler under execution limits. If it runs for more than 30 seconds or uses more than 500MB of RAM, it should have the appropriate limit applied to it. I fork the daemon and before exec() for the compiler do the following:

      struct rlimit rlim;
      rlim.rlim_cur = rlim.rlim_max = m_compile_max_seconds; // 30

      if (0 != setrlimit(RLIMIT_CPU, &rlim))
      {
        xthrow(Sys_rlimit, errno, "system", "can't bound compilation time");
      }

      rlim.rlim_cur = rlim.rlim_max = m_compile_max_memory * (1 << 20); // 500 MB

      if (0 != setrlimit(RLIMIT_AS, &rlim))
      {
        xthrow(Sys_rlimit, errno, "system", "can't limit compiler memory usage");
      }

This is the compile that locked up. If anyone believes that looking at the source would make a difference, let me know and I’ll send it along.

  501 59880 46191     4004   0  31 10  2495004   9612 -      SN                  0 ??         0:00.02 /opt/local/libexec/llvm-4.0/bin/clang++ -pipe -c -o /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.opt -O3 -Winvalid-pch -march=core2 -fstack-protector-strong -D_BSD_SOURCE -DFOR_SPARQL -D_REENTRANT -D_PTHREADS -DTHREAD -D_GLIBCXX_USE_DEPRECATED=0 -DTURBO_GENCODE=1 -DDO_CASSANDRA=0 -DMEM_LIMIT_LEAK_CHECKING -DFULL_RESERVATIONS -DGCC5 -D_DARWIN_C_SOURCE -DDARWIN -DMAC_OSX=1 -std=gnu++14 -m64 -fPIC -I/Users/barto/UnixEnvironment/CSI/repo4/lib -I/Users/barto/UnixEnvironment/CSI/repo4/lib/cgrsrc /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.cpp
  501 59881 59880     4004   0  31 10  2554904  33884 -      SN                  0 ??        21:37.14 /opt/local/libexec/llvm-4.0/bin/clang -cc1 -triple x86_64-apple-macosx10.10.0 -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -emit-obj -disable-free -disable-llvm-verifier -discard-value-names -main-file-name 1.cpp -mrelocation-model pic -pic-level 2 -mthread-model posix -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core2 -target-linker-version 274.2 -dwarf-column-info -debugger-tuning=lldb -coverage-notes-file /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.gcno -resource-dir /opt/local/libexec/llvm-4.0/bin/../lib/clang/4.0.0 -D _BSD_SOURCE -D FOR_SPARQL -D _REENTRANT -D _PTHREADS -D THREAD -D _GLIBCXX_USE_DEPRECATED=0 -D TURBO_GENCODE=1 -D DO_CASSANDRA=0 -D MEM_LIMIT_LEAK_CHECKING -D FULL_RESERVATIONS -D GCC5 -D _DARWIN_C_SOURCE -D DARWIN -D MAC_OSX=1 -I /Users/barto/UnixEnvironment/CSI/repo4/lib -I /Users/barto/UnixEnvironment/CSI/repo4/lib/cgrsrc -stdlib=libc++ -O3 -Winvalid-pch -std=gnu++14 -fdeprecated-macro -fdebug-compilation-dir /Users/barto/UnixEnvironment/CSI/repo4/bin -ferror-limit 19 -fmessage-length 0 -stack-protector 2 -fblocks -fobjc-runtime=macosx-10.10.0 -fencode-extended-block-signature -fcxx-exceptions -fexceptions -fmax-type-align=16 -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.opt -x c++ /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.cpp

David


David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
Does it actually lock up, or just take a very long time. LLVM does have problems with very large functions, which leads to long times for "instruction selection" (worse in debug builds of the compiler too)

The same applies to g++ - I had something that was about 100k lines that took over 15 minutes to compile a while back - tweaking the options changed it to about 20s. Just because one compiler is "good" and the other "bad" doesn't mean that the "bad" one is broken, it's all depending on what the code looks like, one may well run through the compilation quickly, and the other take very long - "The devil is in the detail". In my g++ case, it was "dead store elimination, that took a long time, and on a file that is several megabytes, the difference with DSE enabled was a few kilobytes - from what I can tell [without looking at the code], g++ does DSE in O(n^2) time, by something akin to `for_each(instructions) { for_each(instructions) check_this_instruction(); }`

--
Mats

On 27 June 2017 at 14:45, David Barto via cfe-dev <[hidden email]> wrote:
The file isn’t very large, at 181949 bytes, and is a machine generated bit of code.

The strange thing is that if I run the compile by hand by cutting/pasting the line into a shell window, it compiles in seconds. I run the same code generation/compile/execution with g++ (5.1) and it never locks up.

Another thing, the code is compiled from a daemon that places the compiler under execution limits. If it runs for more than 30 seconds or uses more than 500MB of RAM, it should have the appropriate limit applied to it. I fork the daemon and before exec() for the compiler do the following:

      struct rlimit rlim;
      rlim.rlim_cur = rlim.rlim_max = m_compile_max_seconds; // 30

      if (0 != setrlimit(RLIMIT_CPU, &rlim))
      {
        xthrow(Sys_rlimit, errno, "system", "can't bound compilation time");
      }

      rlim.rlim_cur = rlim.rlim_max = m_compile_max_memory * (1 << 20); // 500 MB

      if (0 != setrlimit(RLIMIT_AS, &rlim))
      {
        xthrow(Sys_rlimit, errno, "system", "can't limit compiler memory usage");
      }

This is the compile that locked up. If anyone believes that looking at the source would make a difference, let me know and I’ll send it along.

  501 59880 46191     4004   0  31 10  2495004   9612 -      SN                  0 ??         0:00.02 /opt/local/libexec/llvm-4.0/bin/clang++ -pipe -c -o /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.opt -O3 -Winvalid-pch -march=core2 -fstack-protector-strong -D_BSD_SOURCE -DFOR_SPARQL -D_REENTRANT -D_PTHREADS -DTHREAD -D_GLIBCXX_USE_DEPRECATED=0 -DTURBO_GENCODE=1 -DDO_CASSANDRA=0 -DMEM_LIMIT_LEAK_CHECKING -DFULL_RESERVATIONS -DGCC5 -D_DARWIN_C_SOURCE -DDARWIN -DMAC_OSX=1 -std=gnu++14 -m64 -fPIC -I/Users/barto/UnixEnvironment/CSI/repo4/lib -I/Users/barto/UnixEnvironment/CSI/repo4/lib/cgrsrc /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.cpp
  501 59881 59880     4004   0  31 10  2554904  33884 -      SN                  0 ??        21:37.14 /opt/local/libexec/llvm-4.0/bin/clang -cc1 -triple x86_64-apple-macosx10.10.0 -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -emit-obj -disable-free -disable-llvm-verifier -discard-value-names -main-file-name 1.cpp -mrelocation-model pic -pic-level 2 -mthread-model posix -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core2 -target-linker-version 274.2 -dwarf-column-info -debugger-tuning=lldb -coverage-notes-file /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.gcno -resource-dir /opt/local/libexec/llvm-4.0/bin/../lib/clang/4.0.0 -D _BSD_SOURCE -D FOR_SPARQL -D _REENTRANT -D _PTHREADS -D THREAD -D _GLIBCXX_USE_DEPRECATED=0 -D TURBO_GENCODE=1 -D DO_CASSANDRA=0 -D MEM_LIMIT_LEAK_CHECKING -D FULL_RESERVATIONS -D GCC5 -D _DARWIN_C_SOURCE -D DARWIN -D MAC_OSX=1 -I /Users/barto/UnixEnvironment/CSI/repo4/lib -I /Users/barto/UnixEnvironment/CSI/repo4/lib/cgrsrc -stdlib=libc++ -O3 -Winvalid-pch -std=gnu++14 -fdeprecated-macro -fdebug-compilation-dir /Users/barto/UnixEnvironment/CSI/repo4/bin -ferror-limit 19 -fmessage-length 0 -stack-protector 2 -fblocks -fobjc-runtime=macosx-10.10.0 -fencode-extended-block-signature -fcxx-exceptions -fexceptions -fmax-type-align=16 -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.opt -x c++ /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.cpp

David


David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
On Jun 27, 2017, at 7:00 AM, mats petersson <[hidden email]> wrote:

Does it actually lock up, or just take a very long time. LLVM does have problems with very large functions, which leads to long times for "instruction selection" (worse in debug builds of the compiler too)

The same applies to g++ - I had something that was about 100k lines that took over 15 minutes to compile a while back - tweaking the options changed it to about 20s. Just because one compiler is "good" and the other "bad" doesn't mean that the "bad" one is broken, it's all depending on what the code looks like, one may well run through the compilation quickly, and the other take very long - "The devil is in the detail". In my g++ case, it was "dead store elimination, that took a long time, and on a file that is several megabytes, the difference with DSE enabled was a few kilobytes - from what I can tell [without looking at the code], g++ does DSE in O(n^2) time, by something akin to `for_each(instructions) { for_each(instructions) check_this_instruction(); }`

--
Mats


This was left running overnight. It was completely locked and wasn’t making any progress.

Just scraping the compile line from the PS output and pasting it into a shell has the compiler running in about 5-8 seconds. So something about running this through my compile daemon did something weird.

It doesn’t happen on the same file every time. If I delete the code cache and re-run my system again, it will pick another file to lock up on, or possibly run to completion without locking up. It appears random.

David

On 27 June 2017 at 14:45, David Barto via cfe-dev <[hidden email]> wrote:
The file isn’t very large, at 181949 bytes, and is a machine generated bit of code.

The strange thing is that if I run the compile by hand by cutting/pasting the line into a shell window, it compiles in seconds. I run the same code generation/compile/execution with g++ (5.1) and it never locks up.

Another thing, the code is compiled from a daemon that places the compiler under execution limits. If it runs for more than 30 seconds or uses more than 500MB of RAM, it should have the appropriate limit applied to it. I fork the daemon and before exec() for the compiler do the following:

      struct rlimit rlim;
      rlim.rlim_cur = rlim.rlim_max = m_compile_max_seconds; // 30

      if (0 != setrlimit(RLIMIT_CPU, &rlim))
      {
        xthrow(Sys_rlimit, errno, "system", "can't bound compilation time");
      }

      rlim.rlim_cur = rlim.rlim_max = m_compile_max_memory * (1 << 20); // 500 MB

      if (0 != setrlimit(RLIMIT_AS, &rlim))
      {
        xthrow(Sys_rlimit, errno, "system", "can't limit compiler memory usage");
      }

This is the compile that locked up. If anyone believes that looking at the source would make a difference, let me know and I’ll send it along.

  501 59880 46191     4004   0  31 10  2495004   9612 -      SN                  0 ??         0:00.02 /opt/local/libexec/llvm-4.0/bin/clang++ -pipe -c -o /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.opt -O3 -Winvalid-pch -march=core2 -fstack-protector-strong -D_BSD_SOURCE -DFOR_SPARQL -D_REENTRANT -D_PTHREADS -DTHREAD -D_GLIBCXX_USE_DEPRECATED=0 -DTURBO_GENCODE=1 -DDO_CASSANDRA=0 -DMEM_LIMIT_LEAK_CHECKING -DFULL_RESERVATIONS -DGCC5 -D_DARWIN_C_SOURCE -DDARWIN -DMAC_OSX=1 -std=gnu++14 -m64 -fPIC -I/Users/barto/UnixEnvironment/CSI/repo4/lib -I/Users/barto/UnixEnvironment/CSI/repo4/lib/cgrsrc /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.cpp
  501 59881 59880     4004   0  31 10  2554904  33884 -      SN                  0 ??        21:37.14 /opt/local/libexec/llvm-4.0/bin/clang -cc1 -triple x86_64-apple-macosx10.10.0 -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -emit-obj -disable-free -disable-llvm-verifier -discard-value-names -main-file-name 1.cpp -mrelocation-model pic -pic-level 2 -mthread-model posix -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core2 -target-linker-version 274.2 -dwarf-column-info -debugger-tuning=lldb -coverage-notes-file /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.gcno -resource-dir /opt/local/libexec/llvm-4.0/bin/../lib/clang/4.0.0 -D _BSD_SOURCE -D FOR_SPARQL -D _REENTRANT -D _PTHREADS -D THREAD -D _GLIBCXX_USE_DEPRECATED=0 -D TURBO_GENCODE=1 -D DO_CASSANDRA=0 -D MEM_LIMIT_LEAK_CHECKING -D FULL_RESERVATIONS -D GCC5 -D _DARWIN_C_SOURCE -D DARWIN -D MAC_OSX=1 -I /Users/barto/UnixEnvironment/CSI/repo4/lib -I /Users/barto/UnixEnvironment/CSI/repo4/lib/cgrsrc -stdlib=libc++ -O3 -Winvalid-pch -std=gnu++14 -fdeprecated-macro -fdebug-compilation-dir /Users/barto/UnixEnvironment/CSI/repo4/bin -ferror-limit 19 -fmessage-length 0 -stack-protector 2 -fblocks -fobjc-runtime=macosx-10.10.0 -fencode-extended-block-signature -fcxx-exceptions -fexceptions -fmax-type-align=16 -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.opt -x c++ /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.cpp

David


David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
>the code is compiled from a daemon

does is also lock up without the deamon - or only with?


Am 27.06.2017 um 16:09 schrieb David Barto via cfe-dev:

> > On Jun 27, 2017, at 7:00 AM, mats petersson <[hidden email]> wrote:
> >
> > Does it actually lock up, or just take a very long time. LLVM does have problems with very large functions, which leads to long times for "instruction selection" (worse in debug builds of the compiler too)
> >
> > The same applies to g++ - I had something that was about 100k lines that took over 15 minutes to compile a while back - tweaking the options changed it to about 20s. Just because one compiler is "good" and the other "bad" doesn't mean that the "bad" one is broken, it's all depending on what the code looks like, one may well run through the compilation quickly, and the other take very long - "The devil is in the detail". In my g++ case, it was "dead store elimination, that took a long time, and on a file that is several megabytes, the difference with DSE enabled was a few kilobytes - from what I can tell [without looking at the code], g++ does DSE in O(n^2) time, by something akin to `for_each(instructions) { for_each(instructions) check_this_instruction(); }`
> >
> > --
> > Mats
> >
>
> This was left running overnight. It was completely locked and wasn’t making any progress.
>
> Just scraping the compile line from the PS output and pasting it into a shell has the compiler running in about 5-8 seconds. So something about running this through my compile daemon did something weird.
>
> It doesn’t happen on the same file every time. If I delete the code cache and re-run my system again, it will pick another file to lock up on, or possibly run to completion without locking up. It appears random.
>
> David
>
> > On 27 June 2017 at 14:45, David Barto via cfe-dev <[hidden email] <mailto:[hidden email]>> wrote:
> > The file isn’t very large, at 181949 bytes, and is a machine generated bit of code.
> >
> > The strange thing is that if I run the compile by hand by cutting/pasting the line into a shell window, it compiles in seconds. I run the same code generation/compile/execution with g++ (5.1) and it never locks up.
> >
> > Another thing, the code is compiled from a daemon that places the compiler under execution limits. If it runs for more than 30 seconds or uses more than 500MB of RAM, it should have the appropriate limit applied to it. I fork the daemon and before exec() for the compiler do the following:
> >
> >       struct rlimit rlim;
> >       rlim.rlim_cur = rlim.rlim_max = m_compile_max_seconds; // 30
> >
> >       if (0 != setrlimit(RLIMIT_CPU, &rlim))
> >       {
> >         xthrow(Sys_rlimit, errno, "system", "can't bound compilation time");
> >       }
> >
> >       rlim.rlim_cur = rlim.rlim_max = m_compile_max_memory * (1 << 20); // 500 MB
> >
> >       if (0 != setrlimit(RLIMIT_AS, &rlim))
> >       {
> >         xthrow(Sys_rlimit, errno, "system", "can't limit compiler memory usage");
> >       }
> >
> > This is the compile that locked up. If anyone believes that looking at the source would make a difference, let me know and I’ll send it along.
> >
> >   501 59880 46191     4004   0  31 10  2495004   9612 -      SN                  0 ??         0:00.02 /opt/local/libexec/llvm-4.0/bin/clang++ -pipe -c -o /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.opt -O3 -Winvalid-pch -march=core2 -fstack-protector-strong -D_BSD_SOURCE -DFOR_SPARQL -D_REENTRANT -D_PTHREADS -DTHREAD -D_GLIBCXX_USE_DEPRECATED=0 -DTURBO_GENCODE=1 -DDO_CASSANDRA=0 -DMEM_LIMIT_LEAK_CHECKING -DFULL_RESERVATIONS -DGCC5 -D_DARWIN_C_SOURCE -DDARWIN -DMAC_OSX=1 -std=gnu++14 -m64 -fPIC -I/Users/barto/UnixEnvironment/CSI/repo4/lib -I/Users/barto/UnixEnvironment/CSI/repo4/lib/cgrsrc /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.cpp
> >   501 59881 59880     4004   0  31 10  2554904  33884 -      SN                  0 ??        21:37.14 /opt/local/libexec/llvm-4.0/bin/clang -cc1 -triple x86_64-apple-macosx10.10.0 -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -emit-obj -disable-free -disable-llvm-verifier -discard-value-names -main-file-name 1.cpp -mrelocation-model pic -pic-level 2 -mthread-model posix -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core2 -target-linker-version 274.2 -dwarf-column-info -debugger-tuning=lldb -coverage-notes-file /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.gcno -resource-dir /opt/local/libexec/llvm-4.0/bin/../lib/clang/4.0.0 -D _BSD_SOURCE -D FOR_SPARQL -D _REENTRANT -D _PTHREADS -D THREAD -D _GLIBCXX_USE_DEPRECATED=0 -D TURBO_GENCODE=1 -D DO_CASSANDRA=0 -D MEM_LIMIT_LEAK_CHECKING -D FULL_RESERVATIONS -D GCC5 -D _DARWIN_C_SOURCE -D DARWIN -D MAC_OSX=1 -I /Users/barto/UnixEnvironment/CSI/repo4/lib -I /Users/barto/UnixEnvironment/CSI/repo4/lib/cgrsrc -stdlib=libc++ -O3 -Winvalid-pch -std=gnu++14 -fdeprecated-macro -fdebug-compilation-dir /Users/barto/UnixEnvironment/CSI/repo4/bin -ferror-limit 19 -fmessage-length 0 -stack-protector 2 -fblocks -fobjc-runtime=macosx-10.10.0 -fencode-extended-block-signature -fcxx-exceptions -fexceptions -fmax-type-align=16 -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.opt -x c++ /Users/barto/UnixEnvironment/CSI/internal/repo4/internal.0/code/381/1.cpp
> >
> > David
> >
> >
> > David Barto
> > [hidden email] <mailto:[hidden email]>
> >
> > Sometimes, my best code does nothing. Most of the rest of it has bugs.
> >
> >
> >
> >
> > _______________________________________________
> > cfe-dev mailing list
> > [hidden email] <mailto:[hidden email]>
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
> >
> >
>
> David Barto
> [hidden email]
>
> Sometimes, my best code does nothing. Most of the rest of it has bugs.
>
>
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
In reply to this post by Xin Wang via cfe-dev
This the the stack trace when the compiler locked up.
I attached with ‘lldb -p <pid>”
I did the thread backtrace all then a process resume
I interrupted the program again and did a second thread backtrace all. Both were identical.

David

(lldb) thread backtrace all
* thread #1: tid = 0x13b475b, 0x00007fff90ec65da libsystem_kernel.dylib`syscall_thread_switch + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff90ec65da libsystem_kernel.dylib`syscall_thread_switch + 10
    frame #1: 0x00007fff9497682d libsystem_platform.dylib`_OSSpinLockLockSlow + 63
    frame #2: 0x00007fff8ca7271b libsystem_malloc.dylib`szone_malloc_should_clear + 116
    frame #3: 0x00007fff8ca72667 libsystem_malloc.dylib`malloc_zone_malloc + 71
    frame #4: 0x00007fff8ca71187 libsystem_malloc.dylib`malloc + 42
    frame #5: 0x00007fff991fa43e libc++.1.dylib`operator new(unsigned long) + 30
    frame #6: 0x00007fff991fcf05 libc++.1.dylib`std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__init(char const*, unsigned long) + 59
    frame #7: 0x000000010e6fc7a9 libLLVM.dylib`llvm::sys::findProgramByName(llvm::StringRef, llvm::ArrayRef<llvm::StringRef>) + 670
    frame #8: 0x000000010e6fd22c libLLVM.dylib`printSymbolizedStackTrace(llvm::StringRef, void**, int, llvm::raw_ostream&) + 186
    frame #9: 0x000000010e6fda7b libLLVM.dylib`llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 93
    frame #10: 0x000000010e6fd116 libLLVM.dylib`llvm::sys::RunSignalHandlers() + 83
    frame #11: 0x000000010e6fde4d libLLVM.dylib`SignalHandler(int) + 183
    frame #12: 0x00007fff94977f1a libsystem_platform.dylib`_sigtramp + 26
    frame #13: 0x00007fff8ca757da libsystem_malloc.dylib`szone_free_definite_size + 4827
    frame #14: 0x000000010eb8a45b libLLVM.dylib`std::__1::__tree<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, std::__1::__map_value_compare<llvm::Value*, std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, std::__1::less<llvm::Value*>, true>, std::__1::allocator<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> > > >::destroy(std::__1::__tree_node<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, void*>*) + 41
    frame #15: 0x000000010eb8a44f libLLVM.dylib`std::__1::__tree<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, std::__1::__map_value_compare<llvm::Value*, std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, std::__1::less<llvm::Value*>, true>, std::__1::allocator<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> > > >::destroy(std::__1::__tree_node<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, void*>*) + 29
    frame #16: 0x000000010eb8a44f libLLVM.dylib`std::__1::__tree<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, std::__1::__map_value_compare<llvm::Value*, std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, std::__1::less<llvm::Value*>, true>, std::__1::allocator<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> > > >::destroy(std::__1::__tree_node<std::__1::__value_type<llvm::Value*, llvm::Optional<(anonymous namespace)::BitPart> >, void*>*) + 29
    frame #17: 0x000000010eb894e0 libLLVM.dylib`llvm::recognizeBSwapOrBitReverseIdiom(llvm::Instruction*, bool, bool, llvm::SmallVectorImpl<llvm::Instruction*>&) + 1224
    frame #18: 0x000000010ec3f969 libLLVM.dylib`llvm::InstCombiner::MatchBSwap(llvm::BinaryOperator&) + 391
    frame #19: 0x000000010ec3fe7c libLLVM.dylib`llvm::InstCombiner::visitOr(llvm::BinaryOperator&) + 636
    frame #20: 0x000000010ec2e3a3 libLLVM.dylib`llvm::InstCombiner::run() + 1261
    frame #21: 0x000000010ec2f05c libLLVM.dylib`combineInstructionsOverFunction(llvm::Function&, llvm::InstCombineWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::DominatorTree&, bool, llvm::LoopInfo*) + 2431
    frame #22: 0x000000010ec2f2d7 libLLVM.dylib`llvm::InstructionCombiningPass::runOnFunction(llvm::Function&) + 297
    frame #23: 0x000000010e78e1ba libLLVM.dylib`llvm::FPPassManager::runOnFunction(llvm::Function&) + 290
    frame #24: 0x000000010ee59722 libLLVM.dylib`(anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) + 810
    frame #25: 0x000000010e78e6be libLLVM.dylib`llvm::legacy::PassManagerImpl::run(llvm::Module&) + 606
    frame #26: 0x000000010d26c481 clang`clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) + 10253
    frame #27: 0x000000010d38e53d clang`clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) + 1035
    frame #28: 0x000000010d693f59 clang`clang::ParseAST(clang::Sema&, bool, bool) + 374
    frame #29: 0x000000010d4fa5bd clang`clang::FrontendAction::Execute() + 69
    frame #30: 0x000000010d4c89d0 clang`clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 722
    frame #31: 0x000000010d526144 clang`clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 1976
    frame #32: 0x000000010d205d96 clang`cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 1371
    frame #33: 0x000000010d20503e clang`main + 8255
    frame #34: 0x00007fff979fb5c9 libdyld.dylib`start + 1
    frame #35: 0x00007fff979fb5c9 libdyld.dylib`start + 1
(lldb) 


On Jun 27, 2017, at 7:20 AM, Brian Cain <[hidden email]> wrote:


On Tue, Jun 27, 2017 at 9:09 AM, David Barto via cfe-dev <[hidden email]> wrote:
On Jun 27, 2017, at 7:00 AM, mats petersson <[hidden email]> wrote:

Does it actually lock up, or just take a very long time. LLVM does have problems with very large functions, which leads to long times for "instruction selection" (worse in debug builds of the compiler too)

The same applies to g++ - I had something that was about 100k lines that took over 15 minutes to compile a while back - tweaking the options changed it to about 20s. Just because one compiler is "good" and the other "bad" doesn't mean that the "bad" one is broken, it's all depending on what the code looks like, one may well run through the compilation quickly, and the other take very long - "The devil is in the detail". In my g++ case, it was "dead store elimination, that took a long time, and on a file that is several megabytes, the difference with DSE enabled was a few kilobytes - from what I can tell [without looking at the code], g++ does DSE in O(n^2) time, by something akin to `for_each(instructions) { for_each(instructions) check_this_instruction(); }`

--
Mats


This was left running overnight. It was completely locked and wasn’t making any progress.

Just scraping the compile line from the PS output and pasting it into a shell has the compiler running in about 5-8 seconds. So something about running this through my compile daemon did something weird.

It doesn’t happen on the same file every time. If I delete the code cache and re-run my system again, it will pick another file to lock up on, or possibly run to completion without locking up. It appears random.



If you attach to clang with a debugger after it stalls, what code is it executing (or if MacOS has something like strace can you run that)?  Does that `ps` output you've shown indicate that the process is in the 'sleeping' state?

You've identified the environment as a contributor I think it makes sense to continue bisecting differences between the good and bad.  IIRC RLIMIT_AS limits the effective virtual address space of a process and not its resident memory.  On a good/baseline compile, how big does the address space get?  This seems like the most likely part to examine more closely.



David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev


On Wed, Jun 28, 2017 at 9:55 AM, David Barto <[hidden email]> wrote:
This the the stack trace when the compiler locked up.
I attached with ‘lldb -p <pid>”
I did the thread backtrace all then a process resume
I interrupted the program again and did a second thread backtrace all. Both were identical.

David

(lldb) thread backtrace all
* thread #1: tid = 0x13b475b, 0x00007fff90ec65da libsystem_kernel.dylib`syscall_thread_switch + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff90ec65da libsystem_kernel.dylib`syscall_thread_switch + 10
    frame #1: 0x00007fff9497682d libsystem_platform.dylib`_OSSpinLockLockSlow + 63
    frame #2: 0x00007fff8ca7271b libsystem_malloc.dylib`szone_malloc_should_clear + 116
    frame #3: 0x00007fff8ca72667 libsystem_malloc.dylib`malloc_zone_malloc + 71
    frame #4: 0x00007fff8ca71187 libsystem_malloc.dylib`malloc + 42
    frame #5: 0x00007fff991fa43e libc++.1.dylib`operator new(unsigned long) + 30
    frame #6: 0x00007fff991fcf05 libc++.1.dylib`std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__init(char const*, unsigned long) + 59
    frame #7: 0x000000010e6fc7a9 libLLVM.dylib`llvm::sys::findProgramByName(llvm::StringRef, llvm::ArrayRef<llvm::StringRef>) + 670


This seems to point back to your RLIMIT_AS constraint, or overall system memory availability.  Why constrain the address space?  A much more realistic constraint would be RLIMIT_RSS.

-Brian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
This is part of an in-memory system (no swap space configured) so RSS would match the AS size for this use case. From what I read about RSS and AS for MacOS and Linux.

Why did it lock up, why not throw the exception and exit?

David

On Jun 28, 2017, at 8:07 AM, Brian Cain <[hidden email]> wrote:



On Wed, Jun 28, 2017 at 9:55 AM, David Barto <[hidden email]> wrote:
This the the stack trace when the compiler locked up.
I attached with ‘lldb -p <pid>”
I did the thread backtrace all then a process resume
I interrupted the program again and did a second thread backtrace all. Both were identical.

David

(lldb) thread backtrace all
* thread #1: tid = 0x13b475b, 0x00007fff90ec65da libsystem_kernel.dylib`syscall_thread_switch + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff90ec65da libsystem_kernel.dylib`syscall_thread_switch + 10
    frame #1: 0x00007fff9497682d libsystem_platform.dylib`_OSSpinLockLockSlow + 63
    frame #2: 0x00007fff8ca7271b libsystem_malloc.dylib`szone_malloc_should_clear + 116
    frame #3: 0x00007fff8ca72667 libsystem_malloc.dylib`malloc_zone_malloc + 71
    frame #4: 0x00007fff8ca71187 libsystem_malloc.dylib`malloc + 42
    frame #5: 0x00007fff991fa43e libc++.1.dylib`operator new(unsigned long) + 30
    frame #6: 0x00007fff991fcf05 libc++.1.dylib`std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__init(char const*, unsigned long) + 59
    frame #7: 0x000000010e6fc7a9 libLLVM.dylib`llvm::sys::findProgramByName(llvm::StringRef, llvm::ArrayRef<llvm::StringRef>) + 670


This seems to point back to your RLIMIT_AS constraint, or overall system memory availability.  Why constrain the address space?  A much more realistic constraint would be RLIMIT_RSS.

-Brian

David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
On Wed, Jun 28, 2017 at 10:10 AM, David Barto <[hidden email]> wrote:
This is part of an in-memory system (no swap space configured) so RSS would match the AS size for this use case. From what I read about RSS and AS for MacOS and Linux.

Why did it lock up, why not throw the exception and exit?



Dunno, it seems like the OS is driving now and it's not immediately clear to me why that system call wouldn't yield either success or failure.  But clang is asking for a resource (more memory), and I've seen those stall before.  My experience with linux (may or may not be applicable) leads me to believe that the system is perhaps resource-constrained and your task is pending while it tries to free up those resources.

RLIMIT_AS and RLIMIT_RSS are distinct on linux, I guess I am a little surprised to see that they're not on MacOS.

In any case, the most likely culprit is your setrlimit.  If I were you I would take clang out of the loop entirely and write a test program that does allocations just like the ones clang does (various sized mallocs, you could try profiling to get a ballpark histogram).  I would be surprised if you don't see the same behavior.

-Brian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
Looks like the bug is that the crash handler is attempting to allocate memory, and the reason it was crashing was that it ran out of memory.

Sounds like a real clang issue to deal with as the compiler should not assume infinite resources.

David

On Jun 28, 2017, at 8:18 AM, Brian Cain <[hidden email]> wrote:

On Wed, Jun 28, 2017 at 10:10 AM, David Barto <[hidden email]> wrote:
This is part of an in-memory system (no swap space configured) so RSS would match the AS size for this use case. From what I read about RSS and AS for MacOS and Linux.

Why did it lock up, why not throw the exception and exit?



Dunno, it seems like the OS is driving now and it's not immediately clear to me why that system call wouldn't yield either success or failure.  But clang is asking for a resource (more memory), and I've seen those stall before.  My experience with linux (may or may not be applicable) leads me to believe that the system is perhaps resource-constrained and your task is pending while it tries to free up those resources.

RLIMIT_AS and RLIMIT_RSS are distinct on linux, I guess I am a little surprised to see that they're not on MacOS.

In any case, the most likely culprit is your setrlimit.  If I were you I would take clang out of the loop entirely and write a test program that does allocations just like the ones clang does (various sized mallocs, you could try profiling to get a ballpark histogram).  I would be surprised if you don't see the same behavior.

-Brian

David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
Hmm, yes, I didn't notice that in the backtrace but you're right.

I don't think "assume infinite resources" is the bug though.  malloc()'s not signal-safe on linux, so it probably isn't on MacOS either.  We shouldn't be calling malloc() from the signal handler.

As a practical matter, maybe this is a feature that you could disable either at build-time or runtime?

On Wed, Jun 28, 2017 at 10:25 AM, David Barto <[hidden email]> wrote:
Looks like the bug is that the crash handler is attempting to allocate memory, and the reason it was crashing was that it ran out of memory.

Sounds like a real clang issue to deal with as the compiler should not assume infinite resources.

David

On Jun 28, 2017, at 8:18 AM, Brian Cain <[hidden email]> wrote:

On Wed, Jun 28, 2017 at 10:10 AM, David Barto <[hidden email]> wrote:
This is part of an in-memory system (no swap space configured) so RSS would match the AS size for this use case. From what I read about RSS and AS for MacOS and Linux.

Why did it lock up, why not throw the exception and exit?



Dunno, it seems like the OS is driving now and it's not immediately clear to me why that system call wouldn't yield either success or failure.  But clang is asking for a resource (more memory), and I've seen those stall before.  My experience with linux (may or may not be applicable) leads me to believe that the system is perhaps resource-constrained and your task is pending while it tries to free up those resources.

RLIMIT_AS and RLIMIT_RSS are distinct on linux, I guess I am a little surprised to see that they're not on MacOS.

In any case, the most likely culprit is your setrlimit.  If I were you I would take clang out of the loop entirely and write a test program that does allocations just like the ones clang does (various sized mallocs, you could try profiling to get a ballpark histogram).  I would be surprised if you don't see the same behavior.

-Brian

David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.






--
-Brian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
I’m going to disable (for the MacOS Builds) the memory limit checks for the time being.

Should I register a bug about this or is there someone with ‘more authority’ who would be more appropriate?

David

On Jun 28, 2017, at 8:29 AM, Brian Cain <[hidden email]> wrote:

Hmm, yes, I didn't notice that in the backtrace but you're right.

I don't think "assume infinite resources" is the bug though.  malloc()'s not signal-safe on linux, so it probably isn't on MacOS either.  We shouldn't be calling malloc() from the signal handler.

As a practical matter, maybe this is a feature that you could disable either at build-time or runtime?

On Wed, Jun 28, 2017 at 10:25 AM, David Barto <[hidden email]> wrote:
Looks like the bug is that the crash handler is attempting to allocate memory, and the reason it was crashing was that it ran out of memory.

Sounds like a real clang issue to deal with as the compiler should not assume infinite resources.

David

On Jun 28, 2017, at 8:18 AM, Brian Cain <[hidden email]> wrote:

On Wed, Jun 28, 2017 at 10:10 AM, David Barto <[hidden email]> wrote:
This is part of an in-memory system (no swap space configured) so RSS would match the AS size for this use case. From what I read about RSS and AS for MacOS and Linux.

Why did it lock up, why not throw the exception and exit?



Dunno, it seems like the OS is driving now and it's not immediately clear to me why that system call wouldn't yield either success or failure.  But clang is asking for a resource (more memory), and I've seen those stall before.  My experience with linux (may or may not be applicable) leads me to believe that the system is perhaps resource-constrained and your task is pending while it tries to free up those resources.

RLIMIT_AS and RLIMIT_RSS are distinct on linux, I guess I am a little surprised to see that they're not on MacOS.

In any case, the most likely culprit is your setrlimit.  If I were you I would take clang out of the loop entirely and write a test program that does allocations just like the ones clang does (various sized mallocs, you could try profiling to get a ballpark histogram).  I would be surprised if you don't see the same behavior.

-Brian

David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.






--
-Brian

David Barto
[hidden email]

Sometimes, my best code does nothing. Most of the rest of it has bugs.




_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Clang 'locks up' when compiling optimized

Xin Wang via cfe-dev
On Wed, Jun 28, 2017 at 10:31 AM, David Barto <[hidden email]> wrote:
I’m going to disable (for the MacOS Builds) the memory limit checks for the time being.

Should I register a bug about this or is there someone with ‘more authority’ who would be more appropriate?



AFAIK from this community you should feel free to open a bug.  If I were you that's what I'd do.  Maintainers are free to debate your (and my) claim that this is not how clang should behave.  The challenge will be coming to a resolution on how to address the problem, and perhaps finding leverage to get someone to execute the fix if you don't know how yourself.

The rich set of behavior that snaps into action when the compiler faults is beneficial to gathering bug reports from end users.  Preserving that functionality while making the signal handler robust/safe will require a smarter design.

As an aside it would be cool to write an analyzer rule that checks for exclusively signal-safe system calls that come from functions registered as signal handlers.

-Brian

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
Loading...