clang with Python 3.1 release31-maint

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

clang with Python 3.1 release31-maint

Tony Arkles
Hi,

I'm at a bit of a loss here for how to continue.  I've built
llvm/clang from svn, and I'm compiling the latest rev from the Python
3.1 release branch.  It compiles fine, but fails its regression suite.
 I've tried recompiling using gcc instead and it seems to work fine.

I'm not sure if this is a clang problem, an llvm problem, a Python
problem, a library problem (Ubuntu 8.04.2), or maybe something else
entirely.

Here's where I'm at so far:

- I've tried compiling with -O0, no change
- I've used GDB to narrow down the segfault to a somewhat complicated
piece of code (in expat, which has apparently been stable for a long
time)
- Valgrind reports a lot of bad memory access (Python, I think, should
generate some as "known behaviour", but when compiled with clang it
produces quite a bit more)

I'm not really sure where to go next to optimize the use of my time...
I've done quite a bit of search for others that have experienced
similar issues, but have so far come up mostly dry...

Any thoughts?  I'll keep poking at it and see if I can make any progress.

Cheers
Tony
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

Douglas Gregor

On Mar 3, 2010, at 6:43 AM, Tony Arkles wrote:

> Hi,
>
> I'm at a bit of a loss here for how to continue.  I've built
> llvm/clang from svn, and I'm compiling the latest rev from the Python
> 3.1 release branch.  It compiles fine, but fails its regression suite.
> I've tried recompiling using gcc instead and it seems to work fine.
>
> I'm not sure if this is a clang problem, an llvm problem, a Python
> problem, a library problem (Ubuntu 8.04.2), or maybe something else
> entirely.
>
> Here's where I'm at so far:
>
> - I've tried compiling with -O0, no change
> - I've used GDB to narrow down the segfault to a somewhat complicated
> piece of code (in expat, which has apparently been stable for a long
> time)
> - Valgrind reports a lot of bad memory access (Python, I think, should
> generate some as "known behaviour", but when compiled with clang it
> produces quite a bit more)
>
> I'm not really sure where to go next to optimize the use of my time...
> I've done quite a bit of search for others that have experienced
> similar issues, but have so far come up mostly dry...

The typical approach to narrowing down these problems is to compile half of the .o files with GCC and half with Clang, link them together, and see if that works. Continue that binary search down until you've found the small set of .o files that, when compiled with Clang, result in a failure. Then you can debug or start splitting up the file itself to narrow it down to a few function(s) to determine whether it's a Clang mis-compile, or undefined behavior in Python, or something entirely different.

        - Doug
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

Tony Arkles
On Wed, Mar 3, 2010 at 9:19 AM, Douglas Gregor <[hidden email]> wrote:
>
> The typical approach to narrowing down these problems is to compile half of the .o files with GCC and half with Clang, link them together, and see if that works. Continue that binary search down until you've found the small set of .o files that, when compiled with Clang, result in a failure. Then you can debug or start splitting up the file itself to narrow it down to a few function(s) to determine whether it's a Clang mis-compile, or undefined behavior in Python, or something entirely different.
>
>        - Doug

Wow, this is an incredibly productive approach :).

I set Python 3.1 aside and focused directly on expat 2.0.1 (this is
where a lot of the failures with Python were).

- If I compile everything with clang, it segfaults on "make check"
(runs their unit test suite)

- If I compile everything with clang except for lib/xmlparse.c, the
test suite runs but has 3 failures.

- If I compile everything with clang except for lib/xmlparse.c and
lib/xmltok.c, the test suite runs fine...

So we've narrowed it down quite a bit!

I have class and some things to deal with this afternoon, but
hopefully I'll have more time to look at this this evening.

Any advice for next steps?  I haven't really looked into the code
generation bits of clang/llvm yet, but I'm guessing that this will
involve comparing assembler output from gcc and clang? (Which I'm
comfortable with, but a bit concerned because xmlparse.c is a 6k line
file...)

Cheers, thanks for the pointers so far,
Tony

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

Andrius Morkūnas
In reply to this post by Tony Arkles
On Wed, 03 Mar 2010 16:43:50 +0200, Tony Arkles <[hidden email]> wrote:
> I'm at a bit of a loss here for how to continue.  I've built
> llvm/clang from svn, and I'm compiling the latest rev from the Python
> 3.1 release branch.  It compiles fine, but fails its regression suite.
>  I've tried recompiling using gcc instead and it seems to work fine.

We've seen this too, on FreeBSD. It happens on i386 machines, amd64 ones
don't seem to have this problem. There are more things that are miscompiled
by clang/llvm but python is more interesting case because it is a dependency
for lots of other software.

--
Andrius
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

Tony Arkles
2010/3/3 Andrius Morkūnas <[hidden email]>:

> On Wed, 03 Mar 2010 16:43:50 +0200, Tony Arkles <[hidden email]>
> wrote:
>>
>> I'm at a bit of a loss here for how to continue.  I've built
>> llvm/clang from svn, and I'm compiling the latest rev from the Python
>> 3.1 release branch.  It compiles fine, but fails its regression suite.
>>  I've tried recompiling using gcc instead and it seems to work fine.
>
> We've seen this too, on FreeBSD. It happens on i386 machines, amd64 ones
> don't seem to have this problem. There are more things that are miscompiled
> by clang/llvm but python is more interesting case because it is a dependency
> for lots of other software.

Which tests are failing for you?  Mine were:

test_grp
test_minidom
test_multiprocessing
test_plistlib
test_pyexpat
test_sax
test_xml_etree_c
test_xml_etree
test_xmlrpc

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

Douglas Gregor
In reply to this post by Tony Arkles

Sent from my iPhone

On Mar 3, 2010, at 8:16 AM, Tony Arkles <[hidden email]>  
wrote:

> On Wed, Mar 3, 2010 at 9:19 AM, Douglas Gregor <[hidden email]>  
> wrote:
>>
>> The typical approach to narrowing down these problems is to compile  
>> half of the .o files with GCC and half with Clang, link them  
>> together, and see if that works. Continue that binary search down  
>> until you've found the small set of .o files that, when compiled  
>> with Clang, result in a failure. Then you can debug or start  
>> splitting up the file itself to narrow it down to a few function(s)  
>> to determine whether it's a Clang mis-compile, or undefined  
>> behavior in Python, or something entirely different.
>>
>>        - Doug
>
> Wow, this is an incredibly productive approach :).

O(lg N) :)

> I set Python 3.1 aside and focused directly on expat 2.0.1 (this is
> where a lot of the failures with Python were).
>
> - If I compile everything with clang, it segfaults on "make check"
> (runs their unit test suite)
>
> - If I compile everything with clang except for lib/xmlparse.c, the
> test suite runs but has 3 failures.
>
> - If I compile everything with clang except for lib/xmlparse.c and
> lib/xmltok.c, the test suite runs fine...
>
> So we've narrowed it down quite a bit!
>
> I have class and some things to deal with this afternoon, but
> hopefully I'll have more time to look at this this evening.
>
> Any advice for next steps?  I haven't really looked into the code
> generation bits of clang/llvm yet, but I'm guessing that this will
> involve comparing assembler output from gcc and clang? (Which I'm
> comfortable with, but a bit concerned because xmlparse.c is a 6k line
> file...)

It's slightly more involved, but you can continue your binary search  
within a file by splitting it into two .c files, and moving functions  
between them.
>


Eventually, it will get to the point where you are looking at assembly  
or LLVM IR, but you want to only do that for a small number of  
functions.
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

David Chisnall
In reply to this post by Andrius Morkūnas
On 3 Mar 2010, at 16:19, Andrius Morkūnas wrote:

> We've seen this too, on FreeBSD. It happens on i386 machines, amd64 ones
> don't seem to have this problem. There are more things that are miscompiled
> by clang/llvm but python is more interesting case because it is a dependency
> for lots of other software.


These are almost certainly cases of clang using the wrong calling convention.  The Darwin ABI is well tested, but other platforms are not.  The x86-64 ABI is more or less the same between Darwin and FreeBSD, so you won't see any problems.  The x86 ABI is different everywhere (and includes weirdnesses like Linux passing unions of two smaller-than-a-register integer types by pointer, which was the most recent one that I saw break stuff).  

The most commonly used function prototypes are now well tested because they broke stuff early on, but there are still some corner cases.  They're usually easy to fix if you can produce a small test case, which is typically just a function and call to that function which generates invalid code when one is compiled with clang and the other with the system compiler.  

David

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

Roman Divacky
In reply to this post by Douglas Gregor
> > Any advice for next steps?  I haven't really looked into the code
> > generation bits of clang/llvm yet, but I'm guessing that this will
> > involve comparing assembler output from gcc and clang? (Which I'm
> > comfortable with, but a bit concerned because xmlparse.c is a 6k line
> > file...)
>
> It's slightly more involved, but you can continue your binary search  
> within a file by splitting it into two .c files, and moving functions  
> between them.

what worked for me

cat *.[ch] > foo.c

edit until it compiles

run delta (delta.tigris.org) with a test.sh thats basically like

gcc foo.c -a.out && ./a.out test.py

if [ $? -eq 0 ];
 clang foo.c -o a.out
 if [ $? -eq 0 ];
   ./a.out test.py
   [ ! $? -eq 0 ]; && exit 0;
 fi
fi
exit 1

adjust to your needs but this general pattern should work mostly fine
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

Tony Arkles
On Wed, Mar 3, 2010 at 12:14 PM, Roman Divacky <[hidden email]> wrote:

>> It's slightly more involved, but you can continue your binary search
>> within a file by splitting it into two .c files, and moving functions
>> between them.
>
> what worked for me
>
> cat *.[ch] > foo.c
>
> edit until it compiles
>
> run delta (delta.tigris.org) with a test.sh thats basically like
[snip]

This results from doing this might take a few days.  I've tried a few
different approaches and have discovered that expat is probably a good
test case for "edge cases" :).  The two files I've narrowed it down to
are super complicated with piles of macros and #ifdefs everywhere...
Frankly, I'm not that surprised that something busts during the
compile process... (~7kloc between the two of them)

At any rate, multidelta (an alternative to the "cat *.[ch]"
suggestion, since doing that results in thousands of compile errors)
has problems with the source files -- it fails right after doing the
"topformflat" process before even trying to reduce.  And trying the
"manually separate the file into two .c files" process doesn't work so
well either because of all the macro definitions.

I'm going to try to take another stab at this tomorrow though.

Cheers, thanks everyone for the help
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
Reply | Threaded
Open this post in threaded view
|

Re: clang with Python 3.1 release31-maint

Seo Sanghyeon-3
In reply to this post by David Chisnall
2010/3/4 David Chisnall <[hidden email]>:
> These are almost certainly cases of clang using the wrong calling convention.

If it is a calling convention bug, have anyone tried this?

Automatically test calling conventions of C compilers
http://code.google.com/p/quest-tester/

This may be easier than reducing from complicated code.

--
Seo Sanghyeon
_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev