[macstl-dev] Re: Good news (finally) about gcc 4.0 problems
lipovsky at skycomputers.com
Sat Sep 24 21:05:12 WST 2005
On Sat, 24 Sep 2005, Glen Low wrote:
>> Seriously, the good news is that it strongly appears that I was able to
>> solve the problems with FSF's gcc 4.0 generating bad code on Linux PPC. I
>> am very thankful to Andrew Pinski (a gcc maintainer) for his suggestion.
>> Essentially all I needed to do was to modify altivec.h to change all
>> instances of simple "inline" into "INLINE" defined as "inline __attribute__
>> ((always_inline))", a familiar macro.
>> I know, I know: I should've noticed it earlier. Sometimes you miss things
>> that, as you think later on, were staring you in the face (but who would
>> doubt a standard header?).
>> So, to reiterate, the stuff that previously ran so badly (in terms of
>> generating correct results) now runs *well and fast*.
>> I am going to implement similar changes in the altivec.h file that belongs
>> to gcc 3.4. Also, I am going to add some macro logic to the config.h file
>> to conditionally include the appropriate versions of the manually changed
>> altivec.h. We should not use the standard altivec.h header.
>> Andrew Pinski mentioned in a (later) email to me that gcc 4.0.3 will
>> possibly have the fast header. He also claimed in the letter before the
>> above that the future FSF gcc 4.1 should expand the vec_* routines into
>> builtins directly, avoiding going through the definitions in altivec.h
>> (just like what Apple gcc does). We'll see.
>> At some point I still want to submit a bug report about gcc 4.0. As it
>> stands right now, however, the "INLINE" macro is a very successful hack.
> That's good news. Here's a suggestion that appeals to my hacker self but
> offends my architect self:
> Before we #include <altivec.h>, we could #define inline INLINE, and then
> after it #undef inline. Don't know if that would work (does #defining a macro
> which would #include itself cause the compiler conniptions, or does it use
> the reserved word correctly in the 2nd invocation? does #undefing a macro
> that references a reserved word go back to allowing the reserved word?), but
> it might avoid having to muck around with the innards of a standard header.
That is a perfectly valid suggestion and it works fine. The preprocessor
doesn't do recursion with #define's. My initial hesitation about doing
something like that was the fact that altivec.h already contains some
"inline __attribute__((always_inline))" constructs (see the altivec.h's I
sent earlier), so I was afraid that it won't accept duplicates for these
expressions. Well, apparently it did (for both 3.4 and 4.0). Good hack!
> Given that gcc 3.4 has no problems with inlining, and the non-inlining bug
> would be fixed in the gcc 4.0.x branch, we shouldn't be incorporating the
> modified <altivec.h> in the standard macstl distribution. Otherwise we're
> saddled with having to synchronize our version with theirs.
This is where you slightly misunderstood me. I claimed that gcc 3.4
*fails* to do proper inlining when the code under compilation reaches
certain complexity just like in the 4.0 case. The difference? The
difference, and a very important one, is that gcc 3.4 produces code that
gives *correct* results whereas 4.0 generates bad code (i.e. the output of
its calculations is nonsensical).
Again, the above #define-#undef hack will be implemented in vec_altivec.h
to work for all compilers that include the standard altivec.h header, thus
solving our former problems.
More information about the macstl-dev