[macstl-dev] not all rosey in the gcc-4.0.0 land

Ilya Lipovsky lipovsky at skycomputers.com
Tue Jul 5 07:34:27 WST 2005

  • Previous message: [macstl-dev] not all rosey in the gcc-4.0.0 land
  • Next message: [macstl-dev] not all rosey in the gcc-4.0.0 land
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]


> The accumulating loop is found in valarray_altivec.h:154. Some of the things 
> I would try that you may or may not have tried already:
>
> 1.    Throw a spanner into the optimizer works. Usually the optimizer cannot 
> optimize around an output statement or a volatile memory write, so you can 
> try either. E.g. create a global volatile static int, then write to it inside 
> of the loop and various places you think might be overoptimizing. The place 
> which successfully breaks the overoptimization would give you a clue as to 
> what level it's occurring at.
>

With so much templates, the problem is determining the the levels.

> 2.    If you're getting this error only with sum () and not regular assigns 
> e.g. vr = v1 * v2 or vr = v1 * v2 + v3, then it's a pretty good bet it has 
> something to do with the init parameter in the above. Try changing the 
> parameter declaration there from T init to const T& init, and copying the 
> init to a private init_copy within the function. Try making it volatile etc.
>

Didn't help.

> 3.    The code at line 154 is called from valarray_algorithm.h:60, there's 
> another place to do 1, 2 and other things to see if this is where the 
> overoptimization happens. This is where the valarray is examined so that only 
> the initial sequence is vectorized, while the tail, left-over elements use a 
> scalar loop (called tail).
>

Line 60 of valarray_algorithm.h corresponds to comments. It will be very 
appreciated if you give me the actual chain of invocation. Right now I 
know 1 thing for sure : the problem does not happen in 
valarray_altivec.h:154 . I commented out the whole structure and yet the 
overoptimization still occurs (i.e. it's somewhere earlier in the chain).


> FSF gcc 4.0 release is dated 20 April 2005, and I suspect Apple put in a lot 
> of effort over and beyond that to get it working with Altivec code for Tiger 
> and Xcode 2.1 -- more's the pity they seem to be all for switching to Intel. 
> So we may be better off waiting for 4.0.1 if we can't resolve the 
> overoptimization, and leave only 3.4.x the supported compiler for YDL at all 
> optimization levels -- according to the gcc.gnu.org site the 4.0 branch has 
> been frozen as of 13 June in preparation for 4.0.1 release.
>

I am afraid that we'll have to.

-Ilya



  • Previous message: [macstl-dev] not all rosey in the gcc-4.0.0 land
  • Next message: [macstl-dev] not all rosey in the gcc-4.0.0 land
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the macstl-dev mailing list