[macstl-dev] not all rosey in the gcc-4.0.0 land

Ilya Lipovsky lipovsky at skycomputers.com
Sun Jul 3 03:10:01 WST 2005

  • Previous message: [macstl-dev] not all rosey in the gcc-4.0.0 land
  • Next message: [macstl-dev] not all rosey in the gcc-4.0.0 land
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]


>> 
>> Could you please try that with valarray<stdext::complex <float> > ? Because 
>> this is where my code fails.
>> 
>> -Ilya
>
> Sure thing.
>
>        using namespace stdext;
>        valarray <complex <float> > v1 (complex <float> (1.0f,2.0f), 100);
>        valarray <complex <float> > v2 (complex <float> (3.0f,4.0f), 100);
>        std::cout << (v1 * v2).sum ();
>
> produces
>
> (-500,1000)
> benchmark has exited with status 0.
>
> on my Mac as expected.
>
> Perhaps the error is with particular values of v1, v2 etc.? (BTW, the complex 
> multiply then sum is also optimized to use some combination of vectorized 
> fma, from recollection, so any error would start at valarray_altivec.h:154 -- 
> test that is involved by inserting a std::cout << "x" in the static "call" 
> function.) You can do a random search of the problem space by looking at 
> exhaustive.cpp and configuring it with the right functor template, 
> stdext::accumulator <stdext::plus>.
>

I'd like to reiterate that the expression above works just fine with my 
code as well. It works well with -O0 and -O1 but *not* with -O2 and -O3. 
Again, I disassembled the code and ran it instruction by instruction to 
see its flow. Even with some C++ code rearrangement inside complex_fma the 
same code is generated with O2 and O3: a ppc-decrement-counter-and-branch 
into itself (<label+offset>: bdnz label+offset) -- i.e. an empty loop that 
goes on decrementing the counter until it's zero. Afterwards it fp-loads 
the supposedly calculated value and fp-stores it into my variable. And it 
contains gibberish.

With -O1 the loop looks & works perfectly normal (I'd say, even 
beautiful).

I tried rewriting the multiplies_plus functor in different ways; but 
all to no avail. I am having a problem even pinpointing what exactly 
causes the overoptimization. Still investigating.

-Ilya



  • Previous message: [macstl-dev] not all rosey in the gcc-4.0.0 land
  • Next message: [macstl-dev] not all rosey in the gcc-4.0.0 land
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the macstl-dev mailing list