[macstl-dev] CSE and the single programmer
glen.low at pixelglow.com
Wed Jul 20 08:54:37 WST 2005
Apple gcc 4.0 is leaps and bounds better than 3.3 for CSE (common
subexpression elimination). For example in macstl 0.3.x I can now
valarray <float> a, b;
valarray <float> b = a + a + a;
and have one lvx (or lfs) within the inner loop, instead of 3 lvx
when the compiler can't track the identical origins of a in the
Unfortunately, the CSE is fairly finicky, I've been chasing better
CSE for macstl and discovered any of the following will make it worse
i.e. longer expressions will cause all loads and code to be taken/
generated, but shorter expressions will cause only the minimal loads/
code implied by CSE:
* -maltivec, -mcpu=G5 even on scalar code (possibly because -
faltivec without -maltivec affects how temporaries are generated??)
* the # of temporaries generated within the expression, even if
the load/store to memory is eventually optimized out.
* the phase of the moon
Since it looks like the compiler gives up CSE at a certain length of
expression rather than with a definite combination of options/usage,
it feels like there's some sort of "maximum length of CSE'able
expression" flag in gcc.
To all you gcc gurus, is there such a flag?
Cheers, Glen Low
pixelglow software | simply brilliant stuff
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the macstl-dev