[macstl-dev] Re: Question
Glen Low
glen.low at pixelglow.com
Fri Jul 15 17:35:11 WST 2005
IIya:
On 02/06/2005, at 8:11 AM, Ilya Lipovsky wrote:
> Glen,
>
> Please accept my apology for belated reply. I cannot answer your
> gcc 3.4 related questions, as I am not working with it. Currently,
> however, for optimization's sake add the following lines (in
> addition to Mike's) to config.h:
>
> // enable templated function classes to be expanded into
> AltiVec
> // code by default
> #define __VEC__
> // maximize inlining
> #define inline inline __attribute__ ((always_inline))
>
> The last directive is essential due the compiler being stubbornly
> lazy about inlining certain nested template functions (such as the
> complex fused multiply-add when used as optimization in certain
> situations).
You are right about this, although I'd been stubbornly resisting this
for a while.
The main reasons why this will be good:
1. Sometimes the compiler is stubborn about inlining functions,
even at insanely high inlining levels (as you and some others have
said) that cause the compiler to have heart attacks on slow machines.
2. Clients may not want to inline the rest of their code.
3. When using -faltivec without -maltivec on Apple gcc, the
compiler refuses to inline code that contains Altivec ops, presumably
to allow running on a non-Altivec machine (given the right runtime
detection). Only the directive seems to overcome it.
4. Existing practice. The ppc_intrinsics.h etc. all use
always_inline too.
Thus 0.3.1 or 0.3.2 will have the always_inline turned on, but in
this fashion:
A. We'll define a macro called ALWAYS_INLINE (any other
suggestions?) in config.h and define it appropriately for gcc and
Visual C++ (and possibly ICC). Hopefully the attribute should work in
the same syntactic position as inline already does in C++. (I don't
want to redefine the inbuilt "inline" because it would conflict with
client code at the least, and I feel it's dangerous to redefine built-
ins.)
B. We'll have to tag ALL candidate functions with this, even the
ones within classes and class templates that are implicitly inline,
in order to force them to be inlined. In general this should be OK
since a large part of them are generated via private macros anyway.
C. A clean compile with no explicit inline tuning at the compiler
level should still provide the same performance level as the current
high inlining levels.
Cheers, Glen Low
---
pixelglow software | simply brilliant stuff
www.pixelglow.com
aim: pixglen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.pixelglow.com/lists/archive/macstl-dev/attachments/20050715/dc172bef/attachment.html
More information about the macstl-dev
mailing list