[macstl-dev] Re: Question

Glen Low glen.low at pixelglow.com
Fri Jul 15 17:35:11 WST 2005

  • Previous message: [macstl-dev] Opteron benchmarks
  • Next message: [macstl-dev] Re: Question
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]


IIya:

On 02/06/2005, at 8:11 AM, Ilya Lipovsky wrote:

> Glen,
>
> Please accept my apology for belated reply. I cannot answer your  
> gcc 3.4 related questions, as I am not working with it. Currently,  
> however, for optimization's sake add the following lines (in  
> addition to Mike's) to config.h:
>
>         // enable templated function classes to be expanded into  
> AltiVec
>     // code by default
>         #define __VEC__
>         // maximize inlining
>         #define inline inline __attribute__ ((always_inline))
>
> The last directive is essential due the compiler being stubbornly  
> lazy about inlining certain nested template functions (such as the  
> complex fused multiply-add when used as optimization in certain  
> situations).

You are right about this, although I'd been stubbornly resisting this  
for a while.

The main reasons why this will be good:

1.    Sometimes the compiler is stubborn about inlining functions,  
even at insanely high inlining levels (as you and some others have  
said) that cause the compiler to have heart attacks on slow machines.
2.    Clients may not want to inline the rest of their code.
3.    When using -faltivec without -maltivec on Apple gcc, the  
compiler refuses to inline code that contains Altivec ops, presumably  
to allow running on a non-Altivec machine (given the right runtime  
detection). Only the directive seems to overcome it.
4.    Existing practice. The ppc_intrinsics.h etc. all use  
always_inline too.

Thus 0.3.1 or 0.3.2 will have the always_inline turned on, but in  
this fashion:

A.    We'll define a macro called ALWAYS_INLINE (any other  
suggestions?) in config.h and define it appropriately for gcc and  
Visual C++ (and possibly ICC). Hopefully the attribute should work in  
the same syntactic position as inline already does in C++. (I don't  
want to redefine the inbuilt "inline" because it would conflict with  
client code at the least, and I feel it's dangerous to redefine built- 
ins.)
B.    We'll have to tag ALL candidate functions with this, even the  
ones within classes and class templates that are implicitly inline,  
in order to force them to be inlined. In general this should be OK  
since a large part of them are generated via private macros anyway.
C.    A clean compile with no explicit inline tuning at the compiler  
level should still provide the same performance level as the current  
high inlining levels.


Cheers, Glen Low


---
pixelglow software | simply brilliant stuff
www.pixelglow.com
aim: pixglen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.pixelglow.com/lists/archive/macstl-dev/attachments/20050715/dc172bef/attachment.html

  • Previous message: [macstl-dev] Opteron benchmarks
  • Next message: [macstl-dev] Re: Question
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the macstl-dev mailing list