[macstl-dev] Re: -faltivec without -maltivec copying;
gcc's altivec attributes
Glen Low
glen.low at pixelglow.com
Thu Aug 4 18:53:08 WST 2005
On 04/08/2005, at 9:04 AM, Devang Patel wrote:
>
> On Aug 3, 2005, at 5:30 PM, Glen Low wrote:
>
>> Dear All
>>
>> Compiling Altivec code on Apple gcc 4.0 with -faltivec but no -
>> maltivec produces more efficient code due to CSE, but has its own
>> problems. The compiler regularly inserts a non-inlineable call to
>> memcpy or memset when a structure containing a vector is copied
>> (e.g. in C++ through a trivial copy constructor), and then warns
>> you with "vectorised memcpy disabled due to use of -faltivec
>> without -maltivec".
>>
>> To the compiler writers: why can't the memcpy be inlined with non-
>> Altivec instructions and preserve the intent of leaving a non-
>> Altivec codepath? (The equivalent of _builtin_memcpy?)
>
> GCC uses heuristics (based on number of bytes being copied) to
> decide whether overhead of calling highly optimized system call is
> worth or not. If it is not then it emits inlined non-Altivec
> instructions (or Altivec, based on availability of vector
> instructions).
Strange, I still get the call to memcpy when I turn up the inline
limit e.g. -finline-limit=50000.
Given that my structure is only the size of a vector and already
aligned, any system call is going to be slower than inline copy.
>> The problem for me is that gcc often produces more efficient code
>> when it doesn't have a user-defined copy constructor -- almost all
>> the objects in macstl have no user-defined copy constructor
>> because of this -- presumeably because it provides more
>> opportunities to elide the copy. But the few occasions when it
>> can't elide the copy, the compiler pessimizes the code even more
>> by inserting the non-inlined call to memcpy. The only way I can
>> avoid that is to define my own copy constructor but I potentially
>> lose general efficiency.
>>
>> Further questions:
>>
>> 1. Any flags or options to get this memcpy inlined?
>
> No.
>
>> 2. Any attributes I can tag on a structure so that it will have
>> the memcpy inlined?
>
> No.
>
>> 3. Any macros or any other automatic way to detect if -faltivec
>> was invoked without -maltivec?
>
> When -faltivec is used, GCC sets builtin macro __APPLE_ALTIVEC__ to 1.
Yes but when -maltivec is invoked as well, the macro is still defined.
>
>> 4. Looking at the output of "gcc -dM -E - -faltivec < /dev/
>> null", what is:
>>
>> __attribute__ ((altivec (vector__)))
>> __attribute__ ((altivec (element__)))
>> __attribute__ ((altivec (bool__)))
>> __attribute__ ((altivec (pixel__)))
>>
>> I didn't find these attributes documented anywhere.
>
> Implementation details :). It is a trick that is used to translate
> context sensitive "vector" keyword.
OK but what about __attribute__ ((altivec(element__))) which doesn't
correspond to any context sensitive keyword?
Cheers, Glen Low
---
pixelglow software | simply brilliant stuff
www.pixelglow.com
aim: pixglen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.pixelglow.com/lists/archive/macstl-dev/attachments/20050804/036d266d/attachment.html
More information about the macstl-dev
mailing list