[macstl-dev] Questions on valarray use : non-aligned buffers

Arman Garakani arman at alum.mit.edu
Wed Sep 21 21:39:53 WST 2005

  • Previous message: [macstl-dev] Questions on valarray use : non-aligned buffers
  • Next message: [macstl-dev] Questions on valarray use : non-aligned buffers
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]


On Sep 21, 2005, at 7:16 AM, Glen Low wrote:
Stéphane:

On 20/09/2005, at 11:50 PM, Stéphane Letz wrote:
It seems that valarray functions do not support non-aligned buffers.  
It a feature that could make sense to add in a future version?

(Apple vecLib code use scalor code in case of on-aligned buffers..)

http://developer.apple.com/documentation/Performance/Conceptual/vDSP/ 
ref_chap/chapter_4.1_section_244.html

macstl gets its insane speed partly from not needing to check the  
buffers for alignment. valarray and statarray were defined first and  
since they encapsulate the details of element storage, I could then  
guarantee their alignment. Now that I've defined the refarray class,  
I cannot guarantee the alignment of data beforehand -- in the next  
version I'll likely insert an assert to that effect.

Rather than slow down all macstl code with a runtime check for  
alignment, I'd suggest that you check for alignment yourself and act  
accordingly. E.g. in pseudo-code:

if (a is aligned and b is aligned) then use macstl;
if (a and b are relatively aligned) then peel off the initial  
sequence, and use macstl on the rest;
else use regular arithmetic;

In your case you may find one or two of the above situations never  
arise, so you lose less speed checking for them.

If you need to minimize duplication of arithmetic, you might be able  
to put the refarray code in it's own module, and compile with and  
without Altivec (or SSE),


Not so fast,

More common than most realize is byte arrays still common in image  
data. The problem with the 3 cases you outlined is that while the  
first and the second differ slightly in speed, the third choice  
suffers significantly. To illustrate, imagine processing a user- 
specified window on a single or multi-plane byte image. Processing  
the image data inside the window -- obviously depending on what  
operation you are performing -- under the 3 cases above will vary  
small percentages between first and second option but quite possibly  
an order of magnitude in the third case. So looking at it from a user  
interface point of view the user will get radically different "feel"  
for the performance depending on where the window is!! Checking for  
alignment is not slow and potentially its price will degrade the best  
case performance a bit, but the average and the worst improve. A good  
approach is to copy in to aligned buffer(s) in the third case.  
Certainly for algorithm designers designing your data size and flow  
so that it is properly aligned-stored is critical.

my 2 cents







-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.pixelglow.com/lists/archive/macstl-dev/attachments/20050921/099bb731/attachment.html

  • Previous message: [macstl-dev] Questions on valarray use : non-aligned buffers
  • Next message: [macstl-dev] Questions on valarray use : non-aligned buffers
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the macstl-dev mailing list