[macstl-dev] Proposal for mixed complex and real arithmetic

Glen Low glen.low at pixelglow.com
Fri Jul 15 21:27:11 WST 2005

  • Previous message: [macstl-dev] Proposal for mixed complex and real arithmetic
  • Next message: [macstl-dev] Proposal for mixed complex and real arithmetic
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]


Ilya, All:

On 15/07/2005, at 8:57 PM, Glen Low wrote:

> Hi Ilya, All
>
> You offered about 1.5 months ago to do an extension of complex  
> arithmetic in macstl. If you're still willing and have the time to  
> do it, we'd welcome as the first major non-Pixelglow extension of  
> macstl and and an important test of the extensibility of the  
> fundamental programming.
>
> Brief outline of scope:
>
> The following mixed arithmetic operations should be made possible,  
> where r is a valarray of real number (int, float, double etc.) and  
> c is the complex equivalent (std::complex <int>, stdext::complex  
> <float>, stdext::complex <double>). Then,
>
> r + c -> c
> c + r -> c
> r - c -> c
> c - r -> c
> r * c -> c
> c * r -> c
> r / c -> c
> c/ r -> c
>
> E.g. valarray <float> + valarray <complex <float> > should yield an  
> expression that can be treated as a valarray <complex <float> >.
>
> Only valarrays of float and valarrays of complex float as the above  
> will be optimized using Altivec.
>
> Things to do:
>
> Based on macstl 0.3 (which has new differently-typed argument  
> functionality to support the above), here are the changes to be done:
>
> A.    [EASY]    functional.h needs to have the arithmetic functors  
> specialized to accept the above combinations. Later we may consider  
> removing this in favor of a generalized functor that works with any  
> 2 different types, but this would mean we'd need moderately  
> complicated typeof simulation to figure out the result type. Once A  
> is done, you should then be able to compile the expressions above  
> and run them, but without optimization.
>
> B.    [MODERATE] valarray_altivec.h needs to have a chunker  
> specialization defined on the right sort of expression, so that it  
> inserts const_chunk_iterator and chunk_begin for the optimization.  
> See valarray_altivec.h:251 etc. for hints. Once you do this, but  
> defining the body as empty you can detect whether the optimization  
> would be called -- e.g. by putting in a simple destructor with a  
> std::cout << "hi I'm here" message and compiling and running the  
> stage A.
>
> C.    [DIFFICULT] We need to define a sensible const_chunk_iterator  
> for the above. Ideally it should be generalizable for all complex/ 
> real or real/complex combinations above, and passed in a template  
> template function which is the required operation -- see  
> valarray_function.h:529 for a hint. Ideally it should also be  
> random access when its two sub-iterators are random access too, but  
> we may have to check code generation to see if a forward iterator  
> makes more sense. This iterator should yield a vec <complex  
> <float>, 2> and so its complex subiterator is incremented whenever  
> it is incremented, but its real subiterator is incremented every  
> other time. Presumably an high/low indicator will be held in the  
> iterator so that it knows which 2 parts of the real subiterator  
> needs to accessed, and an appropriate lvsl/lvsr/vperm applied.  
> Finally the operator* and operator[] should implement the operation  
> -- you might get away with defining it in terms of the (real, real)  
> function.

The other possibility is to engineer a vec <complex <float>, 4> that  
contains 2 __vector floats, and restructure valarray <complex <float>  
 > to use this. However I used to remember gcc 3.3 had a terrible  
time optimizing structs that contained more than 1 field, as vec  
<complex <float>, 4> would. If you do a test of this structure on 3.4  
and it works acceptably, we can then change valarray <complex <float>  
 > to use this.

Quickly declare a simple vecComplexFloat4 struct with 2 __vector  
floats. Declare a operator+ that adds two vecComplexFloat4's. Then  
try this:

vecComplexFloat4 a, b;
vecComplexFloat4 c = (a + a + a) + (b + b + b);

On 3.3, the usual thing would be that the compiler did a store and  
then a redundant load for each of the temps a + a etc. Try it on 3.4  
and 4.0 to see if that has changed. If you get positive results on  
3.4 and 4.0, we can adopt that approach instead -- A and B would be  
unchanged, but C would be simpler, at the cost of having to define up  
most of the vec <complex <float>, 4> operators.


Cheers, Glen Low


---
pixelglow software | simply brilliant stuff
www.pixelglow.com
aim: pixglen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.pixelglow.com/lists/archive/macstl-dev/attachments/20050715/32630b41/attachment-0001.html

  • Previous message: [macstl-dev] Proposal for mixed complex and real arithmetic
  • Next message: [macstl-dev] Proposal for mixed complex and real arithmetic
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the macstl-dev mailing list