[macstl-dev] Re: Question
Glen Low
glen.low at pixelglow.com
Tue May 17 08:29:40 WST 2005
Ilya
On 17/05/2005, at 4:51 AM, Ilya Lipovsky wrote:
Hi Glen,
I think your element_cast <> may not be such a bad idea. However, I
am not sure it is the best, either. I think the right idea in this
case is to actually expand the chunking mechanism to adapt varying
types. To be combined in a native fashion within one loop.
Why I believe it to be a better idea? Because, for example, in the
case provided in my previous email element_cast <> will convert a
compact representation into a strided one. This will require 2 extra
vmrghw instructions + 1 load per iteration (to make a <float> into
<complex <float>> ) as opposed to simply loading the 4 floats
natively and multiplying them with the 2 registers that contain 4
complex<float>'s. I am not even counting the wasted vperm's of the
operator*<complex<float> >. We don't need the extra vperm's and the
vmaddfp's in the natively implemented operator*<complex<float>,
<float> > case. The operator should be able to be implemented as
follows:
template <> struct multiplies <macstl::vec <float, 4>, macstl::vec
<stdext::complex <float>, 4> >
{
typedef macstl::vec <float, 4> first_argument_type;
typedef macstl::vec <stdext::complex <float>, 4>
second_argument_type;
typedef macstl::vec <stdext::complex <float>, 4> result_type;
result_type operator() (const first_argument_type& lhs, const
second_argument_type& rhs) const
{
using namespace macstl;
return ..... ; // this is nontrivial ;-)
}
};
The problem is that macstl::vec (on Altivec) is defined as a 128-bit
quantity corresponding exactly to one vector register. In practice
stuffing anything more spoils gcc 3.3's ability to enregister
macstl::vec -- we need to ensure that it only ever contains one field
of native type in order to get gcc to keep it in registers only.
Therefore vec <complex <float>, 4> can't work -- a complex float is
64 bit and therefore such a beast would be 256 bit.
We have to tackle the multiplication at a higher level, at the
valarray expression level.
Now element_cast <> isn't as bad as you think. Consider that the
valarray expression template engine I wrote can actually reconfigure
expressions at compile time for efficiency e.g.
(a * b) + c
actually recomposes the expression to use something like madd (c, a,
b) i.e. what looks like two separate operations in the expression can
be merged into a single.
That means
element_cast <complex> (a) * b
need not actually unpack the float a into a complex then multiply by
b, but invoke some sort of merged operation which multiplies 2
complex by 2 float at a time. (The only limitation I see is that the
iterator would need to step through 2 complex at a time, so the float
vector may have to be loaded twice -- it would take a smart loop
unroller in the compiler to see that double load and optimize to a
single one...)
The issue then becomes whether it is convenient for users to use
element_cast. The valarray expression engine works on identical types
and has no notion of type promotion (yet). Some people have said
element_cast is rather clunky and would rather automatic promotions
like regular C (i.e. float -> complex float, integer -> float etc.).
My worry from a syntactic point of view is that these conversions
aren't free, more so with SIMD architectures, so there's a need to
highlight expensive conversions. What do you think?
This conversation is interesting, so I'm going to suggest we continue
it in the mailing list.
http://www.pixelglow.com/lists/listinfo/macstl-dev/
The question, then, is how hard is it to implement such a beast. What
is your opinion?
I don't mind doing some coding as long as my manager(s) approve. I am
just a soldier ;).
-Ilya
OK thanks for the offer. Once we thrash out what you need and what
the others are happy with, we can work something out.
Cheers, Glen Low
---
pixelglow software | simply brilliant stuff
www.pixelglow.com
aim: pixglen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.pixelglow.com/lists/archive/macstl-dev/attachments/20050517/acf59aca/attachment.html
More information about the macstl-dev
mailing list