[macstl-dev] [ANN] macstl 0.3.1 -- Extensively re-optimized, runs 450x faster than G4 scalar code

Glen Low glen.low at pixelglow.com
Tue Sep 6 18:09:49 WST 2005

  • Previous message: [macstl-dev] Re: macstl on Linux, redux
  • Next message: [macstl-dev] Questions on valarray use
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]


Dear All
macstl is a portable SIMD (single instruction multiple data) toolkit  
that massively accelerates array-based code. It features fast  
transcendental and integer division functions, complex number  
arithmetic and cross-platform programming, all in an easy-to-use  
syntax. After extensive re-optimization, the new 0.3.1 version  
features new Linux x86 and Cygwin support, a contributed complex  
conjugate function, a much-requested refarray class, optimizations  
for SSE2 and lots more. For Apple developers, 0.3.1 now runs even  
faster on -faltivec without -maltivec and has improved macstlizer  
Altivec-to-SSE conversions for the PowerPC-Intel transition.

http://www.pixelglow.com/macstl/

macstl is rocket fuel for your data processing code --  the Opteron  
on Windows x64 cruises in at 9.8x faster than scalar code, and the G4  
on Linux blasts forward at 450x faster than scalar code. No, it’s not  
a misprint, here I’ll spell it out -- four-hundred-and-freaking-fifty  
times faster than scalar!!

Opteron on Windows x64: http://www.pixelglow.com/lists/archive/macstl- 
dev/2005-July/000114.html

G4 on Linux: http://www.pixelglow.com/lists/archive/macstl-dev/2005- 
September/000142.html

macstl requires Mac OS X 10.3 or 10.4, Windows 2000, XP or Server  
2003, Linux PPC or x86, or Cygwin 1.5. The library is open-source and  
free when derived code is reciprocated, otherwise it is $99 for a  
Personal license, $499 for a Corporate License and $2499 for a  
Redistributable License.
List of New Features

- Fixed class scope vector typedefs, missing PowerPC intrinsics  
header, vector initializer syntax for FSF 3.4 [ILi*].
- Added complex conj function for vec and valarray [ILi*].
- Improved valarray expression performance: v1 [slice].
- Improved valarray code generation: CSE, inlining limits, literal  
terms, array term elements, statarray construction, compiling - 
faltivec without -maltivec for Apple gcc 4.0.
- Added refarray class [PBa].
- Fixed buffer overflow in integral valarrays for SSE2; added  
optimizations for valarray expressions: v1 >> k and v1 << k for SSE2  
[MSh].
- Fixed accumulate array dispatch, integer constant overflow, literal  
benchmark test for SSE2; fixed chunking iterator pessimization for  
gcc 3.3/4 [ILi, RBe].
- Added makefile for Linux x86 [ILi*].
- Added support for FSF gcc 3.4 on Cygwin 1.5.
- Added differently typed valarray construct and assign from terms,  
valarrays of sized booleans, select with sized booleans [ILi].
- Fixed unix makefile directory.
- Added macstlizer conversions: abs, abss, cmpeq, max, min.
- Improved readme file.

Thanks especially go to Ilya Lipovsky (SKY Computers) and Rene Bertin  
for their immense help, testing and code contribution with the full  
Linux port. That's what open source is all about, folks!

Cheers, Glen Low


---
pixelglow software | simply brilliant stuff
www.pixelglow.com
aim: pixglen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.pixelglow.com/lists/archive/macstl-dev/attachments/20050906/e194fe8b/attachment.html

  • Previous message: [macstl-dev] Re: macstl on Linux, redux
  • Next message: [macstl-dev] Questions on valarray use
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the macstl-dev mailing list