The macstl gcc rematch

The fight you’ve all been waiting for, after that last battle of the libraries. In the red corner, the latest gcc 3.3 libstdc++, courtesy of Apple’s December 2002 gcc Updater, replacing the old gcc 3.1 codebase. And in the blue corner, a newly retuned macstl 0.1.2. The ground remains the same: a dual processor Power Macintosh G4. But I had added a couple of new benchmarks to exercise our participants.

Operations per Clock Tick (larger = faster)
operation gcc 3.3 libstdc++ macstl 0.1.2, Altivec off macstl 0.1.2, Altivec on
inline arithmetic 807 888 3355
inline transcendental 74 79 1041
outline transcendental 90 96 39
inline scalarization 1474 1481 4329
inline predication 186 595 3448
inline slice 2958 2890 4065
unchunked apply 408 403 406
unchunked shift 1587 1086 1470
unchunked mask 251 154 163
unchunked indirect 429 421 534

New Benchmarks

The inline predication benchmark tests the relational min and max expressions of the form (v1 == v2).min (), which are optimized in macstl 0.1.2. The inline slice benchmark is the same as the old unchunked slice benchmark, but since slicing is now chunked and inlined, the name was changed. The unchunked shift benchmark was in the source code since macstl 0.1, but while it crashed the gcc 3.1 libstdc++, it works fine now in 3.3.

The biggest jump in gcc performance is with inline arithmetic: 81% faster than the previous version. However, macstl without Altivec still keeps its lead at 10% faster. And with Altivec, it speeds away from 4.2x to 18.5x faster than gcc on all inline tests except for slicing.

New Optimizations

macstl 0.1.2 specifically targets chunked relational min and max expressions, using Altivec predicates to gain 18.5x speed over gcc in the inline predication test. It even enhances unchunked bool-valued min and max, yielding 5.8x speed over gcc.

The new slicing algorithms also come out on top, based on Altivec permutes. The improvements over scalar code are not as dramatic though, being just 37% faster than gcc and 41% faster than without Altivec in the inline slice test.

Mon, 29 Sep 2003. © Pixelglow Software.
» Codewarrior vs. gcc. vs. macstl