From pauljbaxter at hotmail.com Tue Feb 1 07:11:47 2005 From: pauljbaxter at hotmail.com (Paul Baxter) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] Re: macstl 0.2 is finally here! whew... References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: Glen, I want to congratulate you on your work. Its been evident from your posts on the altivec list that you love this stuff and are very good with it. Any chance of outlining future plans for MacSTL beyond the yellow, red/green blocks in vec-common-interface? It may help those of us considering the redistributable license. Its quite a commitment (particularly if just wanting to use this on Intel/AMD) without knowing a very broad outline of what is going to be added over the course of the next year. $499 for rights to use in my commercial code was fine as a disposable 'try it out' price, $2499 was a bit of a surprise. (though I understand the need to eat as well :-) I've seen the coloured feature boxes, but it doesn't make clear whether for instance you may devote more effort to bringing the x86 platform intrinsics onto a par with the altivec ones. For instance I am interested in implementation of complex numbers of the x86 platform and despite my realisation the 'Mac'STL may never truly be as well supported on x86 as on the PowerPC, I would be interested to know your take on things. On a separate note, perhaps you could explain if your code might be used with something like GPL'ed code. Is your open source code amenable to being published within a GPL'd context (given GPL's 'viral' nature) or is it limited to open source that only supports your reciprocal license? Regards I From glen.low at pixelglow.com Tue Feb 1 08:27:47 2005 From: glen.low at pixelglow.com (Glen Low) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] Re: macstl 0.2 is finally here! whew... In-Reply-To: References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: <19542AC2-73E8-11D9-949F-000D9337BC48@pixelglow.com> On 01/02/2005, at 7:11 AM, Paul Baxter wrote: > Glen, > > I want to congratulate you on your work. Its been evident from your > posts on the altivec list that you love this stuff and are very good > with it. > Thanks! > Any chance of outlining future plans for MacSTL beyond the yellow, > red/green blocks in vec-common-interface? It may help those of us > considering the redistributable license. Its quite a commitment > (particularly if just wanting to use this on Intel/AMD) without > knowing a very broad outline of what is going to be added over the > course of the next year. $499 for rights to use in my commercial code > was fine as a disposable 'try it out' price, $2499 was a bit of a > surprise. (though I understand the need to eat as well :-) I think it's reasonable. JoelOnSoftware thinks software should either be less than $1,000 or more than $75,000, so perhaps this is in the middle of the field. Then again, given that macstl is by nature and intent open-source, I doubt I could sell a proprietary version for $75,000 :-). However VAST, an autovectorizer for Altivec sells for $3,000 for embedded systems. Also it's a bit to do with the size of the market. Eric Raymond of CATB / OSI fame thinks 80% of the software written is in-house vs. shrink-wrapped packaging, so it makes sense that the redistributable price is 4x to 5x the in-house price. > I've seen the coloured feature boxes, but it doesn't make clear > whether for instance you may devote more effort to bringing the x86 > platform intrinsics onto a par with the altivec ones. For instance I > am interested in implementation of complex numbers of the x86 platform > and despite my realisation the 'Mac'STL may never truly be as well > supported on x86 as on the PowerPC, I would be interested to know your > take on things. I hope to tackle the low-lying fruit first -- multiplies, divides and some trig functions, and then complex numbers, all of which can be easily ported over from the Altivec side. Help is always welcome! Or particular development could be sponsored -- that's how the complex number arithmetic on Altivec came to be, it was sponsored. (SSE3 intrinsics would help in this regard.) > > On a separate note, perhaps you could explain if your code might be > used with something like GPL'ed code. Is your open source code > amenable to being published within a GPL'd context (given GPL's > 'viral' nature) or is it limited to open source that only supports > your reciprocal license? > I had this discussion with the RPL writers (Technical Pursuit) and raised it with license-discuss@opensource.org and also briefly with the FSF. The main opinion seems to be that RPL is incompatible with GPL. On the other hand I'm pretty amenable to making some exceptions a la the MySQL folk, but it's hard for me to figure out how to maintain the "extra" virality of RPL when it's combined with GPL -- e.g. if I explicitly allow RPL to be combined with a particular piece of GPL, can I maintain that the whole must be reciprocated even if it is in-house? Would this be a violation of GPL (no extra restrictions allowed)? Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From pauljbaxter at hotmail.com Wed Feb 2 04:42:21 2005 From: pauljbaxter at hotmail.com (Paul Baxter) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] Re: macstl 0.2 is finally here! whew... References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> <19542AC2-73E8-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: Glen, In amongst the 101 other things you've got to do right now, I notice the new mailing list might not be working... Neither my message or your reply appear on the list, and I didn't receive your reply via the mailing list (just direct from you). Mind you the mailman interface only shows January 2005. On a separate tack: Given the new licensing, how should/could a company evaluate MacSTL prior to purchase? e.g. benchmark using their own algorithms against say Blitz++/BLAS/uBlas/another for business purposes (including sharing the source code/object code amongst the team who write the tests?) I'm not trying to get a free lunch here, I am hoping to purchase this if it proves its worth (as well as a commercial license for FFTW which costs even more). I just wonder whether paying $499 for a corporate version to run some tests at work for evaluation might be counter-productive to the uptake of your library where an organisation cannot reciprocate some of the test code being evaluated. This is exactly the situation I'm currently facing (though it is possible that I've misunderstood your proposed licensing as I'm no lawyer) Any plans for a 30 day evaluation period after which object/source etc must be destroyed? (I think that was the jist of the previous evaluation licensing wasn't it - my memory's going so apologies if I'm mis-remembering). The other thing I found surprising is that you refer to the redistribution of 'proprietary object code'. Given that your library provides source code and my compiler provides the object code, I thought you would need to prohibit the generation of object code for the purposes of redistribution (etc). Its probably a fine and rather stupid point, but your comments appreciated. PS The benchmark tests for windows compile OK but throw unhandled exceptions when run. I haven't had time to chase this yet (10 hrs at work + a kid is limiting my spare 'fun' time right now :) ) but I'm doing this at home on my Athlon XP which is lacking SSE2/3. I'm not expecting any support, but it might be useful as a FAQ. I'll post something when I find out why. Paul ----- Original Message ----- From: "Glen Low" To: "Paul Baxter" Cc: Sent: Tuesday, February 01, 2005 12:27 AM Subject: Re: [macstl-dev] Re: macstl 0.2 is finally here! whew... > On 01/02/2005, at 7:11 AM, Paul Baxter wrote: > >> Glen, >> >> I want to congratulate you on your work. Its been evident from your posts >> on the altivec list that you love this stuff and are very good with it. >> > > Thanks! > >> Any chance of outlining future plans for MacSTL beyond the yellow, >> red/green blocks in vec-common-interface? It may help those of us >> considering the redistributable license. Its quite a commitment >> (particularly if just wanting to use this on Intel/AMD) without knowing a >> very broad outline of what is going to be added over the course of the >> next year. $499 for rights to use in my commercial code was fine as a >> disposable 'try it out' price, $2499 was a bit of a surprise. (though I >> understand the need to eat as well :-) > > I think it's reasonable. JoelOnSoftware thinks software should either be > less than $1,000 or more than $75,000, so perhaps this is in the middle of > the field. Then again, given that macstl is by nature and intent > open-source, I doubt I could sell a proprietary version for $75,000 :-). > However VAST, an autovectorizer for Altivec sells for $3,000 for embedded > systems. > > Also it's a bit to do with the size of the market. Eric Raymond of CATB / > OSI fame thinks 80% of the software written is in-house vs. shrink-wrapped > packaging, so it makes sense that the redistributable price is 4x to 5x > the in-house price. > >> I've seen the coloured feature boxes, but it doesn't make clear whether >> for instance you may devote more effort to bringing the x86 platform >> intrinsics onto a par with the altivec ones. For instance I am interested >> in implementation of complex numbers of the x86 platform and despite my >> realisation the 'Mac'STL may never truly be as well supported on x86 as >> on the PowerPC, I would be interested to know your take on things. > > I hope to tackle the low-lying fruit first -- multiplies, divides and some > trig functions, and then complex numbers, all of which can be easily > ported over from the Altivec side. Help is always welcome! > > Or particular development could be sponsored -- that's how the complex > number arithmetic on Altivec came to be, it was sponsored. (SSE3 > intrinsics would help in this regard.) >> >> On a separate note, perhaps you could explain if your code might be used >> with something like GPL'ed code. Is your open source code amenable to >> being published within a GPL'd context (given GPL's 'viral' nature) or is >> it limited to open source that only supports your reciprocal license? >> > > I had this discussion with the RPL writers (Technical Pursuit) and raised > it with license-discuss@opensource.org and also briefly with the FSF. The > main opinion seems to be that RPL is incompatible with GPL. On the other > hand I'm pretty amenable to making some exceptions a la the MySQL folk, > but it's hard for me to figure out how to maintain the "extra" virality of > RPL when it's combined with GPL -- e.g. if I explicitly allow RPL to be > combined with a particular piece of GPL, can I maintain that the whole > must be reciprocated even if it is in-house? Would this be a violation of > GPL (no extra restrictions allowed)? > > > > Cheers, Glen Low > > > --- > pixelglow software | simply brilliant stuff > www.pixelglow.com > > From glen.low at pixelglow.com Wed Feb 2 07:20:26 2005 From: glen.low at pixelglow.com (Glen Low) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] Re: macstl 0.2 is finally here! whew... In-Reply-To: References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> <19542AC2-73E8-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: On 02/02/2005, at 4:42 AM, Paul Baxter wrote: > Glen, > > In amongst the 101 other things you've got to do right now, I notice > the new mailing list might not be working... Neither my message or > your reply appear on the list, and I didn't receive your reply via the > mailing list (just direct from you). Mind you the mailman interface > only shows January 2005. That's somewhat puzzling, my Mailman seems to have sent it out through my ISP SMTP as expected, so I have to chase this up with the ISP. The archives are going to be slightly out of date, since the main website where it resides is my web host's server and I archive and upload manually the mails -- eventually I will script this of course. > On a separate tack: Given the new licensing, how should/could a > company evaluate MacSTL prior to purchase? e.g. benchmark using their > own algorithms against say Blitz++/BLAS/uBlas/another for business > purposes (including sharing the source code/object code amongst the > team who write the tests?) > I'm not trying to get a free lunch here, I am hoping to purchase this > if it proves its worth (as well as a commercial license for FFTW which > costs even more). I just wonder whether paying $499 for a corporate > version to run some tests at work for evaluation might be > counter-productive to the uptake of your library where an organisation > cannot reciprocate some of the test code being evaluated. This is > exactly the situation I'm currently facing (though it is possible that > I've misunderstood your proposed licensing as I'm no lawyer) > I should make it clearer, perhaps in a FAQ. By default without paying the software is under the RPL. Clause 1.2 says: "Deploy" means to use, Serve, sublicense or distribute Licensed Software other than for Your internal Research and/or Personal Use, and includes without limitation, any and all internal use or distribution of Licensed Software within Your business or organization other than for Research and/or Personal Use, as well as direct or indirect sublicensing or distribution of Licensed Software by You to any third party in any form or manner. Thus you are allowed to evaluate ("Research") macstl without it being considered a deployment and invoking the reciprocation clause. You have to draw a reasonable line in the sand between internal use e.g. if actual users are using beta or final versions of your software that #includes macstl in-house, vs. an evaluation situation e.g. the team hasn't decided yet to incorporate macstl and is just testing. > Any plans for a 30 day evaluation period after which object/source etc > must be destroyed? (I think that was the jist of the previous > evaluation licensing wasn't it - my memory's going so apologies if I'm > mis-remembering). Hmm, I did have a 30 day eval period before, I have to check with my lawyer about whether that would conflict with RPL. > The other thing I found surprising is that you refer to the > redistribution of 'proprietary object code'. Given that your library > provides source code and my compiler provides the object code, I > thought you would need to prohibit the generation of object code for > the purposes of redistribution (etc). Its probably a fine and rather > stupid point, but your comments appreciated. Unpacking the phrase "redistribution of proprietary object code", it means if you want to distribute code that #includes macstl but don't want reciprocate your own code, then it's proprietary, and requires you to pay the redistribution license fee. I can't prohibit the generation of object code since you have those rights at all times with the RPL -- one the finer points of open source rights that I learnt from license-discuss at opensource.org -- that all the OSI-certified licenses do not prohibit combining code, they just prohibit what can be done with combined code. > PS > The benchmark tests for windows compile OK but throw unhandled > exceptions when run. I haven't had time to chase this yet (10 hrs at > work + a kid is limiting my spare 'fun' time right now :) ) but I'm > doing this at home on my Athlon XP which is lacking SSE2/3. Yes, the library is currently tuned for SSE2. You can selectively disable support by not defining the appropriate macro: __MMX__, __SSE__, __SSE2__, __SSE3__ in the VS.NET project level -- the valarray library will then transparently scale back and use scalars for ops it cannot find, and the vec library will tell you what ops are undefined. For now, the trig support requires __SSE2__ because I couldn't get a decent argument reduction happening with the floats of __SSE__ -- on Altivec the fused multiply-adds have higher precision than SSE multiply then add -- so with SSE2 I had to argument reduce in double, then move back to float for the polynomial estimate. > I'm not expecting any support, but it might be useful as a FAQ. I'll > post something when I find out why. OK, much appreciated! Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From glen.low at pixelglow.com Wed Feb 2 08:39:12 2005 From: glen.low at pixelglow.com (Glen Low) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] Re: macstl 0.2 is finally here! whew... In-Reply-To: References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> <19542AC2-73E8-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: On 02/02/2005, at 4:42 AM, Paul Baxter wrote: > Glen, > > In amongst the 101 other things you've got to do right now, I notice > the new mailing list might not be working... Neither my message or > your reply appear on the list, and I didn't receive your reply via the > mailing list (just direct from you). Mind you the mailman interface > only shows January 2005. > If my limited mailman knowledge serves me correctly, you have "nodupes" option set on, which makes the list not send an email to you if the poster had cc'ed you directly, see http://mm.tkikuchi.net/mailman-admin/node14.html. The option is "Filter out duplicate messages to list members (if possible)" on your admin screen -- by default I had set it to on. I'll make the default to off to get the usual list behavior but I'll leave it up to you if you want it off for your own subscription. Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From pauljbaxter at hotmail.com Wed Feb 2 18:47:37 2005 From: pauljbaxter at hotmail.com (Paul Baxter) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] Re: macstl 0.2 is finally here! whew... References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> <19542AC2-73E8-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: >> the new mailing list might not be working... Neither my message or your >> reply appear on the list, and I didn't receive your reply via the mailing >> list (just direct from you). Mind you the mailman interface only shows >> January 2005. > > The archives are going to be slightly out of date, since the main website > where it resides is my web host's server and I archive and upload manually > the mails -- eventually I will script this of course. Having my hotmail account dropping emails is not a surprise of itself; only when combined with the lack of changes to the mailing list archive did I think it might be a list problem. Since you do archving manually at present, no problems. I certainly received your last post correctly. > >> On a separate tack: Given the new licensing, how should/could a company >> evaluate MacSTL prior to purchase? > > I should make it clearer, perhaps in a FAQ. > > Thus you are allowed to evaluate ("Research") macstl without it being > considered a deployment and invoking the reciprocation clause. You have to > draw a reasonable line in the sand between internal use e.g. if actual > users are using beta or final versions of your software that #includes > macstl in-house, vs. an evaluation situation e.g. the team hasn't decided > yet to incorporate macstl and is just testing. Excellent news. > >> Any plans for a 30 day evaluation period after which object/source etc >> must be destroyed? (I think that was the jist of the previous evaluation >> licensing wasn't it - my memory's going so apologies if I'm >> mis-remembering). > > Hmm, I did have a 30 day eval period before, I have to check with my > lawyer about whether that would conflict with RPL. Given the 'research' clause above which I mis-interpreted (my company does 'research' as its business output), it looks like you already have the evaluation base covered, thanks. Thanks too for the explanation of the OSI object code philosophy. From pauljbaxter at hotmail.com Wed Feb 2 23:58:42 2005 From: pauljbaxter at hotmail.com (Paul Baxter) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] Re: Windows build without SSE2/3 (e.g. athlon XP) vectest and benchmark References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> <19542AC2-73E8-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: >> The benchmark tests for windows compile OK but throw unhandled >> exceptions when run. I haven't had time to chase this yet (10 hrs at work >> + a kid is limiting my spare 'fun' time right now :) ) but I'm doing this >> at home on my Athlon XP which is lacking SSE2/3. > > Yes, the library is currently tuned for SSE2. You can selectively disable > support by not defining the appropriate macro: __MMX__, __SSE__, __SSE2__, > __SSE3__ in the VS.NET project level >> The benchmark tests for windows compile OK but throw unhandled exceptions >> but I'm doing this at home on my Athlon XP which is lacking SSE2/3. > > Yes, the library is currently tuned for SSE2. You can selectively disable > support by not defining the appropriate macro: __MMX__, __SSE__, __SSE2__, > __SSE3__ in the VS.NET project level For both projects: Adjust the project defines to __MMX__ and __SSE__ only (no __SSE2__) and set the project build arch to SSE (rather than SSE2). Tests of vectest project: Compilation problems in vec_mmx.h >From line 3779 there is a section for SSE/MMX that has some problems (definition of for the maximum function. Replacing the 8 with 4 (incl the subsequent inside the function allows compilation and passes unsigned short tests Same applies to minimum at line 3858 onwards (test_func() OK but test_accum() is noted as undefined, due ( I think) to the lack of template <> struct accumulator > > ) Other points: vec fails at min and max in both test_func and test_acc Don't know why?? QNAN handling? maybe different handling in scalar and vector code? Benchmark test: With __MMX__, __SSE__ and arch set to SSE Program crashes at entry to main Changing the denormal activation to #ifdef __SSE__ // _mm_setcsr (_mm_getcsr () | 0x8040); // on Intel, treat denormals as zero for full speed _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); #endif as per recommendations some way down the thread at http://softwareforums.intel.com/ids/board/message?board.id=16&message.id=183 then allowed the program to execute on my AthlonXP (Which I then showed to be equivalent to changing the bit mask to 0x8000 ie just the FTZ bit.) I also tried 0x40 (DAZ bit 6 undocumented) as recommended in the same thread but suspect that this further (P4-specific?) assistance may not be available for Athlon's as it repeated the crash. Querying _get_csr() after the revised function call on the Athlon gives me a value of 0x98F0 which looks like the FTZ bit(15) is set but not the DAZ bit 6. It would be interesting to know whether the function call is equivalent to a mask of 0x8040 on P4's and 0x8000 on Athlons automatically. Perhaps you can try it on a P4 and let me know? Otherwise you could consider the function at http://ccrma-mail.stanford.edu/pipermail/planetccrma/2005-January/007558.html This code sets bit 15 FTZ and then queries cpuid before deciding on bit 6. Regards Paul Baxter ------------------------- Partial output of vectest.exe (with above mods added to vec_mmx.h) attached at the end of email > vec defined: sum defined: sum OK. max undefined. min undefined. operator- defined: < snipped> max defined: max OK. min defined: min OK. pow undefined. --------------------------------------- problem> vec defined: sum defined: sum OK. max defined: 10000000: -1.#QNAN != 2.752831445473174e-031 == max (2.752831445473174e-031 -3.2907964123623209e-019 -1.3662074820492266e-017 -1.#QNAN). min defined: 10000000: -1.#QNAN != -4.5765854009174442e-027 == min (30269796477465809000000 -4.5765854009174442e-027 503096195153920 -1.#QNAN). operator- defined: operator- OK. log undefined. max defined: 10000000: -1.#QNAN != 6.7629492218561597e+032 == max (6.7629492218561597e+032, -1.#QNAN). min defined: 10000000: 1.#QNAN != -2.2030078907976132e-016 == min (-2.2030078907976132e-016, 1.#QNAN). pow undefined. From glen.low at pixelglow.com Thu Feb 3 08:07:14 2005 From: glen.low at pixelglow.com (Glen Low) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] Re: Windows build without SSE2/3 (e.g. athlon XP) vectest and benchmark In-Reply-To: References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> <19542AC2-73E8-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: <8EE5C6CA-7577-11D9-949F-000D9337BC48@pixelglow.com> Paul: > For both projects: > Adjust the project defines to __MMX__ and __SSE__ only (no __SSE2__) > and set the project build arch to SSE (rather than SSE2). > > > Tests of vectest project: > > Compilation problems in vec_mmx.h > >> From line 3779 there is a section for SSE/MMX that has some problems > (definition of for the maximum function. Replacing > the 8 > with 4 (incl the subsequent inside the function allows > compilation and passes unsigned short tests > > Same applies to minimum at line 3858 onwards > > (test_func() OK but test_accum() is noted as undefined, due ( I think) > to the lack of > template <> struct accumulator 4> > > > ) I'll look into it. > Other points: > vec fails at min and max in both test_func and test_acc > Don't know why?? QNAN handling? maybe different handling in scalar and > vector code? Yes it's NaN handling. C89 and C++98 are silent about NaN handling in max and min code, but C99 says that fmax and fmin are supposed to ignore NaN rather than propagate NaN ("C99-style max"). However both Altivec and MMX/SSE use Java-style max, which propagates the NaN. So for the common interface I've adopted C99-style max, which is fairly easy to get on Altivec and requires a little more thought on SSE, and it would devolve to faster Java-style max when finite math optimization is on (is there a macro or option for this on VC++?). > > Benchmark test: > > With __MMX__, __SSE__ and arch set to SSE > > Program crashes at entry to main > > Changing the denormal activation to > #ifdef __SSE__ > > // _mm_setcsr (_mm_getcsr () | 0x8040); // on Intel, treat denormals > as zero for full speed > > _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); > > #endif > > as per recommendations some way down the thread at > http://softwareforums.intel.com/ids/board/message? > board.id=16&message.id=183 > then allowed the program to execute on my AthlonXP (Which I then > showed to be equivalent to changing the bit mask to 0x8000 ie just the > FTZ bit.) > I also tried 0x40 (DAZ bit 6 undocumented) as recommended in the same > thread but suspect that this further (P4-specific?) assistance may not > be available for Athlon's as it repeated the crash. > > Querying _get_csr() after the revised function call on the Athlon > gives me a value of 0x98F0 which looks like the FTZ bit(15) is set but > not the DAZ bit 6. > > It would be interesting to know whether the function call is > equivalent to a mask of 0x8040 on P4's and 0x8000 on Athlons > automatically. Perhaps you can try it on a P4 and let me know? > Otherwise you could consider the function at > http://ccrma-mail.stanford.edu/pipermail/planetccrma/2005-January/ > 007558.html > This code sets bit 15 FTZ and then queries cpuid before deciding on > bit 6. Yes, I'm considering to make it part of the common interface i.e. a single function that sets or clears denormal handling on all processors. Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From David-Chilton at utc.edu Sat Feb 5 11:27:00 2005 From: David-Chilton at utc.edu (David Chilton) Date: Sat Feb 5 20:43:05 2005 Subject: [macstl-dev] template with C linkage Message-ID: <7b5427bd5875c2ce69a158e37f2e930d@utc.edu> Glen, I successfully built and executed the benchmark target in the macstl xcode project, and it looks great. Now I'm trying to test incorporating macstl into the code for my honors thesis, which relies heavily on valarray's read from FITS files, but I can't seem to get gcc to build it. I continually get "template with C linkage" errors. The same code compiles fine using the standard library valarray. below is an example: #include using stdext::valarray; class TestClass { public: static size_t testStatic; TestClass(size_t n):index(n) {} TestClass():index(0){} size_t getindex() { return index;} valarray testFunc(valarray& input){ valarray d(input.size()); for(size_t i =0; i References: <7b5427bd5875c2ce69a158e37f2e930d@utc.edu> Message-ID: <4d0034d70e2578f4ba16e82d974fff5c@pixelglow.com> On 05/02/2005, at 11:27 AM, David Chilton wrote: > Glen, > > I successfully built and executed the benchmark target in the macstl > xcode project, and it looks great. Now I'm trying to test > incorporating macstl into the code for my honors thesis, which relies > heavily on valarray's read from FITS files, but I can't seem to get > gcc to build it. I continually get "template with C linkage" errors. > The same code compiles fine using the standard library valarray. > below is an example: > > #include > using stdext::valarray; > > class TestClass > { > public: > static size_t testStatic; > TestClass(size_t n):index(n) {} > TestClass():index(0){} > size_t getindex() { return index;} > valarray testFunc(valarray& input){ > valarray d(input.size()); > for(size_t i =0; i d[i]=input[i]; > return d; > } > private: > size_t index; > } > size_t TestClass testStatic = size_t(0); > > compile with: gcc -c -o test.o test.cpp > > and tons of errors are given > > Any suggestions would be appreciated. > Thanks for evaluating macstl and joining the list, hope you find both activities useful! The program above has a few syntax errors, allow me to correct: #include using stdext::valarray; class TestClass { public: static size_t testStatic; TestClass(size_t n):index(n) {} TestClass():index(0){} size_t getindex() { return index;} valarray testFunc(valarray& input){ valarray d(input.size()); for(size_t i =0; i And the above, if put into main.cpp, should compile cleanly on Apple gcc 3.3. I'm not sure where the "template with C linkage" errors are coming from, perhaps you are compiling a .c file instead of a .cpp file? Or can you submit an example of the error? Note that valarray is unlikely to be optimized, because size_t is usually a typedef for long, and only ints are Altivec optimized. On 32-bit gcc, long is 32 bits which is equivalent to an int, but on 64-bit gcc, long is 64 bits which is no longer the size of an int, so I didn't want to trigger the issue in macstl. (I may do some template metaprogramming to get 32-bit longs optimized and skip 64-bit longs, we'll see.) Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From glen.low at pixelglow.com Sat Feb 5 20:26:39 2005 From: glen.low at pixelglow.com (Glen Low) Date: Sat Feb 5 20:55:43 2005 Subject: [macstl-dev] Re: Windows build without SSE2/3 (e.g. athlon XP) vectest and benchmark In-Reply-To: References: <310C09B2-739C-11D9-949F-000D9337BC48@pixelglow.com> <19542AC2-73E8-11D9-949F-000D9337BC48@pixelglow.com> Message-ID: Paul: On 02/02/2005, at 11:58 PM, Paul Baxter wrote: > Compilation problems in vec_mmx.h > >> From line 3779 there is a section for SSE/MMX that has some problems > (definition of for the maximum function. Replacing > the 8 > with 4 (incl the subsequent inside the function allows > compilation and passes unsigned short tests > > Same applies to minimum at line 3858 onwards > > (test_func() OK but test_accum() is noted as undefined, due ( I think) > to the lack of > template <> struct accumulator 4> > > > ) Have fixed both the member and binary min and max for vec in build 3 -- which will be accumulated in release 0.2.1. Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From David-Chilton at utc.edu Sun Feb 6 02:28:56 2005 From: David-Chilton at utc.edu (David Chilton) Date: Sun Feb 6 02:55:06 2005 Subject: [macstl-dev] template with C linkage In-Reply-To: <4d0034d70e2578f4ba16e82d974fff5c@pixelglow.com> References: <7b5427bd5875c2ce69a158e37f2e930d@utc.edu> <4d0034d70e2578f4ba16e82d974fff5c@pixelglow.com> Message-ID: Glen, After much tinkering, it seems the problem comes in with the -Wp,-header-mapfile,... option that is passed by Xcode. I couldn't get benchmark.cpp to compile from the command line so i started putting in options from the Xcode build results window until it built. Finally, and tried to compile the main.cpp given below. It works with the std valarray straight away, but gives errors when i try to use macstl/valarry. But It works when I add -Wp,-header-mapfile,[PATH_TO_BENCHMARK_BUILD]/benchmark.hmap to the gcc call, without that it gives loads of errors. I can't seem to find any documentation on the hmap file or how to make one work, or why it deosn't work without it. Here's the command line that does build properly: /usr/bin/gcc-3.3 /Users/chiltie/Desktop/main.cpp -o /Users/chiltie/Desktop/main -lstdc++ -DUSE_MACSTL -Wp,-header-mapfile,/Users/chiltie/Desktop/macstl/mac/build/ macstl.build/benchmark.build/benchmark.hmap here's the one that doesn't: /usr/bin/gcc-3.3 /Users/chiltie/Desktop/main.cpp -o /Users/chiltie/Desktop/main -lstdc++ -DUSE_MACSTL And finally here's my updated main.cpp to give some test output: #include #ifdef USE_MACSTL #include using stdext::valarray; #else #include using std::valarray; #endif class TestClass { public: static size_t testStatic; TestClass(size_t n):index(n) {} TestClass():index(0){} size_t getindex() { return index;} valarray testFunc(valarray& input){ valarray d(input.size()); for(size_t i =0; i test(10); test=size_t(0); TestClass testvar = size_t(0); size_t i; for(i = 0;i test2 = testvar.testFunc(test); for(i=0;i On 05/02/2005, at 11:27 AM, David Chilton wrote: > >> Glen, >> >> I successfully built and executed the benchmark target in the macstl >> xcode project, and it looks great. Now I'm trying to test >> incorporating macstl into the code for my honors thesis, which relies >> heavily on valarray's read from FITS files, but I can't seem to get >> gcc to build it. I continually get "template with C linkage" errors. >> The same code compiles fine using the standard library valarray. >> below is an example: >> >> #include >> using stdext::valarray; >> >> class TestClass >> { >> public: >> static size_t testStatic; >> TestClass(size_t n):index(n) {} >> TestClass():index(0){} >> size_t getindex() { return index;} >> valarray testFunc(valarray& input){ >> valarray d(input.size()); >> for(size_t i =0; i> d[i]=input[i]; >> return d; >> } >> private: >> size_t index; >> } >> size_t TestClass testStatic = size_t(0); >> >> compile with: gcc -c -o test.o test.cpp >> >> and tons of errors are given >> >> Any suggestions would be appreciated. >> > > Thanks for evaluating macstl and joining the list, hope you find both > activities useful! > > The program above has a few syntax errors, allow me to correct: > > #include > using stdext::valarray; > > class TestClass > { > public: > static size_t testStatic; > TestClass(size_t n):index(n) {} > TestClass():index(0){} > size_t getindex() { return index;} > valarray testFunc(valarray& input){ > valarray d(input.size()); > for(size_t i =0; i d[i]=input[i].getindex(); // no > conversion to size_t, I assume you left out getindex > return d; > } > private: > size_t index; > }; > size_t TestClass::testStatic = size_t(0); // needs the :: > > On the macstl side there's one minor bug, valarray_vec.h:38 should read > > #include > > And the above, if put into main.cpp, should compile cleanly on Apple > gcc 3.3. > > I'm not sure where the "template with C linkage" errors are coming > from, perhaps you are compiling a .c file instead of a .cpp file? Or > can you submit an example of the error? > > Note that valarray is unlikely to be optimized, because > size_t is usually a typedef for long, and only ints are Altivec > optimized. On 32-bit gcc, long is 32 bits which is equivalent to an > int, but on 64-bit gcc, long is 64 bits which is no longer the size of > an int, so I didn't want to trigger the issue in macstl. (I may do > some template metaprogramming to get 32-bit longs optimized and skip > 64-bit longs, we'll see.) > > Cheers, Glen Low > > > --- > pixelglow software | simply brilliant stuff > www.pixelglow.com > > > > _______________________________________________ > macstl-dev mailing list > macstl-dev@pixelglow.com > http://www.pixelglow.com/lists/listinfo/macstl-dev > From glen.low at pixelglow.com Sun Feb 6 08:57:47 2005 From: glen.low at pixelglow.com (Glen Low) Date: Sun Feb 6 09:17:04 2005 Subject: [macstl-dev] template with C linkage In-Reply-To: References: <7b5427bd5875c2ce69a158e37f2e930d@utc.edu> <4d0034d70e2578f4ba16e82d974fff5c@pixelglow.com> Message-ID: <2b256ac39bb711f5513a5c5bb20db525@pixelglow.com> David: On 06/02/2005, at 2:28 AM, David Chilton wrote: > Glen, > After much tinkering, it seems the problem comes in with the > -Wp,-header-mapfile,... option that is passed by Xcode. I couldn't > get benchmark.cpp to compile from the command line so i started > putting in options from the Xcode build results window until it built. > Finally, and tried to compile the main.cpp given below. It works > with the std valarray straight away, but gives errors when i try to > use macstl/valarry. But It works when I add > -Wp,-header-mapfile,[PATH_TO_BENCHMARK_BUILD]/benchmark.hmap to the > gcc call, without that it gives loads of errors. I can't seem to find > any documentation on the hmap file or how to make one work, or why it > deosn't work without it. > > Here's the command line that does build properly: > /usr/bin/gcc-3.3 /Users/chiltie/Desktop/main.cpp -o > /Users/chiltie/Desktop/main -lstdc++ -DUSE_MACSTL > -Wp,-header-mapfile,/Users/chiltie/Desktop/macstl/mac/build/ > macstl.build/benchmark.build/benchmark.hmap > > here's the one that doesn't: > /usr/bin/gcc-3.3 /Users/chiltie/Desktop/main.cpp -o > /Users/chiltie/Desktop/main -lstdc++ -DUSE_MACSTL It's rather perplexing. I use almost the same command line as yours: /usr/bin/gcc-3.3 main.cpp -lstdc++ -DUSE_MACSTL -I../macstl -o main.o where the -I is the path to the base of the outermost macstl directory, and I get no errors. Googling for "template with C linkage" turns up about 700+ pages, some detailing issues even with system headers on certain combinations of operating systems and machines. So my trusty C++ detective says :-), it's because a (possibly system) header has an unterminated extern "C" { which then infects all subsequent headers that were #included after. Now if > > And Here's a few lines of the error output (it gives over a thousand > total errors): > In file included from /usr/include/macstl/valarray.h:86, > from /Users/chiltie/Desktop/main.cpp:3: > /usr/include/macstl/impl/valarray_vec.h:45: error: template with C > linkage > /usr/include/macstl/impl/valarray_vec.h:150: error: template with C > linkage > /usr/include/macstl/impl/valarray_vec.h:254: error: template with C > linkage > is the first lot of errors you get back from the compile, then all the #includes before line 86 were OK and the culprit should be #included from somewhere in vec.h header. I suspect that one of your system headers inadvertently had its last line or so chopped off, thus causing an unterminated extern "C" directive, or perhaps it was some interaction with a macro definition. The #include might not have gone through the exact same route and thus not triggered it. To narrow it down, try putting a simple template definition at various points in the my source along the #include path to see if you can trigger the error sooner: e.g. template struct make_my_day { }; Some configuration questions that might be useful: 1. What's the exact version of gcc? (Try gcc --version) 2. Have you made any changes at all to the /usr/include/gcc headers? 3. Did you install macstl at /usr/include/ or somewhere else? If the former, try putting it somewhere else and using an -I option like I did to see if the error goes away. Anyone else try out David's code and got errors? Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From glen.low at pixelglow.com Mon Feb 7 08:05:56 2005 From: glen.low at pixelglow.com (Glen Low) Date: Mon Feb 7 08:24:28 2005 Subject: [macstl-dev] template with C linkage In-Reply-To: <2b256ac39bb711f5513a5c5bb20db525@pixelglow.com> References: <7b5427bd5875c2ce69a158e37f2e930d@utc.edu> <4d0034d70e2578f4ba16e82d974fff5c@pixelglow.com> <2b256ac39bb711f5513a5c5bb20db525@pixelglow.com> Message-ID: On 06/02/2005, at 8:57 AM, Glen Low wrote: > David: > > On 06/02/2005, at 2:28 AM, David Chilton wrote: > >> Glen, >> After much tinkering, it seems the problem comes in with the >> -Wp,-header-mapfile,... option that is passed by Xcode. I couldn't >> get benchmark.cpp to compile from the command line so i started >> putting in options from the Xcode build results window until it >> built. Finally, and tried to compile the main.cpp given below. It >> works with the std valarray straight away, but gives errors when i >> try to use macstl/valarry. But It works when I add >> -Wp,-header-mapfile,[PATH_TO_BENCHMARK_BUILD]/benchmark.hmap to the >> gcc call, without that it gives loads of errors. I can't seem to >> find any documentation on the hmap file or how to make one work, or >> why it deosn't work without it. >> >> Here's the command line that does build properly: >> /usr/bin/gcc-3.3 /Users/chiltie/Desktop/main.cpp -o >> /Users/chiltie/Desktop/main -lstdc++ -DUSE_MACSTL >> -Wp,-header-mapfile,/Users/chiltie/Desktop/macstl/mac/build/ >> macstl.build/benchmark.build/benchmark.hmap >> >> here's the one that doesn't: >> /usr/bin/gcc-3.3 /Users/chiltie/Desktop/main.cpp -o >> /Users/chiltie/Desktop/main -lstdc++ -DUSE_MACSTL > > It's rather perplexing. I use almost the same command line as yours: > > /usr/bin/gcc-3.3 main.cpp -lstdc++ -DUSE_MACSTL -I../macstl -o main.o > > where the -I is the path to the base of the outermost macstl > directory, and I get no errors. > > Googling for "template with C linkage" turns up about 700+ pages, some > detailing issues even with system headers on certain combinations of > operating systems and machines. > Finally figured it out. The bug only hits when you put macstl in /usr/include or symlink to it. Apparently gcc automatically encloses all such headers in an extern "C" block, see http://gcc.gnu.org/onlinedocs/cpp/System-Headers.html. The URL claims this is only in operation "on very old systems", so perhaps it's a bug on OS X. I tried using the #pragma GCC system_header directive but it didn't seem to take, and in any case would be inaccurate in intent for macstl headers -- so I'll change the ReadMe.rtf to highlight this issue. Thanks for your help! Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From glen.low at pixelglow.com Tue Feb 15 00:14:44 2005 From: glen.low at pixelglow.com (Glen Low) Date: Tue Feb 15 00:23:11 2005 Subject: [macstl-dev] macstl 0.2.1 beats the autovectorizing Intel ICC 8.1 Message-ID: Hi All Thanks for all your patience and help with debugging macstl 0.2 -- especially Paul Baxter and Derek Piasecki who helped with AMD64 issues. I'm happy to announce the immediate availability of macstl 0.2.1, which features support for Intel ICC 8.1 on Windows and partial support for IBM XLC++ 6.0 on Mac OS X. http://www.pixelglow.com/macstl/download/ Here's the list of changes: * Fixed member and binary min and max for vec [PBa]. * Fixed #include error with own projects [DCh]. * Added support for Intel ICC 8.1 [ACu]. * Fixed truncation of signed constants in unsigned parameters [DPi]. * Added partial support for IBM XLC 6.0. * Fixed header access paths and missing functions malloc, free, vm_allocate, vm_copy, vm_deallocate for Codewarrior. * Fixed #include error, domain in trigonometric test for VC++. Improved inlining for ICC. I haven't got the latest benchmarks up yet, but so far macstl 0.2.1 beats the autovectorizing Intel ICC -- some 2x to 16x faster than autovectorized code!! I've also got the Subversion server set up, so for those of you who pay the license fees, you get access to the latest sources and even commit rights to the branches, so you can play with macstl on our own servers. Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From glen.low at pixelglow.com Tue Feb 15 20:16:18 2005 From: glen.low at pixelglow.com (Glen Low) Date: Tue Feb 15 20:29:53 2005 Subject: [macstl-dev] Opinions wanted about the future directions of macstl Message-ID: Hi All, Now that macstl 0.2.1 has reached some measure of stability and we now have a good community of you macstl users in this mailing list, it's time I gathered some feedback about our future direction. Please rank the top 5 or so things you'd like to see in macstl, discuss it, and I'll see what I can do, time, energy and money permitting :-). Documentation 1. Design of the trig functions. 2. Design of the integer division functions. 3. How to adapt macstl for your favorite SIMD architecture. 4. More examples and samples (suggest?) Altivec 5. The rest of the transcendentals e.g. acos, sinh. 6. The complex transcendentals. 7. Any other mathematical or other vector-related functions (suggest?) 8. long long support -- doubtful whether Altivec will accelerate this much... 9. Optimizing mask arrays -- using a bool array to select elements from another array MMX/SSE/SSE2/SSE3 10. Summarizers e.g. min, max, sum. 11. operator*, operator/, operator% 12. float transcendentals. 13. double transcendentals. 14. complex float arithmetic. 15. complex float transcendentals. 16. complex double arithmetic. 17. complex double transcendentals. 18. memory mapping in Windows General SIMD 19. Support for OpenMP parallelizing. 20. Distributed valarrays, perhaps through MPI? 21. Support for your favorite SIMD architecture, perhaps a GPU? Other areas in macstl 22. Other Core Foundation objects e.g. maps, sets. 23. Objective-C++ support. 24. Improving the COM implementation. 25. Improving the Mach vector implementation. Perhaps also a Mach std::string? Cheers, Glen Low --- pixelglow software | simply brilliant stuff www.pixelglow.com From pauljbaxter at hotmail.com Wed Feb 16 02:51:01 2005 From: pauljbaxter at hotmail.com (Paul Baxter) Date: Wed Feb 16 07:46:52 2005 Subject: [macstl-dev] Opinions wanted about the future directions of macstl References: Message-ID: > Now that macstl 0.2.1 has reached some measure of stability and we now > have a good community of you macstl users in this mailing list, it's time > I gathered some feedback about our future direction. > > Please rank the top 5 or so things you'd like to see in macstl, discuss > it, and I'll see what I can do, time, energy and money permitting :-). > Comments attached - heavy PC/Linux bias See *N where N = priority (1 highest) > Documentation > 1. Design of the trig functions. > 2. Design of the integer division functions. > 3. How to adapt macstl for your favorite SIMD architecture. *5 > 4. More examples and samples (suggest?) Added: *6 4b) Architecture optimisation guide - how to get the best out of macstl on different compilers/OS/processors .g. compile flags, tips > > Altivec > 5. The rest of the transcendentals e.g. acos, sinh. > 6. The complex transcendentals. > 7. Any other mathematical or other vector-related functions (suggest?) > 8. long long support -- doubtful whether Altivec will accelerate this > much... > 9. Optimizing mask arrays -- using a bool array to select elements from > another array > > MMX/SSE/SSE2/SSE3 > 10. Summarizers e.g. min, max, sum. > 11. operator*, operator/, operator% *3 > 12. float transcendentals. > 13. double transcendentals. *1 > 14. complex float arithmetic. > 15. complex float transcendentals. *2 > 16. complex double arithmetic. > 17. complex double transcendentals. > 18. memory mapping in Windows Or in Linux ;) Mem alignment, alternate allocators e.g. boost::aligned_storage Added: *4 18b) Code optimisation Biased here but optimisations for Athlon64, P4 (and maybe others) e.g. Athlon XP (all I have at home) performance in some benchmarks is double the P4 (first two functions in benchmark.cpp) and in others horribly slower (e.g. polynomial). Can't help feeling there's a lot more still to be done to improve the processing optimisation on PC architectures. > > General SIMD > 19. Support for OpenMP parallelizing. > 20. Distributed valarrays, perhaps through MPI? Wouldn't you use VSIPL++ if you wanted this? > 21. Support for your favorite SIMD architecture, perhaps a GPU? Wanna work with a CELL, Glen ;) > > Other areas in macstl > 22. Other Core Foundation objects e.g. maps, sets. > 23. Objective-C++ support. > 24. Improving the COM implementation. > 25. Improving the Mach vector implementation. Perhaps also a Mach > std::string? Added suggestions: