October 5th, 2010, 3:12 pm
List of suspects: Intel Compiler's Math Library (ICML), Julien Pommier's SSE math library (SSE_Math), MS intrinsic math functions (MS).Platform: Visual Studio 2008, WinXP box running on Cure 2 duo.Math functions tested: exp, log, sin, cos.Remarks: ICML is a part of Intel's compiler suite. ICML automatically replaces cmath functions with optimized intrinsics. MS automatically replaces stock cmath.And the verdict is...Normalized single procession performance: MS (1), SSE_Math (4), ICML (6).ICML single precision performance is 6 times faster than MS and 1.5 times faster than SSE_Math.But...ICML must be used in a loop. The loop must be auto-vectorized by the compiler. Not all loops can be auto-vectorized. Using std::vector instead of plain array will privet vectorization. Aliased pointer will prevent vectorization. Many other things can prevent vectorization. Intel Compiler's manual provides some info on how to facilitate vectorization.SSE_Math has full single precision accuracy. SSE_Math can compute sin and cos of the same argument in 1 pass with almost zero overhead. It can be 8 times faster than MS in some circumstances.
Last edited by
renorm on October 4th, 2010, 10:00 pm, edited 1 time in total.