• 1
• 5
• 6
• 7
• 8
• 9
• 11 spv205
Posts: 478
Joined: July 14th, 2002, 3:00 am

### Optimized math functions

OutrunI agree - I was just trying to understand why you wanted to do remez on [1/sqrt(pi) exp -.5 x^2] (to me that was only useful if it was a scalar gaussian,and you then did product). Otherwise it would seem better to calc euclidean distance and then remez etc on exp - in which case AVT suggestion was useful...but AVT is following this up QuoteQuote QuoteOriginally posted by: spv205exp(-1/2 sum x^2) is slower than product exp(-1/2 x^2) [if you can be bothered!]Good idea (to check order)!... I think it the other it's the other way around though. The left one has only 1x exp(), the right one N x exp() Cuchulainn
Posts: 62583
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Optimized math functions

QuoteOriginally posted by: outrunjust found that boost has the Remez algorithm (boost/math/tools/remez.hpp), so that should make finding min-max approximation coefficients very easy!boost docIt would be nice if you would comment on this algo in Boost if you get some time.I did a few experiments with Brent's method in Maths and it worked well for the test cases.
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl Cuchulainn
Posts: 62583
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Optimized math functions

Maybe I can have a look at Remez then if you like if someone can specify the problem (exp(z() in range [A,b] etc. desired accuracy).
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl Cuchulainn
Posts: 62583
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Optimized math functions

QuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnThey wrote x = Nlog(2)/2^K + rand then exp(x) = 2^(N/2)^K * exp(r)Now:1. They they use Remez to approxmate exp(r) by a polynomial of degree 5 (==> Solve a matrix system). 2. They get typically r < log(2)/64 = 0.01 which can be done with Pade, so why use Remez?..we can start with this, .. can we get K directly from the last bits (the exponent) of the float?we can then do various order approximations of exp(r) on r [0 .. 1/2]Here are some tests for Pade R(m,n) == Rmn at z = 1/2 (since exp(z) is a monotically increasing function of z)z = 1/2R01: 2R11: 1.66667R22: 1.64865R33: 1.64872exp(z): 1.64872Going outside radius of convergence to see what happens at z = 1R01: blowup 1/(1-z)R11: 3R22: 2.71429R33: 2.71831exp(z): 2.71828For r = log(2)/64 we getR01: 1.01095R11: 1.01089R22: 1.01089R33: 1.01089exp(z): 1.01089BTW This polynomial function gives 1.64787 at z = 1/2 and 2.61765 at z = 1. Not surprising because polynomials are in general not very good at approximation, when they go outside their domain. For r = log(2)/64 it gives 1.01136, clearly worse than R(m,n).Quoteexp(z) = (32793 +31836 z + 21146 z^2)/32768I have not done boost Remez yet.
Last edited by Cuchulainn on October 13th, 2010, 10:00 pm, edited 1 time in total.
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl Cuchulainn
Posts: 62583
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Optimized math functions

QuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnDo Microsoft /Intel use Pade, or is it something else? This is how Intel does it: The Computation of Transcendental Functions on the IA-64 ArchitectureI had another look at this paper. Apart from h/w and machine dependencies I really have not been able to discover the rationale (no pun intended) why the authore use Remez. A glimpse could be remark 4 on page 3("do not fear polynomials"); polynomials are not vety good, they are not even wrong.Some of the features of Remez that I suspect will affect performance are:1. It is an iterative scheme2. It calculates Chebycheff point cos(x[k]) ==> yet another trig function3. The derivatives of the polynomial up to order 2 are needed4. LU decomposition5. It needs high precision arithmetic1st impressions is that it is top-heavy and not optimal in this case. The reasons for using are not compelling IMO.
Last edited by Cuchulainn on October 13th, 2010, 10:00 pm, edited 1 time in total.
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl Cuchulainn
Posts: 62583
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Optimized math functions

QuoteOriginally posted by: outrunQuoteOriginally posted by: Cuchulainnz = 1/2R01: 2R11: 1.66667R22: 1.64865R33: 1.64872exp(z): 1.64872..so is it correct that the final approximation equation is something like?exp(z) ~ (2 + 1.66667*z ) / (1 + 1.64864*z + 1.64872*x^2 )
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl Cuchulainn
Posts: 62583
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Optimized math functions

QuoteOriginally posted by: outrunThe steps you describe are pre-processing steps, with the goals to estimate coefficient of a Chebyshev approximation. The end result is something like this:exp(z) = (32793 +31836 z + 21146 z^2)/32768..nothing but a couple of simple + * an / float operators[...] edit:That's why I said earlier that you can pick any way to estimate these coefficient:C++, Matlab, Mathematica, a simple google search..I am not exactly sure what you are saying here. Prepropcessing is also time-consuming.The polynomial solution is not very robust.
Last edited by Cuchulainn on October 13th, 2010, 10:00 pm, edited 1 time in total.
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl Cuchulainn
Posts: 62583
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Optimized math functions

QuoteOriginally posted by: outrunGreat! This is very simple to the points! and allows for error analysis.And R33 is optimized for z in [0,1]?I have a deadline tonight, working my *** off, but will contribute soon.. I'll do the Chebyshev polynomial approx. for various number of terms.One quick thing... Why does Intel choose for optimizing the (rescalable) region for [0, 1/2] instead of the more common [0,1] or [-1,1]? Has that to do with the storage of the exponent in floating point notation? Is so, that we should to that too, otherwise we need to waste cycles on processing that part too (the exponent, K).R33 at z = 1 is accurate up to 4/5 decimal places; even for z = 2, R33 = 7.4 while exp(z) = 7.38.I am not sure why in [0,1/2]. The rationale behind the numerical analysis is not clear, as I mentioned.BTW you NTL library for boost Remez.Maybe R22 is OK as well
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl  