- Cuchulainn
**Posts:**62901**Joined:****Location:**Amsterdam-
**Contact:**

QuoteOriginally posted by: outrunjust for the record (now that you have quotes it)=> I made a mistake stating that the error is 3% with exp(z) = (32793 +31836 z + 21146 z^2)/32768the formula is correct, and the error is not 3%, but +/-0.00086 on the interval z=[ 0.. 1/2] for exp(z)How were the numbers 327.. etc. arrived at? (step 1 in Remez?)

Step over the gap, not into it. Watch the space between platform and train.

http://www.datasimfinancial.com

http://www.datasim.nl

http://www.datasimfinancial.com

http://www.datasim.nl

OutrunI guess you are not able to be a bit more specific on your overall algorithm/goals? Are there no higher level optimisations? curious

OutrunI don't know much about the EM algorithm workings and your data set, but doesn't the exponential effectively mean that only points in a small radius of the mean (at any one time) have an impact? So can you just ignore points outside some #number of standard deviations and not do the exp calc?

What is verdict on inverse CDF? We have Acklam for doubles and Shaw-Brickman (ICNDfloat2) for singles. Acklam's formula has 5% (two 2.5% branches) branch which uses sqrt and log. Shaw-Brickman has no branches, but uses log every time. Acklam formula is more accurate, but looses too much by switching to singles.One thing that puzzles me about these formulas is why they rely on math function? Math functions themselves are approximated using similar formulas. Why not to approximate the final result directly?Acklam's formula uses math functions only 5% of time, but both sqrt and log are evaluated. We can't use SIMD versions of math function in the branches. Without SIMD Shaw-Brickman it is slower and less accurate than Acklam, at least on CPU.

OutrunI guess you've gone through this all, but can't you just zero those probabilities (so the probability still sums to 1)? In other words for each point store the most likely class (from the previous iteration) then revalue starting with that class and threshold based on some multiple of that level? On another note would you mind clarifyingQuoteanother thing to note is that *I* need exp to evaluate the Gaussian density, so I guess I'm better of to approximate the function f(x) = exp(-1/2 x^2) / sqrt(2 pi) directly using Remezexp(-1/2 sum x^2) is slower than product exp(-1/2 x^2) [if you can be bothered!]

- Cuchulainn
**Posts:**62901**Joined:****Location:**Amsterdam-
**Contact:**

QuoteOriginally posted by: outrunjust found that boost has the Remez algorithm (boost/math/tools/remez.hpp), so that should make finding min-max approximation coefficients very easy!boost docWell, you should know, you are in the Boost Maths consortium See also Mon Oct 11, 10 06:38 PM

Last edited by Cuchulainn on October 11th, 2010, 10:00 pm, edited 1 time in total.

Step over the gap, not into it. Watch the space between platform and train.

http://www.datasimfinancial.com

http://www.datasim.nl

http://www.datasimfinancial.com

http://www.datasim.nl

- Cuchulainn
**Posts:**62901**Joined:****Location:**Amsterdam-
**Contact:**

QuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnSee also Mon Oct 11, 10 06:38 PM Ah I didn;t notice that first time around! Nicely spotted!I only wrote a boost math function *once*, the code itself was not much more than "exp(abs(x))", but it ended up being quite a bit of extra work (docs, plots, tables) to get it in! That's why I suggest to keep that for phase 2.Boost Maths is well documented; Mr. Maddock, Bristow and colleagues are doing an admirable job imo.The code part is the easy part (like a dress rehearsal), it's the plot/doc that is vital.

Step over the gap, not into it. Watch the space between platform and train.

http://www.datasimfinancial.com

http://www.datasim.nl

http://www.datasimfinancial.com

http://www.datasim.nl

Do I read your reply from Tue Oct 12, 10 03:39 PM correctly, that N = pdfN and you dosome 'binning' up to ~ 3 or 4 decimal places?Then this would be far more easy the implementing some exp, as it only asks for the*absolute* error: while code tries to minimize relative error at the same time and thishas its troubles towards the tails - while binning would certainly not care not there.Especially in that case you only need some 'exp' on the negative axis with absoluteerror below 4 significant decimals - did I understand it correctly that way?

GZIP: On