Page **2** of **5**

### Numerical accuracy and performance ?

Posted: **December 7th, 2015, 9:02 pm**

by **Traden4Alpha**

QuoteOriginally posted by: CuchulainnQuote It's not an issue of confidence interval or drift but a more fundamental limitation of adding small numbers to large ones with a fixed-length floating point representation. Actually, in single precision floats, the problem starts showing up at much lower sample counts than 2^32 especially for calculations of higher moments.If you want a simple test of this issue, create a loop that adds 1.0 to a 32-bit float and let it loop 2^26 times. What's the result? It should be 2^26 but it won't be. You'll reach a point where x+1.0 == x.Is T4A referring to machine precision(ULP)?Exactly! Moreover, the issue with the RNG is the mapping of uniform RNs of spacing epsilon to Normal RNs of much larger spacing or anomalous compressions of the density of RNs in which some values might occur twice as often as adjacent values. The laws of large numbers mean these nonuniformities in the body of the distribution are probably harmless but the truncation of the tail is not especially in the context of QF in which banks and regulators are most concerned with tail events.

### Numerical accuracy and performance ?

Posted: **December 7th, 2015, 9:13 pm**

by **Traden4Alpha**

What's also interesting about the machine precision issue for MC is that as the number of samples grows, the chance that a sample's value will be ignored because it is less than the machine epsilon for the accumulating variable will grow. At first, only small values will fail to change the sum but as the sum grows, (SumX + X) == SumX will happen more and more often until virtually all values of X fail to change the sum.

### Numerical accuracy and performance ?

Posted: **December 8th, 2015, 7:06 am**

by **ioualalen**

Hi everyone,Thank again for the vivid debate, it is very interesting for me to see that this topic is interesting for you.Quote This reminds me a bit of IBM/Rational '-ify' products like purify, quantify etc. A non-finite number and a memory leak are not a million miles apart?I am not sure that what we do could apply here. From my perspective, memory leak are more of procedural problem, meaning that the programmer is not cautious enough or the garbage collector is not ?smart? enough. Memory is roughly an integer world, where there is almost no computation needed to use it.Quote I find this text below a bit weird: an underflow is not an error.I won?t quote all the response of everyone. I think Traden4Alpha has a very good point about that. Signaling an underflow could be done of course, some systems already does it (in Fortran for example). For and undeflow that could be interesting, however for a numerical drift a flag won?t be able to do much as this way more difficult to detect.Quote It's not an issue of confidence interval or drift but a more fundamental limitation of adding small numbers to large ones with a fixed-length floating point representation. Actually, in single precision floats, the problem starts showing up at much lower sample counts than 2^32 especially for calculations of higher moments.If you want a simple test of this issue, create a loop that adds 1.0 to a 32-bit float and let it loop 2^26 times. What's the result? It should be 2^26 but it won't be. You'll reach a point where x+1.0 == x.Sure, absorption occurs at some point but only if your values are all of the same sign, and you SumX is move toward larger and larger values. Is this always the case in a Monte Carlo simulation?When they have different signs you could have catastrophic cancellation (which is also bad), but it is not the same kind of numerical drift.Quote Take a look at Box Muller method for computing random Normal values and think about what happens when a 32-bit uniform integer is converted into a 32-bit float and then the values are transformed via Box Muller into Normal distribution. IIRC, this method never produces random values outside of about -6 to +6 sigma (i.e., it truncates the tails of the distribution) no matter how many samples you run which means it will underestimate all the even moments of the distribution and cause some anomalies in the mean and skew if the MC involves nonlinear payoffs. Alright, I will try to implement this to see how it behaves. I will inform you of the results.Quote What's also interesting about the machine precision issue for MC is that as the number of samples grows, the chance that a sample's value will be ignored because it is less than the machine epsilon for the accumulating variable will grow. At first, only small values will fail to change the sum but as the sum grows, (SumX + X) == SumX will happen more and more often until virtually all values of X fail to change the sum. Do you think we can build a case like this easily to test it?On a more open debate, do you have ever encounters this kind of issue we discuss here in your jobs?

### Numerical accuracy and performance ?

Posted: **December 8th, 2015, 11:47 am**

by **Cuchulainn**

QuoteWhen they have different signs you could have catastrophic cancellation (which is also bad), but it is not the same kind of numerical drift.What's 'numerical drift'? Can you give an example? //Thinking out loud:One thing is useful to know if a sequence {a_n: n = 0,1,2,..} (aka iterative scheme) is converging or not. We know1. if the sequence is converging then it is a Cauchy sequenceBut can we state the following?2. if the Cauchy sequence is not converging then the sequence is NOT converging? Could this be a test if the scheme is drifting into the abyss? Sooner or late it will blow up.

### Numerical accuracy and performance ?

Posted: **December 8th, 2015, 12:14 pm**

by **ioualalen**

Hi Cuchulainn, of course, a numerical drift is a progressive gap between the value you want to calculate and the value you get using floating-point arithmetic. The case of the Patriot missile is a good example of that. Here, below, in a nutshell how it goes. During the loop the value calculated accumulates more and more rounding-errors, thus causing a drift.float cpt = 0.0f;for (int i = 0 ; i < 10^9 ; i++) cpt = cpt + 0.1f;printf( cpt );Concerning your second point on Cauchy Sequence, I doubt that. It does seems that you have an equivalence between Cauchy sequence and Convergence property. At least the theorem gives you a sufficient condition, but I don't think it is a necessary one. [EDIT]In fact I may be completely wrong as this website explains it.

http://demo.activemath.org/ActiveMath2/ ... uchy_seqIf you have equivalence then you could use your second property. [/EDIT]

### Numerical accuracy and performance ?

Posted: **December 8th, 2015, 12:40 pm**

by **Cuchulainn**

QuoteCauchy sequence and Convergence property. At least the theorem gives you a sufficient condition, but I don't think it is a necessary one. My conjecture is convergence => Cauchy convergence but NOT(Cauchy) => NOT(convergence).True/not true? QuoteSo for real or complex sequences, both concepts Cauchy sequence und convergent sequence are equivalent These are "complete normed vector spaces" == Banach space. So, in a complete normed vector space Cauchy convergence is necessary and sufficient for convergence to a limit.

### Numerical accuracy and performance ?

Posted: **December 8th, 2015, 2:39 pm**

by **Traden4Alpha**

QuoteOriginally posted by: ioualalenQuote It's not an issue of confidence interval or drift but a more fundamental limitation of adding small numbers to large ones with a fixed-length floating point representation. Actually, in single precision floats, the problem starts showing up at much lower sample counts than 2^32 especially for calculations of higher moments.If you want a simple test of this issue, create a loop that adds 1.0 to a 32-bit float and let it loop 2^26 times. What's the result? It should be 2^26 but it won't be. You'll reach a point where x+1.0 == x.Sure, absorption occurs at some point but only if your values are all of the same sign, and you SumX is move toward larger and larger values. Is this always the case in a Monte Carlo simulation?When they have different signs you could have catastrophic cancellation (which is also bad), but it is not the same kind of numerical drift.Absorption may be a bigger issue than it first appears to be because:1. Estimates of the even moments (e.g., variance and kurtosis, etc.) do involve sums of only positive numbers (e.g., the sum of X^2 and sum of X^4 respectively).2. Absorption effects can occur even if the values are not all the same sign but the series is subject to statistical runs of the same sign. The distribution of SumX for sample size, N, will have a standard deviation of O(sqrt(N)) which should be OK for N<O(2^52) for series that have a mean of zero. But the distribution of higher odd moments for estimating skewness (i.e., the sum of X^3) will have a much higher standard deviation and begin absorbing at much lower N.

### Numerical accuracy and performance ?

Posted: **December 8th, 2015, 2:53 pm**

by **Traden4Alpha**

QuoteOriginally posted by: ioualalenHi Cuchulainn, of course, a numerical drift is a progressive gap between the value you want to calculate and the value you get using floating-point arithmetic. The case of the Patriot missile is a good example of that. Here, below, in a nutshell how it goes. During the loop the value calculated accumulates more and more rounding-errors, thus causing a drift.float cpt = 0.0f;for (int i = 0 ; i < 10^9 ; i++) cpt = cpt + 0.1f;printf( cpt );The "true" answer will always lie in some interval defined by propagating (and computing) various ±epsilon values (with roundup/rounddown). Based on this issue that all fixed-bit-size floating point numbers are intervals, we have two goals:1. Obviously, we'd like to minimize the width of the final answer's interval (especially avoiding the case in which the interval expands to infinity, hits a singularity, or wraps around on a periodic system). 2. Avoid the chance that the true answer is at the edge of the interval (e.g, due to drift with a consistent sign as in the case of adding a fixed value of 0.1 for 10^9 times).Some of this problem is exacerbated by a naive "more-is-better" strategy when picking samples sizes, grid densities, time steps, etc. (compounded by the growing use of faster processors, GPUs, and cloud grids). Round-off error means that accuracy follows a bathtub curve -- poor for too sparse a computation and poor for too dense a computation. This leads to a third goal:3. Finding the "just-right" Goldilocks operating parameters that maximize accuracy.

### Numerical accuracy and performance ?

Posted: **December 10th, 2015, 2:10 pm**

by **ioualalen**

Hello again everyone,just a quick update, I had to look at your code Cuchulainn with the quiet NaN. It turns out that the calculation does not produce a NaN but a -oo as log(0.0/65.0) is not NaN. Therefore you ends up dividing by -oo at some point then set everything to zero and returns zero. But I am not sure this is really an error.Quotedouble n(double x){ double A = 1.0 / std::sqrt(2.0 * 3.1415); return A * std::exp(-x*x*0.5);}double N(double x) { // An approximation to the cumulative normal distribution double a1 = 0.4361836; double a2 = -0.1201676; double a3 = 0.9372980; double k = 1.0 / (1.0 + (0.33267 * x)); if (x >= 0.0) { return 1.0 - n(x)* (a1*k + (a2*k*k) + (a3*k*k*k)); } else { return 1.0 - N(-x); }}// // Normal distributions std::cout << "N(4) " << N(4.0) << std::endl; double val2 = std::log(0.0 / 65.0); std::cout << "N(NaN) " << N(val2) << std::endl;However, if you do something like log(-1.0) then you ends up with a "proper" NaN, and then you have infinite recursion. My guess is you could use a small (and kind of weird) test, which is : if (x != x) exit(1) ;//then it's a NaNThis test should be really fast to run so performance should not be to badly affected. But I am not sure that it will go well on any kind of langage or plateform.QuoteAbsorption may be a bigger issue than it first appears to be because:So I tried to do a small Monte Carlo simulation using this piece of code :

http://quant.stackexchange.com/question ... s-in-javaI run like 10^7 simulations, but the result were pretty accurate. Using double precision number I end up with like 13 correct digits (over 16 in total).I have not done every measure for variance and kurtosis yet, but I will try. Do you have some basic implementation of these calculations that I could use ?

### Numerical accuracy and performance ?

Posted: **December 10th, 2015, 3:33 pm**

by **Cuchulainn**

Hi ioualelen,You are are correct. Here js an example with the infinite recursion:(btw C++ 11 has a test for nan which is like x != x indeed).Quote have not done every measure for variance and kurtosis yet, but I will try. Do you have some basic implementation of these calculations that I could use ?maybe Boost Distributions

### Numerical accuracy and performance ?

Posted: **December 11th, 2015, 5:25 am**

by **Cuchulainn**

QuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuote It's not an issue of confidence interval or drift but a more fundamental limitation of adding small numbers to large ones with a fixed-length floating point representation. Actually, in single precision floats, the problem starts showing up at much lower sample counts than 2^32 especially for calculations of higher moments.If you want a simple test of this issue, create a loop that adds 1.0 to a 32-bit float and let it loop 2^26 times. What's the result? It should be 2^26 but it won't be. You'll reach a point where x+1.0 == x.Is T4A referring to machine precision(ULP)?Exactly! Moreover, the issue with the RNG is the mapping of uniform RNs of spacing epsilon to Normal RNs of much larger spacing or anomalous compressions of the density of RNs in which some values might occur twice as often as adjacent values. The laws of large numbers mean these nonuniformities in the body of the distribution are probably harmless but the truncation of the tail is not especially in the context of QF in which banks and regulators are most concerned with tail events.I've been thinking about the consequences is finite precision in this context. If we take the normal y = cdf(x)we see that it is a 1:1 onto (bijective) mapping from x space to y space.This bijectivity breaks down with finite precision data types. Then I reckon the equation y = cdf(x) is N:1 because multiple x's solve this equation for a given y (inverse cdf) and on the other hand y->x is a 1:N mapping because multiple x's will produce the same y values. Of course, we can truncate or transform the x domain but that is probably not the issue.Is this the concern? Quad precision would improve things but again it is a stay of execution.

### Numerical accuracy and performance ?

Posted: **December 11th, 2015, 1:08 pm**

by **Traden4Alpha**

QuoteOriginally posted by: CuchulainnQuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuote It's not an issue of confidence interval or drift but a more fundamental limitation of adding small numbers to large ones with a fixed-length floating point representation. Actually, in single precision floats, the problem starts showing up at much lower sample counts than 2^32 especially for calculations of higher moments.If you want a simple test of this issue, create a loop that adds 1.0 to a 32-bit float and let it loop 2^26 times. What's the result? It should be 2^26 but it won't be. You'll reach a point where x+1.0 == x.Is T4A referring to machine precision(ULP)?Exactly! Moreover, the issue with the RNG is the mapping of uniform RNs of spacing epsilon to Normal RNs of much larger spacing or anomalous compressions of the density of RNs in which some values might occur twice as often as adjacent values. The laws of large numbers mean these nonuniformities in the body of the distribution are probably harmless but the truncation of the tail is not especially in the context of QF in which banks and regulators are most concerned with tail events.I've been thinking about the consequences is finite precision in this context. If we take the normal y = cdf(x)we see that it is a 1:1 onto (bijective) mapping from x space to y space.This bijectivity breaks down with finite precision data types. Then I reckon the equation y = cdf(x) is N:1 because multiple x's solve this equation for a given y (inverse cdf) and on the other hand y->x is a 1:N mapping because multiple x's will produce the same y values. Of course, we can truncate or transform the x domain but that is probably not the issue.Is this the concern? Quad precision would improve things but again it is a stay of execution.Indeed. And the density of values in the set of floating points numbers is extremely non-uniform. The range [0,1] is given about as many values as [1,∞] and there's more double-precision values stuffed in the [0,10^-153] range than the [10^-153,1] range. If they are not constructed carefully, inverse Gaussian CDF computations (e.g., Gaussian RNGs) will model the lower tail (p near 0) differently than the upper tail (p near 1).

### Numerical accuracy and performance ?

Posted: **December 11th, 2015, 8:09 pm**

by **Cuchulainn**

QuoteOriginally posted by: outrunQuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuote It's not an issue of confidence interval or drift but a more fundamental limitation of adding small numbers to large ones with a fixed-length floating point representation. Actually, in single precision floats, the problem starts showing up at much lower sample counts than 2^32 especially for calculations of higher moments.If you want a simple test of this issue, create a loop that adds 1.0 to a 32-bit float and let it loop 2^26 times. What's the result? It should be 2^26 but it won't be. You'll reach a point where x+1.0 == x.Is T4A referring to machine precision(ULP)?Exactly! Moreover, the issue with the RNG is the mapping of uniform RNs of spacing epsilon to Normal RNs of much larger spacing or anomalous compressions of the density of RNs in which some values might occur twice as often as adjacent values. The laws of large numbers mean these nonuniformities in the body of the distribution are probably harmless but the truncation of the tail is not especially in the context of QF in which banks and regulators are most concerned with tail events.Indeed, boost math has I've been thinking about the consequences is finite precision in this context. If we take the normal y = cdf(x)we see that it is a 1:1 onto (bijective) mapping from x space to y space.This bijectivity breaks down with finite precision data types. Then I reckon the equation y = cdf(x) is N:1 because multiple x's solve this equation for a given y (inverse cdf) and on the other hand y->x is a 1:N mapping because multiple x's will produce the same y values. Of course, we can truncate or transform the x domain but that is probably not the issue.Is this the concern? Quad precision would improve things but again it is a stay of execution.Indeed. And the density of values in the set of floating points numbers is extremely non-uniform. The range [0,1] is given about as many values as [1,∞] and there's more double-precision values stuffed in the [0,10^-153] range than the [10^-153,1] range. If they are not constructed carefully, inverse Gaussian CDF computations (e.g., Gaussian RNGs) will model the lower tail (p near 0) differently than the upper tail (p near 1).Indeed! E.g. boost math provides complement functions for this exact reason: Complement of the Cumulative Distribution FunctionSo, we need two functions to do the same thing? Is it a workaround? The basic problem is the weakness in the numerical method used (?) And we even haven't discussed n-variate cdf....

### Numerical accuracy and performance ?

Posted: **December 11th, 2015, 9:13 pm**

by **Traden4Alpha**

QuoteOriginally posted by: CuchulainnQuoteOriginally posted by: outrunQuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuote It's not an issue of confidence interval or drift but a more fundamental limitation of adding small numbers to large ones with a fixed-length floating point representation. Actually, in single precision floats, the problem starts showing up at much lower sample counts than 2^32 especially for calculations of higher moments.If you want a simple test of this issue, create a loop that adds 1.0 to a 32-bit float and let it loop 2^26 times. What's the result? It should be 2^26 but it won't be. You'll reach a point where x+1.0 == x.Is T4A referring to machine precision(ULP)?Exactly! Moreover, the issue with the RNG is the mapping of uniform RNs of spacing epsilon to Normal RNs of much larger spacing or anomalous compressions of the density of RNs in which some values might occur twice as often as adjacent values. The laws of large numbers mean these nonuniformities in the body of the distribution are probably harmless but the truncation of the tail is not especially in the context of QF in which banks and regulators are most concerned with tail events.Indeed, boost math has I've been thinking about the consequences is finite precision in this context. If we take the normal y = cdf(x)we see that it is a 1:1 onto (bijective) mapping from x space to y space.This bijectivity breaks down with finite precision data types. Then I reckon the equation y = cdf(x) is N:1 because multiple x's solve this equation for a given y (inverse cdf) and on the other hand y->x is a 1:N mapping because multiple x's will produce the same y values. Of course, we can truncate or transform the x domain but that is probably not the issue.Is this the concern? Quad precision would improve things but again it is a stay of execution.Indeed. And the density of values in the set of floating points numbers is extremely non-uniform. The range [0,1] is given about as many values as [1,∞] and there's more double-precision values stuffed in the [0,10^-153] range than the [10^-153,1] range. If they are not constructed carefully, inverse Gaussian CDF computations (e.g., Gaussian RNGs) will model the lower tail (p near 0) differently than the upper tail (p near 1).Indeed! E.g. boost math provides complement functions for this exact reason: Complement of the Cumulative Distribution FunctionSo, we need two functions to do the same thing? Is it a workaround? The basic problem is the weakness in the numerical method used (?) And we even haven't discussed n-variate cdf....They are the same thing conceptually, but not identical numerically. They reflect the simple fact that x != (1-(1-x)) for about half of all floating point numbers.