SERVING THE QUANTITATIVE FINANCE COMMUNITY

• 1
• 2

Cuchulainn
Topic Author
Posts: 63234
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### What could go possibly wrong ...

Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: What could go possibly wrong ...

Well, that looks like it has divide by zero errors in the subterms all over the 3-D rho space although maybe the PSD condition saves the day. Also, sqrt(x<0) seems possible but unlikely.

Cuchulainn
Topic Author
Posts: 63234
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: What could go possibly wrong ...

The term under the square root is the determinant of the (positive definite?) covariance matrix. If we take rho13 = rho23 = 0, rho12 >= 1 we get NaN. But what is there is noise in the data?
The term exp(-w/b) should never underlfow?

BTW what's PSD?
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

BigAndyD
Posts: 73
Joined: July 10th, 2013, 12:32 pm

### Re: What could go possibly wrong ...

PSD = Positive semidefiniteness

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: What could go possibly wrong ...

The term under the square root is the determinant of the (positive definite?) covariance matrix. If we take rho13 = rho23 = 0, rho12 >= 1 we get NaN. But what is there is noise in the data?
The term exp(-w/b) should never underlfow?

BTW what's PSD?
The square root term crashes if the determinant is less that zero. Both the 1/denominator and exp term blow up if the determinant is zero although the analytic version of the equation does not because the exp term goes to zero much faster than the 1/sqrt term goes to infinity.

In theory, if the rho values come from a covariance matrix computed from real-valued data, PSD is guaranteed. In practice, round-off errors in the statistical sums could lead to violations of PSD and a negative determinant. Note that if the original data contains complex values, PSD is not guaranteed and the sqrt() term may crash (or need to be computed as a complex value).

If the rho values come from some other estimation process (e.g., each rho being estimated from market data, disjoint historical data sets, or the intuitions of the user ), then all bets are off.

Cuchulainn
Topic Author
Posts: 63234
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: What could go possibly wrong ...

PSD = Positive semidefiniteness
Ah, thanks.
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

Cuchulainn
Topic Author
Posts: 63234
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: What could go possibly wrong ...

It is possible to randomly genetrate rhoS(call x,y,z)  in -[-1,1] so that the terms

1 - (x^2 + y^2 + z^2) + 2xyz.

becomes  < 0

Now I recall Alan had a 3d diagram of this surface.

So, if the choice of rhos is not 'good' then the normal distribution is not dependable?
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: What could go possibly wrong ...

Yes, it's easy to generate pathological rho through calculations that do not respect the analytic constraints on a covariance matrix. For example, if rho12==1 and rho23==1, then rho13 can only be exactly 1. There's actually some (messy) mutual interval constraints on the arccosines of the rho values.

Other than round-off errors in the summations and rho calculations, no empirical covariance matrix computed from real-numbered data will ever have a determinant less than zero.

It's not so much that the normal distribution is not dependable but that it's easy to create rho values that fail to correspond any normal distribution with real-valued parameters. That is, the error is how rho is computed more so than the use of a normal distribution.

I would think that the empirically possible case in which the determinant is zero (leading to divide-by-zero errors in computing both the exp() and 1/sqrt() terms) can be handled in code by appropriate sequencing and testing of the intermediate terms.

Cuchulainn
Topic Author
Posts: 63234
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: What could go possibly wrong ...

It's a law of gravity IMO that the matrix in the general nXn case that the matrix must be PD for Cholesky to work. Just having rhoS in [-1,1] is not enough (only for n = 2).

If the matrix has noise or you want to tweak, then PD can break down.
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: What could go possibly wrong ...

Yes, the constraints on a valid correlation matrix are quite complicated.

The question is whether one should blame the formalism for being too sensitive to noise or blame the user for introducing tweaks that are inconsistent with both mathematical and physical reality?

I'd think that any contract in software would involve terms binding on both the code and the caller of the code. And if the caller of the code violates the contract, then it's a second-order matter of whether the contract on the code side states that the code checks for invalid inputs.

Cuchulainn
Topic Author
Posts: 63234
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: What could go possibly wrong ...

From geometric point of view it is a feasible region in n-space. For n = 3 the boundary is:

1 - (x^2 + y^2 + z^2) + 2xyz.= 0

//
A nice preprocessing step might be to compute the quadratic form $x^TCx$ associated with the candidate matrix C and then minimize it (possibly as a least squares) using Differential Evolution to find a global minimum vector x. The corresponding value > 0 for a PD matrix.
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: What could go possibly wrong ...

Nice!

The other approach is to posit a distribution around the estimated/tweaked values of rho and intersect that distribution with the constraint equation (or intersect it with a second distribution for the determinant). That would lead to a maximum likelihood solution.

Cuchulainn
Topic Author
Posts: 63234
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: What could go possibly wrong ...

It is possible to randomly genetrate rhoS(call x,y,z)  in -[-1,1] so that the terms

1 - (x^2 + y^2 + z^2) + 2xyz.

becomes  < 0

Now I recall Alan had a 3d diagram of this surface.

So, if the choice of rhos is not 'good' then the normal distribution is not dependable?
I found that figure showing feasible region in 3d. In my tests things stayed OK if I stayed in about [-0.5, 0.5]^3. Now we can see why. Is the centre of gravity at (0,0,0)?
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

Cuchulainn
Topic Author
Posts: 63234
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: What could go possibly wrong ...

Nice!

The other approach is to posit a distribution around the estimated/tweaked values of rho and intersect that distribution with the constraint equation (or intersect it with a second distribution for the determinant).  That would lead to a maximum likelihood solution.
How would the algorithm for this expand?
Step over the gap, not into it. Watch the space between platform and train.
http://www.datasimfinancial.com
http://www.datasim.nl

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: What could go possibly wrong ...

Nice figure (although it should extend all the way to four of the vertices of that cube).

Yes, [-0.5, 0.5]^3 is entirely safe in that it corresponds to acos(0.50) = 60° to acos(-0.5) = 120° angles between the original data vectors. But any correlations outside [-0.5,0.5] can induce constraints on the other rhos.

Expanding the algorithm would probably mean either:

1) using the equation for the determinant of the NxN.

2) doing an eigendecomposition of the matrix with a sensitivity analysis WRT the values and find the perturbations of the values that bring all the negative-valued modes to zero.

3) Expressing the constraint in terms of acos(rho) angles (and distributions of those angles) with combinatoric logic for which chains of rhos are mutually incompatible.