Serving the Quantitative Finance Community

• 1
• 2

ppauper
Topic Author
Posts: 70239
Joined: November 15th, 2001, 1:29 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

Was erfc() good enough or was qerfc() necessary?

BTW how did they come up with those numbers?
I needed quad precision and the code linked by outrun had the constants for the quad case which I could cut and paste into the fortran code
I never did find out where the numbers came from, but there's structure in there,
q1=3^2 q0
q2=5^2 q0
q3=7^2 q0
and so on

outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

It looks like a high precision version of "Rational Chebyshev approximations for the error function" as implemented in Cernlib C300

..or maybe the root source is http://www.kurims.kyoto-u.ac.jp/~ooura/

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

Do you also need to increase the number of terms to get full accuracy in quad precision?

ppauper
Topic Author
Posts: 70239
Joined: November 15th, 2001, 1:29 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

The test case I'm running has an answer in the literature of 1.50448.
With the double precision error function and a 300x300 matrix, I'm getting 1.50377
and with the quad precision erfc and a 400x400 matrix, I get 1.50406
the remaining error  is because I'm truncating an infinite sum, and when I look at the coefficients at the end it's a fairly flat spectrum

ppauper
Topic Author
Posts: 70239
Joined: November 15th, 2001, 1:29 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

Do you also need to increase the number of terms to get full accuracy in quad precision?
the number of terms in the error function routine has greatly increased, it's gone from 6 terms to 26 terms so will run a lot slower

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

Thanks!

outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

..that 1.504 only has 3 digits accuracy and hasn't improved that much. The new quad erfc has 32 digits accuracy, so there is a big gap between the two..

Is the linear algebra part of the computation unstable and the main source of error? Are you e.g. solving a system of equations or inverting a matrix?

Cuchulainn
Posts: 64298
Joined: July 16th, 2004, 7:38 am
Location: Drosophila melanogaster
Contact:

### Re: quadruple precision complex erfc code (qerfc) in fortran

Do you also need to increase the number of terms to get full accuracy in quad precision?
the number of terms in the error function routine has greatly increased, it's gone from 6 terms to 26 terms so will run a lot slower
If speed is an issue..

If I understand properly, you are summing a series. In that case OpenMP could be used .. it involves nothing more than a single line to achieve parallel speedup. I know OpenMP in C (my Fortran is a distant memory) and is easy to use.

https://www.dartmouth.edu/~rc/classes/i ... lause.html

Seems OMP is OK in quad

http://forum.openmp.org/forum/viewtopic.php?f=3&t=1741

// Sanity check; Kahan summation maybe

https://en.wikipedia.org/wiki/Kahan_summation_algorithm
"Compatibility means deliberately repeating other people's mistakes."
David Wheeler

http://www.datasimfinancial.com
http://www.datasim.nl

ppauper
Topic Author
Posts: 70239
Joined: November 15th, 2001, 1:29 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

..that 1.504 only has 3 digits accuracy and hasn't improved that much. The new quad erfc has 32 digits accuracy, so there is a big gap between the two..

Is the linear algebra part of the computation unstable and the main source of error? Are you e.g. solving a system of equations or inverting a matrix?
I'm solving a system $AY=B$ by row reduction rather than inverting $A$
the main source of error is I have an infinite sum which I am truncating
I have $\sum_{n=-\infty}^{\infty}f_{n}(x)y_{n}=b(x)$ where the $f_{n}(x)$ are (known) functions and the $y_{n}$ are (unknown) coefficients. This equation is true for all $x$ in some range
I truncate the series and evaluate at the $2N+1$ gridpoints $x_{m}$
$\sum_{n=-N}^{N}f_{n}(x_{m})y_{n}=b(x_{m})$
and write $A_{mn}=f_{n}(x_{m})$ and $B_{m}=b(x_{m})$ gives $\sum_{n=-N}^{N}A_{mn}y_{n}=B_{m}$

The larger $N$ is, the smaller the error, but there's a very flat spectrum: when I solve the truncated system and plot $y_{n}$ against $n$, the slope is very gradual
Last edited by ppauper on January 31st, 2018, 3:30 pm, edited 1 time in total.

ppauper
Topic Author
Posts: 70239
Joined: November 15th, 2001, 1:29 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

cuch: don't you need a parallel machine (beowulf cluster) to run MPI (or at least to take advantage of the parallelization)?

Cuchulainn
Posts: 64298
Joined: July 16th, 2004, 7:38 am
Location: Drosophila melanogaster
Contact:

### Re: quadruple precision complex erfc code (qerfc) in fortran

cuch: don't you need a parallel machine (beowulf cluster) to run MPI (or at least to take advantage of the parallelization)?
AFAIR MPI uses a network of computers and works via message passing. We did some Monte Carlo stuff on MPI a while back. You can do MPI on a single machine  But MPI is the world of the Fortran titans in research.
These days, multicore and manycore computers with shared memory are more common and easier to use e.g. using OpenMP using Fortran or C. Even laptops have 4 cores these days.
What is the algorithmic pattern you wish to parallelise? (BTW Phelim Boyle is the inventor of MC and he was doing a PhD in relativity in our maths dept during my undergrad days.)

N.B. the file is really a ps file. So you have to rename back to a .ps file and open in Ghostscript.
Attachments
boyleparallel2_this_is_a_ps_file.pdf
"Compatibility means deliberately repeating other people's mistakes."
David Wheeler

http://www.datasimfinancial.com
http://www.datasim.nl

ppauper
Topic Author
Posts: 70239
Joined: November 15th, 2001, 1:29 pm

### Re: quadruple precision complex erfc code (qerfc) in fortran

I've used a beowulf cluster in the past but don't have access to it now. You had to pay for time.
It's a bunch of computers hooked up together, but the user logs on to the front end  and it only seems like you're using 1 computer.

In the current code,  $\sum_{n=-N}^{N}f_{n}(x_{m})y_{n}=b(x_{m})$

I have 400 of  $b(x_{m})$ and $400^2$ of $f_{n}(x_{m})$ to evaluate which would scream on a parallel machine

then there's the row operations to solve the linear system and again they could be parallelized.

Cuchulainn
Posts: 64298
Joined: July 16th, 2004, 7:38 am
Location: Drosophila melanogaster
Contact:

### Re: quadruple precision complex erfc code (qerfc) in fortran

I've used a beowulf cluster in the past but don't have access to it now. You had to pay for time.
It's a bunch of computers hooked up together, but the user logs on to the front end  and it only seems like you're using 1 computer.

In the current code,  $\sum_{n=-N}^{N}f_{n}(x_{m})y_{n}=b(x_{m})$

I have 400 of  $b(x_{m})$ and $400^2$ of $f_{n}(x_{m})$ to evaluate which would scream on a parallel machine

then there's the row operations to solve the linear system and again they could be parallelized.
This looks like some kind of inverse transform or something..(quantised version of a Fredholm integral equation of the first kind?)
It looks like
1. Compute the matrix A and vector b
2. Solve AY = b
I reckon that step 1 could be parallelised in some way (each row independently of the others). Is that the rationale for using MPI?

OpenMP maybe? You have a double loop and you can parallelise each iteration of the outer loop. Reduction variables might be needed.
Here is an example of matrix multiplication in C++ (Fortran is similar).
void omp_ParallelNestedMatrixMultiply(const NestedMatrix& m1, const NestedMatrix& m2, NestedMatrix& m3)
{ // Matrix multiplication

// Assume 'compatibility' for multiplication of two matrices; OK since
// matrices are square.

double temp;

#pragma omp parallel for
for (long i = 0; i < m3.size(); ++i)
{
for (long j = 0; j < m3.size(); j++)
{
temp = 0.0;
for (long k = 0; k < m1.size(); k++)
{
temp += m1[i][k] * m2[k][j];
}
m3[i][j] = temp;
}
};
}
"Compatibility means deliberately repeating other people's mistakes."
David Wheeler

http://www.datasimfinancial.com
http://www.datasim.nl