Serving the Quantitative Finance Community

 
User avatar
Cuchulainn
Posts: 22931
Joined: July 16th, 2004, 7:38 am

Re: Python tricks

September 16th, 2020, 7:36 pm

Hi, 

Sorry if I'm on the wrong board here - new to the forum. 

I'm trying to get to grips with Cython - just doing a basic explicit finite difference function and trying to test the performance gains of various implementations. I know my code is working, and it's an order of magnitude quicker than pure python/numpy, but the numba jit compilation is another 10x faster than my Cython code - is anyone familiar with C/Cython and able to spot the bottleneck in the following please? It's definitely something to do with my V[:,:] array but I don't know how to optimise this further. 

Can obviously just use the numba version for speed but feel like I should be able to at least get to the same with Cython... so wondering what I've missed.

Thanks!!

Numpy/Numba versions (~1.5ms and 5 microseconds, respectively):
import numpy as np
import numba as nb
def FDEur_py(option_type, vol, r, K, T, n_ds):
    ds = 2 * K / n_ds
    dt = 0.9 / vol ** 2 / n_ds ** 2
    s = np.arange(0,2*K+ds,ds)
    n_dt = round(T / dt)
    dt = T / n_dt
    V = np.empty((n_ds+1, n_dt+1))
    
    q = 1 if option_type == 'C' else -1
    
    V[:,0] = np.maximum(q * (s - K),0)
    
    for k in range(1,n_dt+1):
        for i in range(1,n_ds):
            delta = (V[i+1,k-1] - V[i-1,k-1]) / 2/ds
            gamma = (V[i+1,k-1] - 2*V[i,k-1] + V[i-1,k-1]) / ds/ds
            theta = -0.5 * vol ** 2 * s[i] ** 2 * gamma - r * s[i] * delta + r * V[i,k-1]
            V[i,k] = V[i,k-1] - dt * theta
        
        V[0,k] = V[0,k-1] * (1 - r * dt)
        V[n_ds,k] = 2 * V[n_ds-1,k] - V[n_ds-2,k]
    
    return V

FDEur_nb = nb.jit(FDEur_py)

Cython attempt (~50 microseconds):
%%cython
import numpy as np
cimport numpy as np

def FDEur(str option_type, float vol, float r, float K, float T, int n_ds):
    cdef double ds = 2 * K / n_ds
    cdef double dt = 0.9 / vol ** 2 / n_ds ** 2
    cdef int n_dt = round(T / dt)
    cdef double[:] s = np.zeros(n_ds+1)
    cdef double[:,:] V = np.zeros((n_ds+1,n_dt+1))
    cdef int q, k, i
    
    dt = T / n_dt
    q = 1 if option_type == 'C' else -1
    
    for i in range(0,n_ds+1):
        s[i] = i * ds
        V[i,0] = max(q * (s[i] - K),0)
    
    for k in range(1,n_dt+1):
        for i in range(1,n_ds):
            delta = (V[i+1,k-1] - V[i-1,k-1]) / 2/ds
            gamma = (V[i+1,k-1] - 2*V[i,k-1] + V[i-1,k-1]) / ds/ds
            theta = -0.5 * vol ** 2 * s[i] ** 2 * gamma - r * s[i] * delta + r * V[i,k-1]
            V[i,k] = V[i,k-1] - dt * theta
        
        V[0,k] = V[0,k-1] * (1 - r * dt)
        V[n_ds,k] = 2 * V[n_ds-1,k] - V[n_ds-2,k]
    
    return np.array(V)
Hi ZSG,
I sent you a PM (Private Mail), top right corner of screen.
 
ZeroSumGame
Posts: 3
Joined: January 23rd, 2020, 11:26 am

Re: Python tricks

September 16th, 2020, 8:05 pm


Hi ZSG,
I sent you a PM (Private Mail), top right corner of screen.

Hey - apparently I'm still too new to be able to send PMs! But unfortunately don't know if I can help, sorry - I'm just in Jupyter NB and learned what i know from chapter 10 in Yves Hilpisch's Python for Finance - then just started trying different problems. Haven't attempted proper setup of .pyx files or anything yet. 
 
User avatar
Cuchulainn
Posts: 22931
Joined: July 16th, 2004, 7:38 am

Re: Python tricks

October 1st, 2020, 1:46 pm

What do masked arrays offer when compared to normal arrays?
 
User avatar
bearish
Posts: 5906
Joined: February 3rd, 2011, 2:19 pm

Re: Python tricks

October 1st, 2020, 2:38 pm

Covid protection?
 
User avatar
Cuchulainn
Posts: 22931
Joined: July 16th, 2004, 7:38 am

Re: Python tricks

October 1st, 2020, 4:45 pm

Covid protection?
Inside every masked array hides a normal array trying to get out. It's a cover.
 
User avatar
katastrofa
Posts: 7931
Joined: August 16th, 2007, 5:36 am
Location: Event Horizon

Re: Python tricks

October 1st, 2020, 5:29 pm

They are a convenient way of creating masks, e.g. for missing values or special values indicating missing values (in some surveys negative numbers are used to indicate that the answer wasn't obtained for various reasons).

import numpy as np
import numpy.ma as ma
zorro = np.array([2, 0, 12, 12, 0])
masked_zorro = ma.masked_less_equal(a, 0)
print('Mean Zorro: {} vs. Mean masked Zorro: {}'.format(zorro.mean(), masked_zorro.mean()))

RTFM answer: https://numpy.org/doc/stable/reference/ ... #rationale
 
User avatar
Cuchulainn
Posts: 22931
Joined: July 16th, 2004, 7:38 am

Re: Python tricks

October 1st, 2020, 5:40 pm

They are a convenient way of creating masks, e.g. for missing values or special values indicating missing values (in some surveys negative numbers are used to indicate that the answer wasn't obtained for various reasons).

import numpy as np
import numpy.ma as ma
zorro = np.array([2, 0, 12, 12, 0])
masked_zorro = ma.masked_less_equal(a, 0)
print('Mean Zorro: {} vs. Mean masked Zorro: {}'.format(zorro.mean(), masked_zorro.mean()))

RTFM answer: https://numpy.org/doc/stable/reference/ ... #rationale
Thanks; I was looking for a second opinion, maybe an application that no one else thought of.
 
User avatar
katastrofa
Posts: 7931
Joined: August 16th, 2007, 5:36 am
Location: Event Horizon

Re: Python tricks

January 4th, 2021, 10:33 am

{True: 0, 1: False}
 
User avatar
Cuchulainn
Posts: 22931
Joined: July 16th, 2004, 7:38 am

Re: Python tricks

January 4th, 2021, 9:14 pm

{True: 0, 1: False}
Inconceivable.
 
User avatar
katastrofa
Posts: 7931
Joined: August 16th, 2007, 5:36 am
Location: Event Horizon

Re: Python tricks

January 5th, 2021, 12:49 am

You keep using Pythin's bool... I don't think it means what you think it means.

I = [0,1,2]
l[True]
l[False]
 
User avatar
Cuchulainn
Posts: 22931
Joined: July 16th, 2004, 7:38 am

Re: Python tricks

January 5th, 2021, 1:00 pm

You keep using Pythin's bool... I don't think it means what you think it means.

I = [0,1,2]
l[True]
l[False]
Doublethink requires using logic against logic or suspending disbelief in the contradiction.

// Python 4.9
 
User avatar
tags
Topic Author
Posts: 3603
Joined: February 21st, 2010, 12:58 pm

Re: Python tricks

April 23rd, 2021, 12:44 pm

Is there clean way to compare two separate dataframes in long format (dimension1, dimension2, .... dimensionN, value) ?
I have a pile of csv files, one new published each month, containing accumulated values and I need to show what has been going during each month. Any comment on this much welcome!

EDIT:  maybe I pass all the dimensions as as many index levels (multiindex feature of pd.DataFrame) while values are left as pandas columns. it is then easy to make the diff between rows in different columns.
 
User avatar
katastrofa
Posts: 7931
Joined: August 16th, 2007, 5:36 am
Location: Event Horizon

Re: Python tricks

April 23rd, 2021, 7:42 pm

You didn't give too many details, but is the diff enough to compare such multidimensional datasets?
Can the components be correlated? Python has lots of nice clustering algorithms, multidimensional scaling, etc. (mostly sklearn) for that. Functional data analysis, baby! :-D

You can simply do diff on a multidimensional df, I think.
 
User avatar
tags
Topic Author
Posts: 3603
Joined: February 21st, 2010, 12:58 pm

Re: Python tricks

April 23rd, 2021, 8:03 pm

thanks for the suggestion kat. apologies i wasn't clear enough earlier. y dataset was simple fundamental market data. i went with the multi-indexing solution that popped in my mind (the one I quickly  commented in my previous post). i hadn't visited the sklearn website for agesm i have to say. the library seems to have grown substantially.