Python: socket realtime prices and Pandas dataframes
Posted: April 8th, 2021, 4:48 pm
Hi everyone,
I have a python script that uses API from one broker platform I use. It connects (in localhost) to the platform software via 'socket' and using a socket.recv() call , I receive all the requested price-feed information (time, symbol, bid, ask, last, last quantity, volume...etc) .
Now, I want to analyze these records, so I append them into one big Pandas Dataframe called 'priceAll'
I then read and analyze 'priceAll' , making local copies inside of other functions , eg:
- summing all quantities per symbol and price so I have all the volumes per specific prices
- summing all quantities per second so to have second-based volumes
- etc ...
This whole workflow works quite fine (looping in 100-200ms approx , I almost never encounter missed values or reading problems) and DF 'priceAll' gets as big as approx 4000-5000 lines, per 8 columns. Rows older than 10mins are automatically dropped with this:
MY QUESTION: Is there any other workflow more suitable for this purpose using Pandas Dataframe ? I am aware that dataframes are not exactly the best choice for realtime and "append" tasks, but I cannot find another better solution that is as fast and as simple with handling tabular data (making summations, averages , grouby's ,etc..) .
Thanks in advance!
I have a python script that uses API from one broker platform I use. It connects (in localhost) to the platform software via 'socket' and using a socket.recv() call , I receive all the requested price-feed information (time, symbol, bid, ask, last, last quantity, volume...etc) .
Now, I want to analyze these records, so I append them into one big Pandas Dataframe called 'priceAll'
Code: Select all
....
....
while True:
msg = mysock.recv(16384)
msg_stringa=str(msg,'utf-8')
read_df = pd.read_csv(StringIO(msg_stringa) , sep=";", error_bad_lines=False,
index_col=None, header=None,
engine='c', names=range(33),
decimal = '.')
priceAll = priceAll.append(priceDF, ignore_index=True).copy()
- summing all quantities per symbol and price so I have all the volumes per specific prices
- summing all quantities per second so to have second-based volumes
- etc ...
This whole workflow works quite fine (looping in 100-200ms approx , I almost never encounter missed values or reading problems) and DF 'priceAll' gets as big as approx 4000-5000 lines, per 8 columns. Rows older than 10mins are automatically dropped with this:
Code: Select all
priceAll = priceAll[(now - priceAll['time']).astype('timedelta64[s]') < 600].copy()
MY QUESTION: Is there any other workflow more suitable for this purpose using Pandas Dataframe ? I am aware that dataframes are not exactly the best choice for realtime and "append" tasks, but I cannot find another better solution that is as fast and as simple with handling tabular data (making summations, averages , grouby's ,etc..) .
Thanks in advance!