I have a python script that uses API from one broker platform I use. It connects (in localhost) to the platform software via 'socket' and using a socket.recv() call , I receive all the requested price-feed information (time, symbol, bid, ask, last, last quantity, volume...etc) .
Now, I want to analyze these records, so I append them into one big Pandas Dataframe called 'priceAll'
I then read and analyze 'priceAll' , making local copies inside of other functions , eg:
Code: Select all
.... .... while True: msg = mysock.recv(16384) msg_stringa=str(msg,'utf-8') read_df = pd.read_csv(StringIO(msg_stringa) , sep=";", error_bad_lines=False, index_col=None, header=None, engine='c', names=range(33), decimal = '.') priceAll = priceAll.append(priceDF, ignore_index=True).copy()
- summing all quantities per symbol and price so I have all the volumes per specific prices
- summing all quantities per second so to have second-based volumes
- etc ...
This whole workflow works quite fine (looping in 100-200ms approx , I almost never encounter missed values or reading problems) and DF 'priceAll' gets as big as approx 4000-5000 lines, per 8 columns. Rows older than 10mins are automatically dropped with this:
Code: Select all
priceAll = priceAll[(now - priceAll['time']).astype('timedelta64[s]') < 600].copy()
MY QUESTION: Is there any other workflow more suitable for this purpose using Pandas Dataframe ? I am aware that dataframes are not exactly the best choice for realtime and "append" tasks, but I cannot find another better solution that is as fast and as simple with handling tabular data (making summations, averages , grouby's ,etc..) .
Thanks in advance!