We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2.1 million rows of fx intraday market data. Resample to 10s works. Resample to 1s crashes.
Orginal reporter: Tony Roberts from pyxll.
Possible OOM?
Download the Dec 2023 csv from here https://www.histdata.com/download-free-forex-historical-data/?/ascii/tick-data-quotes/eurusd/2023
(this is free sample data)
pandas data prep code:
file = "data/DAT_ASCII_EURUSD_T_202312.csv" df2_raw = pd.read_csv(file, header=None) df2 = df2_raw.drop(columns=3).rename(columns={0:'timestamp', 1:'bid', 2:'ask'}) df2['timestamp'] = pd.to_datetime(df2['timestamp'], format="%Y%m%d %H%M%S%f") df2['mid'] = 0.5*(df2['bid'] + df2['ask']) df2 = df2.set_index('timestamp') lib.write("EURUSD", df2)
should produce a df with datetime index and bid, ask, mid columns with 2,102,540 rows and no missing data.
resample code:
def resampled_tick_data2(lib, symbol, start, end, freq, max_rows=1000): qb = adb.QueryBuilder() qb = qb.resample(freq, closed='right').agg({ 'high': ('mid', 'max'), 'low': ('mid', 'min'), 'open': ('mid', 'first'), 'close': ('mid', 'last') }) data = lib.read(symbol, date_range=[start, end], query_builder=qb) df = data.data.dropna() if max_rows is not None and len(df) > max_rows: raise RuntimeError("Number of rows is greater than max rows") return df df = resampled_tick_data2(lib, "EURUSD", dt.datetime(2023,1,1), dt.datetime(2023,12,31), "1s", max_rows=None)
Either produce correct results or give a clear error message.
Py 3.10. arcticdb 4.5.0rc. WSL on Win11.
LMDB
The same example runs ok with mimalloc, with freq='1s' and all freqs down to '10ms' which gives the same number of output rows as original data items.
The text was updated successfully, but these errors were encountered:
alexowens90
No branches or pull requests
Describe the bug
2.1 million rows of fx intraday market data. Resample to 10s works. Resample to 1s crashes.
Orginal reporter: Tony Roberts from pyxll.
Possible OOM?
Steps/Code to Reproduce
Download the Dec 2023 csv from here https://www.histdata.com/download-free-forex-historical-data/?/ascii/tick-data-quotes/eurusd/2023
(this is free sample data)
pandas data prep code:
should produce a df with datetime index and bid, ask, mid columns with 2,102,540 rows and no missing data.
resample code:
Expected Results
Either produce correct results or give a clear error message.
OS, Python Version and ArcticDB Version
Py 3.10. arcticdb 4.5.0rc. WSL on Win11.
Backend storage used
LMDB
Additional Context
The same example runs ok with mimalloc, with freq='1s' and all freqs down to '10ms' which gives the same number of output rows as original data items.
The text was updated successfully, but these errors were encountered: