You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the requesting machine is in the same region as the s3 bucket.
joblib is used to parallelize the download, up to 56 threads.
it is very difficult to reproduce, happens at least once a day to random users who are using the same code to download, but different parquets.
Installed packages: arrow 1.3.0 pyarrow 14.0.1
File "/opt/venv/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 3003, in read_table
return dataset.read(columns=columns, use_threads=use_threads,
File "/opt/venv/lib/python3.10/site-packages/pyarrow/parquet/core.py", line 2631, in read
table = self._dataset.to_table(
File "pyarrow/_dataset.pyx", line 556, in pyarrow._dataset.Dataset.to_table
File "pyarrow/_dataset.pyx", line 3713, in pyarrow._dataset.Scanner.to_table
File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_statusError: IOError: AWS Error NETWORK_CONNECTION during GetObject operation: curlCode: 28, Timeout was reached
How can I debug this further?
Thank you.
Component(s)
Python
The text was updated successfully, but these errors were encountered:
Describe the bug, including details regarding any error messages, version, and platform.
Hello,
This is very similar to bug #36007
the requesting machine is in the same region as the s3 bucket.
joblib is used to parallelize the download, up to 56 threads.
it is very difficult to reproduce, happens at least once a day to random users who are using the same code to download, but different parquets.
Installed packages:
arrow 1.3.0
pyarrow 14.0.1
How can I debug this further?
Thank you.
Component(s)
Python
The text was updated successfully, but these errors were encountered: