What happens?
An HTTP request that receives an HTTP 502 response is failed without retry. HTTP 502 errors should be retried because they are transient.
This can be a big problem in ducklake scenarios that ingest large amounts of data into object storage (e.g. Cloudflare R2). A single 502 can break the whole ingestion pipeline, because the request is not retried.
_duckdb.HTTPException: HTTP Error: Unable to connect to URL r2://XXX/datalake/tables/srd/daily/ducklake-019e26ad-1eef-728e-88ae-46b1899b3368.parquet: Bad Gateway (HTTP code 502)
To Reproduce
I do not have specific reproduction steps, but I can point to the code that I believe is responsible: HTTPResponse::ShouldRetry()
Adding the following case should fix this issue:
case HTTPStatusCode:: BadGateway_502:
OS:
macOS Tahoe 26.0.1
DuckDB Version:
1.5.2
DuckLake Version:
8a5851971f
DuckDB Client:
Python
Hardware:
No response
Full Name:
Bill Zissimopoulos
Affiliation:
N/A
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Not applicable - the reproduction does not require a data set
Did you include all code required to reproduce the issue?
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
What happens?
An HTTP request that receives an HTTP 502 response is failed without retry. HTTP 502 errors should be retried because they are transient.
This can be a big problem in ducklake scenarios that ingest large amounts of data into object storage (e.g. Cloudflare R2). A single 502 can break the whole ingestion pipeline, because the request is not retried.
To Reproduce
I do not have specific reproduction steps, but I can point to the code that I believe is responsible:
HTTPResponse::ShouldRetry()Adding the following
caseshould fix this issue:OS:
macOS Tahoe 26.0.1
DuckDB Version:
1.5.2
DuckLake Version:
8a5851971f
DuckDB Client:
Python
Hardware:
No response
Full Name:
Bill Zissimopoulos
Affiliation:
N/A
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Not applicable - the reproduction does not require a data set
Did you include all code required to reproduce the issue?
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?