Skip to content

HTTP 502 responses are failed without retry #1160

@billziss-gh

Description

@billziss-gh

What happens?

An HTTP request that receives an HTTP 502 response is failed without retry. HTTP 502 errors should be retried because they are transient.

This can be a big problem in ducklake scenarios that ingest large amounts of data into object storage (e.g. Cloudflare R2). A single 502 can break the whole ingestion pipeline, because the request is not retried.

_duckdb.HTTPException: HTTP Error: Unable to connect to URL r2://XXX/datalake/tables/srd/daily/ducklake-019e26ad-1eef-728e-88ae-46b1899b3368.parquet: Bad Gateway (HTTP code 502)

To Reproduce

I do not have specific reproduction steps, but I can point to the code that I believe is responsible: HTTPResponse::ShouldRetry()

Adding the following case should fix this issue:

case HTTPStatusCode:: BadGateway_502:

OS:

macOS Tahoe 26.0.1

DuckDB Version:

1.5.2

DuckLake Version:

8a5851971f

DuckDB Client:

Python

Hardware:

No response

Full Name:

Bill Zissimopoulos

Affiliation:

N/A

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions