Skip to content

Connectorx arrow_stream timestamp conversion issue #3491

@louiewhw

Description

@louiewhw

dlt version

1.20.0

Describe the problem

When using ConnectorX with return_type="arrow_stream", timestamp columns are returned as date64[ms] instead of timestamp[us].

Timestamp columns incorrectly converted to Date64 with arrow_stream return_type #866

The cast_date64_columns_to_timestamp function in pyarrow.py uses .view() to reinterpret the bits, but this doesn't rescale the units, causing milliseconds to be interpreted as microseconds (1000x diff).

#3218

This results in dates like 2025-12-14 being stored as 1970-01-21.

Expected behavior

No response

Steps to reproduce

source = sql_database(
    connection_string,
    table_names=["my_table"],
    backend="connectorx",
    backend_kwargs={"return_type": "arrow_stream"},  # ← triggers bug
)
source.my_table.apply_hints(incremental=dlt.sources.incremental("updated_at"))
pipeline.run(source)
# Watermark: 1970-01-21 <- (should be 2025-12-14)

Root Cause:
ConnectorX arrow_stream returns date64[ms], but dlt cast_date64_columns_to_timestamp uses .view(timestamp[us]) which reinterprets bits without unit conversion:

new_col = col.view(pyarrow.timestamp("us"))  # treats ms as μs → 1000x

Example:

  • Raw value: 1765747379262 (milliseconds)
  • .view(timestamp[us]) → interprets as 1765747379262 μs1970-01-21
  • Correct: 1765747379262 ms = 1765747379262000 μs2025-12-14

Fix:

Probably Use .view() to reinterpret type, then .cast() to convert units?

# date64[ms] → view as timestamp[ms] → cast to timestamp[us]
chunk_ts_us = pyarrow.compute.cast(chunk.view(pyarrow.timestamp("ms")), pyarrow.timestamp("us"))

Operating system

macOS

Runtime environment

Local

Python version

3.12

dlt data source

SQL

dlt destination

Filesystem & buckets

Other deployment details

No response

Additional information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingquestionFurther information is requested

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions