-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Open
Labels
Description
Describe the bug, including details regarding any error messages, version, and platform.
version: 21.0.0
platform: arm64
When reading data with:
- hive-style partioning
- integer dataframe columns
The partition columns get inferred as int type.
from pyarrow.parquet import read_table
path = "gs://project/data/run_date=2025-09-17/job_id=abc123/0.pq"
dataset = read_table(path)
dataset.read()
pyarrow.Table
0: uint32
1: uint32
2: uint32
3: uint32
...
99: uint32
run_date: dictionary<values=string, indices=int32, ordered=0>
job_id: dictionary<values=string, indices=int32, ordered=0>
This causes issues when trying to cast "2025-09-17" to int32 for example
Component(s)
Python