You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We ran into this error when reading from a managed Databricks Delta Table:
Invalid argument error: all columns in a record batch must have the specified row count
This happens when running the select * from my_table query via datafusion. This occurs in the customer environment, so we don't have a reliable reproduction yet.
FWIW, I found a similar issue that someone ran into when reading from Iceberg, along with the explanation that the number of physical and logical records in a batch may not match: apache/datafusion-comet#973
Is it possible that a similar issue exists with Delta?
Thanks!
The text was updated successfully, but these errors were encountered:
Thanks for your reply @ion-elgreco. I will work with the customer on a reproduction. Like I said, we don't have access to the delta table, so can't reproduce on our end. I was hoping that someone familiar with the code base might have some suggestions. Maybe there are some specific things we can ask the customer to check, e.g., could this happen if the table has deletion vectors enabled?
To query the DeltaLake using datafusion, we use register_table to register the Delta table provider with datafusion and then issue queries via
let options: SQLOptions = SQLOptions::new()
.with_allow_ddl(false)
.with_allow_dml(false);
let df =self.datafusion.sql_with_options("select * from my_table", options).await?;
let mut stream = match df.execute_stream().await?;
while let Some(batch) =stream.next().await {
let batch = match batch {
Ok(batch) => batch,
Err(e) => {
/*THE ERROR IS REPORTED HERE */
}
};
}
Environment
Delta-rs version: 0.22.2
Binding: Rust
Environment:
Bug
We ran into this error when reading from a managed Databricks Delta Table:
This happens when running the
select * from my_table
query via datafusion. This occurs in the customer environment, so we don't have a reliable reproduction yet.FWIW, I found a similar issue that someone ran into when reading from Iceberg, along with the explanation that the number of physical and logical records in a batch may not match:
apache/datafusion-comet#973
Is it possible that a similar issue exists with Delta?
Thanks!
The text was updated successfully, but these errors were encountered: