Skip to content

Conversation

@sundy-li
Copy link
Member

@sundy-li sundy-li commented Jan 7, 2026

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Upgrade iceberg-rust from v0.4.0 to v0.8.0 and implement write support for Iceberg tables.

Dependency Upgrades

  • Upgrade iceberg-rust from v0.4.0 to v0.8.0 with breaking API changes
  • Upgrade hive_metastore from v0.1.0 to v0.2.0
  • Add IcebergFileIO wrapper to adapt new FileIO API to OperatorRegistry trait (iceberg-rust removed built-in support)
  • Update catalog builders (HMS, REST, Glue, S3Tables) to use new CatalogBuilder trait with load() method

New Feature: Iceberg Table Write Support

  • Implement IcebergDataFileWriter for writing data blocks to Parquet files
  • Add IcebergCommitSink for committing data via Transaction API
  • Support both partitioned and non-partitioned table writes
  • Handle multi-field partitioning using FanoutWriter
  • Add type conversion from Databend scalars to Iceberg literals (bool, int, long, float, double, string, date, timestamp)
  • Include cache invalidation after successful commits
  • Support null values in partition columns

Other Changes

  • Re-enable standalone_iceberg_tpch CI tests that were temporarily disabled for arm64 compatibility
  • Simplify generate_catalog_meta to generate_default_catalog_meta

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Added comprehensive SQL logic tests for:

  • Basic table writes (INSERT INTO)
  • Multiple data type support (int, bigint, double, string, date, boolean)
  • Single-field partitioned table writes
  • Two-field partitioned table writes
  • Multiple inserts into existing partitions
  • Null values in partition columns

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

- Upgrade iceberg-rust from v0.4.0 to v0.8.0 with breaking API changes
- Add IcebergFileIO wrapper to adapt new FileIO API to OperatorRegistry trait
- Update catalog builders to use new CatalogBuilder trait with load() method
- Re-enable standalone_iceberg_tpch CI tests that were disabled for arm64
- Upgrade hive_metastore from v0.1.0 to v0.2.0
- Simplify generate_catalog_meta to generate_default_catalog_meta
@github-actions github-actions bot added the pr-chore this PR only has small changes that no need to record, like coding styles. label Jan 7, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 827a035a51

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

- Handle file:// and memory:// URIs that don't have a host/bucket
- Fix gcs.project-id mapping to project_id (was incorrectly default_storage_class)
- Use Error::other() to fix clippy io_other_error lint
- Basic insert with multiple rows
- Multiple insert statements
- Various data types (int, bigint, float, double, string, date, boolean)
- NULL value handling
- Partitioned table writes
- INSERT SELECT from another iceberg table
- Aggregation queries on inserted data
- Map iceberg 'table not found' errors to ErrorCode::UnknownTable
- This allows DROP TABLE IF EXISTS to work correctly for iceberg tables
- Fix column type annotation in base.test (ITI -> TAT)
- Remove write tests since iceberg tables don't support INSERT yet
- Test CREATE TABLE with various types (int, bigint, double, string, date, boolean)
- Test CREATE TABLE with partition by clause
- Test INSERT statements (expected to fail with error 1002 since writes not yet supported)
- Test DROP TABLE cleanup
- Implement IcebergDataFileWriter for writing data blocks to parquet files
- Add IcebergCommitSink for committing data via Transaction API
- Support both partitioned and non-partitioned table writes
- Handle multi-field partitioning with FanoutWriter
- Add type conversion from Databend scalars to Iceberg literals
- Include cache invalidation after successful commits
- Update tests to verify write functionality works correctly
@sundy-li sundy-li changed the title chore: bump iceberg-rust to v0.8.0 and re-enable CI tests feat(iceberg): bump iceberg-rust to v0.8.0 and add write support Jan 7, 2026
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Jan 7, 2026
@sundy-li sundy-li requested review from KKould and youngsofun January 7, 2026 07:14
- Suppress deprecated as_slice/from_slice warnings in hash.rs and jwk.rs
- Add missing rand::Rng import in sized_spsc.rs tests
- Add allow for diverging_sub_expression in raft_state_machine_impl.rs
The alpine variant doesn't support ARM64 architecture.
The polygon ring starting point changed after dependency upgrade.
The polygon is geometrically equivalent - same shape, different vertex order.
- Add astral-sh/setup-uv@v7 to iceberg tpch test action
- Update float scientific notation format (e308 -> e+308) in tests
@bohutang
Copy link
Member

bohutang commented Jan 9, 2026

@codex review

@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

@sundy-li sundy-li merged commit 05caa2e into databendlabs:main Jan 9, 2026
88 checks passed
@sundy-li sundy-li deleted the bump-iceberg branch January 9, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-chore this PR only has small changes that no need to record, like coding styles. pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants