Skip to content

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Jul 1, 2025

Bumps delta-spark from 3.3.2 to 4.0.0.

Release notes

Sourced from delta-spark's releases.

Delta Lake 4.0.0

We are excited to announce the final release of Delta Lake 4.0.0! This release includes several exciting new features.

Highlights

  • [Spark] Preview support for catalog-managed tables, a new table feature that transforms Delta Lake into a catalog-oriented lakehouse table format. This feature is still in the RFC stage, and as such, the protocol is still under development and is subject to change.
  • [Spark] Delta Connect is an extension for Spark Connect which enables the usage of Delta over Spark Connect, allowing Delta to be used with the decoupled client-server architecture of Spark Connect.
  • [Spark] Support for the Variant data type to enable semi-structured storage and data processing, for flexibility and performance.
  • [Spark] Support a new DROP FEATURE implementation that allows dropping table features instantly without truncating history.
  • [Kernel] Support for reading and writing version checksum.
  • [Kernel] Support reading log compaction files for better performance during snapshot construction, and support writing log compaction files as a post commit hook.
  • [Kernel] Support for the Clustered Table feature which enables defining and updating the clustering columns on a table.
  • [Kernel] Support for writing to row tracking enabled tables.
  • [Kernel] Support for writing file statistics to the Delta log when they are provided by the engine. This enables data skipping using query filters at read time.

Details by each component.

Sunset of Delta Standalone and dependent connectors

Currently, Delta Standalone and its dependent connectors, including Delta Flink and Delta Hive, are no longer under active development. Starting in Delta 4.0 we will not be releasing these projects as part of the 4.x Delta releases. These connectors are in maintenance mode and, going forward, will only receive critical security fixes and high-severity bug patches in the 3.x series. We are committed to a full transition from Delta Standalone to Delta Kernel and a future Kernel-based Flink connector.

Delta Spark

Delta Spark 4.0 is built on Apache Spark™ 4.0 . Similar to Apache Spark, we have released Maven artifacts for Scala 2.13.

The key features of this release are:

  • Delta Connect adds Spark Connect support to Scala and Python APIs of Delta Lake for Apache Spark. Spark Connect is a new project released in Apache Spark 4.0 that adds a decoupled client-server infrastructure which allows remote connectivity from Spark from everywhere. Delta Connect makes the DeltaTable interfaces compatible with the new Spark Connect protocol. For more information on how to use Delta Connect, see the Delta Connect documentation. Delta Connect is currently in preview.
  • Preview support for catalog-managed tables: Delta Spark now supports reading from and writing to tables that have the catalogOwned-preview feature enabled. This feature allows a catalog to broker all commits to the table it manages, giving the catalog the control and visibility it needs to prevent invalid operations (e.g. commits that violate foreign key constraints), enforce security and access controls, and opens the door for future performance optimizations. Currently write support includes INSERT, MERGE INTO, UPDATE, and DELETE operations.
    • Note: this feature is still in the RFC stage, and as such, the protocol is still under development and is subject to change. The catalogOwned-preview feature should not be enabled for production tables and tables created with this preview feature enabled may not be compatible with future Delta Spark releases.
  • Support for Variant data type: The Variant data type is a new Apache Spark data type. The Variant data type enables flexible, and efficient processing of semi-structured data, without a user-specified schema. Variant data does not require a fixed schema on write. Instead, Variant data is queried using the schema-on-read approach. The Variant data type allows flexible ingestion by not requiring a write schema, and enables faster processing with the Spark Variant binary encoding format. This feature was originally released in preview as part of Delta 4.0.0 Preview, as of 4.0.0 this feature is no longer in preview. Please see the documentation and the example for more details.
  • Preview support for shredded variants: Shredded variants are a storage optimization which allow for efficient sub-field extraction at the cost of higher write overhead, showing up to 20x read performance improvement. Shredded Variant data is stored according to the Parquet Variant Shredding specification. See the variantShredding RFC for more details.
    • Note that this feature is in preview and that tables created with this preview feature enabled may not be compatible with future Delta Spark releases.
  • Type Widening now supports a broader set of type changes and is no longer in preview. This feature allows you to change the data type of a column in your Delta table without rewriting the underlying data files. See the type widening documentation for a list of all supported type changes and additional information. Delta 3.3 or above is required to read tables with type widening enabled.
  • Support dropping table features without truncating history: The current drop feature implementation requires the execution of the command twice with a 24 hour waiting time in between. In addition, it also results in the truncation of the history of the Delta table to the last 24 hours. The new DROP FEATURE implementation allows dropping features instantly without truncating history. Dropping a feature introduces a new writer feature to the table, the checkpointProtection feature.
    • Dropping a feature with the new behaviour can be achieved as follows:
    ALTER TABLE table_name DROP FEATURE feature_name
    
    • We can still drop a feature with the old behavior as follows:
    ALTER TABLE table_name DROP FEATURE feature_name TRUNCATE HISTORY
    
    • The checkpointProtection feature can be dropped with history truncation.

... (truncated)

Commits
  • 6d055c5 Setting version to 4.0.0
  • 91e9cbb [Spark][Delta 4.0] Fixes options-based time travel with timestamps (#4707)
  • bc2cc32 [Doc] Update Delta Connect Documentation for Delta 4.0.0 release (#4703)
  • 2de4c04 [4.0 release] Docs changes in preparation for the 4.0 release (#4695)
  • 32e1be0 [Doc] Update Type widening documentation for Delta 4.0 release (#4696)
  • 21207f9 [4.0][Kernel] Block writing data into column mapping enable table as it is no...
  • 22453a8 [4.0][Kernel] Validate row tracking configs are not missing on existing table...
  • fdd7df5 [4.0][Kernel] Support enabling row tracking in Kernel (#4669)
  • 220064c Fix the Issue that Kernel can't read partition column of type ISO8601… (#4666)
  • 2243a23 [Kernel][4.0] Block schema evolution via withSchema for 4.0 (#4656)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [delta-spark](https://github.com/delta-io/delta) from 3.3.2 to 4.0.0.
- [Release notes](https://github.com/delta-io/delta/releases)
- [Commits](delta-io/delta@v3.3.2...v4.0.0)

---
updated-dependencies:
- dependency-name: delta-spark
  dependency-version: 4.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file python Pull requests that update python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants