Skip to content

Conversation

@SajidAlamQB
Copy link
Contributor

@SajidAlamQB SajidAlamQB commented Sep 30, 2025

Description

Related to: #1170

Fixes CI e2e test failure for kedro-docker with Spark support. The test "Execute docker build and run using spark Dockerfile" was failing when PySpark attempted to convert DataFrames to Pandas using Apache Arrow. The error occurred because Arrow couldn't access sun.misc.Unsafe due to JVM restrictions.

Development notes

Updated kedro_docker/template/Dockerfile.spark to add JVM configuration environment variables
Added --add-opens flags to allow Apache Arrow access:

java.base/java.nio - for buffer operations
java.base/sun.nio.ch - for channel implementations
java.base/jdk.internal.misc - for access to sun.misc.Unsafe

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Updated jsonschema/kedro-catalog-X.XX.json if necessary
  • Added a description of this change in the relevant RELEASE.md file
  • Added tests to cover my changes
  • Received approvals from at least half of the TSC (required for adding a new, non-experimental dataset)

Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
@SajidAlamQB SajidAlamQB changed the title test docker ci fix: Kedro-docker CI E2E Failures Oct 2, 2025
@SajidAlamQB SajidAlamQB self-assigned this Oct 2, 2025
Signed-off-by: Sajid Alam <[email protected]>
@SajidAlamQB SajidAlamQB marked this pull request as ready for review October 2, 2025 14:41
Copy link
Member

@deepyaman deepyaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this long-running issue!

Copy link
Member

@deepyaman deepyaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I actually have a question. 😅

Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
Signed-off-by: Sajid Alam <[email protected]>
@SajidAlamQB SajidAlamQB marked this pull request as draft October 3, 2025 13:11
@SajidAlamQB
Copy link
Contributor Author

Can we get this merged in for now to unblock this, I will create a follow up issue to dig further into this as I don't have capacity at the moment to dig into this further but the current fix does resolve the immediate issue.

@SajidAlamQB SajidAlamQB marked this pull request as ready for review October 9, 2025 11:36
@SajidAlamQB SajidAlamQB requested a review from deepyaman October 9, 2025 11:37
@SajidAlamQB SajidAlamQB dismissed deepyaman’s stale review October 9, 2025 11:38

We will investigate this further in a follow up issue.

@deepyaman deepyaman changed the title fix: Kedro-docker CI E2E Failures ci(docker): set legacy IPC format to fix e2e tests Oct 9, 2025
@deepyaman deepyaman enabled auto-merge (squash) October 9, 2025 16:32
@deepyaman deepyaman disabled auto-merge October 9, 2025 16:32
@deepyaman deepyaman enabled auto-merge (squash) October 9, 2025 16:33
@deepyaman deepyaman merged commit a500bf5 into main Oct 9, 2025
25 checks passed
@deepyaman deepyaman deleted the docker-fix-spark branch October 9, 2025 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants