Skip to content

Conversation

@PavithranRick
Copy link
Contributor

Describe the issue this Pull Request addresses

This PR enhances Hudi’s test infrastructure by adding comprehensive concurrency control test cases for Spark SQL Show procedures. These tests validate that the procedures behave correctly and consistently when executed concurrently with operations such as clustering, compaction, and committing. No user-facing or production code behavior is changed.

Summary and Changelog

Summary:
Introduces a new suite of concurrency-focused tests that ensure Hudi Show procedures remain consistent, thread-safe, and timeline-accurate under concurrent operations.

Changelog:

  • TestConcurrencyControlProcedures:
    • New test class containing three targeted concurrency tests for show procedures.
  • Clustering Concurrency Test:
    • Validates show_clustering behavior during concurrent clustering schedule/execute operations.
  • Compaction Concurrency Test:
    • Ensures show_compaction remains reliable when compaction is scheduled/executed concurrently.
  • Commit Concurrency Test:
    • Verifies show_commits consistency during concurrent insert operations, including timeline progression validation.
  • Lock Provider Integration:
    • Configured InProcessLockProvider using optimistic concurrency control to simulate realistic multi-threaded execution.
  • Progressive Validation:
    • Added early / mid / late operation state checks to confirm stable timeline progression.
  • No production logic changes; test-only improvements.

Impact

  • Enhanced Test Coverage: Improves confidence in Hudi’s behavior in multi-user and concurrent workloads.
  • Concurrency Safety Verification: Confirms Show procedures deliver consistent results without affecting ongoing write operations.
  • Lock Provider Validation: Ensures correct interplay between Hudi’s lock provider and Show procedures.
  • Timeline Consistency: Validates proper timeline evolution during concurrent modifications.
  • No user-facing or production changes.

Risk Level

none
Test-only changes with no runtime impact on production environments.

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable
  • [] CI passed

vamshikrishnakyatham and others added 4 commits September 1, 2025 22:19
…Clustering procedures - applies the same to all the remaining action, file group procedures
…Clustering procedures - applies the same to all the remaining action, file group procedures
…Clustering procedures - applies the same to all the remaining action, file group procedures
@hudi-bot
Copy link
Collaborator

hudi-bot commented Dec 2, 2025

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L PR with lines of changes in (300, 1000]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants