-
Notifications
You must be signed in to change notification settings - Fork 83
feat: Add range-based locking for parallel file I/O #847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implement range-based locking mechanism to enable truly parallel non-overlapping writes to files. This improves performance for multi-GPU workloads by allowing concurrent writes to different file regions while serializing only overlapping operations. Key changes: - Add RangeLockManager class for managing non-overlapping range locks - Add FileHandleWithRangeLock extending FileHandle with range lock support - Add comprehensive C++ and Python tests for range lock functionality - Support move semantics for efficient lock transfer Performance benefits: - Non-overlapping writes can execute in parallel - Reduces contention for multi-GPU file I/O operations - Maintains data integrity by serializing overlapping ranges
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a range-based locking mechanism for kvikio to enable parallel file I/O operations on non-overlapping file regions. The implementation allows multiple threads or GPUs to write to different sections of a file simultaneously while serializing only overlapping operations.
Key Changes
- Added
RangeLockManagerclass for managing overlapping range detection and locking - Added
FileHandleWithRangeLockextendingFileHandlewith range-aware parallel I/O operations - Comprehensive test coverage in both C++ and Python to validate parallel execution and data integrity
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| cpp/include/kvikio/range_lock.hpp | Core range lock manager with overlap detection and RAII lock semantics |
| cpp/include/kvikio/file_handle_rangelock.hpp | Extended file handle with range-locked read/write operations |
| cpp/tests/test_range_lock.cpp | C++ unit tests for range locking functionality and performance |
| python/kvikio/tests/test_range_lock.py | Python integration tests for concurrent file operations |
| cpp/tests/CMakeLists.txt | Build configuration for range lock tests |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| return std::async(std::launch::deferred, [future = std::move(future), | ||
| lock = std::move(range_lock)]() mutable { | ||
| auto result = future.get(); | ||
| // Lock will be automatically released when this lambda exits | ||
| return result; | ||
| }); |
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::launch::deferred prevents the operation from running in parallel. The lambda will only execute when .get() is called, defeating the purpose of parallel I/O. Consider using std::launch::async or a different approach to maintain parallelism while ensuring proper lock cleanup.
| return std::async(std::launch::deferred, [future = std::move(future), | ||
| lock = std::move(range_lock)]() mutable { |
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same deferred execution issue as in pwrite_rangelock. This prevents the read operation from running in parallel, contradicting the parallel I/O goals of the feature.
| return std::async(std::launch::deferred, [future = std::move(future), | |
| lock = std::move(range_lock)]() mutable { | |
| return std::async(std::launch::async, [future = std::move(future), | |
| lock = std::move(range_lock)]() mutable { |
|
/ok to test |
@madsbk, there was an error processing your request: See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/ |
|
/ok to test 68ff040 |
| bool sync_default_stream = true) { | ||
|
|
||
| // Acquire range lock for this write | ||
| auto range_lock = range_lock_manager_.lock_range(file_offset, file_offset + size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KvikIO has transitioned from a header-only library to a shared library. So all the implementations should go to corresponding .cpp files.
| * Copyright (c) 2025, NVIDIA CORPORATION. | ||
| * | ||
| * Modified FileHandle with range-based locking support | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preamble should follow this complete format: https://github.com/rapidsai/kvikio/blob/branch-25.12/cpp/include/kvikio/file_handle.hpp
| * | ||
| * This version acquires a lock only for the specific range being written, | ||
| * allowing non-overlapping writes to proceed in parallel. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function parameters and return values should be commented in Doxygen format.
|
|
||
| namespace kvikio { | ||
|
|
||
| class FileHandleWithRangeLock : public FileHandle { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the class needs to be commented in detail too, to explain the purpose of this class and also key implementation details. For example, when multiple write requests contend on a common range, what would the behavior be?
|
@a-hirota, thanks for contributing! When you say locking, do you mean locking via this API? We’re not doing any file-based locking here, right? |
| return std::async(std::launch::deferred, [future = std::move(future), | ||
| lock = std::move(range_lock)]() mutable { | ||
| auto result = future.get(); | ||
| // Lock will be automatically released when this lambda exits | ||
| return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For parallel I/O, the step to wait for the chunked tasks is performed in the thread pool (https://github.com/rapidsai/kvikio/blob/branch-25.12/cpp/include/kvikio/parallel_operation.hpp#L175-L184). Would it be possible that this unlock step here follows suit and gets pushed to the thread pool's task queue as well?
|
Thanks for submitting the PR for review. For completely new features, it is usually a good practice to file an issue first that explains the need/motivation/objective, before diving into implementation details. I think it would be of great help if you could elaborate on the use case of this range locking feature. Are you developing a database system based on KvikIO? You mentioned that this PR significantly improves the performance for multi-GPU. Is there any benchmark result to share? |
Description
This PR implements range-based locking mechanism for kvikio to enable truly parallel non-overlapping writes to files. This significantly
improves performance for multi-GPU workloads.
Key Changes
RangeLockManagerclass for managing non-overlapping range locksFileHandleWithRangeLockextendingFileHandlewith range lock supportPerformance Benefits
Usage Example
C++ Usage
Python Usage (Future API)
Testing
Performance Impact
In our tests with 2 GPUs writing to non-overlapping regions: