Skip to content

Conversation

@kingcrimsontianyu
Copy link
Contributor

@kingcrimsontianyu kingcrimsontianyu commented Nov 20, 2025

This PR introduces two changes in KvikIO C++ API:

  • For the pread (and pwrite if applicable) method in FileHandle, RemoteHandle and MmapHandle's , this PR adds the thread pool as a function parameter. By default, the global thread pool is used.
  • After previous PR's cleanup (https://github.com/rapidsai/kvikio/pull/851/files/30543a3ccb953b0eb2afb7bb91d36ceda482dd69#r2432486309), the thread_pool_wrapper class merely forwards calls without adding useful functionality. This PR removes it and adds a simple type alias ThreadPool for the underlying BS:thread_pool.

This PR is a dependency of #874 which facilitates investigation into a multi-drive scaling problem reported by #850.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 20, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@kingcrimsontianyu kingcrimsontianyu added breaking Introduces a breaking change improvement Improves an existing functionality c++ Affects the C++ API of KvikIO labels Nov 20, 2025
@kingcrimsontianyu kingcrimsontianyu changed the title Add thread pool as function parameters Add thread pool as function parameters to C++ API Nov 24, 2025
@kingcrimsontianyu
Copy link
Contributor Author

/ok to test afb670d

@rapidsai rapidsai deleted a comment from copy-pr-bot bot Nov 24, 2025
@kingcrimsontianyu
Copy link
Contributor Author

/ok to test 19f8af9

@kingcrimsontianyu kingcrimsontianyu marked this pull request as ready for review November 24, 2025 20:18
@kingcrimsontianyu kingcrimsontianyu requested a review from a team as a code owner November 24, 2025 20:18
Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I think it would be useful to add some tests that mix different thread pools.

std::size_t offset = 0,
std::size_t task_size = defaults::task_size());
std::size_t task_size = defaults::task_size(),
ThreadPool* thread_pool = &defaults::thread_pool());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using std::shared_ptr<ThreadPool> throughout to encourage easier and safer lifetime management for users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Thinking over passing thread pool as a shared pointer, when the asynchronous function pread/pwrite returns, the shared pointer is destroyed. So in order to properly extend the pool's lifetime for the async operation and prevent use-after-free, we need to further share its ownership with the I/O task, either each task or the last aggregate task. The pro is no concern over thread pool lifetime at the point the std::future 's result is being waited for. The con is the slight increase in implementation complexity and runtime overhead.

If we pass a raw pointer instead, we claim no ownership responsibility and require users to maintain the pool's lifetime throughout the I/O operations. The pro is simplicity, and the con is loss of bonus of smart pointers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a very tricky problem.

To extend the lifetime of the thread pool properly during the asynchronous operations, we need to capture std::shared_ptr<ThreadPool> in the last task:
https://github.com/rapidsai/kvikio/blob/main/cpp/include/kvikio/detail/parallel_operation.hpp#L166

auto last_task = [=, thread_pool = thread_pool, tasks = std::move(tasks)]() mutable -> std::size_t {

Suppose reference count is exactly 1 when the last task is being executed. When it is done, the task goes out of scope precisely at https://github.com/bshoshany/thread-pool/blob/v4.1.0/include/BS_thread_pool.hpp#L938, and the reference count will reach 0 and the pool start being destroyed. In the destructor, we wait (sleep) (https://github.com/bshoshany/thread-pool/blob/v4.1.0/include/BS_thread_pool.hpp#L336) for the condition that tasks_running == 0, which will not happen because --tasks_running takes place at the beginning of the worker's loop (https://github.com/bshoshany/thread-pool/blob/v4.1.0/include/BS_thread_pool.hpp#L915). So tasks_running will always be 1 and we are waiting forever in the destructor. Strangely, I haven't seen this in my unit test, but I fear that the deadlock may appear in production.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think if we do want to extend the lifetime of the thread pool, we need to add it directly to the returned future's results ("shared state" in C++ terminology), i.e. instead of std::future<std::size_t> we probably need std::future<std::pair<std::size_t, std::shared_ptr<ThreadPool>>>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I'm inclined to go back to the raw pointer approach, and ask users to shoulder the responsibility of lifetime management for the thread pool. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@kingcrimsontianyu
Copy link
Contributor Author

/ok to test bbf240b

@kingcrimsontianyu
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 2481f31 into rapidsai:main Nov 26, 2025
78 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Introduces a breaking change c++ Affects the C++ API of KvikIO improvement Improves an existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants