Skip to content

par_sort_pairs: Use a growable pool of thread states#164

Merged
vigna merged 2 commits intovigna:mainfrom
progval:par_sort_pairs-crossbeam
Feb 7, 2026
Merged

par_sort_pairs: Use a growable pool of thread states#164
vigna merged 2 commits intovigna:mainfrom
progval:par_sort_pairs-crossbeam

Conversation

@progval
Copy link
Contributor

@progval progval commented Feb 3, 2026

It turns out that rayon can return Yield::Idle when there is an other task running the the same thread (lower in the call stack), because it cannot yield to that other task (it would need to jump back to that point of the call stack, then back to where we currently are). This means that we need a way to add new states somehow.

Instead of creating throwaway states (which would create many tiny BatchIterators we would have to merge with the others) or making each thread have a pool of throwaway states (which would create fewer of them, but still many because the extra ones can't be used by other states), this introduces a global pool with all states in it.

This does add synchronization overhead, but it should be negligeable because try_for_each_init calls this initializer only once per internal sequential iterators; and if there are many small such iterators, then rayon already adds too much overhead anyway.

It turns out that rayon can return Yield::Idle when there is an other task running the the same thread (lower in the call stack),
because it cannot yield to that other task (it would need to jump back to that point of the call stack, then back to where we currently are).
This means that we need a way to add new states somehow.

Instead of creating throwaway states (which would create many tiny BatchIterators we would have to merge with the others)
or making each thread have a pool of throwaway states (which would create fewer of them, but still many because the extra ones can't be used by other states),
this introduces a global pool with all states in it.

This does add synchronization overhead, but it should be negligeable because try_for_each_init calls this initializer only once per internal sequential iterators;
and if there are many small such iterators, then rayon already adds too much overhead anyway.
This would cause us to silently lose data
@vigna
Copy link
Owner

vigna commented Feb 7, 2026

I know this might seem like a stupid question, but could the problem be in the parallel iterator and not in the sorter?

@progval
Copy link
Contributor Author

progval commented Feb 7, 2026

It doesn't matter why Rayon calls worker functions recursively, it's well-known that it will do that in many cases. It's the same reason we can't hold mutexes across calls to Rayon functions.

@vigna vigna merged commit 0687d11 into vigna:main Feb 7, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants