Skip to content

Comments

parallel median, n-th variants#1281

Open
qkniep wants to merge 1 commit intorayon-rs:mainfrom
qkniep:main
Open

parallel median, n-th variants#1281
qkniep wants to merge 1 commit intorayon-rs:mainfrom
qkniep:main

Conversation

@qkniep
Copy link

@qkniep qkniep commented Jan 18, 2026

This is a basic implementation of par_median and other variants for n-th element selection.

The algorithm works as follows: 1) sample a subset of roughly sqrt(len) elements, 2) sort that sample, 3) pick lower and upper bounds close to the k/len qunatile within the sample, 4) collect all elements between these bounds from the input array, 5) if the k-th element lies within the bounds or equals one of the bounds, return it. Note, that this can fail if the initial sample or bound selection was "bad". In that case we currently fall back to a sequential call to select_nth_unstable().

As opposed to what I wrote in #1254, this implementation now seems to perform well even for non-uniform low-cardinality data (due to the counting of elements that equal either of the bounds).

Addresses #1254.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant