feat: Allow quantile to compute multiple quantiles at once #25516

agossard · 2025-11-27T04:12:35Z

A first cut here. Interested in any feedback.

Note: for multiple quantiles, we will always just sort up front (for now). quick select is still used on the case where we are only looking for one quantile, we can get a contiguous slice, and the data is not already sorted. Could consider implementing a multi-quick sort if we think it is worth it.

I don't love having .quantile and quantiles in so many places in the internal code, but wasn't sure how hard it would be to avoid that.

Note: I have not refactored .describe() yet to actually use this, can do. I did make qcut utilize this pathway.

crates/polars-expr/src/expressions/aggregation.rs

…tions

nikaltipar · 2025-11-28T13:09:46Z

Thanks for adding the change. This might need some extra use-cases.

For instance, a test case with an array input for a few valid quantiles followed by an invalid one. etc

…ature/multi_quantiles

…factor

crates/polars-expr/src/expressions/aggregation.rs

orlp · 2025-12-01T14:46:04Z

crates/polars-core/src/series/implementations/time.rs

+                .map(|v: Option<f64>| v.map(|f| f as i64))
+                .collect::<Int64Chunked>()
+                .into_series();
+            // Cast the int64 series to the time type


This should not use a cast, but rather do into_time on the Int64Chunked.

orlp · 2025-12-01T14:46:24Z

crates/polars-core/src/series/implementations/duration.rs

+                .map(|v: Option<f64>| v.map(|f| f as i64))
+                .collect::<Int64Chunked>()
+                .into_series();
+            // Cast the int64 series to the duration type


This should not use a cast, but rather do into_duration on the Int64Chunked.

orlp · 2025-12-01T14:47:29Z

crates/polars-core/src/series/implementations/datetime.rs

+                .map(|v: Option<f64>| v.map(|f| f as i64))
+                .collect::<Int64Chunked>()
+                .into_series();
+            // Cast the int64 series to the datetime type


This should not use a cast, but rather do into_datetime on the Int64Chunked.

orlp · 2025-12-01T14:47:55Z

crates/polars-core/src/series/implementations/date.rs

+                .map(|v: Option<f64>| v.map(|f| (f * (US_IN_DAY as f64)) as i64))
+                .collect::<Int64Chunked>()
+                .into_series();
+            // Cast the int64 series to the datetime type


This should not use a cast, but rather do into_date on the Int64Chunked.

orlp · 2025-12-01T14:49:03Z

crates/polars-core/src/chunked_array/ops/mod.rs

+        for _q in quantiles {
+            out.push(None);
+        }
+        Ok(out)


vec![None; quantiles.len()]

orlp · 2025-12-01T14:50:57Z

crates/polars-core/src/chunked_array/ops/aggregate/mod.rs

    }

+    fn quantiles_reduce(&self, quantiles: &[f64], method: QuantileMethod) -> PolarsResult<Scalar> {
+        let v = self.quantiles(quantiles, method)?; // Vec<Option<f64>>


Please don't annotate types in comments like this.

orlp · 2025-12-01T14:51:54Z

crates/polars-core/src/chunked_array/ops/aggregate/mod.rs


+    fn quantiles_reduce(&self, quantiles: &[f64], method: QuantileMethod) -> PolarsResult<Scalar> {
+        let v = self.quantiles(quantiles, method)?; // Vec<Option<f64>>
+        // build a Float64 series from the optional results, preserving nulls


We're not building a Float64 series here.

codecov · 2025-12-02T05:03:10Z

Codecov Report

❌ Patch coverage is 67.68293% with 106 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.81%. Comparing base (45819db) to head (f752ee9).

Files with missing lines	Patch %	Lines
...polars-core/src/series/implementations/duration.rs	30.00%	14 Missing ⚠️
crates/polars-expr/src/expressions/aggregation.rs	65.85%	14 Missing ⚠️
...tes/polars-core/src/series/implementations/date.rs	38.09%	13 Missing ⚠️
...polars-core/src/series/implementations/datetime.rs	38.09%	13 Missing ⚠️
.../polars-core/src/series/implementations/decimal.rs	40.90%	13 Missing ⚠️
...tes/polars-core/src/series/implementations/time.rs	31.57%	13 Missing ⚠️
...polars-core/src/chunked_array/ops/aggregate/mod.rs	70.00%	12 Missing ⚠️
crates/polars-core/src/chunked_array/ops/mod.rs	0.00%	5 Missing ⚠️
...s-core/src/chunked_array/ops/aggregate/quantile.rs	96.66%	3 Missing ⚠️
crates/polars-core/src/series/series_trait.rs	81.25%	3 Missing ⚠️
... and 1 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #25516      +/-   ##
==========================================
+ Coverage   79.61%   79.81%   +0.20%     
==========================================
  Files        1729     1729              
  Lines      239727   239952     +225     
  Branches     3038     3038              
==========================================
+ Hits       190857   191521     +664     
+ Misses      48087    47648     -439     
  Partials      783      783

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

orlp · 2025-12-04T09:49:41Z

crates/polars-core/src/series/series_trait.rs

+    /// Get the quantile of the Series as a new Series of length 1.
+    /// Default implementation delegates to `quantiles_reduce` with a single element
+    /// and unwraps the resulting `List` scalar to a plain scalar where possible.
+    fn quantile_reduce(&self, quantile: f64, method: QuantileMethod) -> PolarsResult<Scalar> {


Something is fundamentally wrong with the datatypes. You changed everything to go through quantiles_reduce and made the dtype quantiles_reduce dynamic based on the length of the input.

quantiles_reduce should always return a List, even if the input has length 1. Please undo the changes that makes everything go through quantiles_reduce.

Ok. Just want to make sure I fully understand what the issue is here and what you would like done. Putting aside the specific internal function implementations for a second, the desired behavior (confirming) is that we have one user exposed function (quantile) that has two different overloaded types behaviors:

f64 input quantile -> f64 output

list of f64 inputs -> list of f64 outputs

Obviously from a math/process standpoint, (1) is a just a version of (2), but from a types standpoint they are different.

internally, we need to go through a lot of steps/functions along the way to go all the way in and all the way out. My original implementation had those two different types cases handled in side by side functions (essentially) all the way down to the bottom and then back out again, including implementing a (new)“quantiles_reduce” function sitting side by side next to “quantile_reduce.” (each of which needing an implementation for all the different numerical types). This seemed bad to me, so I collapsed the two reduce functions into a single one that could handle both cases.

Are you saying you don’t want that and we should go back to having two reduce functions, one for f64 -> f64 and one for list -> list? Or are you saying everything can go through the new quantiles_reduce, but make that function always do list -> list and handle the scalar/list type conversion in the caller?

Are you saying you don’t want that and we should go back to having two reduce functions, one for f64 -> f64 and one for list -> list?

Yes. They are different operations.

Or are you saying everything can go through the new quantiles_reduce, but make that function always do list -> list and handle the scalar/list type conversion in the caller?

This also would've been fine, but my preference goes to the above. What isn't fine is that the output datatype depends on the length of the input.

agossard added 9 commits November 25, 2025 22:14

Start on low level implementations

d949654

Implement multi quantile call first pass

d8bad2f

Changed documentation

b2a7d38

unit test

58b5dc3

Merge remote-tracking branch 'origin/main' into feature/multi_quantiles

852efcf

Implement quantiles_reduce for f16

991cf9a

Fix merge problems

4104d11

Remove one permutation of low level quantile algorithm

0355c5e

lint errors

aedf643

agossard requested review from MarcoGorelli, alexander-beedie, c-peters, orlp, reswqa and ritchie46 as code owners November 27, 2025 04:12

github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Nov 27, 2025

mcrumiller reviewed Nov 27, 2025

View reviewed changes

crates/polars-expr/src/expressions/aggregation.rs Show resolved Hide resolved

agossard added 4 commits November 27, 2025 21:55

Implement quantiles_reduce on various SeriesWrape datatype implementa…

24d031e

…tions

Update mypy signature

4531fdb

formatting fixes

bfcafd5

Fix test variable name

3ec8c0b

agossard and others added 6 commits November 28, 2025 11:46

start on refactoring

17ba975

Fix series documentation

8348a34

Handle integers (0 or 1) coming in as the quantile

b404423

linter errors

71241f4

Merge branch 'pola-rs:main' into feature/multi_quantiles

5c81cbf

More lint problems

2a0937b

agossard added 3 commits November 28, 2025 13:33

Merge remote-tracking branch 'origin/feature/multi_quantiles' into fe…

70165b9

…ature/multi_quantiles

Merge branch 'feature/multi_quantiles' into feature/multi_quantile_re…

1eeea47

…factor

A bunch of progress

9eabe1f

orlp requested changes Dec 1, 2025

View reviewed changes

agossard added 2 commits December 1, 2025 23:23

Clean up and make suggested code changes

72d5d8b

fix lint problems

98a4714

agossard and others added 3 commits December 2, 2025 05:42

Merge branch 'main' into feature/multi_quantiles

000dc2e

streamline get_quantile

1ca1ee2

Merge branch 'main' into feature/multi_quantiles

f752ee9

orlp requested changes Dec 4, 2025

View reviewed changes

feat: Allow quantile to compute multiple quantiles at once #25516

Are you sure you want to change the base?

feat: Allow quantile to compute multiple quantiles at once #25516

Conversation

agossard commented Nov 27, 2025

Uh oh!

Uh oh!

nikaltipar commented Nov 28, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

orlp Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Dec 2, 2025 •

edited

Loading

orlp Dec 4, 2025 •

edited

Loading