PERCENTILE_CONT and PERCENTILE_DISC functions #8807
Draft
+616
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PERCENTILE_DISC and PERCENTILE_CONT functions
The
PERCENTILE_CONTandPERCENTILE_DISCfunctions are known as inverse distribution functions.These functions operate on an ordered set. Both functions can be used as aggregate or window functions.
PERCENTILE_DISC
PERCENTILE_DISCis an inverse distribution function that assumes a discrete distribution model.It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.
Syntax for the
PERCENTILE_DISCfunction as an aggregate function.Syntax for the
PERCENTILE_DISCfunction as an window function.The first argument
<percent>must evaluate to a numeric value between 0 and 1, because it is a percentile value.This expression must be constant within each aggregate group.
The
ORDER BYclause takes a single expression that can be of any type that can be sorted.The function
PERCENTILE_DISCreturns a value of the same type as the argument inORDER BY.For a given percentile value
P,PERCENTILE_DISCsorts the values of the expression in theORDER BYclause andreturns the value with the smallest
CUME_DISTvalue (with respect to the same sort specification)that is greater than or equal to
P.Analytic Example
PERCENTILE_CONT
PERCENTILE_CONTis an inverse distribution function that assumes a continuous distribution model.It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.
Syntax for the
PERCENTILE_CONTfunction as an aggregate function.Syntax for the
PERCENTILE_CONTfunction as an window function.The first argument
<percent>must evaluate to a numeric value between 0 and 1, because it is a percentile value.This expression must be constant within each aggregate group.
The
ORDER BYclause takes a single expression, which must be of numeric type to perform interpolation.The
PERCENTILE_CONTfunction returns a value of typeDOUBLE PRECISIONorDECFLOAT(34)depending on the typeof the argument in the
ORDER BYclause. A value of typeDECFLOAT(34)is returned ifORDER BYcontainsan expression of one of the types
INT128,NUMERIC(38, x)orDECFLOAT(16 | 34), otherwise -DOUBLE PRECISION.The result of
PERCENTILE_CONTis computed by linear interpolation between values after ordering them.Using the percentile value (
P) and the number of rows (N) in the aggregation group, you can computethe row number you are interested in after ordering the rows with respect to the sort specification.
This row number (
RN) is computed according to the formulaRN = (1 + (P * (N - 1)).The final result of the aggregate function is computed by linear interpolation between the values from rows
at row numbers
CRN = CEILING(RN)andFRN = FLOOR(RN).Analytic Example
An example of using both aggregate functions