Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Databricks SQL
Databricks Runtime
Returns the value of an expression associated with the smallest value of a second expression in a group. With the optional third argument, returns an array of up to limit values corresponding to the smallest values of the ordering expression. Semantics are the same as max_by aggregate function with opposite ordering.
Syntax
min_by(expr, ordExpr) [FILTER ( WHERE cond ) ]
min_by(expr, ordExpr, limit) [FILTER ( WHERE cond ) ]
This function can also be invoked as a window function using the OVER clause.
Arguments
- expr: The expression whose value is to be returned. Can be any data type.
- ordExpr: The expression used to determine the minimum value (the sorting key). Must be an orderable type.
- limit: (Optional) An
INTvalue representing the maximum number of values to return. Must be greater than 0 and less than or equal to 100,000. When provided, the function returns an array of up tolimitvalues. - cond: An optional boolean expression filtering the rows used for aggregation.
Returns
Without limit: the result type matches the type of expr. If multiple rows have the same minimum value of ordExpr, the result is non-deterministic.
With limit: an ARRAY of values whose element type matches the type of expr. The array contains up to limit elements. Values corresponding to NULL orderings are not included. If all ordering values are NULL, the result is NULL. If multiple rows share the same smallest values, the function is non-deterministic.
Note
For certain STRING collations, such as UTF8_LCASE, the result may be non-deterministic as well.
Examples
> SELECT min_by(x, y) FROM VALUES (('a', 10)), (('b', 50)), (('c', 20)) AS tab(x, y);
a
> SELECT min_by(x, y, 2) FROM VALUES (('a', 10)), (('b', 50)), (('c', 20)) AS tab(x, y);
[a, c]
> SELECT min_by(x, y COLLATE UTF8_LCASE) FROM VALUES (('a', 'X')), (('b', 'x')), (('c', 'Z')) AS tab(x, y);
a (or b)