Skip to content

Conversation

@FBruzzesi
Copy link
Member

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

@FBruzzesi FBruzzesi added the enhancement New feature or request label Nov 18, 2025
@FBruzzesi FBruzzesi changed the title feat: Add {Expr, Series}.any_value feat: Add {Expr,Series}.any_value Nov 18, 2025
@FBruzzesi FBruzzesi marked this pull request as ready for review November 18, 2025 22:18
@MarcoGorelli
Copy link
Member

thanks!

some initial thoughts:

  • I think the function just needs to not make any guarantees about order, I don't think we need to intentionally add randomness, just taking first for eager backends should be enough (it's likely more performant anyway, right?)
  • i'm slightly concerned that sql ignores nulls in this function, but we're suggesting to not to. if polars were to add this method, would they ignore nulls or not? perhaps we could have a mandatory ignore_nulls keyword argument, similar to how we have it in any_horizontal?

@FBruzzesi
Copy link
Member Author

Thanks @MarcoGorelli

  • I think the function just needs to not make any guarantees about order, I don't think we need to intentionally add randomness, just taking first for eager backends should be enough (it's likely more performant anyway, right?)

Wouldn't using first actually make a lot of guarantees on order for eager backends?

  • i'm slightly concerned that sql ignores nulls in this function, but we're suggesting to not to. if polars were to add this method, would they ignore nulls or not? perhaps we could have a mandatory ignore_nulls keyword argument, similar to how we have it in any_horizontal?

Sure I will add such option πŸ™πŸΌ

@MarcoGorelli
Copy link
Member

Wouldn't using first actually make a lot of guarantees on order for eager backends?

Sure, but we don't guarantee it - an in, it happens to return the first value, but it's not guaranteed (and the name any_value strongly implies there's nothing guaranteed)

Polars does something similar in joins - in the eager case it may happen to return rows in the same order when it's more efficient to do so, but it's not guaranteed (and it's documented that it's not guaranteed)

# !NOTE: ibis arbitrary returns a random non-null value
# See: https://ibis-project.org/reference/expression-generic.html#ibis.expr.types.generic.Column.arbitrary
expr = cast("ir.Column", expr)
return expr.arbitrary() if ignore_nulls else expr.first(include_null=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expr.first(include_null=not ignore_nulls) is not behaving as expected 🫠

Comment on lines +113 to +118
if kwargs.get("ignore_nulls"):
msg = (
"`Expr.any_value(ignore_nulls=True)` is not supported in a `over` "
"context for pandas-like backend."
)
raise NotImplementedError(msg)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an issue if a group has all null values

Comment on lines +198 to +203
if last_node.kwargs.get("ignore_nulls"):
msg = (
"`Expr.any_value(ignore_nulls=True)` is not supported in a `group_by` "
"context for pandas-like backend"
)
raise NotImplementedError(msg)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an issue if a group has all null values

@MarcoGorelli
Copy link
Member

thanks for updating, will try to review next week

@FBruzzesi
Copy link
Member Author

FBruzzesi commented Dec 9, 2025

@MarcoGorelli just to make sure we align before going on a tangent. We briefly touched the possibility of explicitly marking this feature as unstable as it's outside of polars API. Next steps would be:

  • Simple warning in the main namespace docstrings
  • Code/runtime warning in the stable namespaces OR even raise an error in the stable namespaces

Let me know what your thoughts are :)


Update: Ok I pushed a proposal in e8d44f6

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @FBruzzesi !

should we do the same (raise a warning in stable.v2) for hist?

@FBruzzesi FBruzzesi merged commit f6bdd89 into main Dec 12, 2025
35 of 36 checks passed
@FBruzzesi FBruzzesi deleted the feat/any_value branch December 12, 2025 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enh]: Add Expr.any_value?

3 participants