Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusion about & when mixing cs.by_name and pl.col #19740

Closed
2 tasks done
etiennebacher opened this issue Nov 12, 2024 · 2 comments · Fixed by #19742
Closed
2 tasks done

Confusion about & when mixing cs.by_name and pl.col #19740

etiennebacher opened this issue Nov 12, 2024 · 2 comments · Fixed by #19742
Assignees
Labels
A-selectors Area: column selectors bug Something isn't working python Related to Python Polars

Comments

@etiennebacher
Copy link
Contributor

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl
import polars.selectors as cs

df = pl.DataFrame({
    "a": [1],
    "b": [2]
})

df.select(cs.by_name("a") | pl.col("b"))
# shape: (1, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 1   ┆ 2   │
# └─────┴─────┘

df.select(cs.by_name("a") & pl.col("b"))
# shape: (1, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 1   ┆ 2   │
# └─────┴─────┘

Log output

No response

Issue description

I don't understand some behavior of & to intersect selectors and expressions.

In the example above, cs.by_name("a") | pl.col("b") returns both columns, which makes sense since both match the condition that they are either "a" or "b". My issue is that cs.by_name("a") & pl.col("b") also returns both columns, but IMO it shouldn't since none of the columns is "a" and "b" at the same time. This comes from this if condition:

if is_column(other):
colname = other.meta.output_name()
if self._attrs["name"] == "by_name" and (
params := self._attrs["params"]
).get("require_all", True):
return by_name(*params["*names"], colname)
other = by_name(colname)

I understand there are some issues with selectors, e.g. #13757. Is this also an issue or am I misunderstanding?

Expected behavior

df.select(cs.by_name("a") & pl.col("b")) should return 0 cols.

Installed versions

--------Version info---------
Polars:              1.12.0
Index type:          UInt32
Platform:            Linux-6.8.0-47-generic-x86_64-with-glibc2.39
Python:              3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0]
LTS CPU:             False

----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               <not installed>
gevent               <not installed>
great_tables         <not installed>
matplotlib           <not installed>
nest_asyncio         <not installed>
numpy                2.1.2
openpyxl             <not installed>
pandas               <not installed>
pyarrow              <not installed>
pydantic             <not installed>
pyiceberg            <not installed>
sqlalchemy           <not installed>
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>
@etiennebacher etiennebacher added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Nov 12, 2024
@orlp
Copy link
Collaborator

orlp commented Nov 12, 2024

This seems like a bug to me.

@alexander-beedie
Copy link
Collaborator

This seems like a bug to me.

Yup, looks like it - I'll go fix this one.

@alexander-beedie alexander-beedie self-assigned this Nov 12, 2024
@alexander-beedie alexander-beedie added A-selectors Area: column selectors and removed needs triage Awaiting prioritization by a maintainer labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-selectors Area: column selectors bug Something isn't working python Related to Python Polars
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants