-
-
Notifications
You must be signed in to change notification settings - Fork 19.5k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
This issue is fixed in latest main (3.0.0)
Reproducible Example
import pandas as pd
import numpy as np
df = pd.DataFrame({
"id": ["A", "B"],
"arr": [np.array([1.0, 2.0]), np.array([3.0, 4.0])], # numpy arrays
})
df["sparse"] = pd.arrays.SparseArray([1, 1], fill_value=0)
row = df[df["id"] == "A"].iloc[0] # ValueErrorIssue Description
When a DataFrame has both:
- A column containing numpy arrays (object dtype)
- A SparseArray column
Calling .iloc[0] raises the following exception.
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Full traceback:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../pandas/core/indexing.py", line 1191, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File ".../pandas/core/indexing.py", line 1754, in _getitem_axis
return self.obj._ixs(key, axis=axis)
File ".../pandas/core/frame.py", line 3996, in _ixs
new_mgr = self._mgr.fast_xs(i)
File ".../pandas/core/internals/managers.py", line 1006, in fast_xs
result = cls._from_sequence(result, dtype=dtype)
File ".../pandas/core/arrays/sparse/array.py", line 590, in _from_sequence
return cls(scalars, dtype=dtype)
File ".../pandas/core/arrays/sparse/array.py", line 475, in __init__
sparse_values, sparse_index, fill_value = _make_sparse(
File ".../pandas/core/arrays/sparse/array.py", line 1888, in _make_sparse
mask = splib.make_mask_object_ndarray(arr, fill_value)
File "sparse.pyx", line 729, in pandas._libs.sparse.make_mask_object_ndarray
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
AFAICT this only occurs when there is both a numpy column AND a sparse column.
It seems this is fixed on the main branch (3.0.0), but it looks like there were a lot of changes that could have impacted this. Feel free to close this issue if backporting a fix in 2.X is not viable.
Expected Behavior
iloc[0] should return the first row, i.e.:
id A
values [1.0, 2.0, 3.0]
sparse_col 1
Name: 0, dtype: objectInstalled Versions
Details
INSTALLED VERSIONS
commit : 0691c5c
python : 3.12.6
python-bits : 64
OS : Darwin
OS-release : 25.1.0
Version : Darwin Kernel Version 25.1.0: Mon Oct 20 19:26:04 PDT 2025; root:xnu-12377.41.6~2/RELEASE_ARM64_T8122
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.2.3
numpy : 1.26.4
pytz : 2025.2
dateutil : 2.9.0.post0
pip : None
Cython : None
sphinx : None
IPython : 9.5.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.13.5
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2024.12.0
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : 3.1.6
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : 2.9.10
pymysql : None
pyarrow : 18.1.0
pyreadstat : None
pytest : 9.0.2
python-calamine : None
pyxlsb : None
s3fs : 2024.12.0
scipy : None
sqlalchemy : 2.0.43
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2025.2
qtpy : None
pyqt5 : None