-
Notifications
You must be signed in to change notification settings - Fork 664
Description
Modin version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest released version of Modin.
-
I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)
Reproducible Example
import modin.pandas as pd
pd.Series([1]).to_json() # '{"__reduced__":{"0":1}}'
Issue Description
The to_json
method dispatches from base.py
, and defaults to pandas before calling to_json
on the resulting pandas object. There is no specialized implementation for Series, meaning that the internal __reduced__
column label is displayed in the output.
This error may also be present in other I/O methods, but this is the only one I happened to check.
I'm not sure what the best fix is. As an ad-hoc mechanism we can have the series frontend override to_json
, then parse the result and remove the __reduced__
level, or even just string trim the leading/trailing curly brace + __reduced__
key entry without parsing (though this may be brittle depending on indentation/formatting flags). A more robust fix would possibly require passing an additional flag through the I/O dispatch mechanisms so the query compiler is aware of whether the data comes from a Series or DataFrame.
Expected Behavior
pandas result of the same code:
>>> pd.Series([1]).modin.to_pandas().to_json()
'{"0":1}'
Error Logs
Replace this line with the error backtrace (if applicable).
Installed Versions
INSTALLED VERSIONS
commit : 69f2751
python : 3.10.13.final.0
python-bits : 64
OS : Darwin
OS-release : 24.5.0
Version : Darwin Kernel Version 24.5.0: Tue Apr 22 19:54:25 PDT 2025; root:xnu-11417.121.6~2/RELEASE_ARM64_T6020
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
Modin dependencies
modin : 0.33.0+15.g69f27510
ray : 2.34.0
dask : 2024.8.1
distributed : 2024.8.1
pandas dependencies
pandas : 2.2.2
numpy : 2.2.6
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.0.0
pip : 23.3
Cython : None
pytest : 8.3.2
hypothesis : None
sphinx : 5.3.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 5.3.0
html5lib : None
pymysql : None
psycopg2 : 2.9.9
jinja2 : 3.1.4
IPython : 8.17.2
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat : None
fastparquet : 2024.5.0
fsspec : 2024.6.1
gcsfs : None
matplotlib : 3.9.2
numba : None
numexpr : 2.10.1
odfpy : None
openpyxl : 3.1.5
pandas_gbq : 0.23.1
pyarrow : 17.0.0
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : 2024.6.1
scipy : 1.14.1
sqlalchemy : 2.0.32
tables : 3.10.1
tabulate : 0.9.0
xarray : 2024.7.0
xlrd : 2.0.1
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None