You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from dask_ml.model_selection import train_test_split
import dask.array as da
x = da.arange(8, chunks=4)
train_test_split(x,blockwise=false)
....
NotImplementedError: ShuffleSplit with blockwise=False has not been implemented yet.
Environment:
Dask version: 2024.4.4
Python version: 3.9.18
Operating System:
Install method (conda, pip, source): pip
The text was updated successfully, but these errors were encountered:
christhorn2
changed the title
Documentation Issue
Documentation Issue with train_test_split and blockwise
Aug 15, 2024
Describe the issue:
API Documentation of dask train_test_split states that blockwise=False is supported for Arrays:
"For Dask Arrays, set blockwise=False to shuffle data between blocks as well."
https://ml.dask.org/modules/generated/dask_ml.model_selection.train_test_split.html#dask_ml.model_selection.train_test_split
This is the intention of the code too I think, and it delegates the job to ShuffleSplit:
dask-ml/dask_ml/model_selection/_split.py
Line 490 in 567cfd7
However, ShuffleSplit does not support blockwise=False:
dask-ml/dask_ml/model_selection/_split.py
Line 194 in 567cfd7
Minimal Complete Verifiable Example:
from dask_ml.model_selection import train_test_split
import dask.array as da
x = da.arange(8, chunks=4)
train_test_split(x,blockwise=false)
....
NotImplementedError: ShuffleSplit with
blockwise=False
has not been implemented yet.Environment:
The text was updated successfully, but these errors were encountered: