Skip to content

Commit

Permalink
wrap prepared_ds_path in str() to avoid TypeError in fsspec package (#…
Browse files Browse the repository at this point in the history
…1548)

* wrap prepared_ds_path in str() to avoid TypeError in fsspec package

`fsspec` calls `if "::" in path` on `prepared_ds_path`, which will throw an error if it is a `PosixPath` object.

* update test too

---------

Co-authored-by: Wing Lian <[email protected]>
  • Loading branch information
FrankRuis and winglian authored Apr 21, 2024
1 parent 7d1d22f commit 7477a53
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion src/axolotl/utils/data/sft.py
Original file line number Diff line number Diff line change
Expand Up @@ -421,7 +421,7 @@ def for_d_in_datasets(dataset_configs):

if cfg.local_rank == 0:
LOG.info(f"Saving merged prepared dataset to disk... {prepared_ds_path}")
dataset.save_to_disk(prepared_ds_path)
dataset.save_to_disk(str(prepared_ds_path))
if cfg.push_dataset_to_hub:
LOG.info(
f"Saving merged prepared dataset with push_to_hub... {cfg.push_dataset_to_hub}/{ds_hash}"
Expand Down
2 changes: 1 addition & 1 deletion tests/test_datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ def test_load_from_save_to_disk(self):
"""Usual use case. Verify datasets saved via `save_to_disk` can be loaded."""
with tempfile.TemporaryDirectory() as tmp_dir:
tmp_ds_name = Path(tmp_dir) / "tmp_dataset"
self.dataset.save_to_disk(tmp_ds_name)
self.dataset.save_to_disk(str(tmp_ds_name))

prepared_path = Path(tmp_dir) / "prepared"
cfg = DictDefault(
Expand Down

0 comments on commit 7477a53

Please sign in to comment.