Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unable to load Anomaly Detection dataset #2308

Closed
haskarb opened this issue Nov 5, 2024 · 6 comments
Closed

[BUG] Unable to load Anomaly Detection dataset #2308

haskarb opened this issue Nov 5, 2024 · 6 comments
Labels
anomaly detection Anomaly detection package bug Something isn't working datasets Datasets and data loaders

Comments

@haskarb
Copy link
Contributor

haskarb commented Nov 5, 2024

Describe the bug

When I fix split="train", I get below error.

Steps/Code to reproduce the bug

from aeon.datasets import load_anomaly_detection
X_train, y_train= load_anomaly_detection(("SMD", "machine-1-1"), split="train")

Expected results

Dataset should be read.

Actual results

packages\\aeon\\datasets\\_tsad_data_loaders.py:150, in load_anomaly_detection(name, split, extract_path, return_metadata)
    148 metadata = df_meta.loc[name]
    149 if split.lower() == \"train\":
--> 150     if metadata[\"train_path\"] is None or np.isnan(metadata[\"train_path\"]):
    151         raise ValueError(
    152             f\"Dataset {name} does not have a training partition. Only \"
    153             \"`split='test'` is supported.\"
    154         )
    155     dataset_path = data_folder / metadata[\"train_path\"]

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''"
}

Versions

 name         : aeon
 version      : 0.11.1
 description  : A toolkit for machine learning from time series

dependencies
 - deprecated >=1.2.13
 - numba >=0.55,<0.61.0
 - numpy >=1.21.0,<1.27.0
 - packaging >=20.0
 - pandas >=1.5.3,<2.1.0
 - scikit-learn >=1.0.0,<1.6.0
 - scipy >=1.9.0,<1.13.0
 - typing-extensions >=4.6.0
@haskarb haskarb added the bug Something isn't working label Nov 5, 2024
@SebastianSchmidl SebastianSchmidl added datasets Datasets and data loaders anomaly detection Anomaly detection package labels Nov 5, 2024
@SebastianSchmidl
Copy link
Member

This should already be fixed by #2100 (currently in main but not yet released). Can you try to reproduce your issue on the current main-branch?

@TonyBagnall
Copy link
Contributor

hi, just ran this and can confirm it runs fine from main, our release will hopefully be ready in a week or so, sorry

@haskarb
Copy link
Contributor Author

haskarb commented Nov 6, 2024

Yes, it works from main. Thanks! I will close the issue now.

@haskarb haskarb closed this as completed Nov 6, 2024
@haskarb
Copy link
Contributor Author

haskarb commented Nov 25, 2024

from aeon.datasets import load_anomaly_detection
X_train, y_train, meta = load_anomaly_detection(("Exalthon", "10_2_1000000_67"), return_metadata=True)

Opening this again, facing similar issue with Exalthon dataset.

{
	"name": "ValueError",
	"message": "When loading a custom dataset, the extract_path must point to a TimeEval-formatted CSV file, but c:\\Users\\bhaskar\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\anomalydetectionmts-bSz-Xj33-py3.11\\Lib\\site-packages\\aeon\\datasets\\local_data is not a CSV-file.",
	"stack": "---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 3
      1 from aeon.datasets import load_anomaly_detection
----> 3 X_train, y_train, meta = load_anomaly_detection((\"Exalthon\", \"10_2_1000000_67\"), return_metadata=True)

File c:\\Users\\bhaskar\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\anomalydetectionmts-bSz-Xj33-py3.11\\Lib\\site-packages\\aeon\\datasets\\_tsad_data_loaders.py:143, in load_anomaly_detection(name, split, extract_path, return_metadata)
    141 # Check if the dataset is part of the TimeEval archive
    142 if name not in tsad_datasets():
--> 143     return _load_custom(name, split, data_folder, return_metadata)
    145 # Load index
    146 df_meta = _load_indexfile()

File c:\\Users\\bhaskar\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\anomalydetectionmts-bSz-Xj33-py3.11\\Lib\\site-packages\\aeon\\datasets\\_tsad_data_loaders.py:189, in _load_custom(name, split, path, return_metadata)
    180 def _load_custom(
    181     name: tuple[str, str],
    182     split: Literal[\"train\", \"test\"],
   (...)
    186     tuple[np.ndarray, np.ndarray], tuple[np.ndarray, np.ndarray, dict[str, Any]]
    187 ]:
    188     if not path.is_file() or path.suffix != \".csv\":
--> 189         raise ValueError(
    190             \"When loading a custom dataset, the extract_path must point to a \"
    191             f\"TimeEval-formatted CSV file, but {path} is not a CSV-file.\"
    192         )
    193     X, y = load_from_timeeval_csv_file(path)
    194     if return_metadata:

ValueError: When loading a custom dataset, the extract_path must point to a TimeEval-formatted CSV file, but c:\\Users\\bhaskar\\AppData\\Local\\pypoetry\\Cache\\virtualenvs\\anomalydetectionmts-bSz-Xj33-py3.11\\Lib\\site-packages\\aeon\\datasets\\local_data is not a CSV-file."
}

@haskarb haskarb reopened this Nov 25, 2024
@SebastianSchmidl
Copy link
Member

Try with "Exathlon" instead of "Exalthon".

@haskarb
Copy link
Contributor Author

haskarb commented Nov 26, 2024

Thanks!

@haskarb haskarb closed this as completed Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
anomaly detection Anomaly detection package bug Something isn't working datasets Datasets and data loaders
Projects
None yet
Development

No branches or pull requests

3 participants