Repository to train a sleep staging classifier. We will use:
- An open source sleep staging dataset with ground truth labels (see dataset)
yasato extract featureslightgbmto train a classifier
The dataset was collected from OSF. It contains mouse EEG/EMG recordings (sampling rate: 512 Hz) and sleep stage labels (epoch length: 2.5 sec).
Training was performed using extracted features from 24h recordings.
Dataset can be downloaded using this link
https://files.osf.io/v1/resources/py5eb/providers/osfstorage/?zip=
or downloaded manually from OSF. See Dataset Structure for details of expected file structure.
Datasets have the following structure
.
├── Mouse01
│ ├── Day1_dark_cycle
│ │ ├── EEG.mat
│ │ ├── EMG.mat
│ │ └── labels.mat
│ ├── Day1_light_cycle
│ │ ├── EEG.mat
│ │ ├── EMG.mat
│ │ └── labels.mat
│ ├── Day2_dark_cycle
│ │ ├── EEG.mat
│ │ ├── EMG.mat
│ │ └── labels.mat
│ └── Day2_light_cycle
│ ├── EEG.mat
│ ├── EMG.mat
│ └── labels.mat
01-extract_features.qmd was run to extract features. An important note is that it uses a local version of SleepStaging() (from staging import SleepStaging) that differs from the implementation in yasa. This was included for reproducibility, though we have plans to include this version in yasa itself and will be no longer needed.
02-train.qmd was run to train on the 24 hour recordings. The outputs of this notebook are saved into /output.
Model output will be stored as
output/
├── classifiers/
│ ├── eeg_full/
│ │ ├── model.joblib
│ │ └── feature_importances.csv
│ ├── eeg_no_kurt/
│ │ ├── model.joblib
│ │ └── feature_importances.csv
│ ├── eeg_no_std/
│ | ├── model.joblib
│ | └── feature_importances.csv
| |
| | ...
03-evaluate.qmd was run to evaluate and produce accuracy and cohen's kappa metrics. The outputs of this notebook are saved into /output.
This is a preliminary release, file issues to enhance functionality.