This repository contains the full workflow developed for the CMI 2025 Human Motion Classification competition. The goal of this challenge is to accurately identify human motion types (e.g., walking, sitting, running, standing) using data collected from IMU (Inertial Measurement Unit) sensors such as accelerometers, gyroscopes, and orientation sensors.
This project demonstrates end-to-end data processing, feature extraction, model training, and validation pipelines with a strong emphasis on robustness, interpretability, and efficiency.
The dataset consists of multivariate time-series signals captured from IMU sensors. Each sequence corresponds to one labeled human activity.
Example sensor streams:
acc_x, acc_y, acc_z→ Accelerometer axes (m/s²)gyro_x, gyro_y, gyro_z→ Gyroscope angular velocity (rad/s)rot_x, rot_y, rot_z, rot_w→ Orientation quaterniontime→ Timestamp or frame index
Target variable: behavior — the motion category (18 classes)
The data is typically split into train.csv, test.csv, and a metadata file such as subjects.csv or sequence_info.csv.
- Sensor alignment and missing-value imputation
- Outlier clipping and normalization (per sensor group)
- Sequence-level standardization and window segmentation
- Statistical features (mean, std, skew, kurtosis, IQR)
- Frequency-domain analysis via FFT and wavelets
- Quaternion-based angular distance and velocity features
- Inter-sensor fusion (e.g., relative acceleration, gravity-compensated motion)
- Aggregation by
sequence_idorsubject_id
A combination of traditional and deep learning models was explored:
- Baseline: RandomForest, XGBoost, LightGBM
- Deep Models: LSTM / GRU, 1D CNN, and InceptionTime-style architectures
- Hybrid: Temporal Convolutional Networks (TCN) + Attention pooling
Training employed stratified group k-fold cross-validation to prevent data leakage between subjects.
- Primary metric: Macro F1-score
- Validation: 5-fold cross-validation
- Visualization: confusion matrices, per-class accuracy, t-SNE embeddings
| Model | Cross-Validation F1 | Public LB | Notes |
|---|---|---|---|
| XGBoost baseline | 0.82 | 0.80 | fast, interpretable |
| TCN + Attention | 0.91 | 0.90 | best generalization |
| Ensemble (XGB + TCN) | 0.92 | 0.91 | final submission |
- Python 3.10+
- NumPy, Pandas, Polars
- Scikit-learn, XGBoost, LightGBM
- PyTorch, PyTorch Lightning
- Matplotlib, Seaborn, Plotly
Clone the repository and open the notebook:
git clone https://github.com/massoudsh/FinDetect.git
cd FinDetect/CMI_2025_IMU_Motion_Classification
jupyter notebook Kaggle_CMI_2025.ipynbOr run directly from terminal (if using VSCode / JupyterLab):
jupyter labDependencies can be installed with:
pip install -r requirements.txt- Add real-time inference pipeline (IMU sensor streaming)
- Deploy trained model as a lightweight edge classifier (ONNX / CoreML)
- Integrate interpretability (SHAP for temporal data)
- Compare transformer-based temporal encoders (e.g., TimesNet, TST)
Massoud Shemirani
FinTech | AI Developer | ML Researcher
📍 GitHub • Kaggle • LinkedIn
© 2025 — Developed as part of the Kaggle CMI 2025 competition.