Clustering and Recognition of Spatiotemporal Features through Interpretable Embedding of Sequence to Sequence Recurrent Neural Networks
This repository contains the code to use RNN-Based Sequence to Sequence model (Seq2Seq) for unsupervised action recognition tested on CMU Mocap Dataset.
We present that Seq2Seq model has capability to learn temporal embedding automatically. Given skeleton-based dataset, we train the network to do regression, but at the same time, the network itself is able to learn a embedding space that can be used to separate different type of actions. To demonstrate our method, we first concatenate multiple different actions together manually and use Seq2Seq to learn the representation. During testing, we will perform motion prediction task and utilize the internal states in Recurrent Neural Network to do clustering. We only provide the total number of possible actions happening in the whole sequence and don't provide any label during training and testing.
{% include youtubePlayer.html id="X9Oz81o55Gs" %}
We do expermients on two cases: concatenated sequence (randomly select from 8 unique actions) and continous sequence (subject 86).
For case one, we follow the work done by Chen Li to use CMU Mocap dataset for human motion prediction. Data are preprocessed on exponential map format. We provide the data in the data/cmu_mocap/train
folder or you can download the data from their Github webpage. The data contains 8 unique actions (walking, running, jumping, basketball, soccer, washwindow, basketball signal, and directing traffic). We manually concatenate multiple actions together to demonstrate the action recognition. To avoid biases on selecting data, we only specify the number of possible actions (repeat or non-repeat) and then randomly select the sequences and orders of data. Each unique type of action can also be repeated. You can also specify the actions yourself. For details, please check the arguments below.
For case two, we use the continous sequence (subject 86) of CMU mocap dataset. The data are also preprocessed on exponential map format and you can find them in data/cmu_mocap/subject_86
folder. Each sequence contains multiple actions and they are manually annotated by Jernej Barbic. You can find details in segmentation_ground_truth
folder.
- Tensorflow 1.13
- Python 3
- scikit-learn 0.21.2
Case one: To train a new model from scratch, run
python src/main.py --dataset cmu --actions walking,running,jumping
Case two: We provide saved models in the demo
folder (download:Google Drive). You can quickly check result by running
python src/subject_86.py --demo True --num_trial 01
To train a new model from scratch, run
python src/subject_86.py --demo False --num_trial 01
--dataset
: a string that is either cmu
or h3.6m
.
--num_actions
: unique number of actions specified for tested sequence (2 to 8 for CMU, 2 to 15 for H3.6M).
--total_num_actions
: integer, total number of actions, default is '3'.
--actions
: a string to define your own actions, default is random
. Instead of randomly choose actions, you can specify a list that contains names of actions you want to show in the testing video. For CMU dataset, you can choose
any action from ['walking', 'running', 'jumping', 'soccer', 'basketball', 'directing_traffic', 'washwindow', 'basketball_signal']
.
--learning_rate
: float number, default is 0.001
--batch_size
: interger number for choosing size of training batch,default is 16
--source_seq_len
:integer,the sequence length for encoder, default is 25
--target_seq_len
:integer,the sequence length for decoder, default is 25
iterations
:integer, iterations for training, default is 1e4
train_dir
:directory to save model
gen_video
:boolean, whether to generate video, default is False
video_dir
:directory to save generated video (only if you generate_video
flag is True
)
load
: integer, the number of previous checkpoint you want to load.
--demo
: boolean, default is True
. Indicate whether to use saved models
--num_trial
: string, default is 01
. The sequence number you want to check. range from 01
to 14
.
--train_dir
: directory to save your new model