This repo contains the code for paper: Fingerspelling Detection in American Sign Language [Arxiv Preprint].
- Pytorch 1.1.0
- Warp-CTC
- Youtube-dl
- FFmpeg
Run ./setup.sh
to set up environment. Youtube-dl and FFmpeg are only required for data preparation.
-
Go to
src
directory:cd src/
. -
Download csv files of ChicagoFSWild/ChicagoFSWild+. Use
preproc/pipeline.sh
to set up the dataset. For example, to set up the ChicagoFSWild in folderdata/fswild
, put the csv fileChicagoFSwild.csv
indata/fswild
and run the following command:
for step in {1..6};do ./preproc/pipeline.sh -d ./data/fswild/ -t ChicagoFSWild -s $step;done
It will generate the following subfolder data/fswild/loader
, where the training and evaluation are based.
data/fswild/loader
|-- dev.json
|-- test.json
|-- train.json
`-- video
More concretely, the script will do the following: (1) downloading videos from Youtube. (2) creating csv files for downloaded videos. (3) resizing. (4) extracting optical flow. (5) generating label files. (6) spliting videos for data loading. In total, those steps take ~1 minute per 1-minute video clip on a common single 12-core CPU, where most time is consumed by step 1,3,4,6. The scripts for parallelizing those steps on slurm can be found in scripts/slurm_fswild.sh
(for ChicagoFSWild) and scripts/slurm_fswildplus.sh
(for ChicagoFSWild+).
Note the above script only downloads and processes videos from youtube which are still available online. Thus the following experimental results can vary from original paper.
- Training
./scripts/train.sh --help # show arguments
./scripts/train.sh --data .data/fswild/loader/ --step 1
See training script for details.
- Evaluation
./scripts/eval.sh --help # show arguments
./scripts/eval.sh --data .data/fswild/loader/ --stage 1
See evaluation script for details. Note computing MSA/AP@Acc requires an off-the-shelf fingerspelling recognizer, which can be downloaded here.
The fingerspelling detector trained on ChicagoFSWild+ ([email protected]: 0.448) can be downloaded here.
- Code for fingerspelling detector
- Code for evaluation
- Code for ASL data preparation from scratch
@inproceedings{shi2021fsdet,
author = {Bowen Shi and Diane Brentari and Greg Shakhnarovich and Karen Livescu},
title = {Fingerspelling Detection in American Sign Language},
booktitle = {CVPR},
year = {2021}
}