MSKA

Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation

Introduction

We propose a multi-stream keypoint attention network to depict a sequence of keypoints produced by a readily available keypoint estimator. In order to facilitate interaction across multiple streams, we investigate diverse methodologies such as keypoint fusion strategies, head fusion, and self-distillation. The resulting framework is denoted as MSKA-SLR, which is expanded into a sign language translation (SLT) model through the straightforward addition of an extra translation network.We carry out comprehensive experiments on well-known benchmarks like Phoenix-2014, Phoenix-2014T, and CSL-Daily to showcase the efficacy of our methodology. Notably, we have attained a novel state-of-the-art performance in the sign language translation task of Phoenix-2014T.

Performance

MSKA-SLR

Dataset	WER	Model	Training
Phoenix-2014	22.1	ckpt	config
Phoenix-2014T	20.5	ckpt	config
CSL-Daily	27.8	ckpt	config

MSKA-SLT

Dataset	R	B1	B2	B3	B4	Model	Training
Phoenix-2014T	53.54	54.79	42.42	34.49	29.03	ckpt	config
CSL-Daily	54.04	56.37	42.80	32.78	25.52	ckpt	config

Installation

conda create -n mska python==3.10.13
conda activate mska
# Please install PyTorch according to your CUDA version.
pip install -r requirements.txt

Download

Datasets

Download datasets from their websites and place them under the corresponding directories in data/

Pretrained Models

mbart_de / mbart_zh : pretrained language models used to initialize the translation network for German and Chinese, with weights from mbart-cc-25.

We provide pretrained models Phoenix-2014T and CSL-Daily. Download this directory and place them under pretrained_models.

Keypoints We provide human keypoints for three datasets, Phoenix-2014, Phoenix-2014T, and CSL-Daily, pre-extracted by HRNet. Please download them and place them under data/Phoenix-2014t(Phoenix-2014 or CSL-Daily).

MSKA-SLR Training

python train.py --config configs/${dataset}_s2g.yaml --epoch 100

MSKA-SLR Evaluation

python train.py --config configs/${dataset}_s2g.yaml --resume pretrained_models/${dataset}_SLR/best.pth --eval

MSKA-SLT Training

python train.py --config configs/${dataset}_s2t.yaml --epoch 40

MSKA-SLT Evaluation

python train.py --config configs/${dataset}_s2t.yaml --resume pretrained_models/${dataset}_SLT/best.pth --eval

Citations

@misc{guan2024multistream,
      title={Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation}, 
      author={Mo Guan and Yan Wang and Guangkun Ma and Jiarui Liu and Mingzu Sun},
      year={2024},
      eprint={2405.05672},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
configs		configs
data		data
images		images
README.md		README.md
Rouge.py		Rouge.py
Tokenizer.py		Tokenizer.py
Visualhead.py		Visualhead.py
datasets.py		datasets.py
metrics.py		metrics.py
model.py		model.py
optimizer.py		optimizer.py
phoenix_cleanup.py		phoenix_cleanup.py
recognition.py		recognition.py
requirements.txt		requirements.txt
sacrebleu.py		sacrebleu.py
train.py		train.py
translation.py		translation.py
utils.py		utils.py
vl_mapper.py		vl_mapper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MSKA

Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation

Introduction

Performance

Installation

Download

MSKA-SLR Training

MSKA-SLR Evaluation

MSKA-SLT Training

MSKA-SLT Evaluation

Citations

About

Releases

Packages

Languages

sutwangyan/MSKA

Folders and files

Latest commit

History

Repository files navigation

MSKA

Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation

Introduction

Performance

Installation

Download

MSKA-SLR Training

MSKA-SLR Evaluation

MSKA-SLT Training

MSKA-SLT Evaluation

Citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages