Skip to content

MVA Master School Project - Sign Language Translation from Video to Text

Notifications You must be signed in to change notification settings

liuvince/mva-slt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recvis - Topic F - Sign Language Translation from Video to Text

Vincent LIU [email protected]

Supervised by Gül Varol

My code relies on the following repositories:

Presentation

https://youtu.be/nQfJSzsQ8Q0

Compute DOPE [2] features on Phoenix14T Dataset [5]

Dope.Viz

Early Fusion

Early.Fusion

Figure 1. Scheme of early fusion before projection and post projection (left) and a detailed overview of a single layered Sign LanguageTransformer (right) taken from [1]. The image example is from PHOENIX14T datatset [5].

Late Fusion

Late.Fusion

Equation 1. Equation of Late fusion.

Code

Contains code to perform dope feature extraction, early-fusion and late-fusion.

Features

https://drive.google.com/drive/folders/1Fc80pHwx8GNV4J-uzLENWD4i9r39RqO1

Experiments

Contains the notebooks of my experiments.

References

[1] Necati Cihan Camgoz, Oscar Koller, Simon Hadfield, and Richard Bowden. Sign language transformers: Joint end-to-end sign language recognition and translation, 2020.

[2] Weinzaepfel, Philippe and Bregier, Romain and Combaluzier,Hadrien and Leroy, Vincent and Rogez, Gregory. DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild. In ECCV, 2020.

[3] Necati Cihan Camgoz and Oscar Koller and Simon Hadfield and Richard Bowden. Multi-channel Transformers for Multi-articulatory Sign Language Translation, 2020.

[4] Oscar Koller, Necati Camgoz, Hermann Ney, and Richard Bowden. Weakly supervised learning with multi-stream cnn-lstm-hmms to discover sequential parallelism in sign language videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP, 04 2019.

[5] Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, Hermann Ney, and Richard Bowden. Neural sign language translation. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018