Skip to content

Latest commit

 

History

History
75 lines (64 loc) · 2.86 KB

README.md

File metadata and controls

75 lines (64 loc) · 2.86 KB

Machine Translation

Main source code will be available at zero (might require some time, 31/05/2019). The used NMT structure is in deepnmt.py.

Main experimental results are summarized below.

Model #Params BLEU Train Decode
GNMT - 24.61 - -
GRU 206M 26.28 2.67 45.35
ATR 122M 25.70 1.33 34.40
SRU 170M 25.91 1.34 42.84
LRN 143M 26.26 0.99 36.50
oLRN 164M 26.73 1.15 40.19

Train: time in seconds per training batch measured from 0.2k training steps. Decode: time in milliseconds used to decode one sentence measured on newstest2014 dataset. BLEU: case-insensitive tokenized BLEU score on WMT14 English-German translation task.

oLRN structure

Unlike LRN, oLRN employs an additional output gate, inspired by LSTM, to handle output information flow. This additional gate also help avoid hidden state explosion when linear activation is applied.

How to Run?

Training and evaluation, please refer to project zero.