We employ the birnn plus CRF architecture as Lample et al. 2016, and experiment on CoNLL-2003 English NER data. Main experimental results are summarized below.
Model | #Params | NER |
---|---|---|
Lample et al. 2016 | - | 90.94 |
LSTM | 245K | 89.61 |
GRU | 192K | 89.35 |
ATR | 87K | 88.46 |
SRU | 161K | 88.89 |
LRN | 129K | 88.56 |
F1-score.
see requirements.txt for full list.
-
download and preprocess dataset
- download the conll2003 dataset from anago (in data folder).
- download the Glove-6B-100d pre-trained word embedding from: http://nlp.stanford.edu/data/glove.6B.zip
-
no hyperparameters are tuned, we keep them all in default.
-
training and evaluation
the running procedure is as follows:
export CUDA_ROOT=XXX export PATH=$CUDA_ROOT/bin:$PATH export LD_LIBRARY_PATH=$CUDA_ROOT/lib64:$LD_LIBRARY_PATH export CUDA_VISIBLE_DEVICES=0 export data_dir=path-of/conll2003/en/ner export glove_dir=path-of/glove.6B/glove.6B.100d.txt RUN_EXP=5 rnn=lrn for i in $(seq 1 $RUN_EXP); do exp_dir=exp$i mkdir $exp_dir cd $exp_dir export cell_type=$rnn python3 ner_glove.py --cell lrn >& log.lrn cd ../ done python scripts/get_test_score.py $rnn exp* >& score.$rnn
Results are reported over 5 runs.
Source code structure is adapted from annago.