NER models

solution of Named Entity Recognition(NER) task, include BERT series model, BiLSTM-CRF model.

Dataset

cner: datasets/cner
CLUENER: http://www.cluebenchmark.com/introduce.html

Models

BERT+Softmax
BERT+CRF
BERT+Span
BERT+Span+label_smoothing
BERT+Span+focal_loss
BiLSTM+CRF

Install

Python

python>=3.6

Requirements

torch>=1.1.0
cuda>=9.0

pip install torch>=1.1.0

Usage

Input Data Format

Input format (prefer BIOS tag scheme), with each character its label for one line. Sentences are splited with a null line.

美	B-LOC
国	I-LOC
的	O
华	B-PER
莱	I-PER
士	I-PER

我	O
跟	O
他	O
谈	O
笑	O
风	O
生	O

Download Google Bert Model

Download the prev_trained_model from [https://pan.baidu.com/s/1NS7-fQALRAqBv2yT4Lcn1g key: umku], and save it to the prev_trained_model/ directory.

note: file structure of the model

prev_trained_model
├── albert-base
│   ├── config.json
│   ├── pytorch_model.bin
│   └── vocab.txt
└── bert-base
    ├── config.json
    ├── pytorch_model.bin
    └── vocab.txt

Run model

Modify the configuration information in run_ner_xxx.py or run_ner_xxx.sh .
sh run_ner_xxx.sh

Result

CLUENER result

Tne overall performance of BERT on dev:

	Accuracy (entity)	Recall (entity)	F1 score (entity)
BERT+Softmax	0.7916	0.7962	0.7939	train_max_length=128 eval_max_length=512 epoch=4 lr=3e-5 batch_size=24
BERT+CRF	0.7877	0.8008	0.7942	train_max_length=128 eval_max_length=512 epoch=5 lr=3e-5 batch_size=24
BERT+Span	0.8132	0.8092	0.8112	train_max_length=128 eval_max_length=512 epoch=4 lr=3e-5 batch_size=24
BERT+Span+focal_loss	0.8121	0.8008	0.8064	train_max_length=128 eval_max_length=512 epoch=4 lr=3e-5 batch_size=24 loss_type=focal
BERT+Span+label_smoothing	0.8235	0.7946	0.8088	train_max_length=128 eval_max_length=512 epoch=4 lr=3e-5 batch_size=24 loss_type=lsr

Cner result

Tne overall performance of BERT on dev(test):

	Accuracy (entity)	Recall (entity)	F1 score (entity)
BERT+Softmax	0.9586(0.9566)	0.9644(0.9613)	0.9615(0.9590)	train_max_length=128 eval_max_length=512 epoch=4 lr=3e-5 batch_size=24
BERT+CRF	0.9562(0.9539)	0.9671(0.9644)	0.9616(0.9591)	train_max_length=128 eval_max_length=512 epoch=10 lr=3e-5 batch_size=24
BERT+Span	0.9604(0.9620)	0.9617(0.9632)	0.9611(0.9626)	train_max_length=128 eval_max_length=512 epoch=4 lr=3e-5 batch_size=24
BERT+Span+focal_loss	0.9516(0.9569)	0.9644(0.9681)	0.9580(0.9625)	train_max_length=128 eval_max_length=512 epoch=4 lr=3e-5 batch_size=24 loss_type=focal
BERT+Span+label_smoothing	0.9566(0.9568)	0.9624(0.9656)	0.9595(0.9612)	train_max_length=128 eval_max_length=512 epoch=4 lr=3e-5 batch_size=24 loss_type=lsr

The entity performance performance of BERT on test:

	CONT	ORG	LOC	EDU	NAME	PRO	RACE	TITLE
BERT+Softmax
Accuracy	1.0000	0.9446	1.0000	0.9911	1.0000	0.8919	1.0000	0.9545
Recall	1.0000	0.9566	1.0000	0.9911	1.0000	1.0000	1.0000	0.9508
F1 Score	1.0000	0.9506	1.0000	0.9911	1.0000	0.9429	1.0000	0.9526
BERT+CRF
Accuracy	1.0000	0.9446	1.0000	0.9823	1.0000	0.9687	1.0000	0.9591
Recall	1.0000	0.9566	1.0000	0.9911	1.0000	0.9697	1.0000	0.9534
F1 Score	1.0000	0.9506	1.0000	0.9867	1.0000	0.9697	1.0000	0.9552
BERT+Span
Accuracy	1.0000	0.9378	1.0000	0.9911	1.0000	0.9429	1.0000	0.9685
Recall	1.0000	0.9548	1.0000	0.9911	1.0000	1.0000	1.0000	0.9560
F1 Score	1.0000	0.9462	1.0000	0.9911	1.0000	0.9706	1.0000	0.9622

Reference

CLUENER2020

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github		.github
datasets		datasets
docs		docs
ner		ner
prev_trained_model		prev_trained_model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_bert_crf.py		run_bert_crf.py
run_bert_crf.sh		run_bert_crf.sh
run_bert_softmax.py		run_bert_softmax.py
run_bert_softmax.sh		run_bert_softmax.sh
run_bert_span.py		run_bert_span.py
run_bert_span.sh		run_bert_span.sh
run_birnn_crf.py		run_birnn_crf.py
run_birnn_crf.sh		run_birnn_crf.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NER models

Dataset

Models

Install

Usage

Input Data Format

Download Google Bert Model

Run model

Result

CLUENER result

Cner result

Reference

About

Releases

Packages

Languages

License

shibing624/NER-models

Folders and files

Latest commit

History

Repository files navigation

NER models

Dataset

Models

Install

Usage

Input Data Format

Download Google Bert Model

Run model

Result

CLUENER result

Cner result

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages