The Transformer-based CRF (or simply Bert-CRF) model is widely used in sequence labeling tasks and has proven to be a strong baseline in most scenarios. Proudly, we can say that, due to the careful model design and training techniques, our implementation can beat most off-the-shelf sequence labeling frameworks.
This example introduces how to train a Transformer-based CRF model using AdaSeq.
Let's train a NER model using the preset script as an example.
python scripts/train.py -c examples/bert_crf/configs/resume.yaml
See details for training
preprocessor:
tag_scheme: BIOES # (str, optional): The tag scheme used for sequence-labeling tasks. Possible candidates are [`BIO`, `BIOES`, `BIES`, `BI`]. Default to `BIOES`.
model:
word_dropout: 0.1 # (float, optional): Word-level/token-level dropout probability. Default to `0`.
use_crf: true # (bool, optional): Whether to use CRF decoder. Default to `true`.
See tutorial
- NER
Language | Dataset | Backbone | AdaSeq Bert-CRF | Best published | Modelcard & Demo |
---|---|---|---|---|---|
Chinese | msra | structbert-base | 96.69 | 96.72 (Li et al., 2020) | ModelScope |
Chinese | ontonotes 4.0 | structbert-base | 83.04 | 84.47 (Li et al., 2020) | ModelScope |
Chinese | resume | structbert-base | 96.87 | 96.79 (Xuan et al., 2020) | ModelScope |
Chinese | structbert-base | 72.77 | 72.66 (Zhu et al., 2022) | ModelScope | |
English | conll03 | xlm-roberta-large | 93.35 | 94.6 (Wang et al., 2021) | ModelScope |
English | conllpp | xlm-roberta-large | 94.71 | 95.88 (Zhou et al., 2021) | ModelScope |
English | wnut16 | xlm-roberta-large | 57.23 | 58.98 (Wang et al., 2021) | - |
English | wnut17 | xlm-roberta-large | 59.69 | 60.45 (Wang et al., 2021) | ModelScope |
- Word Segmentation
Language | Dataset | Backbone | AdaSeq Bert-CRF | Best published | Modelcard & Demo |
---|---|---|---|---|---|
Chinese | pku | bastructbert-base | 96.87 | 96.84(Jiang et al., 2022) | ModelScope |
- Part of Speech
Language | Dataset | Backbone | AdaSeq Bert-CRF | Best published | Modelcard & Demo |
---|---|---|---|---|---|
Chinese | ctb6 | bastructbert-base | 95.19 | 95.41(Meng et al., 2020) | ModelScope |