Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test result comparison without vs with document-level features #42

Closed
junwei-h opened this issue Jul 28, 2022 · 18 comments
Closed

Test result comparison without vs with document-level features #42

junwei-h opened this issue Jul 28, 2022 · 18 comments

Comments

@junwei-h
Copy link

Sorry to raise another issue. I observed ACE+doc is worse

ACE+doc (python .\train.py --config .\config\doc_ner_best.yaml --test)
MICRO_AVG: acc 0.8338 - f1-score 0.9094
MACRO_AVG: acc 0.8032 - f1-score 0.8851
LOC tp: 1550 - fp: 135 - fn: 118 - tn: 1550 - precision: 0.9199 - recall: 0.9293 - accuracy: 0.8597 - f1-score: 0.9246
MISC tp: 478 - fp: 95 - fn: 224 - tn: 478 - precision: 0.8342 - recall: 0.6809 - accuracy: 0.5997 - f1-score: 0.7498
ORG tp: 1426 - fp: 99 - fn: 235 - tn: 1426 - precision: 0.9351 - recall: 0.8585 - accuracy: 0.8102 - f1-score: 0.8952
PER tp: 1563 - fp: 40 - fn: 54 - tn: 1563 - precision: 0.9750 - recall: 0.9666 - accuracy: 0.9433 - f1-score: 0.9708

ACE (python .\train.py --config .\config\conll_03_english.yaml --test)
MICRO_AVG: acc 0.8807 - f1-score 0.9366
MACRO_AVG: acc 0.8635 - f1-score 0.9247500000000001
LOC tp: 1580 - fp: 90 - fn: 88 - tn: 1580 - precision: 0.9461 - recall: 0.9472 - accuracy: 0.8987 - f1-score: 0.9466
MISC tp: 606 - fp: 115 - fn: 96 - tn: 606 - precision: 0.8405 - recall: 0.8632 - accuracy: 0.7417 - f1-score: 0.8517
ORG tp: 1561 - fp: 159 - fn: 100 - tn: 1561 - precision: 0.9076 - recall: 0.9398 - accuracy: 0.8577 - f1-score: 0.9234
PER tp: 1575 - fp: 31 - fn: 42 - tn: 1575 - precision: 0.9807 - recall: 0.9740 - accuracy: 0.9557 - f1-score: 0.9773

Did I miss anything? Thanks.

@wangxinyu0922
Copy link
Member

It's strange. Could you provide your modified config file and your ner data for the experiment?

@junwei-h
Copy link
Author

junwei-h commented Jul 29, 2022

Here is the doc_ner_best.yaml content.

Controller:
  model_structure: null
MFVI:
  hexa_rank: 150
  hexa_std: 1
  iterations: 3
  normalize_weight: true
  quad_rank: 150
  quad_std: 1
  tag_dim: 150
  use_hexalinear: false
  use_quadrilinear: false
  use_second_order: false
  use_third_order: false
  window_size: 1
ReinforcementTrainer:
  assign_doc_id: true
  controller_learning_rate: 0.1
  controller_optimizer: SGD
  distill_mode: false
  optimizer: SGD
  pretrained_file_dict:
    bert-base-cased: resources/bert-base-cased.hdf5
    bert-base-multilingual-cased: resources/bert-base-multilingual-cased.hdf5
    bert-large-cased: resources/bert-large-cased.hdf5
  sentence_level_batch: true
  train_with_doc: true
anneal_factor: 2
ast:
  Corpus: SEMEVAL16-TR:SEMEVAL16-ES:SEMEVAL16-NL:SEMEVAL16-EN:SEMEVAL16-RU
atis:
  Corpus: ATIS-EN:ATIS-TR:ATIS-HI
chunk:
  Corpus: CONLL_03:CONLL_03_GERMAN
embeddings:
  ELMoEmbeddings-0:
    model: original
    # options_file: elmo_2x4096_512_2048cnn_2xhighway_options.json
    # weight_file: elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5
  FastWordEmbeddings-0:
    embeddings: en
    freeze: true
  FlairEmbeddings-0:
    model: en-forward
  FlairEmbeddings-1:
    model: en-backward
  FlairEmbeddings-2:
    model: multi-forward
  FlairEmbeddings-3:
    model: multi-backward
  TransformerWordEmbeddings-0:
    layers: '-1'
    model: xlnet-large-cased
    embedding_name: xlnet-large-cased
    pooling_operation: first
    v2_doc: true
  TransformerWordEmbeddings-1:
    layers: '-1'
    model: xlm-roberta-large
    embedding_name: xlm-roberta-large
    pooling_operation: first
    v2_doc: true
  TransformerWordEmbeddings-2:
    layers: '-1'
    model: roberta-large
    embedding_name: roberta-large
    pooling_operation: first
    v2_doc: true
  TransformerWordEmbeddings-3:
    layers: -1,-2,-3,-4
    model: bert-large-cased
    pooling_operation: first
  TransformerWordEmbeddings-4:
    layers: -1,-2,-3,-4
    model: bert-base-cased
    embedding_name: bert-base-cased
    pooling_operation: first
  TransformerWordEmbeddings-5:
    layers: -1,-2,-3,-4
    model: bert-base-multilingual-cased
    pooling_operation: first
interpolation: 0.5
is_teacher_list: true
model:
  FastSequenceTagger:
    crf_attention: false
    dropout: 0.0
    hidden_size: 800
    sentence_loss: true
    use_crf: true
model_name: xlnet-task-docv2_en-xlmr-task-tuned-docv2
ner:
  Corpus: CONLL_03_ENGLISH
  tag_dictionary: resources/taggers/ner_tags.pkl
target_dir: resources/taggers/
targets: ner
teacher_annealing: false
train:
  controller_momentum: 0.9
  learning_rate: 0.1
  max_episodes: 30
  max_epochs: 150
  max_epochs_without_improvement: 25
  mini_batch_size: 32
  monitor_test: false
  patience: 5
  save_final_model: false
  train_with_dev: false
  true_reshuffle: false
  use_warmup: false
trainer: ReinforcementTrainer

Since I shortened the embedding name, I added
if '/' in name: name = name.split('/')[-1]
before https://github.com/Alibaba-NLP/ACE/blob/main/flair/trainers/reinforcement_trainer.py#L1468

Also did

-        elif self.ext_doc:
+        elif hasattr(self, 'ext_doc') and self.ext_doc:

at https://github.com/Alibaba-NLP/ACE/blob/main/flair/embeddings.py#L2943
and

-            if self.sentence_feat:
+            if hasattr(input_sentences,'sentence_feat') and self.sentence_feat:

at https://github.com/Alibaba-NLP/ACE/blob/main/flair/embeddings.py#L3148

to get the code working.

Thanks.

I used '--test', so I assume it is using the coll2003 corpus as NER data.

@wangxinyu0922
Copy link
Member

wangxinyu0922 commented Jul 29, 2022

Here is the doc_ner_best.yaml content.

Controller:
  model_structure: null
MFVI:
  hexa_rank: 150
  hexa_std: 1
  iterations: 3
  normalize_weight: true
  quad_rank: 150
  quad_std: 1
  tag_dim: 150
  use_hexalinear: false
  use_quadrilinear: false
  use_second_order: false
  use_third_order: false
  window_size: 1
ReinforcementTrainer:
  assign_doc_id: true
  controller_learning_rate: 0.1
  controller_optimizer: SGD
  distill_mode: false
  optimizer: SGD
  pretrained_file_dict:
    bert-base-cased: resources/bert-base-cased.hdf5
    bert-base-multilingual-cased: resources/bert-base-multilingual-cased.hdf5
    bert-large-cased: resources/bert-large-cased.hdf5
  sentence_level_batch: true
  train_with_doc: true
anneal_factor: 2
ast:
  Corpus: SEMEVAL16-TR:SEMEVAL16-ES:SEMEVAL16-NL:SEMEVAL16-EN:SEMEVAL16-RU
atis:
  Corpus: ATIS-EN:ATIS-TR:ATIS-HI
chunk:
  Corpus: CONLL_03:CONLL_03_GERMAN
embeddings:
  ELMoEmbeddings-0:
    model: original
    # options_file: elmo_2x4096_512_2048cnn_2xhighway_options.json
    # weight_file: elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5
  FastWordEmbeddings-0:
    embeddings: en
    freeze: true
  FlairEmbeddings-0:
    model: en-forward
  FlairEmbeddings-1:
    model: en-backward
  FlairEmbeddings-2:
    model: multi-forward
  FlairEmbeddings-3:
    model: multi-backward
  TransformerWordEmbeddings-0:
    layers: '-1'
    model: xlnet-large-cased
    embedding_name: xlnet-large-cased
    pooling_operation: first
    v2_doc: true
  TransformerWordEmbeddings-1:
    layers: '-1'
    model: xlm-roberta-large
    embedding_name: xlm-roberta-large
    pooling_operation: first
    v2_doc: true
  TransformerWordEmbeddings-2:
    layers: '-1'
    model: roberta-large
    embedding_name: roberta-large
    pooling_operation: first
    v2_doc: true
  TransformerWordEmbeddings-3:
    layers: -1,-2,-3,-4
    model: bert-large-cased
    pooling_operation: first
  TransformerWordEmbeddings-4:
    layers: -1,-2,-3,-4
    model: bert-base-cased
    embedding_name: bert-base-cased
    pooling_operation: first
  TransformerWordEmbeddings-5:
    layers: -1,-2,-3,-4
    model: bert-base-multilingual-cased
    pooling_operation: first
interpolation: 0.5
is_teacher_list: true
model:
  FastSequenceTagger:
    crf_attention: false
    dropout: 0.0
    hidden_size: 800
    sentence_loss: true
    use_crf: true
model_name: xlnet-task-docv2_en-xlmr-task-tuned-docv2
ner:
  Corpus: CONLL_03_ENGLISH
  tag_dictionary: resources/taggers/ner_tags.pkl
target_dir: resources/taggers/
targets: ner
teacher_annealing: false
train:
  controller_momentum: 0.9
  learning_rate: 0.1
  max_episodes: 30
  max_epochs: 150
  max_epochs_without_improvement: 25
  mini_batch_size: 32
  monitor_test: false
  patience: 5
  save_final_model: false
  train_with_dev: false
  true_reshuffle: false
  use_warmup: false
trainer: ReinforcementTrainer

Since I shortened the embedding name, I added if '/' in name: name = name.split('/')[-1] before https://github.com/Alibaba-NLP/ACE/blob/main/flair/trainers/reinforcement_trainer.py#L1468

Also did

-        elif self.ext_doc:
+        elif hasattr(self, 'ext_doc') and self.ext_doc:

at https://github.com/Alibaba-NLP/ACE/blob/main/flair/embeddings.py#L2943 and

-            if self.sentence_feat:
+            if hasattr(input_sentences,'sentence_feat') and self.sentence_feat:

at https://github.com/Alibaba-NLP/ACE/blob/main/flair/embeddings.py#L3148

to get the code working.

Thanks.

I used '--test', so I assume it is using the coll2003 corpus as NER data.

To use my pretrained model, the embedding names cannot be modified. The code will sort the order of embeddings by embedding names so that the order of the embeddings is always the same (this is quite important in ACE). You may change the embedding name back to the origin yaml file. The code reads the model parameters according to the model: setting, so you can change the path of model: but the embedding_name: should be fixed.

By the way, if you want to train your own model, you can design your own embedding_name: for each embedding, or simply delete them because the code will define the embedding name according to the model path.

@junwei-h
Copy link
Author

junwei-h commented Jul 29, 2022

I have this error using the default embedding names. This is why I shortened it. Any suggestions? Thank you.

[2022-07-29 10:35:23,903 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\96435fa287fbf7e469185f1062386e05a075cadbf6838b74da22bf64b080bc32.99bcd55fc66f4f3360bc49ba472b940b8dcf223ea6a345deb969d607ca900729
2022-07-29 10:35:37,507 Testing using best model ...
2022-07-29 10:35:37,519 Setting embedding mask to the best action: tensor([1., 1., 0., 1., 0., 0., 1., 0., 0., 1., 1., 1.])
['/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased_v2doc', 'C:\\Users\\ebb\\.flair\\embeddings\\lm-jw300-backward-v0.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\lm-jw300-forward-v0.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\news-backward-0.4.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\news-forward-0.4.1.pt', 'Word: en', 'bert-base-cased', 'bert-base-multilingual-cased', 'bert-large-cased', 'elmo-original']
Traceback (most recent call last):
  File "C:\Users\ebb\.conda\envs\ace_py37\lib\site-packages\transformers\configuration_utils.py", line 242, in get_config_dict
    raise EnvironmentError
OSError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".\train.py", line 163, in <module>
    predict_posterior=args.predict_posterior,
  File "C:\Users\ebb\ACE\flair\trainers\reinforcement_trainer.py", line 1472, in final_test
    embedding.tokenizer = AutoTokenizer.from_pretrained(name, do_lower_case=True)
  File "C:\Users\ebb\.conda\envs\ace_py37\lib\site-packages\transformers\tokenization_auto.py", line 206, in from_pretrained
    config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
  File "C:\Users\ebb\.conda\envs\ace_py37\lib\site-packages\transformers\configuration_auto.py", line 203, in from_pretrained
    config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "C:\Users\ebb\.conda\envs\ace_py37\lib\site-packages\transformers\configuration_utils.py", line 251, in get_config_dict
    raise EnvironmentError(msg)
OSError: Can't load config for '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased'. Make sure that:

- '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased' is a correct model identifier listed on 'https://huggingface.co/models'

- or '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased' is the correct path to a directory containing a config.json file

EDIT: actually the same error occurred regardless if I shorten the embedding name in the yaml file or not. The name seems hard-coded somewhere. Have to add if '/' in name: name = name.split('/')[-1] to avoid this error.

@wangxinyu0922
Copy link
Member

This is caused by my modifications at #41. I added some additional modifications in the reinforcement_trainer.py to fix this problem. By the way, I think adding if '/' in name: name = name.split('/')[-1] can fix this issue correctly, but it may cause other problems if you would like to train your own model.

@junwei-h
Copy link
Author

junwei-h commented Aug 1, 2022

Thanks for your comments. I will uncomment if '/' in name: name = name.split('/')[-1] to train my model. The current problem is with this line, I can run python .\train.py --config .\config\doc_ner_best.yaml --test and got result worse than ACE without doc. See the last a few lines of output below. What is your result?

2022-08-01 16:15:23,401 bert-base-multilingual-cased 177853440
2022-08-01 16:15:23,401 first
2022-08-01 16:15:24,517 bert-large-cased 333579264
2022-08-01 16:15:24,517 first
2022-08-01 16:15:25,686 elmo-original 0
2022-08-01 16:15:26,707 Finished Embeddings Assignments
2022-08-01 16:15:33,542 10/108
2022-08-01 16:15:37,408 20/108
2022-08-01 16:15:43,965 30/108
2022-08-01 16:15:52,792 40/108
2022-08-01 16:16:02,594 50/108
2022-08-01 16:16:09,335 60/108
2022-08-01 16:16:17,459 70/108
2022-08-01 16:16:23,476 80/108
2022-08-01 16:16:27,939 90/108
2022-08-01 16:16:32,564 100/108
2022-08-01 16:16:35,964 0.9315 0.8883 0.9094
2022-08-01 16:16:35,964
MICRO_AVG: acc 0.8338 - f1-score 0.9094
MACRO_AVG: acc 0.8032 - f1-score 0.8851
LOC tp: 1550 - fp: 135 - fn: 118 - tn: 1550 - precision: 0.9199 - recall: 0.9293 - accuracy: 0.8597 - f1-score: 0.9246
MISC tp: 478 - fp: 95 - fn: 224 - tn: 478 - precision: 0.8342 - recall: 0.6809 - accuracy: 0.5997 - f1-score: 0.7498
ORG tp: 1426 - fp: 99 - fn: 235 - tn: 1426 - precision: 0.9351 - recall: 0.8585 - accuracy: 0.8102 - f1-score: 0.8952
PER tp: 1563 - fp: 40 - fn: 54 - tn: 1563 - precision: 0.9750 - recall: 0.9666 - accuracy: 0.9433 - f1-score: 0.9708

@wangxinyu0922
Copy link
Member

The F1 score does not change compared with your initial run. Could you show a longer output line for the testing? I think the code will print the names and the orders of the embeddings.

@junwei-h
Copy link
Author

junwei-h commented Aug 2, 2022

Yes, the result is identical to the original run. I printed it on screen. The following is the longest output I can see now. Let me know if you need earlier logs.

[2022-08-01 12:19:53,749 INFO] loading weights file https://cdn.huggingface.co/bert-base-multilingual-cased-pytorch_model.bin from cache at C:\Users\ebb/.cache\torch\transformers\3d1d2b2daef1e2b3ddc2180ddaae8b7a37d5f279babce0068361f71cd548f615.7131dcb754361639a7d5526985f880879c9bfd144b65a0bf50590bddb7de9059
[2022-08-01 12:19:56,735 INFO] All model checkpoint weights were used when initializing BertModel.

[2022-08-01 12:19:56,735 INFO] All the weights of BertModel were initialized from the model checkpoint at bert-base-multilingual-cased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-08-01 12:19:57,548 Model Size: 2191057164
2022-08-01 12:20:16,437 Loaded predicted embeddings: resources/bert-large-cased.hdf5
2022-08-01 12:20:35,421 Loaded predicted embeddings: resources/bert-base-cased.hdf5
2022-08-01 12:20:53,593 Loaded predicted embeddings: resources/bert-base-multilingual-cased.hdf5
Corpus: 14041 train + 3250 dev + 3453 test sentences
2022-08-01 12:20:53,615 ----------------------------------------------------------------------------------------------------
2022-08-01 12:20:53,625 loading file resources\taggers\xlnet-task-docv2_en-xlmr-task-tuned-docv2\best-model.pt
[2022-08-01 12:20:54,699 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\90deb4d9dd705272dc4b3db1364d759d551d72a9f70a91f60e3a1f5e278b985d.9019d8d0ae95e32b896211ae7ae130d7c36bb19ccf35c90a9e51923309458f70
[2022-08-01 12:20:54,699 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 28996
}

[2022-08-01 12:20:54,868 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\cee054f6aafe5e2cf816d2228704e326446785f940f5451a5b26033516a4ac3d.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
[2022-08-01 12:20:55,347 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\45629519f3117b89d89fd9c740073d8e4c1f0a70f9842476185100a8afe715d1.65df3cef028a0c91a7b059e4c404a975ebe6843c71267b67019c0e9cfa8a88f0
[2022-08-01 12:20:55,347 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 119547
}

[2022-08-01 12:20:55,535 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\96435fa287fbf7e469185f1062386e05a075cadbf6838b74da22bf64b080bc32.99bcd55fc66f4f3360bc49ba472b940b8dcf223ea6a345deb969d607ca900729
2022-08-01 12:21:11,886 Testing using best model ...
2022-08-01 12:21:11,920 Setting embedding mask to the best action: tensor([1., 1., 0., 1., 0., 0., 1., 0., 0., 1., 1., 1.])
['/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased_v2doc', 'C:\\Users\\ebb\\.flair\\embeddings\\lm-jw300-backward-v0.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\lm-jw300-forward-v0.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\news-backward-0.4.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\news-forward-0.4.1.pt', 'Word: en', 'bert-base-cased', 'bert-base-multilingual-cased', 'bert-large-cased', 'elmo-original']
[2022-08-01 12:21:12,238 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\df92a75c0ebbeb195065fe16fafa54ccd72e8362692cca884303a56788bd4bfc.0163e810fe4bdef52282bd9ddcbded8accbaa97a3ea7d89737ee7ce87511c587
[2022-08-01 12:21:12,254 INFO] Model config XLNetConfig {
  "architectures": [
    "XLNetLMHeadModel"
  ],
  "attn_type": "bi",
  "bi_data": false,
  "bos_token_id": 1,
  "clamp_len": -1,
  "d_head": 64,
  "d_inner": 4096,
  "d_model": 1024,
  "dropout": 0.1,
  "end_n_top": 5,
  "eos_token_id": 2,
  "ff_activation": "gelu",
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-12,
  "mem_len": null,
  "model_type": "xlnet",
  "n_head": 16,
  "n_layer": 24,
  "pad_token_id": 5,
  "reuse_len": null,
  "same_length": false,
  "start_n_top": 5,
  "summary_activation": "tanh",
  "summary_last_dropout": 0.1,
  "summary_type": "last",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 250
    }
  },
  "untie_r": true,
  "vocab_size": 32000
}

[2022-08-01 12:21:12,433 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-spiece.model from cache at C:\Users\ebb/.cache\torch\transformers\5b125ba222ff82664771f63cd8fac9696c24b403fc1ab720d537fe2ceaaf0576.8b10bd978b5d01c21303cc761fc9ecd464419b3bf921864a355ba807cfbfafa8
[2022-08-01 12:21:12,702 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-roberta-large-config.json from cache at C:\Users\ebb/.cache\torch\transformers\5ac6d3984e5ca7c5227e4821c65d341900125db538c5f09a1ead14f380def4a7.aa59609b4f56f82fa7699f0d47997566ccc4cf07e484f3a7bc883bd7c5a34488
[2022-08-01 12:21:12,702 INFO] Model config XLMRobertaConfig {
  "architectures": [
    "XLMRobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "xlm-roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "output_past": true,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 250002
}

[2022-08-01 12:21:12,916 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-roberta-large-sentencepiece.bpe.model from cache at C:\Users\ebb/.cache\torch\transformers\f7e58cf8eef122765ff522a4c7c0805d2fe8871ec58dcb13d0c2764ea3e4a0f3.309f0c29486cffc28e1e40a2ab0ac8f500c203fe080b95f820aa9cb58e5b84ed
[2022-08-01 12:21:13,709 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at C:\Users\ebb/.cache\torch\transformers\c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
[2022-08-01 12:21:13,709 INFO] Model config RobertaConfig {
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 50265
}

[2022-08-01 12:21:14,067 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at C:\Users\ebb/.cache\torch\transformers\1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
[2022-08-01 12:21:14,067 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at C:\Users\ebb/.cache\torch\transformers\f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
[2022-08-01 12:21:14,353 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\b945b69218e98b3e2c95acf911789741307dec43c698d35fad11c1ae28bda352.9da767be51e1327499df13488672789394e2ca38b877837e52618a67d7002391
[2022-08-01 12:21:14,353 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 28996
}

[2022-08-01 12:21:14,555 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\5e8a2b4893d13790ed4150ca1906be5f7a03d6c4ddf62296c383f6db42814db2.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
2022-08-01 12:21:14,803 /home/yongjiang.jy/.cache/torch/transformers/bert-base-cased 108310272
2022-08-01 12:21:14,803 first
2022-08-01 12:25:13,948 /home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large 355359744
2022-08-01 12:25:13,948 first
2022-08-01 14:19:50,193 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt 43087046
2022-08-01 14:19:50,193 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt is not selected, Skipping
2022-08-01 14:19:50,193 /home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt 43087046
2022-08-01 14:23:59,382 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt 18257500
2022-08-01 14:23:59,382 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt is not selected, Skipping
2022-08-01 14:23:59,383 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt 18257500
2022-08-01 14:23:59,383 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt is not selected, Skipping
2022-08-01 14:23:59,392 /home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large 559890432
2022-08-01 14:23:59,393 first
2022-08-01 16:11:26,127 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased 360268800
2022-08-01 16:11:26,127 first
2022-08-01 16:11:26,127 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased is not selected, Skipping
2022-08-01 16:11:26,141 bert-base-multilingual-cased 177853440
2022-08-01 16:11:26,141 first
2022-08-01 16:11:27,757 bert-large-cased 333579264
2022-08-01 16:11:27,757 first
2022-08-01 16:11:29,672 elmo-original 0
2022-08-01 16:14:07,954 Finished Embeddings Assignments
2022-08-01 16:14:15,224 10/108
2022-08-01 16:14:19,001 20/108
2022-08-01 16:14:25,966 30/108
2022-08-01 16:14:34,451 40/108
2022-08-01 16:14:44,239 50/108
2022-08-01 16:14:50,761 60/108
2022-08-01 16:14:58,905 70/108
2022-08-01 16:15:04,991 80/108
2022-08-01 16:15:09,529 90/108
2022-08-01 16:15:14,203 100/108
2022-08-01 16:15:17,461 ----------------------------------------------------------------------------------------------------
2022-08-01 16:15:17,461 current corpus: CONLL_03_ENGLISH
2022-08-01 16:15:18,647 /home/yongjiang.jy/.cache/torch/transformers/bert-base-cased 108310272
2022-08-01 16:15:18,647 first
2022-08-01 16:15:20,399 /home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large 355359744
2022-08-01 16:15:20,399 first
2022-08-01 16:15:21,372 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt 43087046
2022-08-01 16:15:21,372 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt is not selected, Skipping
2022-08-01 16:15:21,372 /home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt 43087046
2022-08-01 16:15:22,428 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt 18257500
2022-08-01 16:15:22,428 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt is not selected, Skipping
2022-08-01 16:15:22,428 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt 18257500
2022-08-01 16:15:22,430 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt is not selected, Skipping
2022-08-01 16:15:22,431 /home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large 559890432
2022-08-01 16:15:22,431 first
2022-08-01 16:15:23,396 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased 360268800
2022-08-01 16:15:23,396 first
2022-08-01 16:15:23,396 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased is not selected, Skipping
2022-08-01 16:15:23,401 bert-base-multilingual-cased 177853440
2022-08-01 16:15:23,401 first
2022-08-01 16:15:24,517 bert-large-cased 333579264
2022-08-01 16:15:24,517 first
2022-08-01 16:15:25,686 elmo-original 0
2022-08-01 16:15:26,707 Finished Embeddings Assignments
2022-08-01 16:15:33,542 10/108
2022-08-01 16:15:37,408 20/108
2022-08-01 16:15:43,965 30/108
2022-08-01 16:15:52,792 40/108
2022-08-01 16:16:02,594 50/108
2022-08-01 16:16:09,335 60/108
2022-08-01 16:16:17,459 70/108
2022-08-01 16:16:23,476 80/108
2022-08-01 16:16:27,939 90/108
2022-08-01 16:16:32,564 100/108
2022-08-01 16:16:35,964 0.9315  0.8883  0.9094
2022-08-01 16:16:35,964 
MICRO_AVG: acc 0.8338 - f1-score 0.9094
MACRO_AVG: acc 0.8032 - f1-score 0.8851
LOC        tp: 1550 - fp: 135 - fn: 118 - tn: 1550 - precision: 0.9199 - recall: 0.9293 - accuracy: 0.8597 - f1-score: 0.9246
MISC       tp: 478 - fp: 95 - fn: 224 - tn: 478 - precision: 0.8342 - recall: 0.6809 - accuracy: 0.5997 - f1-score: 0.7498
ORG        tp: 1426 - fp: 99 - fn: 235 - tn: 1426 - precision: 0.9351 - recall: 0.8585 - accuracy: 0.8102 - f1-score: 0.8952
PER        tp: 1563 - fp: 40 - fn: 54 - tn: 1563 - precision: 0.9750 - recall: 0.9666 - accuracy: 0.9433 - f1-score: 0.9708

@wangxinyu0922
Copy link
Member

wangxinyu0922 commented Aug 2, 2022

The problem is probably from this list:

['/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased_v2doc', 'C:\\Users\\ebb\\.flair\\embeddings\\lm-jw300-backward-v0.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\lm-jw300-forward-v0.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\news-backward-0.4.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\news-forward-0.4.1.pt', 'Word: en', 'bert-base-cased', 'bert-base-multilingual-cased', 'bert-large-cased', 'elmo-original']

This is the name list of all the embeddings, and it decides the order of embedding according to the embedding name. It seems that the flair embeddings are the direct path in your system. In your case, you may specify the embedding_name of flair embeddings into something like embedding_name: news-forward-0.4.1.pt.

@junwei-h
Copy link
Author

junwei-h commented Aug 2, 2022

Following your suggestion, I have this list

['/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large_v2doc', 
'/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large_v2doc',
'/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased_v2doc', 
'Word: en', 
'bert-base-cased', 
'bert-base-multilingual-cased', 
'bert-large-cased', 
'elmo-original', 
'lm-jw300-backward-v0.1.pt', 
'lm-jw300-forward-v0.1.pt', 
'news-backward-0.4.1.pt', 
'news-forward-0.4.1.pt']

I got the same F1 score result. Is the list in the right order or I should prepend /home/yongjiang.jy/.flair/embeddings/ to all embedding names? What is the order in your run?

@wangxinyu0922
Copy link
Member

wangxinyu0922 commented Aug 3, 2022

It seems that I forgot to upload some of the fine-tuned transformer embeddings (though the order of embeddings is still not correct now). I'm collecting them and will upload them later.

@wangxinyu0922
Copy link
Member

I'm uploading the fine-tuned embeddings for the model. Please download en-xlm-roberta-large.zip, en-roberta-large.zip and en-xlnet-large-cased.zip at OneDrive in fine-tuned models and unzip them. Then change the path of the embeddings (model:) in the config/doc_ner_best.yaml.

@junwei-h
Copy link
Author

junwei-h commented Aug 4, 2022

Thank you. Is it now in the right order

2022-08-04 13:21:03,082 Setting embedding mask to the best action: tensor([1., 1., 0., 1., 0., 0., 1., 0., 0., 1., 1., 1.])
['/home/yongjiang.jy/.cache/torch/transformers/bert-base-cased', 
'/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large_v2doc', 
'/home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt', 
'/home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt', 
'/home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt', 
'/home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt', 
'/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large_v2doc', 
'/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased_v2doc', 
'Word: en', 
'bert-base-multilingual-cased', 
'bert-large-cased', 
'elmo-original']

The output is worse:

2022-08-04 17:08:46,123 0.9335  0.8874  0.9099
2022-08-04 17:08:46,124 
MICRO_AVG: acc 0.8346 - f1-score 0.9099
MACRO_AVG: acc 0.8034 - f1-score 0.884775
LOC        tp: 1553 - fp: 131 - fn: 115 - tn: 1553 - precision: 0.9222 - recall: 0.9311 - accuracy: 0.8633 - f1-score: 0.9266
MISC       tp: 474 - fp: 95 - fn: 228 - tn: 474 - precision: 0.8330 - recall: 0.6752 - accuracy: 0.5947 - f1-score: 0.7458
ORG        tp: 1414 - fp: 94 - fn: 247 - tn: 1414 - precision: 0.9377 - recall: 0.8513 - accuracy: 0.8057 - f1-score: 0.8924
PER        tp: 1571 - fp: 37 - fn: 46 - tn: 1571 - precision: 0.9770 - recall: 0.9716 - accuracy: 0.9498 - f1-score: 0.9743

Why don't you share your order and your result?

@wangxinyu0922
Copy link
Member

2021-01-13 16:43:05,998 loading file resources/taggers/xlnet-task-docv2_en-xlmr-task-tuned-docv2_en-xlmr-task-docv2_elmo_bert-four-large-pred_bert-four-old-pred_multi-bert-four-pred_word_flair_mflair_150epoch_32batch_0.1lr_800hidden_eng_crf_reinforce_freeze_sentbatch_5patience_nodev_ner4/best-model.pt
2021-01-13 16:43:28,354 Testing using best model ...
2021-01-13 16:43:28,357 Setting embedding mask to the best action: tensor([1., 1., 0., 1., 0., 0., 1., 0., 0., 1., 1., 1.], device='cuda:0')
['/home/yongjiang.jy/.cache/torch/transformers/bert-base-cased', '/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large', '/home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt', '/home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt', '/home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt', '/home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt', '/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large', '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased', 'Word: en', 'bert-base-multilingual-cased', 'bert-large-cased', 'elmo-original']
2021-01-13 16:43:30,267 /home/yongjiang.jy/.cache/torch/transformers/bert-base-cased 108310272
2021-01-13 16:43:30,267 first
2021-01-13 16:43:32,162 /home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large 355359744
2021-01-13 16:43:32,163 first
2021-01-13 16:43:34,162 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt 43087046
2021-01-13 16:43:34,163 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt is not selected, Skipping
2021-01-13 16:43:34,163 /home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt 43087046
2021-01-13 16:43:35,884 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt 18257500
2021-01-13 16:43:35,884 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt is not selected, Skipping
2021-01-13 16:43:35,884 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt 18257500
2021-01-13 16:43:35,885 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt is not selected, Skipping
2021-01-13 16:43:35,893 /home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large 559890432
2021-01-13 16:43:35,893 first
2021-01-13 16:43:39,177 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased 360268800
2021-01-13 16:43:39,177 first
2021-01-13 16:43:39,177 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased is not selected, Skipping
2021-01-13 16:43:39,190 bert-base-multilingual-cased 177853440
2021-01-13 16:43:39,191 first
2021-01-13 16:43:48,142 bert-large-cased 333579264
2021-01-13 16:43:48,142 first
2021-01-13 16:44:03,180 elmo-original 0
2021-01-13 16:44:33,009 Finished Embeddings Assignments
2021-01-13 16:44:51,372 10/108
2021-01-13 16:44:59,656 20/108
2021-01-13 16:45:16,240 30/108
2021-01-13 16:45:38,311 40/108
2021-01-13 16:45:57,214 50/108
2021-01-13 16:46:11,119 60/108
2021-01-13 16:46:32,466 70/108
2021-01-13 16:46:45,911 80/108
2021-01-13 16:46:54,667 90/108
2021-01-13 16:47:05,405 100/108
2021-01-13 16:47:11,091 0.9417	0.9497	0.9457
2021-01-13 16:47:11,091 
MICRO_AVG: acc 0.897 - f1-score 0.9457
MACRO_AVG: acc 0.8778 - f1-score 0.932725
LOC        tp: 1584 - fp: 75 - fn: 84 - tn: 1584 - precision: 0.9548 - recall: 0.9496 - accuracy: 0.9088 - f1-score: 0.9522
MISC       tp: 612 - fp: 122 - fn: 90 - tn: 612 - precision: 0.8338 - recall: 0.8718 - accuracy: 0.7427 - f1-score: 0.8524
ORG        tp: 1574 - fp: 113 - fn: 87 - tn: 1574 - precision: 0.9330 - recall: 0.9476 - accuracy: 0.8873 - f1-score: 0.9402
PER        tp: 1594 - fp: 22 - fn: 23 - tn: 1594 - precision: 0.9864 - recall: 0.9858 - accuracy: 0.9725 - f1-score: 0.9861

It's strange, have you download my uploaded embeddings?

@junwei-h
Copy link
Author

junwei-h commented Aug 5, 2022

Yes, the 3 Transformer Embeddings are from OneDrive and saved under resources.

TransformerWordEmbeddings-0:
  layers: '-1'
  model: resources/en-xlnet-large-cased/xlnet-large-cased # the path to the fine-tuned model
  embedding_name: /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased
  pooling_operation: first
  v2_doc: true
TransformerWordEmbeddings-1:
  layers: '-1'
  model: resources/en-xlm-roberta-large/xlm-roberta-large # the path to the fine-tuned model
  embedding_name: /home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large
  pooling_operation: first
  v2_doc: true
TransformerWordEmbeddings-2:
  layers: '-1'
  model: resources/en-roberta-large/roberta-large # the path to the fine-tuned model
  embedding_name: /home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large
  pooling_operation: first
  v2_doc: true

Here is the entire output log. Some models were downloaded from s3 on AWS or huggingface and saved under .cache\torch\transformers\. Do you see anything wrong?

2022-08-02 09:39:48,397 Reading data from C:\Users\ebb\.flair\datasets\conll_03_english
2022-08-02 09:39:48,397 Train: C:\Users\ebb\.flair\datasets\conll_03_english\train.txt
2022-08-02 09:39:48,397 Dev: C:\Users\ebb\.flair\datasets\conll_03_english\testa.txt
2022-08-02 09:39:48,397 Test: C:\Users\ebb\.flair\datasets\conll_03_english\testb.txt
2022-08-02 09:39:53,510 {b'<unk>': 0, b'O': 1, b'B-PER': 2, b'E-PER': 3, b'S-LOC': 4, b'B-MISC': 5, b'I-MISC': 6, b'E-MISC': 7, b'S-MISC': 8, b'S-PER': 9, b'B-ORG': 10, b'E-ORG': 11, b'S-ORG': 12, b'I-ORG': 13, b'B-LOC': 14, b'E-LOC': 15, b'I-PER': 16, b'I-LOC': 17, b'<START>': 18, b'<STOP>': 19}
2022-08-02 09:39:53,510 Corpus: 14987 train + 3466 dev + 3684 test sentences
C:\Users\ebb\ACE\flair\utils\params.py:104: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  dict_merge.dict_merge(params_dict, yaml.load(f))
[2022-08-02 09:39:55,481 INFO] Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
[2022-08-02 09:39:56,550 INFO] instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
[2022-08-02 09:39:56,551 INFO] instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
[2022-08-02 09:39:56,553 INFO] instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
[2022-08-02 09:39:56,554 INFO] instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
[2022-08-02 09:39:56,798 INFO] Initializing ELMo.
[2022-08-02 09:40:06,218 INFO] loading KeyedVectors object from C:\Users\ebb\.flair\embeddings\en-fasttext-news-300d-1M
[2022-08-02 09:40:08,897 INFO] loading vectors from C:\Users\ebb\.flair\embeddings\en-fasttext-news-300d-1M.vectors.npy with mmap=None
[2022-08-02 09:40:09,758 INFO] setting ignored attribute vectors_norm to None
[2022-08-02 09:40:15,288 INFO] KeyedVectors lifecycle event {'fname': 'C:\\Users\\ebb\\.flair\\embeddings\\en-fasttext-news-300d-1M', 'datetime': '2022-08-02T09:40:15.288871', 'gensim': '4.2.0', 'python': '3.7.13 (default, Mar 28 2022, 08:03:21) [MSC v.1916 64 bit (AMD64)]', 'platform': 'Windows-10-10.0.19041-SP0', 'event': 'loaded'}
[2022-08-02 09:40:17,648 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\df92a75c0ebbeb195065fe16fafa54ccd72e8362692cca884303a56788bd4bfc.0163e810fe4bdef52282bd9ddcbded8accbaa97a3ea7d89737ee7ce87511c587
[2022-08-02 09:40:17,648 INFO] Model config XLNetConfig {
  "architectures": [
    "XLNetLMHeadModel"
  ],
  "attn_type": "bi",
  "bi_data": false,
  "bos_token_id": 1,
  "clamp_len": -1,
  "d_head": 64,
  "d_inner": 4096,
  "d_model": 1024,
  "dropout": 0.1,
  "end_n_top": 5,
  "eos_token_id": 2,
  "ff_activation": "gelu",
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-12,
  "mem_len": null,
  "model_type": "xlnet",
  "n_head": 16,
  "n_layer": 24,
  "pad_token_id": 5,
  "reuse_len": null,
  "same_length": false,
  "start_n_top": 5,
  "summary_activation": "tanh",
  "summary_last_dropout": 0.1,
  "summary_type": "last",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 250
    }
  },
  "untie_r": true,
  "vocab_size": 32000
}

[2022-08-02 09:40:17,866 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-spiece.model from cache at C:\Users\ebb/.cache\torch\transformers\5b125ba222ff82664771f63cd8fac9696c24b403fc1ab720d537fe2ceaaf0576.8b10bd978b5d01c21303cc761fc9ecd464419b3bf921864a355ba807cfbfafa8
[2022-08-02 09:40:18,131 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\df92a75c0ebbeb195065fe16fafa54ccd72e8362692cca884303a56788bd4bfc.0163e810fe4bdef52282bd9ddcbded8accbaa97a3ea7d89737ee7ce87511c587
[2022-08-02 09:40:18,131 INFO] Model config XLNetConfig {
  "architectures": [
    "XLNetLMHeadModel"
  ],
  "attn_type": "bi",
  "bi_data": false,
  "bos_token_id": 1,
  "clamp_len": -1,
  "d_head": 64,
  "d_inner": 4096,
  "d_model": 1024,
  "dropout": 0.1,
  "end_n_top": 5,
  "eos_token_id": 2,
  "ff_activation": "gelu",
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-12,
  "mem_len": null,
  "model_type": "xlnet",
  "n_head": 16,
  "n_layer": 24,
  "output_hidden_states": true,
  "pad_token_id": 5,
  "reuse_len": null,
  "same_length": false,
  "start_n_top": 5,
  "summary_activation": "tanh",
  "summary_last_dropout": 0.1,
  "summary_type": "last",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 250
    }
  },
  "untie_r": true,
  "vocab_size": 32000
}

[2022-08-02 09:40:18,467 INFO] loading weights file https://cdn.huggingface.co/xlnet-large-cased-pytorch_model.bin from cache at C:\Users\ebb/.cache\torch\transformers\7fc554c19ef7bc74f1f74603c10156d751d2f99b09e8e38f91ed88e8c9ec6294.db8dc8babedbb75a56c36fca3e02b016e19fd682e79fb1a928e03c2df977cace
[2022-08-02 09:40:23,096 INFO] All model checkpoint weights were used when initializing XLNetModel.

[2022-08-02 09:40:23,096 INFO] All the weights of XLNetModel were initialized from the model checkpoint at xlnet-large-cased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use XLNetModel for predictions without further training.
[2022-08-02 09:40:23,387 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-roberta-large-config.json from cache at C:\Users\ebb/.cache\torch\transformers\5ac6d3984e5ca7c5227e4821c65d341900125db538c5f09a1ead14f380def4a7.aa59609b4f56f82fa7699f0d47997566ccc4cf07e484f3a7bc883bd7c5a34488
[2022-08-02 09:40:23,387 INFO] Model config XLMRobertaConfig {
  "architectures": [
    "XLMRobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "xlm-roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "output_past": true,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 250002
}

[2022-08-02 09:40:23,586 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-roberta-large-sentencepiece.bpe.model from cache at C:\Users\ebb/.cache\torch\transformers\f7e58cf8eef122765ff522a4c7c0805d2fe8871ec58dcb13d0c2764ea3e4a0f3.309f0c29486cffc28e1e40a2ab0ac8f500c203fe080b95f820aa9cb58e5b84ed
[2022-08-02 09:40:24,246 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-roberta-large-config.json from cache at C:\Users\ebb/.cache\torch\transformers\5ac6d3984e5ca7c5227e4821c65d341900125db538c5f09a1ead14f380def4a7.aa59609b4f56f82fa7699f0d47997566ccc4cf07e484f3a7bc883bd7c5a34488
[2022-08-02 09:40:24,246 INFO] Model config XLMRobertaConfig {
  "architectures": [
    "XLMRobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "xlm-roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "output_hidden_states": true,
  "output_past": true,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 250002
}

[2022-08-02 09:40:24,476 INFO] loading weights file https://cdn.huggingface.co/xlm-roberta-large-pytorch_model.bin from cache at C:\Users\ebb/.cache\torch\transformers\a89d1c4637c1ea5ecd460c2a7c06a03acc9a961fc8c59aa2dd76d8a7f1e94536.2f41fe28a80f2730715b795242a01fc3dda846a85e7903adb3907dc5c5a498bf
[2022-08-02 09:40:38,765 INFO] All model checkpoint weights were used when initializing XLMRobertaModel.

[2022-08-02 09:40:38,765 INFO] All the weights of XLMRobertaModel were initialized from the model checkpoint at xlm-roberta-large.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use XLMRobertaModel for predictions without further training.
[2022-08-02 09:40:39,039 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at C:\Users\ebb/.cache\torch\transformers\c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
[2022-08-02 09:40:39,047 INFO] Model config RobertaConfig {
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 50265
}

[2022-08-02 09:40:39,526 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at C:\Users\ebb/.cache\torch\transformers\1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
[2022-08-02 09:40:39,526 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at C:\Users\ebb/.cache\torch\transformers\f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
[2022-08-02 09:40:39,858 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at C:\Users\ebb/.cache\torch\transformers\c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
[2022-08-02 09:40:39,873 INFO] Model config RobertaConfig {
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "output_hidden_states": true,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 50265
}

[2022-08-02 09:40:40,036 INFO] loading weights file https://cdn.huggingface.co/roberta-large-pytorch_model.bin from cache at C:\Users\ebb/.cache\torch\transformers\2339ac1858323405dffff5156947669fed6f63a0c34cfab35bda4f78791893d2.fc7abf72755ecc4a75d0d336a93c1c63358d2334f5998ed326f3b0da380bf536
[2022-08-02 09:40:47,400 INFO] All model checkpoint weights were used when initializing RobertaModel.

[2022-08-02 09:40:47,400 INFO] All the weights of RobertaModel were initialized from the model checkpoint at roberta-large.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use RobertaModel for predictions without further training.
[2022-08-02 09:40:47,710 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\90deb4d9dd705272dc4b3db1364d759d551d72a9f70a91f60e3a1f5e278b985d.9019d8d0ae95e32b896211ae7ae130d7c36bb19ccf35c90a9e51923309458f70
[2022-08-02 09:40:47,710 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 28996
}

[2022-08-02 09:40:47,915 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\cee054f6aafe5e2cf816d2228704e326446785f940f5451a5b26033516a4ac3d.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
[2022-08-02 09:40:48,149 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\90deb4d9dd705272dc4b3db1364d759d551d72a9f70a91f60e3a1f5e278b985d.9019d8d0ae95e32b896211ae7ae130d7c36bb19ccf35c90a9e51923309458f70
[2022-08-02 09:40:48,149 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "output_hidden_states": true,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 28996
}

[2022-08-02 09:40:48,365 INFO] loading weights file https://cdn.huggingface.co/bert-large-cased-pytorch_model.bin from cache at C:\Users\ebb/.cache\torch\transformers\5f91c3ab24cfb315cf0be4174a25619f6087eb555acc8ae3a82edfff7f705138.b5f1c2070e0a0c189ca3b08270b0cb5bd0635b7319e74e93bd0dc26689953c27
[2022-08-02 09:40:53,287 INFO] All model checkpoint weights were used when initializing BertModel.

[2022-08-02 09:40:53,287 INFO] All the weights of BertModel were initialized from the model checkpoint at bert-large-cased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
[2022-08-02 09:40:53,577 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\b945b69218e98b3e2c95acf911789741307dec43c698d35fad11c1ae28bda352.9da767be51e1327499df13488672789394e2ca38b877837e52618a67d7002391
[2022-08-02 09:40:53,577 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 28996
}

[2022-08-02 09:40:53,765 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\5e8a2b4893d13790ed4150ca1906be5f7a03d6c4ddf62296c383f6db42814db2.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
[2022-08-02 09:40:54,096 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\b945b69218e98b3e2c95acf911789741307dec43c698d35fad11c1ae28bda352.9da767be51e1327499df13488672789394e2ca38b877837e52618a67d7002391
[2022-08-02 09:40:54,096 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "output_hidden_states": true,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 28996
}

[2022-08-02 09:40:54,324 INFO] loading weights file https://cdn.huggingface.co/bert-base-cased-pytorch_model.bin from cache at C:\Users\ebb/.cache\torch\transformers\d8f11f061e407be64c4d5d7867ee61d1465263e24085cfa26abf183fdc830569.3fadbea36527ae472139fe84cddaa65454d7429f12d543d80bfc3ad70de55ac2
[2022-08-02 09:40:55,939 INFO] All model checkpoint weights were used when initializing BertModel.

[2022-08-02 09:40:55,939 INFO] All the weights of BertModel were initialized from the model checkpoint at bert-base-cased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
[2022-08-02 09:40:56,160 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\45629519f3117b89d89fd9c740073d8e4c1f0a70f9842476185100a8afe715d1.65df3cef028a0c91a7b059e4c404a975ebe6843c71267b67019c0e9cfa8a88f0
[2022-08-02 09:40:56,160 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 119547
}

[2022-08-02 09:40:56,339 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\96435fa287fbf7e469185f1062386e05a075cadbf6838b74da22bf64b080bc32.99bcd55fc66f4f3360bc49ba472b940b8dcf223ea6a345deb969d607ca900729
[2022-08-02 09:40:56,639 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\45629519f3117b89d89fd9c740073d8e4c1f0a70f9842476185100a8afe715d1.65df3cef028a0c91a7b059e4c404a975ebe6843c71267b67019c0e9cfa8a88f0
[2022-08-02 09:40:56,639 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "output_hidden_states": true,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 119547
}

[2022-08-02 09:40:56,861 INFO] loading weights file https://cdn.huggingface.co/bert-base-multilingual-cased-pytorch_model.bin from cache at C:\Users\ebb/.cache\torch\transformers\3d1d2b2daef1e2b3ddc2180ddaae8b7a37d5f279babce0068361f71cd548f615.7131dcb754361639a7d5526985f880879c9bfd144b65a0bf50590bddb7de9059
[2022-08-02 09:40:59,505 INFO] All model checkpoint weights were used when initializing BertModel.

[2022-08-02 09:40:59,505 INFO] All the weights of BertModel were initialized from the model checkpoint at bert-base-multilingual-cased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-08-02 09:41:00,242 Model Size: 2191057164
Corpus: 14041 train + 3250 dev + 3453 test sentences
2022-08-02 09:41:00,307 ----------------------------------------------------------------------------------------------------
2022-08-02 09:41:00,313 loading file resources\taggers\xlnet-task-docv2_en-xlmr-task-tuned-docv2\best-model.pt
[2022-08-02 09:41:01,321 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\90deb4d9dd705272dc4b3db1364d759d551d72a9f70a91f60e3a1f5e278b985d.9019d8d0ae95e32b896211ae7ae130d7c36bb19ccf35c90a9e51923309458f70
[2022-08-02 09:41:01,321 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 28996
}

[2022-08-02 09:41:01,562 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\cee054f6aafe5e2cf816d2228704e326446785f940f5451a5b26033516a4ac3d.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
[2022-08-02 09:41:01,967 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\45629519f3117b89d89fd9c740073d8e4c1f0a70f9842476185100a8afe715d1.65df3cef028a0c91a7b059e4c404a975ebe6843c71267b67019c0e9cfa8a88f0
[2022-08-02 09:41:01,967 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 119547
}

[2022-08-02 09:41:02,137 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\96435fa287fbf7e469185f1062386e05a075cadbf6838b74da22bf64b080bc32.99bcd55fc66f4f3360bc49ba472b940b8dcf223ea6a345deb969d607ca900729
2022-08-02 09:41:09,908 Testing using best model ...
2022-08-02 09:41:09,915 Setting embedding mask to the best action: tensor([1., 1., 0., 1., 0., 0., 1., 0., 0., 1., 1., 1.])
[2022-08-02 09:41:10,228 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\df92a75c0ebbeb195065fe16fafa54ccd72e8362692cca884303a56788bd4bfc.0163e810fe4bdef52282bd9ddcbded8accbaa97a3ea7d89737ee7ce87511c587
[2022-08-02 09:41:10,228 INFO] Model config XLNetConfig {
  "architectures": [
    "XLNetLMHeadModel"
  ],
  "attn_type": "bi",
  "bi_data": false,
  "bos_token_id": 1,
  "clamp_len": -1,
  "d_head": 64,
  "d_inner": 4096,
  "d_model": 1024,
  "dropout": 0.1,
  "end_n_top": 5,
  "eos_token_id": 2,
  "ff_activation": "gelu",
  "initializer_range": 0.02,
  "layer_norm_eps": 1e-12,
  "mem_len": null,
  "model_type": "xlnet",
  "n_head": 16,
  "n_layer": 24,
  "pad_token_id": 5,
  "reuse_len": null,
  "same_length": false,
  "start_n_top": 5,
  "summary_activation": "tanh",
  "summary_last_dropout": 0.1,
  "summary_type": "last",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 250
    }
  },
  "untie_r": true,
  "vocab_size": 32000
}

[2022-08-02 09:41:10,412 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlnet-large-cased-spiece.model from cache at C:\Users\ebb/.cache\torch\transformers\5b125ba222ff82664771f63cd8fac9696c24b403fc1ab720d537fe2ceaaf0576.8b10bd978b5d01c21303cc761fc9ecd464419b3bf921864a355ba807cfbfafa8
[2022-08-02 09:41:10,659 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-roberta-large-config.json from cache at C:\Users\ebb/.cache\torch\transformers\5ac6d3984e5ca7c5227e4821c65d341900125db538c5f09a1ead14f380def4a7.aa59609b4f56f82fa7699f0d47997566ccc4cf07e484f3a7bc883bd7c5a34488
[2022-08-02 09:41:10,659 INFO] Model config XLMRobertaConfig {
  "architectures": [
    "XLMRobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "xlm-roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "output_past": true,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 250002
}

[2022-08-02 09:41:10,866 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-roberta-large-sentencepiece.bpe.model from cache at C:\Users\ebb/.cache\torch\transformers\f7e58cf8eef122765ff522a4c7c0805d2fe8871ec58dcb13d0c2764ea3e4a0f3.309f0c29486cffc28e1e40a2ab0ac8f500c203fe080b95f820aa9cb58e5b84ed
[2022-08-02 09:41:11,583 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at C:\Users\ebb/.cache\torch\transformers\c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
[2022-08-02 09:41:11,583 INFO] Model config RobertaConfig {
  "architectures": [
    "RobertaForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "eos_token_id": 2,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "type_vocab_size": 1,
  "vocab_size": 50265
}

[2022-08-02 09:41:11,968 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at C:\Users\ebb/.cache\torch\transformers\1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
[2022-08-02 09:41:11,968 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at C:\Users\ebb/.cache\torch\transformers\f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
[2022-08-02 09:41:12,237 INFO] loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json from cache at C:\Users\ebb/.cache\torch\transformers\b945b69218e98b3e2c95acf911789741307dec43c698d35fad11c1ae28bda352.9da767be51e1327499df13488672789394e2ca38b877837e52618a67d7002391
[2022-08-02 09:41:12,237 INFO] Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 28996
}

[2022-08-02 09:41:12,438 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-vocab.txt from cache at C:\Users\ebb/.cache\torch\transformers\5e8a2b4893d13790ed4150ca1906be5f7a03d6c4ddf62296c383f6db42814db2.e13dbb970cb325137104fb2e5f36fe865f27746c6b526f6352861b1980eb80b1
['/home/yongjiang.jy/.cache/torch/transformers/bert-base-cased', '/home/yongjiang.jy/.cache/torch/transformers/bert-base-multilingual-cased', '/home/yongjiang.jy/.cache/torch/transformers/bert-large-cased', '/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt', '/home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt', '/home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt', '/home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt', '/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased_v2doc', 'Word: en', 'elmo-original']
2022-08-02 09:41:12,653 /home/yongjiang.jy/.cache/torch/transformers/bert-base-cased 108310272
2022-08-02 09:41:12,653 first
2022-08-02 09:44:21,981 /home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large 355359744
2022-08-02 09:44:21,981 first
2022-08-02 11:27:15,861 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt 43087046
2022-08-02 11:27:15,872 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt is not selected, Skipping
2022-08-02 11:27:15,872 /home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt 43087046
2022-08-02 11:31:08,328 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt 18257500
2022-08-02 11:31:08,328 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt is not selected, Skipping
2022-08-02 11:31:08,330 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt 18257500
2022-08-02 11:31:08,330 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt is not selected, Skipping
2022-08-02 11:31:08,340 /home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large 559890432
2022-08-02 11:31:08,340 first
2022-08-02 13:17:38,173 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased 360268800
2022-08-02 13:17:38,173 first
2022-08-02 13:17:38,173 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased is not selected, Skipping
2022-08-02 13:17:38,181 bert-base-multilingual-cased 177853440
2022-08-02 13:17:38,181 first
2022-08-02 13:20:22,800 bert-large-cased 333579264
2022-08-02 13:20:22,800 first
2022-08-02 13:28:35,785 elmo-original 0
2022-08-02 13:30:49,374 Finished Embeddings Assignments
2022-08-02 13:30:56,081 10/108
2022-08-02 13:30:59,509 20/108
2022-08-02 13:31:05,647 30/108
2022-08-02 13:31:13,446 40/108
2022-08-02 13:31:26,921 50/108
2022-08-02 13:31:34,043 60/108
2022-08-02 13:31:43,577 70/108
2022-08-02 13:31:49,945 80/108
2022-08-02 13:31:54,605 90/108
2022-08-02 13:31:59,280 100/108
2022-08-02 13:32:03,319 0.9285	0.8831	0.9052
2022-08-02 13:32:03,319 
MICRO_AVG: acc 0.8269 - f1-score 0.9052
MACRO_AVG: acc 0.7963 - f1-score 0.880675
LOC        tp: 1554 - fp: 151 - fn: 114 - tn: 1554 - precision: 0.9114 - recall: 0.9317 - accuracy: 0.8543 - f1-score: 0.9214
MISC       tp: 473 - fp: 96 - fn: 229 - tn: 473 - precision: 0.8313 - recall: 0.6738 - accuracy: 0.5927 - f1-score: 0.7443
ORG        tp: 1407 - fp: 98 - fn: 254 - tn: 1407 - precision: 0.9349 - recall: 0.8471 - accuracy: 0.7999 - f1-score: 0.8888
PER        tp: 1554 - fp: 39 - fn: 63 - tn: 1554 - precision: 0.9755 - recall: 0.9610 - accuracy: 0.9384 - f1-score: 0.9682
2022-08-02 13:32:03,320 ----------------------------------------------------------------------------------------------------
2022-08-02 13:32:03,320 ----------------------------------------------------------------------------------------------------
2022-08-02 13:32:03,320 current corpus: CONLL_03_ENGLISH
2022-08-02 13:32:04,666 /home/yongjiang.jy/.cache/torch/transformers/bert-base-cased 108310272
2022-08-02 13:32:04,666 first
2022-08-02 13:32:06,368 /home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large 355359744
2022-08-02 13:32:06,368 first
2022-08-02 13:32:07,593 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt 43087046
2022-08-02 13:32:07,594 /home/yongjiang.jy/.flair/embeddings/lm-jw300-backward-v0.1.pt is not selected, Skipping
2022-08-02 13:32:07,594 /home/yongjiang.jy/.flair/embeddings/lm-jw300-forward-v0.1.pt 43087046
2022-08-02 13:32:08,894 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt 18257500
2022-08-02 13:32:08,894 /home/yongjiang.jy/.flair/embeddings/news-backward-0.4.1.pt is not selected, Skipping
2022-08-02 13:32:08,894 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt 18257500
2022-08-02 13:32:08,895 /home/yongjiang.jy/.flair/embeddings/news-forward-0.4.1.pt is not selected, Skipping
2022-08-02 13:32:08,903 /home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large 559890432
2022-08-02 13:32:08,904 first
2022-08-02 13:32:09,819 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased 360268800
2022-08-02 13:32:09,819 first
2022-08-02 13:32:09,819 /home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased is not selected, Skipping
2022-08-02 13:32:09,822 bert-base-multilingual-cased 177853440
2022-08-02 13:32:09,823 first
2022-08-02 13:32:10,863 bert-large-cased 333579264
2022-08-02 13:32:10,863 first
2022-08-02 13:32:11,970 elmo-original 0
2022-08-02 13:32:13,072 Finished Embeddings Assignments
2022-08-02 13:32:20,355 10/108
2022-08-02 13:32:24,289 20/108
2022-08-02 13:32:31,113 30/108
2022-08-02 13:32:39,760 40/108
2022-08-02 13:32:49,477 50/108
2022-08-02 13:32:56,743 60/108
2022-08-02 13:33:05,407 70/108
2022-08-02 13:33:11,596 80/108
2022-08-02 13:33:16,283 90/108
2022-08-02 13:33:21,167 100/108
2022-08-02 13:33:24,807 0.9285	0.8831	0.9052
2022-08-02 13:33:24,807 
MICRO_AVG: acc 0.8269 - f1-score 0.9052
MACRO_AVG: acc 0.7963 - f1-score 0.880675
LOC        tp: 1554 - fp: 151 - fn: 114 - tn: 1554 - precision: 0.9114 - recall: 0.9317 - accuracy: 0.8543 - f1-score: 0.9214
MISC       tp: 473 - fp: 96 - fn: 229 - tn: 473 - precision: 0.8313 - recall: 0.6738 - accuracy: 0.5927 - f1-score: 0.7443
ORG        tp: 1407 - fp: 98 - fn: 254 - tn: 1407 - precision: 0.9349 - recall: 0.8471 - accuracy: 0.7999 - f1-score: 0.8888
PER        tp: 1554 - fp: 39 - fn: 63 - tn: 1554 - precision: 0.9755 - recall: 0.9610 - accuracy: 0.9384 - f1-score: 0.9682

@wangxinyu0922
Copy link
Member

Let me check this problem deeper. It may take a few days.

@wangxinyu0922
Copy link
Member

Hi, I have fixed this problem. You may test the model again. The problem comes from the unexpected do_lower_case=True for reloading the transformer tokenizers.

2022-08-17 22:10:31,725 Finished Embeddings Assignments
2022-08-17 22:10:34,380 10/108
2022-08-17 22:10:35,861 20/108
2022-08-17 22:10:38,278 30/108
2022-08-17 22:10:41,239 40/108
2022-08-17 22:10:44,537 50/108
2022-08-17 22:10:47,018 60/108
2022-08-17 22:10:49,926 70/108
2022-08-17 22:10:52,097 80/108
2022-08-17 22:10:53,792 90/108
2022-08-17 22:10:55,546 100/108
2022-08-17 22:10:56,821 0.9415  0.9495  0.9455
2022-08-17 22:10:56,821
MICRO_AVG: acc 0.8967 - f1-score 0.9455
MACRO_AVG: acc 0.8776 - f1-score 0.932575
LOC        tp: 1584 - fp: 76 - fn: 84 - tn: 1584 - precision: 0.9542 - recall: 0.9496 - accuracy: 0.9083 - f1-score: 0.9519
MISC       tp: 612 - fp: 122 - fn: 90 - tn: 612 - precision: 0.8338 - recall: 0.8718 - accuracy: 0.7427 - f1-score: 0.8524
ORG        tp: 1573 - fp: 113 - fn: 88 - tn: 1573 - precision: 0.9330 - recall: 0.9470 - accuracy: 0.8867 - f1-score: 0.9399
PER        tp: 1594 - fp: 22 - fn: 23 - tn: 1594 - precision: 0.9864 - recall: 0.9858 - accuracy: 0.9725 - f1-score: 0.9861

@junwei-h
Copy link
Author

Thank you very much. I have replicated your result

2022-08-18 17:45:37,725 0.9415	0.9495	0.9455
2022-08-18 17:45:37,725 
MICRO_AVG: acc 0.8967 - f1-score 0.9455
MACRO_AVG: acc 0.8776 - f1-score 0.932575
LOC        tp: 1584 - fp: 76 - fn: 84 - tn: 1584 - precision: 0.9542 - recall: 0.9496 - accuracy: 0.9083 - f1-score: 0.9519
MISC       tp: 612 - fp: 122 - fn: 90 - tn: 612 - precision: 0.8338 - recall: 0.8718 - accuracy: 0.7427 - f1-score: 0.8524
ORG        tp: 1573 - fp: 113 - fn: 88 - tn: 1573 - precision: 0.9330 - recall: 0.9470 - accuracy: 0.8867 - f1-score: 0.9399
PER        tp: 1594 - fp: 22 - fn: 23 - tn: 1594 - precision: 0.9864 - recall: 0.9858 - accuracy: 0.9725 - f1-score: 0.9861

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants