-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test result comparison without vs with document-level features #42
Comments
It's strange. Could you provide your modified config file and your ner data for the experiment? |
Here is the doc_ner_best.yaml content.
Since I shortened the embedding name, I added Also did
at https://github.com/Alibaba-NLP/ACE/blob/main/flair/embeddings.py#L2943
at https://github.com/Alibaba-NLP/ACE/blob/main/flair/embeddings.py#L3148 to get the code working. Thanks. I used '--test', so I assume it is using the coll2003 corpus as NER data. |
To use my pretrained model, the embedding names cannot be modified. The code will sort the order of embeddings by embedding names so that the order of the embeddings is always the same (this is quite important in ACE). You may change the embedding name back to the origin yaml file. The code reads the model parameters according to the By the way, if you want to train your own model, you can design your own |
I have this error using the default embedding names. This is why I shortened it. Any suggestions? Thank you.
EDIT: actually the same error occurred regardless if I shorten the embedding name in the yaml file or not. The name seems hard-coded somewhere. Have to add |
This is caused by my modifications at #41. I added some additional modifications in the |
Thanks for your comments. I will uncomment 2022-08-01 16:15:23,401 bert-base-multilingual-cased 177853440 |
The F1 score does not change compared with your initial run. Could you show a longer output line for the testing? I think the code will print the names and the orders of the embeddings. |
Yes, the result is identical to the original run. I printed it on screen. The following is the longest output I can see now. Let me know if you need earlier logs.
|
The problem is probably from this list: ['/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner3/xlm-roberta-large_v2doc', '/home/yongjiang.jy/.flair/embeddings/xlnet-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner4/xlnet-large-cased_v2doc', 'C:\\Users\\ebb\\.flair\\embeddings\\lm-jw300-backward-v0.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\lm-jw300-forward-v0.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\news-backward-0.4.1.pt', 'C:\\Users\\ebb\\.flair\\embeddings\\news-forward-0.4.1.pt', 'Word: en', 'bert-base-cased', 'bert-base-multilingual-cased', 'bert-large-cased', 'elmo-original'] This is the name list of all the embeddings, and it decides the order of embedding according to the embedding name. It seems that the flair embeddings are the direct path in your system. In your case, you may specify the |
Following your suggestion, I have this list
I got the same F1 score result. Is the list in the right order or I should prepend |
It seems that I forgot to upload some of the fine-tuned transformer embeddings (though the order of embeddings is still not correct now). I'm collecting them and will upload them later. |
I'm uploading the fine-tuned embeddings for the model. Please download |
Thank you. Is it now in the right order
The output is worse:
Why don't you share your order and your result? |
It's strange, have you download my uploaded embeddings? |
Yes, the 3 Transformer Embeddings are from OneDrive and saved under resources.
Here is the entire output log. Some models were downloaded from s3 on AWS or huggingface and saved under
|
Let me check this problem deeper. It may take a few days. |
Hi, I have fixed this problem. You may test the model again. The problem comes from the unexpected
|
Thank you very much. I have replicated your result
|
Sorry to raise another issue. I observed ACE+doc is worse
ACE+doc (
python .\train.py --config .\config\doc_ner_best.yaml --test
)MICRO_AVG: acc 0.8338 - f1-score 0.9094
MACRO_AVG: acc 0.8032 - f1-score 0.8851
LOC tp: 1550 - fp: 135 - fn: 118 - tn: 1550 - precision: 0.9199 - recall: 0.9293 - accuracy: 0.8597 - f1-score: 0.9246
MISC tp: 478 - fp: 95 - fn: 224 - tn: 478 - precision: 0.8342 - recall: 0.6809 - accuracy: 0.5997 - f1-score: 0.7498
ORG tp: 1426 - fp: 99 - fn: 235 - tn: 1426 - precision: 0.9351 - recall: 0.8585 - accuracy: 0.8102 - f1-score: 0.8952
PER tp: 1563 - fp: 40 - fn: 54 - tn: 1563 - precision: 0.9750 - recall: 0.9666 - accuracy: 0.9433 - f1-score: 0.9708
ACE (
python .\train.py --config .\config\conll_03_english.yaml --test
)MICRO_AVG: acc 0.8807 - f1-score 0.9366
MACRO_AVG: acc 0.8635 - f1-score 0.9247500000000001
LOC tp: 1580 - fp: 90 - fn: 88 - tn: 1580 - precision: 0.9461 - recall: 0.9472 - accuracy: 0.8987 - f1-score: 0.9466
MISC tp: 606 - fp: 115 - fn: 96 - tn: 606 - precision: 0.8405 - recall: 0.8632 - accuracy: 0.7417 - f1-score: 0.8517
ORG tp: 1561 - fp: 159 - fn: 100 - tn: 1561 - precision: 0.9076 - recall: 0.9398 - accuracy: 0.8577 - f1-score: 0.9234
PER tp: 1575 - fp: 31 - fn: 42 - tn: 1575 - precision: 0.9807 - recall: 0.9740 - accuracy: 0.9557 - f1-score: 0.9773
Did I miss anything? Thanks.
The text was updated successfully, but these errors were encountered: