Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List resources to release and where #232

Closed
4 tasks done
mcollardanuy opened this issue May 9, 2023 · 0 comments
Closed
4 tasks done

List resources to release and where #232

mcollardanuy opened this issue May 9, 2023 · 0 comments
Assignees
Labels
documentation Improvements or additions to documentation release

Comments

@mcollardanuy
Copy link
Collaborator

mcollardanuy commented May 9, 2023

To do:

List of resources to release (currently in toponymVM2.0):

To run T-Res for the current configuration

resources/models/blb_lwm-ner-fine.model/
resources/wikidata/wikidata_gazetteer.csv
resources/wikidata/entity2class.txt
resources/wikidata/mentions_to_wikidata.json
resources/wikidata/mentions_to_wikidata_normalized.json
resources/wikidata/wikidata_to_mentions_normalized.json
resources/deezymatch/combined/wkdtalts_w2v_ocr/
resources/deezymatch/models/w2v_ocr/
resources/rel_db/embeddings_database.db
resources/models/disambiguation/deezymatch+3+25_Ashton1860+wmtops

Also, resources/models/blb_lwm-ner-fine.model/ unless it's uploaded to HF (see issue).

To train a new fine-grained NER model (with topRes19th)

experiments/outputs/data/lwm/ner_fine_train.json
experiments/outputs/data/lwm/ner_fine_dev.json

To train a new DeezyMatch model (for English OCR)

resources/deezymatch/data/w2v_ocr_pairs.txt

To generate a new DeezyMatch training set (for English OCR)

Nothing, embeddings already uploaded here, store them under resources/models/w2v/ in folders named w2v_18XXs_news, where XX is the decade.

To train a new entity disambiguation model (with topRes19th)

experiments/outputs/data/lwm/linking_df_split.tsv
@mcollardanuy mcollardanuy self-assigned this May 9, 2023
@mcollardanuy mcollardanuy added release documentation Improvements or additions to documentation labels May 9, 2023
@mcollardanuy mcollardanuy changed the title List resources to release List resources to release and where May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation release
Projects
None yet
Development

No branches or pull requests

1 participant