Skip to content

๐Ÿ’ Example projects for various NLP tasks with datasets, scripts and results

License

Notifications You must be signed in to change notification settings

gfranco008/projects

ย 
ย 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

32 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Example projects

This repo contains example projects for various NLP tasks, implemented with spaCy and other NLP frameworks. The projects include scripts, benchmarks and results, as well as annotated datasets (created with Prodigy).

๐Ÿ’ Projects

Name Description
ner-food-ingredients Use sense2vec and Prodigy to bootstrap an NER model to detect ingredients in Reddit comments and to calculate how these mentions change over time. Includes an end-to-end video tutorial, raw pre-processed data, 949 annotated examples and pretrained tok2vec weights.
ner-fashion-brands Use sense2vec to bootstrap an NER model to detect fashion brands in Reddit comments. Includes 1735 annotated examples, a data visualizer, training and evaluation scripts for spaCy and pretrained tok2vec weights.
ner-drugs Use word vectors to bootstrap an NER model to detect drug names in Reddit comments. Includes 1977 annotated examples, a data visualizer, training and evaluation scripts for spaCy and pretrained tok2vec weights.
textcat-docs-issues Train a binary text classifier with exclusive classes to predict whether a GitHub issue title is about documentation. Includes 1161 annotated examples, a live demo and downloadable model and training and evaluation scripts for spaCy.
nel-emerson Use spaCy and Prodigy to train an Entity Linking model to disambiguate mentions of "Emerson" to unique WikiData identifiers.

About

๐Ÿ’ Example projects for various NLP tasks with datasets, scripts and results

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 52.6%
  • Python 47.4%