This repo contains example projects for various NLP tasks, including scripts, benchmarks, results and datasets created with Prodigy.
| Name | Description |
|---|---|
ner-food-ingredients |
Use sense2vec and Prodigy to bootstrap an NER model to detect ingredients in Reddit comments and to calculate how these mentions change over time. Includes an end-to-end video tutorial, raw pre-processed data, 949 annotated examples and pretrained tok2vec weights. |
ner-fashion-brands |
Use sense2vec to bootstrap an NER model to detect fashion brands in Reddit comments. Includes 1735 annotated examples, a data visualizer, training and evaluation scripts for spaCy and pretrained tok2vec weights. |
ner-drugs |
Use word vectors to bootstrap an NER model to detect drug names in Reddit comments. Includes 1977 annotated examples, a data visualizer, training and evaluation scripts for spaCy and pretrained tok2vec weights. |
textcat-docs-issues |
Train a binary text classifier with exclusive classes to predict whether a GitHub issue title is about documentation. Includes 1161 annotated examples, a live demo and downloadable model and training and evaluation scripts for spaCy. |