Named Entity Recognition
Tagging names, concepts or key phrases is a crucial task for Natural Language Understanding pipelines. Prodigy lets you label NER training data or improve an existing model’s accuracy with ease.
Fast and flexible annotation
Prodigy’s web-based annotation app has been carefully designed to be as efficient as possible. The manual interface lets you label entities by highlighting spans of text by hand. Your annotations snap to token boundaries, and you can mark single-word entities by double-clicking.
Once your model is already accurate, the binary interface lets you fly through lots of inputs very quickly, by just saying “yes” or “no” to specific suggestions. It’s a great way to confirm likely predictions, reject unlikely ones, or focus only on possible mistakes.
Label any text, in any language or script
Prodigy lets you use token boundaries for faster and more consistent annotation, but it's also fully flexible: you can annotate from the character up if your task requires it. No matter what language or writing system you're working with, if it's text, Prodigy can help you annotate it.
Bootstrap with powerful patterns
Prodigy is a fully scriptable annotation tool, letting you automate as much as possible with custom rule-based logic. You don’t want to waste time labeling every instance of common entities like "New York" or "the United States" by hand. Instead, give Prodigy rules or a list of examples, review the entities in context and annotate the exceptions. As you annotate, a statistical model can learn to suggest similar entities, generalising beyond your initial patterns.
patterns.jsonl{"pattern": [{"lower": "new"}, {"lower": "york"}], "label": "CITY"}
{"pattern": [{"lower": "berlin"}], "label": "CITY"}
Focus on what the model is most uncertain about
Prodigy lets you put the model in the loop, so that it can actively participate in the training process, using what it already knows to figure out what to ask you next. The model learns as you go, based on the answers you provide. Most annotation tools avoid making any suggestions to the user, to avoid biasing the annotations. Prodigy takes the opposite approach: ask the user as little as possible.
Make the most of transfer learning
With modern transfer learning and data augmentation technologies, quality is much more important than quantity. Language model pretraining technologies like BERT or ULMFiT let you import knowledge from millions of examples without creating a single annotation. You don't need an army of annotators to train a good model. You just need good ideas and the right workflow – which Prodigy makes easier than ever.