Large Language Models (LLMs)
Nothing is stopping you from integrating Prodigy with services that can help you annotate. This includes large language models, which allow you to provide a prompt in order to attempt an NLP task. Prodigy integrates with these models via the spacy-llm package and comes preconfigured with some recipes that you can use directly.
Quickstart
You can use the ner.openai.correct
, ner.openai.fetch
,
ner.llm.correct
and ner.llm.fetch
recipes to pre-highlight text
examples with NER annotations. These annotations typically deserve a review, but
the large language model in these recipes is able to generate annotations for
many entities that are not supported out of the box by pretrained spaCy
pipelines.
You can learn more by checking the named entity section on this page.
You can use the textcat.openai.correct
, textcat.openai.fetch
,
textcat.llm.correct
and textcat.llm.fetch
recipes to attach class
predictions to text examples. These annotations typically deserve a review, but
the large language model is able to generate labels that don’t require you to
train your own model beforehand.
You can learn more by checking the text classification section on this page.
The terms.llm.fetch
and terms.openai.fetch
recipes can accept a
topic in order to generate a terminology list for you.
You can learn more by checking the terminology section on this page.
The ner.llm.fetch
, textcat.llm.fetch
, ner.openai.fetch
and
textcat.openai.fetch
recipes allow you to download predictions from large
language models upfront. These predictions won’t be perfect, but they might
allow you to select an interesting subset for manual review in Prodigy.
The most compelling use case for this is when you’re dealing with a rare label. Instead of going through all the examples manually you could instead only check the examples in which the LLM predicts the label of interest.
Be aware that it can be expensive to send many queries to an LLM vendor, but it can be worth the investment if this is a method to help you get started.
There can be good reasons to write custom prompts for large
language models. To help discover which prompts perform best you may consider
using the ab.llm.tournament
, ab.openai.prompts
and the
ab.openai.tournament
recipes to compare prompts.
How do these recipes work?
Large language models, like those offered by OpenAI, can be used for text completion tasks. They allow you to input some text as a prompt, and the model will generate text completion that tries to match whatever context was given.
This text-in, text-out interface means you can try to engineer a prompt that the large language model can use to perform a specific task that you’re interested in. While this approach is not perfect and part of on-going research, it does offer a general method to construct prompts for many typical NLP tasks such as NER or text classification.
As of v1.13, Prodigy integrates with large language models via the spacy-llm library. This project supports multiple large language models backends that all output structured data like a normal spaCy model.
These recipes include:
ner.llm.correct
/ner.llm.fetch
review/download NER annotations performed byspacy-llm
spans.llm.correct
/spans.llm.fetch
review/download spancat annotations byspacy-llm
textcat.llm.correct
/textcat.llm.fetch
review/download text classification annotations byspacy-llm
terms.llm.fetch
download terms/phrases generated by an LLMab.llm.tournament
prompt engineering via tournament selection
These recipes combine large language models with Prodigy to aid the human in the loop by providing pre-annotated examples. The goal of the available recipes is to help you get started quicker but the recipes themselves, including their prompts, can be fully customized too.
Originally, as of v1.12, Prodigy offered seven recipes that interact with OpenAI directly. These include:
ner.openai.correct
/ner.openai.fetch
review/download NER annotationstextcat.openai.correct
/textcat.openai.fetch
review/download text classification annotationsterms.openai.fetch
retreive terminology lists based on a queryab.openai.prompts
compare two prompts for OpenAI in a blind taste testab.openai.tournament
compare many prompts for OpenAI in a tournament
All of these recipes handle the prompt generation from OpenAI as well as the response parsing in order to help you annotate.
Benefits of spacy-llm
There are also some other benefits of the spacy-llm
approach that are worth
highlighting.
-
When you use
spacy-llm
you’re free to switch LLM providers if you’d like to experiment with different vendors. -
Some of the LLM backends that
spacy-llm
supports can be run on your own hardware, removing the need to send data to a third party. -
The
spacy-llm
recipes allow you to use a cache that prevents the same costly prompt from being run twice. -
The
spacy-llm
recipes will allow you to predict more than just text categories and named entities. Spans are currently supported, but other linguistic tasks will likely be added in the future as well. -
The prompts from
spacy-llm
will likely more up to date prompts. The spaCy team is directly working on that project, which also means that it can iterate on better models quickly and independently.
Getting Started with Prodigy and spacy-llm
New: 1.13
You can use spacy-llm to add LLM-powered components to a spaCy pipeline. This allows for an easy integration with Prodigy, because it’s already set up to work nicely with spaCy.
To use an LLM with spaCy you’ll need to start by creating a configuration file
that tells spacy-llm
how to construct a prompt for your task. Here’s one such
example that could be used to find named entities in recipes.
Basic spacy-llm config[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
save_io = true
[components.llm.task]
@llm_tasks = "spacy.NER.v2"
labels = ["DISH", "INGREDIENT", "EQUIPMENT"]
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
config = {"temperature": 0.3}
Let’s go over what this configuration file defines.
-
At the start of the file we’re defining an English spaCy pipeline with a single component called
"llm"
. At this point in the configuration file it is not known what kind of pipeline component it is, all we know is the name. -
Later in the file, we see that there is a definition for the
llm
component and that it uses a factory called"llm"
. This refers to a registered function thatspacy-llm
provides that can construct components that can interface with large language models. The file also configuressave_io = true
, which ensures that the LLM prompt/reponse are saved. Not every Prodigy recipe will need this, but some do, so it’s a good practice to always include it when you’re configurationspacy-llm
pipelines for Prodigy. -
Next, we see a definition for a “task”. In
spacy-llm
a task is a combination of a prompt generator and a response parser. The prompt generator will generate a prompt based on the inputs that you provide and the response parser makes sure that any response from the large lanuage model is properly turned into structured data for spaCy. In this example we’re configuring a spacy.NER task, which allows us to provide labels. In this case the file is configured to detectDISH
,INGREDIENT
andEQUIPMENT
entities. -
Finally we also configure a backend, which is where we configure which LLM provider to use. You can choose to go with a paid vendor, like OpenAI, but you can also configure a local model, like Dolly, instead. If you’re going with a vendor, you’ll need to set up your environment variables so that you can identify yourself.
How it works
Under the hood, spacy-llm
will take your configuration file and use it to
write a prompt for the large language model when it is presented with an example
to annotate. If we assume the following input:
{ "text": "I know of a great pizza recipe with anchovis." }
Then in this particular case, the prompt may look something like this:
NER prompt sent to LLMFrom the text below, extract the following entities in the following format:
dish: <comma delimited list of strings>
ingredient: <comma delimited list of strings>
equipment: <comma delimited list of strings>
Text:
"""
I know of a great pizza recipe with anchovis.
"""
After the large language model receives the prompt, it will process it and produce output, which might look like this.
NER response from LLMdish: pizza
ingredient: anchovis
equipment:
The goal of spacy-llm
is to handle the prompt generation and parsing on your
behalf while supporting multiple LLM backends. The interface is just like a
normal spaCy pipeline that you’re used to, but by supporting these large
language models directly we may have an opportunity to make data annotation
easier. It can remove the need to find a pre-trained model for the task that
we’re interested in by leveraging an appropriate prompt instead.
Using spacy-llm
pipelines directly
Before diving deeper into the spacy-llm
recipes, it’s good to observe that you
can also use these pipeline programatically. If you have a custom Python
recipes, you’ll be able to directly assemble a spaCy pipeline from a config
file.
from spacy_llm.util import assemble
from dotenv import load_dotenv
# Make sure the environment variables are loaded
load_dotenv()
# Assemble a spaCy pipeline from the config
nlp = assemble("config.cfg")
# Use this pipeline as you would normally
doc = nlp("I know of a great pizza recipe with anchovis.")
print(doc.ents) # (pizza, anchovis)
You can also use the spaCy assemble command from the terminal to generate a local nlp pipeline that you can load as well.
dotenv run -- spacy assemble config.cfg en_ner_cooking
This will save a folder on disk called en_ner_cooking
that contains a spaCy
pipeline that you can load like any other spaCy pipeline.
import spacy
from dotenv import load_dotenv
# Again, we have to sure the environment variables are loaded
load_dotenv()
# Load saved LLM pipeline from disk
nlp = spacy.load("en_ner_cooking")
# Again, use this pipeline as you would normally
doc = nlp("I know of a great pizza recipe with anchovis.")
print(doc.ents) # (pizza, anchovis)
This means that, theoretically, you could immediately start re-using the saved
pipeline in your recipes as you would normally. If, for example, you’d like to
use ner.correct
with this model, you can do so by running:
Example
dotenv run -- prodigy ner.manual ner_cooking en_cooking_ner examples.jsonl --component llm
This setup works, but as we’ll see later, the Prodigy integration with
spacy-llm
will make this much easier.
Using spacy-llm
recipes for NER
Instead of building a spaCy model and storing it to disk manually, you can also
directly use the *.llm.*
recipes, which use spacy-llm
under the hood. For
NER that means that you can use the ner.llm.correct
recipe to annotate
data with an LLM model in the loop.
Example
dotenv run -- prodigy ner.llm.correct annotated-recipes spacy-llm-config.cfg examples.jsonl
This will start an interface that shows you the LLM predictions together with the prompt and response.
Because this example is annotated correctly, you can simply accept the annotation without having to use your mouse to annotate the entities. This can save a tremendous amount of time, but it should be stressed that the LLM annotations can be wrong. You still want to be in the loop to curate the annotations.
Alternatively, you can also use ner.llm.fetch
to download these
annotations on disk such that you can later review them with the
ner.manual
recipe. This interface won’t give you the prompt information,
but does allow you to call the LLM exactly once even if you reset the server and
you want to show the data to multiple annotators.
Example
dotenv run -- prodigy ner.llm.fetch spacy-llm-config.cfg examples.jsonl ner-annotated.jsonl 100%|████████████████████████████| 50/50 [00:12<00:00, 3.88it/s]
More spacy-llm
configurations for NER
Let’s expand the spacy-llm
configuration for named entity recognition.
Basic spacy-llm config[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
save_io = true
[components.llm.task]
@llm_tasks = "spacy.NER.v2"
labels = ["DISH", "INGREDIENT", "EQUIPMENT"]
[components.llm.task.label_definitions]
DISH = "Extract the name of a known dish."
INGREDIENT = "Extract the name of a cooking ingredient, including herbs and spices."
EQUIPMENT = "Extract any mention of cooking equipment. e.g. oven, cooking pot, grill"
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
config = {"temperature": 0.3}
[components.llm.task.examples]
@misc = "spacy.FewShotReader.v1"
path = "ner_examples.yml"
[components.llm.cache]
@llm_misc = "spacy.BatchCache.v1"
path = "local-cached"
batch_size = 3
max_batches_in_mem = 10
There are three main additions to this configuration file.
-
It now has label definitions that help describe the annotation task via
components.llm.task.label_definitions
. This can help you give the large language model extra context and may yield more reliable results. -
It now has few shot examples via
components.llm.tasks.examples
. It is now configured to use a few shot reader to load in examples from a file on disk. This allows you to add examples to the prompt, possibly examples that the LLM got wrong in the past, in an attempt to steer the LLM to what you want it to do. -
It now has a cache via
components.llm.cache
. By configuring this, spacy-llm will store batches of documents in thelocal-cached
folder so that you don’t have to incur costs when you rerun the same example.
For the best performance, we recommend passing in label definitions as well as few shot examples to the prompt when you write your own configuration files as well.
The aforementioned few-shot examples need to be structured, and the expected structure will depend on the task that you’re running. More details can be found on the spacy-llm docs, but for NER it might look like this:
ner_examples.yaml- text: "You can't get a great chocolate flavor with carob."
entities:
INGEDIENT: ['carob']
- text: "You can probably sand-blast it if it's an anodized aluminum pan."
entities:
INGEDIENT: []
EQUIPMENT: ['anodized aluminum pan']
Given such a file on disk, you will now use a different prompt when running the
ner.llm.correct
and ner.llm.fetch
recipes.
The new prompt used under the hoodYou are an expert Named Entity Recognition (NER) system. Your task is to accept Text as input and extract named entities for the set of predefined entity labels.
From the Text input provided, extract named entities for each label in the following format:
DISH: <comma delimited list of strings>
INGREDIENT: <comma delimited list of strings>
EQUIPMENT: <comma delimited list of strings>
Below are definitions of each label to help aid you in what kinds of named entities to extract for each label.
Assume these definitions are written by an expert and follow them closely.
DISH: Extract the name of a known dish.
INGREDIENT: Extract the name of a cooking ingredient, including herbs and spices.
EQUIPMENT: Extract any mention of cooking equipment. e.g. oven, cooking pot, grill
Below are some examples (only use these as a guide):
Text:
'''
You can't get a great chocolate flavor with carob.
'''
INGREDIENT: carob
Text:
'''
You can probably sand-blast it if it's an anodized aluminum pan.
'''
INGREDIENT:
EQUIPMENT: anodized aluminum pan
Here is the text that needs labeling:
Text:
'''
In Silicon Valley, a Voice of Caution Guides a High-Flying Uber
'''
You’ll notice that the prompt is now much longer than before. In theory this will give the LLM more context and may also allow it to give a better predictive performance.
More performance with spacy.NER.v3
In general, it’s best to create a spacy-llm
pipeline with detailed label
descriptions and examples. It’s especially useful to add examples to the
prompt that LLM might otherwise get wrong. This tends to add a lot of context to
the prompt which the LLM can use to repond appropriately. The only downside, in
practice, is that the longer prompt may also slow down the pipeline and incur
more compute costs on your behalf.
If you’re interested in making the LLM even more reliable then you might want to
consider the spacy.NER.v3
task that uses chain-of-thought reasoning in the prompt to generate better
annotations. It’s based on the PromptNER
paper and it requires you to pass an example to the prompt.
Example
Here’s an example configuration that uses spacy.NER.v3
.
config.cfg[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
[components.llm.task]
@llm_tasks = "spacy.NER.v3"
labels = ["DISH", "INGREDIENT", "EQUIPMENT"]
description = "Entities are the names food dishes, ingredients, and any kind of cooking equipment. Adjectives, verbs, adverbs are not entities. Pronouns are not entities."
[components.llm.task.label_definitions]
DISH = "Known food dishes, e.g. Lobster Ravioli, garlic bread"
INGREDIENT = "Individual parts of a food dish, including herbs and spices."
EQUIPMENT = "Any kind of cooking equipment. e.g. oven, cooking pot, grill"
[components.llm.task.examples]
@misc = "spacy.FewShotReader.v1"
path = "few-shot-examples.json"
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
This configuration refers to a few-shot-examples.json
file, which might have
examples like below.
few-shot-examples.json[
{
"text": "You can't get a great chocolate flavor with carob.",
"spans": [
{
"text": "chocolate",
"is_entity": false,
"label": "==NONE==",
"reason": "is a flavor in this context, not an ingredient"
},
{
"text": "carob",
"is_entity": true,
"label": "INGREDIENT",
"reason": "is an ingredient to add chocolate flavor"
}
]
},
{
"text": "You can probably sand-blast it if it's an anodized aluminum pan",
"spans": [
{
"text": "sand-blast",
"is_entity": false,
"label": "==NONE==",
"reason": "is a cleaning technique, not some kind of equipment"
},
{
"text": "anodized aluminum pan",
"is_entity": true,
"label": "EQUIPMENT",
"reason": "is a piece of cooking equipment, anodized is included since it describes the type of pan"
}
]
}
]
Let’s go over some of the differences of this setup compared to the previous configuration.
-
The
spacy.NER.v3
configuration file also comes with a task description. This allows you to mention what you are and what you aren’t interested in detecting, which contributes more context to the prompt for the LLM. -
The
examples.json
file is more expressive than before. Each example allows you to pass a reason with every label and you’re also able to give negative examples to indicate when something isn’t an entity.
That last change especially contributes a lot of context for the LLM and it’s also reflected in the generated prompt.
You are an expert Named Entity Recognition (NER) system.
Your task is to accept Text as input and extract named entities.
Entities must have one of the following labels: DISH, EQUIPMENT, INGREDIENT.
If a span is not an entity label it: `==NONE==`.
Entities are the names food dishes,
ingredients, and any kind of cooking equipment.
Adjectives, verbs, adverbs are not entities.
Pronouns are not entities.
Below are definitions of each label to help aid you in what kinds of named entities to extract for each label.
Assume these definitions are written by an expert and follow them closely.
DISH: Known food dishes, e.g. Lobster Ravioli, garlic bread
INGREDIENT: Individual parts of a food dish, including herbs and spices.
EQUIPMENT: Any kind of cooking equipment. e.g. oven, cooking pot, grill
Q: Given the paragraph below, identify a list of entities, and for each entry explain why it is or is not an entity:
Paragraph: You can't get a great chocolate flavor with carob.
Answer:
1. chocolate | False | ==NONE== | is a flavor in this context, not an ingredient
2. carob | True | INGREDIENT | is an ingredient to add chocolate flavor
Paragraph: You can probably sand-blast it if it's an anodized aluminum pan
Answer:
1. sand-blast | False | ==NONE== | is a cleaning technique, not some kind of equipment
2. anodized aluminum pan | True | EQUIPMENT | is a piece of cooking equipment, anodized is included since it describes the type of pan
Paragraph: I know of a great pizza recipe with anchovis.
Answer:
1. pizza | True | DISH | is a known food dish
2. anchovis | True | INGREDIENT | is an ingredient used in the pizza recipe
Notice how the response now contains a table-like format in the list? The task can deal with this because it comes with a parser than can handle this input, but it serves as a nice example of how these LLMs can really generate different outputs that might suit a task better.
In general, the most recent version of a task should also be the version to try first. But it can be helpful to remember that longer prompts might be more expensive too. However, if costs become a burden, it can be good to downgrade to a task that generates shorter prompts.
Using spacy-llm
recipes for spans
If you’re interested in annotating spans that can overlap, you can use the
spans.llm.correct
and spans.llm.fetch
recipes. These recipes are
very similar to their ner.llm.correct
and ner.llm.fetch
counterparts, but their configuration allows span overlap and can be used to
train models for span categorisation. In our example, that might mean that you’d
also be interested in detecting an ingredient if it was part of the name of a
dish.
The main difference, compared to the previous configuration file, is that you’d use the spacy.SpanCat task instead of spacy.NER.
Here’s what a revised configuration file might look like.
Basic spacy-llm config for spans[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
save_io = true
[components.llm.task]
@llm_tasks = "spacy.SpanCat.v2"
labels = ["DISH", "INGREDIENT", "EQUIPMENT"]
[components.llm.task.label_definitions]
DISH = "Extract the name of a known dish."
INGREDIENT = "Extract the name of a cooking ingredient, including herbs and spices."
EQUIPMENT = "Extract any mention of cooking equipment. e.g. oven, cooking pot, grill"
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
config = {"temperature": 0.3}
[components.llm.task.examples]
@misc = "spacy.FewShotReader.v1"
path = "span_examples.yaml"
[components.llm.cache]
@llm_misc = "spacy.BatchCache.v1"
path = "local-cached"
batch_size = 3
max_batches_in_mem = 10
This file refers to a span_examples.yaml
file, which might look like this:
span_examples.yaml- text: 'Mac and Cheese is a popular American pasta variant.'
entities:
INGREDIENT: ['Cheese']
DISH: ['Mac and Cheese']
This configuration will generate a slightly different prompt mainly to make it clear that the spans can overlap.
Example prompt for spansYou are an expert Named Entity Recognition (NER) system. Your task is to accept
Text as input and extract named entities for the set of predefined entity
labels.
The entities you extract for each label can overlap with each other.
From the Text input provided, extract named entities for each label in
the following format:
DISH:
EQUIPMENT:
INGREDIENT:
Below are definitions of each label to help aid you in what kinds of named entities
to extract for each label. Assume these definitions are written by an expert and
follow them closely.
DISH: Extract the name of a known dish.
INGREDIENT: Extract the name of a cooking ingredient, including herbs and spices.
EQUIPMENT: Extract any mention of cooking equipment. e.g. oven, cooking pot, grill.
Below are some examples (only use these as a guide):
Text:
'''
Mac and Cheese is a popular American pasta
variant.
'''
INGREDIENT: Cheese
DISH: Mac and Cheese
Here is the text that needs labeling:
Text:
'''
Spaghetti Bolognaise is a dish.
'''
Example responseINGREDIENT: Spaghetti
DISH: Spaghetti Bolognaise
The overlapping nature of these spans is also reflected in the annotation
interface when you use the spans.llm.correct
recipe.
Example
dotenv run -- prodigy spans.llm.correct annotated-recipes config.cfg examples.jsonl
Note how the provided annotations are now nested.
Just like before, you may also choose to fetch these examples upfront using the
spans.llm.fetch
recipe. This is the *.fetch
variant of the original
*.correct
recipe.
Example
dotenv run -- prodigy spans.llm.fetch config.cfg examples.jsonl spancat-annotated.jsonl 100%|████████████████████████████| 50/50 [00:12<00:00, 3.88it/s]
More performance with spacy.Spans.v3
Just like with named entities, you can add chain of thought
reasoning for spans too via the
spacy.SpanCat.v3
task. Like before, the setup is largely the same but you’ll be required to add
examples to the config.cfg
file if you plan on using the v3
task.
[paths]
examples = null
[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
[components.llm.task]
@llm_tasks = "spacy.SpanCat.v3"
labels = ["DISH", "INGREDIENT", "EQUIPMENT"]
description = Entities are the names food dishes,
ingredients, and any kind of cooking equipment.
Adjectives, verbs, adverbs are not entities.
Pronouns are not entities.
[components.llm.task.label_definitions]
DISH = "Known food dishes, e.g. Lobster Ravioli, garlic bread"
INGREDIENT = "Individual parts of a food dish, including herbs and spices."
EQUIPMENT = "Any kind of cooking equipment. e.g. oven, cooking pot, grill"
[components.llm.task.examples]
@misc = "spacy.FewShotReader.v1"
path = "llm-examples.json"
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
[
{
"text": "Spaghetti Bolognaise is a great dish.",
"spans": [
{
"text": "Spaghetti",
"is_entity": true,
"label": "INGREDIENT",
"reason": "It is part of the dish name, but it indicates a key ingredient."
},
{
"text": "Spaghetti Bolognaise",
"is_entity": true,
"label": "DISH",
"reason": "It is the name of a popular pasta dish."
}
]
}
]
You are an expert Entity Recognition system.
Your task is to accept Text as input and extract named entities.
The entities you extract can overlap with each other.
Entities must have one of the following labels: DISH, EQUIPMENT, INGREDIENT.
If a span is not an entity label it: `==NONE==`.
Entities are the names food dishes,
ingredients, and any kind of cooking equipment.
Adjectives, verbs, adverbs are not entities.
Pronouns are not entities.
Below are definitions of each label to help aid you in what kinds of named entities to extract for each label.
Assume these definitions are written by an expert and follow them closely.
DISH: Known food dishes, e.g. Lobster Ravioli, garlic bread
INGREDIENT: Individual parts of a food dish, including herbs and spices.
EQUIPMENT: Any kind of cooking equipment. e.g. oven, cooking pot, grill
Q: Given the paragraph below, identify a list of entities, and for each entry explain why it is or is not an entity:
Paragraph: Spaghetti Bolognaise is a great dish.
Answer:
1. Spaghetti | True | INGREDIENT | It is part of the dish name, but it indicates a key ingredient.
2. Spaghetti Bolognaise | True | DISH | It is the name of a popular pasta dish.
Paragraph: Spaghetti Bolognaise is a great dish.
Answer:
1. Spaghetti | True | INGREDIENT | It is part of the dish name, but it indicates a key ingredient.
2. Spaghetti Bolognaise | True | DISH | It is the name of a popular pasta dish.
3. Bolognaise | False | NONE | While it is part of the dish name, it refers to the specific type of sauce used in the dish and not a standalone entity.
4. great dish | False | NONE | These are adjectives describing the dish and not entities.
5. Spaghetti Bolognaise | True | DISH | It is the name of a popular pasta dish.
You might notice how some candidates appear, but that they are listed as
False
. These are suggests that the Prodigy interface will not render.
Using spacy-llm
recipes for Textcat.
The textcat.llm.correct
and textcat.llm.fetch
recipes are similar
to their NER counterparts but can perform annotation for text categorisation
tasks. That means that you could take a spacy-llm
configuration file like
below:
spacy-llm-config.cfg for text categorisation[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
save_io = true
[components.llm.task]
@llm_tasks = "spacy.TextCat.v3"
labels = ["RECIPE", "FEEDBACK", "QUESTION"]
exclusive_classes = false
[components.llm.task.label_definitions]
RECIPE = "Cooking instructions for a dish."
FEEDBACK = "Comments that might inform the author."
QUESTION = "A question is being asked to the author."
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
config = {"temperature": 0.3}
[components.llm.cache]
@llm_misc = "spacy.BatchCache.v1"
path = "local-cached"
batch_size = 3
max_batches_in_mem = 10
This configuration file uses the spacy.TextCat.v3
task, which comes with
different parameters than its NER counterpart. Specifically, you’ll notice that
we’ve set exclusive_classes
to false
. For text classification we need to
specify if the labels are exclusive (meaning they cannot overlap) or if they can
be modelled as a set of binary classes. This is also reflected in the prompt
that is generated.
Generated textcat promptYou are an expert Text Classification system. Your task is to accept Text as input
and provide a category for the text based on the predefined labels.
Classify the text below to any of the following labels: RECIPE, FEEDBACK, QUESTION
The task is non-exclusive, so you can provide more than one label as long as
they're comma-delimited. For example: Label1, Label2, Label3.
Do not put any other text in your answer, only one or more of the provided labels with nothing before or after.
If the text cannot be classified into any of the provided labels, answer `==NONE==`.
Below are definitions of each label to help aid you in correctly classifying the text.
Assume these definitions are written by an expert and follow them closely.
RECIPE: Cooking instructions for a dish.
FEEDBACK: Comments that might inform the author.
QUESTION: A question is being asked to the author.
Here is the text that needs classification
Text:
'''
Cream cheese is really good in mashed potatoes.
'''
To use this configuration file directly you may use the
textcat.llm.correct
recipe to curate the annotations given by the large
language model.
Example
dotenv run -- prodigy textcat.llm.correct annotated-recipes spacy-llm-config.cfg examples.jsonl
Alternatively you may also choose to fetch them upfront, so that the annotations
can be used later in textcat.manual
.
Example
dotenv run -- prodigy textcat.llm.fetch examples.jsonl spacy-llm-config.cfg textcat-annotated.jsonl 100%|████████████████████████████| 50/50 [00:12<00:00, 3.88it/s]
LLMs for terminology lists New: 1.13.2
The terms.llm.fetch
recipe can generate terms and phrases obtained from a
large language model. These terms and phrases can then be curated and turned
into patterns files, which can help with downstream annotation tasks.
To get started, you’ll need to configure a configuration file for spaCy LLM.
Example spacy-llm config for terms[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
[components.llm.task]
@llm_tasks = "prodigy.Terms.v1"
batch_size = 50
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
config = {"temperature": 0.3}
This configuration file describes the task as well as the backend to use.
From here you can use the terms recipe by describing the topic that you’d like to generate terms for. The example below demonstrates how to generate “skateboard tricks”.
Example
prodigy terms.llm.fetch skateboard-trick-terms config.cfg "skateboard tricks" 100%|████████████████████████████| 50/50 [00:12<00:00, 3.88it/s]
This will generate a list of skateboard tricks that are stored in the
skateboard-trick-terms
dataset.
The examples that have been generated may look something like this:
{"text":"pop shove it","meta":{"topic":"skateboard tricks"}}
{"text":"switch flip","meta":{"topic":"skateboard tricks"}}
{"text":"nose slides","meta":{"topic":"skateboard tricks"}}
{"text":"lazerflip","meta":{"topic":"skateboard tricks"}}
{"text":"lipslide","meta":{"topic":"skateboard tricks"}}
Given such a file you’re able to review the generated terms via the
textcat.manual
recipe.
Example
prodigy textcat.manual skateboard-tricks skateboard-tricks.jsonl --label tricks
From this interface you can manually accept or reject each example. Then, when
you’re done annotating, you can export the annotated text into a patterns file
via the terms.to-patterns
recipe.
Example
prodigy terms.to-patterns skateboard-tricks ./skateboard-patterns.jsonl --label skateboard-trick --spacy-model blank:en ✨ Exported 129 patterns ./skateboard-patterns.jsonl
This will generate a file with patterns, like those shown below.
{"label":"skateboard-trick","pattern":[{"lower":"pop"},{"lower":"shove"},{"lower":"it"}]}
{"label":"skateboard-trick","pattern":[{"lower":"switch"},{"lower":"flip"}]}
{"label":"skateboard-trick","pattern":[{"lower":"nose"},{"lower":"slides"}]}
{"label":"skateboard-trick","pattern":[{"lower":"lazerflip"}]}
{"label":"skateboard-trick","pattern":[{"lower":"lipslide"}]}
From here, the skateboard-patterns.jsonl
file can be used in recipes, like
ner.manual
, to make the annotation task easier.
Prompt engineering via tournaments New: 1.13.2
Sometimes you’d like to compare and benchmark prompts for a specific task. You could facilitate this with an A/B test, but if you have a large pool of prompts you may prefer to use a tournament to figure out the best performing candidate. This is especially helpful when you’re not just comparing prompts, but also different LLM backends.
This is where the ab.llm.tournament
recipe might help. It uses the Glicko
rating system internally to determine the duels as well as the best performing
prompt/LLM combination.
As an example, let’s assume that we want to write humorous haikus about a given topic. Then you could create two Jinja2 templates that can each accept a topic, yet construct a different prompt.
prompts/prompt1.jinja2Write a haiku about {{topic}} that rhymes.
prompts/prompt2.jinja2Write a hilarious haiku about {{topic}} that rhymes.
These prompts both require a topic
to be injected, which you can provide via a
.jsonl
file.
inputs.jsonl{"topic": "Python"}
{"topic": "star wars"}
{"topic": "maths"}
Next, the recipe will also require spacy-llm
configuration files, but you can
also prepare a folder of these files if you want to compare more than one LLM
backend. As an example, let’s configure one file to use OpenAI and another to
use Cohere.
configs/gpt3-5.cfg[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
[components.llm.task]
@llm_tasks = "prodigy.TextPrompter.v1"
[components.llm.model]
@llm_models = "spacy.GPT-3-5.v1"
config = {"temperature": 0.3}
configs/cohere.cfg[nlp]
lang = "en"
pipeline = ["llm"]
[components]
[components.llm]
factory = "llm"
[components.llm.task]
@llm_tasks = "prodigy.TextPrompter.v1"
[components.llm.model]
@llm_models = "spacy.Command.v1"
name = "command"
config = {"temperature": 0.1}
Finally, as a nice touch, the recipe can also render some extra information to help give the user some context. This is also handled by a jinja2 template.
display-template.jinja2Select the best haiku about {{topic}}.
You can now build a tournament to try all the different combinations of prompts
and backends by calling the ab.llm.tournament
recipe as follows:
Example
prodigy ab.llm.tournament haiku-tournament inputs.jsonl ./prompts ./configs display-template.jinja2 --resume
From here, the recipe will keep generating candidates and will present you with an interface like below.
As you annotate and choose between the candidates you’ll also get a summary printed on the terminal.
Output after annotating a few examples
============== Current winner: [prompt1.jinja2 + gpt3-5.cfg] ==============
comparison prob trials [prompt1.jinja2 + gpt3-5.cfg] > [prompt1.jinja2 + cohere.cfg] 0.50 0 [prompt1.jinja2 + gpt3-5.cfg] > [prompt2.jinja2 + cohere.cfg] 0.50 0 [prompt1.jinja2 + gpt3-5.cfg] > [prompt2.jinja2 + gpt3-5.cfg] 0.71 1
Initially this table will show low trial counts as well as small probability values. As you annotate more and more however, these numbers will converge and the tournament will pick the winning candidates more often.
Output after annotating more examples, ratings will converge
============== Current winner: [prompt1.jinja2 + gpt3-5.cfg] ==============
comparison prob trials [prompt1.jinja2 + gpt3-5.cfg] > [prompt1.jinja2 + cohere.cfg] 0.55 23 [prompt1.jinja2 + gpt3-5.cfg] > [prompt2.jinja2 + cohere.cfg] 0.82 18 [prompt1.jinja2 + gpt3-5.cfg] > [prompt2.jinja2 + gpt3-5.cfg] 0.91 12
Getting Started with OpenAI and Prodigy New: 1.12
If you want to get started with OpenAI and Prodigy you’ll want to set everything up so that you can work swiftly but also securely.
-
Account Setup: You will need to set up an account for OpenAI, which you can do here. You can choose to use a free account as you’re testing the service but you can consider paying as you go as well. Their pricing page gives lots of details. Be aware that new accounts have some extra rate-limit restrictions which are described in detail here.
-
Keys and a
.env
file: Once your account is set up it’s time to set up API keys, which you can do here.The Prodigy recipes will assume that your keys are stored in a local
.env
file in your current working directory. It needs to contain aPRODIGY_OPENAI_KEY
, which you’ve just created and aPRODIGY_OPENAI_ORG
which you can find here. This is what the.env
file would look like:.env
PRODIGY_OPENAI_ORG = "org-..." PRODIGY_OPENAI_KEY = "sk-..."
You should make sure that this dotenv file is added to a
.gitignore
such that it never gets uploaded. If somebody were to gain access to this key they might incur costs on your behalf with it. -
Keep an eye on costs: the OpenAI service will incur costs on every request that you make. In order to prevent a large unexpected bill, we recommend setting a spending cap on the API. This can make sure you never spend more than a predefined amount per month. You can configure this by going to the usage limits section of the account page on OpenAI.
-
(Optional) Customise a recipe: The recipes provided by Prodigy are designed to be generally useful, but there can be good reasons to go beyond the zero-shot defaults. The NER and textcat recipes allow you to contribute examples to the prompt which might improve the output of OpenAI. These recipes also allow you to write custom prompts for OpenAI, which can be relevant if you’re interested in generating responses for non-English languages.
Usage
This section explains how each OpenAI recipe works in detail by describing the prompt that is sent to OpenAI as well as the response that is returned.
OpenAI for NER New: 1.12
The ner.openai.correct
and ner.openai.fetch
recipes leverage
OpenAI to pre-annotate named entities in text that you provide. As a motivating
example, let’s assume that we have a set of example texts that contain comments
on a food recipe blog. It might have an example that looks like this:
examples.jsonl{"text": "Sriracha sauce goes really well with hoisin stir fry, but you should add it after you use the wok."}
Given an examples file, you can use ner.openai.correct
to help with
annotating.
Example
prodigy ner.openai.correct recipe-ner examples.jsonl --label "dish,ingredient,equipment"
Internally, this recipe will take the provided labels ("dish"
, "ingredient"
and "equipment"
) together with the provided text in each example to generate a
prompt for OpenAI. Here’s what such a prompt would look like:
NER prompt sent to OpenAIFrom the text below, extract the following entities in the following format:
dish: <comma delimited list of strings>
ingredient: <comma delimited list of strings>
equipment: <comma delimited list of strings>
Text:
"""
Sriracha sauce goes really well with hoisin stir fry, but you should add it after you use the wok.
"""
This prompt is sent to OpenAI, which will return with a response. This response is not deterministic, but it might look something like this:
NER response from OpenAIdish: hoisin stir fry
ingredient: Sriracha sauce
equipment: wok
If you use the ner.openai.correct
recipe then you’ll be able to see this
prompt and response from inside Prodigy.
Because this example is annotated correctly, you can simply accept the annotation without having to use your mouse to annotate the entities. This can save a tremendous amount of time, but it should be stressed that the OpenAI annotations can be wrong. You still want to be in the loop to curate the annotations.
Alternatively, you can also use ner.openai.fetch
to download these
annotations on disk such that you can later review them with the
ner.manual
recipe. This interface won’t give you the prompt information,
but does allow you to call OpenAI once even if you want to show the data to
multiple annotators.
Example
prodigy ner.openai.fetch examples.jsonl ner-annotated.jsonl "dish,ingredient,equipment" 100%|████████████████████████████| 50/50 [00:12<00:00, 3.88it/s]
OpenAI for Text Classification New: 1.12
The textcat.openai.correct
and textcat.openai.fetch
recipes
leverage OpenAI to attach classification labels in text that you provide. As a
motivating example, let’s assume that we have a set of example texts that
contain comments on a food recipe blog. It might have an example that looks like
this:
examples.jsonl{"text": "Cream cheese is really good in mashed potatoes."}
Given an examples file, you can use textcat.openai.correct
to help with
annotating labels.
Example
prodigy textcat.openai.correct recipe-comments-textcat examples.jsonl --label "recipe,feedback,question"
Internally, this recipe will take the provided labels ("recipe"
, "feedback"
and "question"
) together with the provided text in each example to generate a
prompt for OpenAI. Here’s what such a prompt would look like:
Textcat prompt sent to OpenAIClassify the text below to any of the following labels: recipe, feedback, question
The task is non-exclusive, so you can provide more than one label as long as they are comma-delimited.
For example: Label1, Label2, Label3.
Your answer should only be in the following format:
answer:
reason:
Here is the text that needs classification
Text:
"""
Cream cheese is really good in mashed potatoes.
"""
This prompt is sent to OpenAI, which will return with a response. This response is not deterministic, but it might look something like this:
Textcat response from OpenAIAnswer: Feedback
Reason: The text does not provide instructions on how to make something, nor does it ask a question. Instead, it
provides an opinion on the use of cream cheese in mashed potatoes.
If you use the textcat.openai.correct
recipe then you’ll be able to see
this prompt and response from inside Prodigy.
Alternatively, you can also use textcat.openai.fetch
to download these
annotations on disk such that you can later review them with the
textcat.manual
recipe. This interface won’t give you the prompt
information, but does allow you to call OpenAI once even if you want to show
the data to multiple annotators.
Example
prodigy textcat.openai.fetch examples.jsonl textcat-annotated.jsonl --label "recipe,feedback,question" 100%|████████████████████████████| 50/50 [00:12<00:00, 3.88it/s]
OpenAI for Terminology Lists New: 1.12
The terms.openai.fetch
recipe can generate terms and phrases obtained
from OpenAI. These terms can then be curated and turned into patterns files,
which can help with downstream annotation tasks.
To get started, you need to make a query to send to OpenAI. The example below demonstrates how to generate at least 100 examples of “skateboard tricks”.
Example
prodigy terms.openai.fetch "skateboard tricks" skateboard-tricks.jsonl --n 100
Alternatively, you may also want to steer the response from OpenAI by providing
some examples. You can add such seed terms via the --seeds
option.
Example with seeds
prodigy terms.openai.fetch "skateboard tricks" skateboard-tricks.jsonl --n 100 --seeds "ollie,kickflip"
If you would like to generate more examples to add to the generated file, you
can re-run the same command with the --resume
flag. This will re-use the
existing examples as seeds for the prompt to OpenAI.
Example that resumes an existing file
prodigy terms.openai.fetch "skateboard tricks" skateboard-tricks.jsonl --n 50 --resume
After generating the examples, you’ll have a skateboard-tricks.jsonl
file that
has contents that might look like this:
{"text":"pop shove it","meta":{"openai_query":"skateboard tricks"}}
{"text":"switch flip","meta":{"openai_query":"skateboard tricks"}}
{"text":"nose slides","meta":{"openai_query":"skateboard tricks"}}
{"text":"lazerflip","meta":{"openai_query":"skateboard tricks"}}
{"text":"lipslide","meta":{"openai_query":"skateboard tricks"}}
Given such a file you’re able to review the generated terms via the
textcat.manual
recipe.
Example
prodigy textcat.manual skateboard-tricks skateboard-tricks.jsonl --label tricks
From this interface you can manually accept or reject each example. Then, when
you’re done annotating, you can export the annotated text into a patterns file
via the terms.to-patterns
recipe.
Example
prodigy terms.to-patterns skateboard-tricks ./skateboard-patterns.jsonl --label skateboard-trick --spacy-model blank:en ✨ Exported 129 patterns ./skateboard-patterns.jsonl
This will generate a file with patterns, like those shown below.
{"label":"skateboard-trick","pattern":[{"lower":"pop"},{"lower":"shove"},{"lower":"it"}]}
{"label":"skateboard-trick","pattern":[{"lower":"switch"},{"lower":"flip"}]}
{"label":"skateboard-trick","pattern":[{"lower":"nose"},{"lower":"slides"}]}
{"label":"skateboard-trick","pattern":[{"lower":"lazerflip"}]}
{"label":"skateboard-trick","pattern":[{"lower":"lipslide"}]}
From here, the skateboard-patterns.jsonl
file can be used in recipes, like
ner.manual
, to make the annotation task easier.
A/B testing Custom Prompts New: 1.12
The ab.openai.prompts
recipe allows you to quickly compare the quality of
outputs from two OpenAI prompts in a quantifiable and blind way.
As an example, let’s assume that we want to write humerous haikus about a given topic. Then you could create two jinja templates that can each accept a topic, yet construct a different prompt.
prompt1.jinja2Write a haiku about {{topic}}.
prompt2.jinja2Write a hilarious haiku about {{topic}}.
Next, you’ll want to have an input file in the appropriate format to feed these
templates. The ab.openai.prompts
recipe assumes data to be in the
following format.
topics.jsonl{"id": 0, "prompt_args": {"topic": "Python"}}
{"id": 0, "prompt_args": {"topic": "star wars"}}
{"id": 0, "prompt_args": {"topic": "maths"}}
Finally, it helps to also have a template that can add context to the annotation interface for the user. You can define another jinja template for that.
display-template.jinja2Select the best haiku about {{topic}}.
When you put all of these templates together you can start annotating. The
command below starts the annotation interface and also uses the --repeat 4
option. This will ensure that each topic will be used to generate a prompt at
least 4 times.
Example
prodigy ab.openai.prompts haiku topics.jsonl display-template.jinja2 prompt1.jinja2 prompt2.jinja2 --repeat 4
This will generate an annotation interface like below.
When you’re done annotating, the terminal will also summarise the results for you.
Output after annotating
=============== ✨ Evaluation results ===============
✔ You preferred prompt1.jinja2 prompt1.jinja2 11 prompt2.jinja2 5
Tournaments New: 1.12
Instead of comparing two examples with each other, you can also create a
tournament to compare any number of prompts via the ab.openai.tournament
recipe. This recipe works just like the ab.openai.prompts
recipe but it
allows you to pass a folder of prompts to create a tournament. The recipe will
then internally use
the Glicko ranking system to keep
track of the best performing candidate and will select duels between prompts
accordingly.
Just like in the ab.openai.prompts
recipe you’d need prompts in .jinja2
files, but now they can reside in a folder.
prompt_folder/prompt1.jinja2Write a haiku about {{topic}}.
prompt_folder/prompt2.jinja2Write a hilarious haiku about {{topic}}.
prompt_folder/prompt2.jinja2Write a super funny haiku about {{topic}}.
You will also need to have a .jsonl
file that contains the required prompt
arguments.
topics.jsonl{"id": 0, "prompt_args": {"topic": "Python"}}
{"id": 0, "prompt_args": {"topic": "star wars"}}
{"id": 0, "prompt_args": {"topic": "maths"}}
Finally, it also helps to have a template that can add context to the annotation interface for the user. You can define another jinja template for that.
display-template.jinja2Select the best haiku about {{topic}}.
When you put all of these templates together you can start a tournament that will match prompts for comparison. The tournament recipe will use each annotation to update its internal belief of prompt performance to decide which prompts to match in the next round.
Example
prodigy ab.openai.tournament haiku-tournament input.jsonl title.jinja2 prompt_folder
This will generate the same annotation interface like below.
As you annotate in the terminal you’ll also receive feedback on the prompt performance. Early on in the annotation process this feedback might look like this:
=================== Current winner: prompt3.jinja2 ===================
comparison value count
P(prompt3.jinja2 > prompt1.jinja2) 0.60 7
P(prompt3.jinja2 > prompt2.jinja2) 0.83 3
But as you add more annotations the recipe will update its belief of ratings over all of the available prompts. So after a while, the feedback might look more like this:
=================== Current winner: prompt3.jinja2 ===================
comparison value count
P(prompt3.jinja2 > prompt1.jinja2) 0.95 19
P(prompt3.jinja2 > prompt2.jinja2) 0.97 5
Few Shot Prompts for OpenAI recipes
OpenAI will make mistakes at times. However, you can attempt to steer the large language model in the right direction by providing some examples that it got wrong before. The named entitiy and text classification recipes provided by Prodigy can take such examples and make sure that they appear in the prompt in the right format.
The sections below explain how to do this and also how the mechanic works in more detail.
NER
As a motivating example, let’s assume that we have a set of example texts that contain comments on a food recipe blog. It might have an example that looks like this:
examples.jsonl{"text": "Sriracha sauce goes really well with hoisin stir fry, but you should add it after you use the wok."}
Given such an examples file, you can use ner.openai.correct
to help
annotating named entities.
Example
prodigy ner.openai.correct recipe-ner examples.jsonl --label "dish,ingredient,equipment"
This will internally generate a prompt for OpenAI.
Prompt sent to OpenAIFrom the text below, extract the following entities in the following format:
dish: <comma delimited list of strings>
ingredient: <comma delimited list of strings>
equipment: <comma delimited list of strings>
Text:
"""
Sriracha sauce goes really well with hoisin stir fry, but you should add it after you use the wok.
"""
This prompt might be sufficient, but if you find examples where OpenAI makes a
lot of mistakes then it might be good to save these so that you may add them to
the prompt. To do this, you can create a ner-examples.yaml
file, which might
have the following format:
ner-examples.yaml- text: "You can't get a great savory flavor with carob."
entities:
dish: []
ingredient: ['carob']
equipment: []
- text: "You can probably sand-blast it if it's an anodized aluminum pan."
entities:
dish: []
ingredient: []
equipment: ['anodized aluminum pan']
Next, you can add these examples to the recipe via:
Example
prodigy ner.openai.correct recipe-ner examples.jsonl --label "dish,ingredient,equipment" --examples-path ./ner-examples.yaml"
With these examples added, the prompt will update and contain the examples in the right format.
Prompt sent to OpenAIFrom the text below, extract the following entities in the following format:
dish: <comma delimited list of strings>
ingredient: <comma delimited list of strings>
equipment: <comma delimited list of strings>
For example:
Text:
"""
You can't get a great chocolate flavor with carob.
"""
dish:
ingredient: carob
equipment:
Text:
"""
You can probably sand-blast it if it's an anodized aluminum pan.
"""
dish:
ingredient:
equipment: anodized aluminum pan
Here is the text that needs predictions.
Text:
"""
Sriracha sauce goes really well with hoisin stir fry, but you should add it after you use the wok.
"""
Maximum number of examples
Because the size of the prompt sent to OpenAI will contribute to more costs the
recipes also provide a --max-examples
argument that will limit the number of
examples added to each prompt. It will randomly select examples on each request
in that case.
Text Classification
As a motivating example, let’s assume that we have a set of example texts that contain comments on a food recipe blog. It might have an example that looks like this:
examples.jonl{"text": "Cream cheese is really good in mashed potatoes."}
Given such an examples file, you can use textcat.openai.correct
to help
annotate classification labels.
Example
prodigy textcat.openai.correct recipe-comments-textcat comments.jsonl --label "recipe,feedback,question"
This command will internally generate the following prompt for OpenAI.
Generated OpenAI promptClassify the text below to any of the following labels: recipe, feedback, question
The task is non-exclusive, so you can provide more than one label as long as they are comma-delimited.
For example: Label1, Label2, Label3.
Your answer should only be in the following format:
answer:
reason:
Here is the text that needs classification
Text:
"""
Cream cheese is really good in mashed potatoes.
"""
You can add examples to this prompt but you need to be mindful of the type of classification task. Because the current task is a multilabel classification task we need to provide examples in the following format.
textcat-multilabel.yaml- text: 'Can someone try this recipe?'
answer: 'question'
reason: 'It is a question about trying a recipe.'
- text:
'1 cup of rice then egg and then mix them well. Should I add garlic last?'
answer: 'question,recipe'
reason: 'It is a question about the steps in making a fried rice.'
You can pass this file along via the --examples-path
flag.
Example
prodigy textcat.openai.correct recipe-comments-textcat comments.jsonl --label "recipe,feedback,question" --examples-path ./textcat-multilabel.yaml"
With these arguments, we will send a larger prompt to OpenAI.
Generated OpenAI prompt with multi-label examplesClassify the text below to any of the following labels: recipe, feedback, question
The task is non-exclusive, so you can provide more than one label as long as they are comma-delimited.
For example: Label1, Label2, Label3.
Your answer should only be in the following format:
answer:
reason:
Below are some examples (only use these as a guide):
Text:
"""
Can someone try this recipe?
"""
answer: question
reason: It is a question about trying a recipe.
Text:
"""
1 cup of rice then egg and then mix them well. Should I add garlic last?
"""
answer: question,recipe
reason: It is a question about the steps in making a fried rice.
Here is the text that needs classification
Text:
"""
Cream cheese is really good in mashed potatoes.
"""
Binary Labels
The previous example mentioned a multi-label example. However, it is also possible that you’re working on a strictly binary classification task. In that situation the examples need to be slightly different because OpenAI needs to strictly “accept” or “reject” an example.
To help explain this, let’s have a look at a default call for a binary task.
Example
prodigy textcat.openai.correct recipe-comments-textcat comments.jsonl --label "recipe"
This will generate the following prompt.
Generated OpenAI promptFrom the text below, determine whether or not it contains a recipe. If it is
a recipe, answer "accept." If it is not a recipe, answer "reject."
Your answer should only be in the following format:
answer:
reason:
Text:
"""
Cream cheese is really good in mashed potatoes.
"""
Like before, you’ll want to create a file that contains the examples that you want to add to the prompt.
textcat-binary.yaml- text: 'This is a recipe for scrambled egg: 2 eggs, 1 butter, batter them, and then fry in a hot pan for 2 minutes'
answer: 'accept'
reason: 'This is a recipe for making a scrambled egg'
- text: 'This is a recipe for fried rice: 1 cup of day old rice, 1 butter, 2 cloves of garlic: put them all in a wok and stir them together.'
answer: 'accept'
reason: 'This is a recipe for making a fried rice'
- text: "I tried it and it's not good"
answer: 'reject'
reason: "It doesn't talk about a recipe."
And when we now use it via this prompt:
Example
prodigy textcat.openai.correct recipe-comments-textcat comments.jsonl --label "recipe" --examples-path ./textcat-binary.yaml"
The prompt will update and look like this:
Generated OpenAI prompt with binary examplesFrom the text below, determine whether or not it contains a recipe. If it is
a recipe, answer "accept." If it is not a recipe, answer "reject."
Your answer should only be in the following format:
answer:
reason:
Below are some examples (only use these as a guide):
Text:
"""
This is a recipe for scrambled egg: 2 eggs, 1 butter, batter them, and then fry in a hot pan for 2 minutes
"""
answer: accept
reason: This is a recipe for making a scrambled egg
Text:
"""
This is a recipe for fried rice: 1 cup of day old rice, 1 butter, 2 cloves of garlic: put them all in a wok and stir them together.
"""
answer: accept
reason: This is a recipe for making a fried rice
Text:
"""
I tried it and it's not good
"""
answer: reject
reason: It doesn't talk about a recipe.
Here is the text that needs classification
Text:
"""
Cream cheese is really good in mashed potatoes.
"""
Maximum number of examples
Because the size of the prompt sent to OpenAI will contribute to more costs the
recipes also provide a --max-examples
argument that will limit the number of
examples added to each prompt. It will randomly select examples on each request
in that case.
Custom prompts for OpenAI
The recipes that Prodigy provides have been designed to work for an English use case. However, you might be interested in using these recipes for another language. If that’s the case you could try to design your own prompts to suit your specific use-case.
Prompts for NER
In order to customise the NER prompt, it helps to understand the Jinja template that Prodigy uses internally.
ner.jinja2From the text below, extract the following entities in the following format:
{# whitespace #}
{%- for label in labels -%}
{{label}}:
{# whitespace #}
{%- endfor -%}
{# whitespace #}
{# whitespace #}
{%- if examples -%}
{# whitespace #}
For example:
{# whitespace #}
{# whitespace #}
{%- for example in examples -%}
Text:
"""
{{ example.text }}
"""
{# whitespace #}
{%- for label, substrings in example.entities.items() -%}
{{ label }}: {{ ", ".join(substrings) }}
{# whitespace #}
{%- endfor -%}
{# whitespace #}
{% endfor -%}
{%- endif -%}
{# whitespace #}
This is the example that needs prediction.
{# whitespace #}
{# whitespace #}
Text:
"""
{{text}}
"""
This template is able to take an input text, labels and examples to turn it into a prompt for OpenAI. You can run the following Python code yourself to get a feeling for it.
import jinja2
import pathlib
template_text = pathlib.Path("ner.jinja2")
template = jinja2.Template(template_text).read_text()
template.render(
text="Steve Jobs founded Apple in 1976.",
labels=["name", "organisation"]
)
This example would generate the following prompt.
OpenAI PromptFrom the text below, extract the following entities in the following format:
name:
organisation:
This is the example that needs prediction.
Text:
"""
Steve Jobs founded Apple in 1976.
"""
You’re free to take the original prompt and to make changes to it. You can add details relevant to your domain or you can change the language of the prompt. Just make sure that you don’t change the names of the rendered variables and that you don’t change the structure of the prompt. The result from OpenAI will still need to get parsed and if OpenAI sends a response in an unexpected format the recipe won’t be able to render it.
In the case of NER the template assumes that labels
is a list of strings
referring to label names and text
refers to the text of the current annotation
example. The example
variable refers to an example that is passed to steer the
prompt. The NER examples gives more details on the expected
format for that.
Once you have a custom template ready, you can refer to it from the command line.
Example with custom prompt
prodigy ner.openai.correct recipe-ner examples.jsonl --label "dish,ingredient,equipment" --prompt-path ./custom-ner.jinja2
Prompts for Textcat
Customising the prompt for text classification works in a similar fashion as the named entity variant but with a different template.
textcat.jinja2{% if labels|length == 1 %}
{% set label = labels[0] %}
From the text below, determine whether or not it contains a {{ label }}. If it is
a {{ label }}, answer "accept." If it is not a {{ label }}, answer "reject."
{% else %}
Classify the text below to any of the following labels: {{ labels|join(", ") }}
{% if exclusive_classes %}
The task is exclusive, so only choose one label from what I provided.
{% else %}
The task is non-exclusive, so you can provide more than one label as long as
they're comma-delimited. For example: Label1, Label2, Label3.
{% endif %}
{% endif %}
{# whitespace #}
Your answer should only be in the following format:
{# whitespace #}
answer:
reason:
{# whitespace #}
{% if examples %}
Below are some examples (only use these as a guide):
{# whitespace #}
{# whitespace #}
{% for example in examples %}
Text:
"""
{{ example.text }}
"""
{# whitespace #}
answer: {{ example.answer }}
reason: {{ example.reason }}
{% endfor %}
{% endif %}
{# whitespace #}
Here is the text that needs classification
{# whitespace #}
Text:
"""
{{text}}
"""
This template is more elaborate than the named entity one because this template makes a distinction between binary classification, multilabel classification and exclusive classification. Like before, you’re free to make any changes you like as long as you don’t change the names of the rendered variables and that you don’t change the structure of the prompt. The result from OpenAI will still need to get parsed and if OpenAI sends a response in an unexpected format the recipe won’t be able to render it.
Once you’ve created a custom template you can refer to it from the command line.
Example with custom prompt
prodigy textcat.openai.correct recipe-comments-textcat examples.jsonl --label "recipe,feedback,question" --prompt-path ./custom-textcat.jinja2
Prompts for Terms
If you’re interested in generating terms in another language then you may not need to write a seperate template. Instead you may also use a query that describes the language that you are interested in, like below.
Example
prodigy terms.openai.fetch "Dutch insults" dutch-insults.jsonl --n 100
Alternatively, if you prefer more control, you can also choose to write a custom template by making changes to the following jinja template:
terms.jinja2Generate me {{n}} examples of {{query}}.
Here are the examples:
{%- if examples -%}
{%- for example in examples -%}
{# whitespace #}
- {{example}}
{%- endfor -%}
{%- endif -%}
{# whitespace #}
-
The query
variable is defined by the “query” argument from the command line
and the n
variable is defined via --n
. The examples in this template refer
to the --seeds
that are passed along.
Once you’ve created a custom template you can refer to it from the command line.
Example
prodigy terms.openai.fetch "Dutch insults" dutch-insults.jsonl --n 100 --prompt-path ./custom-terms.jinja2