Prompt Engineering

Prodigy supports recipes and tools for prompt engineering via prompt tournaments as of version 1.12. In these tournaments, your prompts will compete as you annotate the one that gives the best results.

Prompts as an A/B test

When you're engineering prompts you're going to get noisy results. In order to determine which prompt is best, you'll need a quantifiable method to compare them. Prodigy offers tools and annotation interfaces for this task and even offers pre-made recipes that integrate with OpenAI.

From OpenAI to Prodigy diagram



Example of A/B prompt workflow

This live demo requires JavaScript to be enabled.

Compare two prompts

The ab.openai.promptsrecipe allows you to quickly compare the quality of outputs from two OpenAI prompts in a quantifiable and blind way. Given these two prompts and the following input data, you can get this interface, with candidates automatically generated by OpenAI.

prompt1.jinja2Write a haiku about {{topic}}.
prompt2.jinja2Write a hilarious haiku about {{topic}}.
input.jsonl{"id": 0, "prompt_args": {"topic": "Python"}}
{"id": 0, "prompt_args": {"topic": "star wars"}}
{"id": 0, "prompt_args": {"topic": "maths"}}
Read more

Tournament of prompts

The ab.openai.tournamentrecipe allows you to quickly create a prompt tournament from any number of OpenAI prompts. The prompts will be competing and as you annotate the winner will gain a higher ranking.

You're even able to re-use the built-in tournament classes that Prodigy provides in your own custom recipes.

Read more


prodigyab.openai.tournamenthaiku-tournamentinput.jsonltitle.jinja2prompt_folder====================== Current winner: prompt2.jinja2 ======================comparison value count P(prompt2.jinja2 > prompt3.jinja2) 0.667322 5 P(prompt2.jinja2 > prompt1.jinja2) 0.753021 3 P(prompt2.jinja2 > prompt4.jinja2) 0.952123 3 ...after more annotations ...====================== Current winner: prompt2.jinja2 ======================comparison value count P(prompt2.jinja2 > prompt3.jinja2) 0.96732 15 P(prompt2.jinja2 > prompt1.jinja2) 0.99302 7 P(prompt2.jinja2 > prompt4.jinja2) 0.99945 4