Skip to content

Commit ee4293a

Browse files
mike0svLiraim
andauthored
Add llm template registry example (#1713)
* Add llm template registry example * Add type ignore. --------- Co-authored-by: Vyacheslav Morov <[email protected]>
1 parent f96e33c commit ee4293a

File tree

3 files changed

+248
-35
lines changed

3 files changed

+248
-35
lines changed

examples/future_examples/prompt_registry.ipynb

Lines changed: 205 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,138 +1,315 @@
11
{
22
"cells": [
3+
{
4+
"metadata": {},
5+
"cell_type": "markdown",
6+
"source": [
7+
"# Using Evidently Prompt Registry with Prompt Templates\n",
8+
"\n",
9+
"In this tutorial, we’ll walk through how to use the **Evidently Prompt Registry** to store, version, and reuse prompts.\n",
10+
"We’ll also see how to connect it with **LLMJudge** and prompt templates for evaluation.\n",
11+
"\n",
12+
"## Connect to Evidently Cloud\n",
13+
"First, import the CloudWorkspace and authenticate with your token.\n",
14+
"\n"
15+
],
16+
"id": "794ee107e0bbccd8"
17+
},
318
{
419
"cell_type": "code",
520
"id": "initial_id",
621
"metadata": {
7-
"collapsed": true
22+
"collapsed": true,
23+
"jupyter": {
24+
"outputs_hidden": true
25+
}
826
},
9-
"source": "from evidently.ui.workspace import CloudWorkspace",
27+
"source": [
28+
"from evidently.ui.workspace import CloudWorkspace"
29+
],
1030
"outputs": [],
1131
"execution_count": null
1232
},
1333
{
14-
"metadata": {},
1534
"cell_type": "code",
16-
"source": "ws = CloudWorkspace(\"\", url=\"https://pr-1885.evidently.dev/\")",
1735
"id": "f475f54d0e3cb0c0",
36+
"metadata": {},
37+
"source": [
38+
"ws = CloudWorkspace(\"your token\")"
39+
],
1840
"outputs": [],
1941
"execution_count": null
2042
},
2143
{
2244
"metadata": {},
23-
"cell_type": "code",
45+
"cell_type": "markdown",
2446
"source": [
25-
"org_id = \"019396b3-b862-7bdb-95ea-1786c2262724\"\n",
26-
"project = ws.search_project(\"Prompts Example\", org_id=org_id)[0]"
47+
"### Select Project\n",
48+
"You need to provide your `org_id` and the project name where you want to store prompts."
2749
],
50+
"id": "99644be6bf6b6302"
51+
},
52+
{
53+
"cell_type": "code",
2854
"id": "a96e7e78ef17d78c",
55+
"metadata": {},
56+
"source": [
57+
"org_id = \"your org id\"\n",
58+
"project = ws.search_project(\"your project name\", org_id=org_id)[0]"
59+
],
2960
"outputs": [],
3061
"execution_count": null
3162
},
3263
{
3364
"metadata": {},
65+
"cell_type": "markdown",
66+
"source": [
67+
"## Create or Load a Prompt\n",
68+
"You can create a new prompt or load an existing one by name.\n"
69+
],
70+
"id": "5754e6dabe37fd1c"
71+
},
72+
{
3473
"cell_type": "code",
74+
"id": "774f71e32051de58",
75+
"metadata": {},
3576
"source": [
3677
"prompt = ws.prompts.get_or_create_prompt(project.id, \"my criteria\")\n",
3778
"prompt.list_versions()"
3879
],
39-
"id": "774f71e32051de58",
4080
"outputs": [],
4181
"execution_count": null
4282
},
4383
{
4484
"metadata": {},
85+
"cell_type": "markdown",
86+
"source": [
87+
"## Add Prompt Versions\n",
88+
"Let’s add new versions of the prompt content.\n",
89+
"This helps you track changes over time.\n"
90+
],
91+
"id": "8499d58283f2f2b2"
92+
},
93+
{
4594
"cell_type": "code",
95+
"id": "f23d3f7ebd0d33b7",
96+
"metadata": {},
4697
"source": [
4798
"criteria = \"aaaa\"\n",
4899
"prompt.bump_version(criteria)"
49100
],
50-
"id": "f23d3f7ebd0d33b7",
51101
"outputs": [],
52102
"execution_count": null
53103
},
54104
{
55-
"metadata": {},
56105
"cell_type": "code",
57-
"source": "prompt.list_versions()",
58106
"id": "90fd01c8828bfedb",
107+
"metadata": {},
108+
"source": [
109+
"prompt.list_versions()"
110+
],
59111
"outputs": [],
60112
"execution_count": null
61113
},
62114
{
63-
"metadata": {},
64115
"cell_type": "code",
65-
"source": "prompt.get_version().content",
66116
"id": "372186afd8fbdba1",
117+
"metadata": {},
118+
"source": [
119+
"prompt.get_version().content"
120+
],
67121
"outputs": [],
68122
"execution_count": null
69123
},
70124
{
71-
"metadata": {},
72125
"cell_type": "code",
73-
"source": "prompt.bump_version(\"bbbb\")",
74126
"id": "184ecb4d1f318477",
127+
"metadata": {},
128+
"source": [
129+
"prompt.bump_version(\"bbbb\")"
130+
],
75131
"outputs": [],
76132
"execution_count": null
77133
},
78134
{
79-
"metadata": {},
80135
"cell_type": "code",
81-
"source": "prompt.get_version(\"latest\").content.as_text()",
82136
"id": "bff5bedad1668104",
137+
"metadata": {},
138+
"source": [
139+
"prompt.get_version(\"latest\").content.as_text()"
140+
],
83141
"outputs": [],
84142
"execution_count": null
85143
},
86144
{
87-
"metadata": {},
88145
"cell_type": "code",
89-
"source": "prompt.delete_version(prompt.get_version().id)",
90146
"id": "10d977c3d9fe6549",
147+
"metadata": {},
148+
"source": [
149+
"prompt.delete_version(prompt.get_version().id)"
150+
],
91151
"outputs": [],
92152
"execution_count": null
93153
},
94154
{
95-
"metadata": {},
96155
"cell_type": "code",
97-
"source": "prompt.get_version(\"latest\").content.as_text()",
98156
"id": "1e749db861675458",
157+
"metadata": {},
158+
"source": [
159+
"prompt.get_version(\"latest\").content.as_text()"
160+
],
99161
"outputs": [],
100162
"execution_count": null
101163
},
102164
{
103165
"metadata": {},
166+
"cell_type": "markdown",
167+
"source": [
168+
"## Delete a Prompt Version\n",
169+
"You can also remove versions if needed.\n"
170+
],
171+
"id": "86bf2b572a1e993a"
172+
},
173+
{
104174
"cell_type": "code",
105-
"source": "prompt.delete()",
106175
"id": "aebc870b0fae5768",
176+
"metadata": {},
177+
"source": [
178+
"prompt.delete()"
179+
],
107180
"outputs": [],
108181
"execution_count": null
109182
},
110183
{
111-
"metadata": {},
112184
"cell_type": "code",
113-
"source": "ws.prompts.list_prompts(project.id)",
114185
"id": "d50289a21b01362b",
186+
"metadata": {},
187+
"source": [
188+
"ws.prompts.list_prompts(project.id)"
189+
],
190+
"outputs": [],
191+
"execution_count": null
192+
},
193+
{
194+
"metadata": {},
195+
"cell_type": "markdown",
196+
"source": [
197+
"## Define a Judge with Criteria\n",
198+
"Now, let’s define a **judge** that evaluates model responses using a template.\n",
199+
"We’ll use a binary classification (GOOD / BAD) with simple criteria.\n"
200+
],
201+
"id": "1637286eb1dbf4db"
202+
},
203+
{
204+
"cell_type": "code",
205+
"id": "41317777700f94a4",
206+
"metadata": {},
207+
"source": [
208+
"from evidently.llm.templates import BinaryClassificationPromptTemplate\n",
209+
"from evidently.descriptors import LLMJudge\n",
210+
"\n",
211+
"judge = LLMJudge(provider=\"openai\", model=\"gpt-4o-mini\", template=BinaryClassificationPromptTemplate(\n",
212+
" target_category=\"GOOD\",\n",
213+
" non_target_category=\"BAD\",\n",
214+
" criteria=\"\"\"Classify the model’s response with the following criteria:\n",
215+
"Correctness: Is the response factually accurate?\n",
216+
"Clarity: Is the response easy to understand?\n",
217+
"Relevance: Does it fully address the question?\n",
218+
"Output only one rating: good or bad.\"\"\"\n",
219+
"))"
220+
],
221+
"outputs": [],
222+
"execution_count": null
223+
},
224+
{
225+
"metadata": {},
226+
"cell_type": "markdown",
227+
"source": [
228+
"## Store the Judge Template in the Prompt Registry\n",
229+
"Instead of keeping the template inline, let’s store it in the registry.\n"
230+
],
231+
"id": "8cc9a4cafc8ff314"
232+
},
233+
{
234+
"cell_type": "code",
235+
"id": "86e411c8f0cb6731",
236+
"metadata": {},
237+
"source": [
238+
"template_prompt = ws.prompts.get_or_create_prompt(project.id, \"my template\")\n",
239+
"template_prompt.bump_version(judge.feature.template)"
240+
],
241+
"outputs": [],
242+
"execution_count": null
243+
},
244+
{
245+
"cell_type": "code",
246+
"id": "3119dcad06f5b57d",
247+
"metadata": {},
248+
"source": [
249+
"template_prompt.list_versions()"
250+
],
251+
"outputs": [],
252+
"execution_count": null
253+
},
254+
{
255+
"metadata": {},
256+
"cell_type": "markdown",
257+
"source": [
258+
"## Reuse the Template\n",
259+
"You can now load the template from the registry and create a new judge.\n"
260+
],
261+
"id": "2ffd8ab794c92ba3"
262+
},
263+
{
264+
"cell_type": "code",
265+
"id": "b75a1a697cbca9fd",
266+
"metadata": {},
267+
"source": [
268+
"new_judge = LLMJudge(provider=\"openai\",\n",
269+
" model=\"gpt-4o-mini\",\n",
270+
" template=template_prompt.get_version().content.template)\n",
271+
"new_judge"
272+
],
273+
"outputs": [],
274+
"execution_count": null
275+
},
276+
{
277+
"metadata": {},
278+
"cell_type": "markdown",
279+
"source": [
280+
"## Clean Up\n",
281+
"Finally, remove the template prompt if you no longer need it.\n"
282+
],
283+
"id": "f0f8532b561b783"
284+
},
285+
{
286+
"cell_type": "code",
287+
"id": "4eae17691463d29d",
288+
"metadata": {},
289+
"source": [
290+
"template_prompt.delete()"
291+
],
115292
"outputs": [],
116293
"execution_count": null
117294
}
118295
],
119296
"metadata": {
120297
"kernelspec": {
121-
"display_name": "Python 3",
298+
"display_name": "Python 3 (ipykernel)",
122299
"language": "python",
123300
"name": "python3"
124301
},
125302
"language_info": {
126303
"codemirror_mode": {
127304
"name": "ipython",
128-
"version": 2
305+
"version": 3
129306
},
130307
"file_extension": ".py",
131308
"mimetype": "text/x-python",
132309
"name": "python",
133310
"nbconvert_exporter": "python",
134-
"pygments_lexer": "ipython2",
135-
"version": "2.7.6"
311+
"pygments_lexer": "ipython3",
312+
"version": "3.11.11"
136313
}
137314
},
138315
"nbformat": 4,

0 commit comments

Comments
 (0)