|
5 | 5 | "id": "ef55abc9", |
6 | 6 | "metadata": {}, |
7 | 7 | "source": [ |
8 | | - "[](https://colab.research.google.com/github/openlayer-ai/examples-gallery/blob/main/monitoring/llms/monitoring-llms.ipynb)\n", |
| 8 | + "[](https://colab.research.google.com/github/openlayer-ai/examples-gallery/blob/main/monitoring/llms/general-llm/monitoring-llms.ipynb)\n", |
9 | 9 | "\n", |
10 | 10 | "\n", |
11 | 11 | "# <a id=\"top\">Monitoring LLMs</a>\n", |
|
87 | 87 | "source": [ |
88 | 88 | "import openlayer\n", |
89 | 89 | "\n", |
90 | | - "client = openlayer.OpenlayerClient(\"YOUR_API_KEY_HERE\")" |
| 90 | + "client = openlayer.OpenlayerClient(\"YOUR_OPENLAYER_API_KEY_HERE\")" |
91 | 91 | ] |
92 | 92 | }, |
93 | 93 | { |
|
132 | 132 | "\n", |
133 | 133 | "[Back to top](#top)\n", |
134 | 134 | "\n", |
135 | | - "In production, as the model makes predictions, the data can be published to Openlayer. This is done with the `publish_batch_data` method. \n", |
| 135 | + "In production, as the model makes predictions, the data can be published to Openlayer. This is done with the `stream_data` method. \n", |
136 | 136 | "\n", |
137 | 137 | "The data published to Openlayer can have a column with **inference ids** and another with **timestamps** (UNIX sec format). These are both optional and, if not provided, will receive default values. The inference id is particularly important if you wish to publish ground truths at a later time. " |
138 | 138 | ] |
|
148 | 148 | ] |
149 | 149 | }, |
150 | 150 | { |
151 | | - "cell_type": "code", |
152 | | - "execution_count": null, |
153 | | - "id": "deec9e95", |
| 151 | + "cell_type": "markdown", |
| 152 | + "id": "1bcf399a", |
154 | 153 | "metadata": {}, |
155 | | - "outputs": [], |
156 | 154 | "source": [ |
157 | | - "batch_1 = production_data.loc[:9]\n", |
158 | | - "batch_2 = production_data.loc[10:18]\n", |
159 | | - "batch_3 = production_data.loc[19:]" |
| 155 | + "### <a id=\"publish-batches\"> Publish to Openlayer </a>\n", |
| 156 | + "\n", |
| 157 | + "Here, we're simulating three calls to `stream_data`. In practice, this is a code snippet that lives in your inference pipeline and that gets called after the model predictions." |
160 | 158 | ] |
161 | 159 | }, |
162 | 160 | { |
163 | 161 | "cell_type": "code", |
164 | 162 | "execution_count": null, |
165 | | - "id": "25b66229", |
| 163 | + "id": "c6f7223f-f96c-4573-9825-71dc186d5c60", |
166 | 164 | "metadata": {}, |
167 | 165 | "outputs": [], |
168 | 166 | "source": [ |
169 | | - "batch_1.head()" |
170 | | - ] |
171 | | - }, |
172 | | - { |
173 | | - "cell_type": "markdown", |
174 | | - "id": "1bcf399a", |
175 | | - "metadata": {}, |
176 | | - "source": [ |
177 | | - "### <a id=\"publish-batches\"> Publish to Openlayer </a>\n", |
178 | | - "\n", |
179 | | - "Here, we're simulating three calls to `publish_batch_data`. In practice, this is a code snippet that lives in your inference pipeline and that gets called after the model predictions." |
| 167 | + "prompt = [\n", |
| 168 | + " {\"role\": \"system\", \"content\": \"You are an expert in Python (programming language).\"},\n", |
| 169 | + " {\"role\": \"user\", \"content\": \"Answer the following user question: {{ question }}\"}\n", |
| 170 | + "]" |
180 | 171 | ] |
181 | 172 | }, |
182 | 173 | { |
|
186 | 177 | "metadata": {}, |
187 | 178 | "outputs": [], |
188 | 179 | "source": [ |
189 | | - "batch_config = {\n", |
| 180 | + "stream_config = {\n", |
| 181 | + " \"prompt\": prompt,\n", |
190 | 182 | " \"inputVariableNames\": [\"question\"],\n", |
191 | 183 | " \"outputColumnName\": \"answer\",\n", |
192 | | - " \"inferenceIdColumnName\": \"inference_id\",\n", |
193 | 184 | "}\n" |
194 | 185 | ] |
195 | 186 | }, |
196 | 187 | { |
197 | | - "cell_type": "code", |
198 | | - "execution_count": null, |
199 | | - "id": "bde01a2b", |
| 188 | + "cell_type": "markdown", |
| 189 | + "id": "e9956786-9117-4e27-8f2b-5dff0f6eab97", |
200 | 190 | "metadata": {}, |
201 | | - "outputs": [], |
202 | 191 | "source": [ |
203 | | - "inference_pipeline.publish_batch_data(\n", |
204 | | - " batch_df=batch_1,\n", |
205 | | - " batch_config=batch_config\n", |
206 | | - ")" |
| 192 | + "You can refer to our documentation guides on [how to write configs for LLM project](https://docs.openlayer.com/how-to-guides/write-dataset-configs/llm-dataset-config) for details on other fields you can use." |
207 | 193 | ] |
208 | 194 | }, |
209 | 195 | { |
210 | 196 | "cell_type": "code", |
211 | 197 | "execution_count": null, |
212 | | - "id": "bfc3dea6", |
| 198 | + "id": "bde01a2b", |
213 | 199 | "metadata": {}, |
214 | 200 | "outputs": [], |
215 | 201 | "source": [ |
216 | | - "inference_pipeline.publish_batch_data(\n", |
217 | | - " batch_df=batch_2,\n", |
218 | | - " batch_config=batch_config\n", |
| 202 | + "inference_pipeline.stream_data(\n", |
| 203 | + " stream_data=dict(production_data.iloc[0, :]),\n", |
| 204 | + " stream_config=stream_config\n", |
219 | 205 | ")" |
220 | 206 | ] |
221 | 207 | }, |
222 | 208 | { |
223 | 209 | "cell_type": "code", |
224 | 210 | "execution_count": null, |
225 | | - "id": "159b4e24", |
| 211 | + "id": "bfc3dea6", |
226 | 212 | "metadata": {}, |
227 | 213 | "outputs": [], |
228 | 214 | "source": [ |
229 | | - "inference_pipeline.publish_batch_data(\n", |
230 | | - " batch_df=batch_3,\n", |
231 | | - " batch_config=batch_config\n", |
| 215 | + "inference_pipeline.stream_data(\n", |
| 216 | + " stream_data=dict(production_data.iloc[1, :]),\n", |
| 217 | + " stream_config=stream_config\n", |
232 | 218 | ")" |
233 | 219 | ] |
234 | 220 | }, |
|
366 | 352 | "name": "python", |
367 | 353 | "nbconvert_exporter": "python", |
368 | 354 | "pygments_lexer": "ipython3", |
369 | | - "version": "3.8.13" |
| 355 | + "version": "3.9.18" |
370 | 356 | } |
371 | 357 | }, |
372 | 358 | "nbformat": 4, |
|
0 commit comments