Use stream_data instead of publish_batch_data for LLM monitoring example

gustavocidornelas · whoseoyster · commit 65b0e657622f · 2024-02-26T13:09:47.000-08:00
diff --git a/examples/monitoring/llms/general-llm/monitoring-llms.ipynb b/examples/monitoring/llms/general-llm/monitoring-llms.ipynb
@@ -5,7 +5,7 @@
    "id": "ef55abc9",
    "metadata": {},
    "source": [
-    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openlayer-ai/examples-gallery/blob/main/monitoring/llms/monitoring-llms.ipynb)\n",
+    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openlayer-ai/examples-gallery/blob/main/monitoring/llms/general-llm/monitoring-llms.ipynb)\n",
     "\n",
     "\n",
     "# <a id=\"top\">Monitoring LLMs</a>\n",
@@ -87,7 +87,7 @@
    "source": [
     "import openlayer\n",
     "\n",
-    "client = openlayer.OpenlayerClient(\"YOUR_API_KEY_HERE\")"
+    "client = openlayer.OpenlayerClient(\"YOUR_OPENLAYER_API_KEY_HERE\")"
    ]
   },
   {
@@ -132,7 +132,7 @@
     "\n",
     "[Back to top](#top)\n",
     "\n",
-    "In production, as the model makes predictions, the data can be published to Openlayer. This is done with the `publish_batch_data` method. \n",
+    "In production, as the model makes predictions, the data can be published to Openlayer. This is done with the `stream_data` method. \n",
     "\n",
     "The data published to Openlayer can have a column with **inference ids** and another with **timestamps** (UNIX sec format). These are both optional and, if not provided, will receive default values. The inference id is particularly important if you wish to publish ground truths at a later time. "
    ]
@@ -148,35 +148,26 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "deec9e95",
+   "cell_type": "markdown",
+   "id": "1bcf399a",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "batch_1 = production_data.loc[:9]\n",
-    "batch_2 = production_data.loc[10:18]\n",
-    "batch_3 = production_data.loc[19:]"
+    "### <a id=\"publish-batches\"> Publish to Openlayer </a>\n",
+    "\n",
+    "Here, we're simulating three calls to `stream_data`. In practice, this is a code snippet that lives in your inference pipeline and that gets called after the model predictions."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "25b66229",
+   "id": "c6f7223f-f96c-4573-9825-71dc186d5c60",
    "metadata": {},
    "outputs": [],
    "source": [
-    "batch_1.head()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1bcf399a",
-   "metadata": {},
-   "source": [
-    "### <a id=\"publish-batches\"> Publish to Openlayer </a>\n",
-    "\n",
-    "Here, we're simulating three calls to `publish_batch_data`. In practice, this is a code snippet that lives in your inference pipeline and that gets called after the model predictions."
+    "prompt = [\n",
+    "    {\"role\": \"system\", \"content\": \"You are an expert in Python (programming language).\"},\n",
+    "    {\"role\": \"user\", \"content\": \"Answer the following user question: {{ question }}\"}\n",
+    "]"
    ]
   },
   {
@@ -186,49 +177,44 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "batch_config = {\n",
+    "stream_config = {\n",
+    "    \"prompt\": prompt,\n",
     "    \"inputVariableNames\": [\"question\"],\n",
     "    \"outputColumnName\": \"answer\",\n",
-    "    \"inferenceIdColumnName\": \"inference_id\",\n",
     "}\n"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bde01a2b",
+   "cell_type": "markdown",
+   "id": "e9956786-9117-4e27-8f2b-5dff0f6eab97",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "inference_pipeline.publish_batch_data(\n",
-    "    batch_df=batch_1,\n",
-    "    batch_config=batch_config\n",
-    ")"
+    "You can refer to our documentation guides on [how to write configs for LLM project](https://docs.openlayer.com/how-to-guides/write-dataset-configs/llm-dataset-config) for details on other fields you can use."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "bfc3dea6",
+   "id": "bde01a2b",
    "metadata": {},
    "outputs": [],
    "source": [
-    "inference_pipeline.publish_batch_data(\n",
-    "    batch_df=batch_2,\n",
-    "    batch_config=batch_config\n",
+    "inference_pipeline.stream_data(\n",
+    "    stream_data=dict(production_data.iloc[0, :]),\n",
+    "    stream_config=stream_config\n",
     ")"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "159b4e24",
+   "id": "bfc3dea6",
    "metadata": {},
    "outputs": [],
    "source": [
-    "inference_pipeline.publish_batch_data(\n",
-    "    batch_df=batch_3,\n",
-    "    batch_config=batch_config\n",
+    "inference_pipeline.stream_data(\n",
+    "    stream_data=dict(production_data.iloc[1, :]),\n",
+    "    stream_config=stream_config\n",
     ")"
    ]
   },
@@ -366,7 +352,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.13"
+   "version": "3.9.18"
   }
  },
  "nbformat": 4,