wandb · johndmulhausen · Nov 7, 2024 · Nov 7, 2024 · Nov 8, 2024 · Nov 14, 2024
@@ -0,0 +1,3 @@
+---
+title: Evaluate models with tables
+---
@@ -0,0 +1,26 @@
+---
+title: Evaluate models with Weave
+---
+import { CTAButtons } from '@site/src/components/CTAButtons/CTAButtons.tsx'
+
+## What is Weave? 
+
+W&B Weave helps developers who are building and iterating on their AI applications to create apples-to-apples evaluations that score the behavior of any aspect of their app, and examine and debug failures by easily inspecting inputs and outputs.
+
+## How do I get started with Weave? 
+
+First, create a W&B account at https://wandb.ai and copy your API key from https://wandb.ai/authorize
+
+Then, you can follow along in the below Colab notebook that demonstrates Weave evaluating an LLM (in this case, OpenAI).
+
+<CTAButtons colabLink='https://colab.research.google.com/github/wandb/weave/blob/master/docs/intro_notebook.ipynb'/>
+
+After running through the steps you will be able to browse your dashboard in Weave and see some of the tracing data that is generated when executing your code that includes calls to your LLM, and see breakdowns of execution time, cost, etc. 
+
+![](https://weave-docs.wandb.ai/assets/images/weave-hero-188bbbbfcac1809f2529c62110d1553a.png)
+
+## How do I use Weave to evaluate models in production? 
+
+This [tutorial on how to build an evaluation pipeline with Weave](https://weave-docs.wandb.ai/tutorial-eval/) can help, which demonstrates how multiple versions of an application that uses a model is evolving. In the tutorial you'll see how the `weave.Evaluation` function assess a Models performance on a set of examples using a list of specified scoring functions or `weave.scorer.Scorer` classes, producing dashboards with advanced breakdowns of the model's performance.
+
+![](https://weave-docs.wandb.ai/assets/images/evals-hero-9bb44591b72ac8637e7e14bc73db1ba8.png)
@@ -11,15 +11,15 @@ Weights & Biases (W&B) is the AI developer platform, with tools for training mod
 
 ![](/images/general/architecture.png)
 
-W&B consists of three major components: [Models](/guides/models.md), [Weave](https://wandb.github.io/weave/), and [Core](/guides/core.md):
+W&B consists of three major components: [Weave](https://wandb.github.io/weave/), [Models](/guides/models.md), and [Core](/guides/core.md):
+
+**[W&B Weave](https://weave-docs.wandb.ai/)** is a lightweight toolkit for tracking and evaluating LLM applications.
 
 **[W&B Models](/guides/models.md)** is a set of lightweight, interoperable tools for machine learning practitioners training and fine-tuning models.
 - [Experiments](/guides/track/intro.md): Machine learning experiment tracking
 - [Sweeps](/guides/sweeps/intro.md): Hyperparameter tuning and model optimization
 - [Registry](/guides/registry/intro.md): Publish and share your ML models and datasets
 
-**[W&B Weave](https://wandb.github.io/weave/)** is a lightweight toolkit for tracking and evaluating LLM applications.
-
 **[W&B Core](/guides/core.md)** is set of powerful building blocks for tracking and visualizing data and models, and communicating results.
 - [Artifacts](/guides/artifacts/intro.md): Version assets and track lineage
 - [Tables](/guides/tables/intro.md): Visualize and query tabular data

@@ -1,7 +1,7 @@
 ---
 description: W&B Quickstart
 displayed_sidebar: default
-title: W&B Quickstart
+title: Models Quickstart
 ---
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';

@@ -79,13 +79,13 @@ export default {
   ],
   default: [
     'guides/intro',
-    'quickstart',
     {
       type: 'category',
       label: 'W&B Models',
       link: {type: 'doc', id: 'guides/models'},
       collapsed: false,
       items: [
+        'quickstart',
         {
           type: 'category',
           label: 'Experiments',
@@ -201,6 +201,25 @@ export default {
             'guides/artifacts/project-scoped-automations',
           ],
         },
+        {
+          type: 'category',
+          label: 'Evaluations',
+          items: [
+            'guides/evaluations/evaluate-models-weave',
+            'guides/evaluations/evaluate-models-tables',
+          ],
+        },
+        {
+          type: 'category',
+          label: 'Tables',
+          link: {type: 'doc', id: 'guides/tables/intro'},
+          items: [
+            'guides/tables/tables-walkthrough',
+            'guides/tables/visualize-tables',
+            'guides/tables/tables-gallery',
+            'guides/tables/tables-download',
+          ],
+        },
         {
           type: 'category',
           label: 'W&B App UI Reference',
@@ -311,17 +330,6 @@ export default {
             // 'guides/artifacts/examples',
           ],
         },
-        {
-          type: 'category',
-          label: 'Tables',
-          link: {type: 'doc', id: 'guides/tables/intro'},
-          items: [
-            'guides/tables/tables-walkthrough',
-            'guides/tables/visualize-tables',
-            'guides/tables/tables-gallery',
-            'guides/tables/tables-download',
-          ],
-        },
         {
           type: 'category',
           label: 'Reports',