diff --git a/23.3/QuickStart.ipynb b/23.3/QuickStart.ipynb index bfa138b..4a48b6b 100644 --- a/23.3/QuickStart.ipynb +++ b/23.3/QuickStart.ipynb @@ -1 +1,401 @@ -{"cells":[{"attachments":{},"cell_type":"markdown","metadata":{"id":"twIzT22mtqhs"},"source":["# Before you start with this QuickStart Notebook"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"iUp4X8V5tqhv"},"source":["In this notebook, we will re-use the classical Iris modeling example to demonstrate how you can automatically document in Vectice your assets, such as datasets, models, graphs, and notes, using a few lines of code. \n","\n","### Pre-requisites:\n","Before using this notebook you will need:\n","* An account in Vectice\n","* An API key to connect to Vectice through the APIs\n","* The Phase Id of the project where you want to log your work\n","\n","Refer to Vectice Getting Started Guide for more detailed instructions: https://docs.vectice.com/getting-started/\n","\n","\n","### Other Resources\n","* Vectice Documentation: https://docs.vectice.com/
\n","* Vectice API documentation: https://api-docs.vectice.com/\n"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"jER4KBN0JbSw"},"source":["
\n","Automated code lineage: The code lineage functionalities are not covered as part of this QuickStart as they require to first setting up a Git repository.\n","
"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"BWxVwtAdgSbj"},"source":["## Install the latest Vectice Python client library"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"bin0M8rKdfBU"},"outputs":[],"source":["%pip install -q seaborn\n","%pip install -q scikit-learn\n","%pip install -q vectice"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"1bwpySvqf5Ce"},"source":["## Get started by connecting to Vectice"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"jx5xlQFJxxQZ"},"source":["**First, we need to authenticate to the Vectice server. Before proceeding further:**\n","\n","- Visit the Vectice app to create and copy an API key (cf. https://docs.vectice.com/getting-started/create-an-api-key)\n","\n","- Paste the API key in the code below"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["import vectice\n","\n","connect = vectice.connect(api_token=\"your-api-key\") #Paste your API key"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"X5h5ghH2gc-S"},"source":["## Retrieve your To Do Phase Id inside your QuickStart project to specify where to document your work"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"rgFE8GY_tqh0"},"source":["In the Vectice app, navigate to your personal workspace (this is prefixed with a `.`) and then go to your QuickStart project next go to the **To Do Phase** and copy paste your **Phase Id** into the cell below."]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["phase = connect.phase('PHA-xxxx') #Paste your Phase Id"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"jRPNU5blklRw"},"source":["### Next, we are going to create an iteration. \n","An iteration allows you to organize your work in repeatable sequence. You can have multiple iterations within a phase. Iterations can be organized into sections.
"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"vE0dEteRepDI"},"outputs":[],"source":["iteration = phase.create_or_get_current_iteration()"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"FYeCGcgLlsOL"},"source":["## Auto-Document your iteration in Vectice\n","In this section we will prepare a modeling dataset based on the well-known iris dataset. We will then train a linear regression model using scikit-learn. \n","As we are doing this work and creating those assets, we will log them and corresponding artifacts in Vectice with a few lines of code.\n","This enables you to document your work as you go, and never forget the data that was used, the models, the code and other artifacts."]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"wFewo-zTo5bn"},"source":["### Log a note"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"_sDBscKr2nkX"},"source":["To log information you simply need to assign string variables to the iteration you previously created, and if desired, associate it with a specific section"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"Qnlqgd7Yo7-J"},"outputs":[],"source":["# Log a note inside the section quickstart of the iteration you created above\n","iteration.log(\"My first log into Vectice\", section = \"quickstart\")"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"TCcxg4Gklxcl"},"source":["### Log a Dataset with a graph as attachment"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"6VvbrEsKl5bH"},"source":["Use the following code block to create a local dataset and generate a graph:"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"jMC05CI6epHw"},"outputs":[],"source":["import pandas as pd\n","from sklearn import datasets\n","\n","iris = datasets.load_iris()\n","\n","df_iris = pd.DataFrame(data=iris.data, columns=iris.feature_names)\n","df_iris['species'] = iris.target_names[iris.target]\n","#Save your dataframe to local file\n","df_iris.to_csv('cleaned_dataset.csv', index=False)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"OMaIknlOtqh3","outputId":"a16f5f30-9900-487f-bbc3-a894fd5171b4"},"outputs":[],"source":["import seaborn as sns\n","import matplotlib.pyplot as plt\n","\n","sns.scatterplot(data=df_iris, x='sepal length (cm)',\n"," y='petal width (cm)', hue='species')\n","plt.plot()\n","#Save your graph to local file\n","plt.savefig('Scatter_plot_iris.png')"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"4zUUM4HynZuV"},"source":["Let's log the dataset we created, including the attachment above.
\n","
\n","The Vectice resource will automatically extract pertinent metadata from the local dataset file and collect statistics (optional) from the pandas dataframe. This information will be documented within the iteration as part of a Dataset version."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ntJfS6_inYyK"},"outputs":[],"source":["from vectice import Dataset, FileResource\n","\n","clean_dataset = Dataset.clean(name=\"Cleaned Dataset\", resource=FileResource(paths=\"cleaned_dataset.csv\", dataframes=df_iris), attachments='Scatter_plot_iris.png')\n","\n","iteration.log(clean_dataset, section = \"quickstart\")"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["After running the cell above, you will notice an output displaying a link pointing to the iteration in the Vectice app. Click on the link to check what you documented."]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"Q8wbMhVTotUK"},"source":["### Log a model with its associated hyper-parameters"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"N0wsppAltqh5"},"outputs":[],"source":["from sklearn.neighbors import KNeighborsClassifier\n","\n","#instantiate the model (with the default parameter)\n","knn = KNeighborsClassifier()\n","\n","# fit the model with data (occurs in-place)\n","knn.fit(df_iris[iris.feature_names],df_iris[\"species\"])"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"g5FOnLfHnY14"},"outputs":[],"source":["from vectice import Model\n","\n","model = Model(library=\"scikit-learn\", technique=\"KNN\", name=\"My first model\", predictor=knn, properties=knn.get_params(), derived_from=[clean_dataset.latest_version_id])\n","iteration.log(model, section = \"quickstart\")"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["iteration.complete()"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"ZV-eznItq8gu"},"source":["Similarly to Dataset, check what you documented by clicking on the link above."]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"YnmFUplTpMoW"},"source":["## 🥇 Congrats! You learn how to succesfully use Vectice to auto-document your assets.
\n","### Next we encourage you to follow [part 2](https://docs.vectice.com/getting-started/quickstart-auto-document-your-work#part-2) of the QuickStart guide to continue learning about Vectice."]}],"metadata":{"colab":{"provenance":[]},"kernelspec":{"display_name":"Python 3","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.9.6"}},"nbformat":4,"nbformat_minor":0} +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "twIzT22mtqhs" + }, + "source": [ + "# Before you start with this Quickstart Notebook" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iUp4X8V5tqhv" + }, + "source": [ + "You will need:\n", + "* An account in Vectice\n", + "* An API key to connect to Vectice through the APIs\n", + "* The Phase Id of the project where you want to log your work\n", + "\n", + "### Other Resources\n", + "* Refer to the Vectice Documentation for more detailed instructions: https://docs.vectice.com/getting-started/
\n", + "* Vectice API documentation: https://api-docs.vectice.com/\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jER4KBN0JbSw" + }, + "source": [ + "
\n", + "Automated code lineage: The code lineage functionalities are not covered as part of this QuickStart as they require to first setting up a Git repository.\n", + "
\n", + "\n", + "\n", + "\n", + "---\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7LArs6J5fyIz" + }, + "source": [ + "## What to expect\n", + "\n", + "In this notebook, we will re-use the classical Iris modeling example to demonstrate how you can automatically document in Vectice your assets, such as datasets, models, graphs, and notes, using a few lines of code." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BWxVwtAdgSbj" + }, + "source": [ + "## Install the latest Vectice Python client library" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "bin0M8rKdfBU" + }, + "outputs": [], + "source": [ + "%pip install -q seaborn\n", + "%pip install -q scikit-learn\n", + "%pip install -q vectice" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1bwpySvqf5Ce" + }, + "source": [ + "## Get started by connecting to Vectice" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jx5xlQFJxxQZ" + }, + "source": [ + "**First, we need to authenticate to the Vectice server. Before proceeding further:**\n", + "\n", + "- Click on the key icon in the upper right corner of the Vectice app to create and copy an API key\n", + "\n", + "- Paste the API key in the code below" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "frnhI_W9aOoZ" + }, + "outputs": [], + "source": [ + "import vectice\n", + "\n", + "connect = vectice.connect(api_token=\"your-api-key\") #Paste your API key" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "X5h5ghH2gc-S" + }, + "source": [ + "## Retrieve your Phase ID inside your Quickstart project to specify where to document your work" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "rgFE8GY_tqh0" + }, + "source": [ + "In the Vectice app, go to your QuickStart project, then go to **[Step 2]** and copy paste your **Phase Id** into the cell below.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "zfbaL3GdaOoa" + }, + "outputs": [], + "source": [ + "phase = connect.phase('PHA-xxxx') #Paste your Phase Id" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jRPNU5blklRw" + }, + "source": [ + "### Next, we are going to create an iteration.\n", + "An iteration allows you to organize your work in repeatable sequence. You can have multiple iterations within a phase.\n", + "
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "vE0dEteRepDI" + }, + "outputs": [], + "source": [ + "iteration = phase.create_or_get_current_iteration()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FYeCGcgLlsOL" + }, + "source": [ + "## Auto-Document your iteration in Vectice\n", + "We will prepare a modeling dataset based on the well-known iris dataset. We will then train a linear regression model using scikit-learn.\n", + "As we are doing this work and creating those assets, we will log them and corresponding artifacts in Vectice with a few lines of code.\n", + "This enables you to document your work as you go, and never forget the data that was used, the models, the code and other artifacts." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wFewo-zTo5bn" + }, + "source": [ + "### Log a note" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_sDBscKr2nkX" + }, + "source": [ + "To log information, you simply need to assign string variables to the iteration you created, associate it with a section called \"Your assets\"." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Qnlqgd7Yo7-J" + }, + "outputs": [], + "source": [ + "# Log a note inside the iteration you created above into a section called \"Your assets\"\n", + "iteration.log(\"My first log into Vectice\", section = \"Your assets\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "A518Gz2lj59b" + }, + "source": [ + "Sections are a way to further organize your iterations. You can dynamically create sections from the API or the Vectice app.\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TCcxg4Gklxcl" + }, + "source": [ + "### Log a Dataset with a graph as attachment" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6VvbrEsKl5bH" + }, + "source": [ + "Use the following code block to create a local dataset and generate a graph:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "jMC05CI6epHw" + }, + "outputs": [], + "source": [ + "import pandas as pd\n", + "from sklearn import datasets\n", + "\n", + "iris = datasets.load_iris()\n", + "\n", + "df_iris = pd.DataFrame(data=iris.data, columns=iris.feature_names)\n", + "df_iris['species'] = iris.target_names[iris.target]\n", + "#Save your dataframe to local file\n", + "df_iris.to_csv('cleaned_dataset.csv', index=False)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "OMaIknlOtqh3" + }, + "outputs": [], + "source": [ + "import seaborn as sns\n", + "import matplotlib.pyplot as plt\n", + "\n", + "sns.scatterplot(data=df_iris, x='sepal length (cm)',\n", + " y='petal width (cm)', hue='species')\n", + "plt.plot()\n", + "#Save your graph to local file\n", + "plt.savefig('Scatter_plot_iris.png')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4zUUM4HynZuV" + }, + "source": [ + "Let's log the dataset we created, including the attachment above.
\n", + "
\n", + "The Vectice resource will automatically extract pertinent metadata from the local dataset file and collect statistics (optional) from the pandas dataframe. This information will be documented within the iteration as part of a Dataset version." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ntJfS6_inYyK" + }, + "outputs": [], + "source": [ + "from vectice import Dataset, FileResource\n", + "\n", + "clean_dataset = Dataset.clean(name=\"Cleaned Dataset\", resource=FileResource(paths=\"cleaned_dataset.csv\", dataframes=df_iris), attachments='Scatter_plot_iris.png')\n", + "\n", + "iteration.log(clean_dataset, section = \"Your assets\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "V1Z_jApoaOoe" + }, + "source": [ + "After running the cell above, you will notice an output displaying a link pointing to the iteration in the Vectice app. Click on the link to check what you documented." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Q8wbMhVTotUK" + }, + "source": [ + "### Log a model with its associated hyper-parameters" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "N0wsppAltqh5" + }, + "outputs": [], + "source": [ + "from sklearn.neighbors import KNeighborsClassifier\n", + "\n", + "#instantiate the model (with the default parameter)\n", + "knn = KNeighborsClassifier()\n", + "\n", + "# fit the model with data (occurs in-place)\n", + "knn.fit(df_iris[iris.feature_names],df_iris[\"species\"])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "g5FOnLfHnY14" + }, + "outputs": [], + "source": [ + "from vectice import Model\n", + "\n", + "model = Model(library=\"scikit-learn\", technique=\"KNN\", name=\"My first model\", predictor=knn, properties=knn.get_params(), derived_from=[clean_dataset.latest_version_id])\n", + "iteration.log(model, section = \"Your assets\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZV-eznItq8gu" + }, + "source": [ + "Similarly to Dataset, check what you documented by clicking on the link above." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "q1S8YhqxlEMG" + }, + "source": [ + "✅ **You can edit and manipulate your sections and your assets below in your app the way you want to tell your story to your peers.**" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YnmFUplTpMoW" + }, + "source": [ + "## 🥇 Congrats! You have learned how to succesfully use Vectice to auto-document your assets.
\n", + "#### You can proceed back to the Vectice app to document your work.\n" + ] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.6" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}