Generate notebooks

INRIA · Nov 6, 2023 · 26ad6d2 · 26ad6d2
1 parent 4851394
commit 26ad6d2
Show file tree

Hide file tree

Showing 8 changed files with 51 additions and 119 deletions.
diff --git a/notebooks/datasets_adult_census.ipynb b/notebooks/datasets_adult_census.ipynb
@@ -4,7 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# The Adult census dataset\n",
+    "# The adult census dataset\n",
     "\n",
     "[This dataset](http://www.openml.org/d/1590) is a collection of demographic\n",
     "information for the adult population as of 1994 in the USA. The prediction\n",

diff --git a/notebooks/linear_models_ex_03.ipynb b/notebooks/linear_models_ex_03.ipynb
@@ -131,7 +131,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "For the following questions, you can copy adn paste the following snippet to\n",
+    "For the following questions, you can copy and paste the following snippet to\n",
     "get the feature names from the column transformer here named `preprocessor`.\n",
     "\n",
     "```python\n",

diff --git a/notebooks/linear_models_sol_02.ipynb b/notebooks/linear_models_sol_02.ipynb
@@ -223,9 +223,9 @@
    "outputs": [],
    "source": [
     "# solution\n",
-    "culmen_length_first_sample = 181.0\n",
+    "flipper_length_first_sample = 181.0\n",
     "culmen_depth_first_sample = 18.7\n",
-    "culmen_length_first_sample * culmen_depth_first_sample"
+    "flipper_length_first_sample * culmen_depth_first_sample"
    ]
   },
   {

diff --git a/notebooks/linear_models_sol_03.ipynb b/notebooks/linear_models_sol_03.ipynb
@@ -211,7 +211,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "For the following questions, you can copy adn paste the following snippet to\n",
+    "For the following questions, you can copy and paste the following snippet to\n",
     "get the feature names from the column transformer here named `preprocessor`.\n",
     "\n",
     "```python\n",

diff --git a/notebooks/linear_regression_non_linear_link.ipynb b/notebooks/linear_regression_non_linear_link.ipynb
@@ -2,7 +2,6 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "14eec485",
    "metadata": {},
    "source": [
     "# Non-linear feature engineering for Linear Regression\n",
@@ -25,7 +24,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "8f516165",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -44,13 +42,13 @@
   },
   {
    "cell_type": "markdown",
-   "id": "00fd3b4f",
    "metadata": {},
    "source": [
-    "```{tip}\n",
-    "`np.random.RandomState` allows to create a random number generator which can\n",
-    "be later used to get deterministic results.\n",
-    "```\n",
+    "<div class=\"admonition tip alert alert-warning\">\n",
+    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Tip</p>\n",
+    "<p class=\"last\"><tt class=\"docutils literal\">np.random.RandomState</tt> allows to create a random number generator which can\n",
+    "be later used to get deterministic results.</p>\n",
+    "</div>\n",
     "\n",
     "To ease the plotting, we create a pandas dataframe containing the data and\n",
     "target:"
@@ -59,7 +57,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "5459a97b",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -71,7 +68,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "8b1b2257",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -84,22 +80,21 @@
   },
   {
    "cell_type": "markdown",
-   "id": "be69fae1",
    "metadata": {},
    "source": [
-    "```{warning}\n",
-    "In scikit-learn, by convention `data` (also called `X` in the scikit-learn\n",
-    "documentation) should be a 2D matrix of shape `(n_samples, n_features)`.\n",
-    "If `data` is a 1D vector, you need to reshape it into a matrix with a\n",
+    "<div class=\"admonition warning alert alert-danger\">\n",
+    "<p class=\"first admonition-title\" style=\"font-weight: bold;\">Warning</p>\n",
+    "<p class=\"last\">In scikit-learn, by convention <tt class=\"docutils literal\">data</tt> (also called <tt class=\"docutils literal\">X</tt> in the scikit-learn\n",
+    "documentation) should be a 2D matrix of shape <tt class=\"docutils literal\">(n_samples, n_features)</tt>.\n",
+    "If <tt class=\"docutils literal\">data</tt> is a 1D vector, you need to reshape it into a matrix with a\n",
     "single column if the vector represents a feature or a single row if the\n",
-    "vector represents a sample.\n",
-    "```"
+    "vector represents a sample.</p>\n",
+    "</div>"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "46804be9",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -110,7 +105,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "a4209f00",
    "metadata": {
     "lines_to_next_cell": 2
    },
@@ -122,7 +116,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "a1bd392b",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -142,7 +135,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "7bfcbeb8",
    "metadata": {},
    "source": [
     "We now observe the limitations of fitting a linear regression model."
@@ -151,7 +143,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "1545fec5",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -165,7 +156,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "e8c79631",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -174,7 +164,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "545fc1f3",
    "metadata": {},
    "source": [
     "Here the coefficient and intercept learnt by `LinearRegression` define the\n",
@@ -185,7 +174,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "0f95ceef",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -197,7 +185,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "1a34a48c",
    "metadata": {},
    "source": [
     "Notice that the learnt model cannot handle the non-linear relationship between\n",
@@ -217,7 +204,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "e01b02d2",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -230,7 +216,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "9a27773e",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -239,7 +224,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "4d5070e3",
    "metadata": {},
    "source": [
     "Instead of having a model which can natively deal with non-linearity, we could\n",
@@ -256,7 +240,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "28c13246",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -266,7 +249,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "69d0ba50",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -276,7 +258,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "7925141e",
    "metadata": {},
    "source": [
     "Instead of manually creating such polynomial features one could directly use\n",
@@ -286,7 +267,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "d31ed0f4",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -297,7 +277,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "6a7fe453",
    "metadata": {},
    "source": [
     "In the previous cell we had to set `include_bias=False` as otherwise we would\n",
@@ -312,7 +291,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "269fbe2b",
    "metadata": {},
    "source": [
     "To demonstrate the use of the `PolynomialFeatures` class, we use a\n",
@@ -323,7 +301,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "38ba0c5c",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -340,7 +317,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "5df7d4a4",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -349,7 +325,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "fe259d20",
    "metadata": {},
    "source": [
     "We can see that even with a linear model, we can overcome the linearity\n",
@@ -379,7 +354,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "7d46da9b",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -392,7 +366,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "9406b676",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -401,7 +374,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "fd29730e",
    "metadata": {},
    "source": [
     "The predictions of our SVR with a linear kernel are all aligned on a straight\n",
@@ -419,7 +391,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "ae1550fa",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -430,7 +401,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "c4670a4e",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -439,7 +409,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "732b2b0f",
    "metadata": {},
    "source": [
     "Kernel methods such as SVR are very efficient for small to medium datasets.\n",
@@ -460,7 +429,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "e30e6b37",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -476,7 +444,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "b46eb0ef",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -486,7 +453,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "5403e6b1",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -502,7 +468,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "0dcdfe92",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -511,7 +476,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "4b4f0560",
    "metadata": {},
    "source": [
     "`Nystroem` is a nice alternative to `PolynomialFeatures` that makes it\n",
@@ -523,7 +487,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "41d6abd8",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -539,7 +502,6 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "be6a232c",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -550,7 +512,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "7860e12d",
    "metadata": {},
    "source": [
     "## Notebook Recap\n",
@@ -579,4 +540,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
+}