Merge branch 'main' into debug-pr-preview

INRIA · Oct 26, 2023 · f5bb3f2 · f5bb3f2
2 parents dfc55f2 + d51a62b
commit f5bb3f2
Show file tree

Hide file tree

Showing 26 changed files with 900 additions and 500 deletions.
diff --git a/.github/workflows/deploy-gh-pages.yml b/.github/workflows/deploy-gh-pages.yml
@@ -32,7 +32,7 @@ jobs:
         pip install -r requirements-dev.txt
 
     - name: Cache jupyter-cache folder
-      uses: actions/cache@v2
+      uses: actions/cache@v3
       env:
         cache-name: jupyter-cache
       with:

diff --git a/.github/workflows/jupyter-book-pr-preview.yml b/.github/workflows/jupyter-book-pr-preview.yml
@@ -19,19 +19,19 @@ jobs:
           sha: ${{ github.event.workflow_run.head_sha }}
           context: 'JupyterBook preview'
 
+      - name: Get pull request number
+        id: pull-request-number
+        run: |
+          export PULL_REQUEST_NUMBER=${{github.event.workflow_run.event.number}}
+          echo "result=${PULL_REQUEST_NUMBER}" >> $GITHUB_OUTPUT
+
       - uses: dawidd6/action-download-artifact@v2
         with:
           github_token: ${{secrets.GITHUB_TOKEN}}
           workflow: deploy-gh-pages.yml
           pr: ${{steps.pull-request-number.outputs.result}}
           name: jupyter-book
 
-      - name: Get pull request number
-        id: pull-request-number
-        run: |
-          export PULL_REQUEST_NUMBER=$(cat pull_request_number)
-          echo "result=${PULL_REQUEST_NUMBER}" >> $GITHUB_OUTPUT
-
       - uses: actions/setup-node@v3
         with:
           node-version: '16'

diff --git a/figures/numerical_pipeline_wrap_up_quiz_comparison.png b/figures/numerical_pipeline_wrap_up_quiz_comparison.png
diff --git a/jupyter-book/_toc.yml b/jupyter-book/_toc.yml
@@ -90,29 +90,27 @@ parts:
   - file: linear_models/linear_models_intuitions_index
     sections:
     - file: linear_models/linear_models_slides
-    - file: linear_models/linear_models_quiz_m4_01
     - file: python_scripts/linear_regression_without_sklearn
     - file: python_scripts/linear_models_ex_01
     - file: python_scripts/linear_models_sol_01
     - file: python_scripts/linear_regression_in_sklearn
     - file: python_scripts/logistic_regression
-    - file: linear_models/linear_models_quiz_m4_02
+    - file: linear_models/linear_models_quiz_m4_01
   - file: linear_models/linear_models_non_linear_index
     sections:
     - file: python_scripts/linear_regression_non_linear_link
     - file: python_scripts/linear_models_ex_02
     - file: python_scripts/linear_models_sol_02
     - file: python_scripts/linear_models_feature_engineering_classification.py
     - file: python_scripts/logistic_regression_non_linear
-    - file: linear_models/linear_models_quiz_m4_03
+    - file: linear_models/linear_models_quiz_m4_02
   - file: linear_models/linear_models_regularization_index
     sections:
     - file: linear_models/regularized_linear_models_slides
     - file: python_scripts/linear_models_regularization
-    - file: linear_models/linear_models_quiz_m4_04
     - file: python_scripts/linear_models_ex_03
     - file: python_scripts/linear_models_sol_03
-    - file: linear_models/linear_models_quiz_m4_05
+    - file: linear_models/linear_models_quiz_m4_03
   - file: linear_models/linear_models_wrap_up_quiz
   - file: linear_models/linear_models_module_take_away
 - caption: Decision tree models

diff --git a/jupyter-book/linear_models/linear_models_quiz_m4_01.md b/jupyter-book/linear_models/linear_models_quiz_m4_01.md
@@ -17,10 +17,75 @@ _Select a single answer_
 
 ```{admonition} Question
 Is it possible to get a perfect fit (zero prediction error on the training set)
-with a linear classifier by itself on a non-linearly separable dataset?
+with a linear classifier **by itself** on a non-linearly separable dataset?
 
 - a) yes
 - b) no
 
 _Select a single answer_
 ```
+
++++
+
+```{admonition} Question
+If we fit a linear regression where `X` is a single column vector, how many
+parameters our model will be made of?
+
+- a) 1
+- b) 2
+- c) 3
+
+_Select a single answer_
+```
+
++++
+
+```{admonition} Question
+If we train a scikit-learn `LinearRegression` with `X` being a single column
+vector and `y` a vector, `coef_` and `intercept_` will be respectively:
+
+- a) an array of shape (1, 1) and a number
+- b) an array of shape (1,) and an array of shape (1,)
+- c) an array of shape (1, 1) and an array of shape (1,)
+- d) an array of shape (1,) and a number
+
+_Select a single answer_
+```
+
++++
+
+```{admonition} Question
+The decision boundaries of a logistic regression model:
+
+- a) split classes using only one of the input features
+- b) split classes using a combination of the input features
+- c) often have curved shapes
+
+_Select a single answer_
+```
+
++++
+
+```{admonition} Question
+For a binary classification task, what is the shape of the array returned by the
+`predict_proba` method for 10 input samples?
+
+- a) (10,)
+- b) (10, 2)
+- c) (2, 10)
+
+_Select a single answer_
+```
+
++++
+
+```{admonition} Question
+In logistic regression's `predict_proba` method in scikit-learn, which of the
+following statements is true regarding the predicted probabilities?
+
+- a) The sum of probabilities across different classes for a given sample is always equal to 1.0.
+- b) The sum of probabilities across all samples for a given class is always equal to 1.0.
+- c) The sum of probabilities across all features for a given class is always equal to 1.0.
+
+_Select a single answer_
+```
diff --git a/jupyter-book/linear_models/linear_models_quiz_m4_02.md b/jupyter-book/linear_models/linear_models_quiz_m4_02.md
@@ -1,64 +1,41 @@
 # ✅ Quiz M4.02
 
 ```{admonition} Question
-If we fit a linear regression where `X` is a single column vector, how many
-parameters our model will be made of?
 
-- a) 1
-- b) 2
-- c) 3
+Let us consider a pipeline that combines a polynomial feature extraction of
+degree 2 and a linear regression model. Let us assume that the linear regression
+coefficients are all non-zero and that the dataset contains a single feature.
+Is the prediction function of this pipeline a straight line?
 
-_Select a single answer_
-```
-
-+++
-
-```{admonition} Question
-If we train a scikit-learn `LinearRegression` with `X` being a single column
-vector and `y` a vector, `coef_` and `intercept_` will be respectively:
-
-- a) an array of shape (1, 1) and a number
-- b) an array of shape (1,) and an array of shape (1,)
-- c) an array of shape (1, 1) and an array of shape (1,)
-- d) an array of shape (1,) and a number
-
-_Select a single answer_
-```
-
-+++
-
-```{admonition} Question
-The decision boundaries of a logistic regression model:
-
-- a) split classes using only one of the input features
-- b) split classes using a combination of the input features
-- c) often have curved shapes
+- a) yes
+- b) no
 
 _Select a single answer_
 ```
 
 +++
 
 ```{admonition} Question
-For a binary classification task, what is the shape of the array returned by the
-`predict_proba` method for 10 input samples?
+Fitting a linear regression where `X` has `n_features` columns and the target
+is a single continuous vector, what is the respective type/shape of `coef_`
+and `intercept_`?
 
-- a) (10,)
-- b) (10, 2)
-- c) (2, 10)
+- a) it is not possible to fit a linear regression in dimension higher than 2
+- b) array of shape (`n_features`,) and a float
+- c) array of shape (1, `n_features`) and an array of shape (1,)
 
 _Select a single answer_
 ```
 
 +++
 
 ```{admonition} Question
-In logistic regression's `predict_proba` method in scikit-learn, which of the
-following statements is true regarding the predicted probabilities?
+Combining (one or more) feature engineering transformers in a single pipeline:
 
-- a) The sum of probabilities across different classes for a given sample is always equal to 1.0.
-- b) The sum of probabilities across all samples for a given class is always equal to 1.0.
-- c) The sum of probabilities across all features for a given class is always equal to 1.0.
+- a) increases the expressivity of the model
+- b) ensures that models extrapolate accurately regardless of the distribution of the data
+- c) may require tuning additional hyperparameters
+- d) inherently prevents any underfitting
 
-_Select a single answer_
+_Select all answers that apply_
 ```
diff --git a/jupyter-book/linear_models/linear_models_quiz_m4_03.md b/jupyter-book/linear_models/linear_models_quiz_m4_03.md
@@ -1,41 +1,103 @@
 # ✅ Quiz M4.03
 
 ```{admonition} Question
+Which of the following estimators can solve linear regression problems?
 
-Let us consider a pipeline that combines a polynomial feature extraction of
-degree 2 and a linear regression model. Let us assume that the linear regression
-coefficients are all non-zero and that the dataset contains a single feature.
-Is the prediction function of this pipeline a straight line?
+- a) sklearn.linear_model.LinearRegression
+- b) sklearn.linear_model.LogisticRegression
+- c) sklearn.linear_model.Ridge
 
-- a) yes
-- b) no
+_Select all answers that apply_
+```
+
++++
+
+```{admonition} Question
+Regularization allows:
+
+- a) to create a model robust to outliers (samples that differ widely from
+  other observations)
+- b) to reduce overfitting by forcing the weights to stay close to zero
+- c) to reduce underfitting by making the problem linearly separable
 
 _Select a single answer_
 ```
 
 +++
 
 ```{admonition} Question
-Fitting a linear regression where `X` has `n_features` columns and the target
-is a single continuous vector, what is the respective type/shape of `coef_`
-and `intercept_`?
+A ridge model is:
 
-- a) it is not possible to fit a linear regression in dimension higher than 2
-- b) array of shape (`n_features`,) and a float
-- c) array of shape (1, `n_features`) and an array of shape (1,)
+- a) the same as linear regression with penalized weights
+- b) the same as logistic regression with penalized weights
+- c) a linear model
+- d) a non linear model
 
-_Select a single answer_
+_Select all answers that apply_
 ```
 
 +++
 
 ```{admonition} Question
-Combining (one or more) feature engineering transformers in a single pipeline:
+Assume that a data scientist has prepared a train/test split and plans to use
+the test for the final evaluation of a `Ridge` model. The parameter `alpha` of
+the `Ridge` model:
 
-- a) increases the expressivity of the model
-- b) ensures that models extrapolate accurately regardless of its distribution
-- c) may require tuning additional hyperparameters
-- d) inherently prevents any underfitting
+- a) is internally tuned when calling `fit` on the train set
+- b) should be tuned by running cross-validation on a **train set**
+- c) should be tuned by running cross-validation on a **test set**
+- d) must be a positive number
 
 _Select all answers that apply_
 ```
+
++++
+
+```{admonition} Question
+Scaling the data before fitting a model:
+
+- a) is often useful for regularized linear models
+- b) is always necessary for regularized linear models
+- c) may speed-up fitting
+- d) has no impact on the optimal choice of the value of a regularization parameter
+
+_Select all answers that apply_
+```
+
++++
+
+```{admonition} Question
+The effect of increasing the regularization strength in a ridge model is to:
+
+- a) shrink all weights towards zero
+- b) make all weights equal
+- c) set a subset of the weights to exactly zero
+- d) constrain all the weights to be positive
+
+_Select all answers that apply_
+```
+
++++
+
+```{admonition} Question
+The parameter `C` in a logistic regression is:
+
+- a) similar to the parameter `alpha` in a ridge regressor
+- b) similar to `1 / alpha` where `alpha` is the parameter of a ridge regressor
+- c) not controlling the regularization
+
+_Select a single answer_
+```
+
++++
+
+```{admonition} Question
+In logistic regression, increasing the regularization strength (by
+decreasing the value of `C`) makes the model:
+
+- a) more likely to overfit to the training data
+- b) more confident: the values returned by `predict_proba` are closer to 0 or 1
+- c) less complex, potentially underfitting the training data
+
+_Select a single answer_
+```