Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete refactoring for regression / classification #66

Merged
merged 5 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOGS.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
Change Logs
===========

0.5.0
+++++

* :pr:`66`: add dependency on patsy in requirements-dev.txt for new content

0.4.0
+++++

Expand Down
1,001 changes: 1,001 additions & 0 deletions _data/2017/persons.txt

Large diffs are not rendered by default.

2,538 changes: 2,538 additions & 0 deletions _data/2017/rendezvous.txt

Large diffs are not rendered by default.

8 changes: 5 additions & 3 deletions _doc/articles/2024/2024-03-01-route2024.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
_l-feuille-route-2024:
.. _l-feuille-route-2024:

==================================
2024-03-01 : feuille de route 2024
Expand All @@ -11,9 +11,11 @@ Séance 1 (26/1)
* :ref:`l-ml-rappel`
* programmation (python :epkg:`numpy`, :epkg:`pandas`, :epkg:`matplotlib`, :epkg:`jupyter`)
* :ref:`Tests unitaires <nbl-practice-py-base-tests_unitaires>`, package python
* CPU, CUDA
* données, SQL
* `SQL <https://en.wikipedia.org/wiki/SQL>`_
* `CPU <https://en.wikipedia.org/wiki/Central_processing_unit>`_,
`CUDA <https://en.wikipedia.org/wiki/CUDA>`_
* machine learning, :epkg:`scikit-learn`, :epkg:`pytorch`
* `comparaison torch/scikit-learn <https://sdpython.github.io/doc/experimental-experiment/dev/auto_examples/plot_torch_linreg.html>`_
* :ref:`l-regclass`
* évaluation, :ref:`ROC <l-ml-plot-roc>`, :math:`R^2`
* ranking, clustering
Expand Down
4 changes: 2 additions & 2 deletions _doc/articles/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ Ou *blog*.
:caption: 2024
:maxdepth: 1

2024/2024-01-18-wsl
2024/2024-03-01-route2024
2024/2024-01-18-wsl

.. toctree::
:caption: 2023
Expand All @@ -25,6 +25,6 @@ Ou *blog*.
:caption: 2022
:maxdepth: 1

2022/2022-11-31-route2022
2022/2022-12-07-cartopy
2022/2022-11-31-route2022
2022/2022-01-01-assurance
4 changes: 2 additions & 2 deletions _doc/c_ml/regclass.rst
Original file line number Diff line number Diff line change
Expand Up @@ -566,8 +566,8 @@ un modèle.
Exercices
+++++++++

* `Tree, overfitting <http://www.xavierdupre.fr/app/ensae_teaching_cs/helpsphinx/notebooks/ml_a_tree_overfitting.html>`_
* `Comparaison de deux régressions <http://www.xavierdupre.fr/app/actuariat_python/helpsphinx/notebooks/enonce_2017.html#enonce2017rst>`_
* :ref:`Tree, overfitting <nbl-practice-ml-ml_a_tree_overfitting>`
* :ref:`Comparaison de deux régressions <nbl-practice-exams-enonce_ml_2017>`

Bibliographie
+++++++++++++
Expand Down
2 changes: 2 additions & 0 deletions _doc/notebook_gallery.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ Correction d'examens
practice/exams/td_note_2015
practice/exams/td_note_2016
practice/exams/td_note_2017
practice/exams/enonce_ml_2017_correction
practice/exams/td_note_2017_2
practice/exams/td_note_2018_1
practice/exams/td_note_2018_2
Expand Down Expand Up @@ -176,3 +177,4 @@ Machine Learning
practice/ml/artificiel_multiclass
practice/ml/ml_features_model
practice/ml/timeseries_ssa
practice/ml/ml_a_tree_overfitting
2 changes: 1 addition & 1 deletion _doc/practice/algo-compose/exercice_morse.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
"source": [
"### Exercice 1 : Traduire un texte Morse qui ne contient pas d'espace\n",
"\n",
"Ce sujet est un exercice classique de programmation. Il est déjà résolu et expliqué sur [Codingame](http://www.synbioz.com/blog/exercice_de_programmation_codingame). Mais on pourra par exemple commencer par utiliser une expression régulière. Une autre option consiste à utiliser un *trie*."
"Ce sujet est un exercice classique de programmation. On peut trouver des solution sur internet. On pourra par exemple commencer par utiliser une expression régulière. Une autre option consiste à utiliser un *trie*."
]
},
{
Expand Down
9 changes: 9 additions & 0 deletions _doc/practice/exams.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,12 @@ Exercices courts
exams/interro_rapide_20_minutes_2014_12
exams/interro_rapide_20_minutes_2015_09
exams/interro_rapide_20_minutes_2015_11

Séance notées machaine learning
===============================

.. toctree::
:maxdepth: 1

exams/enonce_ml_2017
exams/enonce_ml_2017_correction
220 changes: 220 additions & 0 deletions _doc/practice/exams/enonce_ml_2017.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Evaluation Python / Machine Learning année 2017 - énoncé\n",
"\n",
"Le répertoire [_data/2017](https://github.com/sdpython/teachpyx/tree/main/_data/2017) contient deux fichiers csv simulés aléatoirement dont il faudra se servir pour répondre aux 10 questions qui suivent. Chaque question vaut deux points. Le travail est à rendre pour le lundi 20 février sous la forme d'un notebook envoyé en pièce jointe d'un mail."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## 1\n",
"\n",
"Deux fichiers sont extraits de la base de données d'un médecin.\n",
"Un fichier contient des informations sur des personnes, un autre\n",
"sur les rendez-vous pris par ces personnes. Quels sont-ils ?"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2\n",
"\n",
"On souhaite étudier la relation entre le prix moyen payé par une personne,\n",
"son âge et son genre. Calculer le prix moyen payé par une personne ?"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3\n",
"\n",
"Faire la jointure entre les deux tables."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4\n",
"\n",
"Tracer deux nuages de points (age, prix moyen) et (genre, prix moyen) ?"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5\n",
"\n",
"Calculer les coefficients de la régression $prix\\_moyen \\sim age + genre$."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6\n",
"\n",
"On souhaite étudier le prix d'une consultation en fonction du jour de la semaine.\n",
"Ajouter une colonne dans la table de votre choix avec le jour de la semaine."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7\n",
"\n",
"Créer un graphe moustache qui permet de vérifier cette hypothèse."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8\n",
"\n",
"Ajouter une colonne dans la table de votre choix qui contient 365 si c'est le premier rendez-vous, le nombre de jour écoulés depuis le précédent rendez-vous. On appelle cette colonne $delay$. On ajoute également la colonne $1/delay$."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 9 \n",
"\n",
"Calculer les coefficients de la régression $prix \\sim age + genre + delay + 1/delay + jour\\_semaine$.\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 10\n",
"\n",
"Comment comparer ce modèle avec le précédent ? Implémentez le calcul qui vous permet de répondre à cette question."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading
Loading