Skip to content

Commit

Permalink
Rattrapage (#69)
Browse files Browse the repository at this point in the history
* enonce rattrapage

* last

* add exam
  • Loading branch information
xadupre authored Mar 18, 2024
1 parent b750121 commit e2784fa
Show file tree
Hide file tree
Showing 4 changed files with 342 additions and 1 deletion.
2 changes: 2 additions & 0 deletions _doc/notebook_gallery.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,8 @@ Correction d'examens
practice/exams/td_note_2022_rattrapage
practice/exams/td_note_2022_rattrapage2
practice/exams/td_note_2023
practice/exams/td_note_2023-2024
practice/exams/td_note_2023-2024_rattrapage

Machine Learning
================
Expand Down
2 changes: 2 additions & 0 deletions _doc/practice/exams.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ Enoncés
* :download:`td_note_2022_rattrapage2 <exams/td_note_2022_rattrapage2.pdf>`
* :download:`td_note_2023 <exams/td_note_2023.pdf>`
* :download:`td_note_2023-2024 <exams/td_note_2023-2024.pdf>`
* :download:`td_note_2023-2024_rattrapage <exams/td_note_2023-2024_rattrapage.pdf>`

Corrections
+++++++++++
Expand All @@ -62,6 +63,7 @@ Corrections
exams/td_note_2022_rattrapage2
exams/td_note_2023
exams/td_note_2023-2024
exams/td_note_2023-2024_rattrapage

Exercices courts
================
Expand Down
337 changes: 337 additions & 0 deletions _doc/practice/exams/td_note_2023-2024_rattrapage.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,337 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 1A - Enoncé 2024\n",
"\n",
"Toutes les questions valent deux points."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercice 1 : recherches de motifs\n",
"\n",
"On suppose qu'on a une séquence d'ADN ``\"ttcagttgtg aatgaatgga cgtgccaaat agacgtgccg ccgccgctcg attcgcactt cgtgccaaat\"``. Les caractères `a, t, g, c` sont appelés bases (nucléiques) et le fragment d'ADN est donc une séquence de bases.\n",
"\n",
"On s'intéresse dans ce sujet à la recherche d'un motif, c'est-à-dire d'une sous-séquence de bases, par exemple `ccaa`.\n",
"\n",
"On pourra écrire ``text = \"ttcagttgtg aatgaatgga cgtgccaaat agacgtgccg ccgccgctcg attcgcactt cgtgccaaat\".replace(\" \", \"\")``."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q1: Ecrire une fonction qui retourne la première position du motif en utilisant une boucle et sans la méthode `index`\n",
"\n",
"La fonction retourne la longueur du texte cherché si le motif n'est pas trouvé.\n",
"\n",
"``def recherche_motif(text, motif):``"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"24\n",
"70\n"
]
}
],
"source": [
"def recherche_motif(text, motif):\n",
" d = len(motif)\n",
" for i in range(len(text) - d + 1):\n",
" if text[i : i + d] == motif:\n",
" return i\n",
" return len(text)\n",
"\n",
"\n",
"text = \"ttcagttgtg aatgaatgga cgtgccaaat agacgtgccg ccgccgctcg attcgcactt cgtgccaaat\".replace(\n",
" \" \", \"\"\n",
")\n",
"motif = \"ccaa\"\n",
"\n",
"print(recherche_motif(text, motif))\n",
"print(recherche_motif(text, \"aaaaaa\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q2 : quel est, dans le pire des cas, le coût de l'algorithme ?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q3 : On souhaite maintenant écrire une fonction qui retourne toutes les positions d'un motif au sein d'un texte.\n",
"\n",
"``def positions_motif(text, motif):``"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[24, 64]\n",
"[]\n"
]
}
],
"source": [
"def positions_motif(text, motif):\n",
" pos = []\n",
" p = recherche_motif(text, motif)\n",
" while p < len(text) and len(pos) < 5:\n",
" pos.append(p)\n",
" p += len(motif) + recherche_motif(text[p + len(motif) :], motif)\n",
" return pos\n",
"\n",
"\n",
"text = \"ttcagttgtg aatgaatgga cgtgccaaat agacgtgccg ccgccgctcg attcgcactt cgtgccaaat\".replace(\n",
" \" \", \"\"\n",
")\n",
"motif = \"ccaa\"\n",
"\n",
"print(positions_motif(text, motif))\n",
"print(positions_motif(text, \"aaaaaaaaaa\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q4 : quel est, dans le pire des cas, le coût de l'algorithme ?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q5 : comment modifier (ou pas) votre fonction pour cet exemple ?\n",
"\n",
"```python\n",
"text = \"cgtgccaaaccaaacc\"\n",
"motif = \"ccaaac\"\n",
"assert positions_motif(text, motif) == [4, 9]\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[4]\n",
"[4, 9]\n"
]
}
],
"source": [
"def positions_motif_chevauchement(text, motif):\n",
" pos = []\n",
" p = recherche_motif(text, motif)\n",
" while p < len(text) and len(pos) < 5:\n",
" pos.append(p)\n",
" p += 1 + recherche_motif(text[p + 1 :], motif)\n",
" return pos\n",
"\n",
"\n",
"text = \"cgtgccaaaccaaacc\"\n",
"motif = \"ccaaac\"\n",
"\n",
"print(positions_motif(text, motif))\n",
"print(positions_motif_chevauchement(text, motif))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q6 : On veut accélérer le processus. On considère un couple de bases et non plus une seule base.\n",
"\n",
"Ecrire une fonction une fonction qui construit tous les couples de bases à partir des bases a,c,g,t."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['aa', 'ac', 'ag', 'at', 'ca', 'cc', 'cg', 'ct', 'ga', 'gc', 'gg', 'gt', 'ta', 'tc', 'tg', 'tt']\n"
]
}
],
"source": [
"def couple_base():\n",
" res = []\n",
" for a in \"acgt\":\n",
" for b in \"acgt\":\n",
" res.append(a + b)\n",
" return res\n",
"\n",
"\n",
"print(couple_base())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q7 : Transformer cette liste en dictionnaire.\n",
"\n",
"Le dictionnaire devra avoir pour clé un couple, pour valeur un nombre entier unique."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'aa': 0, 'ac': 1, 'ag': 2, 'at': 3, 'ca': 4, 'cc': 5, 'cg': 6, 'ct': 7, 'ga': 8, 'gc': 9, 'gg': 10, 'gt': 11, 'ta': 12, 'tc': 13, 'tg': 14, 'tt': 15}\n"
]
}
],
"source": [
"res = couple_base()\n",
"dico = {couple: i for i, couple in enumerate(res)}\n",
"print(dico)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Q8 : Ecrire une fonction qui découpe une séquence de bases en séquence de couples de bases.\n",
"\n",
"Chaque gène sera représenté par le nombre entier défini à la question précédente. On supposera que la séquence est de longueur paire. On fera de même pour le motif."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[15, 4, 11, 14, 14, 0, 14, 0, 14, 8, 6, 14, 5, 0, 3, 2, 1, 11, 9, 6, 5, 9, 6, 7, 6, 3, 13, 9, 1, 15, 6, 14, 5, 0, 3]\n",
"[5, 0]\n"
]
}
],
"source": [
"def decoupe_sequence(text, dico):\n",
" return [dico[text[i : i + 2]] for i in range(0, len(text), 2)]\n",
"\n",
"\n",
"text = \"ttcagttgtg aatgaatgga cgtgccaaat agacgtgccg ccgccgctcg attcgcactt cgtgccaaat\".replace(\n",
" \" \", \"\"\n",
")\n",
"motif = \"ccaa\"\n",
"text_couple = decoupe_sequence(text, dico)\n",
"print(text_couple)\n",
"motif_couple = decoupe_sequence(motif, dico)\n",
"print(motif_couple)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q9 : appliquer l'une de vos précédente fonction de recherche avec ces deux séquences pour trouver les positions."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[24, 64]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def positions_motif_couple(text_couple, motif_couple):\n",
" position = positions_motif(text_couple, motif_couple)\n",
" return [p * 2 for p in position]\n",
"\n",
"\n",
"positions_motif_couple(text_couple, motif_couple)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Q10 : coût de la recherche et résultats équivalents\n",
"\n",
"* Quelle recherche est plus rapide ? Combien de fois plus rapide ?\n",
"* Pourquoi ces deux recherches ne sont pas équivalentes ?\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
2 changes: 1 addition & 1 deletion _unittests/ut_xrun_doc/test_normalize_notebook.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def run_test(self, nb_name: str, verbose=0) -> int:
return 1
raise AssertionError(
f"Normalization should be run on {nb_name!r}. "
f"Set NB_NORMALIZE to 1 and rerun this file."
f"Set NB_NORMALIZE=1 and rerun this file."
)
return 1

Expand Down

0 comments on commit e2784fa

Please sign in to comment.