From cfa054187f8b9241542ba4bb338ee292bb071b15 Mon Sep 17 00:00:00 2001 From: 3mmaRand <7593411+3mmaRand@users.noreply.github.com> Date: Thu, 28 Sep 2023 17:51:53 +0000 Subject: [PATCH] =?UTF-8?q?Deploying=20to=20gh-pages=20from=20@=203mmaRand?= =?UTF-8?q?/BIO00088H-data@4230fd0bf94cbb8019c647fcd2c6d551769d3654=20?= =?UTF-8?q?=F0=9F=9A=80?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- core/week-1/study_after_workshop.html | 9 ++---- core/week-1/workshop.html | 46 +++++++++++++++++---------- core/week-2/workshop.html | 9 +++++- search.json | 28 ++++++++++++---- 4 files changed, 61 insertions(+), 31 deletions(-) diff --git a/core/week-1/study_after_workshop.html b/core/week-1/study_after_workshop.html index d51533c..a6bd32b 100644 --- a/core/week-1/study_after_workshop.html +++ b/core/week-1/study_after_workshop.html @@ -270,20 +270,17 @@
These are suggestions
Five selfish reasons to work reproducibly. Alternatively, see the very entertaining talk
Five selfish reasons to work reproducibly (Markowetz 2015). Alternatively, see the very entertaining talk
Many high profile cases of work which did not reproduce e.g. Anil Potti unravelled by Baggerly and Coombes (2009)
Will become standard in Science and publishing e.g OECD Global Science Forum Building digital workforce capacity and skills for data-intensive science OECD Global Science Forum (2020)
Will become standard in Science and publishing e.g OECD Global Science Forum Building digital workforce capacity and skills for data-intensive science (OECD Global Science Forum 2020)
use folders to organise your work
you are aiming for structured, systematic and repeatable.
inputs and outputs should be clearly identifiable from structure and/or naming
Example
+Examples
-- liver_transcriptome/
|__data
|__raw/
|__processed/
|__images/
- |__R/
+ |__code/
|__reports/
|__figures/
Guiding principle - names of files and directories should be systematic and readable by humans and machines. Have a convention!
+Guiding principle - Have a convention! Good file names are:
+machine readable
human readable
play nicely with sorting
I suggest
no spaces in names
READMEs are a form of documentation which have been widely used for a long time. They contain all the information about the other files in a directory. They can be extensive but need not be. Concise is good. Bullet points are good
Give a project description, brief
Give a project title and description, brief
start date, last updated date and contact information
Outline the folder structure
Give software requirements: programs and versions used or required. There are packages that give session information in R Wickham et al. (2021) and Python Ostblom, Joel (2019)
R:
-```
-#| eval: false
-sessioninfo::session_info()
-```
+sessioninfo::session_info()
Python:
-```
-#| eval: false
-import session_info
-session_info.show()
-
-```
+import session_info
session_info.show()
Instructions run the code, build reports, and reproduce the figures etc
Where to find the data, outputs
Any other information that needed to understand and recreate the work
Ideally, a summary of changes with the date
-- liver_transcriptome/
|__data
@@ -510,6 +512,9 @@ Code comments
Github co-pilot demo
+
+Quarto demo
+
Useful exercises
@@ -520,11 +525,15 @@ Useful exercises
🎬 Update R
🎬 Update RStudio. You will need the prelease Dessert Sunflower for github Copilot integration
Install package building tools
-🎬 Install Rtools (windows) or Xcode (mac)
+🎬 Windows Install Rtools
+🎬 Mac install Xcode from Mac App Store
Update packages:
🎬 devtools, tidyverse, BiocManager, readxl
Install Quarto
+Install Zotero
+🎬 Install Zotero
+
You’re finished!
@@ -550,6 +559,9 @@ Independent study following the workshop
Baggerly, Keith A, and Kevin R Coombes. 2009. “DERIVING CHEMOSENSITIVITY FROM CELL LINES: FORENSIC BIOINFORMATICS AND REPRODUCIBLE RESEARCH IN HIGH-THROUGHPUT BIOLOGY.” Ann. Appl. Stat. 3 (4): 1309–34. https://doi.org/10.2307/27801549.
+
+Markowetz, Florian. 2015. “Five Selfish Reasons to Work Reproducibly.” Genome Biol. 16 (December): 274. https://doi.org/10.1186/s13059-015-0850-7.
+
National Academies of Sciences, Engineering, Medicine, Policy, Global Affairs, Engineering, Medicine Committee on Science, Public Policy, Board on Research Data, et al. 2019. Understanding Reproducibility and Replicability. National Academies Press (US). https://www.ncbi.nlm.nih.gov/books/NBK547546/.
diff --git a/core/week-2/workshop.html b/core/week-2/workshop.html
index da9ab37..c8cfcee 100644
--- a/core/week-2/workshop.html
+++ b/core/week-2/workshop.html
@@ -283,9 +283,16 @@ Workshop
Session overview
In this workshop you will
File formats
-Data files. - Sequences data - Image data - Structure data
+Data files. - Sequences data - Image data - Structure data PDB/mmCIF www.pdb.org
Similarities and differences
🎬
+what is markdown
+Google Colab
+snippets
+python
+differences between r and python
+rstudio terminal
+basic bash
You’re finished!
🥳 Well Done! 🎉
Independent study following the workshop
diff --git a/search.json b/search.json
index e2263f7..2e850f2 100644
--- a/search.json
+++ b/search.json
@@ -6,12 +6,26 @@
"section": "",
"text": "About this site"
},
+ {
+ "objectID": "core/week-1/study_after_workshop.html",
+ "href": "core/week-1/study_after_workshop.html",
+ "title": "Independent Study to consolidate this week",
+ "section": "",
+ "text": "These are suggestions"
+ },
+ {
+ "objectID": "core/week-1/study_after_workshop.html#bio00088h-group-research-project-students",
+ "href": "core/week-1/study_after_workshop.html#bio00088h-group-research-project-students",
+ "title": "Independent Study to consolidate this week",
+ "section": "BIO00088H Group Research Project students",
+ "text": "BIO00088H Group Research Project students\n\nRevise previous Data Analysis materials. You can find the version you took on the VLE site for 17C or 08C. However, my latest versions (in development) are here: Data Analysis in R. The Becoming a Bioscientist (BABS) modules replace the Laboratory and Professional Skills modules. BABS1 and BABS1 are stage one, and I’ve tried to improve them over 17C and 08C. The site is also searchable (icon top right)"
+ },
{
"objectID": "core/week-1/study_after_workshop.html#msc-bioinformatics-students-doing-bio00070m",
"href": "core/week-1/study_after_workshop.html#msc-bioinformatics-students-doing-bio00070m",
"title": "Independent Study to consolidate this week",
"section": "MSc Bioinformatics students doing BIO00070M",
- "text": "MSc Bioinformatics students doing BIO00070M"
+ "text": "MSc Bioinformatics students doing BIO00070M\n\nMake sure you carry out the preparatory work for week 2 of 52M"
},
{
"objectID": "core/week-1/workshop.html",
@@ -39,7 +53,7 @@
"href": "core/week-1/workshop.html#why-does-it-matter",
"title": "Workshop",
"section": "Why does it matter?",
- "text": "Why does it matter?\n\n\n\nfutureself, CC-BY-NC, by Julen Colomb\n\n\n\nFive selfish reasons to work reproducibly. Alternatively, see the very entertaining talk\nMany high profile cases of work which did not reproduce e.g. Anil Potti unravelled by Baggerly and Coombes (2009)\nWill become standard in Science and publishing e.g OECD Global Science Forum Building digital workforce capacity and skills for data-intensive science OECD Global Science Forum (2020)"
+ "text": "Why does it matter?\n\n\n\nfutureself, CC-BY-NC, by Julen Colomb\n\n\n\nFive selfish reasons to work reproducibly (Markowetz 2015). Alternatively, see the very entertaining talk\nMany high profile cases of work which did not reproduce e.g. Anil Potti unravelled by Baggerly and Coombes (2009)\nWill become standard in Science and publishing e.g OECD Global Science Forum Building digital workforce capacity and skills for data-intensive science (OECD Global Science Forum 2020)"
},
{
"objectID": "core/week-1/workshop.html#how-to-achieve-reproducibility",
@@ -60,21 +74,21 @@
"href": "core/week-1/workshop.html#project-oriented-workflow",
"title": "Workshop",
"section": "Project-oriented workflow",
- "text": "Project-oriented workflow\n\nuse folders to organise your work\nyou are aiming for structured, systematic and repeatable.\n\nExample\n-- liver_transcriptome/\n |__data\n |__raw/\n |__processed/\n |__images/\n |__R/\n |__reports/\n |__figures/"
+ "text": "Project-oriented workflow\n\nuse folders to organise your work\nyou are aiming for structured, systematic and repeatable.\ninputs and outputs should be clearly identifiable from structure and/or naming\n\nExamples\n-- liver_transcriptome/\n |__data\n |__raw/\n |__processed/\n |__images/\n |__code/\n |__reports/\n |__figures/"
},
{
"objectID": "core/week-1/workshop.html#naming-things",
"href": "core/week-1/workshop.html#naming-things",
"title": "Workshop",
"section": "Naming things",
- "text": "Naming things\n\n\n\ndocuments, CC-BY-NC, https://xkcd.com/1459/\n\n\nGuiding principle - names of files and directories should be systematic and readable by humans and machines. Have a convention!\nI suggest\n\nno spaces in names\nuse snake_case or kebab-case rather than CamelCase or dot.case\nuse all lower case except very occasionally where convention is otherwise, e.g., README, LICENSE\nordering: use left-padded numbers e.g., 01, 02….99 or 001, 002….999\ndates ISO 8601 format: 2020-10-16\nwrite down your conventions\n\n-- liver_transcriptome/\n |__data\n |__raw/\n |__2022-03-21_donor_1.csv\n |__2022-03-21_donor_2.csv\n |__2022-03-21_donor_3.csv\n |__2022-05-14_donor_1.csv\n |__2022-05-14_donor_2.csv\n |__2022-05-14_donor_3.csv\n |__processed/\n |__images/\n |__code/\n |__functions/\n |__summarise.R\n |__normalise.R\n |__theme_volcano.R\n |__01_data_processing.py\n |__02_exploratory.R\n |__03_modelling.R\n |__04_figures.R\n |__reports/\n |__01_report.qmd\n |__02_supplementary.qmd\n |__figures/\n |__01_volcano_donor_1_vs_donor_2.eps\n |__02_volcano_donor_1_vs_donor_3.eps"
+ "text": "Naming things\n\n\n\ndocuments, CC-BY-NC, https://xkcd.com/1459/\n\n\nGuiding principle - Have a convention! Good file names are:\n\nmachine readable\nhuman readable\nplay nicely with sorting\n\nI suggest\n\nno spaces in names\nuse snake_case or kebab-case rather than CamelCase or dot.case\nuse all lower case except very occasionally where convention is otherwise, e.g., README, LICENSE\nordering: use left-padded numbers e.g., 01, 02….99 or 001, 002….999\ndates ISO 8601 format: 2020-10-16\nwrite down your conventions\n\n-- liver_transcriptome/\n |__data\n |__raw/\n |__2022-03-21_donor_1.csv\n |__2022-03-21_donor_2.csv\n |__2022-03-21_donor_3.csv\n |__2022-05-14_donor_1.csv\n |__2022-05-14_donor_2.csv\n |__2022-05-14_donor_3.csv\n |__processed/\n |__images/\n |__code/\n |__functions/\n |__summarise.R\n |__normalise.R\n |__theme_volcano.R\n |__01_data_processing.py\n |__02_exploratory.R\n |__03_modelling.R\n |__04_figures.R\n |__reports/\n |__01_report.qmd\n |__02_supplementary.qmd\n |__figures/\n |__01_volcano_donor_1_vs_donor_2.eps\n |__02_volcano_donor_1_vs_donor_3.eps"
},
{
"objectID": "core/week-1/workshop.html#readme-files",
"href": "core/week-1/workshop.html#readme-files",
"title": "Workshop",
"section": "Readme files",
- "text": "Readme files\nREADMEs are a form of documentation which have been widely used for a long time. They contain all the information about the other files in a directory. They can be extensive but need not be. Concise is good. Bullet points are good\n\nGive a project description, brief\nOutline the folder structure\nGive software requirements: programs and versions used or required. There are packages that give session information in R Wickham et al. (2021) and Python Ostblom, Joel (2019)\n\nR:\n```\n#| eval: false\nsessioninfo::session_info()\n```\nPython:\n```\n#| eval: false\nimport session_info\nsession_info.show()\n\n```\n\nInstructions run the code, build reports, and reproduce the figures etc\nWhere to find the data, outputs\nAny other information that needed to understand and recreate the work\n\n-- liver_transcriptome/\n |__data\n |__raw/\n |__2022-03-21_donor_1.csv\n |__2022-03-21_donor_2.csv\n |__2022-03-21_donor_3.csv\n |__2022-05-14_donor_1.csv\n |__2022-05-14_donor_2.csv\n |__2022-05-14_donor_3.csv\n |__processed/\n |__images/\n |__code/\n |__functions/\n |__summarise.R\n |__normalise.R\n |__theme_volcano.R\n |__01_data_processing.py\n |__02_exploratory.R\n |__03_modelling.R\n |__04_figures.R\n |__README.md\n |__reports/\n |__01_report.qmd\n |__02_supplementary.qmd\n |__figures/\n |__01_volcano_donor_1_vs_donor_2.eps\n |__02_volcano_donor_1_vs_donor_3.eps"
+ "text": "Readme files\nREADMEs are a form of documentation which have been widely used for a long time. They contain all the information about the other files in a directory. They can be extensive but need not be. Concise is good. Bullet points are good\n\nGive a project title and description, brief\nstart date, last updated date and contact information\nOutline the folder structure\nGive software requirements: programs and versions used or required. There are packages that give session information in R Wickham et al. (2021) and Python Ostblom, Joel (2019)\n\nR:\nsessioninfo::session_info()\nPython:\nimport session_info\nsession_info.show()\n\nInstructions run the code, build reports, and reproduce the figures etc\nWhere to find the data, outputs\nAny other information that needed to understand and recreate the work\nIdeally, a summary of changes with the date\n\n-- liver_transcriptome/\n |__data\n |__raw/\n |__2022-03-21_donor_1.csv\n |__2022-03-21_donor_2.csv\n |__2022-03-21_donor_3.csv\n |__2022-05-14_donor_1.csv\n |__2022-05-14_donor_2.csv\n |__2022-05-14_donor_3.csv\n |__processed/\n |__images/\n |__code/\n |__functions/\n |__summarise.R\n |__normalise.R\n |__theme_volcano.R\n |__01_data_processing.py\n |__02_exploratory.R\n |__03_modelling.R\n |__04_figures.R\n |__README.md\n |__reports/\n |__01_report.qmd\n |__02_supplementary.qmd\n |__figures/\n |__01_volcano_donor_1_vs_donor_2.eps\n |__02_volcano_donor_1_vs_donor_3.eps"
},
{
"objectID": "core/week-1/workshop.html#code-comments",
@@ -95,7 +109,7 @@
"href": "core/week-2/workshop.html",
"title": "Workshop",
"section": "",
- "text": "In this workshop you will\n\nData files. - Sequences data - Image data - Structure data\nSimilarities and differences\n🎬\nYou’re finished!"
+ "text": "In this workshop you will\n\nData files. - Sequences data - Image data - Structure data PDB/mmCIF www.pdb.org\nSimilarities and differences\n🎬\nwhat is markdown\nGoogle Colab\nsnippets\npython\ndifferences between r and python\nrstudio terminal\nbasic bash\nYou’re finished!"
},
{
"objectID": "core/week-2/workshop.html#session-overview",
@@ -109,7 +123,7 @@
"href": "core/week-2/workshop.html#file-formats",
"title": "Workshop",
"section": "",
- "text": "Data files. - Sequences data - Image data - Structure data\nSimilarities and differences\n🎬\nYou’re finished!"
+ "text": "Data files. - Sequences data - Image data - Structure data PDB/mmCIF www.pdb.org\nSimilarities and differences\n🎬\nwhat is markdown\nGoogle Colab\nsnippets\npython\ndifferences between r and python\nrstudio terminal\nbasic bash\nYou’re finished!"
},
{
"objectID": "core/week-6/study_after_workshop.html",