You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
More information: <https://docs.conda.io/en/latest/>
334
+
```{solution} Some things to note
335
+
- Everything is listed as you installed it; with or without specified versions
336
+
- Using this environment file a few days/weeks later will likely not result in the same environment
337
+
- This can be a good starting point for a reproducible environment as you may add your current version numbers to it (check for example with `conda list | grep "packagename"`)
338
+
```
339
+
340
+
In daily use you may not always use an environment.yml file to create the full environment, but create a base environment and then add new packages with `conda install packagename` as you go. Also those packages will be listed in the environment files created with either of the approaches above.
320
341
321
-
See also: <https://github.com/mamba-org/mamba>
342
+
More information: <https://docs.conda.io/en/latest/> and <https://github.com/mamba-org/mamba>
322
343
````
323
344
324
345
````{group-tab} Python virtualenv
@@ -355,5 +376,5 @@ information?
355
376
356
377
```{keypoints}
357
378
- Recording dependencies with versions can make it easier for the next person to execute your code
358
-
- There are many tools to record dependencies
379
+
- There are many tools to record dependencies and separate environments
Copy file name to clipboardExpand all lines: content/intro.md
+18-11Lines changed: 18 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,27 +16,34 @@
16
16
17
17
## This workshop is all about reproducibility - from a computational perspective
18
18
19
+
This section connects the steps above to the CodeRefinery workshop lessons.
20
+
19
21
**"Here is my code"**
20
22
21
-
->**Version control with git** with focus on collaboration
22
-
->**Social coding**: What can you do to get credit for your code and to allow reuse
23
-
->**Documentation**: How to let others or future you know about your thoughts and how to use your code
24
-
->**Jupyter Notebooks**: A tool to write and share executable notebooks and data visualization
25
-
->**Automated testing**: Preventing yourself and others from breaking your functioning code
26
-
->**Modular code development**: Making reusing parts of your code easier
23
+
-**Version control with git** with focus on collaboration
24
+
-**Social coding**: What can you do to get credit for your code and to allow reuse
25
+
-**Documentation**: How to let others or future you know about your thoughts and how to use your code
26
+
-**Jupyter Notebooks**: A tool to write and share executable notebooks and data visualization
27
+
-**Automated testing**: Preventing yourself and others from breaking your functioning code
28
+
-**Modular code development**: Making reusing parts of your code easier
27
29
28
-
**"Here are my tools"**
30
+
**"Here are my tools"**
29
31
30
-
-> This lesson on general **Reproducibility**: Preparing code to be usable by you and others in the future
32
+
This lesson on general **Reproducibility**: Preparing code to be usable by you and others in the future
31
33
32
34
This includes organizing your projects on your own computer and recording your computational steps, dependencies and computing environment.
33
35
34
-
We will also mention a few tools and platforms for sharing data (**"Here is my data"**) and research outputs(**"Here are my results"**), but they are not the focus of this workshop.
36
+
We will also mention a few tools and platforms for sharing data (**"Here is my data"**) and research outputs(**"Here are my results"**) in the **social coding** lesson, but they are not the focus of this workshop.
35
37
36
38
## Small steps towards reproducible research
37
39
38
40
If this is all new to you, it may feel quite overwhelming.
39
-
Our recommendation: Focus on "good enough" instead of perfect: To start, pick one topic that seems reasonable to implement for your current project. Something that helps YOU right now. Some things you may have to implement due to requirements from your funders or the journal where you want to publish your research. Use their requirements as a checklist and find tools that feel comfortable for you.
40
-
A great way to see what are the really important things to implement, meet with a colleague, exchange codes and try to run each others code. Every question your colleague has to ask from you about your code gives a hint on where you may need to improve your documentation.
41
+
42
+
**Our recommendation:** Don't worry! Focus on "good enough" instead of perfect.
43
+
44
+
To start, pick one topic that seems reasonable to implement for your current project. Something that helps YOU right now. This may be something you may have to implement due to requirements from your funders or the journal where you want to publish your research. Use their requirements as a checklist and find tools that feel comfortable for you.
45
+
46
+
A great way to see what are the really important things to implement is to meet with a colleague, exchange codes and try to run each others code. Every question your colleague has to ask from you about your code gives a hint on where you may need to improve.
47
+
41
48
Keeping a "log book" while working on your own code also serves as a great basis for making your code more reproducible. Can you use any of the tools and techniques learned in this workshop to share parts of your log book with others to help them run your code?
-[Reproducible research template](https://github.com/the-turing-way/reproducible-project-template) by the Turing Way
71
+
72
+
More tools and templates in [Heidi Seibolds blog](https://heidiseibold.ck.page/posts/setting-up-a-fair-and-reproducible-project).
73
+
59
74
60
75
---
61
76
62
-
## Discussion on reproducibility
77
+
## Excursion: Reproducible publications
78
+
79
+
### Discussion on collaborative writing of academic papers
63
80
64
81
````{discussion} Discuss in the collaborative document:
65
82
66
-
**How do you collaborate on writing academic papers?**
67
83
```
68
-
- Are you using version control for academic papers?
84
+
- How do you collaborate on writing academic papers?
69
85
- ...
70
86
- ...
71
87
- (share your experience)
@@ -75,46 +91,35 @@ project_name/
75
91
- ...
76
92
- (share your experience)
77
93
```
78
-
> Please write or discuss your ideas before opening solution!
79
-
80
-
```{solution} Take away messages
81
-
- Consider using version control for manuscripts as well. It may help you when keeping track of edits + if you sync it online then you don't have to worry about losing your work.
82
94
83
-
- Collaboration can be done efficiently by
84
-
- real time collaboration tools like HackMD/HedgeDoc where conflicts are resolved on the fly
85
-
- version control where conflicts are detected and shown – and solved manually
86
-
```
87
95
````
88
96
89
-
## Some tools and templates
97
+
-> Consider using **version control for manuscripts** as well. It may help you when keeping track of edits + if you sync it online then you don't have to worry about losing your work.
-[Reproducible research template](https://github.com/the-turing-way/reproducible-project-template) by the Turing Way
99
+
Version control does not have to mean git, but could also mean using "tracking changes" in tools like Word, Google Docs, or Overleaf (find links below).
94
100
95
-
More tools and templates in [Heidi Seibolds blog](https://heidiseibold.ck.page/posts/setting-up-a-fair-and-reproducible-project).
101
+
### Tools for collaborative writing and version control of manuscripts
96
102
97
-
## Reproducible publications
98
-
99
-
- Git can be used to collaborate on manuscripts written in, e.g., LaTeX and other text-based formats but other tools exist, some with git integration:
100
-
-[Overleaf](https://www.overleaf.com) or [Typst](https://typst.app/): online, collaborative LaTeX editor
101
-
-[Authorea](https://www.authorea.com): collaborative platform for preprints
102
-
-[HackMD](https://hackmd.io/) or [HedgeDoc](https://hedgedoc.org/): online collaborative Markdown editors
103
-
-[Manuscripts.io](https://www.manuscripts.io/): a collaborative authoring tool that support scientific content and reproducibility.
104
-
- Google Docs can be a good alternative
105
-
106
-
- Many tools exist to assist in making scholarly output reproducible:
107
-
-[rrtools](https://github.com/benmarwick/rrtools): instructions, templates, and functions for writing a reproducible article or report with R.
108
-
-[Jupyter Notebooks](https://jupyter.org): web-based computational environment for creating code and text based notebooks that can be used as, see also our [Jupyter lesson](https://coderefinery.github.io/jupyter/) later in this workshop.
109
-
supplementary material for articles.
110
-
-[Binder](https://mybinder.org): makes a repository with Jupyter notebooks available in an executable environment (discussed later in the [Jupyter lesson](https://coderefinery.github.io/jupyter/)).
111
-
-["Research compendia"](http://inundata.org/talks/rstd19/#/): a set of good practices for
112
-
reproducible data analysis in R, but much is transferable to other languages.
113
-
114
-
```{seealso}
115
-
Do you want to practice your reproducibility skills and get inspired by working with other people's code/data? Join a [ReproHack event](https://www.reprohack.org/event/)!
116
-
```
103
+
Git **can** be used to collaborate on manuscripts written in, e.g., LaTeX and other text-based formats. However it might not always be the most convenient. Other tools exist to make the process more enjoyable:
104
+
105
+
You can **collaboratively gather notes** using self-hosted or public instances of tools like [HedgeDoc](https://hedgedoc.org/) and [Etherpad](https://etherpad.org) or use online options like [HackMD](https://hackmd.io/), [Google Docs](https://docs.google.com) or the Microsoft online tools for easy and efficient collaboration.
106
+
107
+
To format your notes into a manuscript, you can use Word-like online editors or tools like [Overleaf](https://www.overleaf.com) (LaTeX) or [Typst](https://typst.app/) (markdown). Most of the tools in this section even provide a git integration.
108
+
109
+
[Manubot](https://github.com/manubot/rootstock) offers another way to turn your written word into a fully rendered manuscript using GitHub.
110
+
111
+
### Executable manuscripts
112
+
113
+
You may also want to consider writing an executable manuscript using tools like [Jupyter Notebooks](https://jupyter.org) hosted on [Binder](https://mybinder.org), [Quarto](https://quarto.org/), [Authorea](https://www.authorea.com) or [Observable](https://observablehq.com/), to name a few.
114
+
115
+
### Resources on research compendia
116
+
117
+
-[About research compendia at the Turing Way](https://book.the-turing-way.org/reproducible-research/compendia)
118
+
-["Research compendia"](http://inundata.org/talks/rstd19/#/): a set of good practices for reproducible data analysis in R, but much is transferable to other languages.
119
+
-[rrtools](https://github.com/benmarwick/rrtools): instructions, templates, and functions for writing a reproducible article or report with R.
120
+
- ...
117
121
118
122
```{keypoints}
119
123
- An organized project directory structure helps with reproducibility.
124
+
- Also think about version control for writing your academic manuscripts.
Copy file name to clipboardExpand all lines: content/where-to-go.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -47,6 +47,10 @@ However, you will not always need all of them. As with so many things, it again
47
47
-[Reproducible research policies and software/data management in scientific computing journals: a survey, discussion, and perspectives](https://doi.org/10.3389/fcomp.2024.1491823)
48
48
- ...
49
49
50
+
```{seealso}
51
+
Do you want to practice your reproducibility skills and get inspired by working with other people's code/data? Join a [ReproHack event](https://www.reprohack.org/event/)!
52
+
```
53
+
50
54
```{keypoints}
51
55
- Not everything in this lesson might be useful right now, but it is good to know that these things exist if you ever get in a situation that would require such solution.
52
56
- Caring about reproducibility makes work easier for the next person working on the project - and that might be you in a few years!
Copy file name to clipboardExpand all lines: content/workflow-management.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,14 @@
1
1
# Recording computational steps
2
2
3
+
```{objectives}
4
+
- Understand why and when a workflow management tool can be useful
5
+
```
6
+
3
7
```{questions}
4
8
- You have some steps that need to be run to do your work. How do you
5
9
actually run them? Does it rely on your own memory and work, or is it
6
10
reproducible? **How do you communicate the steps** for future you and others?
7
11
- How can we create a reproducible workflow?
8
-
- When to use scientific workflow management systems.
9
12
```
10
13
11
14
```{instructor-note}
@@ -78,7 +81,7 @@ steps in precisely this order, as we would run them manually, one after another.
78
81
79
82
## Workflow tools
80
83
81
-
Sometimes it may be helpful to go from imperative to declarative style. Rather than saying "do this and then that" we describe dependencies but we let the tool figure out the series of steps to produce results.
84
+
Sometimes it may be helpful to go from imperative to declarative style. Rather than saying "do this and then that" we describe dependencies between steps, but we let the tool figure out the order of steps to produce results.
82
85
83
86
### Example workflow tool: [Snakemake](https://snakemake.readthedocs.io/en/stable/index.html)
84
87
@@ -205,6 +208,7 @@ which can be installed by `conda install graphviz`.
205
208
```console
206
209
$ snakemake -j 1 --dag | dot -Tpng > dag.png
207
210
```
211
+
208
212
Rules that have yet to be completed are indicated with solid outlines, while already completed rules are indicated with dashed outlines.
209
213
210
214
```{figure} img/snakemake_dag.png
@@ -238,5 +242,5 @@ Tools like Snakemake help us with **reproducibility** by supporting us with **au
238
242
239
243
```{keypoints}
240
244
- Computational steps can be recorded in many ways
241
-
- Workflow tools can help, if there are many steps to be executed
245
+
- Workflow tools can help, if there are many steps to be executed and/or many datasets to be processed
0 commit comments