Skip to content

Commit 44a6b25

Browse files
committed
naming simplified ex1
1 parent e5a2d8d commit 44a6b25

9 files changed

+90
-25
lines changed

img/rg-ex-1a.pdf

5.27 KB
Binary file not shown.

img/rg-ex-1b.pdf

6.87 KB
Binary file not shown.

toy_example/README.md

+32-25
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Snakemake toy example
1+
# Snakemake toy example
22

3-
This is a simple toy example that can be used to start learning the basics of snakemake. This folder contains some data files with simple contents. For example, file 123.txt contains three lines of data, with values 1, 2, and 3. We'll use these simple data along with core command-line tools, included by default in most unix environments, to illustrate the basics of snakemake.
3+
This is a simple toy example that can be used to start learning the basics of snakemake. This folder contains some extremely simple input files. For example, sampleA.txt contains three lines of data, with values 1, 2, and 3, respectively. We'll use these simple data along with core command-line tools, included by default in most unix environments, to illustrate the basics of snakemake.
44

55
Note: If you haven't installed snakemake yet, I'd suggest using [conda](https://docs.conda.io/en/latest/miniconda.html) to do so.
66

@@ -13,14 +13,14 @@ Get started by expanding Example 1a
1313

1414
<details><summary>Expand - Ex. 1a</summary>
1515

16-
Create a file named toy.snakefile with the following contents:
16+
ex-1a.smk has the following contents:
1717

1818
rule all:
1919
input:
20-
"output/123_rsorted.txt",
21-
"output/345_rsorted.txt",
22-
"output/567_rsorted.txt"
23-
20+
"output/sampleA_rsorted.txt",
21+
"output/sampleB_rsorted.txt",
22+
"output/sampleC_rsorted.txt"
23+
2424
rule rsort:
2525
input:
2626
"{basename}.txt"
@@ -29,6 +29,10 @@ Create a file named toy.snakefile with the following contents:
2929
shell:
3030
"sort -r {input} > {output}"
3131

32+
<details><summary>Ex. 1a rulegraph</summary>
33+
![Ex. 1a rulegraph](../img/rg-ex-1a.pdf)
34+
</details>
35+
3236
Topics covered:
3337
* Targets & dependencies
3438
* Writing rules
@@ -42,13 +46,13 @@ Writing rules - Generally will have 'input', 'output', and 'shell' blocks (more
4246

4347
The rule 'all' is placed at the top of the file (the first rule, anyway), and this is always executed by default. It's being used to define the targets for the workflow.
4448

45-
Now perform a dry-run:
49+
To perform a dry-run:
4650

47-
snakemake --snakefile toy.snakefile --dry-run
51+
snakemake --snakefile ex-1a.smk --dry-run
4852

4953
Notice that snakemake keeps track of the wildcards during the evaluation of each rule
5054
* experiment by changing the targets so they don't match the input files
51-
55+
5256
## You have reached the end of example 1a ✅
5357

5458
</details>
@@ -59,7 +63,7 @@ Now we'll make the workflow a bit more interesting. We'll add more rules, use a
5963
<details><summary>Expand - Ex. 1b</summary>
6064

6165

62-
toy.snakefile contents:
66+
ex-1b.smk contents:
6367

6468
rule all:
6569
input:
@@ -87,7 +91,7 @@ toy.snakefile contents:
8791

8892
rule randsort:
8993
input:
90-
"{base}.txt"
94+
"output/{base}_appended.txt"
9195
output:
9296
"output/{base}_randsorted.txt"
9397
shell:
@@ -100,34 +104,38 @@ toy.snakefile contents:
100104
"output/{base}_fsorted.txt"
101105
shell:
102106
"sleep 2 ; sort -n {input} > {output}"
103-
104-
toy_config.yml contents:
107+
108+
config-ex-1b.yml contents:
105109

106110
basenames:
107-
- '123'
108-
- '345'
109-
- '567'
111+
- 'sampleA'
112+
- 'sampleB'
113+
- 'sampleC'
110114
append_val: 42
111115

116+
<details><summary>Ex. 1b rulegraph</summary>
117+
![Ex. 1b rulegraph](../img/rg-ex-1b.pdf)
118+
</details>
119+
112120
The config
113121
* How is the config used with this snakefile?
114-
122+
115123
The expand statement
116124
* The various uses of curly braces can be confusing at first (at least for me)
117125
* `expand` is distinct from `wildcards`
118126
* can be thought of as "expand this string (arg 1) into an array of strings, filling in all combinations of values (args 2+ as key-value pairs)
119127

120128
Running the new snakefile/configfile
121129

122-
snakemake --snakefile toy.snakefile --configfile toy_config.yml
123-
130+
snakemake --snakefile ex-1b.smk --configfile config-ex-1b.yml
131+
124132
Did snakemake run the workflow, and successfully create the desired targets?
125133
* View the directory of results
126134

127135
More about the core tenets of snakemake (also gnu make, make-like things)
128-
* Try running the workflow to completion, then running it again. What happens?
136+
* Try running the workflow to completion, then running it again. What happens?
129137
* Delete `output`, then try again. Isn't this cool?
130-
* Try deleting an intermediate file, then running the pipeline again. How is this beneficial? How can it be problematic?
138+
* Try modifying an intermediate file, then running the pipeline again. How is this beneficial? How can it be problematic?
131139

132140
The workflow & DAG
133141
* Directed Acyclic Graph - how snakemake 'knows' how to produce the desired targets
@@ -136,11 +144,10 @@ The workflow & DAG
136144
Viewing the DAG (or rulegraph)
137145

138146
#DAG (file-level granularity)
139-
snakemake --snakefile toy.snakefile --configfile toy_config.yml --dag | dot -T pdf > toy_dag.pdf
147+
snakemake --snakefile ex-1b.smk --configfile config-ex-1b.yml --dag | dot -T pdf > dag-ex-1b.pdf
140148
#Rulegraph (rule-level granularity)
141-
snakemake --snakefile toy.snakefile --configfile toy_config.yml --rulegraph | dot -T pdf > toy_rulegraph.pdf
149+
snakemake --snakefile ex-1b.smk --configfile config-ex-1b.yml --rulegraph | dot -T pdf > rg-ex-1b.pdf
142150

143151
## You have reached the end of example 1b ✅
144152

145153
</details>
146-

toy_example/config-ex-1b.smk

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
basenames:
2+
- sampleA
3+
- sampleB
4+
- sampleC
5+
append_val: 42

toy_example/ex-1a.smk

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
rule all:
2+
input:
3+
"output/sampleA_rsorted.txt",
4+
"output/sampleB_rsorted.txt",
5+
"output/sampleC_rsorted.txt"
6+
7+
rule rsort:
8+
input:
9+
"{basename}.txt"
10+
output:
11+
"output/{basename}_rsorted.txt"
12+
shell:
13+
"sort -r {input} > {output}"

toy_example/ex-1b.smk

+40
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
rule all:
2+
input:
3+
expand("output/{bname}_fsorted.txt", bname=config['basenames']),
4+
expand("output/{bname}_randsorted.txt", bname=config['basenames'])
5+
6+
rule rsort:
7+
input:
8+
"{base}.txt"
9+
output:
10+
"output/{base}_rsorted.txt"
11+
shell:
12+
"sort -r {input} > {output}"
13+
14+
rule append_value:
15+
input:
16+
"output/{base}_rsorted.txt"
17+
output:
18+
"output/{base}_appended.txt"
19+
params:
20+
append_val = config['append_val']
21+
shell:
22+
"cat {input} > {output} ; "
23+
"echo {params.append_val} >> {output}"
24+
25+
rule randsort:
26+
input:
27+
"output/{base}_appended.txt"
28+
output:
29+
"output/{base}_randsorted.txt"
30+
shell:
31+
"sort -R {input} > {output}"
32+
33+
rule fsort:
34+
input:
35+
"output/{base}_appended.txt"
36+
output:
37+
"output/{base}_fsorted.txt"
38+
shell:
39+
"sleep 2 ; sort -n {input} > {output}"
40+
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)