Skip to content

Commit

Permalink
Merge pull request #138 from comses-education/fix_naming_sorting_chal…
Browse files Browse the repository at this point in the history
…lenge_formatting

fix challenge formatting
  • Loading branch information
alee committed Sep 23, 2023
2 parents e0ef83b + 830a3dc commit 7db847e
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 18 deletions.
2 changes: 1 addition & 1 deletion episodes/04-collaboration.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Discussion
- What goes wrong with collaboration?
- How can you prepare to collaborate?

::::::::::::::: solution
::::::::::::::::: spoiler

## Suggestions

Expand Down
36 changes: 19 additions & 17 deletions episodes/05-project_organization.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,8 +249,6 @@ For your information, to encode experimental details the following conventions w
- measurement date
- other details are timepoint and raw or normalized data

::::::::::::::: solution

```
2020-07-14_s12_phyB_on_SD_t04.raw.csv
2020-07-14_s1_phyA_on_LD_t05.raw.csv
Expand All @@ -272,8 +270,6 @@ SD_phya_ons_t04_2020-07-12.norm.csv
ld_phyA_ons_t04_2020-08-12.norm.csv
```

:::::::::::::::::::::::::

- What are the problems with having the date first?
- How do different date formats behave once sorted?
- Can you tell the importance of a leading 0 (zeros)?
Expand All @@ -282,21 +278,27 @@ ld_phyA_ons_t04_2020-08-12.norm.csv
- Do you see benefits of keeping consistent lengths of the naming conventions?
- Do you see what happens when you mix conventions?

> ## Solution
>
> - Using dates up front makes it difficult to quickly find data for
> particular conditions or genotypes. It also masks the "logical" order of samples
> or timepoints.
> - Named months break the "expected" sorting, same as dates without leading 0
> - Without leading zeros, 's12' appear before s1 and s2
> - the first (and second) part of the name are easiest to spot
> - the last file is also from LD conditions, but appears after SD, same with 'phya' genotypes
> - the last 3 file names are easiest to read as all parts appear on top of each other
> due to the same 3 letter-length codes ons and off
> - The lack of consistency makes it very difficult to get data from related samples/conditions.
::::::::::::::::::::::::: solution



## Solution

- Using dates up front makes it difficult to quickly find data for
particular conditions or genotypes. It also masks the "logical" order of samples
or timepoints.
- Named months break the "expected" sorting, same as dates without leading 0
- Without leading zeros, 's12' appear before s1 and s2
- the first (and second) part of the name are easiest to spot
- the last file is also from LD conditions, but appears after SD, same with 'phya' genotypes
- the last 3 file names are easiest to read as all parts appear on top of each other
due to the same 3 letter-length codes ons and off
- The lack of consistency makes it very difficult to get data from related samples/conditions.

::::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::::: callout

## Some helpful organisation tools
Expand All @@ -318,7 +320,7 @@ ld_phyA_ons_t04_2020-08-12.norm.csv
## Attribution

This episode was adapted from and includes material from Wilson et al.
[Good Enough Practices for Scientific Computing](https://github.com/swcarpentry/good-enough-practices-in-scientific-computing).
[Good Enough Practices for Scientific Computing](https://doi.org/10.1371/journal.pcbi.1005510).

Some content was adapted from [FAIR in Biological Practice episode on files and organisation](https://carpentries-incubator.github.io/fair-bio-practice/09-files-organization/index.html). That material gives a slightly different and also useful perspective.

Expand Down

0 comments on commit 7db847e

Please sign in to comment.