From eaa269ed5f393fad36a8e5c9d0c84efaf81bbc18 Mon Sep 17 00:00:00 2001 From: Lex Nederbragt Date: Wed, 26 Apr 2023 14:01:25 +0200 Subject: [PATCH] Fixes several small spelling mistakes --- _episodes/02-data_management.md | 10 +++++----- _episodes/03-software.md | 2 +- _episodes/04-collaboration.md | 4 ++-- _episodes/05-project_organization.md | 6 +++--- _episodes/06-track_changes.md | 12 ++++++------ _episodes/07-manuscripts.md | 10 +++++----- _episodes/08-what_next.md | 4 ++-- 7 files changed, 24 insertions(+), 24 deletions(-) diff --git a/_episodes/02-data_management.md b/_episodes/02-data_management.md index a56aa56f..e3bf9018 100644 --- a/_episodes/02-data_management.md +++ b/_episodes/02-data_management.md @@ -14,7 +14,7 @@ objectives: - "Identify problems with data management practices" - "Understand what raw data is" - "Understand what backing up data means and why it is important to back up in more than one location" -- "Be able to decide on appropiate file names and identifiers" +- "Be able to decide on appropriate file names and identifiers" - "Be able to create analysis ready datasets" - "Understand the importance of documenting your process" - "Understand what a DOI is and its usefulness" @@ -70,8 +70,8 @@ Our recommendations have two main themes. One is to work towards ready-to-analyz > > * In-house cloud service: this is a good way to back up your data (usually). You have local support. It is probably compliant with funders and data security guidelines. > > * USB pen drive: definitely not! Pen-drives are prone to dying (and your data with it). It also raises data security issues and they can be easily lost. > > * External hard-drive: see above. -> > * My laptop: it is good as a temporal storage solution for your active data. However, you should back it up appropiately. -> > * My workstation's hard-disk: it is good as a temporal storage solution for your active data. However, you should back it up appropiately. +> > * My laptop: it is good as a temporal storage solution for your active data. However, you should back it up appropriately. +> > * My workstation's hard-disk: it is good as a temporal storage solution for your active data. However, you should back it up appropriately. > > * Network drive: this is a good way to back up your data (usually). You have local support. It is probably compliant with funders and data security guidelines. > {: .solution} {: .challenge} @@ -264,7 +264,7 @@ and write a good README file for the humans ## Data management plans -Many UK universities and funders require researchers to complete a data management plan (DMP). A DMP is a document which outlines information about your research data and how it will be processed. Many funders provide basic templates for writing a DMP, along with guidelines on what information should be included but the main compoments of a DMP are: +Many UK universities and funders require researchers to complete a data management plan (DMP). A DMP is a document which outlines information about your research data and how it will be processed. Many funders provide basic templates for writing a DMP, along with guidelines on what information should be included but the main components of a DMP are: * Information about your data * Information about your metadata and data formats * Information on how data can be accessed, shared and re-used @@ -285,7 +285,7 @@ Many UK universities and funders require researchers to complete a data manageme Writing your first data management plan can be a daunting task but your future self will thank you in the end. It's best to speak to other members of your lab about any existing lab group or grant data management plans. -If you lab group doesn't have a data management plan, it may be helpful to work on it together to identify any major considerations. +If your lab group doesn't have a data management plan, it may be helpful to work on it together to identify any major considerations. More resources on data management plans are available at [DMP online](https://dmponline.dcc.ac.uk). diff --git a/_episodes/03-software.md b/_episodes/03-software.md index df37cf1f..33fa3b73 100644 --- a/_episodes/03-software.md +++ b/_episodes/03-software.md @@ -245,7 +245,7 @@ Also look for well-maintained libraries that already do what you're trying to do. All programming languages have libraries that you can import and use in your code. This is code that people have already written and made available for distribution that have a particular -function. For instances there are libraries for statistics, +function. For instance, there are libraries for statistics, modeling, mapping and many more. Many languages catalog the libraries in a centralized source, for instance R has CRAN, Python has PyPI, and so on. So diff --git a/_episodes/04-collaboration.md b/_episodes/04-collaboration.md index 37311b49..8e948f73 100644 --- a/_episodes/04-collaboration.md +++ b/_episodes/04-collaboration.md @@ -147,7 +147,7 @@ newcomers. Make explicit decisions about (and publicize where appropriate) how members of the -project will communicate with each other and with externals users / +project will communicate with each other and with external users / collaborators. This includes the location and technology for email lists, chat channels, voice / video conferencing, documentation, and meeting notes, as well as which of these channels will be public or @@ -157,7 +157,7 @@ private. ## Working with sensitive data It is important to identify whether your project will work with sensitive data - by which we might mean: - * Research data including personal data or identifiers (this might include names and addresses, or potentially identifyable genetic data or health information, or confidential information) + * Research data including personal data or identifiers (this might include names and addresses, or potentially identifiable genetic data or health information, or confidential information) * Commercially sensitive data or information (this might include intellectual property, or data generated or used within a restrictive commercial research funding agreement) * Data which may cause harm or adverse affects if released or made public (for example data relating to rare or endangered species which could cause poaching or fuel illegal trading) diff --git a/_episodes/05-project_organization.md b/_episodes/05-project_organization.md index 514bb509..c3b2dad5 100644 --- a/_episodes/05-project_organization.md +++ b/_episodes/05-project_organization.md @@ -57,7 +57,7 @@ The below recommendations on how you can structure data, code, analysis outputs and other files, are drawn primarily from [[noble2009](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000424), [gentzkow2014](https://web.stanford.edu/~gentzkow/research/CodeAndData.pdf)]. -The important concepts are that is useful to organize the project in +The important concepts are that it is useful to organize the project in modules by the types of files and that consistent planning and good names help you effectively find and use things later. @@ -105,7 +105,7 @@ cleaning or statistical analyses. These files can be thought of as the "scientific guts" of the project. The second type of file in `src` is controller or driver scripts -that that contains all the analysis steps for the entire project +that contains all the analysis steps for the entire project from start to finish, with particular parameters and data input/output commands. A controller script for a simple project, for example, may read a raw data table, import and apply several cleanup @@ -166,7 +166,7 @@ Projects that do not have any will not require `bin`. For example, use names such as `bird_count_table.csv`, `manuscript.md`, or -`sightings_analysis.py`. Do *not* using sequential numbers (e.g., +`sightings_analysis.py`. Do *not* use sequential numbers (e.g., `result1.csv`, `result2.csv`) or a location in a final manuscript (e.g., `fig_3_a.png`), since those numbers will almost certainly change as the project evolves. diff --git a/_episodes/06-track_changes.md b/_episodes/06-track_changes.md index 646055ab..98dacdd5 100644 --- a/_episodes/06-track_changes.md +++ b/_episodes/06-track_changes.md @@ -90,7 +90,7 @@ regular basis. Do not allow individual investigator's versions of the project repository to drift apart, as the effort required to merge differences goes up faster than the size of the difference. This is particularly important for the manual versioning procedure -describe below, which does not provide any assistance for merging +described below, which does not provide any assistance for merging simultaneous, possibly conflicting, changes. @@ -132,12 +132,12 @@ moment a laptop is stolen or its hard drive fails. > > * Reverted to the previous version of the abstract text as the manuscript reached word limits > -> * Cleaned the strain inventory: Recent freezer cleaning and ordering indicated a lot of problem with the strains data. The missing physical samples were removed from the table, the duplicated ids are marked for checking with PCR. The antibiotic resistence were moved from phenotype description to its own column. +> * Cleaned the strain inventory: Recent freezer cleaning and ordering indicated a lot of problem with the strains data. The missing physical samples were removed from the table, the duplicated ids are marked for checking with PCR. The antibiotic resistance were moved from phenotype description to its own column. > > * New regulation heatmap: As suggested by Will I used the normalization and variance stabilization procedure from Hafemeister et al prior to clustering and heatmap generation > -> The largest the project (measured either in: collaborators, file numbers, or workflow complexity) the more detailed the change description should be. -> While your personal project can get away with one liner descrptions, the largest projects should always contain inforamtion about motivation behind the change and +> The larger the project (measured either in: collaborators, file numbers, or workflow complexity) the more detailed the change description should be. +> While your personal project can get away with one liner descriptions, the largest projects should always contain information about motivation behind the change and > what are the consequences. > {: .callout} @@ -264,12 +264,12 @@ and thereby require less self-discipline for more reliable results. > ## Changelog in action > -> Have a look at one of the example github repositories and how they track changes*: +> Have a look at one of the example github repositories and how they track changes: > * [data from E.R. Ballou et al. 2020](https://github.com/ewallace/pseudonuclease_evolution_2020/commits/master) > * [data from I. Boehm et al. 2020](https://github.com/BioRDM/nmj-pig/commits/main) > > Give examples of: -> * what makes them good changelog +> * what makes their changelogs good > * what could be improved > > Think what would be the most difficult feature to replicate with manual version control? diff --git a/_episodes/07-manuscripts.md b/_episodes/07-manuscripts.md index ebd47f50..7a7c4196 100644 --- a/_episodes/07-manuscripts.md +++ b/_episodes/07-manuscripts.md @@ -29,7 +29,7 @@ is essential, just like other collaborations. > ## Discussion (3 mins) > > Whether or not you have written a scientific manuscript before, -> you probably have experience of group work or writing +> you probably have experience of group work or writing. > Discuss on the collaborative document: > > * What tools have you used before for group writing? @@ -46,7 +46,7 @@ is essential, just like other collaborations. We suggest having a meeting (or online thread) of all authors at the beginning of the writing process. Ask everyone how they would prefer to -write a manuscript. The agree a decision and process, and put the outcome +write a manuscript. Then agree on a decision and process, and put the outcome in writing. If co-authors are learning new tools, ask someone familiar with those tools to support them! @@ -111,7 +111,7 @@ Our first alternative will already be familiar to many researchers: 1. ***Write manuscripts using online tools with rich formatting, change tracking, and reference - management (6a)***, such as Google Docs or MS OneDrive. + management***, such as Google Docs or MS OneDrive. With the document online, everyone's changes are in one place, and hence don't need to be merged manually. @@ -165,7 +165,7 @@ e.g. through [Rmarkdown](https://rmarkdown.rstudio.com/). |----------------------------------------------|----------------------|----------------------|----------------------------------| | Previous user experience/comfort | High | Medium | Low | | Visible tracking of changes | Low | Variable | High | -| Institutional support | Low | High* | Low | +| Institutional support | Low | High | Low | | Ease of merging changes and suggestions | Low | Medium | High | | Distributed control | Low | High | High | | Ease of formatting changes for re-submission | Low | Low | High | @@ -173,7 +173,7 @@ e.g. through [Rmarkdown](https://rmarkdown.rstudio.com/). While we feel that text-based version control is a superior method, the barriers to entry may be too high for many users. The single master online approach is a good compromise. -If your instution has invested in an environment (Google Docs / MS Office), +If your institution has invested in an environment (Google Docs / MS Office), users can stay within their familiar desktop GUI applications while still taking advantage of automatic file versioning and shared editing. diff --git a/_episodes/08-what_next.md b/_episodes/08-what_next.md index 99ff8085..f788ed21 100644 --- a/_episodes/08-what_next.md +++ b/_episodes/08-what_next.md @@ -50,7 +50,7 @@ Thinking through your work from a collaborator's point of view is helpful, and u Learning good practices is a long-term process that never stops. We [left out many good practices](/_extras/what-we-left-out.md) that, although useful, -have more niche application. +have more niche applications. We recommend the paper [Best Practices in Scientific Computing](https://doi.org/10.1371/journal.pbio.1001745), especially for those gaining more experience with coding. There are [many other useful papers and resources that we have selected](/_extras/resources). @@ -77,7 +77,7 @@ Progress in computational good practices comes from different places in the scie - PIs and lab heads can require that lab members share code and data, and make it easy for them to do so. - Lab members can organise "data curation days" and training sessions to share good practices. - Self-organised groups led by students and postdocs can share ideas and train each other. -- Global organisations like [The Carpentries](https://carpentries.org) can co-ordinate training and suport training materials. +- Global organisations like [The Carpentries](https://carpentries.org) can co-ordinate training and support training materials. - Professional societies can help to organise training. It takes time to learn good practices, and time to train others.