carpentries-lab · ewallace · May 9, 2023 · Apr 26, 2023
diff --git a/_episodes/02-data_management.md b/_episodes/02-data_management.md
@@ -14,7 +14,7 @@ objectives:
 - "Identify problems with data management practices"  
 - "Understand what raw data is"  
 - "Understand what backing up data means and why it is important to back up in more than one location"  
-- "Be able to decide on appropiate file names and identifiers"  
+- "Be able to decide on appropriate file names and identifiers"  
 - "Be able to create analysis ready datasets"  
 - "Understand the importance of documenting your process"  
 - "Understand what a DOI is and its usefulness"  
@@ -70,8 +70,8 @@ Our recommendations have two main themes. One is to work towards ready-to-analyz
 > > * In-house cloud service: this is a good way to back up your data (usually). You have local support. It is probably compliant with funders and data security guidelines.  
 > > * USB pen drive: definitely not! Pen-drives are prone to dying (and your data with it). It also raises data security issues and they can be easily lost.  
 > > * External hard-drive: see above.  
-> > * My laptop: it is good as a temporal storage solution for your active data. However, you should back it up appropiately.  
-> > * My workstation's hard-disk: it is good as a temporal storage solution for your active data. However, you should back it up appropiately.  
+> > * My laptop: it is good as a temporal storage solution for your active data. However, you should back it up appropriately.  
+> > * My workstation's hard-disk: it is good as a temporal storage solution for your active data. However, you should back it up appropriately.  
 > > * Network drive: this is a good way to back up your data (usually). You have local support. It is probably compliant with funders and data security guidelines.  
 > {: .solution}  
 {: .challenge}  
@@ -264,7 +264,7 @@ and write a good README file for the humans
 
 ## Data management plans
 
-Many UK universities and funders require researchers to complete a data management plan (DMP). A DMP is a document which outlines information about your research data and how it will be processed. Many funders provide basic templates for writing a DMP, along with guidelines on what information should be included but the main compoments of a DMP are:
+Many UK universities and funders require researchers to complete a data management plan (DMP). A DMP is a document which outlines information about your research data and how it will be processed. Many funders provide basic templates for writing a DMP, along with guidelines on what information should be included but the main components of a DMP are:
 * Information about your data
 * Information about your metadata and data formats
 * Information on how data can be accessed, shared and re-used
@@ -285,7 +285,7 @@ Many UK universities and funders require researchers to complete a data manageme
 
 Writing your first data management plan can be a daunting task but your future self will thank you in the end. 
 It's best to speak to other members of your lab about any existing lab group or grant data management plans. 
-If you lab group doesn't have a data management plan, it may be helpful to work on it together to identify any major considerations.
+If your lab group doesn't have a data management plan, it may be helpful to work on it together to identify any major considerations.
 
 More resources on data management plans are available at [DMP online](https://dmponline.dcc.ac.uk).
 

diff --git a/_episodes/03-software.md b/_episodes/03-software.md
@@ -245,7 +245,7 @@ Also look for well-maintained libraries that already do what you're
 trying to do. All programming languages have libraries that you can
 import and use in your code. This is code that people have already
 written and made available for distribution that have a particular
-function. For instances there are libraries for statistics,
+function. For instance, there are libraries for statistics,
 modeling, mapping and many more. Many languages catalog the
 libraries in a centralized source, for instance R has
 CRAN, Python has PyPI, and so on. So

diff --git a/_episodes/04-collaboration.md b/_episodes/04-collaboration.md
@@ -147,7 +147,7 @@ newcomers.
 
 Make explicit
 decisions about (and publicize where appropriate) how members of the
-project will communicate with each other and with externals users /
+project will communicate with each other and with external users /
 collaborators. This includes the location and technology for email
 lists, chat channels, voice / video conferencing, documentation, and
 meeting notes, as well as which of these channels will be public or
@@ -157,7 +157,7 @@ private.
 ## Working with sensitive data
 
 It is important to identify whether your project will work with sensitive data - by which we might mean:
-  * Research data including personal data or identifiers (this might include names and addresses, or potentially identifyable genetic data or health information, or confidential information)
+  * Research data including personal data or identifiers (this might include names and addresses, or potentially identifiable genetic data or health information, or confidential information)
   * Commercially sensitive data or information (this might include intellectual property, or data generated or used within a restrictive commercial research funding agreement)
   * Data which may cause harm or adverse affects if released or made public (for example data relating to rare or endangered species which could cause poaching or fuel illegal trading)
 

diff --git a/_episodes/05-project_organization.md b/_episodes/05-project_organization.md
@@ -57,7 +57,7 @@ The below recommendations on how you can structure data,
 code, analysis outputs and other files, are drawn primarily 
 from [[noble2009](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000424), [gentzkow2014](https://web.stanford.edu/~gentzkow/research/CodeAndData.pdf)].
 
-The important concepts are that is useful to organize the project in
+The important concepts are that it is useful to organize the project in
 modules by the types of files and that consistent planning and good
 names help you effectively find and use things later.
 
@@ -105,7 +105,7 @@ cleaning or statistical analyses. These files can be thought of as
 the "scientific guts" of the project.
 
 The second type of file in `src` is controller or driver scripts
-that that contains all the analysis steps for the entire project
+that contains all the analysis steps for the entire project
 from start to finish, with particular parameters and data
 input/output commands. A controller script for a simple project, for
 example, may read a raw data table, import and apply several cleanup
@@ -166,7 +166,7 @@ Projects that do not have any will not require `bin`.
 
 For example, use names
 such as `bird_count_table.csv`, `manuscript.md`, or
-`sightings_analysis.py`. Do *not* using sequential numbers (e.g.,
+`sightings_analysis.py`. Do *not* use sequential numbers (e.g.,
 `result1.csv`, `result2.csv`) or a location in a final manuscript
 (e.g., `fig_3_a.png`), since those numbers will almost certainly
 change as the project evolves.

diff --git a/_episodes/06-track_changes.md b/_episodes/06-track_changes.md
@@ -90,7 +90,7 @@ regular basis. Do not allow individual investigator's versions of
 the project repository to drift apart, as the effort required to
 merge differences goes up faster than the size of the difference.
 This is particularly important for the manual versioning procedure
-describe below, which does not provide any assistance for merging
+described below, which does not provide any assistance for merging
 simultaneous, possibly conflicting, changes.
 
 
@@ -132,12 +132,12 @@ moment a laptop is stolen or its hard drive fails.
 >
 > * Reverted to the previous version of the abstract text as the manuscript reached word limits
 >
-> * Cleaned the strain inventory: Recent freezer cleaning and ordering indicated a lot of problem with the strains data. The missing physical samples were removed from the table, the duplicated ids are marked for checking with PCR. The antibiotic resistence were moved from phenotype description to its own column.
+> * Cleaned the strain inventory: Recent freezer cleaning and ordering indicated a lot of problem with the strains data. The missing physical samples were removed from the table, the duplicated ids are marked for checking with PCR. The antibiotic resistance were moved from phenotype description to its own column.
 >
 > * New regulation heatmap: As suggested by Will I used the normalization and variance stabilization procedure from Hafemeister et al prior to clustering and heatmap generation
 >
-> The largest the project (measured either in: collaborators, file numbers, or workflow complexity) the more detailed the change description should be.
-> While your personal project can get away with one liner descrptions, the largest projects should always contain inforamtion about motivation behind the change and
+> The larger the project (measured either in: collaborators, file numbers, or workflow complexity) the more detailed the change description should be.
+> While your personal project can get away with one liner descriptions, the largest projects should always contain information about motivation behind the change and
 > what are the consequences.
 >
 {: .callout}
@@ -264,12 +264,12 @@ and thereby require less self-discipline for more reliable results.
 
 > ## Changelog in action
 >
-> Have a look at one of the example github repositories and how they track changes*:
+> Have a look at one of the example github repositories and how they track changes:
 > * [data from E.R. Ballou et al. 2020](https://github.com/ewallace/pseudonuclease_evolution_2020/commits/master)
 > * [data from I. Boehm et al. 2020](https://github.com/BioRDM/nmj-pig/commits/main)
 >
 > Give examples of:
-> * what makes them good changelog
+> * what makes their changelogs good
 > * what could be improved
 >
 > Think what would be the most difficult feature to replicate with manual version control?

diff --git a/_episodes/07-manuscripts.md b/_episodes/07-manuscripts.md
@@ -29,7 +29,7 @@ is essential, just like other collaborations.
 > ## Discussion (3 mins)
 >
 > Whether or not you have written a scientific manuscript before, 
-> you probably have experience of group work or writing
+> you probably have experience of group work or writing.
 > Discuss on the collaborative document:
 >
 > * What tools have you used before for group writing?
@@ -46,7 +46,7 @@ is essential, just like other collaborations.
 
 We suggest having a meeting (or online thread) of all authors at the
 beginning of the writing process. Ask everyone how they would prefer to
-write a manuscript. The agree a decision and process, and put the outcome
+write a manuscript. Then agree on a decision and process, and put the outcome
 in writing. If co-authors are learning new tools, ask someone
 familiar with those tools to support them!
 
@@ -111,7 +111,7 @@ Our first alternative will already be familiar to many researchers:
 
 1.  ***Write manuscripts using online tools with rich
     formatting, change tracking, and reference
-    management (6a)***, such as Google Docs or MS OneDrive.
+    management***, such as Google Docs or MS OneDrive.
     With the document online, everyone's changes are in one place, and
     hence don't need to be merged manually.
 
@@ -165,15 +165,15 @@ e.g. through [Rmarkdown](https://rmarkdown.rstudio.com/).
 |----------------------------------------------|----------------------|----------------------|----------------------------------|
 | Previous user experience/comfort             | High                 | Medium               | Low                              |
 | Visible tracking of changes                  | Low                  | Variable             | High                             |
-| Institutional support                        | Low                  | High*                | Low                              |
+| Institutional support                        | Low                  | High                | Low                              |
 | Ease of merging changes and suggestions      | Low                  | Medium               | High                             |
 | Distributed control                          | Low                  | High                 | High                             |
 | Ease of formatting changes for re-submission | Low                  | Low                  | High                             |
 
 While we feel that text-based version control is a superior method,
 the barriers to entry may be too high for many users.
 The single master online approach is a good compromise.
-If your instution has invested in an environment (Google Docs / MS Office),
+If your institution has invested in an environment (Google Docs / MS Office),
 users can stay within their familiar desktop GUI applications while still
 taking advantage of automatic file versioning and shared editing.
 

diff --git a/_episodes/08-what_next.md b/_episodes/08-what_next.md
@@ -50,7 +50,7 @@ Thinking through your work from a collaborator's point of view is helpful, and u
 
 Learning good practices is a long-term process that never stops.
 We [left out many good practices](/_extras/what-we-left-out.md) that, although useful,
-have more niche application.
+have more niche applications.
 We recommend the paper [Best Practices in Scientific Computing](https://doi.org/10.1371/journal.pbio.1001745),
 especially for those gaining more experience with coding.
 There are [many other useful papers and resources that we have selected](/_extras/resources).
@@ -77,7 +77,7 @@ Progress in computational good practices comes from different places in the scie
 - PIs and lab heads can require that lab members share code and data, and make it easy for them to do so.
 - Lab members can organise "data curation days" and training sessions to share good practices.
 - Self-organised groups led by students and postdocs can share ideas and train each other.
-- Global organisations like [The Carpentries](https://carpentries.org) can co-ordinate training and suport training materials.
+- Global organisations like [The Carpentries](https://carpentries.org) can co-ordinate training and support training materials.
 - Professional societies can help to organise training.
 
 It takes time to learn good practices, and time to train others.