-
Notifications
You must be signed in to change notification settings - Fork 0
/
hw12-reproducible-reporting.Rmd
95 lines (66 loc) · 4.74 KB
/
hw12-reproducible-reporting.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
title: "Homework 12 - Reproducible Reporting"
output:
html_document:
toc: false
number_sections: false
params:
number: 12
submit: "https://u.pcloud.com/#page=puplink&code=4VYkZFrkB2PiCTam5NRIFErHAE4tQIrXV"
purpose: |
The purposes of this assignment are:
>
> - Be able to manage computational projects for reproducibility, reuse, and collaboration.
> - Use R tools and conventions to document code and analyses and produce reproducible reports.
> - Be able to publish, share materials, and collaborate through the web.
---
```{r child = file.path("child", "setup.Rmd")}
```
```{r child = file.path("child", "hw.Rmd")}
```
### 1) Staying organized [5%]
Download and use [this template](templates/hw12.zip) for your assignment. Inside the "hw12" folder, open and edit the YAML at the top of the "hw12.Rmd" file.
> This week's main task is simple: **Create a html document of HW 11**.
For this last assignment, all you need to do is convert what you did in HW 11 into a .Rmd file that compiles to a html document. So you're going to repeat the same steps as in HW 11, except this time your result will be a much nicer, well-formatted html page. **If you want, you may choose to work with a different dataset from HW 11, but you may also just use the same dataset and charts from your HW 11 submission.**
> ### **Using good style**
>
> For this assignment, you must use good style to receive full credit. Follow the best practices described in [this style guide](http://adv-r.had.co.nz/Style.html).
### 2) Choose and load some data [5%]
For this assignment, you will need to find a dataset of your choosing and create **three** summary visualizations. To keep things manageable, choose one of the following datasets from the following libraries. Note that to load any of these data frames, all you need to do is install and load the package.
**dplyr**:
- `storms`
- `starwars`
**ggplot2**:
- `diamonds`
- `economics`
- `midwest`
- `mpg`
- `msleep`
- `txhousing`
**dslabs**:
- `gapminder`
- `movielens`
- `murders`
- `stars`
### 3) Inspect your data [10%]
Once you've chosen a data set, open your `hw12.Rmd` file and begin exploring the data (be sure to load the package that contains the dataset at the top of your file). Write some code in code chunks to preview and summarize the data frame using some of the methods we've used in class. You should be able to quickly get an understanding of what variables are included and their nature. Consider the following questions in your exploration (you don't have to write out answers to these questions - just write code to help you answer them by previewing the data in different ways):
- What is the total size of the data frame?
- What type of data is each variable (numeric, character, logical, date)?
- Do any variables have missing values? Why might that be?
- What are the "boundaries" of each period of observation:
- For numeric variables, what are the min and max values?
- For character variables, what are the unique values in the variable?
- For date variables, what time period do the observations in these data frames span?
### 4) Make charts [50%]
Now that you have a basic understanding of the dataset, make some charts to explore the variables in the data and their potential relationships. You may use base R plotting functions or the **ggplot2** package to make your figures, but you must make at least two different types of figures, including:
1. A scatterplot of involving at least two variables.
2. A bar chart involving at least one variable.
You can choose to plot whichever variables you wish, but you must be able to interpret the results of your chart.
### 5) Interpret your charts [15%]
Below the code for each of your charts, write a description and interpretation of your chart in a comment. Make sure you address at least the following questions:
1. Describe what variables you are plotting and why.
2. Describe the primary relationship / trend / information you hope the reader will gain from your visualization.
### 6) Compile your RMarkdown document [5%]
When you're done, click the "knit" button to compile your .Rmd file into a html page. Make sure you've used code chunk settings to make the outputs pretty (e.g. nice figure dimensions, no warning messages, etc.).
### 7) Read and reflect [SOLO, 10%]
Read and reflect on next week's readings on [monte carlo methods](r13-monte-carlo-methods.html). Afterwards, in a comment (`#`) in your R file, write a short reflection on what you've learned and any questions or points of confusion you have about what we've covered thus far. This can just few a few sentences related to this assignment, next week's readings, things going on in the world that remind you something from class, etc. If there's anything that jumped out at you, write it down.