Skip to content

Commit c7e0175

Browse files
committed
Improve looping over files lecture material
* Condense presentation * Show shortcuts for empty vectors * Store filename then read it to simplify setup for storing in data frames * Calculate number of files once * Add a realistic calculation within the file itself
1 parent b0ca28b commit c7e0175

File tree

1 file changed

+12
-17
lines changed

1 file changed

+12
-17
lines changed

materials/for-loops-R.md

Lines changed: 12 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -207,14 +207,16 @@ data_files = list.files(pattern = "locations-")
207207
* First create an empty vector to store those counts
208208

209209
```r
210-
results <- vector(mode = "integer", length = length(data_files))
210+
n_files = length(data_files)
211+
results <- integer(n_files)
211212
```
212213

213214
* Then write our loop
214215

215216
```r
216-
for (i in 1:length(data_files){
217-
data <- read.csv(data_files[i])
217+
for (i in 1:n_files){
218+
filename <- data_files[i]
219+
data <- read.csv(filename)
218220
count <- nrow(data)
219221
results[i] <- count
220222
}
@@ -228,38 +230,31 @@ for (i in 1:length(data_files){
228230
* We often want to calculate multiple pieces of information in a loop making it useful to store results in things other than vectors
229231
* We can store them in a data frame instead by creating an empty data frame and storing the results in the `i`th row of the appropriate column
230232
* Associate the file name with the count
233+
* Also store the minimum latitude
231234
* Start by creating an empty data frame
232235
* Use the `data.frame` function
233236
* Provide one argument for each column
234237
* "Column Name" = "an empty vector of the correct type"
235238

236239
```r
237-
results <- data.frame(file_name = vector(mode = "character", length = length(data_files)))
238-
count = vector(mode = "integer", length = length(data_files)))
240+
results <- data.frame(file_name = character(n_files),
241+
count = integer(n_files),
242+
min_lat = numeric(n_files))
239243
```
240244

241245
* Now let's modify our loop from last time
242246
* Instead of storing `count` in `results[i]` we need to first specify the `count` column using the `$`: `results$count[i]`
243247
* We also want to store the filename, which is `data_files[i]`
244248

245249
```r
246-
for (i in 1:length(data_files){
247-
data <- read.csv(data_files[i])
248-
count <- nrow(data)
249-
results$file_name[i] <- data_files[i]
250-
results$count[i] <- count
251-
}
252-
```
253-
254-
* We could also rewrite this a little to make it easier to understand by getting the file name at the begging
255-
256-
```r
257-
for (i in 1:length(data_files){
250+
for (i in 1:n_files){
258251
filename <- data_files[i]
259252
data <- read.csv(filename)
260253
count <- nrow(data)
254+
min_lat = min(data$lat)
261255
results$file_name[i] <- filename
262256
results$count[i] <- count
257+
results$min_lat[i] <- min_lat
263258
}
264259
```
265260

0 commit comments

Comments
 (0)