-
Notifications
You must be signed in to change notification settings - Fork 2
/
functions-packages.qmd
352 lines (235 loc) · 14 KB
/
functions-packages.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
# Functions & Packages {#sec-functions-packages}
```{r}
#| echo: false
#| message: false
source("_common.R")
```
> ### Learning Objectives {.unnumbered}
>
> * Know some common functions in R.
> * Know how R handles function arguments and named arguments.
> * Know how to install, load, and use functions from external R packages.
> * Practice programming with functions using the TurtleGraphics package.
>
> ### Suggested Readings {.unnumbered}
>
> * Chapter 3 of Danielle Navarro's book ["Learning Statistics With R"]({{< var navarro_book_c3 >}})
## Functions
You can do a lot with the basic operators like `+`, `-`, and `*`, but to do more advanced calculations you're going to need to start using functions.^[Technically speaking, operators are functions in R: the addition operator `+` is a convenient way of calling the addition function `'+'()`. Thus `10+20` is equivalent to the function call `'+'(20, 30)`. Not surprisingly, no-one ever uses this version.]
> [Watch this 1-minute video for a quick summary of **functions**](https://vimeo.com/220490105)
R has a lot of very useful built-in functions. For example, if I wanted to take the square root of 225, I could use R's built-in square root function `sqrt()`:
```{r}
sqrt(225)
```
Here the letters `sqrt` are short for "square root," and the value inside the `()` is the "argument" to the function. In the example above, the value `225` is the "argument".
Keep in mind that not all functions have (or require) arguments:
```{r}
date() # Returns the current date and time
```
(the date above is the date this page was last built)
### Multiple arguments
Some functions have more than one argument. For example, the `round()` function can be used to round some value to the nearest integer or to a specified decimal place:
```{r}
round(3.14165) # Rounds to the nearest integer
round(3.14165, 2) # Rounds to the 2nd decimal place
```
Not all arguments are mandatory. With the `round()` function, the decimal place is an _optional_ input - if nothing is provided, the function will round to the nearest integer by default.
### Argument names
In the case of `round()`, it's not too hard to remember which argument comes first and which one comes second. But it starts to get very difficult once you start using complicated functions that have lots of arguments. Fortunately, most R functions use **argument names** to make your life a little easier. For the `round()` function, for example, the number that needs to be rounded is specified using the `x` argument, and the number of decimal points that you want it rounded to is specified using the `digits` argument, like this:
```{r}
round(x = 3.1415, digits = 2)
```
### Default values
Notice that the first time I called the `round()` function I didn't actually specify the `digits` argument at all, and yet R somehow knew that this meant it should round to the nearest whole number. How did that happen? The answer is that the `digits` argument has a **default value** of `0`, meaning that if you decide not to specify a value for `digits` then R will act as if you had typed `digits = 0`.
This is quite handy: most of the time when you want to round a number you want to round it to the nearest *whole* number, and it would be pretty annoying to have to specify the `digits` argument every single time. On the other hand, sometimes you actually do want to round to something other than the nearest whole number, and it would be even more annoying if R didn't allow this! Thus, by having `digits = 0` as the default value, we get the best of both worlds.
### Function help
Not sure what a function does, how many arguments it has, or what the argument names are? Ask R for help by typing `?` and then the function name, and R will return some documentation about it. For example, type `?round()` into the console and R will return information about how to use the `round()` function.
### Combining functions
In the same way that R allows us to put multiple operations together into a longer command (like `1 + 2 * 4` for instance), it also lets us put functions together and even combine functions with operators if we so desire. For example, the following is a perfectly legitimate command:
```{r}
round(sqrt(7), digits = 2)
```
When R executes this command, starts out by calculating the value of `sqrt(7)`, which produces an intermediate value of `2.645751`. The command then simplifies to `round(2.645751, digits = 2)`, which rounds the value to `2.65`.
## Frequently used functions
### Math functions
R has LOTS of functions. Many of the basic math functions are somewhat self-explanatory, but it can be hard to remember the specific function name. Below is a reference table of some frequently used math functions.
Function | Description | Example input | Example output
---------- | ------------------|------------------|---------------
`round(x, digits=0)` | Round `x` to the `digits` decimal place | `round(3.1415, digits=2)` | ``r round(3.1415, digits=2)``
`floor(x)` | Round `x` **down** the nearest integer | `floor(3.1415)` | ``r floor(3.1415)``
`ceiling(x)` | Round `x` **up** the nearest integer | `ceiling(3.1415)` | ``r ceiling(3.1415)``
`abs()` | Absolute value | `abs(-42)` | ``r abs(-42)``
`min()` | Minimum value | `min(1, 2, 3)` | ``r min(1, 2, 3)``
`max()` | Maximum value | `max(1, 2, 3)` | ``r max(1, 2, 3)``
`sqrt()` | Square root | `sqrt(64)` | ``r sqrt(64)``
`exp()` | Exponential | `exp(0)` | ``r exp(0)``
`log()` | Natural log | `log(1)` | ``r log(1)``
`factorial()` | Factorial | `factorial(5)` | ``r factorial(5)``
### Functions for manipulating data types
You will often need to check the data type of objects and convert them to other types. To handle this, use these patterns:
- Check the type of `x`: `is.______()`
- Convert the type of `x`: `as.______()`
In each of these patterns, replace "`______`" with:
- `character`
- `logical`
- `numeric` / `double` / `integer`
#### Converting data types
You can convert an object from one type to another using `as.______()`, replacing "`______`" with a data type:
Convert **numeric** types:
```{r}
as.numeric("3.1415")
as.double("3.1415")
as.integer("3.1415")
```
Convert **non-numeric** types:
```{r}
as.character(3.1415)
as.logical(3.1415)
```
A few notes to keep in mind:
1) When converting from a **numeric** to a **logical**, `as.logical()` will always return `TRUE` for any numeric value other than `0`, for which it returns `FALSE`.
```{r}
as.logical(7)
as.logical(0)
```
The reverse is also true
```{r}
as.numeric(TRUE)
as.numeric(FALSE)
```
2) Not everything can be converted. For example, if you try to coerce a character that contains letters into a number, R will return `NA`, because it doesn't know what number to choose:
```{r}
as.numeric('foo')
```
3) The `as.integer()` function behaves the same as `floor()`:
```{r}
as.integer(3.14)
as.integer(3.99)
```
#### Checking data types
Similar to the `as.______()` format, you can check if an object is a specific data type using `is.______()`, replacing "`______`" with a data type.
Checking **numeric** types:
```{r}
is.numeric(3.1415)
is.double(3.1415)
is.integer(3.1415)
```
Checking **non-numeric** types:
```{r}
is.character(3.1415)
is.logical(3.1415)
```
One thing you'll notice is that `is.integer()` often gives you a surprising result. For example, why did `is.integer(7)` return `FALSE`?. Well, this is because numbers are _doubles_ by default in R, so even though `7` _looks_ like an integer, R thinks it's a double.
The safer way to check if a number is an integer in _value_ is to compare it against itself converted into an integer:
```{r}
7 == as.integer(7)
```
## More functions with **packages**
When you start R, it only loads the "Base R" functions (e.g. `sqrt()`, `round()`, etc.), but there are thousands and thousands of additional functions stored in external **packages**.
> [Watch this 1-minute video for a quick summary of **packages**](https://vimeo.com/220490447)
### Installing packages
To install a package, use the `install.packages()` function. Make sure you put the package name in quotes:
```{r eval=FALSE}
install.packages("packagename") # This works
install.packages(packagename) # This doesn't work
```
Just like most software, you only need to _install_ a package once.
### Using packages
After installing a package, you can't immediately use the functions that the package contains. This is because when you start up R only the "base" functions are loaded. If you want R to also load the functions inside a package, you have to _load_ that package, which you do with the `library()` function. In contrast to the `install.packages()` function, you don't need quotes around the package name to load it:
```{r eval=FALSE}
library("packagename") # This works
library(packagename) # This also works
```
Here's a helpful image to keep the two ideas of _installing_ vs _loading_ separate:
![](images/package_lightbulb.png){ width=800 }
### Example: **wikifacts**
As an example, try installing the [Wikifacts](https://github.com/keithmcnulty/wikifacts) package, by Keith McNulty:
```{r eval=FALSE}
install.packages("wikifacts") # Remember - you only have to do this once!
```
Now that you have the package installed on your computer, try loading it using `library(wikifacts)`, then trying using some of it's functions:
```{r}
library(wikifacts) # Load the library
```
```{r}
wiki_randomfact()
wiki_didyouknow()
```
In case you're wondering, the only thing this package does is generate messages containing random facts from [Wikipedia](https://www.wikipedia.org/).
### Using only _some_ package functions
Sometimes you may only want to use a single function from a library without having to load the whole thing. To do so, use this recipe:
> packagename::functionname()
Here I use the name of the _package_ followed by `::` to tell R that I'm looking for a function that is in that package. For example, if I didn't want to load the whole **wikifacts** library but still wanted to use the `wiki_randomfact()` function, I could do this:
```{r}
wikifacts::wiki_randomfact()
```
Where this is particularly handy is when two packages have a function with the same name. If you load both library, R might not know which function to use. In those cases, it's best to also provide the **package** name. For example, let's say there was a package called **apples** and another called **bananas**, and each had a function named `fruitName()`. If I wanted to use each of them in my code, I would need to specify the package names like this:
```{r, eval=FALSE}
apples::fruitName()
bananas::fruitName()
```
## Turtle Graphics
[Turtle graphics](https://en.wikipedia.org/wiki/Turtle_graphics) is a classic teaching tool in computer science, originally invented in the 1960s and re-implemented over and over again in different programming languages.
In R, there is a similar package called [**TurtleGraphics**](http://www.gagolewski.com/software/TurtleGraphics/?section=software&subsection=TurtleGraphics). To get started, install the package (remember, you only need to do this once on your computer):
```{r, eval=FALSE}
install.packages('TurtleGraphics')
```
Once installed, load the package (remember, you have to load this every time you restart R to use the package!):
```{r}
library(TurtleGraphics)
```
### Getting to know your turtle
Here's the idea. You have a turtle, and she lives in a nice warm terrarium. The terrarium is 100 x 100 units in size, where the lower-left corner is at the `(x, y)` position of `(0, 0)`. When you call `turtle_init()`, the turtle is initially positioned in the center of the terrarium at `(50, 50)`:
```{r}
#| eval: false
turtle_init()
```
![](images/turtle_init.png){ width=456 }
You can move the turtle using a variety of movement functions (see `?turtle_move()`), and she will leave a trail where ever she goes. For example, you can move her 10 units forward from her starting position:
```{r}
#| eval: false
turtle_init()
turtle_forward(distance = 10)
```
![](images/turtle_forward.png){ width=456 }
You can also make the turtle jump to a new position (without drawing a line) by using the `turtle_setpos(x, y)`, where `(x, y)` is a coordinate within the 100 x 100 terrarium:
```{r}
#| eval: false
turtle_init()
turtle_setpos(x=10, y=10)
```
![](images/turtle_setpos.png){ width=456 }
### Turtle loops
Simple enough, right? But what if I want my turtle to draw a more complicated shape? Let's say I want her to draw a hexagon. There are six sides to the hexagon, so the most natural way to write code for this is to write a `for` loop that loops over the sides (if this doesn't make sense yet, go read ahead to the chapter on [iteration](iteration.qmd)!). At each iteration within the loop, I'll have the turtle walk forwards, and then turn 60 degrees to the left. Here's what happens:
<!-- ```{r, fig.show='animate', interval=0.05, cache=TRUE, message=FALSE} -->
<!-- ffmpeg is a pain, so manually sticking video in -->
```{r}
#| eval: false
turtle_init()
for (side in 1:6) {
turtle_forward(distance = 10)
turtle_left(angle = 60)
}
```
<video width="456" controls="" loop="">
<source src="images/turtle_hexagon.webm">
</video>
Cool! As you draw more complex shapes, you can speed up the process by wrapping your turtle commands inside the `turtle_do({})` function. This will skip the animations of the turtle moving and will jump straight to the final position. For example, here's the hexagon again without animations:
```{r}
#| eval: false
turtle_init()
turtle_do({
for (side in 1:6) {
turtle_forward(distance = 10)
turtle_left(angle = 60)
}
})
```
![](images/turtle_hexagon.png){ width=456 }
## Page sources {.unnumbered}
Some content on this page has been modified from other courses, including:
- Danielle Navarro's book ["Learning Statistics With R"]({{< var navarro_book_c3>}})
- Danielle Navarro's website ["R for Psychological Science"]({{< var navarro_psy_book>}})
- Jenny Bryan's [STAT 545 Course]({{< var stat545>}})
- [RStudio primers](https://rstudio.cloud/learn/primers/1.2)
- Xiao Ping Song's [Intro2R crash course]({{< var song_book >}})