Skip to content

Commit 1abf1d6

Browse files
authored
various small translate-sql vignette corrections (#1550)
1 parent d51bc0d commit 1abf1d6

File tree

1 file changed

+16
-16
lines changed

1 file changed

+16
-16
lines changed

vignettes/translation-function.Rmd

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,15 @@ con <- simulate_dbi()
3030
translate_sql((x + y) / 2, con = con)
3131
```
3232

33-
`translate_sql()` takes an optional `con` parameter. If not supplied, this causes dplyr to generate (approximately) SQL-92 compliant SQL. If supplied, dplyr uses `sql_translation()` to look up a custom environment which makes it possible for different databases to generate slightly different SQL: see `vignette("new-backend")` for more details. You can use the various simulate helpers to see the translations used by different backends:
33+
`translate_sql()` takes an optional `con` parameter. If not supplied, this causes dbplyr to generate (approximately) SQL-92 compliant SQL. If supplied, dbplyr uses `sql_translation()` to look up a custom environment which makes it possible for different databases to generate slightly different SQL: see `vignette("new-backend")` for more details. You can use the various simulate helpers to see the translations used by different backends:
3434

3535
```{r}
3636
translate_sql(x ^ 2L, con = con)
3737
translate_sql(x ^ 2L, con = simulate_sqlite())
3838
translate_sql(x ^ 2L, con = simulate_access())
3939
```
4040

41-
Perfect translation is not possible because databases don't have all the functions that R does. The goal of dplyr is to provide a semantic rather than a literal translation: what you mean, rather than precisely what is done. In fact, even for functions that exist both in databases and R, you shouldn't expect results to be identical; database programmers have different priorities than R core programmers. For example, in R in order to get a higher level of numerical accuracy, `mean()` loops through the data twice. R's `mean()` also provides a `trim` option for computing trimmed means; this is something that databases do not provide.
41+
Perfect translation is not possible because databases don't have all the functions that R does. The goal of dbplyr is to provide a semantic rather than a literal translation: what you mean, rather than precisely what is done. In fact, even for functions that exist both in databases and in R, you shouldn't expect results to be identical; database programmers have different priorities than R core programmers. For example, in R in order to get a higher level of numerical accuracy, `mean()` loops through the data twice. R's `mean()` also provides a `trim` option for computing trimmed means; this is something that databases do not provide.
4242

4343
If you're interested in how `translate_sql()` is implemented, the basic techniques that underlie the implementation of `translate_sql()` are described in ["Advanced R"](https://adv-r.hadley.nz/translation.html).
4444

@@ -63,7 +63,7 @@ The following examples work through some of the basic differences between R and
6363
```
6464
6565
* R and SQL have different defaults for integers and reals.
66-
In R, 1 is a real, and 1L is an integer. In SQL, 1 is an integer, and 1.0 is a real
66+
In R, 1 is a real, and 1L is an integer. In SQL, 1 is an integer, and 1.0 is a real.
6767
6868
```{r}
6969
translate_sql(1, con = con)
@@ -104,7 +104,7 @@ dbplyr no longer translates `%/%` because there's no robust cross-database trans
104104

105105
### Aggregation
106106

107-
All database provide translation for the basic aggregations: `mean()`, `sum()`, `min()`, `max()`, `sd()`, `var()`. Databases automatically drop NULLs (their equivalent of missing values) whereas in R you have to ask nicely. The aggregation functions warn you about this important difference:
107+
All databases provide translation for the basic aggregations: `mean()`, `sum()`, `min()`, `max()`, `sd()`, `var()`. Databases automatically drop NULLs (their equivalent of missing values) whereas in R you have to ask nicely. The aggregation functions warn you about this important difference:
108108

109109
```{r}
110110
translate_sql(mean(x), con = con)
@@ -119,7 +119,7 @@ translate_sql(mean(x, na.rm = TRUE), window = FALSE, con = con)
119119

120120
### Conditional evaluation
121121

122-
`if` and `switch()` are translate to `CASE WHEN`:
122+
`if` and `switch()` are translated to `CASE WHEN`:
123123

124124
```{r}
125125
translate_sql(if (x > 5) "big" else "small", con = con)
@@ -135,7 +135,7 @@ translate_sql(switch(x, a = 1L, b = 2L, 3L), con = con)
135135

136136
## Unknown functions
137137

138-
Any function that dplyr doesn't know how to convert is left as is. This means that database functions that are not covered by dplyr can often be used directly via `translate_sql()`.
138+
Any function that dbplyr doesn't know how to convert is left as is. This means that database functions that are not covered by dbplyr can often be used directly via `translate_sql()`.
139139

140140
### Prefix functions
141141

@@ -145,15 +145,15 @@ Any function that dbplyr doesn't know about will be left as is:
145145
translate_sql(foofify(x, y), con = con)
146146
```
147147

148-
Because SQL functions are general case insensitive, I recommend using upper case when you're using SQL functions in R code. That makes it easier to spot that you're doing something unusual:
148+
Because SQL functions are generally case insensitive, I recommend using upper case when you're using SQL functions in R code. That makes it easier to spot that you're doing something unusual:
149149

150150
```{r}
151151
translate_sql(FOOFIFY(x, y), con = con)
152152
```
153153

154154
### Infix functions
155155

156-
As well as prefix functions (where the name of the function comes before the arguments), dbplyr also translates infix functions. That allows you to use expressions like `LIKE` which does a limited form of pattern matching:
156+
As well as prefix functions (where the name of the function comes before the arguments), dbplyr also translates infix functions. That allows you to use expressions like `LIKE`, which does a limited form of pattern matching:
157157

158158
```{r}
159159
translate_sql(x %LIKE% "%foo%", con = con)
@@ -190,7 +190,7 @@ mf %>%
190190

191191
### Error for unknown translations
192192

193-
If needed, you can also force dbplyr to error if it doesn't know how to translate a function with the `dplyr.strict_sql` option:
193+
If needed, you can also use the `dplyr.strict_sql` option to force dbplyr to error if it doesn't know how to translate a function:
194194

195195
```{r}
196196
#| error = TRUE
@@ -245,16 +245,16 @@ Things get a little trickier with window functions, because SQL's window functio
245245
knitr::include_graphics("windows.png", dpi = 300)
246246
```
247247
248-
Of the many possible specifications, there are only three that commonly
248+
Of the many possible specifications, only three are commonly
249249
used. They select between aggregation variants:
250250
251-
* Recycled: `BETWEEN UNBOUND PRECEEDING AND UNBOUND FOLLOWING`
251+
* Recycled: `BETWEEN UNBOUND PRECEDING AND UNBOUND FOLLOWING`
252252
253-
* Cumulative: `BETWEEN UNBOUND PRECEEDING AND CURRENT ROW`
253+
* Cumulative: `BETWEEN UNBOUND PRECEDING AND CURRENT ROW`
254254
255-
* Rolling: `BETWEEN 2 PRECEEDING AND 2 FOLLOWING`
255+
* Rolling: `BETWEEN 2 PRECEDING AND 2 FOLLOWING`
256256
257-
dplyr generates the frame clause based on whether your using a recycled
257+
dbplyr generates the frame clause based on whether you're using a recycled
258258
aggregate or a cumulative aggregate.
259259
260260
To see how individual window functions are translated to SQL, we can again use `translate_sql()`:
@@ -266,14 +266,14 @@ translate_sql(ntile(G, 2), con = con)
266266
translate_sql(lag(G), con = con)
267267
```
268268

269-
If the tbl has been grouped or arranged previously in the pipeline, then dplyr will use that information to set the "partition by" and "order by" clauses. For interactive exploration, you can achieve the same effect by setting the `vars_group` and `vars_order` arguments to `translate_sql()`
269+
If the tbl has been grouped or arranged previously in the pipeline, then dplyr will use that information to set the "partition by" and "order by" clauses. For interactive exploration, you can achieve the same effect by setting the `vars_group` and `vars_order` arguments to `translate_sql()`:
270270

271271
```{r}
272272
translate_sql(cummean(G), vars_order = "year", con = con)
273273
translate_sql(rank(), vars_group = "ID", con = con)
274274
```
275275

276-
There are some challenges when translating window functions between R and SQL, because dplyr tries to keep the window functions as similar as possible to both the existing R analogues and to the SQL functions. This means that there are three ways to control the order clause depending on which window function you're using:
276+
There are some challenges when translating window functions between R and SQL, because dbplyr tries to keep the window functions as similar as possible to both the existing R analogues and to the SQL functions. This means that there are three ways to control the order clause depending on which window function you're using:
277277

278278
* For ranking functions, the ordering variable is the first argument: `rank(x)`,
279279
`ntile(y, 2)`. If omitted or `NULL`, will use the default ordering associated

0 commit comments

Comments
 (0)