-
Notifications
You must be signed in to change notification settings - Fork 1.9k
/
bindCache.Rd
463 lines (378 loc) · 20.4 KB
/
bindCache.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/bind-cache.R
\name{bindCache}
\alias{bindCache}
\title{Add caching with reactivity to an object}
\usage{
bindCache(x, ..., cache = "app")
}
\arguments{
\item{x}{The object to add caching to.}
\item{...}{One or more expressions to use in the caching key.}
\item{cache}{The scope of the cache, or a cache object. This can be \code{"app"}
(the default), \code{"session"}, or a cache object like a
\code{\link[cachem:cache_disk]{cachem::cache_disk()}}. See the Cache Scoping section for more information.}
}
\description{
\code{bindCache()} adds caching \code{\link[=reactive]{reactive()}} expressions and \verb{render*} functions
(like \code{\link[=renderText]{renderText()}}, \code{\link[=renderTable]{renderTable()}}, ...).
Ordinary \code{\link[=reactive]{reactive()}} expressions automatically cache their \emph{most recent}
value, which helps to avoid redundant computation in downstream reactives.
\code{bindCache()} will cache all previous values (as long as they fit in the
cache) and they can be shared across user sessions. This allows
\code{bindCache()} to dramatically improve performance when used correctly.
}
\details{
\code{bindCache()} requires one or more expressions that are used to generate a
\strong{cache key}, which is used to determine if a computation has occurred
before and hence can be retrieved from the cache. If you're familiar with the
concept of memoizing pure functions (e.g., the \pkg{memoise} package), you
can think of the cache key as the input(s) to a pure function. As such, one
should take care to make sure the use of \code{bindCache()} is \emph{pure} in the same
sense, namely:
\enumerate{
\item For a given key, the return value is always the same.
\item Evaluation has no side-effects.
}
In the example here, the \code{bindCache()} key consists of \code{input$x} and
\code{input$y} combined, and the value is \code{input$x * input$y}. In this simple
example, for any given key, there is only one possible returned value.
\if{html}{\out{<div class="sourceCode">}}\preformatted{r <- reactive(\{ input$x * input$y \}) \%>\%
bindCache(input$x, input$y)
}\if{html}{\out{</div>}}
The largest performance improvements occur when the cache key is fast to
compute and the reactive expression is slow to compute. To see if the value
should be computed, a cached reactive evaluates the key, and then serializes
and hashes the result. If the resulting hashed key is in the cache, then the
cached reactive simply retrieves the previously calculated value and returns
it; if not, then the value is computed and the result is stored in the cache
before being returned.
To compute the cache key, \code{bindCache()} hashes the contents of \code{...}, so it's
best to avoid including large objects in a cache key since that can result in
slow hashing. It's also best to avoid reference objects like environments and
R6 objects, since the serialization of these objects may not capture relevant
changes.
If you want to use a large object as part of a cache key, it may make sense
to do some sort of reduction on the data that still captures information
about whether a value can be retrieved from the cache. For example, if you
have a large data set with timestamps, it might make sense to extract the
most recent timestamp and return that. Then, instead of hashing the entire
data object, the cached reactive only needs to hash the timestamp.
\if{html}{\out{<div class="sourceCode">}}\preformatted{r <- reactive(\{ compute(bigdata()) \} \%>\%
bindCache(\{ extract_most_recent_time(bigdata()) \})
}\if{html}{\out{</div>}}
For computations that are very slow, it often makes sense to pair
\code{\link[=bindCache]{bindCache()}} with \code{\link[=bindEvent]{bindEvent()}} so that no computation is performed until
the user explicitly requests it (for more, see the Details section of
\code{\link[=bindEvent]{bindEvent()}}).
}
\section{Cache keys and reactivity}{
Because the \strong{value} expression (from the original \code{\link[=reactive]{reactive()}}) is
cached, it is not necessarily re-executed when someone retrieves a value,
and therefore it can't be used to decide what objects to take reactive
dependencies on. Instead, the \strong{key} is used to figure out which objects
to take reactive dependencies on. In short, the key expression is reactive,
and value expression is no longer reactive.
Here's an example of what not to do: if the key is \code{input$x} and the value
expression is from \code{reactive({input$x + input$y})}, then the resulting
cached reactive will only take a reactive dependency on \code{input$x} -- it
won't recompute \code{{input$x + input$y}} when just \code{input$y} changes.
Moreover, the cache won't use \code{input$y} as part of the key, and so it could
return incorrect values in the future when it retrieves values from the
cache. (See the examples below for an example of this.)
A better cache key would be something like \verb{input$x, input$y}. This does
two things: it ensures that a reactive dependency is taken on both
\code{input$x} and \code{input$y}, and it also makes sure that both values are
represented in the cache key.
In general, \code{key} should use the same reactive inputs as \code{value}, but the
computation should be simpler. If there are other (non-reactive) values
that are consumed, such as external data sources, they should be used in
the \code{key} as well. Note that if the \code{key} is large, it can make sense to do
some sort of reduction on it so that the serialization and hashing of the
cache key is not too expensive.
Remember that the key is \emph{reactive}, so it is not re-executed every single
time that someone accesses the cached reactive. It is only re-executed if
it has been invalidated by one of the reactives it depends on. For
example, suppose we have this cached reactive:
\if{html}{\out{<div class="sourceCode">}}\preformatted{r <- reactive(\{ input$x * input$y \}) \%>\%
bindCache(input$x, input$y)
}\if{html}{\out{</div>}}
In this case, the key expression is essentially \code{reactive(list(input$x, input$y))} (there's a bit more to it, but that's a good enough
approximation). The first time \code{r()} is called, it executes the key, then
fails to find it in the cache, so it executes the value expression, \code{{ input$x + input$y }}. If \code{r()} is called again, then it does not need to
re-execute the key expression, because it has not been invalidated via a
change to \code{input$x} or \code{input$y}; it simply returns the previous value.
However, if \code{input$x} or \code{input$y} changes, then the reactive expression will
be invalidated, and the next time that someone calls \code{r()}, the key
expression will need to be re-executed.
Note that if the cached reactive is passed to \code{\link[=bindEvent]{bindEvent()}}, then the key
expression will no longer be reactive; instead, the event expression will be
reactive.
}
\section{Cache scope}{
By default, when \code{bindCache()} is used, it is scoped to the running
application. That means that it shares a cache with all user sessions
connected to the application (within the R process). This is done with the
\code{cache} parameter's default value, \code{"app"}.
With an app-level cache scope, one user can benefit from the work done for
another user's session. In most cases, this is the best way to get
performance improvements from caching. However, in some cases, this could
leak information between sessions. For example, if the cache key does not
fully encompass the inputs used by the value, then data could leak between
the sessions. Or if a user sees that a cached reactive returns its value
very quickly, they may be able to infer that someone else has already used
it with the same values.
It is also possible to scope the cache to the session, with
\code{cache="session"}. This removes the risk of information leaking between
sessions, but then one session cannot benefit from computations performed in
another session.
It is possible to pass in caching objects directly to
\code{bindCache()}. This can be useful if, for example, you want to use a
particular type of cache with specific cached reactives, or if you want to
use a \code{\link[cachem:cache_disk]{cachem::cache_disk()}} that is shared across multiple processes and
persists beyond the current R session.
To use different settings for an application-scoped cache, you can call
\code{\link[=shinyOptions]{shinyOptions()}} at the top of your app.R, server.R, or
global.R. For example, this will create a cache with 500 MB of space
instead of the default 200 MB:
\if{html}{\out{<div class="sourceCode">}}\preformatted{shinyOptions(cache = cachem::cache_mem(max_size = 500e6))
}\if{html}{\out{</div>}}
To use different settings for a session-scoped cache, you can set
\code{session$cache} at the top of your server function. By default, it will
create a 200 MB memory cache for each session, but you can replace it with
something different. To use the session-scoped cache, you must also call
\code{bindCache()} with \code{cache="session"}. This will create a 100 MB cache for
the session:
\if{html}{\out{<div class="sourceCode">}}\preformatted{function(input, output, session) \{
session$cache <- cachem::cache_mem(max_size = 100e6)
...
\}
}\if{html}{\out{</div>}}
If you want to use a cache that is shared across multiple R processes, you
can use a \code{\link[cachem:cache_disk]{cachem::cache_disk()}}. You can create a application-level shared
cache by putting this at the top of your app.R, server.R, or global.R:
\if{html}{\out{<div class="sourceCode">}}\preformatted{shinyOptions(cache = cachem::cache_disk(file.path(dirname(tempdir()), "myapp-cache")))
}\if{html}{\out{</div>}}
This will create a subdirectory in your system temp directory named
\code{myapp-cache} (replace \code{myapp-cache} with a unique name of
your choosing). On most platforms, this directory will be removed when
your system reboots. This cache will persist across multiple starts and
stops of the R process, as long as you do not reboot.
To have the cache persist even across multiple reboots, you can create the
cache in a location outside of the temp directory. For example, it could
be a subdirectory of the application:
\if{html}{\out{<div class="sourceCode">}}\preformatted{shinyOptions(cache = cachem::cache_disk("./myapp-cache"))
}\if{html}{\out{</div>}}
In this case, resetting the cache will have to be done manually, by deleting
the directory.
You can also scope a cache to just one item, or selected items. To do that,
create a \code{\link[cachem:cache_mem]{cachem::cache_mem()}} or \code{\link[cachem:cache_disk]{cachem::cache_disk()}}, and pass it
as the \code{cache} argument of \code{bindCache()}.
}
\section{Computing cache keys}{
The actual cache key that is used internally takes value from evaluating
the key expression(s) (from the \code{...} arguments) and combines it with the
(unevaluated) value expression.
This means that if there are two cached reactives which have the same
result from evaluating the key, but different value expressions, then they
will not need to worry about collisions.
However, if two cached reactives have identical key and value expressions
expressions, they will share the cached values. This is useful when using
\code{cache="app"}: there may be multiple user sessions which create separate
cached reactive objects (because they are created from the same code in the
server function, but the server function is executed once for each user
session), and those cached reactive objects across sessions can share
values in the cache.
}
\section{Async with cached reactives}{
With a cached reactive expression, the key and/or value expression can be
\emph{asynchronous}. In other words, they can be promises --- not regular R
promises, but rather objects provided by the
\href{https://rstudio.github.io/promises/}{\pkg{promises}} package, which
are similar to promises in JavaScript. (See \code{\link[promises:promise]{promises::promise()}} for more
information.) You can also use \code{\link[future:future]{future::future()}} objects to run code in a
separate process or even on a remote machine.
If the value returns a promise, then anything that consumes the cached
reactive must expect it to return a promise.
Similarly, if the key is a promise (in other words, if it is asynchronous),
then the entire cached reactive must be asynchronous, since the key must be
computed asynchronously before it knows whether to compute the value or the
value is retrieved from the cache. Anything that consumes the cached
reactive must therefore expect it to return a promise.
}
\section{Developing render functions for caching}{
If you've implemented your own \verb{render*()} function, it may just work with
\code{bindCache()}, but it is possible that you will need to make some
modifications. These modifications involve helping \code{bindCache()} avoid
cache collisions, dealing with internal state that may be set by the,
\code{render} function, and modifying the data as it goes in and comes out of
the cache.
You may need to provide a \code{cacheHint} to \code{\link[=createRenderFunction]{createRenderFunction()}} (or
\code{htmlwidgets::shinyRenderWidget()}, if you've authored an htmlwidget) in
order for \code{bindCache()} to correctly compute a cache key.
The potential problem is a cache collision. Consider the following:
\if{html}{\out{<div class="sourceCode">}}\preformatted{output$x1 <- renderText(\{ input$x \}) \%>\% bindCache(input$x)
output$x2 <- renderText(\{ input$x * 2 \}) \%>\% bindCache(input$x)
}\if{html}{\out{</div>}}
Both \code{output$x1} and \code{output$x2} use \code{input$x} as part of their cache key,
but if it were the only thing used in the cache key, then the two outputs
would have a cache collision, and they would have the same output. To avoid
this, a \emph{cache hint} is automatically added when \code{\link[=renderText]{renderText()}} calls
\code{\link[=createRenderFunction]{createRenderFunction()}}. The cache hint is used as part of the actual
cache key, in addition to the one passed to \code{bindCache()} by the user. The
cache hint can be viewed by calling the internal Shiny function
\code{extractCacheHint()}:
\if{html}{\out{<div class="sourceCode">}}\preformatted{r <- renderText(\{ input$x \})
shiny:::extractCacheHint(r)
}\if{html}{\out{</div>}}
This returns a nested list containing an item, \verb{$origUserFunc$body}, which
in this case is the expression which was passed to \code{renderText()}:
\code{{ input$x }}. This (quoted) expression is mixed into the actual cache
key, and it is how \code{output$x1} does not have collisions with \code{output$x2}.
For most developers of render functions, nothing extra needs to be done;
the automatic inference of the cache hint is sufficient. Again, you can
check it by calling \code{shiny:::extractCacheHint()}, and by testing the
render function for cache collisions in a real application.
In some cases, however, the automatic cache hint inference is not
sufficient, and it is necessary to provide a cache hint. This is true
for \code{renderPrint()}. Unlike \code{renderText()}, it wraps the user-provided
expression in another function, before passing it to \code{\link[=createRenderFunction]{createRenderFunction()}}
(instead of \code{\link[=createRenderFunction]{createRenderFunction()}}). Because the user code is wrapped in
another function, \code{createRenderFunction()} is not able to automatically
extract the user-provided code and use it in the cache key. Instead,
\code{renderPrint} calls \code{createRenderFunction()}, it explicitly passes along a
\code{cacheHint}, which includes a label and the original user expression.
In general, if you need to provide a \code{cacheHint}, it is best practice to
provide a \code{label} id, the user's \code{expr}, as well as any other arguments
that may influence the final value.
For \pkg{htmlwidgets}, it will try to automatically infer a cache hint;
again, you can inspect the cache hint with \code{shiny:::extractCacheHint()} and
also test it in an application. If you do need to explicitly provide a
cache hint, pass it to \code{shinyRenderWidget}. For example:
\if{html}{\out{<div class="sourceCode">}}\preformatted{renderMyWidget <- function(expr) \{
q <- rlang::enquo0(expr)
htmlwidgets::shinyRenderWidget(
q,
myWidgetOutput,
quoted = TRUE,
cacheHint = list(label = "myWidget", userQuo = q)
)
\}
}\if{html}{\out{</div>}}
If your \code{render} function sets any internal state, you may find it useful
in your call to \code{\link[=createRenderFunction]{createRenderFunction()}} to use
the \code{cacheWriteHook} and/or \code{cacheReadHook} parameters. These hooks are
functions that run just before the object is stored in the cache, and just
after the object is retrieved from the cache. They can modify the data
that is stored and retrieved; this can be useful if extra information needs
to be stored in the cache. They can also be used to modify the state of the
application; for example, it can call \code{\link[=createWebDependency]{createWebDependency()}} to make
JS/CSS resources available if the cached object is loaded in a different R
process. (See the source of \code{htmlwidgets::shinyRenderWidget} for an example
of this.)
}
\section{Uncacheable objects}{
Some render functions cannot be cached, typically because they have side
effects or modify some external state, and they must re-execute each time
in order to work properly.
For developers of such code, they should call \code{\link[=createRenderFunction]{createRenderFunction()}} (or
\code{\link[=markRenderFunction]{markRenderFunction()}}) with \code{cacheHint = FALSE}.
}
\section{Caching with \code{renderPlot()}}{
When \code{bindCache()} is used with \code{renderPlot()}, the \code{height} and \code{width}
passed to the original \code{renderPlot()} are ignored. They are superseded by
\code{sizePolicy} argument passed to `bindCache. The default is:
\if{html}{\out{<div class="sourceCode">}}\preformatted{sizePolicy = sizeGrowthRatio(width = 400, height = 400, growthRate = 1.2)
}\if{html}{\out{</div>}}
\code{sizePolicy} must be a function that takes a two-element numeric vector as
input, representing the width and height of the \verb{<img>} element in the
browser window, and it must return a two-element numeric vector, representing
the pixel dimensions of the plot to generate. The purpose is to round the
actual pixel dimensions from the browser to some other dimensions, so that
this will not generate and cache images of every possible pixel dimension.
See \code{\link[=sizeGrowthRatio]{sizeGrowthRatio()}} for more information on the default sizing policy.
}
\examples{
\dontrun{
rc <- bindCache(
x = reactive({
Sys.sleep(2) # Pretend this is expensive
input$x * 100
}),
input$x
)
# Can make it prettier with the \%>\% operator
library(magrittr)
rc <- reactive({
Sys.sleep(2)
input$x * 100
}) \%>\%
bindCache(input$x)
}
## Only run app examples in interactive R sessions
if (interactive()) {
# Basic example
shinyApp(
ui = fluidPage(
sliderInput("x", "x", 1, 10, 5),
sliderInput("y", "y", 1, 10, 5),
div("x * y: "),
verbatimTextOutput("txt")
),
server = function(input, output) {
r <- reactive({
# The value expression is an _expensive_ computation
message("Doing expensive computation...")
Sys.sleep(2)
input$x * input$y
}) \%>\%
bindCache(input$x, input$y)
output$txt <- renderText(r())
}
)
# Caching renderText
shinyApp(
ui = fluidPage(
sliderInput("x", "x", 1, 10, 5),
sliderInput("y", "y", 1, 10, 5),
div("x * y: "),
verbatimTextOutput("txt")
),
server = function(input, output) {
output$txt <- renderText({
message("Doing expensive computation...")
Sys.sleep(2)
input$x * input$y
}) \%>\%
bindCache(input$x, input$y)
}
)
# Demo of using events and caching with an actionButton
shinyApp(
ui = fluidPage(
sliderInput("x", "x", 1, 10, 5),
sliderInput("y", "y", 1, 10, 5),
actionButton("go", "Go"),
div("x * y: "),
verbatimTextOutput("txt")
),
server = function(input, output) {
r <- reactive({
message("Doing expensive computation...")
Sys.sleep(2)
input$x * input$y
}) \%>\%
bindCache(input$x, input$y) \%>\%
bindEvent(input$go)
# The cached, eventified reactive takes a reactive dependency on
# input$go, but doesn't use it for the cache key. It uses input$x and
# input$y for the cache key, but doesn't take a reactive dependency on
# them, because the reactive dependency is superseded by addEvent().
output$txt <- renderText(r())
}
)
}
}
\seealso{
\code{\link[=bindEvent]{bindEvent()}}, \code{\link[=renderCachedPlot]{renderCachedPlot()}} for caching plots.
}