@@ -22,17 +22,19 @@ which wraps the [`AWS-LC` cryptographic library][9].
22
22
23
23
Traditionally, ` /dev/random ` has been considered a source of “true” randomness,
24
24
with the downside that reads block when the pool of entropy gets depleted. On
25
- the other hand, ` /dev/urandom ` doesn’t block, but provides lower quality
26
- results. It turns out the distinction in output quality is actually very hard to
27
- make. According to [ this article] [ 2 ] , for kernel versions prior to 4.8, both
28
- devices draw their output from the same pool, with the exception that
29
- ` /dev/random ` will block when the system estimates the entropy count has
30
- decreased below a certain threshold. The ` /dev/urandom ` output is considered
31
- secure for virtually all purposes, with the caveat that using it before the
32
- system gathers sufficient entropy for initialization may indeed produce low
33
- quality random numbers. The ` getrandom ` syscall helps with this situation; it
34
- uses the ` /dev/urandom ` source by default, but will block until it gets properly
35
- initialized (the behavior can be altered via configuration flags).
25
+ the other hand, ` /dev/urandom ` doesn’t block, which lead people believe that it
26
+ provides lower quality results.
27
+
28
+ It turns out the distinction in output quality is actually very hard to make.
29
+ According to [ this article] [ 2 ] , for kernel versions prior to 4.8, both devices
30
+ draw their output from the same pool, with the exception that ` /dev/random ` will
31
+ block when the system estimates the entropy count has decreased below a certain
32
+ threshold. The ` /dev/urandom ` output is considered secure for virtually all
33
+ purposes, with the caveat that using it before the system gathers sufficient
34
+ entropy for initialization may indeed produce low quality random numbers. The
35
+ ` getrandom ` syscall helps with this situation; it uses the ` /dev/urandom ` source
36
+ by default, but will block until it gets properly initialized (the behavior can
37
+ be altered via configuration flags).
36
38
37
39
Newer kernels (4.8+) have switched to an implementation where ` /dev/random `
38
40
output comes from a pool called the blocking pool, the output of ` /dev/urandom `
@@ -41,6 +43,8 @@ and there’s also an input pool which gathers entropy from various sources
41
43
available on the system, and is used to feed into or seed the other two
42
44
components. A very detailed description is available [ here] [ 3 ] .
43
45
46
+ ### Linux kernels from 4.8 until 5.17 (included)
47
+
44
48
The details of this newer implementation are used to make the recommendations
45
49
present in the document. There are in-kernel interfaces used to obtain random
46
50
numbers as well, but they are similar to using ` /dev/urandom ` (or ` getrandom `
@@ -99,6 +103,42 @@ not increase the current entropy estimation. There is also an `ioctl` interface
99
103
which, given the appropriate privileges, can be used to add data to the input
100
104
entropy pool while also increasing the count, or completely empty all pools.
101
105
106
+ ### Linux kernels from 5.18 onwards
107
+
108
+ Since version 5.18, Linux has support for the
109
+ [ Virtual Machine Generation Identifier] ( https://learn.microsoft.com/en-us/windows/win32/hyperv_v2/virtual-machine-generation-identifier ) .
110
+ The purpose of VMGenID is to notify the guest about time shift events, such as
111
+ resuming from a snapshot. The device exposes a 16-byte cryptographically random
112
+ identifier in guest memory. Firecracker implements VMGenID. When resuming a
113
+ microVM from a snapshot Firecracker writes a new identifier and injects a
114
+ notification to the guest. Linux,
115
+ [ uses this value] ( https://elixir.bootlin.com/linux/v5.18.19/source/drivers/virt/vmgenid.c#L77 )
116
+ [ as new randomness for its CSPRNG] ( https://elixir.bootlin.com/linux/v5.18.19/source/drivers/char/random.c#L908 ) .
117
+ Quoting the random.c implementation of the kernel:
118
+
119
+ ```
120
+ /*
121
+ * Handle a new unique VM ID, which is unique, not secret, so we
122
+ * don't credit it, but we do immediately force a reseed after so
123
+ * that it's used by the crng posthaste.
124
+ */
125
+ ```
126
+
127
+ As a result, values returned by ` getrandom() ` and ` /dev/(u)random ` are distinct
128
+ in all VMs started from the same snapshot, ** after** the kernel handles the
129
+ VMGenID notification. This leaves a race window between resuming vCPUs and Linux
130
+ CSPRNG getting successfully re-seeded. In Linux 6.8, we
131
+ [ extended VMGenID
] ( https://lore.kernel.org/lkml/[email protected] / )
132
+ to emit a uevent to user space when it handles the notification. User space can
133
+ poll this uevent to know when it is safe to use ` getrandom() ` , et al. avoiding
134
+ the race condition.
135
+
136
+ Please note that, Firecracker will always enable VMGenID. In kernels earlier
137
+ than 5.18, where there is no VMGenID driver, the device will not have any effect
138
+ in the guest.
139
+
140
+ ### User space considerations
141
+
102
142
Init systems (such as ` systemd ` used by AL2 and other distros) might save a
103
143
random seed file after boot. For ` systemd ` , the path is
104
144
` /var/lib/systemd/random-seed ` . Just to be on the safe side, any such file
@@ -121,8 +161,8 @@ alter the read result via bind mounting another file on top of
121
161
and should be sufficient for most cases.
122
162
- Use ` virtio-rng ` . When present, the guest kernel uses the device as an
123
163
additional source of entropy.
124
- - To be as safe as possible, the direct approach is to do the following (before
125
- customer code is resumed in the clone):
164
+ - On kernels before 5.18, to be as safe as possible, the direct approach is to
165
+ do the following (before customer code is resumed in the clone):
126
166
1 . Open one of the special devices files (either ` /dev/random ` or
127
167
` /dev/urandom ` ). Take note that ` RNDCLEARPOOL ` no longer
128
168
[ has any effect] [ 7 ] on the entropy pool.
@@ -133,6 +173,13 @@ alter the read result via bind mounting another file on top of
133
173
1 . Issue a ` RNDRESEEDCRNG ` ioctl call ([ 4.14] [ 5 ] , [ 5.10] [ 6 ] , (requires
134
174
` CAP_SYS_ADMIN ` )) that specifically causes the ` CSPRNG ` to be reseeded from
135
175
the input pool.
176
+ - On kernels starting from 5.18 onwards, the CSPRNG will be automatically
177
+ reseeded when the guest kernel handles the VMGenID notification. To completely
178
+ avoid the race condition, users should follow the same steps as with kernels
179
+ \< 5.18.
180
+ - On kernels starting from 6.8, users can poll for the VMGenID uevent that the
181
+ driver sends when the CSPRNG is reseeded after handling the VMGenID
182
+ notification.
136
183
137
184
** Annex 1 contains the source code of a C program which implements the previous
138
185
three steps.** As soon as the guest kernel version switches to 4.19 (or higher),
0 commit comments