Tavis Ormandy (taviso)
This document is a work in progress, documenting a behaviour under investigation.
The XGETBV
instruction reads the contents of an internal control register. It
is not a privileged instruction and is usually available to userspace. The
contents is also exposed via the xstate_bv
header in the XSAVE
structure.
The primary use of XGETBV
is determining the XINUSE
flags, which allows
kernels and userthread implementations to determine what CPU state needs to be
saved or restored on context switch. However, it has been observed that these
flags appear to be non-deterministic on various Intel CPUs.
It is not clear what the consequences of this is are, or if this is security relevant.
We are not the first researchers to observe this non-determinism, the RR project have also noticed this behaviour. 1
We have found a reliable way to reproduce this issue. If you use an AVX instruction
like VSQRTSS
followed by VZEROALL
to set and unset the INUSE flag, we can
observe fluctuations in the flags for no apparent reason.
To reproduce this, compile the testcase with -mavx
$ cc -mavx xgetbv.c -o xgetbv
If you run the testcase on an affected machine, you should see non-deterministic results:
$ ./xgetbv
first execution, our flags: 0000000000
After 172775235 tests, our XINUSE was 0000000002 vs 0000000000
$ ./xgetbv
first execution, our flags: 0000000000
After 5620219 tests, our XINUSE was 0000000002 vs 0000000000
$ ./xgetbv
first execution, our flags: 0000000000
After 700881 tests, our XINUSE was 0000000002 vs 0000000000
$ ./xgetbv
first execution, our flags: 0000000000
After 169544692 tests, our XINUSE was 0000000002 vs 0000000000
$ ./xgetbv
first execution, our flags: 0000000000
After 113335157 tests, our XINUSE was 0000000002 vs 0000000000
If you also artificially induce context switching between another process, the average number of tests required reduces:
$ cc -mavx hammer.c -o hammer
$ ./hammer &
[1] 2775472
$ ./xgetbv
first execution, our flags: 0000000000
After 722148 tests, our XINUSE was 0000000002 vs 0000000000
$ ./xgetbv
first execution, our flags: 0000000000
After 705312 tests, our XINUSE was 0000000002 vs 0000000000
$ ./xgetbv
first execution, our flags: 0000000000
After 473381 tests, our XINUSE was 0000000002 vs 0000000000
If the other process does not use AVX, then the average number of tests required does not reduce. This implies there may be some way of determining what other processes scheduled on the same core are doing.
$ cc hammer.c -o hammer-noavx
# Note: remember to stop any existing hammer process
$ ./xgetbv
first execution, our flags: 0000000000
After 9348279 tests, our XINUSE was 0000000002 vs 0000000000
Note that the number of tests is an order of magnitude difference, this appears to be reliable.
It's not clear what the implications of this are. It may be possible to influence other processes, or determine what other processes are doing.
We have only observed this on Intel CPUs, no AMD processors appear to exhibit this behaviour.
We have confirmed this on the following CPUs:
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) CPU @ 2.00GHz
stepping : 3
microcode : 0x1
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
stepping : 4
microcode : 0x2006e05
cpu family : 6
model : 79
model name : Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
stepping : 1
microcode : 0xb000040
cpu family : 6
model : 140
model name : 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
stepping : 1
microcode : 0xa6
Further research is required to determine if this behaviour has any security consequences.