-
Notifications
You must be signed in to change notification settings - Fork 17
Fpu isolation #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
cxdong
wants to merge
15
commits into
intel-staging:pvVMCS-POC-v6.12
Choose a base branch
from
cxdong:fpu
base: pvVMCS-POC-v6.12
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Fpu isolation #31
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The vcpu_after_set_cpuid PV interface will add a new cpuid entries to the pkvm hypervisor if it is allowed. In this case, the previous cpuid entries memory physical addresses will be returned back to the host. And such memory is freed with the size of the new cpuid entries, which is incorrect. To resolve this, the pkvm hypervisor returns such memory physical addresses via the pkvm memcache format so that the size will also be returned to the host. Fixes: 241210f ("pkvm: x86: Add vcpu_after_set_cpuid PV interface") Signed-off-by: Chuanxiao Dong <[email protected]>
To support the pkvm hypervisor doing the FPU isolation for the pVM, add the fpu_kernel_cfg and fpu_user_cfg symbols to the pkvm hypervisor which will be used by the pkvm hypervisor to initialize the guest fpstate. The fpu configs are initialized according to the corresponding host symbols before the deprivilege the host. Signed-off-by: Chuanxiao Dong <[email protected]>
Setup the xstate offset and size in the pkvm hypervisor. These could be used by the pkvm hypervisor to locate the FPU register offset in the fpstate regs memory. Signed-off-by: Chuanxiao Dong <[email protected]>
Add a percpu fpu to facilitate the FPU isolation. The percpu fpu is implemented by leveraging the pcpu_hot->current_task->thread.fpu and initialized with the initial fpstate, which is used to restore the FPU to the initial state to isolation the FPU from the host for the pVM. The permission is set to the maximum features to allow the guest to use all the supported features. Signed-off-by: Chuanxiao Dong <[email protected]>
The pkvm hypervisor needs to save/restore the FPU registers to/from the memory for the guest to support the FPU isolation. The fpstate memory is allocated by the host and send to the pkvm hypervisor via the vcpu_create PV interfaces, which will be donated to pkvm hypervisor for protection. It will be undoated when destroy the VM. For the pVM, as the fpstate memory will be used by the pkvm hypervisor to save/restore the pVM's FPU registers for the isolation purpose, its size is counted according to the guest fpu allocated by the host. For the npVM, this fpstate memory doesn't have to be used in the same as no FPU isolation required. Instead, the fpstate memory owned by the host can be used to save/restore the FPU registers for the npVM. So the size for this fpstate memory is just size of the structure fpstate except for the regs, which is large enough for the pkvm hypervisor to emulate the IA32_XFD msr. Signed-off-by: Chuanxiao Dong <[email protected]>
Initialize the guest fpu and mark it as not in use. The fpstate regs memory is initialized according to the supported fpu features. The xfeatures is set to the permitted default xfeatures. Signed-off-by: Chuanxiao Dong <[email protected]>
With expoing the FPU dynamica feature via the cpuid, the fpstate allocated when creating the vcpu may not be sufficient for the guest. As the pVM's FPU state is managed by the pkvm hypervisor while the npVM's FPU state is managed by the host, re-allocating the fpstate is only necessary for the pVM, and should be done before adding the new cpuid entries to the pkvm hypervisor. This is only permitted before the pVM has started running. The old fpstate physical addresses are returned back to the host via the pkvm memcache. Signed-off-by: Chuanxiao Dong <[email protected]>
Enable the guest xfd features if the vcpuid expose the feature to the guest. The fpstate will be reallocated for the pkvm hypervisor before the cpuid entries are ready. So enabling the xfd feature for the guest is just to update the xfeatures and xfd fields accordingly if the fpstate size is sufficient. Signed-off-by: Chuanxiao Dong <[email protected]>
Add the fpu_swap_kvm_fpstate() for the pkvm hypervisor to swap the FPU registers to guarantee the FPU isolation. The implementation is ported from the linux kernel source code with minor modifications that, only saving/restoring the FPU registers for the pVM, and wipe the FPU registers by restoring with the initial FPU statue before returning to the host. For the npVM, no FPU registers swapping but only updates the percpu fpu to indicate the current active fpstate. Signed-off-by: Chuanxiao Dong <[email protected]>
To support the FPU isolation for the pVM, mark the pVM's fpstate as confidential and do the FPU registers swapping around the vcpu run loop. Signed-off-by: Chuanxiao Dong <[email protected]>
Separate the vcpu run loop core to a dedicated function. No function change. Signed-off-by: Chuanxiao Dong <[email protected]>
This reverts commit f64d30c Signed-off-by: Chuanxiao Dong <[email protected]>
With the FPU swapping support in the pkvm hypervisor, the XFD and XFD_ERR MSR can be handled inside the pkvm hypervisor. Add code to do the XFD/XFD_ERR MSR switching, and handle the #NM exception inside the pkvm hypervisor. Signed-off-by: Chuanxiao Dong <[email protected]>
The host_pkru should be saved when load a vcpu so that the pkvm hypervisor can do the xsave state switching for running a guest. Fixes: 3729703 ("pkvm: vmx: Implement vcpu_run") Signed-off-by: Chuanxiao Dong <[email protected]>
Doing vcpu reset requires to clear the BNDREGS/BNDCSR components in the fpstate. Port the code fpstate_clear_xstate_component() from the linux kernel to locate the position in the fpstate and do the memory clearing. Signed-off-by: Chuanxiao Dong <[email protected]>
vineethrp
pushed a commit
to vineethrp/pKVM-IA
that referenced
this pull request
Oct 1, 2025
…er dereference commit 1bb3363da862e0464ec050eea2fb5472a36ad86b upstream. A malicious HID device with quirk APPLE_MAGIC_BACKLIGHT can trigger a NULL pointer dereference whilst the power feature-report is toggled and sent to the device in apple_magic_backlight_report_set(). The power feature-report is expected to have two data fields, but if the descriptor declares one field then accessing field[1] and dereferencing it in apple_magic_backlight_report_set() becomes invalid since field[1] will be NULL. An example of a minimal descriptor which can cause the crash is something like the following where the report with ID 3 (power report) only references a single 1-byte field. When hid core parses the descriptor it will encounter the final feature tag, allocate a hid_report (all members of field[] will be zeroed out), create field structure and populate it, increasing the maxfield to 1. The subsequent field[1] access and dereference causes the crash. Usage Page (Vendor Defined 0xFF00) Usage (0x0F) Collection (Application) Report ID (1) Usage (0x01) Logical Minimum (0) Logical Maximum (255) Report Size (8) Report Count (1) Feature (Data,Var,Abs) Usage (0x02) Logical Maximum (32767) Report Size (16) Report Count (1) Feature (Data,Var,Abs) Report ID (3) Usage (0x03) Logical Minimum (0) Logical Maximum (1) Report Size (8) Report Count (1) Feature (Data,Var,Abs) End Collection Here we see the KASAN splat when the kernel dereferences the NULL pointer and crashes: [ 15.164723] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP KASAN NOPTI [ 15.165691] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] [ 15.165691] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.15.0 intel-staging#31 PREEMPT(voluntary) [ 15.165691] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 15.165691] RIP: 0010:apple_magic_backlight_report_set+0xbf/0x210 [ 15.165691] Call Trace: [ 15.165691] <TASK> [ 15.165691] apple_probe+0x571/0xa20 [ 15.165691] hid_device_probe+0x2e2/0x6f0 [ 15.165691] really_probe+0x1ca/0x5c0 [ 15.165691] __driver_probe_device+0x24f/0x310 [ 15.165691] driver_probe_device+0x4a/0xd0 [ 15.165691] __device_attach_driver+0x169/0x220 [ 15.165691] bus_for_each_drv+0x118/0x1b0 [ 15.165691] __device_attach+0x1d5/0x380 [ 15.165691] device_initial_probe+0x12/0x20 [ 15.165691] bus_probe_device+0x13d/0x180 [ 15.165691] device_add+0xd87/0x1510 [...] To fix this issue we should validate the number of fields that the backlight and power reports have and if they do not have the required number of fields then bail. Fixes: 394ba61 ("HID: apple: Add support for magic keyboard backlight on T2 Macs") Cc: [email protected] Signed-off-by: Qasim Ijaz <[email protected]> Reviewed-by: Orlando Chamberlain <[email protected]> Tested-by: Aditya Garg <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This serial is to add FPU isolation for the protected VM. The pkvm hypervisor will restore pVM's FPU registers before entering the guest, and save pVM's FPU registers after exiting the guest. Before returning to the host, the FPU registers will be restored to the initial state. The host reponses to save/restore its own FPU registers.
For the non-protected VM, the host does the FPU switching between the npVM and the host.