Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't issue Smart Reboot due to PCI Passthrough #8232

Open
cloudrootab opened this issue Jan 8, 2025 · 1 comment
Open

Can't issue Smart Reboot due to PCI Passthrough #8232

cloudrootab opened this issue Jan 8, 2025 · 1 comment

Comments

@cloudrootab
Copy link
Contributor

Are you using XOA or XO from the sources?

XO from the sources

Which release channel?

None

Provide your commit number

fd9c9

Describe the bug

Raising bug report, as per Oliviers request in thread https://xcp-ng.org/forum/topic/10232/feature-request-disable-smart-reboot-if-guests-have-pci-passthru

I have two GPUs. However, this leads to Smart Reboot not being able to complete. Which leads to it aborting the operation, and reverts the suspending of VMs. Instead I'm met with the error message, as described below.

GPU0: AMD RX6400 (attached through GUI)
GPU1: Nvidia GTX1060 (attached through CLI as per compute-docs)

Error message

host.restart
{
  "force": false,
  "suspendResidentVms": true,
  "bypassBlockedSuspend": false,
  "bypassCurrentVmCheck": false,
  "id": "3c08401c-446b-445d-9ea2-24393beae8e4"
}
{
  "errors": [
    {
      "code": "VM_HAS_PCI_ATTACHED",
      "params": [
        "OpaqueRef:ad56d720-83ff-4a57-dc26-51841905fcb9"
      ],
      "task": {
        "uuid": "fbb8b196-88e9-bc18-0d1f-06143897380e",
        "name_label": "Async.VM.suspend",
        "name_description": "",
        "allowed_operations": [],
        "current_operations": {},
        "created": "20250108T16:55:30Z",
        "finished": "20250108T16:55:30Z",
        "status": "failure",
        "resident_on": "OpaqueRef:8b1095b3-beee-1d7f-93f2-bf5a88896299",
        "progress": 1,
        "type": "<none/>",
        "result": "",
        "error_info": [
          "VM_HAS_PCI_ATTACHED",
          "OpaqueRef:ad56d720-83ff-4a57-dc26-51841905fcb9"
        ],
        "other_config": {},
        "subtask_of": "OpaqueRef:NULL",
        "subtasks": [],
        "backtrace": "(((process xapi)(filename ocaml/xapi/xapi_vm_lifecycle.ml)(line 742))((process xapi)(filename ocaml/xapi/xapi_vm_helpers.ml)(line 1653))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/helpers.ml)(line 1662))((process xapi)(filename ocaml/xapi/xapi_vm_helpers.ml)(line 1652))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 2241))((process xapi)(filename ocaml/xapi/rbac.ml)(line 191))((process xapi)(filename ocaml/xapi/rbac.ml)(line 200))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 75)))"
      }
    },
    {
      "code": "VM_HAS_PCI_ATTACHED",
      "params": [
        "OpaqueRef:ef20a143-4ca5-b293-c111-a6905c49394c"
      ],
      "task": {
        "uuid": "d75716e1-ec5f-40af-0e05-d496c0379e96",
        "name_label": "Async.VM.suspend",
        "name_description": "",
        "allowed_operations": [],
        "current_operations": {},
        "created": "20250108T16:55:30Z",
        "finished": "20250108T16:55:30Z",
        "status": "failure",
        "resident_on": "OpaqueRef:8b1095b3-beee-1d7f-93f2-bf5a88896299",
        "progress": 1,
        "type": "<none/>",
        "result": "",
        "error_info": [
          "VM_HAS_PCI_ATTACHED",
          "OpaqueRef:ef20a143-4ca5-b293-c111-a6905c49394c"
        ],
        "other_config": {},
        "subtask_of": "OpaqueRef:NULL",
        "subtasks": [],
        "backtrace": "(((process xapi)(filename ocaml/xapi/xapi_vm_lifecycle.ml)(line 742))((process xapi)(filename ocaml/xapi/xapi_vm_helpers.ml)(line 1653))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/helpers.ml)(line 1662))((process xapi)(filename ocaml/xapi/xapi_vm_helpers.ml)(line 1652))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 2241))((process xapi)(filename ocaml/xapi/rbac.ml)(line 191))((process xapi)(filename ocaml/xapi/rbac.ml)(line 200))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 75)))"
      }
    }
  ],
  "message": "",
  "name": "Error",
  "stack": "Error: 
    at next (/opt/xen-orchestra/@vates/async-each/index.js:83:24)
    at onFulfilled (/opt/xen-orchestra/@vates/async-each/index.js:56:7)
    at onFulfilledWrapper (/opt/xen-orchestra/@vates/async-each/index.js:58:41)"
}

To reproduce

Have GPUs attached through PCI-passthrough (I have two)
Issue a smart reboot from Host/Advanced.

Expected behavior

If it isn't possible to issue and do a Smart Reboot due to PCI devices being attached. Then this should disable the Smart Reboot button altogether. Preferably with a note that makes it clear why the button is disabled.

Screenshots

No response

Node

20.18.1

Hypervisor

XCP-NG 8.3

Additional context

No response

@olivierlambert
Copy link
Member

Thanks for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants