feat: add macOS support via Virtualization.framework (vz)#82
feat: add macOS support via Virtualization.framework (vz)#82
Conversation
Move Linux-specific resource detection (CPU, memory, disk, network) and device management (discovery, mdev, vfio) into _linux.go files. Add stub _darwin.go files that return empty/unsupported results for macOS. This is a pure refactoring with no functional changes on Linux. Prepares the codebase for macOS support where these Linux-specific features (cgroups, sysfs, VFIO) are not available. Co-authored-by: Cursor <[email protected]>
Move Linux bridge/TAP networking into bridge_linux.go. Add bridge_darwin.go stub since macOS vz uses built-in NAT networking. Extract shared IP allocation logic to ip.go. Move VMM binary detection (cloud-hypervisor, qemu paths) into binaries_linux.go. Add binaries_darwin.go that returns empty paths since vz is in-process. No functional changes on Linux. Co-authored-by: Cursor <[email protected]>
Move ingress binary embedding into platform-specific files. Update build tags on architecture-specific files to also include OS constraint. Replace checkKVMAccess() with platform-agnostic checkHypervisorAccess(): - Linux: checks /dev/kvm access (existing behavior) - macOS: verifies ARM64 arch and Virtualization.framework availability No functional changes on Linux. Co-authored-by: Cursor <[email protected]>
…ions Add GetVsockDialer() to instance manager interface. This abstraction handles the difference between: - Linux: socket-based vsock (AF_VSOCK or Unix socket proxy) - macOS vz: in-process vsock via VirtualMachine object Update API handlers (exec, cp, instances) and build manager to use GetVsockDialer() instead of directly creating vsock connections. Add DialVsock() method to VsockDialer interface for explicit dialing. Co-authored-by: Cursor <[email protected]>
Implement Hypervisor and VMStarter interfaces using github.com/Code-Hex/vz/v3 library for Apple's Virtualization.framework. Key differences from Linux hypervisors: - In-process: VMs run within hypeman process (no separate PID) - NAT networking: Uses vz built-in NAT (192.168.64.0/24) - Direct vsock: Connects via VirtualMachine object, not socket files - Snapshot support: Available on macOS 14+ ARM64 Registers vz starter on macOS via init() in hypervisor_darwin.go. Linux hypervisor_linux.go is a no-op placeholder. Co-authored-by: Cursor <[email protected]>
Guest init changes: - Add hvc0 serial console support (vz uses hvc0, not ttyS0) - Prioritize /dev/hvc0 for console output in logger and mount Binary embedding: - Add darwin-specific embed files for cross-compiled linux/arm64 binaries - Guest init and agent binaries are embedded when building on macOS OCI image handling: - Add vmPlatform() to return linux/arm64 for VM images regardless of host - Fixes image pull on macOS which would otherwise request darwin/arm64 Instance lifecycle: - Track active hypervisors for vz (needed for in-process VM references) - Handle vz-specific cleanup in delete (no PID to kill) - Support vz in instance queries Co-authored-by: Cursor <[email protected]>
Build system: - Add macOS targets to Makefile (build-darwin, run, sign) - Add .air.darwin.toml for live reload on macOS - Add vz.entitlements for Virtualization.framework code signing - Add .env.darwin.example with macOS-specific configuration Documentation: - Update DEVELOPMENT.md with macOS setup instructions - Update README.md to mention macOS support - Update lib/hypervisor/README.md with vz implementation details - Update lib/instances/README.md for multi-hypervisor support - Update lib/network/README.md with platform comparison Co-authored-by: Cursor <[email protected]>
Co-authored-by: Cursor <[email protected]>
| }) | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
Multiple networks overwritten, only last one kept
High Severity
When multiple networks are configured, configureNetwork loops through each network and calls addNATNetwork, but each call to addNATNetwork invokes SetNetworkDevicesVirtualMachineConfiguration with a single-element array, overwriting any previously configured networks. Only the last network in the list will actually be attached to the VM. The fix would be to build the full array of network devices first and call SetNetworkDevicesVirtualMachineConfiguration once with all devices, similar to how configureStorage correctly accumulates disks into storageDevices before making a single call.
Changes the vz hypervisor from in-process to subprocess model, allowing VMs to survive hypeman restarts. Mirrors the cloud-hypervisor architecture. Key changes: - Add cmd/vz-shim binary that hosts vz VMs in a subprocess - Shim exposes HTTP API on Unix socket for VM control (matching CH pattern) - Shim exposes vsock proxy on separate Unix socket using CH protocol - Update vz starter to spawn shim subprocess instead of in-process VM - Add vz.Client implementing Hypervisor interface via HTTP to shim - Update VsockDialer to use Unix socket proxy instead of in-process VM - Add hypervisor.ClientFactory for uniform hypervisor client creation - Remove activeHypervisors tracking (no longer needed) - Simplify vsock_darwin.go (vz now uses same socket pattern as other hypervisors) - Update Makefile to build and sign vz-shim binary Co-authored-by: Cursor <[email protected]>
|
|
||
| vmConfig.SetNetworkDevicesVirtualMachineConfiguration([]*vz.VirtioNetworkDeviceConfiguration{ | ||
| networkConfig, | ||
| }) |
There was a problem hiding this comment.
Multiple network interfaces ignored due to overwrite
High Severity
When configuring multiple network interfaces, configureNetwork loops through each NetworkConfig and calls addNATNetwork, but addNATNetwork calls SetNetworkDevicesVirtualMachineConfiguration with a single-element slice. This replaces the entire network device list on each iteration rather than appending. Only the last network interface will be attached to the VM; all previous networks are silently discarded.
Additional Locations (1)
| state = hypervisor.StateShutdown | ||
| default: | ||
| state = hypervisor.StateRunning | ||
| } |
There was a problem hiding this comment.
VM error states incorrectly reported as running
Medium Severity
The GetVMInfo state mapping falls back to StateRunning for unrecognized states in the default case. The vz shim server can return states like "Error", "Starting", "Pausing", "Resuming", "Stopping", and "Unknown" which the client doesn't handle, causing them all to be reported as "Running". Most critically, a VM in "Error" state would be incorrectly reported as running, potentially causing operations to be attempted on a failed VM or preventing proper error detection.
…ation Add snapshot save/restore infrastructure to vz-shim: - Snapshot endpoint in shim server (vm.snapshot) - RestoreVM implementation in starter (loads config from metadata.json) - Snapshot method in client (adapts directory path to file path) Document Virtualization.framework limitation: - Linux guest VMs cannot be reliably saved/restored - Only macOS guests support this functionality - This is an undocumented Apple limitation confirmed by Tart and UTM projects - References: Tart #1177, #796; UTM #6654 The infrastructure is in place for potential future macOS guest support while correctly disabling snapshot capability for Linux guests. Also improves MAC address handling and error logging in vm.go. Co-authored-by: Cursor <[email protected]>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| _ = parts | ||
| return filepath.Base(filepath.Dir(filepath.Dir(path))) == name || | ||
| filepath.Base(filepath.Dir(path)) == name | ||
| } |
There was a problem hiding this comment.
Image matching fails for registry-qualified names
Medium Severity
The containsImageName function extracts everything before : as the image name, then compares it against path components using filepath.Base. For fully qualified image names like docker.io/library/alpine:3.20, this extracts docker.io/library/alpine as the name, but compares against alpine from the path. These won't match, causing RestoreVM to fail when the instance metadata contains a registry-qualified image reference.


Summary
Adds macOS support to hypeman using Apple's Virtualization.framework via the Code-Hex/vz library.
Key changes:
_linux.gofiles with stub_darwin.gocounterpartslib/hypervisor/vz/package implementingHypervisorandVMStarterinterfacescmd/vz-shim/binary that hosts VMs in a separate process, allowing VMs to survive hypeman restarts (mirrors cloud-hypervisor architecture)CONNECT {port}\n→OK {port}\n) over Unix sockethypervisor.NewClient()for uniform hypervisor client creation across all VMM typesArchitecture (vz-shim subprocess model):
Platform comparison:
Snapshot limitation
Virtualization.framework does not support save/restore for Linux guest VMs - only macOS guests work. This is an undocumented Apple limitation confirmed by other projects:
"You can only suspend macOS VMs"The snapshot infrastructure is implemented in vz-shim for potential future macOS guest support, but the capability is correctly disabled for Linux guests.
vz-shim API:
The shim exposes a Cloud Hypervisor-compatible HTTP API on Unix socket:
GET /api/v1/vm.info- VM state and configurationPUT /api/v1/vm.pause- Pause VMPUT /api/v1/vm.resume- Resume VMPUT /api/v1/vm.shutdown- Graceful shutdownPUT /api/v1/vm.power-button- ACPI power buttonPUT /api/v1/vm.snapshot- Save VM state (infrastructure for macOS guests)GET /api/v1/vmm.ping- Health checkPUT /api/v1/vmm.shutdown- Terminate shimVsock proxy uses same text-based handshake as Cloud Hypervisor:
CONNECT {port}\nOK {port}\nCI Considerations
Current CI uses self-hosted Linux runners with KVM. For macOS:
macos-14) support Apple Silicon but may lack virtualization entitlementsmacos-14to verify compilation, defer VM tests to self-hostedExample addition to test.yml:
Test plan
hypeman resources- verified resource detectionhypeman pull- verified linux/arm64 image pullhypeman run- verified VM creation and boothypeman exec- verified command execution in VMhypeman ps- verified instance listinghypeman build- verified Dockerfile→VM image buildhypeman ingress- verified external access to VM serviceshypeman rm- verified instance cleanup