Conversation
|
looks good on staging:
|
hiroTamada
left a comment
There was a problem hiding this comment.
Solid implementation of GPU-aware load balancing. The TTL-based caching with double-checked locking is well done, and the VRAM-based selection heuristic is a reasonable approach. One minor nit about the config wiring, but nothing blocking.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| parentGPU := vfToParent[mdev.VFAddress] | ||
| if parentGPU == "" { | ||
| continue | ||
| } |
There was a problem hiding this comment.
VFs without parent GPU have VRAM usage ignored
Medium Severity
In calculateGPUVRAMUsage, mdevs on VFs with empty ParentGPU are skipped (if parentGPU == "" { continue }), so their VRAM is never counted. However, in selectLeastLoadedVF, these same VFs ARE included in allGPUs and freeVFsByGPU for selection. This means VFs without a physfn symlink are grouped under an empty-string "GPU" that always appears to have 0 VRAM usage, making them preferentially selected even when they already have active mdevs. This could cause load imbalance.
Note
Introduces VRAM-aware vGPU allocation with TTL-cached profile metadata and a configurable cache TTL.
devices/mdev.go(SetGPUProfileCacheTTL,getCachedProfiles) and parses framebuffer sizes for profilescalculateGPUVRAMUsage,selectLeastLoadedVF);CreateMdevnow picks a VF from the least-loaded GPUavailable_instancesGPU_PROFILE_CACHE_TTLto config and wires it inmain.goviadevices.SetGPUProfileCacheTTLWritten by Cursor Bugbot for commit d7e7aaa. This will update automatically on new commits. Configure here.