Skip to content

Conversation

@SamlRx
Copy link

@SamlRx SamlRx commented May 11, 2025

This PR addresses the issue described in #170.

Previously, the code iterated over /sys/class/hwmon and selected the first AMD GPU it encountered, which could mistakenly identify a discrete GPU (dGPU) instead of the integrated GPU (iGPU).

This patch improves detection by resolving the full path of each hwmon device and extracting its associated PCI address. We then use lspci to check whether the device matches known AMD dGPU patterns. Only if the device is not a known dGPU do we consider it a valid iGPU candidate.

This should ensure more reliable and deterministic selection of the iGPU on systems with both AMD integrated and discrete GPUs.

I have also moved the find_intel_igpu function to utils.py for consistency.

@SamlRx
Copy link
Author

SamlRx commented May 11, 2025

Should I bump version as part of this PR?

@antheas
Copy link
Contributor

antheas commented May 11, 2025

No. Can you simplify the PR? The moves make it hard to see what changed

@SamlRx
Copy link
Author

SamlRx commented May 11, 2025

Yeah sure

@antheas
Copy link
Contributor

antheas commented May 11, 2025

Checking for local CPUs was meant to act as a dGPU check, why didn't it work?

@SamlRx
Copy link
Author

SamlRx commented May 11, 2025

Checking for local CPUs was meant to act as a dGPU check, why didn't it work?

I can double check in details tomorrow. But as far as I remember I had the file local_cpulist in both directories.

I fixed the PR it should be more readable.

@antheas
Copy link
Contributor

antheas commented May 11, 2025

Since you have a dGPU, you can see the differences between them. Is there something simpler you can check for that does not require quirking per generation?

@SamlRx
Copy link
Author

SamlRx commented May 12, 2025

I can double check in details tomorrow. But as far as I remember I had the file local_cpulist in both directories.

Yeah I confirm that I have a local_cpulist in both directories,.

Is there something simpler you can check for that does not require quirking per generation?

As far as I can see nvtop is able to detect if a GPU is integrated or not. Are you aware of a lib similar to libdrm or dlsym for Python?

@SamlRx
Copy link
Author

SamlRx commented May 12, 2025

I have been digging around and I think I can use vulkaninfo to do so, that would mean the script for AMD GPUs will admit vulkaninfo is installed:

========
GPU0:
	apiVersion         = 1.4.305
	driverVersion      = 25.0.5
	vendorID           = 0x1002
	deviceID           = 0x731f
	deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
	deviceName         = AMD Radeon RX 5700 XT (RADV NAVI10)
	driverID           = DRIVER_ID_MESA_RADV
	driverName         = radv
	driverInfo         = Mesa 25.0.5
	conformanceVersion = 1.4.0.0
	deviceUUID         = 00000000-0800-0000-0000-000000000000
	driverUUID         = 414d442d-4d45-5341-2d44-525600000000
GPU1:
	apiVersion         = 1.4.305
	driverVersion      = 25.0.5
	vendorID           = 0x1002
	deviceID           = 0x15bf
	deviceType         = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
	deviceName         = AMD Radeon Graphics (RADV PHOENIX)
	driverID           = DRIVER_ID_MESA_RADV
	driverName         = radv
	driverInfo         = Mesa 25.0.5
	conformanceVersion = 1.4.0.0
	deviceUUID         = 00000000-6400-0000-0000-000000000000
	driverUUID         = 414d442d-4d45-5341-2d44-525600000000

@SamlRx
Copy link
Author

SamlRx commented May 12, 2025

I have modified the PR with the usage of vulkaninfo

@SamlRx
Copy link
Author

SamlRx commented May 13, 2025

If you are not confortable merging this, I can still work on it. Could you please share if something is bothering you? I can also add unit tests if this is something that could help.

@antheas
Copy link
Contributor

antheas commented May 13, 2025

I have been busy with work.

The current problem I have with this as 8 see it is that the functions where you do the string manipulation with ":" and the fact you hardcoded CPU generations seem fragile

If you can instead switch to a static check similar to local CPUs, I will merge this. You can look in the source code of e.g. nvtop for this

@SamlRx
Copy link
Author

SamlRx commented May 13, 2025

Unfortunately, translating the nvtop part using Cython was out my reach. However, I believe it's possible to achieve something similar using this library: https://github.com/mark9064/pyamdgpuinfo, by extending it to include the AMDGPU_VRAM_TYPE_INTEGRATED information.

As a result, I ended up reverting most of my earlier changes and discovered that mem_info_vram_vendor only appears on discrete GPUs. Which, on reflection, makes sense. This approach preserves the core logic while offering a more reliable way to distinguish between dGPUs and iGPUs in the long run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants