Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia support #391

Closed
5 tasks done
ilya-zlobintsev opened this issue Oct 25, 2024 · 20 comments
Closed
5 tasks done

Nvidia support #391

ilya-zlobintsev opened this issue Oct 25, 2024 · 20 comments

Comments

@ilya-zlobintsev
Copy link
Owner

ilya-zlobintsev commented Oct 25, 2024

With #388 being merged, LACT now has basic support for Nvidia GPUs through NVML (nvidia management library). This issue tracks the feature support for nvidia.

  • Information reporting
    • The UI now only shows fields that have data present, as it doesn't make sense to report vendors-specific info such as CUDA cores or compute units when the value will always be empty on that GPU
  • Real-time stats reporting (clockspeed, power usage, power states, fan speed, throttling)
  • Power limit configuration
  • Custom fan curves
    • Largely uses the same logic as pre-RDNA3 AMD
  • Clockspeed configuration

Not possible to implement currently:

  • Voltage configuration - doesn't appear to be supported in NVML
@ilya-zlobintsev
Copy link
Owner Author

All the main functionality has been implemented, now it just needs a bit more testing across multiple GPU generations.

@Dekamir
Copy link

Dekamir commented Nov 15, 2024

Will we ever be able to toggle individual PowerMizer power/performance levels (or edit them) on NVIDIA?

@AbdulrahmanObaido
Copy link

can we get intel arc support fan speed set on kernel 6.12 ?

@ilya-zlobintsev
Copy link
Owner Author

@AbdulrahmanObaido 6.12 only added support for reading the fan speed on ARC, it's not possible to set it (or do any other configuration on intel GPUs)
See #401 for more info

@Jimmytalksalot
Copy link

what is the config yaml for fan control?

@stanislav-kozyrev
Copy link

@ilya-zlobintsev Thanks for Nvidia GPU support. By the way, the memory clock offset value in UI and config file is multiplied by 2 compared to GreenWithEnvy and MSI Afterburner. Default memory clock is determined correctly (e.g., 11501 for 23 Gbps). In those programs the offset should be 1500 (AB) instead of 3000 (LACT) to get 26 Gbps on RTX 4080 Super. Please check attached debug snapshot.
info.json

@stanislav-kozyrev
Copy link

@ilya-zlobintsev Also forgot to mention that sometimes Nvidia drivers don't create libnvidia-ml.so (especially beta or from CUDA repo), but there is still libnvidia-ml.so.1. I had to make a missing symlink to make lactd to detect Nvidia. Maybe it makes sense to fallback to libnvidia-ml.so.1 if libnvidia-ml.so is not found?

@ilya-zlobintsev
Copy link
Owner Author

@stanislav-kozyrev it was changed to libnvidia-ml.so.1 since the last stable release, see #414 for more info

@stanislav-kozyrev
Copy link

@ilya-zlobintsev Thanks for the update. Tried out release 0.7.0 and both issues (library name and memory offset) are resolved.

@HorstBaerbel
Copy link

Thanks for the nvidia support! RTX 3060 user here. The GPU and VRAM clocks do not seem to adjust correctly. LACT seems to do something to the clocks, but the actual values are a bit off. Also I'd expect the Power states to update to reflect the new values:

Image

LACT-v0.7.0-snapshot-20250116-163055.tar.gz

@stanislav-kozyrev
Copy link

@HorstBaerbel Could you please try to reproduce the issue with either GreenWithEnvy (check Flatpacks) or official NVIDIA X Server Settings app? Both apps allow to adjust core and memory offsets, but don't forget to apply defaults with LACT first. Power limit can be changed with nvidia-smi. Also, try to get 100% load with GPU heavy benchmark like Unigine Superposition.

The core clock on modern (since last decade or so) GPUs is more like a cap, i.e. upper limit, that a vendor's boost technology aims to achieve given ideal conditions, e.g., hot spot temperature 15C, unlimited power and/or voltage, etc. For instance, out of the box my RTX 4080S core clock is set to 3105 MHz, but in reality it stays around 2655-2750 depending on the load. That's why undervolting modern chips is more beneficial than overclocking -- basically it's all about removing obstacles to let the boost algorithm stretch it's legs.

@HorstBaerbel
Copy link

HorstBaerbel commented Jan 16, 2025

@stanislav-kozyrev I'm on Wayland here, so LACT is basically my only hope. GreenWithEnvy and official NVIDIA X Server Settings app don't support Wayland.
About the clocks: I understand that it is an upper limit, but if I increase the limits the clocks do increase but the relation is totally random. This is underclocking with the power limit set to max running FurMark:

Image

Not even the reduced-from-default clocks are reached while the VRAM clock is too high. I'm not saying this is LACTs fault. It may very well be a driver thing.

@BlueGoliath
Copy link

Thanks for the nvidia support! RTX 3060 user here. The GPU and VRAM clocks do not seem to adjust correctly. LACT seems to do something to the clocks, but the actual values are a bit off. Also I'd expect the Power states to update to reflect the new values:

Image

LACT-v0.7.0-snapshot-20250116-163055.tar.gz

NVML bug.

@BlueGoliath
Copy link

Thanks for the nvidia support! RTX 3060 user here. The GPU and VRAM clocks do not seem to adjust correctly. LACT seems to do something to the clocks, but the actual values are a bit off. Also I'd expect the Power states to update to reflect the new values:
Image
LACT-v0.7.0-snapshot-20250116-163055.tar.gz

NVML bug.

FYI, this has been fixed in 570. It looks like LACT doesn't update those values as an app restart is required for new values to be shown. Everything overclocking works fine now.

@Aspect250
Copy link

Why not implement core clock offset and memory clock offset like in these projects
https://github.com/WickedLukas/nvidia-tuner
https://github.com/Dreaming-Codes/nvidia_oc
as they mention that they use NVML.
this would provide the ability to overclock and undervolt on nvidia cards.

@ilya-zlobintsev
Copy link
Owner Author

Why not implement core clock offset and memory clock offset like in these projects https://github.com/WickedLukas/nvidia-tuner https://github.com/Dreaming-Codes/nvidia_oc as they mention that they use NVML. this would provide the ability to overclock and undervolt on nvidia cards.

The current min/max clock functionality does use offsets under the hood. I plan on showing them as offsets in the ui too, but they already work using the same nvml options.

@Aspect250
Copy link

Aspect250 commented Feb 2, 2025

Why not implement core clock offset and memory clock offset like in these projects https://github.com/WickedLukas/nvidia-tuner https://github.com/Dreaming-Codes/nvidia_oc as they mention that they use NVML. this would provide the ability to overclock and undervolt on nvidia cards.

The current min/max clock functionality does use offsets under the hood. I plan on showing them as offsets in the ui too, but they already work using the same nvml options.

Im using this cachyos guide as my basis right now, So your using/doing nvmlDeviceSetGpcClkVfOffset according to the guide above and not nvmlDeviceSetGpuLockedClocks or the equivalent to it for the current max gpu clock slider.
Have i got that right?

@ilya-zlobintsev
Copy link
Owner Author

Why not implement core clock offset and memory clock offset like in these projects https://github.com/WickedLukas/nvidia-tuner https://github.com/Dreaming-Codes/nvidia_oc as they mention that they use NVML. this would provide the ability to overclock and undervolt on nvidia cards.

The current min/max clock functionality does use offsets under the hood. I plan on showing them as offsets in the ui too, but they already work using the same nvml options.

Im using this cachyos guide as my basis right now, So your using/doing nvmlDeviceSetGpcClkVfOffset according to the guide above and not nvmlDeviceSetGpuLockedClocks or the equivalent to it for the current max gpu clock slider. Have i got that right?

Yes, that's correct - both the max GPU and VRAM clock settings currently use nvmlDeviceSetGpcClkVfOffset/nvmlDeviceSetMemClkVfOffset, and the max clock values shown in the GUI are calculated as clockspeed of max pstate + the offset. It was done this way because it allowed to reuse the existing GUI and config options which were originally made for AMD. nvmlDeviceSetGpuLockedClocks is not used anywhere.

Here is the relevant code: https://github.com/ilya-zlobintsev/LACT/blob/master/lact-daemon/src/server/gpu_controller/nvidia.rs#L567

However I do plan on replacing the nvmlDeviceSet*ClkOffset functions (which are marked as deprecated in NVML docs) with nvmlDeviceSetClockOffsets, which allows to configure clock offsets per each pstate separately. It was a bit buggy when I tested it originally, but it appears to be improved with the 570 beta driver.

@Aspect250
Copy link

Thanks for clarifying. So much easier to undervolt/overclock right now.

Cheers for adding NVIDIA support.

@HorstBaerbel
Copy link

Only here to report that this works much better now with 570.86.16. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants