-
-
Notifications
You must be signed in to change notification settings - Fork 23.6k
Enable SPIR-V optimizations for shader compiler (looking for testing). #82444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
reduz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems fine, just nitpick.
5dec64e to
b0dab7f
Compare
|
As per @reduz's suggestion, now you can toggle this with a project setting instead. The launch option is gone. |
b0dab7f to
3e7ed2e
Compare
Validation warningsFirst off, there are validation warnings in Mobile renderer and Godot will crash if ran with validation layers (crash happens inside the layer). The warnings are: As we talked with @DarioSamo, the crashes & important validation errors can be fixed if we let Godot reflect the SPIR-V before optimization, instead of afterwards. Not a big deal. The performance warnings can't be fixed without a severe refactor on how shaders works. Unfortunately Godot just crosses fingers the driver will optimize the unconsumed vertex outputs away. Performance Comparison on low end AndroidOK I admit I was hoping a miracle (e.g. 50% improvement or more), but it does make a difference. I tested on Android, particularly a Redmi 4X boasting a Adreno 504. I chose this device because it is one of the weakest devices we will find that we should be able to reasonably support (e.g. everything on minimum, resolution downscaled, etc) because the drivers are decent enough. Note: Godot currently can't hope to target this device, it runs too slow. All tests were made with the Mobile renderer, not the Forward+ one. I also used two scenes that I knew in advance are pixel shader bound. They're extremely simple:
Note that these are very synthetic and not indicative of real world performance.
To sumarize (in MSPF):
Takeaways:
Although I was hoping a lot more; it is something. And it is by accumulation of performance improvements that we eventually reach decent performance. Particularly on the lower end. So if Dario finds the cost of integrating this PR acceptable (i.e. fixing validation errors, runtime cost of compiling with optimizer), I'm in favour of including the SPIR-V Optimizer |
Port SPIRV-Tools to SCons. Enable optimizations on glslang when it's built in. Add project setting to enable optimizations by the shader compiler (disabled by default).
3e7ed2e to
e2075b5
Compare
This is an experimental PR with an implementation that won't resemble the final approach we'd take if shader optimizations are to be enabled, but at least it gets a SPIRV-Tools port to SCons out of the way as the biggest undertaking that had to be done to test it out.
I was made aware one of the reasons the optimizer couldn't be used was because it wouldn't preserve the resource bindings properly, and therefore make the validation fail. However, upon digging on the Glslang source, it's pretty evident it just doesn't expose the options the optimizer has to preserve this information. With that change made, it's perfectly possible to use Godot with the shader optimizer, although some issues might be left to find out.
Important
Enabling
"rendering/shader_compiler/shader_compilation/optimize"and restarting the editor is required for shader optimizations to take effect.It should be fairly noticeable by the fact it'll take longer to build shaders during the startup of the editor or when opening scenes.
What we want to verify
There are ongoing discussions about how much optimizations done by the shader compiler are worth it or not. Drivers can do a lot of heavy-lifting when it comes to optimizations, so it's hard to find consistent data on how much this helps or not across the board as it can vary wildly depending on hardware vendors and the platforms being targeted.
The idea behind the PR is to have an easily accesible option to test out how much of this holds true depending on where Godot is deployed. By using the project setting, it should be fairly easy to verify if this brings any noticeable improvement to a particular platform.
What I've been able to verify so far
For context, I'm doing my testing in Windows 11 with an RTX 3090 Ti. Out of all the platforms where I think this would be beneficial, I think this is the least likely one to show any difference. NVIDIA's fairly competitive when it comes to its Windows drivers and this is high-end hardware.
As far as Godot's caching is concerned, there are significant differences that can be verified when it comes to the size of the SPIR-V shaders (expected) and the PSO cache (less expected) stored in the user data directories. At least on NVIDIA, this seems to hint that the initial PSO that is generated does not achieve as good of an optimization that the SPIR-V Optimizer does. However, there's no telling if this PSO is actually used at later points or replaced by a more optimized one in the background.
So this is clearly not nothing, but that doesn't necessarily translate to performance. This is where I've had a dodgy experience so far in getting results that can be replicated consistently. Whenever I've noticed a performance uplift, it's been usually around 1-2%, only for it to go away the more I jumped between both versions. I suspect the driver is doing some heavy lifting to delegate the optimization and swap out the actual pipeline for a better one as soon as it can.
What we should verify
I suspect we might find more significant differences if we target testing on the platforms that might not have their drivers as polished as NVIDIA and AMD on desktop.
Android: There's a huge amount of hardware variety here where we could find this is worth it given the nature of how Vulkan drivers work on this platform.
Intel: Very popular across low-end laptops and not necessarily the most up-to-date when it comes to drivers. ARC discrete GPUs might fare better than iGPUs.
If you'd still like to test on NVIDIA and AMD, the info can still be useful to find out whether this PR will be worth it or not, or in an odd case if there's any actual regressions from it.
Reasons we might not want this
Compilation takes longer, that is undeniable. However, considering the optimizer works on multiple steps, and one of them has the original SPIR-V in its unoptimized form, I think we can realistically mitigate this by using the unoptimized version as soon as possible and delegating the optimization to the background. This would result in no noticeable difference whatsoever to the current shader stutters possible as long as it's done properly.
The code size for SPIR-V Tools's optimizer is massive. While I can try to whittle it down as much as I can, I think it might end up to nearly 200K lines at minimum. While we can easily opt out of building it into the engine with an option, it's still a massive addition to the codebase. On the positive side, there was pretty much no patching required to get it to work: only extracting the required files as necessary. That said, I do think this could be slightly more crucial to the engine if we happen to find significant performance differences that make it worth it, unlike the dependency on OIDN which ended up performing much worse than it should and was also around 115K lines.