Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cyberpunk 2077 FG works but with significant timing issues #245

Open
SimpleHeuristics opened this issue Jan 23, 2025 · 31 comments
Open

Cyberpunk 2077 FG works but with significant timing issues #245

SimpleHeuristics opened this issue Jan 23, 2025 · 31 comments

Comments

@SimpleHeuristics
Copy link

Cyberpunk 2077 got updated to V2.21 today. It contains the new DLSS4 dlls, so Transformer based DLSS Super Resolution, Ray Reconstruction, and Frame Generation.

Initially the game would not launch, only made it to the CD Projekt RED Logo before freezing. I suspected this had to do with the new DLLs so I tested them one by one by using the most recent CNN based ones.

Both the Transformer based super resolution and ray reconstruction DLLs work. It's only the Transformer based FG dll (nvngx_dlssg.dll) that doesn't.

I tested this as well in other games with frame generation support that worked well like starfield, horizon zero dawn remastered, god of war ragnarok. All exhibit the same issue where the super resolution DLL works but not the frame generation one.

I'm guess this has to do with the fact the current implementation in dxvk-nvapi expected the FG dll to utilize optical flow rather than some transformer model.

Anyways it's all very new, but unfortunate given that frame gen only recently started working under linux.

Software information

  • Affected Applications: Cyberpunk 2077 V2.21, but affects any game that supports frame generation that tries to use the new dll.
  • DXVK-NVAPI, DXVK or VKD3D-Proton settings (environment variables and/or configuration files): Defaults from Proton Experimental Bleeding Edge

System information

  • GPU: RTX 4090 FE
  • Driver: 565.77
  • Launcher (e.g Proton or Bottles): Proton Under Steam (Embedded GameScope)
  • Proton/Wine version: Proton Experimental Bleeding Edge (Jan 23, 2025)
  • DXVK-NVAPI version: As above in bleeding edge proton
  • DXVK version: As above in bleeding edge proton
  • VKD3D-Proton version: As above in bleeding edge proton

Log files

Please attach DXVK-NVAPI log files as a text file :
Both NVAPI and Proton Logs Attached

nvapi64.log
steam-1091500.log

@liam-middlebrook
Copy link
Contributor

The crash is in vkd3d-proton coming from a call to ID3D12Device::CreateUnorderedAccessView() which happens after a call is made to NvAPI_D3D12_CaptureUAVInfo(). Looking at the vkd3d-proton source, and the rest of the call frames coming from nvngx_dlssg.dll it appears that there's a feature-gap in vkd3d-proton's captureuavinfo implementation where it doesn't support buffer resources.

liam-middlebrook added a commit to liam-middlebrook/vkd3d-proton that referenced this issue Jan 23, 2025
Fixes a crash reported in jp7677/dxvk-nvapi#245 where capturing UAVInfo for a
buffer-resource would result in a segfault.

Signed-off-by: Liam Middlebrook <[email protected]>
@SimpleHeuristics
Copy link
Author

The crash is in vkd3d-proton coming from a call to ID3D12Device::CreateUnorderedAccessView() which happens after a call is made to NvAPI_D3D12_CaptureUAVInfo(). Looking at the vkd3d-proton source, and the rest of the call frames coming from nvngx_dlssg.dll it appears that there's a feature-gap in vkd3d-proton's captureuavinfo implementation where it doesn't support buffer resources.

So it's not an issue with the nvapi implementation? Impressed that it can be fixed so quickly despite Nvidia switching to an entirely different method of generating frames than optical flow.

HansKristian-Work pushed a commit to HansKristian-Work/vkd3d-proton that referenced this issue Jan 24, 2025
Fixes a crash reported in jp7677/dxvk-nvapi#245 where capturing UAVInfo for a
buffer-resource would result in a segfault.

Signed-off-by: Liam Middlebrook <[email protected]>
@jp7677
Copy link
Owner

jp7677 commented Jan 24, 2025

The related VKD3D-Proton PR has been merged, OK to close this? Also thanks for the well written issue report in the first place!

On a side note, I saw those in the provided logs:

info:nvapi64:<-NvAPI_D3D12_CreateCubinComputeShaderWithName: Invalid argument
info:nvapi64:<-NvAPI_D3D12_CreateCubinComputeShaderEx: Invalid argument
info:nvapi64:NvAPI_QueryInterface (NvAPI_D3D12_CreateCubinComputeShaderExV2): Not implemented method

This seems to be a misuse of those functions on purposes, I guess for checking which functions are available or similar. The argument for the device is null for both calls and also for NvAPI_D3D12_CreateCubinComputeShaderExV2 when added, so returning NVAPI_INVALID_ARGUMENT seems correct here. Interestingly, it continues to probe for NvAPI_D3D12_GetCudaMergedTextureSamplerObject after NvAPI_D3D12_CreateCubinComputeShaderExV2.
In summary, this looks like expected behavior and so far nothing wrong on our side here.

@SveSop
Copy link
Contributor

SveSop commented Jan 24, 2025

@jp7677 Yes, it seems to be a check IF these are available calling with "bogus data", and get a NVAPI_INVALID_ARGUMENT in return, but it WILL utilize this later, as i can see in windows:

<- NvAPI_D3D12_CreateCubinComputeShaderWithName: (0000000000000000, 0000000000000000, 0, 0, 0, 0, 0000000000000000, 0000000000000000)
nvapi_QueryInterface: (0x3151211b) - <<RELAY: NvAPI_D3D12_CreateCubinComputeShaderEx>>
<- NvAPI_D3D12_CreateCubinComputeShaderEx: (0000000000000000, 0000000000000000, 0, 0, 0, 0, 0, 0000000000000000, 0000000000000000)
nvapi_QueryInterface: (0x299f5fdc) - <<RELAY: NvAPI_D3D12_CreateCubinComputeShaderExV2>>
<- NvAPI_D3D12_CreateCubinComputeShaderExV2: (0000000000000000)
nvapi_QueryInterface: (0x329fe6e0) - <<RELAY: NvAPI_D3D12_GetCudaMergedTextureSamplerObject>>
<- NvAPI_D3D12_GetCudaMergedTextureSamplerObject: (0000000000000000)
nvapi_QueryInterface: (0x0ddac234) - <<RELAY: NvAPI_D3D12_GetCudaIndependentDescriptorObject>>
<- NvAPI_D3D12_GetCudaIndependentDescriptorObject: (0000000000000000)
nvapi_QueryInterface: (0x70c07832) - <<RELAY: NvAPI_D3D12_IsFatbinPTXSupported>>
<- NvAPI_D3D12_IsFatbinPTXSupported: (0000016D63A48630, 0000016D6BB740F8)
--
<- NvAPI_D3D12_CreateCubinComputeShaderExV2: (0000006CD9BFE250)
<- NvAPI_D3D12_CreateCubinComputeShaderExV2: (0000006CD9BFE140)
<- NvAPI_D3D12_CreateCubinComputeShaderExV2: (0000006CD9BFE140)
<- NvAPI_D3D12_CreateCubinComputeShaderExV2: (0000006CD9BFE140)
<- NvAPI_D3D12_CreateCubinComputeShaderExV2: (0000006CD9BFE140)
--
<- NvAPI_D3D12_GetCudaIndependentDescriptorObject: (0000006CD975AF48)
<- NvAPI_D3D12_GetCudaMergedTextureSamplerObject: (0000006CD975AD70)
<- NvAPI_D3D12_GetCudaIndependentDescriptorObject: (0000006CD975ACD8)
<- NvAPI_D3D12_GetCudaMergedTextureSamplerObject: (0000006CD975AFF0)
<- NvAPI_D3D12_GetCudaIndependentDescriptorObject: (0000006CD975AF58)

I was actually experimenting around this after this 2.21 update... And from the testing, it does seem like the 3 calls to

NvAPI_D3D12_CreateCubinComputeShaderWithName
NvAPI_D3D12_GetCudaMergedTextureSamplerObject
NvAPI_D3D12_GetCudaIndependentDescriptorObject

aswell as
NvAPI_D3D12_CreateCubinComputeShaderExV2

was "needed" to continue performing this ExV2 call. (This particular one could be easily implemented, because it is more or less just the same as NvAPI_D3D12_CreateCubinComputeShaderEx but with a struct instead of parameters, even tho it has some sort of version check.)

These 2 is somewhat more iffy tho..
NvAPI_D3D12_GetCudaMergedTextureSamplerObject and NvAPI_D3D12_GetCudaIndependentDescriptorObject, as they are actually used, so i am not sure what happens if they would just be stubbed and return a error by default.

Then again.. After this VKD3D fix, it might not be needed at all perhaps? 😄

@jp7677
Copy link
Owner

jp7677 commented Jan 24, 2025

I think we are good here (for now) and what we are seeing is a fallback path at work.

@Saancreed
Copy link
Collaborator

These 2 is somewhat more iffy tho..
NvAPI_D3D12_GetCudaMergedTextureSamplerObject and NvAPI_D3D12_GetCudaIndependentDescriptorObject, as they are actually used, so i am not sure what happens if they would just be stubbed and return a error by default.

Actually, they look like something that vkd3d-proton could implement using vkGetImageViewHandle64NVX.

@philipl
Copy link

philipl commented Jan 25, 2025

I'm not sure whether to comment here or on the vkd3d-proton issue, but although it doesn't crash anymore, I don't think the frame gen is actually working correctly.

Although the feature turns on, and my frame rate appears to increase, I do not see smooth motion (and I did with the 2.2 version of Cyberpunk). Instead, it seems to have the same smoothness as with frame gen off and I see motion that looks, for want of a better description, like a very basic blend of the two frames with no actual motion interpolation going on.

Perhaps this is the consequence of the use of the fallback path.

@SimpleHeuristics
Copy link
Author

I'm not sure whether to comment here or on the vkd3d-proton issue, but although it doesn't crash anymore, I don't think the frame gen is actually working correctly.

Although the feature turns on, and my frame rate appears to increase, I do not see smooth motion (and I did with the 2.2 version of Cyberpunk). Instead, it seems to have the same smoothness as with frame gen off and I see motion that looks, for want of a better description, like a very basic blend of the two frames with no actual motion interpolation going on.

Perhaps this is the consequence of the use of the fallback path.

This might be a cyberpunk specific issue. I am seeing what appears to be inverse ghosting. As if the generated frame is being inserted after the next real frame.

However using the new frame Gen DLL on other games results in essentially perfect results.

@philipl
Copy link

philipl commented Jan 25, 2025

This might be a cyberpunk specific issue. I am seeing what appears to be inverse ghosting. As if the generated frame is being inserted after the next real frame.

Actually, that might be what I'm seeing too. It would explain why it doesn't look smooth.

However using the new frame Gen DLL on other games results in essentially perfect results.

In other games, are you using some mechanism to override from CNN to transformer? (although in cyberpunk, switching back to CNN doesn't fix the weirdness)

@SimpleHeuristics
Copy link
Author

This might be a cyberpunk specific issue. I am seeing what appears to be inverse ghosting. As if the generated frame is being inserted after the next real frame.

Actually, that might be what I'm seeing too. It would explain why it doesn't look smooth.

However using the new frame Gen DLL on other games results in essentially perfect results.

In other games, are you using some mechanism to override from CNN to transformer? (although in cyberpunk, switching back to CNN doesn't fix the weirdness)

I am just copying the new DLLs into the other game folders like how we would override / update these in the past. The CNN/Transformer override toggle is only in cyberpunk as of now since it was updated to support this specifically. Can be toggled via driver too but that might just be a windows thing only for now with Nvidia inspector / Nvidia app. We'll have to see what the 570 drivers do for us on Linux.

The only DLL that gave issues (regardless of being able to toggle models or not was) on cyberpunk and all other games was the framegen one and not the super res one.

@Saancreed
Copy link
Collaborator

Saancreed commented Jan 25, 2025

I added a very basic, incomplete and probably not entirely correct implementation of mentioned Cuda entrypoints in https://github.com/jp7677/dxvk-nvapi/tree/image-view-handle-64 + https://github.com/Saancreed/vkd3d-proton/tree/image-view-handle-64.

But be aware, this requires that your Vulkan driver supports revision 3 of VK_NVX_image_view_handle (so, anything before 570 is probably excluded). Try it at your own risk 🙂

EDIT: Forgot to mention but you also need winevulkan 1.3.302 or newer which no official Proton version currently supports.

@SveSop
Copy link
Contributor

SveSop commented Jan 25, 2025

EDIT: Forgot to mention but you also need winevulkan 1.3.302 or newer which no official Proton version currently supports.

Sadly it did not seem to be a straight drop-in replacement from wine-9.22 to fix that.. But i suppose its bound to be updated in not so distant future now that wine-10.0 is out 👍

@Saancreed
Copy link
Collaborator

ivyl was already poked about this, so Proton bleeding-edge will likely have updated winevulkan around Monday.

@Saancreed
Copy link
Collaborator

Saancreed commented Jan 27, 2025

https://github.com/ValveSoftware/Proton/releases/tag/experimental-bleeding-edge-9.0-156274-20250127-p412b48-weab957-d5b2128-va9a96a now has updated winevulkan, this should be enough to support vkGetImageViewHandle64NVX and by extension, NvAPI_D3D12_GetCudaMergedTextureSamplerObject and NvAPI_D3D12_GetCudaIndependentDescriptorObject with my branches linked above.

Coincidentally, this should also fix the black screen seen when attempting to use DLSS v310 in Vulkan applications on R570 drivers.

@SveSop
Copy link
Contributor

SveSop commented Jan 27, 2025

@Saancreed 👍
Does seem to work, although i cannot say what it does in the end. Sadly it did not fix the "ghosting effect" in CP2077 when panning/moving.

As commented above, it does really look like the FG frame is inserted out of order in a way.

Going through the "unknown" nvapi calls, the only thing that could possibly have use would be one of those "DRS" settings, and that this particular game uses some sort of "tweak". Maybe this ties in with what was mentioned above where the same DLSS libs did not cause the same issue on another game? Maybe some CNN/Transformer "setting" that is used..

Ill see if i can return "not found" for those DRS calls on windows and if that changes anything 😄

EDIT: It did not. So not related to DRS.

@jp7677 jp7677 changed the title Cyberpunk 2077 (and any other DLSS FG Game) Fails to launch with new Tensor based FG dll from DLSS4 (Cyberpunk Patch 2.21). Optical Flow based FG DLL works but with significant timing issues Cyberpunk 2077 FG works but with significant timing issues Feb 20, 2025
@Linux-Fan
Copy link

The issue of heavy image stuttering in Cyberpunk 2077 2.21 utilising DLSS4 still persists on my CachyOS fully updated system. Any prospects for a solution?

@SveSop
Copy link
Contributor

SveSop commented Mar 5, 2025

I am not 100% sure this is due to nvapi - or nvapi alone, but it could maybe aggravate some issues with DLSS and the new transformer model.. perhaps.

https://youtu.be/3nfEkuqNX4k?t=835

This video with link to where they have issues recording what appears to be "frames rendered out of sync" is very much like what we experience in CP2077 imo.
So.. Why would the "ton of issues" be in windows using nvidia's own recording software? Possibly DLSS Transformer Model issues.. or framegen issues?

PS. Whole video is quite interesting, so worth a look.

@philipl
Copy link

philipl commented Mar 5, 2025

I tried using DLSS swapper to downgrade to the last DLSS 3 dll and it still had the rendering problem. Could it be a game issue - have they changed their code around how they hook in frame gen, and this is the code that is problematic? IIRC, the initial game crashing problem with FG happened regardless of DLSS dll version, so that points to game code as well.

@Linux-Fan
Copy link

And what about other games that are reported to have similar issues?

@philipl
Copy link

philipl commented Mar 7, 2025

And what about other games that are reported to have similar issues?

🤷 I forced all the dlls back to 3.7.10 and I still see the out of order frames. Maybe nvidia make some recommendations in how to hook FG in, or they included some other nvidia specific optimisations in their guidance for dlss4 that means this could show up in other games as well.

@SveSop
Copy link
Contributor

SveSop commented Mar 9, 2025

So.. found out something interesting today when i was testing and fiddling with CP2077..

If i START the game when DLSS is set as "Ultra Performance", then go ingame and THEN set something like Quality or Balanced, it works just fine. If i "exit to menu" and then go back ingame, its borked... If i do NOT set the settings to "Ultra Performance" BEFORE starting the game, it's borked.

This can also be forced with NVAPI settings:

"DXVK_NVAPI_DRS_NGX_DLSS_RR_MODE": "ultra_performance",
"DXVK_NVAPI_DRS_NGX_DLSS_SR_MODE": "ultra_performance",

But then the game disregards ingame settings and it will be stuck at Ultra Performance. I do feel the Ultra Performance setting is "ok'ish", but has a bit of issues still.. Like disappearing NPC's when they walk past lampposts and whatever, so i don't really think its all that great.

Tested this with both GE custom proton, and Proton Bleeding Edge, and it seems to work as long as you remember to set the DLSS setting to "Ultra Performance" before quitting. A bit tedious, but hey.

Also tested this with a updated nvngx_dllsg.dll 310.2.1 instead of the 310.2.0 that came with Patch 2.21, but it made no changes in this behavior.
I'll see if the logs seem different between starting the game with "Ultra Performance" vs starting it with "Balanced" perhaps. It would be a bit more logical if it ONLY worked when "Ultra Performance" was set, and then immediately bork as soon as anything else was set, but i can clearly see both difference in FPS and other weird stuff that happens when i compare ultra <-> quality, so it HAS changed without having the frame-sync-issues.

@philipl
Copy link

philipl commented Mar 10, 2025

Wow. I have a similar but not identical experience. I need to test more combinations but for now I have this:

For me (on a 4060ti for this test) I had to set my saved DLSS level to Performance and not turn Path Tracing on. I tested with RT off and with RT on but with all RT sub-settings off (not sure whether that is any different from RT off) and in those cases, FG wold work smoothly. If I turned on PT, it would go wrong. I believe, but need to do more testing, that Ultra Performance stayed broken, but maybe there was some other RT related setting active. I will update when I can test more combinations.

Update:

  • Keeping RT off appears to be the key thing. My belief that "RT on with all sub-settings off didn't b0rk things was not correct". RT has to be off.
  • Once RT is off, all DLSS levels appeared to work fine.
  • It also seems like as long as you turn RT off before loading/continuing, it will be fine. It seemed to work just as well as starting the game with RT off
  • Once you're in game and FG is working, RT can be safely turned on.

FWIW, I'm using dlss swapper to force 310.2.1, and running 570.86.16 drivers. Currently running latest Proton Experimental so not using any custom builds of vkd3d or dxvk-nvapi.

@Linux-Fan
Copy link

I cannot confirm that setting Ultra Performance changes anything - I'm still getting very heavy image stuttering. Cyberpunk 2077 with FrameGen is unplayable on Linux at the moment.

@philipl
Copy link

philipl commented Mar 16, 2025

I've now been able to test on a second system (4090) and the behaviour matches my previous experience. As long as RT/PT is disabled when you continue/load, FG will work correctly.

@shelterx
Copy link

Is this a driver bug or vkd3d/dxvk-nvapi issue? Anyone know?

@chonty
Copy link

chonty commented Mar 16, 2025

@philipl worked for me. Re-enabling RT breaks it again and it remains broken even after disabling.

@SveSop
Copy link
Contributor

SveSop commented Mar 16, 2025

Is this a driver bug or vkd3d/dxvk-nvapi issue? Anyone know?

Hard to tell, since everyone seems to have somewhat different experience. As i said, i run ALL on max (RT Ultra, Path Tracing on+++), just having the game set to "DLSS Ultra Performance" when the game starts and as long as i do not exit the game (or exit to menu), i can run everything just fine. If i forget to set DLSS back to Ultra Performance when quitting the game, i have to load the game, set Ultra Performance, then exit the game again - and THEN i can start up and go ingame.

Since this does not seem to work for everyone, i dunno whats up, as i seem to be able to do this very consistently.

I have created a user_settings.py file setting this:

user_settings = {
    "PROTON_LOG": "0",
    "WINEDEBUG": "-all",
    "DXVK_NVAPI_LOG_LEVEL": "none",
    "DXVK_LOG_LEVEL": "none",
    "VKD3D_CONFIG": "dxr11",
    "VKD3D_DEBUG": "none",
    "VKD3D_SHADER_DEBUG": "none",
#    "PROTON_ENABLE_NGX_UPDATER": "1",
    "DXVK_NVAPI_DRS_NGX_DLSS_RR_OVERRIDE": "on",
    "DXVK_NVAPI_DRS_NGX_DLSS_SR_OVERRIDE": "on",
    "DXVK_NVAPI_DRS_NGX_DLSS_FG_OVERRIDE": "on",
    "DXVK_NVAPI_DRS_NGX_DLSS_RR_OVERRIDE_RENDER_PRESET_SELECTION": "render_preset_latest",
    "DXVK_NVAPI_DRS_NGX_DLSS_SR_OVERRIDE_RENDER_PRESET_SELECTION": "render_preset_latest"
}

I can't get the game to start if i enable the NGX_UPDATER.. just seem to hang forever when starting.

VKD3D_CONFIG: "dxr11" is not needed i think, as this is the default anyway afaik...

@Linux-Fan
Copy link

What is the strangest part - on some occasions (like once every 20-30 runs) Frame Generation works as it should be with everything enabled, including RT/PT. Then, when you reload a save or go back to the menu and start playing again, everything goes back to the same erroneous behaviour.

@SveSop
Copy link
Contributor

SveSop commented Mar 16, 2025

What is the strangest part - on some occasions (like once every 20-30 runs) Frame Generation works as it should be with everything enabled, including RT/PT. Then, when you reload a save or go back to the menu and start playing again, everything goes back to the same erroneous behaviour.

Yeah, seen that too.

Using quick-travel and loading savegame will bork the "fix" i said above with using Ultra Performance for me.. So i have to quit the game and load again almost every time i use quick-travel. So.. Meh.. Sux.

It could possibly be some nvapi function needed for some sync thing... Maybe NvAPI_D3D12_SetNvShaderExtnSlotSpace and NvAPI_D3D12_SetNvShaderExtnSlotSpaceLocalThread. The problem is that currently (i think) vkd3d cant really use "extensions" for nvidia like it is supposed to do with these functions. I think MAYBE these has to do with "improving performance" especially in path tracing scenarios, using the "Shader Execution Reordering API" (Cyberpunk DOES use this).
Could be why it seems as an improvement when NOT using PT with this game perhaps? Messing with this call in windows causes the game to crash, so not really anything easy to compare with there. I was hoping i could reproduce this in some sense in windows, but so far no dice.

The problem is that many games has some sort of expectation when it detect NVIDIA hardware, and many times do NOT provide any fallback mechanism in the case of NVAPI_NO_IMPLEMENTATION or similar errors returned, so even if the call is an error, game devs could choose to just call this without checking returncode possibly due to performance reasons, as it is always working with whatever driver is required. My testing on windows certainly indicated this, as it would "choose" to crash over using a different path with worse performance...

Why this became a lot more prominent with the 2.21 patch is anyones guess, but new features from DLSS 4.0 that makes more use of this special API thing could very well be the case, especially for this game.

https://developer.nvidia.com/blog/improve-shader-performance-and-in-game-frame-rates-with-shader-execution-reordering/

Even tho this blog is from 2022, it does not mean ALL games utilize this, and from a quick browse through this, i think it requires more than just replacing a couple of dll's (DLSS dll's) to make full use of this.

@SveSop
Copy link
Contributor

SveSop commented Mar 16, 2025

I wonder if NV_ray_tracing_invocation_reorder could be used for something like this.. 🤔

@shelterx
Copy link

@SveSop
Mmh... Just enabling DLSS (no FG) can cause issues on it's own, you have to restart the games sometimes otherwise it can cause all sorts of weirdness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants