xformers for potential speedup, or torch 2.01 arguments #17

311-code · 2024-01-26T05:34:28Z

I had read that using xformers (pip install xformers) could possibly results in a large speedup in the marigold and depth-anything realtime conversion. The issue is I can't find any xformers wheel that is compatible with 2.0.1+cu117 (cuda 11.7) and not sure if the unity project requires that version of cuda to work.

It seems like xformers version 22 is compatible possibly with torch 2.01 and cuda 11.8.

If this doesn't work though because it's too old: I had read an argument you can do with torch 2.01 that would be as fast of a speedup as xformers is adding --opt-sdp-attention or --opt-sdp-no-mem-attention arguments (but these seem specific flags only for automatic1111 I am wondering if the same sort of thing could be done here?)

I still can't get the depth-anything model going quite yet to test though. Somehow threedeejay did but he says it runs at 2 frames per second.

The text was updated successfully, but these errors were encountered:

parkchamchi · 2024-01-26T11:22:07Z

Hmm, I wonder if dany is using or can use the half precision optimization?

Higher version of torch would work with the scripts. One thing to consider is that if the current Unity OnnxRuntime dlls (v1.13.1) would work with the other CUDA/cuDNN version. The ORT docs (#) list the ORT v1.13.1 requires CUDA v11.6 and cuDNN v8.5.0.96. But the fact that I use it with CUDA v11.7 and cuDNN v8.2.4, and that the doc saying:

Note: Because of CUDA Minor Version Compatibility, Onnx Runtime built with CUDA 11.4 should be compatible with any CUDA 11.x version. Please reference Nvidia CUDA Minor Version Compatibility.

I think it is safe to say you can upgrade your CUDA version.

--opt-sdp-attention or --opt-sdp-no-mem-attention

I do not know how those args would work.

311-code · 2024-01-27T02:20:11Z

I will look into this and try it out and report back.

ricardofeynman · 2024-01-30T18:33:20Z

--opt-sdp-attention or --opt-sdp-no-mem-attention

AFAIK these args are relevent only to Stable Diffusion to speed up image generation. At least I've never encountered them in any other context.

Migrated to Sentis From Barracuda v3.0. Now supports MiDaS v3+ and Depth-Anything models without ORT.

Where's that little mind exploding emoji when you need it. This, good sir, is some very exciting news. Must test soon.

311-code · 2024-02-05T02:02:45Z

Threedeejay sent me this and someone got a little further, I wonder if they just need to do pip install https://download.pytorch.org/whl/cu118/xformers-0.0.23.post1%2Bcu118-cp311-cp311-win_amd64.whl for a compatible version possibly?

I'm currently trying to get the Unity 2022.3.18f1 project going at this time. A lot of features from meta which are finally exposing the hand tracking, and other features to openxr fully for PCVR with the v62 update. (I have it early by opting into the public test channels on pc and the mobile app)

parkchamchi · 2024-02-05T07:33:21Z

If you have python 3.11 and cuda 11.8, I assume.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xformers for potential speedup, or torch 2.01 arguments #17

xformers for potential speedup, or torch 2.01 arguments #17

311-code commented Jan 26, 2024 •

edited

Loading

parkchamchi commented Jan 26, 2024

311-code commented Jan 27, 2024

ricardofeynman commented Jan 30, 2024

311-code commented Feb 5, 2024 •

edited

Loading

parkchamchi commented Feb 5, 2024

xformers for potential speedup, or torch 2.01 arguments #17

xformers for potential speedup, or torch 2.01 arguments #17

Comments

311-code commented Jan 26, 2024 • edited Loading

parkchamchi commented Jan 26, 2024

311-code commented Jan 27, 2024

ricardofeynman commented Jan 30, 2024

311-code commented Feb 5, 2024 • edited Loading

parkchamchi commented Feb 5, 2024

311-code commented Jan 26, 2024 •

edited

Loading

311-code commented Feb 5, 2024 •

edited

Loading