Cache IPAdapter instances to avoid expensive KV extraction on every generation #335

rewbs · 2024-02-20T03:58:18Z

Hi all, I'm not deeply familiar with the code I've touched here, so I'm very open to feedback on this PR.

Description

Currently apply_ipadapter reconstructs an IPAdapter instance on every invocation, even when reusing the same model. This is an expensive operation primarily due to the call to To_KV(), and to a lesser degree due to the calls to init_proj*().

This PR caches the IPAdapter() keyed off of the IPAdapter model filename if running with --always-high-vram.

Why?

This has a significant performance impact on Deforum, which I'm currently porting to Forge as per #96).

Without this change, Forge is slower than A1111 on a simple Deforum run with IPAdapter enabled, despite having higher it/s. With this change, it is substantially faster that A1111.

The attached 120 frame Deforum settings file runs as follows on my 3090 / i5-4590 @ 3.30GHz:

A1111: 3 min. 32.1 sec.
Forge without this change: 4 min. 28.5 sec.
Forge with this change: 1 min. 47.5 sec

deforum_settings.txt

Outside of Deforum, this also benefits runs with batch size>1 or any repeated gens using IPAdapter (shaves a few seconds off the initialisation time before you see the it/s gauge).

Screenshots/videos:

n/a

Checklist:

I have read contributing wiki page
I have performed a self-review of my own code
My code follows the style guidelines
My code passes tests (not passing on clean checkout in my env)

…rame

BadisG · 2024-02-21T23:16:24Z

I think that it's working, now when I use IpAdapters (in this example I go for instantID so I get 2 IpAdapters) and I generate images over and over with the same model, it starts without much delay, here's my logs

2024-02-22 00:14:15,746 - ControlNet - INFO - Using preprocessor: InsightFace (InstantID)
2024-02-22 00:14:15,746 - ControlNet - INFO - preprocessor resolution = 1024
2024-02-22 00:14:15,915 - ControlNet - INFO - Current ControlNet IPAdapterPatcher: D:\stable-diffusion-webui-forge\models\ControlNet\ip-adapter_instant_id_sdxl.bin
2024-02-22 00:14:15,915 - ControlNet - INFO - ControlNet Input Mode: InputMode.SIMPLE
2024-02-22 00:14:15,919 - ControlNet - INFO - Using preprocessor: instant_id_face_keypoints
2024-02-22 00:14:15,920 - ControlNet - INFO - preprocessor resolution = 1024
D:\stable-diffusion-webui-forge\venv\lib\site-packages\insightface\utils\transform.py:68: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
  P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
Automatic Memory Management: 0 Modules in 0.00 seconds.
2024-02-22 00:14:16,621 - ControlNet - INFO - Current ControlNet ControlNetPatcher: D:\stable-diffusion-webui-forge\models\ControlNet\control_instant_id_sdxl.safetensors
2024-02-22 00:14:18,030 - ControlNet - INFO - IPAdapter: Using cached layers for ip-adapter_instant_id_sdxl.bin.
2024-02-22 00:14:18,056 - ControlNet - INFO - ControlNet Method InsightFace (InstantID) patched.
2024-02-22 00:14:18,160 - ControlNet - INFO - ControlNet Method instant_id_face_keypoints patched.
To load target model SDXL
To load target model ControlNet
Begin to load 2 models
unload clone 3
unload clone 2
Moving model(s) has taken 0.14 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.78it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.75it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.85it/s]

I still have those unload clone things though, dunno if that was supposed to be "fixed" by your PR or it's a normal thing.

Do you also intend to add a feature to unload the previous checkpoint when you switch models? That's the biggest weakness of --always-gpu, and could've be fixed with your own flag.
#266 (comment)

rewbs · 2024-02-23T22:37:59Z

Thanks for trying this out!!

I still have those unload clone things though, dunno if that was supposed to be "fixed" by your PR or it's a normal thing.

Those are unrelated and are not expected to be changed by this PR.

Do you also intend to add a feature to unload the previous checkpoint when you switch models? That's the biggest weakness of --always-gpu, and could've be fixed with your own flag.

Not sure I fully understand: this change isn't about checkpoints, it's about caching some of the data derived from the ipadapter models. Perhaps unloading previous checkpoints is a separate concern we can tackle under a different PR.

lllyasviel · 2024-02-24T06:17:01Z

will take a look soon

…eneration lllyasviel#335

lllyasviel · 2024-08-01T19:50:26Z

hi we are going to close PRs before forge's recent major revision
if we missed some important PRs, please consider reopen (if that is not already on our todo list

rewbs added 2 commits February 20, 2024 13:38

Cache IPAdapter instances to avoid expensive KV extraction on every f…

dbff87c

…rame

Only cache IPAdapter instance if in high VRAM mode

ecf6cd3

rewbs requested a review from lllyasviel as a code owner February 20, 2024 03:58

Merge branch 'main' into ip-adapter-layer-cache

f0a2a90

Merge branch 'main' into ip-adapter-layer-cache

c6feca4

Panchovix pushed a commit to Panchovix/stable-diffusion-webui-reForge that referenced this pull request Jul 13, 2024

Cache IPAdapter instances to avoid expensive KV extraction on every g…

f902bc6

…eneration lllyasviel#335

lllyasviel closed this Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache IPAdapter instances to avoid expensive KV extraction on every generation #335

Cache IPAdapter instances to avoid expensive KV extraction on every generation #335

rewbs commented Feb 20, 2024

BadisG commented Feb 21, 2024 •

edited

Loading

rewbs commented Feb 23, 2024

lllyasviel commented Feb 24, 2024

lllyasviel commented Aug 1, 2024

Cache IPAdapter instances to avoid expensive KV extraction on every generation #335

Cache IPAdapter instances to avoid expensive KV extraction on every generation #335

Conversation

rewbs commented Feb 20, 2024

Description

Why?

Screenshots/videos:

Checklist:

BadisG commented Feb 21, 2024 • edited Loading

rewbs commented Feb 23, 2024

lllyasviel commented Feb 24, 2024

lllyasviel commented Aug 1, 2024

BadisG commented Feb 21, 2024 •

edited

Loading