Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache IPAdapter instances to avoid expensive KV extraction on every generation #335

Closed
wants to merge 4 commits into from

Conversation

rewbs
Copy link

@rewbs rewbs commented Feb 20, 2024

Hi all, I'm not deeply familiar with the code I've touched here, so I'm very open to feedback on this PR.

Description

Currently apply_ipadapter reconstructs an IPAdapter instance on every invocation, even when reusing the same model. This is an expensive operation primarily due to the call to To_KV(), and to a lesser degree due to the calls to init_proj*().

This PR caches the IPAdapter() keyed off of the IPAdapter model filename if running with --always-high-vram.

Why?

This has a significant performance impact on Deforum, which I'm currently porting to Forge as per #96).

Without this change, Forge is slower than A1111 on a simple Deforum run with IPAdapter enabled, despite having higher it/s. With this change, it is substantially faster that A1111.

The attached 120 frame Deforum settings file runs as follows on my 3090 / i5-4590 @ 3.30GHz:

  • A1111: 3 min. 32.1 sec.
  • Forge without this change: 4 min. 28.5 sec.
  • Forge with this change: 1 min. 47.5 sec

deforum_settings.txt

Outside of Deforum, this also benefits runs with batch size>1 or any repeated gens using IPAdapter (shaves a few seconds off the initialisation time before you see the it/s gauge).

Screenshots/videos:

n/a

Checklist:

@BadisG
Copy link

BadisG commented Feb 21, 2024

I think that it's working, now when I use IpAdapters (in this example I go for instantID so I get 2 IpAdapters) and I generate images over and over with the same model, it starts without much delay, here's my logs

2024-02-22 00:14:15,746 - ControlNet - INFO - Using preprocessor: InsightFace (InstantID)
2024-02-22 00:14:15,746 - ControlNet - INFO - preprocessor resolution = 1024
2024-02-22 00:14:15,915 - ControlNet - INFO - Current ControlNet IPAdapterPatcher: D:\stable-diffusion-webui-forge\models\ControlNet\ip-adapter_instant_id_sdxl.bin
2024-02-22 00:14:15,915 - ControlNet - INFO - ControlNet Input Mode: InputMode.SIMPLE
2024-02-22 00:14:15,919 - ControlNet - INFO - Using preprocessor: instant_id_face_keypoints
2024-02-22 00:14:15,920 - ControlNet - INFO - preprocessor resolution = 1024
D:\stable-diffusion-webui-forge\venv\lib\site-packages\insightface\utils\transform.py:68: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
  P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
Automatic Memory Management: 0 Modules in 0.00 seconds.
2024-02-22 00:14:16,621 - ControlNet - INFO - Current ControlNet ControlNetPatcher: D:\stable-diffusion-webui-forge\models\ControlNet\control_instant_id_sdxl.safetensors
2024-02-22 00:14:18,030 - ControlNet - INFO - IPAdapter: Using cached layers for ip-adapter_instant_id_sdxl.bin.
2024-02-22 00:14:18,056 - ControlNet - INFO - ControlNet Method InsightFace (InstantID) patched.
2024-02-22 00:14:18,160 - ControlNet - INFO - ControlNet Method instant_id_face_keypoints patched.
To load target model SDXL
To load target model ControlNet
Begin to load 2 models
unload clone 3
unload clone 2
Moving model(s) has taken 0.14 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.78it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.75it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.85it/s]

I still have those unload clone things though, dunno if that was supposed to be "fixed" by your PR or it's a normal thing.

Do you also intend to add a feature to unload the previous checkpoint when you switch models? That's the biggest weakness of --always-gpu, and could've be fixed with your own flag.
#266 (comment)

@rewbs
Copy link
Author

rewbs commented Feb 23, 2024

Thanks for trying this out!!

I still have those unload clone things though, dunno if that was supposed to be "fixed" by your PR or it's a normal thing.

Those are unrelated and are not expected to be changed by this PR.

Do you also intend to add a feature to unload the previous checkpoint when you switch models? That's the biggest weakness of --always-gpu, and could've be fixed with your own flag.

Not sure I fully understand: this change isn't about checkpoints, it's about caching some of the data derived from the ipadapter models. Perhaps unloading previous checkpoints is a separate concern we can tackle under a different PR.

@lllyasviel
Copy link
Owner

will take a look soon

Panchovix pushed a commit to Panchovix/stable-diffusion-webui-reForge that referenced this pull request Jul 13, 2024
@lllyasviel
Copy link
Owner

hi we are going to close PRs before forge's recent major revision
if we missed some important PRs, please consider reopen (if that is not already on our todo list

@lllyasviel lllyasviel closed this Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants