Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free lunch for Krita AI Diffusion - Implement FreeU #208

Closed
wants to merge 3 commits into from
Closed

Free lunch for Krita AI Diffusion - Implement FreeU #208

wants to merge 3 commits into from

Conversation

davniko
Copy link

@davniko davniko commented Dec 9, 2023

I've implemented FreeU functionality into the plugin. It allows for enabling or disabling FreeU with a checkbox in the configuration window. If enabled, the input values are used to to apply the FreeU_V2 node in the workflow builder. If disabled, the input fields for the values are hidden.

Screenshot 2023-12-09 201112

FreeU disabled:
Screenshot 2023-12-09 201533

FreeU enabled:
Screenshot 2023-12-09 201605

@Acly
Copy link
Owner

Acly commented Dec 10, 2023

Thank you for the PR. I admit I haven't used FreeU much so far, but as it is I don't believe it fits into the plugin well. That doesn't mean it can't be useful, my main issue is the way it's exposed.

As a general rule of thumb, it should be possible to explain parameters exposed in the UI to people who have no technical knowledge about SD by giving an intuition how they affect the image.

If FreeU always improves quality, it should be active by default with a "good" parametrization. I'm skeptical though, I haven't seen an independent evaluation yet where it's clearly superior. Alternatively I would expect at most 1 slider which eg. allows to trade "image coherence" with "high frequency detail". (I'm not sure this description is accurate)

I realize finding a good mapping and steering the internal parameters can be a lot of work, especially since evaluating SD results is tedious. But it could be part of the value the tool provides.

@davniko
Copy link
Author

davniko commented Dec 10, 2023

Fair comments. I do believe FreeU visibly improve output quality, this can be compared on X/Y plots in image generation, and has been done by many people already - although "image quality" is somewhat subjective, it does tend to fix some mistakes and add some detail in generated images at the very least. The harder part is determining the "best" parameter values for it. I've set the defaults to values suggested by Nasir Khalid on this site after extensive testing: https://wandb.ai/nasirk24/UNET-FreeU-SDXL/reports/FreeU-SDXL-Optimal-Parameters--Vmlldzo1NDg4NTUw?accessToken=6745kr9rjd6e9yjevkr9bpd2lm6dpn6j00428gz5l60jrhl3gj4gubrz4aepupda . These are also the values I tend to use for image generation.

You are correct about the tool not fitting quite well though. Perhaps it could be just a simple checkbox which is turned on by default (without showing the parameter options) this way would be more user-friendly.

This may be a problem for more advanced users however, since people have their own preferences for FreeU parameters and may want to tweak them to those preferences.

I'm also not sure it would make sense to use a slider here since FreeU is either on or off, not in a range of values applied like LoRA strength or CFG for example. The single slider would also be controlling 4 parameters in this case, so I'm not sure how it would map to the parameter values as you've said.

How would you feel about introducing something like an "Advanced" checkbox into the plugin, which would enable seasoned users to see more advanced tweaking options (like FreeU parameters in this case). This would perhaps allow us to have our cake and eat it too, with non-advanced users being exposed only to simple and intuitive settings while advanced users having the additional tweaking options available to them. This also relates to other tools that I have implemented or am in the process of implementing into the plugin locally for myself (as I find them very useful), but am unsure how and if they would fit into the current plugin - stuff like Koyha HiRes fix / DeepShrink (which really helps generating images in the non-trained aspect ratios and resolutions), additional upscaling options, noise injection (gets additional details into the image), custom comfy workflows, etc.

Of course if that is not how you envision the plugin, I will keep the modifications locally for myself. But I do think the plugin could benefit from enabling at least some additional advanced options to experienced users, even if the additional tools I've mentioned are not desirable for this plugin.

@Acly
Copy link
Owner

Acly commented Dec 11, 2023

I do believe FreeU visibly improve output quality, this can be compared on X/Y plots in image generation, and has been done by many people already

I've seen some of those, but none that actually compare across a wide range and collect meaningful statistics. The images in Nasir Khalid comparison are quire indicative: it's clearly a trade off, it fixes some compositional artifacts, improves coherence, but images tend to look more plastic/fake/oversaturated/smooth due to loss of structural detail. Which one you prefer is subjective as you say, personally I mostly prefer the orginal.

I'm not against advanced settings, but to me the question about which settings to support/expose and how to present them are largely independent. In other words, niche or complex options are OK, but they still ought to be explainable in terms of their effect on the image. The FreeU parameters are what I'd call internal parameters, the whole SD process has hundreds of those, all of which could be tweaked to some effect. But at some point we have to consolidate them, workflows are growing in complexity and will continue to do so. I think it's important we find good defaults and move on, rather than dragging an ever growing list of tweakability.

IMO this applies to "seasoned" users too! At some point I stopped worrying about whether an image might have come out better if I had used Euler instead of DPM or whatever: It's too time consuming, and there are other options with much larger impact. Abstracing/simplifying one part of the pipeline creates room to explore others.
To be sure, we're not going to appease everybody who comes from Comfy/Auto and has their personal optimal config which they derived from a "1girl portrait" X/Y plot at some point.

Back on topic, I think FreeU as a checkbox option with a suitable description is okay. Hardcode the parameters for now, see if people actually request tweakability. I saw you tied it to the Style, doesn't it make more sense to have it as a "global" option in Diffusion settings?

Koyha HiRes fix / DeepShrink (which really helps generating images in the non-trained aspect ratios and resolutions)

Heard about this, haven't tested it yet. But it's very interesting, since the plugin already deals with this using 2-pass workflows. If DeepShrink is an improvement it would definitely be nice to have. What I'm looking for is an evaluation - let's say you find out DeepShrink is just as good as 2-pass HiRes fix in 9/10 cases, but faster, for such and such resolution range. Then we can implement it as the default for the situations where it works well, and everybody profits from the improvement - without additional complexity. Defaults don't have to be perfect, they can improve over time.

So please do continue to share any improvements you make. But also consider how the most people could gain from them. I know it's a lot more work though...

@davniko
Copy link
Author

davniko commented Dec 14, 2023

You are correct. There is some trade-off with oversaturation when using FreeU. From what I understand, parameters b1 and b2 primarily influence the UNet's backbone layer, impacting overall image composition, while s1 and s2 affect the details more. Correctly balancing out these parameters can lead to better results. The resulting oversaturation is also somewhat dependent on the checkpoint used. I suppose the advantage of generating images in a painting software is that you can also fix the oversaturation directly there using Krita's tools. But I added this to the description as a warning to users. I put the checkbox under Styles because it seemed most intuitive there next to the LoRA and checkpoint selection, where I would also have FreeU node in a ComfyUI workflow. I can move it if you don't think this is the appropriate window for it.

Regarding Kohya HiRes Fix, its main advantage lies in preventing and mitigating image corruption when generating in non-standard ratios and resolutions, which you might want to do often in painting software - you don't want just square images or in resolutions the models were trained on. It allows to keep the painting software canvas size and ratio versatility.

Your current workflow, as I understand, already involves latent or image upscaling to match the chosen canvas size if that is required. So if a user has too big of a canvas, you generate the image in a smaller resolution and then upscale it to the desired size. It gives pretty good results and is decently fast even for larger resolutions.

However the method has some limitations. The main one is that it doesn't perform well when the aspect ratio is off - for example when it is 16:9 or 20:9, aspect ratios often used for computers and phone screens. In that case, the output will still be corrupted with stretched and deformed figures to fit the aspect ratio. There is are also several in between steps to performing generation of images with your method. First you calculate if it needs the upscale, then generate the image in a smaller size and then either latent upscaling or image upscaling.

Due to this I think the Deep Shrink might be more efficient and effective way of handling this. It performs everything within a single node without the need to manually scale the image pre or post generation and is applied only on the checkpoint model.

But of course, the Kohya method is not bulletproof and may not be suitable for all scenarios. It performs slower for larger image generation of sizes 2048 and above, since it is directly generating the image in that size and your current method of generating a smaller image the using image upscale to get to the desired size seems to be faster than going directly to a large resolution. At large resolutions too far away from the trained sizes the Kohya method also stops working well, at least with the default settings.

In any case, I am still testing this out and will open a PR for review once I'm done and give some comparisons, so you can check and see if you're interested in having this method included in the main plugin.

@Acly
Copy link
Owner

Acly commented Dec 15, 2023

I put the checkbox under Styles because it seemed most intuitive there

It fits thematically into either tab I'd say, but there is an important difference:

  • If it's part of style, it has to be enabled and configued for each style individually
  • If it's in the diffusion tab you enable/disable it globally, regardless of style/checkpoint used

Both could make sense, just making sure you considered this.

"images may suffer from oversaturation.",
)

b1 = Setting(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep them in the settings file, can these be grouped? Maybe put a dict here

free_u_params = Setting("FreeU Parameters", dict(b1=1.1, b2=1.2, s1=0.6, s2=0.4))

Make a copy of the defaults in Style.__init__ in that case.

@guzuligo
Copy link

[Self-Attention Guidance] would go well with this.
image

@richkel
Copy link

richkel commented Mar 22, 2024

any progress on this sag and freeu would be a great add on to this already amazing tool

@davniko davniko closed this by deleting the head repository Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants