Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposed cleanups/refactoring #206

Closed
bakkot opened this issue Aug 30, 2022 · 6 comments
Closed

proposed cleanups/refactoring #206

bakkot opened this issue Aug 30, 2022 · 6 comments

Comments

@bakkot
Copy link
Contributor

bakkot commented Aug 30, 2022

I've been messing around with the code for a while now, and there's a few pain points I'd like to address. Before I do a bunch of work on this I want to get confirmation that these changes would be merged:

  • remove batch_size, per Is anyone actually using --batch_size? #204
  • refactor pngwriter to not take an "upscaled" parameter, since that's not really related to what it does
  • reify the parameters (at least sampler, seed, steps, cfg_scale, prompt) into an object
    • and make that object responsible for serialization/deserialization to both prompts and png metadata
    • ideally also changing the format of metadata stored in images; png metadata can be key-value pairs, so we don't need to stick it all into a single string

These need to be done in order - it will be a lot easier to refactor pngwriter after removing batch_size, for example.

I'd really like to work on getting interpolation working, but for that code to be of acceptable quality I really need to make the above changes first. (I actually started doing it and then kept getting distracted by all of the above stuff.)

@lstein if I send a PR with all of these changes, would you be up for merging them? I can split it into seperate PRs if necessary, but if each one needs to get through review in order it's going to take a lot longer.

@tildebyte
Copy link
Contributor

Nice ideas all.

While you're in there (so to speak), it'd be great to have something in the standard PNG "Description" metadata field (possibly at least part of the prompt, but yeah prompts can get pretty long). Given that it's a standard field, it would allow other software (e.g. automated gallery constructors) to have something to generate captions from.

@bakkot
Copy link
Contributor Author

bakkot commented Aug 30, 2022

Good idea. Seems like the prompt is an obvious thing to put there (possibly with something like "generated by Stable Diffusion").

@tildebyte
Copy link
Contributor

TIL; basujindal/stable-diffusion#65 suggests that PNG metadata is EXIF-compliant. If so, then that actually expands the use cases - there are LOTS of scenarios involving automation and EXIF data.

@tildebyte
Copy link
Contributor

Why are these things never straightforward... https://stackoverflow.com/a/62203081/5074550 indicates that

  • PNG does NOT support EXIF
  • "Many pieces of software encode EXIF in a PNG by convention in an iTXt (or compressed zTXt) chunk notated "Raw profile type APP1""
    and
  • "PNG files have included XMP documents (including translated EXIF metadata) in iTXt chunks for years"

To my eye, the most portable/compatible scheme would be storing XMP docs in text chunks. At least then some software (like presumably Lightroom) would be able to read them.

@carbocation
Copy link

carbocation commented Aug 31, 2022

My fork of the old compviz repo writes Exif data to the PNG output. I do this because, as a mac user, I can view the exif data within Finder.

Example image*:
image

And since this Exif data is written to the device model field, it can be displayed in Finder:
image

I'm mentioning this here in case the code to do this is useful for this repo.

  • = Note that github doesn't seem to preserve Exif metadata, so if you download this literal image, you won't see the metadata. Hence the subsequent screenshot as a demo.

@magnusviri
Copy link
Contributor

Since you guys are talking about refactoring and such, I thought I'd mention this here. Macs initially couldn't get consistent images even with the same seed. Some people figured it out (in this issues) but I am pretty sure a side effect of the "fix" is to change all cuda user seed results (I haven't tested it). Because of this I can't imagine that people will want to make this change.

However, there's a 1.5 checkpoint model that's currently being beta tested on the Stable-Diffusion Discord. If I understand correctly, when this is released all of the seed results from the 1.4 model will change. It seems like this is the best time to make a change like this.

@lstein lstein closed this as completed Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants