Crop editor #353

StableLlama · 2025-03-22T13:43:45Z

This PR adds functionality to crop an image and set areas to include and exclude when the image gets exported. All new features are:

Export images ready to be used for training
Mark images manually or with the help of YOLO models to create masks for
inclusion and exclusion of image parts for masked training. Filter based on
markings and also their (partial) visibility
Crop images with advanced hints that respect relevant aspect ratios and
bucket sizes of the training scripts
Rate images with 0 to 5 stars and be able to use that for filtering
Show the tag count for the current filter setting
Filter for image size

I know it is big, so I tried to split off the first two steps, so it's building on #332 and #335 (for #332 I guess #337 would be beneficial as well, although I haven't tested it).

The changes and new features are best described in the updated README: https://github.com/StableLlama/taggui/tree/crop_editor?tab=readme-ov-file#cropping-and-masking-with-markings

Screenshot, simulating a clothing LoRA dataset preparation, with the new features active.

At this time this PR is now 100% ~~90%~~ ~~80%~~ ready and I'd be happy about a first review and probably some foreign testing. To make it 100% ready, and to change it from draft to a real pull request, still missing is:

Implementation of feature detection models to automatically create hints, includes and excludes
Testing with real world data

Additional internal changes in this PR:

simplify settings infrastructure

Still missing: - color space handling - JPEG XL (different PR) - crop editor (different PR)

Code refactor to respect mostly a width of 80

Advantages are smaller file sizes when compression is acceptable (quality < 100) or lossless compression with quality = 100. Also the alpha channel is supported and kept in the images. Note 1: this only adds support for the export function, and is not general JPEG XL support for taggui. There is the fork https://github.com/yggdrasil75/taggui that does exactly this. Note 2: You might need `pip install pillow-jxl-plugin` beforehand to be able to export into JPEG XL.

Also refactor the code a bit to be able to use the target size calculation more easily in (future) other modules as well.

Also change the Settings infrastructure to send a Signal when a setting is changed.

Also rename CustomRectItem to MarkingItem

Also refactor a bit to use QSize instead of a simple tuple

StableLlama · 2025-04-02T20:44:07Z

This PR has significantly improved my workflow of preparing images for training.

As an example this is how I used it for my test:

Collected images (I stopped with a bit more than 10k images) and saved them with a coarse structure in different directories
Then I used classical applications (file browser, image viewer) to condense the images to the relevant ones (the collection was for a generic concept, right now I wanted to train a specific one), this left me with 330 images
Using taggui I went through that list and gave manual tags to organize them better, some normal tags that will be used later on for captioning and some starting with a hash for easy filtering, like #lowres. Using the new width:> and height:> filter helped in quickly identifying the high resolution images
Using the new marking tool I let it find faces and hands, the faces directly as an exclude, the hands as a hint
I gave the interesting images star ratings (5* = must, 4* = should, 3* = can, 2* = if necessary, 1* = don't take it, 0* = unrated)
Filtering for stars:>=3 I had 39 images. Looking at the statistics of tag distribution I could finetune where I needed more and where I could leave out images to make the set more balanced (i.e. adjust the rating of the 2* and 3* images)
Then I placed a well aligned crop for each image that had a good enough rating. Holding the ctrl to snap to a common aspect ratio and having the hint about how to make a crop that contains enough pixels (that is stay over the green line) I could very easily and quickly find a good position for the crop. I think this is a game changer!
At the same time I checked whether the excludes are proper aligned, taking the affected latents into account (i.e. the red overlay)
Then going into the export dialog I had a look at the statistics and could see that nearly all images had only one of three aspect ratios and those even had a rather equal distribution - except for two images. Double clicking on that line gave me a filter with exactly those two images. Then I could easily change my crop to go to one of the common aspect ratios to have well filled buckets for training
A last check is with crops:hand to see that no hand was cut as this can give bad anatomy after training
Then I could export the images in a format that cut out the excludes (in this case the face) to make those parts ignored during masked training

Conclusion:
Creating this PR was much more work that I thought it would be and so it took much longer than expected - but the result justifies it. Creating training datasets will now be so much quicker that I'm sure I'll easily catch up with the invested time.

Fix context menu display on markings Respect "insert space after tag separator" for export

…that support transparency.

…ticaptioning can be used with the civitai trainer as well

… option to sort them to the end

StableLlama · 2025-04-27T15:21:21Z

@jhc13 this PR is now open for a month. Could you already have a look at it?

I'm currently using it in a rather big captioning project and it's helping me a lot. So I'd be happy when it gets pulled soon as I think others will benefit from it as well.

jhc13 · 2025-05-06T12:15:00Z

Hi @StableLlama, I apologize for not responding to you sooner. I have been busy with other things and did not have time to work on anything related to TagGUI.

I appreciate all the time and effort you put into this and your other PRs, but I have decided not to merge this for now, for the following reasons:

As I discussed in my comment on your initial PR, I consider most of these features out of scope. I have always intended for TagGUI to be a program that focuses on the text side of image datasets, rather than being a comprehensive dataset management tool.
I am still very busy, so I don't have the time to properly review the code. I do plan to look at other PRs like Add support to drag selected images to external applications #336 when I have time.

You may disagree with my first point, and I do understand your perspective, but it just does not align with my personal vision for the project. You are welcome to add your features to your own fork of the program. If others find them useful too, then that's great.

Thanks again for your contributions!

StableLlama · 2025-05-15T20:53:26Z

As I've written, I see these additions not as a dataset management tool (for that far too much is missing) but as a workflow tool to prepare the images for training. One aspect is tagging/captioning (that's where taggui is very good at) but then also in creating masks for masked training and cropping the images. Especially for cropping I know no other tool that can do it as efficiently as this PR can do it.

But as you have suggested I consider to create a fork that contains all the PR here that I think are very beneficial for efficiently preparing the images.

Spikhalskiy · 2025-05-29T01:17:06Z

@StableLlama this is fantastic work. Please consider making a fork with a broader vision if you have the time and passion to maintain it. It will outgrow the original project if you continue to bring new features, making it a Swiss Army knife for dataset preparation.

StableLlama · 2025-08-08T12:53:34Z

@Spikhalskiy as I couldn't wait any longer, I've now created a fork, I'll call it "taggui workflow" as it is supporting the image preparation workflow. Next I'll pull a few other PRs here that are necessary for me.

As written above: I'm more than happy when I don't need a fork and it could be united here.

Allow zooming on center image

10f7b5d

StableLlama force-pushed the crop_editor branch from 281bf29 to c47ab95 Compare March 26, 2025 22:31

StableLlama added 28 commits March 26, 2025 23:33

Implement export functionality

da3f17c

Still missing: - color space handling - JPEG XL (different PR) - crop editor (different PR)

Add color space conversion

11b59f5

Little display fixes and add infrastructure for preferred sizes.

721fe83

Change algorithm for bucketing

40e45ff

Code documentation

165a659

Add documentation

f50b340

Fix markdown style

ecbf8b8

Make sure to export the caption files as well

884206b

Refactor DEFAULT_SETTINGS to ease initial value access

3b7be61

Code refactor to respect mostly a width of 80

Use new default infrastructure

1adce15

Fix broken tagging

3fa1021

Finetune sharpening

ff0e0be

Allow export to respect filter and selection of images

6abcaa2

Add possibility to filter for image size

1ba3949

Add filtering for target size.

466d20b

Also refactor the code a bit to be able to use the target size calculation more easily in (future) other modules as well.

Remove little left over

570e937

Add option to only export missing images

ca6551f

Change image_viewer to show rectangles and edit them.

5c020b0

Also change the Settings infrastructure to send a Signal when a setting is changed.

Show image crop in image list

4168bc0

Simplify code

a6afbf5

Add head up display to show hint about preferred aspect ratios

d8eb595

Little fixes

87517d0

Store meta changes to disk

43d20c7

Add toolbar entries to add markings and make the editable.

97fcaed

Make label display working as intended.

8ba40c3

Also rename CustomRectItem to MarkingItem

Change JSON format to list of markings

a2571b6

Also refactor a bit to use QSize instead of a simple tuple

Fix litte bugs

265f5a4

Add star rating to undo/redo infrastructure

ca41f57

This was referenced Apr 2, 2025

New feature: Export images #332

Closed

Allow zooming on center image #335

Closed

StableLlama marked this pull request as ready for review April 2, 2025 20:05

StableLlama added 9 commits April 3, 2025 22:49

Fix error

16b5001

Add an option to captioning to ignore hashed tags

7f8d624

Quickly set rating filter (control + click on stars)

02eecc7

Fix context menu display on markings Respect "insert space after tag separator" for export

All masking for JPEG as well and add an option to show it on formats …

42df5a0

…that support transparency.

Add tag export options

032f475

Language fixes

0c17acd

New feature: support multi captioning when a #newline tag is used

8d57186

Make sure that changes to the filter update the tag count

ca7494d

Allow #newline to also generate new copies of the image - so that mul…

761ea89

…ticaptioning can be used with the civitai trainer as well

This was referenced Apr 12, 2025

Feature request: simple image modification (crop, resize, mask painting) #294

Open

Feature request: tag count for filtered or selected images #300

Open

StableLlama added 6 commits April 19, 2025 01:00

Optimize target size calculation

8664664

Feature: create masking files instead of using the alpha layer

0750b29

Fix noise in masked areas (especially in dark parts)

db113c9

Bug fix for recent regression

25483a0

Add an option to filter for image area

1364b25

Add the possibility to better handle sentence captions by offering an…

668d9ef

… option to sort them to the end

Make sure the crop hint is shown on the active image in the image list

60bc188

Merge remote-tracking branch 'upstream/main' into crop_editor

f3e7516

StableLlama mentioned this pull request Aug 8, 2025

Crop editor StableLlamaAI/taggui_flow#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Crop editor #353

Crop editor #353

Uh oh!

StableLlama commented Mar 22, 2025 •

edited

Loading

Uh oh!

StableLlama commented Apr 2, 2025

Uh oh!

StableLlama commented Apr 27, 2025

Uh oh!

jhc13 commented May 6, 2025

Uh oh!

StableLlama commented May 15, 2025

Uh oh!

Spikhalskiy commented May 29, 2025

Uh oh!

StableLlama commented Aug 8, 2025

Uh oh!

Uh oh!

Crop editor #353

Are you sure you want to change the base?

Crop editor #353

Uh oh!

Conversation

StableLlama commented Mar 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

StableLlama commented Apr 2, 2025

Uh oh!

StableLlama commented Apr 27, 2025

Uh oh!

jhc13 commented May 6, 2025

Uh oh!

StableLlama commented May 15, 2025

Uh oh!

Spikhalskiy commented May 29, 2025

Uh oh!

StableLlama commented Aug 8, 2025

Uh oh!

Uh oh!

StableLlama commented Mar 22, 2025 •

edited

Loading