04. Scoring

Image Scoring

The quality of the generated images is evaluated using one or more scoring models. The final score guides the optimization process.

Configuration (`config.yaml`)

All scoring-related settings are configured in your config.yaml file.

scorer_method: A list of scorer identifiers to use (e.g., [cityaes, clip, manual]).
scorer_average_type: How to average scores from different scorers for the same image (arithmetic, geometric, quadratic).
scorer_weight: (Optional) A dictionary to assign custom weights to scorers (e.g., {cityaes: 1.2, clip: 0.8}). Default is 1.0.
scorer_default_device: The default device (cpu or cuda) for running scorer models.
scorer_device: (Optional) A dictionary to override the device for specific scorers (e.g., {imagereward: cuda}).
scorer_alt_location: (Optional) A dictionary to specify custom paths for specific scorer models if they are not in the main scorer_model_dir.
scorer_filters: (Optional) A dictionary to exclude specific scorers from running on certain payloads by name.
scorer_print_individual: If True, prints the score from each individual scorer in the console.
hpsv3_uncertainty_penalty: A multiplier for how much the hpsv3 scorer's uncertainty (sigma) penalizes its final score. Default is 0.5.
forensicnoise_detection_method: The detection method for the forensicnoise scorer. Can be "structural" (default) or "colored".

Scorer Categories & Descriptions

Prompt-Image Alignment (PIA): Measure how well the image matches the text prompt.
- clip: Uses OpenAI's CLIP model (ViT-L/14). General purpose, widely used.
- blip: Uses Salesforce's BLIP model. Often good at capturing finer details described in the prompt.
Aesthetic Quality: Measure the general visual appeal or predicted human preference, often independent of the prompt.
- laion: Based on the LAION Aesthetic dataset predictor. A common baseline for general aesthetics.
- chad: Originally trained by Discord users on preferred generations. Can be opinionated but often aligns with popular styles.
Hybrid (PIA + Aesthetic): Aim to capture both prompt alignment and visual appeal.
- imagereward: Trained by THUDM to predict human preferences based on prompt-image pairs.
- hpsv21: Human Preference Score v2.1.
- hpsv3: Human Preference Score v3. Returns both a score (mu) and an uncertainty value (sigma).
- pick: Based on the Pick-a-Pic dataset and model, trained on human choices between images generated from the same prompt.
Anime/Illustration Focused: Specifically trained or tuned for anime, manga, or illustrated styles.
- cityaes: CityAesthetics model (Anime variant v1.8). Often very effective for anime styles and good at identifying generation artifacts. (recommended)
- aestheticv25: Based on the improved LAION predictor v2.5. (from Euge)
- shadowv2: Aesthetic predictor from the "shadow" model series.
- cafe: Aesthetic predictor from the "cafe" model series.
- wdaes: Aesthetic predictor from the Waifu Diffusion project.
Anatomy & Composition:
- luminaflex, lumidinov2l, lumidinov2g: A family of scorers trained to detect anatomical flaws and composition issues.
Technical & Artifact Analysis:
- simplequality: A fast, model-free scorer that measures basic image quality metrics like brightness and contrast.
- gammanoise: Measures the level of gamma noise in an image.
- forensicnoise: Analyzes structural noise patterns to detect AI-generated artifacts. Requires background removal (rembg).
- backgroundblackness: Measures the percentage of pure black in the background of an image. Requires background removal (rembg).
Special & Utility:
- manual: Enables interactive scoring via the console. The user is prompted to enter a score (0-10) for each generated image.
- noai: Attempts to classify if an image is AI-generated vs. real. (Experimental, results may vary).

Special Scorer Behavior

manual Mode:
- When an image is shown for manual scoring, you can type OVERRIDE_SCORE in the console. This will interrupt the current iteration and prompt you to enter a final average score for the entire iteration, bypassing all other scorers.
hpsv3 Scorer:
- This scorer returns both a score (mu) and an uncertainty value (sigma). The final score is calculated as mu - (k * sigma).
- You can control the uncertainty penalty with the hpsv3_uncertainty_penalty setting in config.yaml. A higher value means a stronger penalty for uncertainty.
forensicnoise Scorer:
- This scorer has a configurable detection_method. You can set it in your config.yaml via the forensicnoise_detection_method key to change its analysis mode.

Note

Some scorers, particularly the technical ones like simplequality, have internal parameters (e.g., sharpness thresholds, weights) that are not currently exposed in config.yaml. To adjust these, you would need to modify the _load_all_models method in sd_optim/scorer.py to pass them during instantiation.

Scorer Setup Recommendations

General Purpose: A good starting point is a mix of aesthetic and prompt-alignment scorers, such as [cityaes, hpsv3, clip].
Anime Focus: Prioritize anime-specific scorers. [cityaes, shadowv2, aestheticv25] is a strong combination.
Colors: Consider adding backgroundblackness to penalize models that don't generate perfect 000000 blackgrounds.
Artifacts: [simplequality, gammanoise, forensicnoise] can help penalize results that produce errors.

General Tips:

Choose scorers relevant to your goal. If merging anime models, prioritize anime-focused scorers. If aiming for photorealism matching a complex prompt, prioritize PIA scorers.
Start with fewer scorers (1-3) to understand their individual impact before combining many.
Adjust weights in scorer_weight to emphasize scorers that best reflect your desired outcome.
Use manual scoring for a few iterations initially to get a feel for the model's output and provide direct feedback, even if you switch to automatic later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

04. Scoring

Image Scoring

Configuration (`config.yaml`)

Scorer Categories & Descriptions

Special Scorer Behavior

Scorer Setup Recommendations

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

04. Scoring

Image Scoring

Configuration (config.yaml)

Scorer Categories & Descriptions

Special Scorer Behavior

Scorer Setup Recommendations

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Configuration (`config.yaml`)