Skip to content

06. Merge Methods

imi edited this page Aug 16, 2025 · 1 revision

Merge Methods

This page provides details on some of the custom merge methods available in sd-optim, which can be set via the merge_method key in config.yaml. These methods are built on the sd-mecha library.

Note: Not full coverage. A lot of experimental methods make it into the repo, so functionality is not guaranteed. Will reference sd-mecha documentation for builtin methods.

multi_domain_alignment

Merges tensors A and B using multi-domain alignment with an anchor tensor C for guidance. This method combines spatial and frequency domain information to create more robust merges.

Key Steps

  1. Frequency-Selective Alignment: Aligns the frequency distributions of A and B, guided by the anchor C, to preserve global features.
  2. Cross-Attention (Optional): Calculates feature importance weights using cross-attention between A, B, and C to emphasize consistently important features.
  3. Dissimilarity Calculation: Measures the spatial dissimilarity between A and B, guided by the anchor C.
  4. Adaptive Interpolation: Merges A and B using slerp interpolation, with alpha values adaptively adjusted based on feature importance and dissimilarity.
  5. Anchor Adjustment: Fine-tunes the merged tensor towards the anchor C to enhance consistency and preserve anchor characteristics.

Parameters

  • a (Tensor): The first tensor to merge.
  • b (Tensor): The second tensor to merge.
  • c (Tensor): The anchor tensor, used as a reference.
  • alpha (float): The base interpolation factor for slerp (0 to 1).
  • beta (float): The strength of the anchor adjustment (0 to 1).
  • kernel_size (int): Size of the Gaussian kernel for smoothing (must be odd).
  • centroid_margin_factor (float): Controls the width of the transition zone between frequency bands.
  • frequency_weight (float): Weight given to the frequency-domain contribution (0 to 1).
  • use_cross_attention (bool): Whether to use cross-attention for feature importance.

Recommendations

  • For stable, conservative merges:
    • kernel_size: 3
    • centroid_margin_factor: 0.1
    • frequency_weight: 0.3
  • For detail preservation:
    • kernel_size: 3
    • centroid_margin_factor: 0.08
    • frequency_weight: 0.4
  • For smoother blending:
    • kernel_size: 5
    • centroid_margin_factor: 0.15
    • frequency_weight: 0.25

More to be added

Clone this wiki locally