Skip to content

08. Advanced Usage

imi edited this page Aug 16, 2025 · 1 revision

Advanced Usage

This page covers advanced features for users who are comfortable with the basic workflow of sd-optim.

Recipe Optimization

For complex merges, you can optimize parameters directly within an existing .mecha recipe file instead of using a single merge method.

Configuration (config.yaml):

  1. Set optimization_mode: recipe.
  2. Configure the recipe_optimization block:
    • recipe_path: The full path to your .mecha file.
    • target_nodes: The reference to the node(s) you want to optimize (e.g., '&8' or ['&8', '&12']).
    • target_params: A list of the parameter names within that node to optimize (e.g., [alpha, beta]).

The optimizer will then vary the specified parameters within the target node(s) of your recipe to find the highest scoring parameter values.

Extensibility

sd-optim, thanks to sd-mecha, is designed to be extensible. You can add your own block definitions, conversion scripts, and even new merge methods.

Custom Block Definitions

You can define custom groupings of model keys (layers) to target them specifically during optimization.

  1. Create a Block YAML file: Create a .yaml file following the sd-mecha ModelConfig format. Give it a unique identifier. For a complete example, see sd_optim/model_configs/sdxl-optim_blocks.yaml.
  2. Place the file: Put this file in the directory specified by configs_dir in your config.yaml.
  3. Use in Guide: In your optimization_guide.yaml, set the custom_block_config_id to your new identifier and use target_type: block in your strategies to reference the block names.

Custom Conversion Scripts

Conversion scripts are a special type of merge method that translates between different ModelConfig formats. They are essential for using custom block definitions.

  1. Create a Python script: Create a .py file containing your conversion function. The function must be decorated with @sd_mecha.merge_method(..., is_conversion=True). For a complete example, see sd_optim/model_configs/convert_sdxl_optim_blocks.py.
  2. Place the file: Put this script in the directory specified by conversion_dir in your config.yaml. The script will be loaded automatically.

Custom Merge Methods

You can add your own merge algorithms to the optimizer.

  1. Edit merge_methods.py: Open the sd_optim/merge_methods.py file.
  2. Add your function: Add your new merge method as a static method inside the MergeMethods class. It must be decorated with @merge_method and have the correct Parameter and Return type hints from sd-mecha.
  3. Use in Config: Set the merge_method in config.yaml to the identifier you gave your function in the decorator.
# Example of a custom merge method in sd_optim/merge_methods.py
class MergeMethods:

    @merge_method
    def pop_lora(
            a: Parameter(Tensor, "weight"),
            b: Parameter(Tensor, "weight"),
            *,
            alpha: Parameter(Tensor) = 0.5,
            rank_ratio: Parameter(float) = 0.25,
            # ... other parameters
    ) -> Return(Tensor, "weight"):
        """
        Pivoted Orthogonal Projection
        Merges tensors 'a' and 'b' using pivoted QR, projection, and low-rank.
        """
        # ... implementation ...
        return merged_tensor

Reproducibility: Merge Artifacts

To ensure you can perfectly reproduce a specific merge from an optimization run, sd-optim can generate a self-contained Python script for each iteration.

Note: This might exhibit yet undiscovered errors as it's complex code. (Perhaps needlessly so)

  • Enable: Set save_merge_artifacts: True in your config.yaml.
  • Output: For each iteration, a _run.py script will be saved in the logs/RUN_FOLDER/merge_artifacts/ directory.
  • Usage: This script contains the exact recipe, custom code, and settings used for that merge. You can run it with python FILENAME_run.py to generate the exact same model file again, independent of sd-optim.

Optuna Study Management

For long-running experiments, you can resume or fork previous Optuna studies. Studies are stored as .db files in the storage_dir (default: optuna_db/).

  • Resuming a Study (resume_from_study):
    • To continue a previous run, set resume_from_study in config.yaml to the name of the study you want to resume (e.g., "run_20250612_191003_tpe").
    • The optimizer will load the existing .db file and continue adding trials.
    • Important: You cannot change the scorer_method when resuming a study directly.
  • Forking a Study (fork_study: True):
    • Use this if you want to start a new experiment based on a previous one (e.g., with different scorers).
    • Set resume_from_study to the parent study name and set fork_study: True.
    • A new study and .db file will be created. All successful trials from the parent study will be enqueued at the start of the new run, giving the optimizer a warm start before it begins generating new trials.

Clone this wiki locally