Problem
The current stage types (prune, train, lora, quant, eval, etc.) are hardcoded for LLM forging. But forge-alloy is meant to be a universal pipeline contract for ANY model domain — vision, audio, diffusion, robotics, whatever.
Current (LLM-only)
| Stage |
Domain |
| prune |
LLM |
| train |
LLM |
| lora |
LLM |
| compact |
LLM |
| quant |
LLM (GGUF, MLX) |
| eval |
LLM (HumanEval, MMLU) |
| expert-prune |
LLM (MoE) |
| context-extend |
LLM (RoPE) |
| modality |
LLM (add encoder) |
The Beyond
Forge-alloy should support domain-specific stage types as extensions of a base stage system. The schema should define WHAT a stage is (input/transform/output with typed config) and let domains register their own.
Camera/Vision domain stages
augment — data augmentation (crop, rotate, color jitter)
backbone-swap — replace feature extractor (ResNet → EfficientNet)
detection-head — add/modify object detection head (YOLO, DETR)
calibrate — camera calibration, lens correction
eval — mAP, IoU, FPS benchmarks (not HumanEval)
Diffusion domain stages
unet-prune — prune U-Net attention
scheduler-swap — DDPM → DPM++ → Euler
vae-tune — fine-tune VAE decoder
eval — FID, CLIP score, aesthetic score
Audio domain stages
codec-swap — replace audio codec (Encodec → DAC)
speaker-adapt — voice cloning LoRA
eval — WER, MOS, speaker similarity
Robotics domain stages
sim-to-real — domain adaptation from simulator
policy-distill — compress large policy to deployable size
eval — task success rate, latency, safety bounds
Design
The stage type system should be a registry, not a hardcoded enum:
{
"domains": {
"llm": { "stages": ["prune", "train", "lora", "quant", "eval", ...] },
"vision": { "stages": ["augment", "backbone-swap", "detection-head", ...] },
"diffusion": { "stages": ["unet-prune", "scheduler-swap", "vae-tune", ...] }
}
}
Each domain registers its stage types with their schemas. The alloy executor discovers domain-specific executors at runtime.
For Now
The README should mention that the current stages are the LLM domain, and that forge-alloy is designed to be extended to other domains. Don't promise what doesn't exist yet, but plant the flag.
Priority
Low for implementation. High for README/docs positioning.
Problem
The current stage types (prune, train, lora, quant, eval, etc.) are hardcoded for LLM forging. But forge-alloy is meant to be a universal pipeline contract for ANY model domain — vision, audio, diffusion, robotics, whatever.
Current (LLM-only)
The Beyond
Forge-alloy should support domain-specific stage types as extensions of a base stage system. The schema should define WHAT a stage is (input/transform/output with typed config) and let domains register their own.
Camera/Vision domain stages
augment— data augmentation (crop, rotate, color jitter)backbone-swap— replace feature extractor (ResNet → EfficientNet)detection-head— add/modify object detection head (YOLO, DETR)calibrate— camera calibration, lens correctioneval— mAP, IoU, FPS benchmarks (not HumanEval)Diffusion domain stages
unet-prune— prune U-Net attentionscheduler-swap— DDPM → DPM++ → Eulervae-tune— fine-tune VAE decodereval— FID, CLIP score, aesthetic scoreAudio domain stages
codec-swap— replace audio codec (Encodec → DAC)speaker-adapt— voice cloning LoRAeval— WER, MOS, speaker similarityRobotics domain stages
sim-to-real— domain adaptation from simulatorpolicy-distill— compress large policy to deployable sizeeval— task success rate, latency, safety boundsDesign
The stage type system should be a registry, not a hardcoded enum:
{ "domains": { "llm": { "stages": ["prune", "train", "lora", "quant", "eval", ...] }, "vision": { "stages": ["augment", "backbone-swap", "detection-head", ...] }, "diffusion": { "stages": ["unet-prune", "scheduler-swap", "vae-tune", ...] } } }Each domain registers its stage types with their schemas. The alloy executor discovers domain-specific executors at runtime.
For Now
The README should mention that the current stages are the LLM domain, and that forge-alloy is designed to be extended to other domains. Don't promise what doesn't exist yet, but plant the flag.
Priority
Low for implementation. High for README/docs positioning.