Feat/video models by CalamitousFelicitousness · Pull Request #4970 · vladmandic/sdnext

CalamitousFelicitousness · 2026-06-30T22:03:39Z

Description

Adds first-last-frame (FLF2V) video for Wan 2.2 I2V and LTX, and allows the Wan 2.2 A14B mixture-of-experts boundary to be tuned at runtime (in both the video tab and the base-model image path) rather than only at model load.

Notes

Boundary: The slider is applied at generation time in set_pipeline_args, the single hook both image and video generation pass through before invoking the pipeline, so changing it takes effect on the next generation with no model reload. Also adds -1 as the default value which uses the value in the model's config. The block is gated to a no-op unless the model has both experts resident and a boundary_ratio in config (mirrors the SDXL register_to_config already in that function).
Wan FLF: the I2V path forwards last_image when the pipeline accepts it and isn't expand_timesteps (the 5B masks to the first frame), otherwise warns. Uses the same base I2V-A14B weights, matching ComfyUI's FLF workflow.
LTX FLF: the last-frame image is conditioned at the final frame index (index=-1 for 2.x, num_frames-1 for 0.9) instead of frame 0; the Last image input shows only for Condition pipelines, I'll probably collapse it into I2V, thinking back it wasn't really worth splitting.

Environment and Testing

Linux (WSL2), Python 3.13, CUDA, RTX 3090

…ideo paths Wan 2.2 A14B ships a per-model boundary_ratio (0.9 I2V, 0.875 T2V) that selects the high- or low-noise expert per step. The video and base-model image loaders both load the shipped value; the slider override is applied at generation time in set_pipeline_args, the one point both paths pass through before invoking the pipeline. The denoising loop reads config.boundary_ratio each call, so tuning takes effect with no reload for video and base-model images alike. The slider defaults to -1, meaning use the model's value; 0 to 1 set the boundary explicitly. Single-expert stages stay load-time because they drop a transformer to free VRAM.

The I2V path forwards a last-frame image to the pipeline when one is supplied and the loaded pipeline accepts it, turning the run into first-last-frame interpolation. supports_last_frame() gates on the pipeline taking a last_image argument and not running expand_timesteps, which conditions on the first frame only, so a model that cannot use a last frame logs a warning instead of silently ignoring it.

The LTX tab already collected a last-frame image but anchored every condition at index 0, so it never acted as a last frame. Build a separate condition for it at the final frame: index -1 for the 2.x family (latent index, negatives wrap) and num_frames-1 for 0.9 (pixel index). The Last image input now shows only for Condition models, the pipelines that accept multi-frame conditioning.

CalamitousFelicitousness added 3 commits June 30, 2026 23:00

vladmandic merged commit b3cf481 into dev Jul 1, 2026
2 checks passed

vladmandic deleted the feat/video-models branch July 1, 2026 09:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/video models#4970

Feat/video models#4970
vladmandic merged 3 commits into
devfrom
feat/video-models

CalamitousFelicitousness commented Jun 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

CalamitousFelicitousness commented Jun 30, 2026

Description

Notes

Environment and Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants