Feat/video models#4970
Merged
Merged
Conversation
…ideo paths Wan 2.2 A14B ships a per-model boundary_ratio (0.9 I2V, 0.875 T2V) that selects the high- or low-noise expert per step. The video and base-model image loaders both load the shipped value; the slider override is applied at generation time in set_pipeline_args, the one point both paths pass through before invoking the pipeline. The denoising loop reads config.boundary_ratio each call, so tuning takes effect with no reload for video and base-model images alike. The slider defaults to -1, meaning use the model's value; 0 to 1 set the boundary explicitly. Single-expert stages stay load-time because they drop a transformer to free VRAM.
The I2V path forwards a last-frame image to the pipeline when one is supplied and the loaded pipeline accepts it, turning the run into first-last-frame interpolation. supports_last_frame() gates on the pipeline taking a last_image argument and not running expand_timesteps, which conditions on the first frame only, so a model that cannot use a last frame logs a warning instead of silently ignoring it.
The LTX tab already collected a last-frame image but anchored every condition at index 0, so it never acted as a last frame. Build a separate condition for it at the final frame: index -1 for the 2.x family (latent index, negatives wrap) and num_frames-1 for 0.9 (pixel index). The Last image input now shows only for Condition models, the pipelines that accept multi-frame conditioning.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds first-last-frame (FLF2V) video for Wan 2.2 I2V and LTX, and allows the Wan 2.2 A14B mixture-of-experts boundary to be tuned at runtime (in both the video tab and the base-model image path) rather than only at model load.
Notes
set_pipeline_args, the single hook both image and video generation pass through before invoking the pipeline, so changing it takes effect on the next generation with no model reload. Also adds -1 as the default value which uses the value in the model's config. The block is gated to a no-op unless the model has both experts resident and aboundary_ratioin config (mirrors the SDXLregister_to_configalready in that function).last_imagewhen the pipeline accepts it and isn'texpand_timesteps(the 5B masks to the first frame), otherwise warns. Uses the same base I2V-A14B weights, matching ComfyUI's FLF workflow.index=-1for 2.x,num_frames-1for 0.9) instead of frame 0; the Last image input shows only for Condition pipelines, I'll probably collapse it into I2V, thinking back it wasn't really worth splitting.Environment and Testing