feat: Add deterministic inference pipeline for reproducible video generation#28
Open
q5sys wants to merge 1 commit intoLightricks:mainfrom
Open
feat: Add deterministic inference pipeline for reproducible video generation#28q5sys wants to merge 1 commit intoLightricks:mainfrom
q5sys wants to merge 1 commit intoLightricks:mainfrom
Conversation
Author
|
@michaellightricks I see you gave this a thumbs up. Anything I can do to improve it or otherwise help in getting this across the line? I'm happy to help however I can. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem:
Running LTX-2 inference with identical inputs (same seed, prompt, settings) produced different video outputs between runs. This violated the expectation that a fixed seed should yield reproducible results.
See links below for example output proving this.
Root Cause:
Best I can tell, PyTorch and CUDA prioritize performance over bitwise determinism by default: scaled_dot_product_attention uses non-deterministic fused kernels, and cuBLAS may use non-deterministic algorithms.
Solution:
Added TI2VidTwoStagesPipelineDeterministic class with enable_deterministic_mode() that enforces reproducibility by:
Performance Impact (RTX 5090 benchmark):
Overall ~1-2 seconds additional time for a ~27 second generation. The original TI2VidTwoStagesPipeline is preserved for users who prioritize speed.
I thought about making alterations to the original pipeline and enabling a
--deterministicor--reproducibleflag which would then enable this capability, but I figured it was safer and cleaner to just create a new pipeline since I'm not aware of what additional development you all have on your internal roadmap, and I wanted this to be as clean and simple for you to merge this into your existing development work.Reference Files for PR: (See for scripts that were run and the output video)
With the Upstream Code:
Script: https://github.com/q5sys/hotdog-determinism/blob/main/run-ltx-hotdog.sh
Shell Log: https://github.com/q5sys/hotdog-determinism/blob/main/ltx-hotdog-shell-output.txt
Output 1: https://github.com/q5sys/hotdog-determinism/blob/main/hotdog-gpu0-1.mp4
Output 2: https://github.com/q5sys/hotdog-determinism/blob/main/hotdog-gpu0-2.mp4
With the Deterministic Pipeline:
Script: https://github.com/q5sys/hotdog-determinism/blob/main/run-ltx-hotdog-deterministic.sh
Shell Log: https://github.com/q5sys/hotdog-determinism/blob/main/ltx-hotdog-deterministic-shell-output.txt
Output 1: https://github.com/q5sys/hotdog-determinism/blob/main/hotdog-gpu0-deterministic-1.mp4
Output 2: https://github.com/q5sys/hotdog-determinism/blob/main/hotdog-gpu0-deterministic-2.mp4