-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
There is a delay between mouth shape and audio #55
Comments
Hi, this seems to be a problem with the unsynchronized video fps or the audio sampling rate. The input audio is suggested to have a sampling rate of 16,000 before being processed by deepspeech, wav2vec or HuBERT (recommended for your cross-lingual application), and the generated video is of 25 fps. This may be the problem I guess. If not, you can first check whether the generated video and audio have the same length. By the way, I found the provided video is of 30 fps, which is not 25 as the original. I recommend using ffmpeg to combine the video and audio as follow, which can ensure there is no misalignment occurring during this process.
|
Thank you for your prompt response. I will try again. Thank you again for such an excellent job! |
Excellent job! According to your explanation, I have achieved good results! |
Haha, thanks for your feedback :) |
Thank you for publicly showcasing such an outstanding work!
When reasoning about Chinese audio, I encountered the problem of inconsistent mouth shape speed and audio speed:
In the first few seconds, the mouth shape is synchronized, but over time, the delay between the mouth shape and the audio increases, causing the mouth shape to be out of sync.
I use video editing software to drag the audio forward or backward for a few seconds to synchronize my mouth movements, but I really don't want to synchronize it through video editing.
May I ask how to solve this problem? If you could reply, I would greatly appreciate it!
macron2story.mp4
The text was updated successfully, but these errors were encountered: