Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[debug] fix badcase, add fade on speech output #379

Merged
merged 2 commits into from
Sep 19, 2024

Conversation

boji123
Copy link
Contributor

@boji123 boji123 commented Sep 11, 2024

fix

我是柏基

cosyvoice的流式推理存在badcase,会导致音频听起来有颤音/不连贯,经过定位发现两个问题
1、(fixed)hifigan的speech输出,在chunk开头部分会有瑕疵,本PR解决
解决策略:在speech输出加入smooth(同fade in out方法)后显著改善音质

2、(ongoing)flow的token context没有处理好,导致mel谱上下文跳变,处理中

@boji123
Copy link
Contributor Author

boji123 commented Sep 19, 2024

2、flow的token context没有处理好,导致mel谱上下文跳变
该问题定位见如下issues
#406

@aluminumbox aluminumbox changed the base branch from main to dev/lyuxiang.lx September 19, 2024 09:30
@aluminumbox aluminumbox merged commit cd26f11 into FunAudioLLM:dev/lyuxiang.lx Sep 19, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants