[debug] fix badcase, add fade on speech output #379

boji123 · 2024-09-11T03:00:26Z

我是柏基

cosyvoice的流式推理存在badcase，会导致音频听起来有颤音/不连贯，经过定位发现两个问题
1、（fixed）hifigan的speech输出，在chunk开头部分会有瑕疵，本PR解决
解决策略：在speech输出加入smooth（同fade in out方法）后显著改善音质

2、（ongoing）flow的token context没有处理好，导致mel谱上下文跳变，处理中

boji123 · 2024-09-19T08:12:47Z

2、flow的token context没有处理好，导致mel谱上下文跳变
该问题定位见如下issues
#406

liubaiji added 2 commits September 11, 2024 10:36

[refator] modify fade_in_out func to a commom form

df653f1

[feature] fix badcase, add fade on speech output

9e0b99e

aluminumbox changed the base branch from main to dev/lyuxiang.lx September 19, 2024 09:30

aluminumbox merged commit cd26f11 into FunAudioLLM:dev/lyuxiang.lx Sep 19, 2024
2 checks passed

This was referenced Sep 20, 2024

[debug] support flow cache, for sharper tts_mel output #410

Closed

[debug] support flow cache, for sharper tts_mel output #412

Closed

boji123 mentioned this pull request Sep 30, 2024

[debug] support flow cache, for sharper tts_mel output (handle prompt bug) #455

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[debug] fix badcase, add fade on speech output #379

[debug] fix badcase, add fade on speech output #379

boji123 commented Sep 11, 2024

boji123 commented Sep 19, 2024

[debug] fix badcase, add fade on speech output #379

[debug] fix badcase, add fade on speech output #379

Conversation

boji123 commented Sep 11, 2024

boji123 commented Sep 19, 2024