Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C++ runtime for speech enhancement GTCRN models #1977

Merged
merged 6 commits into from
Mar 10, 2025

Conversation

csukuangfj
Copy link
Collaborator

See also
https://github.com/Xiaobin-Rong/gtcrn

CC @yuyun2000 @Xiaobin-Rong

Usage

Build sherpa-onnx from source

cd /path/to/sherpa-onnx

mkdir build
cd build
cmake ..
make

Download models and test files

cd /path/to/sherpa-onnx
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/speech_with_noise.wav
ls -lh gtcrn_simple.onnx
-rw-r--r--  1 fangjun  staff   523K Mar 10 12:55 gtcrn_simple.onnx

Run it

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline-denoiser \
  --debug=1 \
  --speech-denoiser-gtcrn-model=./gtcrn_simple.onnx \
  --input-wav=./speech_with_noise.wav \
  --output-wav=./enhanced_speech_16k.wav
speech_with_noise.mov
enhanced_16k.mov
Screenshot 2025-03-10 at 17 12 47 Screenshot 2025-03-10 at 17 12 26

Test 2

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav
./build/bin/sherpa-onnx-offline-denoiser \
  --debug=1 \
  --speech-denoiser-gtcrn-model=./gtcrn_simple.onnx  \
  --input-wav=./inp_16k.wav \
  --output-wav=./enhanced_16k-2.wav
inp_16k.mov
enhanced_16k-2.mov
Screenshot 2025-03-10 at 17 31 51 Screenshot 2025-03-10 at 17 32 34

@csukuangfj csukuangfj merged commit 488a6e6 into k2-fsa:master Mar 10, 2025
170 of 214 checks passed
@csukuangfj csukuangfj deleted the cpp-gtcrn branch March 10, 2025 10:11
@yuyun2000
Copy link

it is too 强

@altunenes
Copy link

Is there a method you recommend for overcoming parallel/overlap speech scenarios? segmentation-3.0 is quite inadequate for parallel speech and causes problems especially for identifying the speakers (speaker recognition).

@csukuangfj
Copy link
Collaborator Author

Is there a method you recommend for overcoming parallel/overlap speech scenarios? segmentation-3.0 is quite inadequate for parallel speech and causes problems especially for identifying the speakers (speaker recognition).

I suggest that you ask this question in https://github.com/pyannote/pyannote-audio
@altunenes

@altunenes
Copy link

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants