Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

would the c++ example still work after the new silero_vad.onnx release ? #472

Closed
ghost opened this issue Jun 28, 2024 · 5 comments
Closed
Assignees
Labels
help wanted Extra attention is needed

Comments

@ghost
Copy link

ghost commented Jun 28, 2024

i could't make it work, maybe i maked some mistakes that i don't realize.

@ghost ghost added the help wanted Extra attention is needed label Jun 28, 2024
@ghost ghost assigned snakers4 Jun 28, 2024
@yujinqiu
Copy link

yujinqiu commented Jun 28, 2024

Has the same issue.
It look like network structure changed ?

image

5.0 version

image

4.0 version

@csukuangfj
Copy link

No, The current examples in https://github.com/snakers4/silero-vad/tree/master/examples
won't work with silero vad v5 as of today (2024.06.29)

I suggest that you have a look at
k2-fsa/sherpa-onnx#1064

It supports both silero vad version 4 and 5.

It provides APIs for 10 different programming languages, e.g.,

  • C++
  • C
  • Java
  • JavaScript
  • Kotlin
  • Swift
  • Go
  • WebAssembly
  • Dart
  • C#

It also supports running silero VAD with Android, iOS, Flutter, NodeJS, etc.

@filtercodes
Copy link
Contributor

When I attempt to run inference with the old model, it's running fine like this:

output, h, c = session.run(['output', 'hn', 'cn'], {input_name: input_tensor, sr_name: np.array([sample_rate], dtype=np.int64), h_name: h, c_name: c})

With the new model i would assume it's this way:

output, s_n = session.run(['output', 'stateN'], {input_name: input_tensor, sr_name: np.array([sample_rate], dtype=np.float32), state_n: stateN})

But I get an error -> input: state Got: 1 Expected: 3 Please fix either the inputs/outputs or the model.

I do send 3 inputs with input_name, sr_name and state_n... and hard coded the outputs from the model

also I tried reshaping the stateN = s_n.reshape((2, 1, -1)) but it's the same.

What am I missing here?

@csukuangfj
Copy link

what is the shape of input_tensor and stateN? @filtercodes

@filtercodes
Copy link
Contributor

Thanks for the reply,

I created input_tensor from audio buffer that has been converted to float32 previously using int2float() from cpp example.

input_tensor = np.expand_dims(audio_float32, axis=0)

it's an audio buffer of 1024 samples.

and

stateN = np.zeros((2, 1, 128), dtype=np.float32)

@ghost ghost closed this as completed Jun 29, 2024
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants