Could not find example for Speaker diarization? #1145
Labels
samples
Issues that are directly related to samples.
type: feature request
‘Nice-to-have’ improvement, new feature or different behavior or design.
Hi folks,
i have hard time to get an data for multiple speakers. and there is no example for it. On google official docs there is no example u can see here https://cloud.google.com/speech-to-text/docs/multiple-voices.
Output:
Speaker %u Word: %s (start: %s, end: %s)
Speaker 0 this (start: "0s", end: "0.5s")
Speaker 0 is (start: "0.5s", end: "1.5s")
Speaker 0 an (start: "1.5s", end: "2.5s")
Speaker 0 entire (start: "2s", end: "3.5s")
Speaker 0 audio (start: "3.5s", end: "4.5s")
Speaker 0 sentence (start: "4.5s", end: "5.5s")
Speaker 0 that (start: "5.5s", end: "6.5s")
Speaker 0 google (start: "6.5s", end: "7.5s")
Speaker 0 give (start: "7.5s", end: "8.5s")
Speaker 0 me (start: "8.5s", end: "9.5s")
Speaker 0 in (start: "9.5s", end: "10.5s")
Speaker 0 its (start: "10.5s", end: "11.5s")
Speaker 0 response (start: "11.5s", end: "12.5s")
Speaker 1 this (start: "0s", end: "0.5s")
Speaker 1 is (start: "0.5s", end: "1.5s")
Speaker 1 an (start: "1.5s", end: "2.5s")
Speaker 1 entire (start: "2s", end: "3.5s")
Speaker 1 audio (start: "3.5s", end: "4.5s")
Speaker 1 sentence (start: "4.5s", end: "5.5s")
Speaker 1 that (start: "5.5s", end: "6.5s")
Speaker 1 google (start: "6.5s", end: "7.5s")
Speaker 1 give (start: "7.5s", end: "8.5s")
Speaker 1 me (start: "8.5s", end: "9.5s")
Speaker 1 in (start: "9.5s", end: "10.5s")
Speaker 1 its (start: "10.5s", end: "11.5s")
Speaker 1 response (start: "11.5s", end: "12.5s")
Speaker 3 this (start: "0s", end: "0.5s")
Speaker 3 is (start: "0.5s", end: "1.5s")
Speaker 3 an (start: "1.5s", end: "2.5s")
Speaker 3 entire (start: "2s", end: "3.5s")
Speaker 3 audio (start: "3.5s", end: "4.5s")
Speaker 3 sentence (start: "4.5s", end: "5.5s")
Speaker 3 that (start: "5.5s", end: "6.5s")
Speaker 3 google (start: "6.5s", end: "7.5s")
Speaker 3 give (start: "7.5s", end: "8.5s")
Speaker 3 me (start: "8.5s", end: "9.5s")
Speaker 3 in (start: "9.5s", end: "10.5s")
Speaker 3 its (start: "10.5s", end: "11.5s")
Speaker 3 response (start: "11.5s", end: "12.5s")
For the sake of simplicity i just cut of some response. first problem as u can see speakerTag value is wrong. the audio that i am sending in request having 5 speakers. it gives me 0,1 and then jump into 3. Now i dont know why google is not responding with 0,1,2,3, and 4 speakersTag. second problem google responding with entire audio text with single person and then with the other person as u can see in my output. I cant figure out is that a problem with my code or something else. i hope u got my problem.
The text was updated successfully, but these errors were encountered: