You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add changes for audio speech and audio transcriptions (#388)
* Add changes for audio speech and audio transcriptions
* Remove testing word stuff
* Black formatting'
---------
Co-authored-by: Zain Hasan <[email protected]>
timestamp_granularities: The timestamp granularities to populate for this
53
54
transcription. response_format must be set verbose_json to use timestamp
54
55
granularities. Either or both of these options are supported: word, or segment.
55
-
56
+
diarize: Whether to enable speaker diarization. When enabled, you will get the speaker id for each word in the transcription.
57
+
In the response, in the words array, you will get the speaker id for each word.
58
+
In addition, we also return the speaker_segments array which contains the speaker id for each speaker segment along with the start and end time of the segment along with all the words in the segment.
59
+
You can use the speaker_id to group the words by speaker.
60
+
You can use the speaker_segments to get the start and end time of each speaker segment.
56
61
Returns:
57
62
The transcribed text in the requested format.
58
63
"""
@@ -103,6 +108,9 @@ def create(
103
108
elsetimestamp_granularities
104
109
)
105
110
111
+
ifdiarize:
112
+
params_data["diarize"] =diarize
113
+
106
114
# Add any additional kwargs
107
115
# Convert boolean values to lowercase strings for proper form encoding
timestamp_granularities: The timestamp granularities to populate for this
181
191
transcription. response_format must be set verbose_json to use timestamp
182
192
granularities. Either or both of these options are supported: word, or segment.
183
-
193
+
diarize: Whether to enable speaker diarization. When enabled, you will get the speaker id for each word in the transcription.
194
+
In the response, in the words array, you will get the speaker id for each word.
195
+
In addition, we also return the speaker_segments array which contains the speaker id for each speaker segment along with the start and end time of the segment along with all the words in the segment.
196
+
You can use the speaker_id to group the words by speaker.
197
+
You can use the speaker_segments to get the start and end time of each speaker segment.
184
198
Returns:
185
199
The transcribed text in the requested format.
186
200
"""
@@ -239,6 +253,9 @@ async def create(
239
253
)
240
254
)
241
255
256
+
ifdiarize:
257
+
params_data["diarize"] =diarize
258
+
242
259
# Add any additional kwargs
243
260
# Convert boolean values to lowercase strings for proper form encoding
0 commit comments