-
Notifications
You must be signed in to change notification settings - Fork 509
Add support for live transcriptions in calls #15696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for live transcriptions in calls #15696
Conversation
f138d7b to
86adf2e
Compare
src/components/ConversationSettings/LiveTranscriptionSettings.vue
Outdated
Show resolved
Hide resolved
86adf2e to
52c8792
Compare
4f08422 to
e445862
Compare
|
Integration test failures should be unrelated. Note that there was no version bump, as I assumed that it would be better to keep the beta version number matching the actual beta release. Due to that when testing this pull request the migration would need to be manually applied! |
|
We're always going to have the transcription story on screen, or there's some expiration planned? Checked the faking example, still feel avatars are a bit redundant, we should have been sticking to displaynames only, but otherwise looks fine |
As discussed, we will do the migration.
But overall, it is awesome 🔥 |
I hope you meant default language not default locale? I mean a default locale also would make sense if it helps writing numbers, dates and things, but using the |
You can do |
nickvergessen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay from PHP side
Antreesy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frontend-wise OK
| this.pendingScrollToBottomLineByLine = undefined | ||
| this.scrollToBottomLineByLine() | ||
| }, 2000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Felt to much during the faking tests, but maybe it will be noticable on live ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I used 1 second, but it felt too quick 🤷 Maybe it could be proportional to the length of the last line or something like that, I intended to implement that but in the end I went for a simpler approach (strange in me :-P )
e445862 to
fa18e0c
Compare
Do you mean removing the transcript after a few seconds without nobody speaking? I would just leave it on screen, but I do not have strong feelings about it.
I tried to, but unfortunately it was taking me way longer than expected so... follow up I guess :-)
I used them for consistency with the chat, and also I think they are good when someone is speaking for long, as in that case the name would be hidden by the text and the avatar still provides a context. |
|
tiny note : icon should be https://pictogrammers.com/library/mdi/icon/subtitles-outline/ |
👍
So increase from the current two seconds to three seconds when a new line is shown and before scrolling to a next one if more lines appear? Or what do you mean?
That is unexpected 🤔 But let's blame it on the simulation unless you find reproducible steps :-)
❤️ |
Oops, definitely, typo fixed. Edit: now really fixed. It helps to press the button to update the text after checking the preview...
Done. |
Done. |
Live transcriptions is an optional feature that is only available if the external app "live_transcription" is available. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
The endpoint just forwards the request to the external app "live_transcription", but using a standard Talk endpoint makes possible to abstract that from the clients. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
The transcripts of each participant is shown in its own block with the avatar and name of the participant, and whenever a transcript for a different participant arrives a new block is shown. The transcript area shows four lines of text, which may include the participant name; the name will be hidden once four or more lines of text for the same participant are added. When a new line is added the whole text is immediately scrolled to show the new line. Using a separate span for each transcript chunk is not strictly needed, but it will be used in following commits. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Rather than just directly showing the next line when a new transcript arrives now the transcript scrolls smoothly to the new line. The scrolling will continue after a small delay if there are more lines until the last one is reached. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Once a transcript block is no longer visible it is no longer needed, so it is now removed. Note that it would still be necessary to remove no longer visible lines inside the same transcript block, for example, for long speeches, but this is something for the future. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
If set, the language of the room is now used when starting live transcriptions. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Signed-off-by: Daniel Calviño Sánchez <[email protected]>
This may influence how the browser renders the transcript chunks, for example, due to specific rules for capitalization. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
Different languages use different separators (for example, Chinese does not use any between kanjis). This detail is included in the language metadata provided by the "live_transcription" app, so the separator added between each transcript chunk now respects the language; in case of a language switch a space is always added. The language metadata is explicitly loaded if it was not available yet before the live transcription is enabled to ensure that it will be available when the transcript is shown. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
The text direction of a language is included in the metadata provided by the "live_transcription" app, so now the transcripts are shown in the right direction (the characters themselves were already shown in the right direction due to the Unicode bidi support, but they could be aligned to the wrong side depending on the main text direction of the UI). Note that the whole transcript block is affected, so also the name and avatar of the author will be affected by the text direction. Due to that a new block is now added when the text direction changes. Signed-off-by: Daniel Calviño Sánchez <[email protected]>
b02f384 to
24a3df4
Compare
Description
TODO
How to test
cert_pem,cert_keyandrsa_private_keyare commented in Janus configuration (as an ECDSA certificate is required for DTLS by the live_transcription app; this will be documented later)return true;inspreed/lib/Service/LiveTranscriptionService.php
Line 35 in 52c8792
return;inspreed/lib/Service/LiveTranscriptionService.php
Line 82 in 52c8792
spreed/lib/Service/LiveTranscriptionService.php
Line 108 in 52c8792
return [ 'en' => [ 'name' => 'English', 'metadata' => [ 'separator' => ' ', 'rtl' => false ]]];inspreed/lib/Service/LiveTranscriptionService.php
Line 129 in 52c8792
spreed/lib/Service/LiveTranscriptionService.php
Line 158 in 52c8792
🖌️ UI Checklist
🖼️ Screenshots / Screencasts
🚧 Tasks (or follow ups)
transcriptsignaling message if it does not come from an internal client🏁 Checklist
🛠️ API Checklist
🚧 Tasks (or follow ups)
liveTranscriptionLanguageIdproperty to the list of properties that trigger a signaling message to update the room, and update the value in the clients🏁 Checklist
docs/has been updated or is not required