fix: support browser-native TTS in classroom playback by harmony2ww · Pull Request #62 · THU-MAIC/OpenMAIC

harmony2ww · 2026-03-17T16:41:06Z

Summary

This PR adds browser-native TTS fallback support to the classroom playback flow so that lectures can still be spoken when no pre-generated audio file is available.

Related Issues

Related to browser-native TTS playback behavior in classroom mode.

Changes

add browser-native TTS fallback in lib/utils/audio-player.ts
pass text, voice, and speed from playback/action engines into the audio player
support pause, resume, stop, and active-state checks for browser speech synthesis

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Refactoring (no functional changes)
CI/CD or build changes

Verification

Steps to reproduce / test

Select browser-native-tts as the TTS provider
Generate or open a classroom with speech actions
Start playback and verify speech is spoken through browser speech synthesis

What you personally verified

verified browser-native TTS can be used during classroom playback when no cached audio is available
verified playback controls still work with browser speech synthesis
ran project checks locally

Evidence

CI passes (pnpm check && pnpm lint && npx tsc --noEmit)
Manually tested locally
Screenshots / recordings attached (if UI changes)

Checklist

My code follows the project's coding style
I have performed a self-review of my code
I have added/updated documentation as needed
My changes do not introduce new warnings

Copilot

Pull request overview

Adds a browser-native (Web Speech API) TTS fallback path for classroom playback when pre-generated audio isn’t available, and threads speech metadata (text/voice/speed) through the playback/action engines to support pause/resume/stop and active-state checks.

Changes:

Extend AudioPlayer.play() to accept a fallback payload and synthesize speech via window.speechSynthesis when IndexedDB audio is missing.
Pass text, voice, and speed into AudioPlayer.play() from both PlaybackEngine and ActionEngine.
Add speech-synthesis aware implementations of pause/resume/stop/playing-state checks in AudioPlayer.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
lib/utils/audio-player.ts	Implements Web Speech API fallback and adds speech-synthesis aware pause/resume/active checks.
lib/playback/engine.ts	Passes speech metadata into audio playback to enable browser-native fallback.
lib/action/engine.ts	Passes speech metadata into audio playback for synchronous speech execution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

    if (this.audio && !this.audio.paused) {
      this.audio.pause();
    }
+    if (this.utterance && typeof window !== 'undefined' && window.speechSynthesis.speaking) {


+    const voices = window.speechSynthesis.getVoices();
+    const desiredVoice = fallback.voice || ttsVoice;
+    const matchedVoice = voices.find(
+      (voice) =>
+        voice.voiceURI === desiredVoice ||
+        voice.name === desiredVoice ||
+        voice.lang === desiredVoice,
+    );
+    if (matchedVoice) {
+      utterance.voice = matchedVoice;
+      utterance.lang = matchedVoice.lang || utterance.lang;
+    }


wyuc · 2026-03-20T08:40:22Z

This was already addressed by #28 (merged Mar 18), which added browser-native TTS playback with text chunking, async voice loading, and language auto-detection. We also shipped #153 on top of that to fix a garbled audio bug and add provider switching in the toolbar.

Your approach in AudioPlayer was clean, we just got there through a different path. The PlaybackEngine ended up being a better place for this since it can manage chunk state and pause/resume.

Thanks for the contribution! Your #63 is still open and we'll look at that separately.

Copilot AI review requested due to automatic review settings March 17, 2026 16:41

Copilot started reviewing on behalf of harmony2ww March 17, 2026 16:41 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

harmony2ww force-pushed the fix/browser-native-tts-playback branch from 72b1d3b to 645cca8 Compare March 17, 2026 16:54

fix: support browser-native TTS in classroom playback

075278a

harmony2ww force-pushed the fix/browser-native-tts-playback branch from 645cca8 to 075278a Compare March 17, 2026 17:05

wyuc closed this Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: support browser-native TTS in classroom playback#62

fix: support browser-native TTS in classroom playback#62
harmony2ww wants to merge 1 commit intoTHU-MAIC:mainfrom
harmony2ww:fix/browser-native-tts-playback

harmony2ww commented Mar 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

wyuc commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

harmony2ww commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issues

Changes

Type of Change

Verification

Steps to reproduce / test

What you personally verified

Evidence

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

wyuc commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

harmony2ww commented Mar 17, 2026 •

edited

Loading