Skip to content

fix: support browser-native TTS in classroom playback#62

Closed
harmony2ww wants to merge 1 commit intoTHU-MAIC:mainfrom
harmony2ww:fix/browser-native-tts-playback
Closed

fix: support browser-native TTS in classroom playback#62
harmony2ww wants to merge 1 commit intoTHU-MAIC:mainfrom
harmony2ww:fix/browser-native-tts-playback

Conversation

@harmony2ww
Copy link
Copy Markdown

@harmony2ww harmony2ww commented Mar 17, 2026

Summary

This PR adds browser-native TTS fallback support to the classroom playback flow so that lectures can still be spoken when no pre-generated audio file is available.

Related Issues

Related to browser-native TTS playback behavior in classroom mode.

Changes

  • add browser-native TTS fallback in lib/utils/audio-player.ts
  • pass text, voice, and speed from playback/action engines into the audio player
  • support pause, resume, stop, and active-state checks for browser speech synthesis

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)
  • CI/CD or build changes

Verification

Steps to reproduce / test

  1. Select browser-native-tts as the TTS provider
  2. Generate or open a classroom with speech actions
  3. Start playback and verify speech is spoken through browser speech synthesis

What you personally verified

  • verified browser-native TTS can be used during classroom playback when no cached audio is available
  • verified playback controls still work with browser speech synthesis
  • ran project checks locally

Evidence

  • CI passes (pnpm check && pnpm lint && npx tsc --noEmit)
  • Manually tested locally
  • Screenshots / recordings attached (if UI changes)

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have added/updated documentation as needed
  • My changes do not introduce new warnings

Copilot AI review requested due to automatic review settings March 17, 2026 16:41
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a browser-native (Web Speech API) TTS fallback path for classroom playback when pre-generated audio isn’t available, and threads speech metadata (text/voice/speed) through the playback/action engines to support pause/resume/stop and active-state checks.

Changes:

  • Extend AudioPlayer.play() to accept a fallback payload and synthesize speech via window.speechSynthesis when IndexedDB audio is missing.
  • Pass text, voice, and speed into AudioPlayer.play() from both PlaybackEngine and ActionEngine.
  • Add speech-synthesis aware implementations of pause/resume/stop/playing-state checks in AudioPlayer.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
lib/utils/audio-player.ts Implements Web Speech API fallback and adds speech-synthesis aware pause/resume/active checks.
lib/playback/engine.ts Passes speech metadata into audio playback to enable browser-native fallback.
lib/action/engine.ts Passes speech metadata into audio playback for synchronous speech execution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread lib/utils/audio-player.ts Outdated
if (this.audio && !this.audio.paused) {
this.audio.pause();
}
if (this.utterance && typeof window !== 'undefined' && window.speechSynthesis.speaking) {
Comment thread lib/utils/audio-player.ts
Comment on lines +97 to +108
const voices = window.speechSynthesis.getVoices();
const desiredVoice = fallback.voice || ttsVoice;
const matchedVoice = voices.find(
(voice) =>
voice.voiceURI === desiredVoice ||
voice.name === desiredVoice ||
voice.lang === desiredVoice,
);
if (matchedVoice) {
utterance.voice = matchedVoice;
utterance.lang = matchedVoice.lang || utterance.lang;
}
@harmony2ww harmony2ww force-pushed the fix/browser-native-tts-playback branch from 72b1d3b to 645cca8 Compare March 17, 2026 16:54
@harmony2ww harmony2ww force-pushed the fix/browser-native-tts-playback branch from 645cca8 to 075278a Compare March 17, 2026 17:05
@wyuc
Copy link
Copy Markdown
Contributor

wyuc commented Mar 20, 2026

This was already addressed by #28 (merged Mar 18), which added browser-native TTS playback with text chunking, async voice loading, and language auto-detection. We also shipped #153 on top of that to fix a garbled audio bug and add provider switching in the toolbar.

Your approach in AudioPlayer was clean, we just got there through a different path. The PlaybackEngine ended up being a better place for this since it can manage chunk state and pause/resume.

Thanks for the contribution! Your #63 is still open and we'll look at that separately.

@wyuc wyuc closed this Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants