fix: use browser speechSynthesis for playback when browser-native-tts is selected#28
Conversation
Code ReviewClean, focused change with correct edge case handling. Overall LGTM. Two items to consider as follow-up improvements: 1. Chrome long utterance cutoffChrome has a known bug where 2.
|
|
Thanks for the review!!!! I'll address the Chrome 15s cutoff and Firefox pause/resume issues in a follow-up PR by implementing an utterance queue with text chunking. This will elegantly handle both issues while keeping this PR focused on the basic fallback. |
… is selected Previously, selecting browser-native-tts as the TTS provider would produce sound in the settings test but remain silent during classroom playback. This happened because: 1. The scene generator correctly skipped pre-generation for browser TTS (it runs client-side, not via API) 2. The playback engine fell back to a silent reading timer when no pre-generated audio was found, instead of calling speechSynthesis This commit adds Web Speech API integration directly in the PlaybackEngine: - New playBrowserTTS() method speaks text via speechSynthesis - Properly wires onend/onerror to advance to the next action - pause()/resume() now handle speechSynthesis.pause()/resume() - stop() and handleUserInterrupt() cancel browser TTS Fixes THU-MAIC#25, fixes THU-MAIC#12, fixes THU-MAIC#5
49b470f to
496b5d9
Compare
- When ttsVoice is "default" (set by Browser Native TTS which has no voice picker), the voiceURI lookup silently fails and no lang is set, causing Chinese text to be spoken with an English voice. - Extract the 0.3 CJK detection threshold as a named constant CJK_LANG_THRESHOLD with JSDoc explaining the rationale. - Fall through to language auto-detection when voice lookup fails, regardless of the reason (missing voice, "default" sentinel, etc.).
|
Update: Both follow-up items have been addressed in the latest commits on this branch. 1. Chrome long utterance cutoff → ✅ FixedAdded 2. Firefox
|
|
For both Firefox and Chrome, the browser's native TTS works correctly inside the classroom. However, the TTS preview under the input box on the homepage cannot play the browser's native TTS. Additionally, on Firefox, previewing the browser's native TTS in the Settings panel shows a success status, but no sound is actually heard. |
Thanks for your review! Both issues fixed in latest push (
|
The browser-native and API-based TTS preview code was duplicated across tts-config-popover, media-popover, and tts-settings. Extract it into a reusable useTTSPreview hook that handles refs, cancellation, audio lifecycle, and staleness checks in one place. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
49a94e6 to
1a582f3
Compare
cosarah
left a comment
There was a problem hiding this comment.
Review Summary
整体改动质量不错,approve with minor suggestions。
改动分析
核心功能:浏览器原生 TTS 预览 ✅
lib/audio/browser-tts-preview.ts封装完善,取消/超时/竞态处理到位ensureVoicesLoaded正确处理了 Chrome 异步加载 voices 的已知问题resolveBrowserVoice支持 voiceURI/name/lang 多种匹配 + CJK 语言自动检测
useTTSPreview hook 抽取 ✅
- 三个组件(tts-config-popover、media-popover、tts-settings)的重复预览逻辑统一收敛到 hook
- 净减约 130 行重复代码
- hook 接口清晰:
{ previewing, startPreview, stopPreview }
setASRProvider 语言重置 ✅
- 切换 ASR provider 时自动重置不兼容的语言代码(BCP-47 vs ISO 639-1)
Settings 输入框 autofill 隔离 ✅
- 6 个 settings 组件统一加了
name、autoComplete="new-password"等属性,防止浏览器串填
pptxgenjs 注释 typo 修复 ✅
Minor suggestions(不阻塞合并)
browser-tts-preview.ts顶部的'use client'指令可以去掉——该文件是纯工具函数,没有 React 组件inferPreviewLang目前只检测中文(CJK 基本平面 + 扩展 A),如果将来需要支持日文/韩文可能需要扩展匹配范围browserTTSNoVoices的中文翻译 "当前浏览器没有可用的 TTS voice" 建议改为 "当前浏览器没有可用的语音",保持中文一致性
* main: feat: whiteboard history and auto-save (THU-MAIC#40) fix: use browser speechSynthesis for playback when browser-native-tts is selected (THU-MAIC#28) chore: fix some minor issues in the comments (THU-MAIC#71) fix: reset ASR language when changing provider (THU-MAIC#67) fix: isolate settings API key autofill fields (THU-MAIC#48) # Conflicts: # components/audio/tts-config-popover.tsx # components/generation/media-popover.tsx # components/settings/tts-settings.tsx # lib/store/settings.ts
… is selected (THU-MAIC#28) * fix: use browser speechSynthesis for playback when browser-native-tts is selected Previously, selecting browser-native-tts as the TTS provider would produce sound in the settings test but remain silent during classroom playback. This happened because: 1. The scene generator correctly skipped pre-generation for browser TTS (it runs client-side, not via API) 2. The playback engine fell back to a silent reading timer when no pre-generated audio was found, instead of calling speechSynthesis This commit adds Web Speech API integration directly in the PlaybackEngine: - New playBrowserTTS() method speaks text via speechSynthesis - Properly wires onend/onerror to advance to the next action - pause()/resume() now handle speechSynthesis.pause()/resume() - stop() and handleUserInterrupt() cancel browser TTS Fixes THU-MAIC#25, fixes THU-MAIC#12, fixes THU-MAIC#5 * fix: handle "default" ttsVoice, extract CJK_LANG_THRESHOLD constant - When ttsVoice is "default" (set by Browser Native TTS which has no voice picker), the voiceURI lookup silently fails and no lang is set, causing Chinese text to be spoken with an English voice. - Extract the 0.3 CJK detection threshold as a named constant CJK_LANG_THRESHOLD with JSDoc explaining the rationale. - Fall through to language auto-detection when voice lookup fails, regardless of the reason (missing voice, "default" sentinel, etc.). * fix: support browser-native tts previews * refactor: extract shared TTS preview logic into useTTSPreview hook The browser-native and API-based TTS preview code was duplicated across tts-config-popover, media-popover, and tts-settings. Extract it into a reusable useTTSPreview hook that handles refs, cancellation, audio lifecycle, and staleness checks in one place. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: yangshen <1322568757@qq.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Fix browser-native TTS producing no sound during classroom playback, while the settings test plays sound correctly.
Fixes #25, fixes #12, fixes #5
Root Cause
When
browser-native-ttsis selected as the TTS provider:use-scene-generator.ts:214,450) correctly skips pre-generating audio — browser TTS runs client-side via Web Speech API, not via server APIengine.ts:436-444) callsaudioPlayer.play()which finds no pre-generated audio in IndexedDB → returnsfalse→ falls back toscheduleReadingTimer()— a silent timer that estimates reading time but never callsspeechSynthesisFix
Add Web Speech API integration directly in
PlaybackEngine(lib/playback/engine.ts):playBrowserTTS()— speaks text viawindow.speechSynthesis, respecting user's voice, speed, volume, and mute settingscancelBrowserTTS()— cancels active browser TTSpause()— callsspeechSynthesis.pause()when browser TTS is activeresume()— callsspeechSynthesis.resume()when browser TTS is pausedstop()/handleUserInterrupt()— callsspeechSynthesis.cancel()to stop browser TTSThe fix is self-contained in one file. When
audioPlayer.play()returnsfalse(no pre-generated audio), the engine now checks ifbrowser-native-ttsis the selected provider and callsspeechSynthesis.speak()instead of falling back to the silent reading timer.Changes
lib/playback/engine.tsTesting