Skip to content

Surface EOS detection and add text normalization#4

Merged
TroyHernandez merged 2 commits into
mainfrom
fix-eos-and-uppercase
Jun 11, 2026
Merged

Surface EOS detection and add text normalization#4
TroyHernandez merged 2 commits into
mainfrom
fix-eos-and-uppercase

Conversation

@TroyHernandez

Copy link
Copy Markdown
Contributor

Addresses two reliability bugs reported by @chris-english.

Summary

  • All four `t3_inference` variants now report whether EOS was emitted via an `eos_found` attribute on returned tokens
  • `generate()` surfaces `eos_found`, `n_tokens`, and `audio_sec` in its return list and warns when EOS was missed
  • `tts_to_file()` returns a list (`path`, `eos_found`, `n_tokens`, `audio_sec`) so callers iterating over many texts can collect a failure report
  • New `normalize_tts_text()` lowercases mid-sentence capitalized words, all-caps emphasis words, and internal-caps oddities while preserving sentence-initial caps and the pronoun `I`. Applied by default in `generate()` (opt out with `normalize_text = FALSE`)

Bugs addressed

Test plan

Two related TTS reliability fixes reported by chris-english:

EOS detection (issue #1):
- All four t3_inference variants (R, traced, cpp, turbo) now track
  whether the model emitted an end-of-speech token vs hit the token cap,
  and attach the result as an 'eos_found' attribute on returned tensors
- generate() reads the attribute and includes eos_found, n_tokens, and
  audio_sec in its return list
- generate() warns when EOS was not found (output is likely garbage)
- tts_to_file() returns a list with path, eos_found, n_tokens, audio_sec
  so callers iterating over many texts can collect a failure report

Text normalization (issue #2):
- New normalize_tts_text() lowercases mid-sentence capitalized words,
  all-caps emphasis words, and internal-caps oddities. Sentence-initial
  caps and the pronoun "I" are preserved.
- generate() applies normalization by default (normalize_text = TRUE).
  The chatterbox model interprets internal capitals as emphasis cues
  and often produces silent audio for them, so this addresses a real
  failure mode reported on inputs like "Yes, Rarely or never Almost
  never..."

8 new tinytest cases for normalize_tts_text covering the reported
failure modes.

R CMD check: 0 errors, 0 warnings, 1 harmless NOTE.
@TroyHernandez TroyHernandez merged commit 758e847 into main Jun 11, 2026
4 checks passed
@TroyHernandez TroyHernandez deleted the fix-eos-and-uppercase branch June 11, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant