Increase phone support to 256 and simplify bitvector operations #435

lenzo-ka · 2025-10-09T21:04:21Z

Summary

This PR increases FSG phonetic context support from 128 to 256 phones and simplifies the bitvector implementation for better maintainability. The motivation is that models with large numbers of phones otherwise will fail -- in particular, Chinese -- without recompilation.

Changes

Increase FSG_PNODE_CTXT_BVSZ from 4 to 8, supporting up to 256 phones (up from 128)
Replace compile-time conditional macros with generic implementation
Improve error message to show actual vs. maximum phone count

Rationale

The previous implementation required recompilation to support larger phonesets. This change:

Supports larger phonesets: 256 phones covers all known phonetic inventories without custom builds
Simplifies maintenance: Removes 20+ lines of conditional macro code in favor of generic function
Negligible performance impact: Bitvector operations occur during word transitions, not in the acoustic scoring hot path (~0.002% overhead)

Memory Impact

Per fsg_pnode_t: +16 bytes (11% increase from ~144 to ~160 bytes)
Typical grammars: +160KB to 1.6MB for 10K-100K nodes
Acceptable tradeoff for runtime flexibility

Testing

Clean compilation with no errors
Existing unit tests pass
Backward compatible with existing models and grammars

Increase FSG_PNODE_CTXT_BVSZ from 4 to 8, supporting up to 256 phones instead of 128. This covers all known phonesets without recompilation. Replace compile-time macro variants with generic implementation to simplify code maintenance. The performance impact is negligible as these operations occur during word transitions, not acoustic scoring. Memory impact: ~16 bytes per fsg_pnode (11% increase), typically 160KB-1.6MB additional for medium to large grammars.

lenzo-ka · 2025-10-09T21:08:57Z

I looked at doing this dynamically at model load time based on the number of phones in the model, but the savings is like 160-240KB total.

lenzo-ka · 2025-10-14T15:52:08Z

This still runs fine on existing models with <128 phones

dhdaines · 2025-10-23T20:23:19Z

Yes, I agree that this is an okay change. For complex grammars there are much worse sources of memory inefficiency in the FSG search code than this!

lenzo-ka requested a review from dhdaines October 9, 2025 21:05

dhdaines merged commit b86eb46 into main Oct 23, 2025
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increase phone support to 256 and simplify bitvector operations #435

Increase phone support to 256 and simplify bitvector operations #435

Uh oh!

lenzo-ka commented Oct 9, 2025

Uh oh!

lenzo-ka commented Oct 9, 2025 •

edited

Loading

Uh oh!

lenzo-ka commented Oct 14, 2025

Uh oh!

dhdaines commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Increase phone support to 256 and simplify bitvector operations #435

Increase phone support to 256 and simplify bitvector operations #435

Uh oh!

Conversation

lenzo-ka commented Oct 9, 2025

Summary

Changes

Rationale

Memory Impact

Testing

Uh oh!

lenzo-ka commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lenzo-ka commented Oct 14, 2025

Uh oh!

dhdaines commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

lenzo-ka commented Oct 9, 2025 •

edited

Loading