Skip to content

Add LODR support to online and offline recognizers#2026

Merged
csukuangfj merged 30 commits intok2-fsa:masterfrom
vsd-vector:lodr_support
Jul 9, 2025
Merged

Add LODR support to online and offline recognizers#2026
csukuangfj merged 30 commits intok2-fsa:masterfrom
vsd-vector:lodr_support

Conversation

@vsd-vector
Copy link
Copy Markdown
Contributor

@vsd-vector vsd-vector commented Mar 19, 2025

This PR adds LODR support from Icefall to offline and online recognizers for both LM shallow fusion and LM rescore.
(see https://k2-fsa.github.io/icefall/decoding-with-langugage-models/LODR.html)

Usage example:

# offline LM rescore
sherpa-onnx-offline     --tokens=tokens.txt    \
                                     --encoder=encoder.onnx     \
                                     --decoder=decoder.onnx     \
                                     --joiner=joiner.onnx      \
                                     --decoding-method=modified_beam_search  \
                                     --lm=lm.onnx \
                                     --lodr-fst=2gram.fst \
                                     --lodr-scale=-0.5  \
                                     test.wav
# online LM rescore
sherpa-onnx                 --tokens=tokens.txt    \
                                     --encoder=encoder.onnx     \
                                     --decoder=decoder.onnx     \
                                     --joiner=joiner.onnx      \
                                     --decoding-method=modified_beam_search  \
                                     --lm=lm.onnx \
                                     --lodr-fst=2gram.fst \
                                     --lodr-scale=-0.5  \
                                     --lm-shallow-fusion=false \
                                     test.wav
                                     
# online LM shallow fusion
sherpa-onnx                 --tokens=tokens.txt    \
                                     --encoder=encoder.onnx     \
                                     --decoder=decoder.onnx     \
                                     --joiner=joiner.onnx      \
                                     --decoding-method=modified_beam_search  \
                                     --lm=lm.onnx \
                                     --lodr-fst=2gram.fst \
                                     --lodr-scale=-0.5  \
                                     --lodr-backoff-id=500 \
                                     --lm-shallow-fusion=true \
                                     test.wav

Where,

  • 2gram.fst is the LODR n-gram model in the binary FST format, e.g. created by Icefall using arpa2fst and then compiled to binary (fstcompile)
  • "lodr-backoff-id" is the ID of backoff symbol in LODR FST (typically len(vocabulary), you could also use -1 for auto-detection)

Summary by CodeRabbit

  • New Features

    • Added support for LODR (lattice-on-demand rescoring) with bi-gram FST models in offline and online transducer decoding.
    • Introduced new command-line arguments and Python API parameters to specify LODR FST paths, backoff IDs, and scaling factors.
    • Enhanced example scripts and documentation with usage examples for RNN language models combined with LODR rescoring.
    • Integrated LODR rescoring into language model scoring for both offline and online recognition workflows.
  • Bug Fixes

    • None.
  • Documentation

    • Updated usage instructions and example commands to include new LODR-related options and parameters.
  • Chores

    • Expanded test scripts to cover decoding scenarios involving LODR rescoring and external language models.

@csukuangfj
Copy link
Copy Markdown
Collaborator

Can you show how it improves the decoding result and also how it affects the RTF?

@vsd-vector
Copy link
Copy Markdown
Contributor Author

You can check the LODR paper

In our experiments with private data we saw relative improvements of 3-7%.

Some performance numbers as reported by shepra-onnx (non-optimized debug build on CPU)

LM rescore, no LODR:
Number of threads: 2, Elapsed seconds: 2.6e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 2.6e+03/5e+03 = 0.52
LODR:
Number of threads: 2, Elapsed seconds: 2.8e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 2.8e+03/5e+03 = 0.56

LM shallow fusion, no LODR:
Number of threads: 2, Elapsed seconds: 6.8e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 6.8e+03/5e+03 = 1.4
LODR:
Number of threads: 2, Elapsed seconds: 6.8e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 6.8e+03/5e+03 = 1.4

@csukuangfj
Copy link
Copy Markdown
Collaborator

Some performance numbers as reported by shepra-onnx (non-optimized debug build on CPU)

Can you test with a release build?

@vsd-vector
Copy link
Copy Markdown
Contributor Author

Can you test with a release build?
on the same ~1.5h audio.

rescore:
Number of threads: 2, Elapsed seconds: 2.3e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 2.3e+03/5e+03 = 0.45

rescore+LODR:
Number of threads: 2, Elapsed seconds: 2.3e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 2.3e+03/5e+03 = 0.47

SF:
Number of threads: 2, Elapsed seconds: 6.2e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 6.2e+03/5e+03 = 1.2

SF+LODR:
Number of threads: 2, Elapsed seconds: 6.3e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 6.3e+03/5e+03 = 1.3

@vsd-vector
Copy link
Copy Markdown
Contributor Author

@csukuangfj Just wanted to kindly check in to see if there's anything else you'd like me to update on this PR.

btw, appreciate your time and all the work you do on the project. Is there any plan to have more maintainers/reviewers?

Comment thread sherpa-onnx/csrc/hypothesis.h Outdated
Comment thread sherpa-onnx/csrc/lodr-fst.h
Comment thread sherpa-onnx/csrc/lodr-fst.h Outdated
Comment thread sherpa-onnx/csrc/lodr-fst.h Outdated
Comment thread sherpa-onnx/csrc/lodr-fst.h Outdated
Comment thread sherpa-onnx/csrc/lodr-fst.h Outdated
@csukuangfj
Copy link
Copy Markdown
Collaborator

Can you test with a release build?
on the same ~1.5h audio.

rescore: Number of threads: 2, Elapsed seconds: 2.3e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 2.3e+03/5e+03 = 0.45

rescore+LODR: Number of threads: 2, Elapsed seconds: 2.3e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 2.3e+03/5e+03 = 0.47

SF: Number of threads: 2, Elapsed seconds: 6.2e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 6.2e+03/5e+03 = 1.2

SF+LODR: Number of threads: 2, Elapsed seconds: 6.3e+03, Audio duration (s): 5e+03, Real time factor (RTF) = 6.3e+03/5e+03 = 1.3

Thank you for sharing the test results.


Is there any plan to have more maintainers/reviewers?

Yes, sherpa-onnx is an open-source project. Contributions of any form, e.g., pull-requests, code reviews, are always welcome.

vsd-vector and others added 3 commits April 8, 2025 15:29
@vsd-vector
Copy link
Copy Markdown
Contributor Author

Hi again!

Requested changes have been integrated into PR.

@vsd-vector vsd-vector requested a review from csukuangfj April 8, 2025 14:50
@vsd-vector
Copy link
Copy Markdown
Contributor Author

@csukuangfj

Comment thread sherpa-onnx/csrc/lodr-fst.cc
Comment thread sherpa-onnx/csrc/lodr-fst.h Outdated
Comment thread sherpa-onnx/csrc/lodr-fst.h Outdated
Comment thread sherpa-onnx/csrc/lodr-fst.h Outdated
Comment thread sherpa-onnx/csrc/offline-lm-config.cc
Comment thread sherpa-onnx/csrc/offline-lm-config.h Outdated
Copy link
Copy Markdown
Contributor Author

@vsd-vector vsd-vector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is backoff_id_ always 0?

I think it is the id of #0, right?

Yes, usually it's the id of #0, so one of last tokens in the vocabulary. So probably, 0 is not a good default.

I have two ideas:

  1. I could set default value to -1 and later somehow automatically deduce it from tokens.txt
  2. or make backoff_id a required parameter if lodr_fst is set

@vsd-vector
Copy link
Copy Markdown
Contributor Author

@csukuangfj backoff_id is now -1 by default and infered from the FST itself.

@vsd-vector vsd-vector requested a review from csukuangfj April 29, 2025 12:59
@csukuangfj
Copy link
Copy Markdown
Collaborator

Can you add a CI test for it? It would be great if a python example test and a example using pre-built binary are available so that users can learn how to use the new feature through examples.

@vsd-vector
Copy link
Copy Markdown
Contributor Author

Can you add a CI test for it? It would be great if a python example test and a example using pre-built binary are available so that users can learn how to use the new feature through examples.

Yes, I think I can add something like this. I will need to download models and LODR FST during the CI test, I can probably use some public models, but what about FST ?

Also, what audio should I use in the test and where is best place to host it?

@csukuangfj
Copy link
Copy Markdown
Collaborator

Can you upload the files to huggingface and download them from CI?

@csukuangfj
Copy link
Copy Markdown
Collaborator

By the way, if you don't want to make your model and fst public, can you use the test model and fst files from icefall?

@vsd-vector
Copy link
Copy Markdown
Contributor Author

@csukuangfj I added some CI tests using Zipformer2 EN models, CLI and python

@csukuangfj
Copy link
Copy Markdown
Collaborator

@csukuangfj I added some CI tests using Zipformer2 EN models, CLI and python

Thanks! Will review it this week.

@vsd-vector
Copy link
Copy Markdown
Contributor Author

@csukuangfj is there anything you'd like me to update on this PR?

Copy link
Copy Markdown
Collaborator

@csukuangfj csukuangfj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left some minor comments. Otherwise, it looks good to me.

Comment thread sherpa-onnx/python/csrc/online-lm-config.cc Outdated
Comment thread sherpa-onnx/csrc/online-lm-config.h Outdated
Comment thread sherpa-onnx/csrc/online-lm-config.h Outdated
Comment thread sherpa-onnx/csrc/online-lm-config.h Outdated
Comment thread sherpa-onnx/csrc/online-lm-config.h
Comment thread sherpa-onnx/csrc/online-lm-config.cc Outdated
Comment thread sherpa-onnx/csrc/lodr-fst.cc Outdated
@csukuangfj csukuangfj requested a review from Copilot July 8, 2025 14:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates LODR (Level-Ordered Deterministic Rescoring) support from Icefall into both online and offline recognizers, enabling LODR for LM shallow fusion and LM rescore.

  • Extended OnlineLMConfig and OfflineLMConfig to include lodr_fst, lodr_scale, and lodr_backoff_id.
  • Implemented LodrFst and LodrStateCost classes and wired them into RNN LM scoring in both online and offline code paths.
  • Updated Python bindings, CLI entry points, examples, and CI test scripts to accept and exercise the new LODR options.

Reviewed Changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated no comments.

Show a summary per file
File Description
python/sherpa_onnx/online_recognizer.py Added lodr_fst and lodr_scale parameters to factory method
python/sherpa_onnx/offline_recognizer.py Same additions for offline recognizer factory
python/csrc/online-lm-config.cc Extended pybind init signature and read/write fields
python/csrc/offline-lm-config.cc Extended pybind init signature and read/write fields
csrc/online-lm-config.h/.cc Added LODR members, Register, Validate, ToString
csrc/offline-lm-config.h/.cc Same for offline LM config
csrc/lodr-fst.h / csrc/lodr-fst.cc New LODR FST implementation
csrc/online-rnn-lm.cc / csrc/offline-rnn-lm.cc Integrated LODR into RNN LM scoring
csrc/offline-lm.h/.cc Integrated LODR into generic offline LM
python-api-examples/online-decode-files.py Added LODR options to demo script
python-api-examples/offline-decode-files.py Same for offline example
.github/scripts/*.sh Download and test LODR FST in CI

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jul 8, 2025

Walkthrough

This change introduces LODR (Lattice On-Demand Rescoring) support across both offline and online speech recognition pipelines. It adds new configuration options, command-line arguments, and implementation for LODR FST-based rescoring in C++ and Python APIs. Test scripts and example usage are updated to validate and demonstrate the new functionality, and supporting classes for FST-based rescoring are implemented.

Changes

Files/Groups Change Summary
.github/scripts/test-*.sh Updated test scripts to download/prepare LODR FST and RNN-LM models, and run new tests with LODR and LM integration.
python-api-examples/offline-decode-files.py, python-api-examples/online-decode-files.py Added command-line arguments for LODR FST and LODR scale; passed these to recognizer constructors; updated usage docs.
sherpa-onnx/csrc/lodr-fst.h, sherpa-onnx/csrc/lodr-fst.cc Introduced new classes for LODR FST and state cost management, enabling FST-based rescoring.
sherpa-onnx/csrc/CMakeLists.txt Added lodr-fst.cc to the build.
sherpa-onnx/csrc/hypothesis.h Added a lodr_state member to the Hypothesis struct for LODR state tracking.
sherpa-onnx/csrc/offline-lm-config.*, sherpa-onnx/csrc/online-lm-config.* Added LODR FST path, scale, and backoff ID to LM config structs with registration and validation.
sherpa-onnx/csrc/offline-lm.h, sherpa-onnx/csrc/offline-lm.cc Integrated LODR FST scoring into offline LM scoring logic; added config-based LODR FST instantiation.
sherpa-onnx/csrc/offline-rnn-lm.cc Updated constructors to call base class with full config (including LODR options).
sherpa-onnx/csrc/online-rnn-lm.cc Integrated LODR FST scoring into online RNN-LM scoring logic, supporting both shallow fusion and rescoring.
sherpa-onnx/python/csrc/offline-lm-config.cc, sherpa-onnx/python/csrc/online-lm-config.cc Exposed new LODR FST and scale (and backoff ID for online) to Python bindings and constructors.
sherpa-onnx/python/sherpa_onnx/offline_recognizer.py, sherpa-onnx/python/sherpa_onnx/online_recognizer.py Added LODR FST and scale parameters to recognizer constructors and passed to config objects.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant PythonScript
    participant Recognizer
    participant LM (RNN/NN)
    participant LODR FST

    User->>PythonScript: Run decode with --lm, --lodr-fst, --lodr-scale
    PythonScript->>Recognizer: Construct with LM and LODR config
    Recognizer->>LM (RNN/NN): Score hypothesis
    Recognizer->>LODR FST: Rescore hypothesis with FST and scale
    LODR FST-->>Recognizer: Return LODR score
    LM (RNN/NN)-->>Recognizer: Return LM score
    Recognizer-->>PythonScript: Final rescored hypothesis
    PythonScript-->>User: Output results
Loading

Poem

🐇
I hopped through FSTs and lattices wide,
With LODR and language models by my side.
Now rescoring is clever, robust, and neat—
Our recognition’s accuracy hard to beat!
From scripts to configs, new options bloom,
Lattice magic brings results that zoom!
Hooray for the code—let’s celebrate and eat!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5dc574a and c761a7d.

📒 Files selected for processing (1)
  • sherpa-onnx/csrc/lodr-fst.cc (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • sherpa-onnx/csrc/lodr-fst.cc
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (40)
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
  • GitHub Check: aarch64 shared GPU ON 1.11.0
  • GitHub Check: windows-2022 3.12
  • GitHub Check: windows-2022 3.11
  • GitHub Check: windows-2022 3.7
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Co-authored-by: Fangjun Kuang <csukuangfj@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (3)
sherpa-onnx/csrc/online-lm-config.h (2)

21-23: Use consistent integer type for lodr_backoff_id.

The member lodr_backoff_id uses int while the codebase convention is to use int32_t for consistency with other similar members in the struct.

-  int lodr_backoff_id = -1;
+  int32_t lodr_backoff_id = -1;

29-32: Update constructor parameter type for consistency.

The constructor parameter should use int32_t to match the member variable type.

-                 int lodr_backoff_id)
+                 int32_t lodr_backoff_id)
sherpa-onnx/csrc/lodr-fst.h (1)

52-52: Add comment documenting fst_ ownership.

Please add a comment clarifying whether fst_ is owned by this class, similar to the documentation provided for fst_ in the LodrStateCost class.

🧹 Nitpick comments (3)
sherpa-onnx/python/sherpa_onnx/offline_recognizer.py (1)

72-73: Add documentation for the new LODR parameters.

The new lodr_fst and lodr_scale parameters are not documented in the method's docstring.

Add documentation for these parameters in the docstring around line 138:

          rule_fars:
            If not empty, it specifies fst archives for inverse text normalization.
            If there are multiple archives, they are separated by a comma.
+         lodr_fst:
+           Path to the LODR (Lookahead On-the-fly Determinization and Rescoring)
+           FST file in binary format. If empty, LODR is disabled.
+         lodr_scale:
+           Scale factor for LODR rescoring. Only used when lodr_fst is provided.
sherpa-onnx/python/sherpa_onnx/online_recognizer.py (1)

92-93: Add documentation for the new LODR parameters.

The new lodr_fst and lodr_scale parameters are not documented in the method's docstring.

Add documentation for these parameters in the docstring around line 220:

          trt_dump_subgraphs: bool = False,
            "Dump optimized subgraphs for debugging." TensorRT EP
+         lodr_fst:
+           Path to the LODR (Lookahead On-the-fly Determinization and Rescoring)
+           FST file in binary format. If empty, LODR is disabled.
+         lodr_scale:
+           Scale factor for LODR rescoring. Only used when lodr_fst is provided.
sherpa-onnx/csrc/lodr-fst.cc (1)

119-124: Optimize memory allocation in the loop.

Creating a new unique_ptr in each iteration is inefficient. Consider modifying the existing object in-place instead.

-  for (size_t i = offset; i < hyp->ys.size(); ++i) {
-    auto next_lodr_state = std::make_unique<LodrStateCost>(
-      hyp->lodr_state->ForwardOneStep(hyp->ys[i]));
-
-    hyp->lodr_state = std::move(next_lodr_state);
-  }
+  for (size_t i = offset; i < hyp->ys.size(); ++i) {
+    *hyp->lodr_state = hyp->lodr_state->ForwardOneStep(hyp->ys[i]);
+  }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 831aff1 and 3dbd6d9.

📒 Files selected for processing (21)
  • .github/scripts/test-offline-transducer.sh (1 hunks)
  • .github/scripts/test-online-transducer.sh (1 hunks)
  • .github/scripts/test-python.sh (1 hunks)
  • python-api-examples/offline-decode-files.py (3 hunks)
  • python-api-examples/online-decode-files.py (3 hunks)
  • sherpa-onnx/csrc/CMakeLists.txt (1 hunks)
  • sherpa-onnx/csrc/hypothesis.h (2 hunks)
  • sherpa-onnx/csrc/lodr-fst.cc (1 hunks)
  • sherpa-onnx/csrc/lodr-fst.h (1 hunks)
  • sherpa-onnx/csrc/offline-lm-config.cc (3 hunks)
  • sherpa-onnx/csrc/offline-lm-config.h (1 hunks)
  • sherpa-onnx/csrc/offline-lm.cc (2 hunks)
  • sherpa-onnx/csrc/offline-lm.h (2 hunks)
  • sherpa-onnx/csrc/offline-rnn-lm.cc (1 hunks)
  • sherpa-onnx/csrc/online-lm-config.cc (3 hunks)
  • sherpa-onnx/csrc/online-lm-config.h (1 hunks)
  • sherpa-onnx/csrc/online-rnn-lm.cc (5 hunks)
  • sherpa-onnx/python/csrc/offline-lm-config.cc (1 hunks)
  • sherpa-onnx/python/csrc/online-lm-config.cc (1 hunks)
  • sherpa-onnx/python/sherpa_onnx/offline_recognizer.py (2 hunks)
  • sherpa-onnx/python/sherpa_onnx/online_recognizer.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (7)
sherpa-onnx/csrc/offline-lm.cc (3)
sherpa-onnx/csrc/lodr-fst.h (1)
  • scale (37-37)
sherpa-onnx/csrc/online-rnn-lm.cc (4)
  • scale (34-69)
  • scale (34-34)
  • scale (72-121)
  • scale (72-73)
sherpa-onnx/csrc/offline-lm.h (1)
  • scale (50-51)
sherpa-onnx/csrc/offline-rnn-lm.cc (2)
sherpa-onnx/csrc/offline-lm.h (1)
  • config (27-27)
sherpa-onnx/csrc/offline-rnn-lm.h (3)
  • OfflineRnnLM (18-18)
  • OfflineRnnLM (20-20)
  • OfflineRnnLM (23-23)
.github/scripts/test-online-transducer.sh (1)
.github/scripts/test-online-ctc.sh (1)
  • log (5-9)
.github/scripts/test-python.sh (1)
.github/scripts/test-online-ctc.sh (1)
  • log (5-9)
sherpa-onnx/python/csrc/online-lm-config.cc (1)
sherpa-onnx/csrc/online-rnn-lm.cc (4)
  • scale (34-69)
  • scale (34-34)
  • scale (72-121)
  • scale (72-73)
sherpa-onnx/python/csrc/offline-lm-config.cc (1)
sherpa-onnx/csrc/online-rnn-lm.cc (4)
  • scale (34-69)
  • scale (34-34)
  • scale (72-121)
  • scale (72-73)
sherpa-onnx/csrc/offline-lm-config.cc (4)
sherpa-onnx/csrc/offline-lm-config.h (1)
  • po (38-38)
sherpa-onnx/csrc/file-utils.cc (2)
  • FileExists (16-18)
  • FileExists (16-16)
sherpa-onnx/csrc/lodr-fst.h (1)
  • scale (37-37)
sherpa-onnx/csrc/online-rnn-lm.cc (4)
  • scale (34-69)
  • scale (34-34)
  • scale (72-121)
  • scale (72-73)
🪛 Shellcheck (0.10.0)
.github/scripts/test-online-transducer.sh

[warning] 196-196: Quote to prevent word splitting/globbing, or split robustly with mapfile or read -a.

(SC2206)


[warning] 197-197: Quote to prevent word splitting/globbing, or split robustly with mapfile or read -a.

(SC2206)


[warning] 198-198: Quote to prevent word splitting/globbing, or split robustly with mapfile or read -a.

(SC2206)


[error] 201-201: Double quote array expansions to avoid re-splitting elements.

(SC2068)


[error] 215-215: Double quote array expansions to avoid re-splitting elements.

(SC2068)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (50)
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
  • GitHub Check: ubuntu-24.04 3.9
  • GitHub Check: windows-2022 3.10
  • GitHub Check: windows-2022 3.8
  • GitHub Check: windows-2022 3.11
  • GitHub Check: rknn shared ON
🔇 Additional comments (39)
sherpa-onnx/csrc/CMakeLists.txt (1)

28-28: LGTM! Clean addition of LODR FST source file.

The new lodr-fst.cc source file is correctly added to the build system in alphabetical order.

sherpa-onnx/python/sherpa_onnx/online_recognizer.py (1)

303-304: Ensure consistent default scale values.

The default lodr_scale=0.1 should be consistent with the C++ implementation and the offline recognizer.

This is the same potential consistency issue as in the offline recognizer - please verify the default values match across all implementations.

sherpa-onnx/csrc/offline-lm.h (1)

13-13: LGTM! Correct include for LODR FST functionality.

The include for lodr-fst.h is properly added to support the new LODR functionality.

.github/scripts/test-python.sh (1)

565-597: LGTM! Well-structured LODR test integration.

The new test section follows the established pattern in the file and properly exercises the LODR functionality. The use of Git LFS for downloading large model files is appropriate, and the cleanup is thorough.

sherpa-onnx/csrc/offline-lm-config.h (2)

22-24: LGTM! LODR configuration members properly added.

The new LODR members (lodr_fst and lodr_scale) are correctly defined with appropriate default values and follow the existing code patterns.


28-36: LGTM! Constructor properly updated for LODR parameters.

The constructor signature and initialization list are correctly updated to include the new LODR parameters. The initialization order matches the member definition order.

sherpa-onnx/csrc/offline-rnn-lm.cc (2)

85-86: LGTM! Proper base class initialization.

The addition of OfflineLM(config) to the member initializer list ensures the base class is properly initialized with the configuration that now includes LODR parameters.


88-90: LGTM! Template constructor properly updated.

The template constructor also correctly calls the base class constructor with the configuration parameter.

sherpa-onnx/csrc/online-lm-config.h (1)

34-40: LGTM! Constructor initialization properly structured.

The constructor initialization list correctly initializes all members in the proper order matching the member definition order.

sherpa-onnx/csrc/hypothesis.h (3)

15-15: LGTM! Appropriate header inclusion.

The <memory> header is correctly added to support the new std::shared_ptr member.


19-19: LGTM! Necessary header inclusion for LODR support.

The inclusion of lodr-fst.h is required for the LodrStateCost type used in the new member.


66-67: LGTM! LODR state member properly added.

The new lodr_state member is correctly defined as a std::shared_ptr<LodrStateCost> and properly default-initialized without explicit nullptr assignment, following the established pattern mentioned in past reviews.

sherpa-onnx/python/csrc/offline-lm-config.cc (2)

16-20: LGTM - Constructor signature correctly extended for LODR support.

The new lodr_fst and lodr_scale parameters are properly added to the constructor with appropriate default values.


25-26: LGTM - LODR parameters properly exposed as read/write attributes.

The new attributes are correctly exposed to Python with appropriate access patterns.

sherpa-onnx/csrc/offline-lm.cc (2)

20-20: LGTM - Proper header inclusion for LODR functionality.

The lodr-fst.h header is correctly included to enable LODR FST operations.


78-89: LGTM - LODR integration follows established patterns.

The implementation correctly:

  • Scales LODR by multiplying with LM scale to replicate Icefall behavior
  • Uses conditional check to prevent crashes when LODR is disabled
  • Calls ComputeScore with appropriate parameters matching the pattern in online-rnn-lm.cc
sherpa-onnx/csrc/offline-lm-config.cc (3)

21-22: LGTM - LODR options properly registered.

The new command-line options are correctly registered with appropriate descriptions.


31-34: LGTM - File existence validation addresses previous feedback.

The validation correctly checks that the LODR FST file exists when provided, addressing the past review comment requesting this validation.


44-46: LGTM - ToString() method updated consistently.

The string representation properly includes the new LODR parameters in a consistent format.

sherpa-onnx/python/csrc/online-lm-config.cc (2)

16-22: LGTM - Constructor signature correctly extended for LODR support.

The new parameters are properly added to the constructor with appropriate defaults.


28-30: LGTM - LODR parameters properly exposed as read/write attributes.

The new attributes are correctly exposed to Python with appropriate access patterns.

.github/scripts/test-offline-transducer.sh (3)

284-298: LGTM - Proper model downloading with Git LFS.

The implementation correctly:

  • Uses Git LFS to handle large model files efficiently
  • Selectively pulls only needed files to save bandwidth
  • Downloads both RNN LM and bigram FST models from appropriate repositories

302-314: LGTM - Comprehensive LODR testing with proper parameters.

The test execution correctly:

  • Uses modified_beam_search decoding method appropriate for LM rescoring
  • Includes all necessary LODR parameters (--lm, --lodr-fst, --lodr-scale)
  • Tests with multiple audio files to ensure robustness
  • Uses a negative scale value (-0.5) which is typical for LODR rescoring

316-316: LGTM - Proper cleanup of downloaded resources.

The cleanup correctly removes all downloaded repositories to prevent CI storage issues.

sherpa-onnx/csrc/online-lm-config.cc (3)

23-27: LGTM: LODR configuration options properly registered.

The new LODR configuration options are correctly integrated into the existing configuration system with appropriate parameter names and descriptions.


35-39: LGTM: File existence validation for LODR FST.

The validation logic properly checks if the LODR FST file exists when specified, following the same pattern as the existing LM model validation.


49-52: LGTM: ToString() method updated correctly.

The new LODR fields are properly formatted in the string representation. The use of double quotes for the FST path (string) and no quotes for numeric values is consistent with existing code style.

.github/scripts/test-online-transducer.sh (1)

177-192: LGTM: Good test coverage for LODR functionality.

The addition of LODR test coverage is valuable, testing both RNN LM and bigram FST integration. The use of Git LFS for selective model file download is appropriate.

python-api-examples/online-decode-files.py (3)

24-39: LGTM: Clear documentation example for LODR usage.

The new usage example effectively demonstrates how to use LODR with RNN LM rescoring, providing users with a concrete example of the command-line parameters.


205-219: LGTM: LODR arguments properly defined.

The new command-line arguments are well-documented with appropriate help text and default values. The constraint that LODR FST is only used when LM is given is clearly stated.


355-356: LGTM: LODR parameters correctly passed to recognizer.

The new LODR parameters are properly integrated into the transducer recognizer creation, following the established pattern for other optional parameters.

sherpa-onnx/csrc/online-rnn-lm.cc (5)

15-15: LGTM: Appropriate header inclusion.

The inclusion of lodr-fst.h header is necessary for the LODR functionality integration.


39-58: LGTM: Well-structured LODR integration in shallow fusion.

The LODR state initialization and score calculation in shallow fusion is well-implemented:

  • Proper conditional checks for LODR availability
  • Correct state management with unique_ptr
  • Appropriate score scaling and application
  • Clear separation of concerns

108-112: LGTM: Consistent LODR integration in rescoring.

The LODR score application in the rescoring method correctly:

  • Uses conditional checks for LODR availability
  • Applies proper scaling (LODR scale * LM scale)
  • Maintains consistency with the Icefall implementation

180-184: LGTM: Proper LODR FST initialization.

The LODR FST is correctly initialized only when the configuration specifies a non-empty FST path, using appropriate constructor parameters.


234-234: LGTM: Clean member variable addition.

The LODR FST member variable is appropriately declared as a unique_ptr, following modern C++ practices.

python-api-examples/offline-decode-files.py (3)

38-56: LGTM: Comprehensive LODR usage example.

The new documentation example clearly demonstrates how to use LODR with RNN LM rescoring in offline decoding, providing users with practical guidance.


292-322: LGTM: Consistent LODR argument definitions.

The LODR command-line arguments are properly defined with appropriate help text and default values, maintaining consistency with the online version.


419-422: LGTM: Proper LODR parameter integration.

The LODR parameters are correctly passed to the offline transducer recognizer, following the established pattern for optional parameters.

Comment thread sherpa-onnx/python/sherpa_onnx/offline_recognizer.py
Comment thread sherpa-onnx/csrc/offline-lm.h
Comment thread sherpa-onnx/csrc/offline-lm.h
Comment thread sherpa-onnx/python/csrc/online-lm-config.cc Outdated
Comment thread .github/scripts/test-online-transducer.sh
Comment thread .github/scripts/test-online-transducer.sh
Comment thread sherpa-onnx/csrc/lodr-fst.cc
Comment thread sherpa-onnx/csrc/lodr-fst.cc Outdated
Comment thread sherpa-onnx/csrc/lodr-fst.cc Outdated
Copy link
Copy Markdown
Collaborator

@csukuangfj csukuangfj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!

@csukuangfj csukuangfj merged commit f096034 into k2-fsa:master Jul 9, 2025
12 of 229 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants