Skip to content

Conversation

yazanmashal03
Copy link

Pull Request Template

Checklist

  • Confirmed that cargo run-checks command has been executed.
  • Made sure the book is up to date with changes in this PR.

Related Issues/PRs

The PR is about Issue #2649, and implements two new metrics for sequence evaluation in NLP and related fields. These are the character error rate (CER) and word error rate(WER) metrics. These are (the same) error metrics used on the character and word levels, respectively.

Changes

The cer.rs file provides an implementation of the Character Error Rate metric. CER measures (here we use Levenshtein using DP) the percentage of characters that are incorrect (insertions, deletions, substitutions) in the predicted sequence compared to the reference sequence.

The wer.rs file implements the Word Error Rate metric. WER is similar to CER but operates at the word level, measuring the percentage of words that are incorrect in the predicted sequence.

Testing

Testing was done using unit tests, where each function (for example test_wer_without_padding, test_wer_with_padding) is a unit test for the WerMetric implementation. Same for CER.

Copy link

codecov bot commented Jul 28, 2025

Codecov Report

❌ Patch coverage is 99.09091% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.50%. Comparing base (38874eb) to head (d7a8bc0).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/burn-train/src/metric/cer.rs 99.09% 1 Missing ⚠️
crates/burn-train/src/metric/wer.rs 99.09% 1 Missing ⚠️

❌ Your project check has failed because the head coverage (63.50%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3418      +/-   ##
==========================================
+ Coverage   63.43%   63.50%   +0.07%     
==========================================
  Files         981      983       +2     
  Lines      109705   109925     +220     
==========================================
+ Hits        69589    69807     +218     
- Misses      40116    40118       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing these additional metrics!

I only have a couple of comments regarding the implementation.

/edit: ignore the failed test on CUDA CI, totally unrelated.

Comment on lines +118 to +122
self.state.update(
value,
batch_size,
FormatOptions::new(self.name()).unit("%").precision(2),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the state might need to keep track of the errors and total characters (or words for WER)? Otherwise aggregation might be incorrect 🤔 this would require a new state type though

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that sounds correct. However, the value here includes the errors relative to the total characters, since value=total_edit_distance/total_characters * 100, so why would we need to keep the total characters?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the current batch that's accurate, but when aggregated for an epoch it might be incorrect since this is a numeric state (not all batches have the same composition). Probably out of scope for this PR so no worries 👍

Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late follow-up! Didn't see this was updated. Please explicitly re-request a review so I know when changes have been applied 🙂

LGTM, just minor formatting issues to fix.

Comment on lines +118 to +122
self.state.update(
value,
batch_size,
FormatOptions::new(self.name()).unit("%").precision(2),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the current batch that's accurate, but when aggregated for an epoch it might be incorrect since this is a numeric state (not all batches have the same composition). Probably out of scope for this PR so no worries 👍

/// deletions, or substitutions) required to change one sequence into the other. This
/// implementation is optimized for space, using only two rows of the dynamic programming table.
///
pub fn edit_distance(reference: &[i32], prediction: &[i32]) -> usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can mark it as pub(crate) only

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants