Skip to content

Conversation

@DajanaV
Copy link
Collaborator

@DajanaV DajanaV commented Nov 17, 2025

Mirrored from ggml-org/llama.cpp#17331

I was running some llama.cpp examples on a system with a German locale (de_DE) and noticed something odd - when llama-cli printed out the model metadata, all the float values had commas as decimal separators (like "0,000000") instead of periods. But when I ran llama-perplexity on the same model, it used periods normally.

After some digging, I found the issue was in the gguf_data_to_str() function in llama-impl.cpp. It was using std::to_string() to format floats, which respects the system's LC_NUMERIC locale setting. So depending on which tool you used and what locale it was running with, you'd get different formatting.

I've changed it to use std::ostringstream with std::locale::classic() instead, which always formats floats with a period as the decimal separator, regardless of the system locale. This should make the output consistent across all tools and locales.

Fixes #10613

The documentation stated that non-terminal symbols must be 'dashed lowercase
words' like 'move' or 'check-mate', but the actual parser supports much more
flexibility. Rule names can include:
- Uppercase letters (e.g., dataType, UPPER-CASE)
- Numbers (e.g., rule123)
- Both dashes and underscores

This was discovered when c.gbnf used 'dataType' with an uppercase letter,
which works correctly despite the documentation saying otherwise. Updated
the docs to accurately describe the parser's capabilities rather than
restrict them.

Fixes #7720

Signed-off-by: Samaresh Kumar Singh <[email protected]>
Previously, std::to_string was used to format float and double values,
which respects the LC_NUMERIC locale setting. This caused inconsistent
output where some tools would print '0,000000' (comma) while others
printed '0.000000' (period), depending on the system locale.

Now using std::ostringstream with std::locale::classic() to ensure
float values are always formatted with a period as the decimal separator,
regardless of the system locale settings.

Signed-off-by: Samaresh Kumar Singh <[email protected]>
@loci-agentic-ai
Copy link

Access the complete analysis in the LOCI Dashboard

Performance Analysis Summary: PR #242 - Locale-Independent Float Formatting

Overview

PR #242 implements locale-independent float formatting in GGUF metadata processing by replacing std::to_string() with std::ostringstream using std::locale::classic(). This change ensures consistent decimal separator formatting across different system locales but introduces performance overhead in the __iter_equals_val function.

Key Findings

Performance Impact:

  • Highest Change: __iter_equals_val function shows 232% Response Time increase (78 ns → 258 ns) and 309% Throughput increase (58 ns → 239 ns)
  • Absolute Impact: +180 ns Response Time, +181 ns Throughput - negligible in practical terms
  • Core Function Impact: No changes detected in critical inference functions (llama_decode, llama_encode, llama_tokenize)
  • Tokens per Second: No impact expected as the affected function is not in the inference pipeline

Power Consumption Analysis:

  • build.bin.libllama.so: +3.32% increase (1462 nJ → 1510 nJ)
  • build.bin.llama-cvector-generator: +3.31% increase (4213 nJ → 4352 nJ)
  • build.bin.llama-run: +1.89% increase (1647 nJ → 1678 nJ)
  • build.bin.llama-tts: -5.07% decrease (3318 nJ → 3150 nJ)

Technical Analysis:

  • Flame Graph: Shows 93% self-execution time in __iter_equals_val, indicating compiler optimization regression rather than algorithmic issues
  • CFG Comparison: Reveals basic block splitting in current version creating unnecessary branch overhead - entry logic split into two blocks versus single consolidated block in base version
  • Root Cause: Compiler optimization differences causing inefficient basic block organization

Code Review Findings:
The changes are technically sound, addressing legitimate internationalization requirements. The performance degradation stems from standard library template instantiation overhead, not the core logic changes. The affected function operates in metadata processing paths, not inference-critical operations.

Actionable Recommendations:

  • Review compiler optimization settings to ensure basic block merging is enabled
  • Verify identical compiler versions and flags between builds to prevent optimization regressions

The changes successfully resolve locale-dependent formatting issues with minimal performance impact on non-critical code paths.

@DajanaV DajanaV force-pushed the main branch 3 times, most recently from f333350 to 9c4623f Compare November 18, 2025 09:10
@loci-dev loci-dev force-pushed the main branch 23 times, most recently from 331588e to d2e6325 Compare November 24, 2025 11:08
@loci-dev loci-dev force-pushed the main branch 15 times, most recently from 4600128 to 865bc12 Compare November 26, 2025 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants