Fix protobuf parser regex to handle scientific notation by willpartcl · Pull Request #87 · TILOS-AI-Institute/MacroPlacement

willpartcl · 2026-01-11T23:35:09Z

Problem:
The regex pattern on line 241 could not parse floating-point values with scientific notation (e.g., 1.42109e-16), causing failures when loading 15 out of 17 IBM (ICCAD04) benchmarks.

Root Cause:
The pattern \-*\w+\.\*\/{0,1}\w*[\w+\/{0,1}\w*]* splits on word boundaries, breaking scientific notation into separate tokens:

Input: "f: 1.42109e-16"
Parsed as: ['f', '1.42109e', '-16']
Caused float('1.42109e') to fail with ValueError

Solution:
Updated regex to explicitly match scientific notation first, then fallback to other patterns:

Pattern: r'[-+]?\d+.?\d*[eE][-+]?\d+|[-]?\w+.?[\w/]*'
Now parses: ['f', '1.42109e-16']
Correctly handles positive/negative exponents

Testing:
Verified all 17 IBM benchmarks now parse successfully:

ibm01-ibm18 (excluding ibm05) all load without errors
Regex still correctly handles:
- Regular floats (0.4, -0.4)
- Integers (123)
- Strings (TOP, BOTTOM)
- Paths (foo/bar)
- Scientific notation (1.42109e-16, 5.68434e+10)

Impact:
This fix enables plc_client_os to parse the full ICCAD04 benchmark suite without requiring Circuit Training's proprietary parser.

Problem: The regex pattern on line 241 could not parse floating-point values with scientific notation (e.g., 1.42109e-16), causing failures when loading 15 out of 17 IBM (ICCAD04) benchmarks. Root Cause: The pattern `\-*\w+\.\*\/{0,1}\w*[\w+\/{0,1}\w*]*` splits on word boundaries, breaking scientific notation into separate tokens: - Input: "f: 1.42109e-16" - Parsed as: ['f', '1.42109e', '-16'] - Caused float('1.42109e') to fail with ValueError Solution: Updated regex to explicitly match scientific notation first, then fallback to other patterns: - Pattern: r'[-+]?\d+\.?\d*[eE][-+]?\d+|[-]?\w+\.?[\w/]*' - Now parses: ['f', '1.42109e-16'] - Correctly handles positive/negative exponents Testing: Verified all 17 IBM benchmarks now parse successfully: - ibm01-ibm18 (excluding ibm05) all load without errors - Regex still correctly handles: * Regular floats (0.4, -0.4) * Integers (123) * Strings (TOP, BOTTOM) * Paths (foo/bar) * Scientific notation (1.42109e-16, 5.68434e+10) Impact: This fix enables plc_client_os to parse the full ICCAD04 benchmark suite without requiring Circuit Training's proprietary parser. Signed-off-by: willpartcl <will@partcl.com>

willpartcl force-pushed the fix-scientific-notation-parsing branch from f5bb952 to 45a721d Compare January 11, 2026 23:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix protobuf parser regex to handle scientific notation#87

Fix protobuf parser regex to handle scientific notation#87
willpartcl wants to merge 1 commit intoTILOS-AI-Institute:mainfrom
partcleda:fix-scientific-notation-parsing

willpartcl commented Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

willpartcl commented Jan 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant