LFX Phase 2: Parameter taxonomy and LLM prompt architecture by ishaan-arora-1 · Pull Request #1766 · riscv/riscv-unified-db

ishaan-arora-1 · 2026-03-26T00:59:10Z

Summary

Design and implement the formal parameter classification taxonomy and LLM prompt architecture for extracting architectural parameters from the RISC-V specification. Builds on Phase 1 (#1765).

Formal taxonomy (taxonomy.md): 8 parameter classes with precise definitions, disambiguation rules, and a decision tree
System prompt (system_prompt.txt): ~940 token prompt defining role, task, condensed taxonomy, critical rules, and strict JSON output schema
Few-shot examples (examples.json): 6 positive + 4 negative examples from real spec text covering all normative classification classes and key false-positive patterns
Prompt assembler (run_prompt.py): CLI tool with 3 modes — assemble, chunk, and estimate — for building context-aware prompts across different LLM models
Validation suite (validate_prompt.py): 175-check automated verification covering taxonomy completeness, example accuracy, schema consistency, assembly correctness, and chunking integrity

Key Design Decisions

Decision	Rationale
Single-pass extraction + classification	Preserves context for classification; avoids two-pass token cost
Mandatory `reasoning` field in output	Reduces hallucinations and aids human review
`skipped_non_parameters` in output	Forces LLM to demonstrate understanding of boundaries
Section-boundary-aware chunking	Prevents splitting mid-paragraph; configurable overlap for context continuity
Three-layer prompt (system + examples + chunk)	Each layer has clear token budget; examples/param-names can be toggled off for small-context models

Parameter Classes

Class	Count in Phase 1	Description
`NORM_DIRECT`	102	Direct implementation choice, not CSR-controlled
`NORM_CSR_RW`	55	Controls whether a CSR field is RO/RW
`NORM_CSR_WARL`	26	Legal values of a WARL CSR field
`SW_RULE`	2	Deterministic with correct software
`NON_ISA`	—	Platform-level, outside ISA scope
`NON_NORM`	—	Inside NOTE/TIP/WARNING blocks
`DOC_RULE`	—	Documentation requirements
`UNKNOWN`	—	Needs further analysis

Token Budget

System prompt:           940 tokens
Few-shot examples:     1,691 tokens
UDB param names:       1,401 tokens
System overhead:         200 tokens
Reserved for output:   4,096 tokens
────────────────────────────────────
Fixed overhead:        4,232 tokens

Available for spec chunk:
  gpt-4o/gpt-4-turbo:   ~119K tokens
  claude-3.5-sonnet:     ~191K tokens
  gemini-1.5-pro:        ~991K tokens

How to Run

# Estimate token budgets
python3 param_extraction/scripts/run_prompt.py estimate

# Chunk a spec file
python3 param_extraction/scripts/run_prompt.py chunk \
    ext/riscv-isa-manual/src/machine.adoc --max-tokens 40000

# Assemble a prompt for a specific chunk
python3 param_extraction/scripts/run_prompt.py assemble \
    ext/riscv-isa-manual/src/machine.adoc \
    --start-line 1209 --end-line 1270 --output-json

# Run validation suite
cd param_extraction/scripts && python3 validate_prompt.py

Test Plan

Closes #1748

…tion Add scripts and data for cataloging all 185 UDB architectural parameters with schema analysis, CSR cross-references, heuristic classifications, and candidate spec text locations. This forms the foundation for LLM-based parameter extraction from the RISC-V specification. Scripts: - export_udb_params.py: extracts parameters from YAML, derives value types, cross-references CSR IDL, classifies each parameter - map_params_to_spec.py: searches 74 spec .adoc files for text related to each parameter using multi-strategy keyword matching - generate_report.py: produces CSV catalog, text report, and param name list Key results: - 185 parameters cataloged (102 NORM_DIRECT, 55 NORM_CSR_RW, 26 NORM_CSR_WARL, 2 SW_RULE) - 81% high-confidence classifications - 98% of parameters mapped to spec text candidates Closes riscv#1747

Design and implement the formal parameter classification taxonomy and prompt architecture for LLM-based extraction from RISC-V specifications. Deliverables: - taxonomy.md: formal definitions for 8 parameter classes (NORM_DIRECT, NORM_CSR_WARL, NORM_CSR_RW, SW_RULE, NON_ISA, NON_NORM, DOC_RULE, UNKNOWN) with disambiguation rules and a decision tree - system_prompt.txt: ~940 token system prompt defining role, task, taxonomy, critical rules, and JSON output schema - examples.json: 6 positive + 4 negative few-shot examples from real spec text covering all normative classes and key false-positive patterns (NOTE blocks, CSR behavior, fixed requirements, permission vs optionality "may") - run_prompt.py: prompt assembler with 3 CLI modes (assemble, chunk, estimate) supporting context window management across models - validate_prompt.py: 175-check validation suite for all deliverables Key design decisions: - Single-pass extraction + classification to preserve context - Mandatory reasoning field in LLM output to reduce hallucinations - Section-boundary-aware chunking with configurable overlap - Three-layer prompt: system + examples + param names + spec chunk Closes riscv#1748

codecov · 2026-03-26T01:12:15Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.96%. Comparing base (de41e7b) to head (ab1a22b).
⚠️ Report is 35 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1766   +/-   ##
=======================================
  Coverage   71.96%   71.96%           
=======================================
  Files          54       54           
  Lines       27976    27976           
  Branches     6183     6183           
=======================================
  Hits        20132    20132           
  Misses       7844     7844

Flag	Coverage Δ
idlc	`75.90% <ø> (ø)`
udb	`65.84% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Add REUSE annotation for param_extraction/** in REUSE.toml - Fix ruff lint errors: remove unused variables, prefix unused loop vars with underscore, remove extraneous f-string prefixes, remove unused imports, sort import blocks - Apply ruff formatting to all Python scripts - Make Python scripts executable to satisfy EXE001 shebang check - Fix prettier formatting for ground_truth.json and spec_mappings.json - Strip trailing whitespace from parameters_catalog.csv - Add missing end-of-file newline to phase1_report.txt

ishaan-arora-1 added 2 commits March 26, 2026 00:58

ishaan-arora-1 requested review from ThinkOpenly and dhower-qc as code owners March 26, 2026 00:59

[autofix.ci] apply automated fixes

05fc3b5

ishaan-arora-1 mentioned this pull request Apr 9, 2026

LFX Phase 3: AsciiDoc-aware spec chunking #1783

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LFX Phase 2: Parameter taxonomy and LLM prompt architecture#1766

LFX Phase 2: Parameter taxonomy and LLM prompt architecture#1766
ishaan-arora-1 wants to merge 4 commits intoriscv:mainfrom
ishaan-arora-1:lfx-phase2-prompt-design

ishaan-arora-1 commented Mar 26, 2026

Uh oh!

codecov Bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ishaan-arora-1 commented Mar 26, 2026

Summary

Key Design Decisions

Parameter Classes

Token Budget

How to Run

Test Plan

Uh oh!

codecov Bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Mar 26, 2026 •

edited

Loading