Fix qmd writer for Str-leading apostrophe across inline boundaries (#205)#210
Merged
Conversation
…bd-nsb9) Follow-up to #201 / bd-8lcm. escape_markdown decides whether to emit \' from purely intra-Str prev_char/next_char. When ' sits at index 0 of a Str, prev_char is None, the escape is dropped, and the reader's smart-quote classifier — which keys off the surrounding byte stream, not Str boundaries — emits Q-2-7 on the regenerated qmd. Triage records the trigger boundary (previous emitted byte is non-alphanumeric, e.g. closing backtick of a Code span, *, ), or block start), generalizes the bug beyond the Code-specific shape in the issue body, identifies a parseable repro ( \`x\`\'s end ) that exercises the existing qmd-json-qmd roundtrip harness, and hands off a TDD-first fix scope to bd-nsb9. Includes minimal repros (parseable qmd + JSON AST). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d-nsb9, issue #205) Follow-up to #201. `escape_markdown` previously decided whether to emit `\'` from purely intra-`Str` context — when `'` sat at index 0 of a `Str`, `prev_char` was `None` and the escape was dropped, even when the preceding inline (e.g. a `Code` span's closing backtick) left a non-alphanumeric byte in the output stream. The reader's smart-quote classifier rejected the regenerated qmd with Q-2-7. `QmdWriterContext` now carries `prev_emitted_alnum`. `write_inline` maintains it: before dispatch, every non-`Str`/non-`Custom` inline clears it (their openers are non-alphanumeric and reset the byte-stream context for any nested content); after dispatch, it is refreshed from the inline's closing byte (last char of a `Str`'s text for `Str`, preserved for `Custom`, `false` otherwise). `write_block` clears it at every block boundary. `write_str` forwards it to `escape_markdown` as the `start_prev_is_alnum` hint, which is consulted when in-`Str` `prev_char` is `None`. While here, generalize the escape rule from "prev alnum AND next non-alnum" to "NOT (prev alnum AND next alnum)" — the reader treats a bare `'` as an apostrophe only between two alphanumeric characters, and the previous narrower form left adjacent cases (e.g. `'.foo`, lone `'`) un-escaped and lossy. Adds three roundtrip fixtures locking in the fix: apostrophe_after_code_inline.qmd — `x`\'s end apostrophe_after_emph.qmd — *hi*\'s end apostrophe_at_block_start.qmd — \'sup end Triage: claude-notes/issue-reports/205/triage.md Beads: bd-nsb9 End-to-end verified: each fixture parses, round-trips through the qmd writer, and re-parses to the same AST. The `qmd-json-qmd` suite (test_qmd_roundtrip_consistency) passes; cargo xtask verify --skip-hub-build --skip-hub-tests passes; cargo nextest run --workspace passes (8942/8942). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #205. Follow-up to #201.
Summary
escape_markdownwas per-Strand decided whether to emit\'from the in-Strprev_char/next_charalone. When'sat at index 0 of aStr,prev_charwasNoneand the escape was dropped — even when the preceding inline (e.g. aCodespan's closing backtick, anEmphcloser, or just block-start) left a non-alphanumeric byte in the output stream. The reader's smart-quote classifier then rejected the regenerated qmd with Q-2-7.QmdWriterContextnow carriesprev_emitted_alnum.write_inlinemaintains it: before dispatch, every non-Str/non-Custominline clears it (their openers are non-alphanumeric and reset the byte-stream context for any nested content); after dispatch, it is refreshed from the inline's closing byte.write_blockclears it at every block boundary.write_strforwards it toescape_markdownas thestart_prev_is_alnumhint, which is consulted when in-Strprev_charisNone.'as an apostrophe only between two alphanumeric characters, and the previous narrower form left adjacent cases ('.foo, lone') un-escaped and lossy.Triage record
claude-notes/issue-reports/205/triage.md(committed in 7228ea7) — repros, localization, root-cause analysis, fix scope, resolved open questions.Test plan
tests/roundtrip_tests/qmd-json-qmd/apostrophe_after_code_inline.qmd(`x`\'s end) and confirmtest_qmd_roundtrip_consistencyfails with Q-2-7 on the regenerated qmd — done before any code change.apostrophe_after_emph.qmd(*hi*\'s end) andapostrophe_at_block_start.qmd(\'sup end).test_qmd_roundtrip_consistencypasses.cargo nextest run --workspace— 8942 / 8942 passing.cargo xtask verify --skip-hub-build --skip-hub-tests— all steps green. (Rust-only change:pampawriter; noquarto-core/quarto-pandoc-typestouchpoints, so the WASM leg is unaffected.)./target/debug/pampa -f json -t qmdon JSON ASTs for each generalization (Code/Emph/Imagepreceding, block-start, nested-inside-Emph) and confirm the'is escaped only when needed — round-trippampa -f json -t qmd | pampaparses cleanly.don't(alnum both sides) is still emitted unescaped, no regression.🤖 Generated with Claude Code