Skip to content

Fix qmd writer for Str-leading apostrophe across inline boundaries (#205)#210

Merged
cscheid merged 2 commits into
mainfrom
bugfix/issue-205
May 15, 2026
Merged

Fix qmd writer for Str-leading apostrophe across inline boundaries (#205)#210
cscheid merged 2 commits into
mainfrom
bugfix/issue-205

Conversation

@cscheid
Copy link
Copy Markdown
Member

@cscheid cscheid commented May 15, 2026

Closes #205. Follow-up to #201.

Summary

  • The qmd writer's escape_markdown was per-Str and decided whether to emit \' from the in-Str prev_char/next_char alone. When ' sat at index 0 of a Str, prev_char was None and the escape was dropped — even when the preceding inline (e.g. a Code span's closing backtick, an Emph closer, or just block-start) left a non-alphanumeric byte in the output stream. The reader's smart-quote classifier then rejected the regenerated qmd with Q-2-7.
  • QmdWriterContext now carries prev_emitted_alnum. write_inline maintains it: before dispatch, every non-Str/non-Custom inline clears it (their openers are non-alphanumeric and reset the byte-stream context for any nested content); after dispatch, it is refreshed from the inline's closing byte. write_block clears it at every block boundary. write_str forwards it to escape_markdown as the start_prev_is_alnum hint, which is consulted when in-Str prev_char is None.
  • Generalizes the escape rule from "prev alnum AND next non-alnum" to "NOT (prev alnum AND next alnum)" — the reader treats a bare ' as an apostrophe only between two alphanumeric characters, and the previous narrower form left adjacent cases ('.foo, lone ') un-escaped and lossy.

Triage record

claude-notes/issue-reports/205/triage.md (committed in 7228ea7) — repros, localization, root-cause analysis, fix scope, resolved open questions.

Test plan

  • Add failing fixture tests/roundtrip_tests/qmd-json-qmd/apostrophe_after_code_inline.qmd (`x`\'s end) and confirm test_qmd_roundtrip_consistency fails with Q-2-7 on the regenerated qmd — done before any code change.
  • Add generalization fixtures: apostrophe_after_emph.qmd (*hi*\'s end) and apostrophe_at_block_start.qmd (\'sup end).
  • Implement the fix; confirm test_qmd_roundtrip_consistency passes.
  • cargo nextest run --workspace8942 / 8942 passing.
  • cargo xtask verify --skip-hub-build --skip-hub-tests — all steps green. (Rust-only change: pampa writer; no quarto-core / quarto-pandoc-types touchpoints, so the WASM leg is unaffected.)
  • End-to-end sanity checks via ./target/debug/pampa -f json -t qmd on JSON ASTs for each generalization (Code/Emph/Image preceding, block-start, nested-inside-Emph) and confirm the ' is escaped only when needed — round-trip pampa -f json -t qmd | pampa parses cleanly. don't (alnum both sides) is still emitted unescaped, no regression.

🤖 Generated with Claude Code

cscheid and others added 2 commits May 15, 2026 13:55
…bd-nsb9)

Follow-up to #201 / bd-8lcm. escape_markdown decides whether to emit
\' from purely intra-Str prev_char/next_char. When ' sits at index 0
of a Str, prev_char is None, the escape is dropped, and the reader's
smart-quote classifier — which keys off the surrounding byte stream,
not Str boundaries — emits Q-2-7 on the regenerated qmd.

Triage records the trigger boundary (previous emitted byte is
non-alphanumeric, e.g. closing backtick of a Code span, *, ), or
block start), generalizes the bug beyond the Code-specific shape in
the issue body, identifies a parseable repro ( \`x\`\'s end ) that
exercises the existing qmd-json-qmd roundtrip harness, and hands off
a TDD-first fix scope to bd-nsb9.

Includes minimal repros (parseable qmd + JSON AST).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d-nsb9, issue #205)

Follow-up to #201. `escape_markdown` previously decided whether to
emit `\'` from purely intra-`Str` context — when `'` sat at index 0
of a `Str`, `prev_char` was `None` and the escape was dropped, even
when the preceding inline (e.g. a `Code` span's closing backtick)
left a non-alphanumeric byte in the output stream. The reader's
smart-quote classifier rejected the regenerated qmd with Q-2-7.

`QmdWriterContext` now carries `prev_emitted_alnum`. `write_inline`
maintains it: before dispatch, every non-`Str`/non-`Custom` inline
clears it (their openers are non-alphanumeric and reset the
byte-stream context for any nested content); after dispatch, it is
refreshed from the inline's closing byte (last char of a `Str`'s
text for `Str`, preserved for `Custom`, `false` otherwise).
`write_block` clears it at every block boundary. `write_str`
forwards it to `escape_markdown` as the `start_prev_is_alnum` hint,
which is consulted when in-`Str` `prev_char` is `None`.

While here, generalize the escape rule from "prev alnum AND next
non-alnum" to "NOT (prev alnum AND next alnum)" — the reader treats
a bare `'` as an apostrophe only between two alphanumeric
characters, and the previous narrower form left adjacent cases
(e.g. `'.foo`, lone `'`) un-escaped and lossy.

Adds three roundtrip fixtures locking in the fix:
  apostrophe_after_code_inline.qmd  — `x`\'s end
  apostrophe_after_emph.qmd         — *hi*\'s end
  apostrophe_at_block_start.qmd     — \'sup end

Triage: claude-notes/issue-reports/205/triage.md
Beads:  bd-nsb9

End-to-end verified: each fixture parses, round-trips through the
qmd writer, and re-parses to the same AST. The `qmd-json-qmd` suite
(test_qmd_roundtrip_consistency) passes; cargo xtask verify
--skip-hub-build --skip-hub-tests passes; cargo nextest run
--workspace passes (8942/8942).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cscheid cscheid merged commit 322383f into main May 15, 2026
4 checks passed
@cscheid cscheid deleted the bugfix/issue-205 branch May 15, 2026 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Writer does not escape apostrophe at the start of a Str whose preceding inline is a Code span

1 participant