|
| 1 | +# Math-mode (MathJax / KaTeX / …) implementation — handoff from bd-4eyf |
| 2 | + |
| 3 | +**Status:** Not started. Notes for the next session. |
| 4 | +**Predecessor work:** bd-4eyf (Bootstrap JS injection) — see |
| 5 | +`claude-notes/plans/2026-05-04-bootstrap-js-injection.md`. |
| 6 | + |
| 7 | +This document captures findings from the bd-4eyf session that are |
| 8 | +directly load-bearing for math-mode work. Read this first; it will |
| 9 | +let you skip the learning loop bd-4eyf went through. |
| 10 | + |
| 11 | +## TL;DR — what changes vs. the Bootstrap JS approach |
| 12 | + |
| 13 | +Quarto 1 delegates math-mode injection entirely to Pandoc by setting |
| 14 | +`html-math-method: mathjax` (or `katex`, `webtex`, …) in pandoc |
| 15 | +options. q2 cannot reuse that mechanism because **q2's HTML pipeline |
| 16 | +does not invoke Pandoc** — we render HTML ourselves. So q2 needs its |
| 17 | +own injection path. |
| 18 | + |
| 19 | +The bd-4eyf "predicate → register `js:*` artifact" pattern *almost* |
| 20 | +fits, but math has two complications Bootstrap didn't: |
| 21 | + |
| 22 | +1. **An inline configuration block** is required (e.g. `<script>window.MathJax = { ... }</script>`) before the loader script. The artifact pipeline only emits external `<script src="…">` tags — it has no path for inline content. |
| 23 | +2. **A trigger that depends on document content**, not just metadata. "Does this document contain math?" requires an AST walk; bd-4eyf's predicate (`is_minimal_html` + `theme_config.suppress_bootstrap`) is metadata-only and runs in microseconds. |
| 24 | + |
| 25 | +These two together are why bd-4eyf deferred the generic `JsFeature` |
| 26 | +abstraction — math is the case where it might actually pay for itself. |
| 27 | + |
| 28 | +## Architectural pieces you'll touch |
| 29 | + |
| 30 | +Read these in roughly this order before designing: |
| 31 | + |
| 32 | +- **`crates/quarto-core/src/stage/stages/bootstrap_js.rs`** — the prototype to copy. The "predicate → `Project`-scoped `js:` artifact" pattern is documented in its module-level doc comment. |
| 33 | +- **`crates/quarto-core/src/stage/stages/apply_template.rs:166-167, 313`** — `collect_artifact_urls` is what turns `js:*` artifacts into `<script src="…">` tags. **It only handles external scripts** (artifacts with a `path`); inline content has no slot here. This is the design pinch-point for math. |
| 34 | +- **`crates/quarto-core/src/stage/stages/include_resolve.rs`** + the `rendered.includes.{header, before-body, after-body}` contract — *this* is how raw HTML (including inline `<script>` blocks) currently reaches the rendered template. The MathJax config script may need to ride this rail rather than the artifact rail. |
| 35 | +- **`crates/quarto-core/src/pipeline.rs`** — `build_html_pipeline_stages_with_options()` (native) and `build_wasm_html_pipeline()` (hub-client). The `#[cfg(not(target_arch = "wasm32"))]` gate pattern bd-4eyf established for omitting native-only stages from WASM is the cleanest precedent. Decide upfront whether math should ship to hub-client (probably yes — math display does not have the iframe-reinit-stateful-component problem Bootstrap does). |
| 36 | +- **`crates/quarto-core/src/format.rs:278`** — `is_minimal_html` predicate. **Important gotcha** documented in `bootstrap_js.rs`: this reads root-level `theme:` only; format-nested `format.html.theme: none` is *not* flattened to root by `MetadataMergeStage`. Use `quarto_sass::ThemeConfig::from_config_value(&doc.ast.meta).suppress_bootstrap` for the canonical "Bootstrap is in use" check, or *combine* both predicates as bd-4eyf did. Math probably wants its own predicate, but if it ever depends on the theme decision, use the same combined approach. |
| 37 | +- **`resources/scss/README.md`** + `resources/js/README.md` — the vendoring conventions. If you vendor MathJax, mirror the layout under `resources/js/mathjax/` and document the source URL + version + bump policy. |
| 38 | + |
| 39 | +## The trigger question (this is the hard one) |
| 40 | + |
| 41 | +Bootstrap JS triggers on metadata: "is a Bootstrap-backed theme |
| 42 | +active?" — checkable in O(1) on the format/document meta. |
| 43 | + |
| 44 | +Math triggers on **content**: "does this document contain at least one |
| 45 | +`Math` element in the AST?" — requires a walk. Options: |
| 46 | + |
| 47 | +1. **Walk in a dedicated stage** that runs late enough to see the final AST (after engines / sugaring / transforms — those can introduce math via crossref equations). This is the pure approach but adds an O(N) AST walk to every render. |
| 48 | +2. **Piggyback on an existing walker.** `RenderHtmlBodyStage` already traverses every node to emit HTML. Set a flag on `StageContext` when it sees a `Math` element. Then a tiny stage between `RenderHtmlBodyStage` and `ApplyTemplateStage` reads the flag and registers artifacts. Cheaper. Tighter coupling — the renderer becomes responsible for a side-channel signal. |
| 49 | +3. **Use the document profile.** `DocumentProfileStage` already snapshots the AST at the checkpoint; `EquationLabelTransform` already counts equation labels (`crates/quarto-core/src/transforms/`). If `profile.has_math` (new field) gets set during profiling, math injection becomes a metadata-style predicate again. Cleanest if the profile contract can be extended cheaply. |
| 50 | + |
| 51 | +Recommendation: option 3 if `DocumentProfile` already sees post-sugar |
| 52 | +crossref equations; otherwise option 2. Option 1 is the most |
| 53 | +expensive and probably unnecessary. |
| 54 | + |
| 55 | +**Don't forget:** `EquationLabelTransform` introduces `Math` blocks |
| 56 | +*from* `$$ … $$ {#eq-…}` syntax during sugaring. The trigger walk must |
| 57 | +run *after* this transform, or it will miss labelled equations. |
| 58 | + |
| 59 | +## The inline config-script question |
| 60 | + |
| 61 | +MathJax wants something like: |
| 62 | + |
| 63 | +```html |
| 64 | +<script> |
| 65 | +window.MathJax = { |
| 66 | + tex: { inlineMath: [['$', '$'], ['\\(', '\\)']] }, |
| 67 | + // … |
| 68 | +}; |
| 69 | +</script> |
| 70 | +<script src="…/mathjax.js" defer></script> |
| 71 | +``` |
| 72 | + |
| 73 | +The artifact pipeline can emit the external `<script>` (`js:mathjax` |
| 74 | +artifact, same shape as `js:bootstrap`) but **not the inline config |
| 75 | +block**. Three plausible paths: |
| 76 | + |
| 77 | +1. **Use `rendered.includes.header`** for the inline block — same rail `IncludeResolveStage` uses for raw HTML. The math stage would push a string into `meta.rendered.includes.header` and the artifact handles only the external script. Two halves, same destination (`<head>`), but they're decoupled in the code. |
| 78 | +2. **Extend the artifact API** with an "inline" variant — `Artifact::inline_script(content)` that emits a `<script>…</script>` directly rather than `<script src="…">`. Touches the `collect_artifact_urls` contract. Probably the most invasive option. |
| 79 | +3. **Bake the config into the loader.** Vendor a small wrapper script (`mathjax-init.js`) that does `window.MathJax = {...}; document.write('<script src="mathjax.js">…')` or similar. Single artifact, no template-side change, but harder to make the config user-controllable. |
| 80 | + |
| 81 | +Recommendation: option 1 (`rendered.includes.header` for inline, |
| 82 | +`js:mathjax` artifact for external). It reuses two existing rails |
| 83 | +without inventing a third. The "two halves" criticism is worth |
| 84 | +~5 lines of doc, not a refactor. |
| 85 | + |
| 86 | +## Vendor vs CDN vs both? |
| 87 | + |
| 88 | +Bootstrap was tiny (80 KB), so vendoring was an easy call. MathJax |
| 89 | +is *much* bigger: |
| 90 | + |
| 91 | +- Full MathJax 3 distribution: ~70 MB unpacked (includes every font / output mode / extension). |
| 92 | +- Common components-only loader: ~1 MB. |
| 93 | +- Smallest bootstrap-loader: ~150 KB. |
| 94 | + |
| 95 | +Quarto 1 vendors the components-only build. The size delta vs Bootstrap |
| 96 | +is real — vendoring 1 MB into the CLI binary affects download size and |
| 97 | +fresh-clone build time. Decision worth recording up front; both |
| 98 | +extremes have precedent. |
| 99 | + |
| 100 | +If you go CDN-default with vendor-fallback, that's a third design |
| 101 | +question you didn't have to answer for Bootstrap. The |
| 102 | +`is_minimal_html`-style metadata knob (`mathjax.source: cdn|local`) |
| 103 | +is the user-facing surface. |
| 104 | + |
| 105 | +## Hub-client / WASM |
| 106 | + |
| 107 | +Unlike Bootstrap (deliberately omitted from WASM because the |
| 108 | +iframe-per-render preview blows away stateful Bootstrap components), |
| 109 | +math display is **stateless** — typeset on load, done. Hub-client |
| 110 | +*should* render math. Don't blindly copy bd-4eyf's |
| 111 | +`#[cfg(not(target_arch = "wasm32"))]` gate; for math, both pipelines |
| 112 | +get the stage. (Confirm by trying a dollar-sign-equation in |
| 113 | +hub-client preview before committing.) |
| 114 | + |
| 115 | +If the size of the vendored MathJax bundle is what's blocking the |
| 116 | +WASM bundle, the CDN-default path solves that for free. |
| 117 | + |
| 118 | +## Configuration knobs (scope decision) |
| 119 | + |
| 120 | +Quarto 1 supports `html-math-method: mathjax | katex | webtex | gladtex | mathml | plain`. |
| 121 | +Each has different semantics: |
| 122 | + |
| 123 | +- `mathjax`, `katex` — client-side typesetting, ship a JS runtime. |
| 124 | +- `webtex` — server-side image rendering, no JS. |
| 125 | +- `gladtex` — alt-text only, no JS. |
| 126 | +- `mathml` — pass-through, no JS. |
| 127 | + |
| 128 | +q2 does not need to ship all of these on day one. Pick a default |
| 129 | +(probably `mathjax`) and an explicit user-controllable knob; defer the |
| 130 | +others. **Decide before you start writing code** — the predicate matrix |
| 131 | +balloons fast if you support all five. |
| 132 | + |
| 133 | +## Test strategy |
| 134 | + |
| 135 | +The bd-4eyf test pattern to copy: |
| 136 | + |
| 137 | +- **Unit tests** for the trigger predicate (math present / absent / nested in callout / inside code block / etc.) in the new stage's `#[cfg(test)] mod tests`. |
| 138 | +- **Integration tests** in a new `crates/quarto-core/tests/math_mode_pipeline.rs` driving `render_to_file` + `ProjectPipeline`, asserting: |
| 139 | + - Math-bearing render emits the script tag(s) and the on-disk file(s). |
| 140 | + - Math-free render emits neither. |
| 141 | + - Multi-page website ships one shared copy. |
| 142 | + - Nested-page relative URL. |
| 143 | +- **Live browser smoke** via chrome-devtools-mcp — actually load a rendered page and assert MathJax typeset something visible (e.g. that a `$x^2$` body produced a `<mjx-container>` in the DOM). The bd-4eyf session proved this works and is the most decisive evidence the runtime is wired up. Don't skip it. |
| 144 | + |
| 145 | +## Snapshot baseline |
| 146 | + |
| 147 | +`crates/quarto-core/tests/fixtures/phase5-single-doc-baseline/expected_hashes.txt` was re-captured for bd-4eyf because the new |
| 148 | +`<script>` tag changed `doc.html`'s SHA. The baseline `doc.qmd` is |
| 149 | +math-free, so math-mode work should *not* change `doc.html` again — |
| 150 | +the math stage must skip on math-free input. If the hash shifts, the |
| 151 | +predicate is over-triggering. (Useful canary.) |
| 152 | + |
| 153 | +## bd-telo dependency |
| 154 | + |
| 155 | +`bd-telo` (filed during bd-4eyf): q2 today only reads `navbar:` from |
| 156 | +the top level of `_quarto.yml`, not from nested `website.navbar:`. If |
| 157 | +math-mode is documented for users in user-facing docs that recommend |
| 158 | +the natural `website.navbar:` shape, the docs will not match what |
| 159 | +works. Either land bd-telo first, or be careful about the YAML shape |
| 160 | +in user-facing docs/examples for math. |
| 161 | + |
| 162 | +## Things that *don't* need re-deriving |
| 163 | + |
| 164 | +bd-4eyf already settled these — don't re-design them: |
| 165 | + |
| 166 | +- **Artifact-based external scripts** are the right shape for the JS payload (uses `ApplyTemplateStage`'s existing `js:` collector, gets the right URL via `ResourceResolverContext`, lands in the project lib dir for websites with the `quarto/` namespace, lands per-page for single-doc). |
| 167 | +- **Project scope** is the right scope for math assets (shared across pages in a website, mirrors the theme CSS layout). |
| 168 | +- **Vendoring layout under `resources/js/<feature>/`** with an `include_bytes!` is the established pattern; just add a section to `resources/js/README.md`. |
| 169 | +- **TDD with a noop stub for red phase** is the local norm; CLAUDE.md mandates it. The failure messages bd-4eyf used (positive cases fail, skip cases pass-via-false-positive) are documented in the bd-4eyf plan. |
| 170 | +- **Script ordering** is alphabetic-by-key and there's no SAT solver coming. If math needs to load *after* Bootstrap (it doesn't), pick a key that sorts after `bootstrap`. |
| 171 | + |
| 172 | +## Open question for the next session's first message |
| 173 | + |
| 174 | +Before designing, get an answer on: |
| 175 | + |
| 176 | +> Do we want math-mode to support multiple engines (mathjax + katex |
| 177 | +> at minimum), or is mathjax-only an acceptable v1? |
| 178 | +
|
| 179 | +This single decision shapes the whole stage — single-feature stage vs. |
| 180 | +parameterized-by-engine stage vs. one-stage-per-engine. Don't start |
| 181 | +without it. |
0 commit comments