-
Notifications
You must be signed in to change notification settings - Fork 0
Description
PLAN: ipynb Export with CommonMark Markdown for mystmd
Tracking issue: QuantEcon/meta#292
Branch:myst-to-ipynbonQuantEcon/mystmd(fork ofjupyter-book/mystmd)
Date: 2026-02-25
Goal
Add myst build --ipynb support to mystmd that produces notebooks with plain
CommonMark markdown cells — compatible with vanilla Jupyter Notebook, JupyterLab
(without jupyterlab-myst), and Google Colab.
PR jupyter-book#1882 already provides the infrastructure but delegates markdown cell content
to myst-to-md, which outputs MyST directive syntax (:::{note}, :::{figure},
etc.). We need a commonmark serialization mode so the output notebooks are
portable.
Phase 1 — Setup & Assess
- Fork
jupyter-book/mystmd→QuantEcon/mystmd - Clone fork, checkout
myst-to-ipynbbranch - Build locally (
npm install+npm run build— 36 packages, 21s) - Run
myst build --ipynbon a sample project (synthetic test cases) - Test against
lecture-python-programming.mystcontent (functions.md) - Catalog what renders correctly vs. breaks
- Document findings below
Phase 1 Findings (2026-02-25)
Build & export works
myst build --ipynbsuccessfully produces.ipynbfiles from MyST source- Code cells are correctly extracted from
{code-cell}blocks - Cell splitting at code boundaries works properly
- 48 cells generated from
functions.md(26 code, 22 markdown) — correct ratio
Issues confirmed: MyST syntax in markdown cells
The following MyST-specific syntax appears in output notebook markdown cells and
will not render in vanilla Jupyter or Colab:
| Issue | MyST syntax in output | Needed CommonMark | Severity |
|---|---|---|---|
| Inline math roles | {math}\E = mc^2`` |
$E = mc^2$ |
HIGH — pervasive |
| Math blocks | ```{math}\n...\n``` |
$$\n...\n$$ |
HIGH — pervasive |
| Admonitions | :::{note} Note\n...\n::: |
> **Note**\n>\n> ... |
HIGH — common |
| Figures | :::{figure} path\n:name: ...\n::: |
 |
HIGH — common |
| Tabs | ::::{tab-set}\n:::{tab-item}...\n:::: |
Content preserved, wrapper stripped | MEDIUM |
| Code blocks (non-executable) | ```{code-block}\n...\n``` |
Plain fenced code | MEDIUM |
+++ markers |
Every markdown cell starts with +++\n |
Should be stripped | HIGH — every cell |
| Exercise/Solution | Unsupported → empty output (silently dropped) | **Exercise N** / configurable |
MEDIUM |
| Proof/Theorem | Unsupported → empty output (silently dropped) | **Theorem** (Title)\n... |
MEDIUM |
| Raw blocks | Unsupported → silently dropped | Drop or preserve as HTML | LOW |
What works correctly (no changes needed)
- Headings (
#,##, etc.) - Bold, italic, inline code
- External links
[text](url) - Blockquotes (
>) - Bullet and numbered lists
- Definition lists
- Cross-references → rendered as
[Theorem 1](#label)links (good!) - Code cell source preservation — exact match
Metadata issues
metadata.language_info.namehardcoded to"python"(bug PLAN: ipynb Export with CommonMark Markdown for mystmd #2)- No
metadata.kernelspecat all — frontmatter ignored (bug CommonMark ipynb export + image attachment embedding #1) - No cell-level metadata (tags like
hide-inputnot passed through) - Log message says "Exported MD" (bug Image attachment embedding: resolve files/ output folder URLs #3)
Architecture assessment for CommonMark mode
After reading the myst-to-md source:
myst-to-mdhas explicit handlers for every node type:directiveHandlers,
roleHandlers,referenceHandlers,miscHandlersin separate files- Admonition handler calls
writeFlowDirective(name)→ always emits:::{name} - Math handler calls
writeStaticDirective('math')→ always emits```{math}` - Inline math handler calls
writeStaticRole('math')→ always emits{math}\...`` - The
blockhandler inmisc.tsemits+++prefix for every block node
Recommended approach: Option B (AST pre-transform in myst-to-ipynb)
Rationale:
myst-to-mdis designed specifically to produce roundtrippable MyST — changing
it risks breaking the MD export path- The ipynb exporter already has the AST available before calling
writeMd - A pre-transform can walk the AST and replace directive nodes with their
CommonMark-equivalent AST nodes (e.g.,admonition→blockquotewith bold
title,mathdirective → plain text$$block) - After the transform,
writeMdwill naturally produce CommonMark because the
AST no longer contains MyST-specific nodes - This keeps
myst-to-mdunchanged and isolates all CommonMark logic in
myst-to-ipynb
The transform would live in myst-to-ipynb/src/commonmark.ts and be applied
conditionally when the export config specifies markdown: commonmark.
Phase 2 — Bug Fixes (from meta#292 review)
| # | Bug | Location | Status |
|---|---|---|---|
| 1 | frontmatter parameter accepted but never used — should populate metadata.kernelspec and metadata.language_info |
myst-to-ipynb/src/index.ts |
✅ Fixed |
| 2 | Language hardcoded to 'python' — should derive from frontmatter kernelspec |
myst-to-ipynb/src/index.ts |
✅ Fixed |
| 3 | Log message says "Exported MD" instead of "Exported IPYNB" | myst-cli/src/build/ipynb/index.ts |
✅ Fixed |
| 4 | Redundant +++ markers leak into markdown cells (stated TODO) |
myst-to-ipynb/src/index.ts |
✅ Fixed |
| 5 | package.json homepage URL points to myst-to-md not myst-to-ipynb |
myst-to-ipynb/package.json |
✅ Fixed |
All tests pass (vitest run — 3/3). Verified on real functions.md lecture content.
Phase 3 — CommonMark Serialization Mode ✅ COMPLETE
Committed as cb808aec on myst-to-ipynb branch.
What was implemented
Added commonmark.ts — an AST pre-transform (~465 lines) that converts MyST-specific
nodes to CommonMark equivalents before writeMd serialization.
Configuration:
# In page frontmatter or project exports:
exports:
- format: ipynb
markdown: commonmark # default: 'myst' (existing behavior)Directive → CommonMark mappings (all working)
| MyST Node | CommonMark Output | Verified |
|---|---|---|
math block |
$$..$$ (via html node — no LaTeX escaping) |
✅ |
inlineMath role |
$...$ (via html node) |
✅ |
admonition |
> **Title** blockquote |
✅ |
exercise |
**Exercise N** + content |
✅ |
solution |
**Solution** + content (or dropped via option) |
✅ |
proof/theorem/lemma |
**Theorem N (Title)** + content |
✅ |
tabSet |
Bold tab titles + tab content | ✅ |
container (figure) |
 + italic caption |
✅ |
container (table) |
Bold caption + GFM table | ✅ |
card |
Bold title + content | ✅ |
grid |
Unwrapped to child cards | ✅ |
details |
Blockquote with bold summary | ✅ |
aside |
Blockquote | ✅ |
mystDirective |
Unwrapped children or code block | ✅ |
mystRole |
Unwrapped children or plain text | ✅ |
code blocks |
Stripped MyST options (lang preserved) | ✅ |
mystTarget |
Dropped (no CommonMark equivalent) | ✅ |
comment |
Dropped (% syntax not valid in CommonMark) | ✅ |
Node identifier/label |
Stripped to prevent (id)= prefixes |
✅ |
Key design decisions
- html-type AST nodes for math: Used
{ type: 'html', value: '$$..$$' }instead
of{ type: 'text' }to preventmdast-util-to-markdownfrom escaping LaTeX
special characters (_,\, etc.) - Bottom-up tree walk: Transforms process children first, so nested directives
(e.g., exercise containing math) are handled correctly - Deep clone: The original AST is cloned before CommonMark transform to avoid
mutating cached data
Files changed
packages/myst-to-ipynb/src/commonmark.ts— NEW (465 lines)packages/myst-to-ipynb/src/index.ts— AddedIpynbOptions, transform wiringpackages/myst-cli/src/build/ipynb/index.ts— Passesmarkdownoption from export config
Phase 4 — Tests & Validation ✅ COMPLETE
Unit tests committed as c1cca05f. Real-world validation fixes committed as 2d70076d.
Both pushed to QuantEcon/mystmd.
Test suite: 35 passing tests across 3 YAML files
| File | Tests | Coverage |
|---|---|---|
basic.yml |
13 | Core features: styles, headings, thematic break, blockquotes, lists, HTML, fenced code, code cells, mixed cells, block marker stripping, links, images, line breaks |
frontmatter.yml |
4 | Kernelspec metadata: default Python, Julia kernel, Python3 kernel, R kernel |
commonmark.yml |
18 | CommonMark mode: inline math ($), math blocks ($$), math with/without labels, underscores not escaped, admonitions→blockquote, admonitions preserved in myst mode, exercises with enumerator, theorems with title, tabSets→bold titles, solutions dropped/kept, frontmatter+CommonMark combined, heading/paragraph identifier stripping, mystTarget drop, comment drop, code block attribute stripping |
Test infrastructure improvements
- Rewrote
run.spec.tsto supportfrontmatterandoptionsfields in YAML test cases - Test runner auto-discovers all
.ymlfiles in the tests directory IpynbOptions(includingcommonmark.dropSolutions) fully testable via YAML
Real-world validation: functions.md from lecture-python-programming.myst
Tested by exporting a real QuantEcon lecture file (48 cells: 26 code, 22 markdown)
using the local dev build of myst build --ipynb in /tmp/test-ipynb-export/.
Issues found and fixed (commit 2d70076d):
| Issue | Root Cause | Fix |
|---|---|---|
(pos_args)=, (recursive_functions)= etc. in output |
myst-to-md's labelWrapper adds (identifier)=\n prefix to headings/paragraphs/blockquotes/lists with identifier/label properties |
Strip identifier/label from all children after transformNode in transformToCommonMark |
(index-vivo0ovzzj)= auto-generated labels |
Same root cause — {index} directives produce auto-generated identifiers |
Same fix |
+++ markers mid-cell |
stripBlockMarkers regex only matched at start of string |
Changed regex to /^\+\+\+[^\n]*\n/gm (global multiline) |
```{code-block} python\n:class: no-execute |
code nodes with extra MyST attributes rendered as directives |
Added code case to transformNode → transformCodeBlock() strips extra attributes |
| Empty cells from dropped nodes | mystTarget / comment / dropped solution nodes leave empty markdown cells |
Added .filter() to remove empty markdown cells after transformation |
mystTarget nodes |
Not handled in CommonMark mode | Added case 'mystTarget': return null |
% comment syntax |
Not handled in CommonMark mode | Added case 'comment': return null |
Result: 48 cells, 0 MyST syntax leaks (verified by automated audit script).
Remaining validation (manual)
- Validate output notebooks open correctly in:
- Jupyter Notebook (classic)
- JupyterLab (no
jupyterlab-myst) - Google Colab
- Test with real QuantEcon lecture content (
lecture-python-programming.myst) — ✅functions.mdclean - Test additional lecture files (more diverse MyST features)
- Verify cell metadata passthrough (
hide-input,remove-cell, etc.)
Phase 5 — Submit Upstream
- Push commits to
QuantEcon/mystmdmyst-to-ipynbbranch - Open PR against
jupyter-book/mystmdmyst-to-ipynbbranch (or push
directly if given access) - Coordinate review with @agoose77 and @rowanc1
- Update QuantEcon/meta#292
Parallel Work
- Update QuantEcon theme download button to serve built ipynb files
- Wire CI in
lecture-python-programming.mystto use custom mystmd build
(until PR merges upstream) - Track QuantEcon/quantecon-theme-src#26
(BinderHub launch support)
Key Files in This Branch
packages/myst-to-ipynb/ # New package — AST → ipynb conversion
src/index.ts # Main export logic + IpynbOptions + empty cell filter
src/commonmark.ts # CommonMark AST pre-transform (Phase 3 + Phase 4 fixes)
tests/run.spec.ts # Test runner — loads all .yml, supports options
tests/basic.yml # 13 basic feature tests (Phase 4)
tests/frontmatter.yml # 4 kernelspec/metadata tests (Phase 4)
tests/commonmark.yml # 18 CommonMark-mode tests (Phase 4)
package.json
packages/myst-to-md/ # Existing — AST → Markdown string (unchanged)
src/index.ts
packages/myst-cli/
src/build/ipynb/index.ts # CLI wiring for `myst build --ipynb`
packages/myst-frontmatter/
src/exports/validators.ts # Export format validators (ipynb added)
Commit History
| Commit | Description |
|---|---|
79c7be0b |
Phase 2: Bug fixes — kernelspec from frontmatter, language_info, log message, +++ stripping, homepage URL |
f67f1188 |
Phase 3: CommonMark serialization mode — commonmark.ts (465 lines), IpynbOptions, CLI wiring |
c1cca05f |
Phase 4: Expand test suite to 30 cases across 3 YAML files |
2d70076d |
Phase 4: Real-world validation fixes — identifier/label stripping, mystTarget/comment drop, empty cell filter, code block attribute stripping, global +++ regex |
References
- mystmd plugin docs
- sphinx-tojupyter — reference implementation for directive → CommonMark mappings
- jupyterlab-myst — JupyterLab extension for rendering MyST (what we want to NOT require)
- nbformat spec — Jupyter notebook format