-
Notifications
You must be signed in to change notification settings - Fork 5
Add Markdown parser support (.md, .markdown) #145
Description
Problem
Ix does not currently parse .md or .markdown files. Markdown files are treated as
unrecognized and excluded from analysis, meaning documentation, API references, guides,
and developer-facing content are invisible to the system.
Proposed solution
Add a Markdown parser that can identify and extract:
- Heading entities at all levels (h1–h6), including ATX (
#) and setext (===,---) syntax - Heading hierarchy as CONTAINS relationships (file → h1 → h2 → h3)
- Section chunks scoped to each heading
- YAML frontmatter blocks
- VitePress/docusaurus anchor syntax (e.g.
## Heading {#anchor-id}) cleaned from entity names
Support the following file patterns: *.md, *.markdown
Why it matters
Markdown is the standard format for documentation in software projects — READMEs, API
references, changelogs, guides, and architecture docs are almost universally written in
it. Without parsing Markdown, Ix cannot reason about a project's documentation layer,
answer questions grounded in written specs, or surface relationships between docs and
the code they describe. This is especially relevant for documentation-heavy repos like
Vue, React, and other OSS projects where .md files are a primary artifact.
Merge requirements
Before this can be merged to main, the following must pass:
- Tested on 3 repos
- No regressions on existing parsers (
ix mapon a known TS/JS repo, verify counts unchanged) - Unit + smoke tests pass (
npm testinix-cli) - Heading hierarchy chains correctly: file → h1 → h2 → h3
- Both ATX and setext heading syntax ingested
-
ix containsreturns children for a known markdown file and heading - Frontmatter entities present where expected
- Section chunks created alongside heading entities
-
ix textreturns results withlanguage: markdown - Both
.mdand.markdownextensions ingested