Roadmap: Reduce rust asciidoc parsers to 1 hybrid parser

# Problem

There are three or more main Rust-based ASCIIdoc parsers. Each one has a preferred use-case but with tradeoffs. As a community, it would be in our interest to reduce duplicate effort and to provide a single hybrid approach that follows specification while maintaining performance optimizations.

# Rust AsciiDoc Parsers Compared

As of early 2026, the Rust ecosystem for AsciiDoc is transitioning from experimental projects to spec-compliant implementations. Unlike Markdown (which has `pulldown-cmark`), AsciiDoc has a significantly more complex grammar, making high-fidelity Rust parsers difficult to build.

Primary Rust-Based AsciiDoc Parsers
-----------------------------------

| Feature | `asciidoc-parser` | `acdc` (nlopes) | `asciidocr` |
| --- | --- | --- | --- |
| **Development Status** | Active (v0.14+) | Active / Research | Active (v0.2.0) |
| **Grammar Style** | Spec-driven / Manual | PEG-based | Scanner/Parser split |
| **Goal** | Compliance with Eclipse Spec | Speed & Correctness | TCK Compliance |
| **Completeness** | High (Inlines, Lists, Attributes) | Moderate (Core elements) | Moderate |
| **CLI Included** | Yes | Yes (`acdc-cli`) | Yes (`asciidocr`) |
| **Performance** | High (Zero-copy focus) | High (PEG-optimized) | Standard |

* * *

### 1\. [`asciidoc-parser`](https://github.com/asciidoc-rs/asciidoc-parser) (scouten)

This is currently the most robust effort toward a production-ready, pure-Rust AsciiDoc processor.

*   **Design Philosophy**: It employs "spec-driven development," mapping code coverage directly against the Eclipse Foundation's AsciiDoc Language Specification.
*   **Strengths**: Strong support for inline substitutions (bold, italic, macros), document attributes, and complex list structures. It includes a built-in HTML5 backend.
*   **Limitations**: Does not support UTF-16 (requires UTF-8), ignores `compat-mode`, and does not support the `book` doctype yet.

### 2\. [`acdc`](https://github.com/nlopes/acdc) / `acdc-parser` (nlopes)

A high-performance parser designed with a focus on formal correctness using a **Parsing Expression Grammar (PEG)**.

*   **Design Philosophy**: Uses the `peg` crate for grammar definition. It utilizes a two-pass inline processing system: first identifying boundaries, then parsing content.
*   **Strengths**: Extremely fast and "fail-fast" by design. Includes an experimental Language Server Protocol (LSP) for editor support.
*   **Limitations**: Known gaps in table spanning (row/column spans) and specific nested inline markup (e.g., bold inside links).

### 3\. [`asciidocr`](https://github.com/delfanbaum/asciidocr)

A newer implementation focused on passing the official **Technology Compatibility Kit (TCK)**.

*   **Design Philosophy**: Implements a standard scanner and parser architecture with a focus on creating a compatible Abstract Syntax Tree (AST).
*   **Strengths**: Provides clear library access to the scanner and AST, making it useful for developers building custom tooling or converters.

* * *

Technical Comparison of Parsing Strategies
------------------------------------------

The complexity of AsciiDoc requires different handling of "Inlines" versus "Blocks."

*   **Block Parsing**: All three libraries handle block-level elements (headings, paragraphs, delimited blocks) relatively well using line-by-line scanning.
*   **Inline Substitution**: `asciidoc-parser` is the most mature here, handling the intricate "constrained vs. unconstrained" regex-like rules of AsciiDoc more reliably than PEG-based approaches, which can struggle with the "lookahead" required for AsciiDoc's non-regular inline syntax.

Recommendation for 2026
-----------------------

*   **For production/static sites**: Use `asciidoc-parser`. It has the highest feature parity with the Ruby reference implementation (`Asciidoctor`) and the best documentation coverage.
*   **For IDE tooling/LSP**: Look at `acdc`. Its PEG grammar is better suited for the incremental parsing needed in text editors.
*   **For custom backends**: `asciidocr` provides the most accessible AST structures if you need to transform AsciiDoc into a proprietary format.


# Goal: Consolidate projects

Consolidate [asciidoc-parser](https://github.com/asciidoc-rs/asciidoc-parser), [acdc](https://github.com/nlopes/acdc) (nlopes), and [asciidocr](https://github.com/delfanbaum/asciidocr) by aligning them with the **Eclipse Foundation’s AsciiDoc Language Specification**. The primary obstacle to a single solution is the divergence in parsing architecture (PEG vs. Manual Recursive Descent).

Consolidation Roadmap
---------------------

### 1\. Unified Intermediate Representation (AST)

Establish a shared Abstract Syntax Tree (AST) crate. Currently, each project defines its own `Node` or `Block` enums.

*   **Action**: Extract the AST definitions from `asciidocr` or `asciidoc-parser` into a standalone `asciidoc-ast` crate.
*   **Goal**: Enable different parsing front-ends to target the same data structure, allowing back-ends (HTML, PDF, DocBook) to be shared.

### 2\. TCK-Driven Validation

The **AsciiDoc Technology Compatibility Kit (TCK)** serves as the "source of truth."

*   **Action**: Create a unified test runner that pulls the TCK JSON/YAML test suite.
*   **Goal**: Move development from "feature-chasing" to "compliance-filling." A project that passes 100% of the TCK is the de facto winner; merging projects becomes a matter of adopting the logic that passes specific TCK chapters.

### 3\. Hybrid Parsing Architecture

AsciiDoc's grammar is context-sensitive and non-regular, making pure PEG (used by `acdc`) difficult for complex inlines, while manual parsers (used by `asciidoc-parser`) are harder to maintain.

*   **Action**: Adopt a "Lexical Functional" split. Use a fast scanner for block boundaries and a specialized Pratt parser or state machine for inline substitutions.
*   **Goal**: Combine the performance of `acdc` with the correctness of `asciidoc-parser`.

* * *

Integration Strategy
--------------------

| Phase | Task | Primary Contributor |
| --- | --- | --- |
| **Phase A** | Common Spec-compliant AST crate | `asciidocr` architecture |
| **Phase B** | High-performance Block Scanner | `acdc` logic |
| **Phase C** | Inline Substitution Engine | `asciidoc-parser` logic |
| **Phase D** | Standard Library / Prelude | Shared |

Contribution Path
-----------------

The most efficient path to a "single solution" is contributing to **`asciidoc-parser`** (scouten), as it currently holds the closest alignment with the Eclipse specification.

1.  **Audit Gaps**: Run the TCK against all three.
2.  **Port Logic**: Identify specific features (e.g., Table Footnotes) present in one but missing in the other.
3.  **Deprecate**: Once a single crate surpasses the others in TCK compliance and performance, provide migration paths for the CLI tools of the smaller projects.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Roadmap: Reduce rust asciidoc parsers to 1 hybrid parser #336

Problem

Rust AsciiDoc Parsers Compared

Primary Rust-Based AsciiDoc Parsers

1. `asciidoc-parser` (scouten)

2. `acdc` / `acdc-parser` (nlopes)

3. `asciidocr`

Technical Comparison of Parsing Strategies

Recommendation for 2026

Goal: Consolidate projects

Consolidation Roadmap

1. Unified Intermediate Representation (AST)

2. TCK-Driven Validation

3. Hybrid Parsing Architecture

Integration Strategy

Contribution Path

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature	`asciidoc-parser`	`acdc` (nlopes)	`asciidocr`
Development Status	Active (v0.14+)	Active / Research	Active (v0.2.0)
Grammar Style	Spec-driven / Manual	PEG-based	Scanner/Parser split
Goal	Compliance with Eclipse Spec	Speed & Correctness	TCK Compliance
Completeness	High (Inlines, Lists, Attributes)	Moderate (Core elements)	Moderate
CLI Included	Yes	Yes (`acdc-cli`)	Yes (`asciidocr`)
Performance	High (Zero-copy focus)	High (PEG-optimized)	Standard

Phase	Task	Primary Contributor
Phase A	Common Spec-compliant AST crate	`asciidocr` architecture
Phase B	High-performance Block Scanner	`acdc` logic
Phase C	Inline Substitution Engine	`asciidoc-parser` logic
Phase D	Standard Library / Prelude	Shared

Uh oh!

Roadmap: Reduce rust asciidoc parsers to 1 hybrid parser #336

Description

Problem

Rust AsciiDoc Parsers Compared

Primary Rust-Based AsciiDoc Parsers

1. asciidoc-parser (scouten)

2. acdc / acdc-parser (nlopes)

3. asciidocr

Technical Comparison of Parsing Strategies

Recommendation for 2026

Goal: Consolidate projects

Consolidation Roadmap

1. Unified Intermediate Representation (AST)

2. TCK-Driven Validation

3. Hybrid Parsing Architecture

Integration Strategy

Contribution Path

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. `asciidoc-parser` (scouten)

2. `acdc` / `acdc-parser` (nlopes)

3. `asciidocr`