Skip to content

Add intermediate representation#103

Open
brandonchinn178 wants to merge 2 commits intojgm:masterfrom
brandonchinn178:intermediate-repr
Open

Add intermediate representation#103
brandonchinn178 wants to merge 2 commits intojgm:masterfrom
brandonchinn178:intermediate-repr

Conversation

@brandonchinn178
Copy link
Copy Markdown

@brandonchinn178 brandonchinn178 commented Feb 4, 2023

Adds a new completely optional (not even exported in import Commonmark) code path for people who want to inspect or transform the markdown content before rendering as Html (). Not sure if it's worth merging, but it was at least an interesting exercise. I'm putting up this PR mostly for demonstration purposes

Tested in GHCi:

ghci> import Commonmark.Nodes

ghci> commonmark "foo.md" "# hello\n```haskell\nx = 1\n```" :: Either ParseError (Nodes (Html ()))
Right (Nodes {unNodes = [Node {nodeType = NodeHeading 1 (Nodes {unNodes = [Node {nodeType = NodeStr "hello", nodeAttributes = [], nodeRange = Just foo.md@1:3-1:8}]}), nodeAttributes = [], nodeRange = Just foo.md@1:1-2:1},Node {nodeType = NodeCodeBlock "haskell" "x = 1\n", nodeAttributes = [], nodeRange = Just foo.md@2:1-5:1},Node {nodeType = NodeParagraph (Nodes {unNodes = []}), nodeAttributes = [], nodeRange = Just foo.md@4:4-5:1}]})

ghci> fromNodes <$> commonmark "foo.md" "# hello\n```haskell\nx = 1\n```" :: Either ParseError (Html ())
Right <h1>hello</h1>
<pre><code class="language-haskell">x = 1
</code></pre>
<p></p>

-- > | Just NodeRawBlock{} <- getNodeType node = True
-- > | otherwise = False
-- > in Nodes . filter (not . isRawHTML) . unNodes
newtype Nodes a = Nodes { unNodes :: [Node a] }
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types have an annoying type parameter because in order to implement fromNodes :: Nodes -> a, you would need to tell the compiler that a can render any NodeTypes that might be in the Nodes. But since NodeType is extensible, it might be possible Nodes to parse some new FancyNode and Html () not know how to render it (for example).

A clearer type would be to say

data Nodes nodeTypes = ...

where nodeTypes is a type-level set of node types that are represented in a Nodes value, and fromNodes could have a constraint that says "a can render every one of these node types". But this isn't very feasible right now.

The current implementation effectively bundles the Dict for IsBlock (Html ()) (Html ()) when parsing SomeNodeType, so each Node carries around how to "finish" rendering

@henrytill
Copy link
Copy Markdown

henrytill commented Jul 29, 2025

I did a similar thing here and released it as a small adjunct library on Hackage.

I functorized the sum types representing inlines and blocks and then used Cofree to provide the type structure necessary for annotating these trees with source ranges and attributes. This also had the advantage of allowing these types to use the Cofree API and the wider Comonad ecosystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants