Skip to content

Parse-print roundtrip loses backslash escapes #68

@melisgl

Description

@melisgl

Parsing "[\\\\][x]" and printing the resulting tree nets "[\\][x]", where the sole remaining backslash escapes the ] char, which is clearly wrong.

The following code demonstrates this:

(with-output-to-string (out)
  (3bmd::print-doc-to-stream-using-format
   (3bmd-grammar:parse-doc "[\\\\][x]")
   out :markdown))
=> "[\\][x]"

(3bmd-grammar:parse-doc "[\\][x]")
=> ((:PLAIN "[" "]" (:REFERENCE-LINK :LABEL ("x") :TAIL NIL)))

I cannot see an easy fix for this in 3bmd. In PAX, there is okay workaround:
melisgl/mgl-pax@61b5ce4#diff-ea7e1b03d2cd2827267c023cbc1dc662c87c95ad254fd8d331619d45e2c7ca45R292-R313
This uses the fact that after parsing, PAX joins consecutive non-blank strings in the parse tree ("x" "y" -> "xy"). Then, the fix is to double backslashes that end strings in the parse tree.

I considered modifying the 3bmd printer, but it cannot make "local" decisions as it would need to consider the context of "\\":

(3bmd-grammar:parse-doc "[a\\\\b][x]")
=> ((:PLAIN (:REFERENCE-LINK :LABEL ("a" "\\" "b") :DEFINITION ("x"))))
=> NIL
=> T
(3bmd-grammar:parse-doc "[a\\\\][x]")
=> ((:PLAIN (:REFERENCE-LINK :LABEL ("a" "\\") :DEFINITION ("x"))))
=> NIL
=> T

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions