Skip to content

Add math-block notation#190

Closed
heavywatal wants to merge 8 commits intoburodepeper:masterfrom
heavywatal:math-block
Closed

Add math-block notation#190
heavywatal wants to merge 8 commits intoburodepeper:masterfrom
heavywatal:math-block

Conversation

@heavywatal
Copy link
Copy Markdown

As discussed in #189, the following notations are added to math-block notation:

\[
  a = \sum \frac 1 i
\]

\begin{equation}
  a = \sum \frac 1 i
\end{equation}

\begin{equation*}
  a = \sum \frac 1 i
\end{equation*}

@burodepeper
Copy link
Copy Markdown
Owner

Cool!

As I feared, the CI tests failed. Are you familiar with that? I'd like to have a couple new ones added for these new math notations as well.

@heavywatal
Copy link
Copy Markdown
Author

Now \\[ ... \\] and \begin{equation} ... \end{equation} are properly recognized. (I noticed that one more backslash is needed in Markdown so that it is translated into \[ ... \] in HTML.)

I think \[ in HTML blocks should be matched as well because wrapping equations in <div> is a common work-around to prevent underscores from being interpreted by Markdown:

<div>\[
a = \sum_{i=1} \frac 1 i
\]</div>

Where should I add the pattern for this? maybe languege-html package, or here?

@burodepeper
Copy link
Copy Markdown
Owner

Thanks for your efforts so far! I'm a little busy at the moment, so I haven't had time to review your PR in depth yet. From what I could see, you haven't added any spec that tests for situations that it shouldn't match. Could you add a couple of those? Think about situations where a similar notation might be used, or the good notation with a typo, etc. This will help cover any regressions in the future.

The pattern you mentioned should go into language-html. Although I'm not sure why it is needed. Rather, it seems that it shouldn't be needed. Maybe the order in which the patterns are matched might also solve this issue. If it does, be sure to leave a comment describing so.

@heavywatal
Copy link
Copy Markdown
Author

Please take your time. I am happy with using my branch on Atom for the time being.

I guess the matching order is not the only problem here because the required number of backslashes is different: \\[ in Markdown, \[ in HTML. But I agree that this should be in language-html.

{
name: 'block.math.markup.md'
begin: '(\\${2})'
end: '(\\${2})(?:.*)'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(?:.*) is incorrect. It should be just

end: '(\\${2})'

Pandoc, at least, allows Markdown to be on the same line as the closing $$. I.e.:

# testing

$$
a = \sum \frac 1 i
$$ *markdown on same line*

is converted into:
image

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, and sorry for my late reaction. I have just fixed it accordingly.

1: name: 'punctuation.md'
endCaptures:
1: name: 'punctuation.md'
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the correct number of backslashes, four \ to represent a single literal \, and then two \ to escape the [.

If it's not showing up correctly, it's the fault of language-latex and their use of $base. See #226, area/language-latex#185.

On my branch of language-latex with the changes mentioned in area/language-latex#185, I see:

image

@kylebarron
Copy link
Copy Markdown
Contributor

I think the code looks fine but it depends on area/language-latex#185. Otherwise the latex highlighting will be off.

@heavywatal
Copy link
Copy Markdown
Author

I may be a bit confused. Given that we need to write \\[ in Markdown to generate \[ in HTML, I am happy with what I get from this PR as shown in the following figure:
screen shot 2018-07-23 at 22 51 49
Are you trying to make \[ work in Markdown by the PR 185?

@kylebarron
Copy link
Copy Markdown
Contributor

Given that we need to write \[ in Markdown to generate [ in HTML

What flavor of Markdown are you using?

By default Pandoc supports only $...$ for inline math and $$...$$ for display math.

There are non-Pandoc extensions for both \[ ... \] and \\[...\\].

Extension: tex_math_single_backslash

Causes anything between \( and \) to be interpreted as inline
TeX math, and anything between \[ and \] to be interpreted
as display TeX math. Note: a drawback of this extension is that
it precludes escaping ( and [.

Extension: tex_math_double_backslash

Causes anything between \\( and \\) to be interpreted as inline
TeX math, and anything between \\[ and \\] to be interpreted
as display TeX math.

I'm surprised to see that there's actually no definition for math in the Commonmark spec.

So which of the following should we color as math environments?

  • $...$
  • $$...$$
  • \(...\)
  • \[...\]
  • \\(...\\)
  • \\[...\\]

@heavywatal
Copy link
Copy Markdown
Author

Oh, I didn't know that pandoc has such options. How does pandoc process equations? I am using Hugo/Blackfriday for HTML generation, and MathJax for equation rendering. Hence for me it is not surprising that CommonMark does not have math specs. It is MathJax's job. What Markdown has to do is generating proper HTML code to be interpreted by JavaScript. From that point of view, tex_math_single_backslash is inconsistent with CommonMark.

According to short-math-guide.pdf from AMS-Math (see below), \begin{equation} ... \end{equation} and \begin{equation*} ... \end{equation*} should also be interpreted as equations. Of course MathJax supports it. But I am not sure if it should start with single- or double-backslash in this case because either works well with Hugo.

In summary, the following patterns in Markdown should be colored IMO:

$...$
$$...$$
\\(...\\)
\\[...\\]
\begin{equation} ... \end{equation}
\begin{equation*} ... \end{equation*}

Using $$ ... $$ is not recommended in short-math-guide.pdf, and I don't use that. But some people might want it...?

screen shot 2018-07-24 at 00 17 46

@kylebarron
Copy link
Copy Markdown
Contributor

How does pandoc process equations?

If converting from markdown to HTML, it converts to the appropriate HTML and uses either MathJax or KaTeX. If converting to PDF/LaTeX, it converts to the appropriate LaTeX commands and then optionally compiles the LaTeX to PDF.

I'm in agreement for

$...$
$$...$$
\\(...\\)
\\[...\\]

Supporting $$...$$ is a must because that's the standard Pandoc way to write display math.

I'm also in agreement for the begin-end clauses, but I'm curious why you suggest those. Does Blackfriday implement those at all? I support those because Pandoc supports Raw LaTeX input. However if we're going to support \begin{equation} ... \end{equation}, we need to also support other environments from AMSMath. For example, the align environment is used to vertically align several equations in a row, but an error is produced if you try to nest align within equation and the practice is not recommended.

For example, this raises an error with pandoc test.md -o test.pdf -s:

# testing

\begin{equation}
\begin{align}
a = \sum \frac 1 i
\end{align}
\end{equation}

So if we want to support \begin{equation} ... \end{equation}, as I believe we should, then we should support a something like

{
begin: '\\begin\\{(align|equation|multline|split|gather|alignat|aligned|gathered|eqnarray|array|tabular)(\\*)?\\}'
end: '\\end\\{$1\\}'
}

As an aside, my attitude about Commonmark and Math is this:

I strongly support inclusion of Pandoc-style inline math notation, with at least the status of “you don’t have to do anything with this, but you must still parse it correctly”. For instance, in

Let $y = m * x + b$ where $b$ is US$50,000

the asterisk in m*x should not be interpreted as an emphasis marker, and conversely, the dollar sign in US$50,000 should not be interpreted as a math shift.

@heavywatal
Copy link
Copy Markdown
Author

Supporting $$...$$ is a must because that's the standard Pandoc way to write display math.

Got it.

Does Blackfriday implement those at all?

No. Blackfriday knows nothing about math equations, as CommonMark does. It just generates HTML code like this:

<p>\begin{equation}
\sum \frac i a
\end{equation}</p>

which is later processed by MathJax, KaTeX, or any JavaScript libraries. I totally agree with you and your quote: “you don’t have to do anything with this, but you must still parse it correctly”.

So if we want to support \begin{equation} ... \end{equation}, as I believe we should, then we should support a something like

Agree. But I am not sure which environments should be included. For example, in the list you suggested, eqnarray is obsolete AFAIK. Maybe it can be a separate issue/PR. Shall we start small from just equation for now?

@kylebarron
Copy link
Copy Markdown
Contributor

I didn't take the time to see which of those were still used.

It's probably easier to take a few minutes and include a good list in this PR.

@heavywatal
Copy link
Copy Markdown
Author

OK, I have just added simple ones: align|equation|multline|split|gather. It is consistent with language-latex.

I excluded alignat|aligned|gathered because they seem to be complicated and require one more argument like \begin{alignat}{2}.

@kylebarron
Copy link
Copy Markdown
Contributor

Alternatively, similar to how we include all rules from HTML, we could include all rules from latex after the Markdown rules. But I'm guessing that would have more side effects

@heavywatal
Copy link
Copy Markdown
Author

I like the idea, and it seems to be a powerful and beautiful solution, but far beyond my ability and resource.

@kylebarron
Copy link
Copy Markdown
Contributor

No I mean instead of writing them all ourselves, as long as the user has language-latex installed, we can just write

{include: 'text.tex.latex'}

And all of the latex highlighting rules get applied after the Markdown ones (depending on where that line is placed)

@heavywatal
Copy link
Copy Markdown
Author

Yes, I think I see your point. I mean considering the side effects of including whole language-latex functions will be much harder than keeping math block grammars by ourselves. LaTeX is not just about math equations, but here we should limit the scope to math blocks.

@heavywatal heavywatal closed this by deleting the head repository Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants