-
Notifications
You must be signed in to change notification settings - Fork 16
fix(stripComments): dont escape special markdown chars *, _, # #1267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
fix(stripComments): dont escape special markdown chars *, _, # #1267
Conversation
Prevents `*`, `_` and `#` chars from getting escaped in the md syntax to avoid something like `**bold **` becoming `\*\*bold \*\*`. Not entirely certain this is enough to solve the problem of MD content with trailing spaces between the closing `**` from rendering, but added some tests to verify that our strip comments transformer should at least leave these alone. Is there any harm in never escaping these?
lib/stripComments.ts
Outdated
| handlers: { | ||
| text(node, _, state, info) { | ||
| // Don't escape special markdown characters like #, *, or _. | ||
| if (/[#*_]/.test(node.value)) return node.value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh, i'm not entirely certain if this is the correct way to be solving this but am not sure how to think of all the cases in which this may regress something.
but just wanted to throw up something as a starting point to iterate on.
i didn't get to see this make any noticeable impact on the readme app after linking, but i was having trouble getting anything in my MD repo to show up.
need to look into it a bit more and pair on it with someone, b/c i couldn't figure out why i wasn't seeing any console logs come thru.
|
i added some better tests to help point out the hard problem here 2ab0f3d split them up into two tests:
the biggest problem here is that any invalid markdown that doesn't fit the CommonMark spec gets converted into a for example, here's a simple md input along with the mdast see how the first two lines with invalid MD get converted into a single text node instead of being split up. also notice how the backslashed if we then try to evaluate these text nodes to either return its raw value vs a "safe" (i.e. escaped) value, how can we without being able to discern whether the the same problem exists for leading/trailing bold/italic markers. turns into this mdast set of text nodes: how do you take a single text node like it feels like we might be going about this the wrong way. i think we need to write a remark plugin, much like the not sure if we have anything like this already, but this feels like the right approach instead of the processing we're doing in our |
🧰 Changes
Prevents
*,_and#chars from getting escaped in the md syntax toavoid something like
**bold **becoming\*\*bold \*\*.Cases:
**word**should be word**word **should be word\*word\*should be *word*_word_should be word_word _should be word\_word\_should be _word_Unfortunately, whitelisting the
*or_chars introduces another issue where we sometimes do want to escape them, like when following a backslash, e.g.\*or\#.So we need a way to match
*word*but not\*word\*.✅ What's working
❌ What's not?
When a literal asterisk is desired in the MD with
\*here\*, the escaped asterisk is getting removed and being rendered as*here*bold.🧬 QA & Testing