fix: render HTML <table> blocks and clean up sidebar snippet (#452)#476
fix: render HTML <table> blocks and clean up sidebar snippet (#452)#476rit3sh-x wants to merge 34 commits intorefactoringhq:mainfrom
Conversation
Convert raw HTML `<table>` blocks to GFM markdown tables in the preprocessing pipeline before BlockNote parses, so notes authored with embedded HTML tables (Obsidian / VSCode-compatible) render in the rich editor. - Add `preProcessHtmlTables` (DOMParser-based, fence-aware). - Wire it into `preProcessEditorMarkdown` ahead of mermaid / images / wikilinks / math. - 22 unit tests cover (header synthesis, pipe escape, malformed-table fallback, fenced-code respect). - BlockNote round-trip test asserts `headerRows: 1` after parse. - Playwright smoke opens the issue refactoringhq#452 minimal repro and asserts the table renders. Refs refactoringhq#452
Pass `tables: { headers: true }` to `useCreateBlockNote` so users
can toggle row/column headers via right-click. Override the
default Table slash item so /table already inserts a 3x2 table
with `headerRows: 1`, removing the "go to raw, fill the empty
header values manually" detour.
Refs refactoringhq#452
Sidebar snippets used to leak raw `| col | --- |` pipe syntax (and `<table>...<tr>...` HTML) when a note contained a table. Replace every table block with a single marker built from the header row, capped at 80 chars and separated from surrounding prose by a newline. The same walker now lives in one shared module so the on-disk parse, the vite dev middleware, and the on-save TypeScript path all stay in sync, with a Rust mirror in `src-tauri/src/vault/parsing.rs`. `NoteItem` adds `white-space: pre-line` so the two-line clamp honors the marker line break. Tests cover GFM pipe + HTML tables, `"---:` alignment separators, 80-char truncation, CRLF input, and a regression test for the on-save path.
`cargo test` failed to compile on `x86_64-pc-windows-msvc` because two test modules pulled in `std::os::unix::process::ExitStatusExt` unconditionally. Gate the Unix imports on `cfg(unix)` and add the matching Windows variant (`ExitStatus::from_raw(code as u32)`) so the test fixtures compile on both targets.
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| BestPractice | 1 medium |
| ErrorProne | 1 high |
| Security | 58 high |
| Complexity | 3 medium |
🟢 Metrics 193 complexity · 3 duplication
Metric Results Complexity 193 Duplication 3
AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Pull Request Overview
The PR introduces HTML table rendering and improved sidebar snippets, but the implementation is currently not up to standards due to high complexity and potential reliability issues. Specifically, src-tauri/src/vault/parsing.rs and src/components/Editor.tsx show massive jumps in complexity without corresponding test coverage. The manual UTF-8 and HTML tag parsing in the Rust backend is brittle and likely to fail on multi-line tags or multi-byte characters. While the TypeScript walker logic was consolidated, the core snippet logic remains duplicated across both languages, increasing the maintenance burden for future updates to the table marker system.
About this PR
- The complex logic for parsing and formatting table snippets ('📊' marker) is duplicated across the Rust backend (src-tauri) and the TypeScript frontend (src/lib). This creates two points of failure for snippet consistency and doubles the effort required for any future formatting changes.
Test suggestions
- Found: HTML to GFM pipe table conversion logic, including escaping of pipe characters and conversion of newlines to
. - Found: BlockNote editor parsing of preprocessed tables to verify they are identified as native table blocks with headerRows: 1.
- Found: Slash menu override verifying that clicking the Table item inserts a block with the correct default rows and header settings.
- Found: Snippet extraction logic (both Rust and TS) correctly identifying and formatting the '📊' table marker.
- Found: Playwright smoke test verifying that a note with an embedded HTML table renders as an interactive table in the UI.
- Missing: Unit tests for edge cases like nested tags, multi-line headers, and varying line endings in the Rust snippet extraction logic.
- The TypeScript snippet logic uses Regex for HTML processing, whereas the editor uses DOMParser. This discrepancy in parsing strategies may lead to subtle inconsistencies where a note's preview in the sidebar does not match how it is rendered when opened in the editor.
Prompt proposal for missing tests
Consider implementing these tests if applicable:
1. Missing: Unit tests for edge cases like nested tags, multi-line headers, and varying line endings in the Rust snippet extraction logic.
Low confidence findings
TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback
| /// Walk the body line-by-line. Drop heading/code-fence/rule/empty lines. Replace any | ||
| /// table block (HTML or GFM pipe) with a `📊 col1 · col2 · …` marker built from the | ||
| /// header row (≤ 80 chars). Preserve surrounding prose verbatim. | ||
| fn collapse_table_blocks_in_body(body: &str) -> String { |
There was a problem hiding this comment.
🔴 HIGH RISK
The snippet extraction logic in collapse_table_blocks_in_body has significantly increased the cyclomatic complexity of this module. Given the manual string-slicing and regex-based parsing approach, this logic is highly prone to edge-case bugs. Ensure this implementation is backed by unit tests covering nested tags, multi-line headers, and varying line endings.
| pipeTableHeaderEmitted: boolean | ||
| } | ||
|
|
||
| function isPipeTableRow(line: string): boolean { |
There was a problem hiding this comment.
🟡 MEDIUM RISK
The logic for tracking nested table states (HTML and GFM pipe syntax) within the snippet walker is becoming quite dense. To improve maintainability, consider refactoring the state management for each table type into separate handler functions or a strategy pattern to isolate buffer management and marker formatting.
| fn strip_html_tags(s: &str) -> String { | ||
| let mut result = String::with_capacity(s.len()); | ||
| let bytes = s.as_bytes(); | ||
| let mut i = 0; | ||
| while i < bytes.len() { | ||
| if bytes[i] == b'<' { | ||
| if let Some(tag_len) = match_html_tag_len(bytes, i) { | ||
| i += tag_len; | ||
| continue; | ||
| } | ||
| } | ||
| let ch_len = utf8_char_len(bytes[i]); | ||
| result.push_str(&s[i..i + ch_len]); | ||
| i += ch_len; | ||
| } | ||
| collapse_whitespace(&result) | ||
| } |
There was a problem hiding this comment.
🟡 MEDIUM RISK
Suggestion: The manual byte-based UTF-8 iteration and custom utf8_char_len logic are brittle and unidiomatic. Consider using s.chars() or s.char_indices() to safely handle multi-byte characters and prevent potential panics during string slicing.
|
@LucaRonin Codacy Static Code Analysis what should I do about this? This was the method I thought to test the functionality. |
Summary
Closes #452.
This PR addresses three related issues that surfaced while implementing the
fix for embedded HTML
<table>rendering:<table>blocks (Obsidian / VSCode style) now render as nativeBlockNote tables when the rich editor opens them. The conversion
happens in
preProcessEditorMarkdownbefore BlockNote parses, sothe on-disk content is unchanged in raw mode and only re-serializes
to GFM if the user actually edits the table in rich mode (matching
the maintainer's suggestion in the issue thread).
/tableinsert. Previously,/tablecreated a table with
headerRows: 0, so the visible "header" cellswere not editable until the user toggled them via raw mode.
tables.headersis now enabled, and the default slash item isoverridden to insert with
headerRows: 1directly.up in the note-list snippet as
?? col1 ? col2 ? ?(header preview,capped at 80 chars), separated from surrounding prose by a newline.
The snippet
<div>useswhite-space: pre-lineso the two-lineclamp shows the table on its own line.
Why this approach
touch the on-disk file. Round-tripping (raw -> rich -> raw) is lossy
by design, matching the maintainer's plan in the issue thread.
dev middleware, and an on-save TS path). The PR consolidates the
walker into a single shared module (
src/lib/snippetMarkdownBody.ts)so future changes only need to update Rust + that module.
Test plan
npx tsc --noEmitpnpm lintpnpm test- 3000+ frontend tests pass; new tests:src/utils/htmlTableMarkdown.test.ts(22), the BlockNote round-tripin
src/components/editorSchema.htmlTable.test.ts(3), the slashoverride in
src/components/tolariaEditorFormattingConfig.test.ts(2),
src/lib/snippetMarkdownBody.test.ts(12), and 3 added casesin
src/utils/wikilinks.test.ts.cargo test --manifest-path src-tauri/Cargo.toml --lib vault::parsingnpx playwright test tests/smoke/html-table-rendering.spec.tspasses with the issue Bug: HTML tables embedded in Markdown are not rendered #452 minimal repro.
AGENTS.md1c - I do not have a macOSdevice to run the focus + screenshot scripts. Happy to coordinate
with a maintainer or with whoever has access.
CODESCENE_PAT. The pre-push hook should run on the maintainer sidebefore merge.
Notes for the reviewer
src/utils/wikilinks.ts::extractSnippetis now a thin wrapper aroundthe shared walker + the existing
stripMarkdownChars. The previousduplicated table-unaware logic was the source of the
"edit-time regression" where the sidebar snippet would revert to raw
pipes after every save.
src-tauri/src/commands/system.rsandsrc-tauri/src/git/clone.rsis unrelated to issue Bug: HTML tables embedded in Markdown are not rendered #452 in spiritbut was required to run
cargo teston a Windows dev box. Happy tosplit it into a separate PR if preferred.