Skip to content

fix: render HTML <table> blocks and clean up sidebar snippet (#452)#476

Open
rit3sh-x wants to merge 34 commits intorefactoringhq:mainfrom
rit3sh-x:fix/html-table-rendering
Open

fix: render HTML <table> blocks and clean up sidebar snippet (#452)#476
rit3sh-x wants to merge 34 commits intorefactoringhq:mainfrom
rit3sh-x:fix/html-table-rendering

Conversation

@rit3sh-x
Copy link
Copy Markdown

@rit3sh-x rit3sh-x commented May 1, 2026

Summary

Closes #452.

This PR addresses three related issues that surfaced while implementing the
fix for embedded HTML <table> rendering:

  1. Render HTML tables (the original bug). Notes containing raw
    <table> blocks (Obsidian / VSCode style) now render as native
    BlockNote tables when the rich editor opens them. The conversion
    happens in preProcessEditorMarkdown before BlockNote parses, so
    the on-disk content is unchanged in raw mode and only re-serializes
    to GFM if the user actually edits the table in rich mode (matching
    the maintainer's suggestion in the issue thread).
  2. Editable header row on /table insert. Previously, /table
    created a table with headerRows: 0, so the visible "header" cells
    were not editable until the user toggled them via raw mode.
    tables.headers is now enabled, and the default slash item is
    overridden to insert with headerRows: 1 directly.
  3. Sidebar snippet readability. Tables (HTML or GFM pipe) now show
    up in the note-list snippet as ?? col1 ? col2 ? ? (header preview,
    capped at 80 chars), separated from surrounding prose by a newline.
    The snippet <div> uses white-space: pre-line so the two-line
    clamp shows the table on its own line.

Why this approach

  • The HTML->GFM preprocessor stays in the editor swap path - it does not
    touch the on-disk file. Round-tripping (raw -> rich -> raw) is lossy
    by design, matching the maintainer's plan in the issue thread.
  • The snippet logic ended up living in three places (Rust runtime, vite
    dev middleware, and an on-save TS path). The PR consolidates the
    walker into a single shared module (src/lib/snippetMarkdownBody.ts)
    so future changes only need to update Rust + that module.

Test plan

  • npx tsc --noEmit
  • pnpm lint
  • pnpm test - 3000+ frontend tests pass; new tests:
    src/utils/htmlTableMarkdown.test.ts (22), the BlockNote round-trip
    in src/components/editorSchema.htmlTable.test.ts (3), the slash
    override in src/components/tolariaEditorFormattingConfig.test.ts
    (2), src/lib/snippetMarkdownBody.test.ts (12), and 3 added cases
    in src/utils/wikilinks.test.ts.
  • cargo test --manifest-path src-tauri/Cargo.toml --lib vault::parsing
    • 89 pass, including 4 new HTML / GFM table snippet tests.
  • npx playwright test tests/smoke/html-table-rendering.spec.ts
    passes with the issue Bug: HTML tables embedded in Markdown are not rendered #452 minimal repro.
  • Native macOS QA per AGENTS.md 1c - I do not have a macOS
    device to run the focus + screenshot scripts. Happy to coordinate
    with a maintainer or with whoever has access.
  • CodeScene gate - I do not have access to the project's
    CODESCENE_PAT. The pre-push hook should run on the maintainer side
    before merge.

Notes for the reviewer

  • src/utils/wikilinks.ts::extractSnippet is now a thin wrapper around
    the shared walker + the existing stripMarkdownChars. The previous
    duplicated table-unaware logic was the source of the
    "edit-time regression" where the sidebar snippet would revert to raw
    pipes after every save.
  • The Windows compile fix in src-tauri/src/commands/system.rs and
    src-tauri/src/git/clone.rs is unrelated to issue Bug: HTML tables embedded in Markdown are not rendered #452 in spirit
    but was required to run cargo test on a Windows dev box. Happy to
    split it into a separate PR if preferred.

rit3sh-x and others added 12 commits May 1, 2026 17:17
Convert raw HTML `<table>` blocks to GFM markdown tables in the
preprocessing pipeline before BlockNote parses, so notes authored
with embedded HTML tables (Obsidian / VSCode-compatible) render
in the rich editor.

- Add `preProcessHtmlTables` (DOMParser-based, fence-aware).
- Wire it into `preProcessEditorMarkdown` ahead of mermaid / images / wikilinks / math.
- 22 unit tests cover (header synthesis, pipe escape, malformed-table fallback, fenced-code respect).
- BlockNote round-trip test asserts `headerRows: 1` after parse.
- Playwright smoke opens the issue refactoringhq#452 minimal repro and asserts
  the table renders.

Refs refactoringhq#452
Pass `tables: { headers: true }` to `useCreateBlockNote` so users
can toggle row/column headers via right-click. Override the
default Table slash item so /table already inserts a 3x2 table
with `headerRows: 1`, removing the "go to raw, fill the empty
header values manually" detour.

Refs refactoringhq#452
Sidebar snippets used to leak raw `| col | --- |` pipe syntax
(and `<table>...<tr>...` HTML) when a note contained a table.
Replace every table block with a single marker built from the
header row, capped at 80 chars and separated from surrounding
prose by a newline.

The same walker now lives in one shared module so the on-disk
parse, the vite dev middleware, and the on-save TypeScript path
all stay in sync, with a Rust mirror in
`src-tauri/src/vault/parsing.rs`. `NoteItem` adds
`white-space: pre-line` so the two-line clamp honors the marker
line break.

Tests cover GFM pipe + HTML tables, `"---:` alignment separators,
80-char truncation, CRLF input, and a regression test for the
on-save path.
`cargo test` failed to compile on `x86_64-pc-windows-msvc` because
two test modules pulled in `std::os::unix::process::ExitStatusExt`
unconditionally. Gate the Unix imports on `cfg(unix)` and add the
matching Windows variant (`ExitStatus::from_raw(code as u32)`) so
the test fixtures compile on both targets.
@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 3, 2026

Not up to standards ⛔

🔴 Issues 59 high · 4 medium

Alerts:
⚠ 63 issues (≤ 0 issues of at least minor severity)

Results:
63 new issues

Category Results
BestPractice 1 medium
ErrorProne 1 high
Security 58 high
Complexity 3 medium

View in Codacy

🟢 Metrics 193 complexity · 3 duplication

Metric Results
Complexity 193
Duplication 3

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

Run reviewer

TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@codacy-production codacy-production Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR introduces HTML table rendering and improved sidebar snippets, but the implementation is currently not up to standards due to high complexity and potential reliability issues. Specifically, src-tauri/src/vault/parsing.rs and src/components/Editor.tsx show massive jumps in complexity without corresponding test coverage. The manual UTF-8 and HTML tag parsing in the Rust backend is brittle and likely to fail on multi-line tags or multi-byte characters. While the TypeScript walker logic was consolidated, the core snippet logic remains duplicated across both languages, increasing the maintenance burden for future updates to the table marker system.

About this PR

  • The complex logic for parsing and formatting table snippets ('📊' marker) is duplicated across the Rust backend (src-tauri) and the TypeScript frontend (src/lib). This creates two points of failure for snippet consistency and doubles the effort required for any future formatting changes.

Test suggestions

  • Found: HTML to GFM pipe table conversion logic, including escaping of pipe characters and conversion of newlines to
    .
  • Found: BlockNote editor parsing of preprocessed tables to verify they are identified as native table blocks with headerRows: 1.
  • Found: Slash menu override verifying that clicking the Table item inserts a block with the correct default rows and header settings.
  • Found: Snippet extraction logic (both Rust and TS) correctly identifying and formatting the '📊' table marker.
  • Found: Playwright smoke test verifying that a note with an embedded HTML table renders as an interactive table in the UI.
  • Missing: Unit tests for edge cases like nested tags, multi-line headers, and varying line endings in the Rust snippet extraction logic.
  • Prompt proposal for missing tests
    Consider implementing these tests if applicable:
    1. Missing: Unit tests for edge cases like nested tags, multi-line headers, and varying line endings in the Rust snippet extraction logic.
    
    Low confidence findings
    • The TypeScript snippet logic uses Regex for HTML processing, whereas the editor uses DOMParser. This discrepancy in parsing strategies may lead to subtle inconsistencies where a note's preview in the sidebar does not match how it is rendered when opened in the editor.

    TIP Improve review quality by adding custom instructions
    TIP How was this review? Give us feedback

/// Walk the body line-by-line. Drop heading/code-fence/rule/empty lines. Replace any
/// table block (HTML or GFM pipe) with a `📊 col1 · col2 · …` marker built from the
/// header row (≤ 80 chars). Preserve surrounding prose verbatim.
fn collapse_table_blocks_in_body(body: &str) -> String {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 HIGH RISK

The snippet extraction logic in collapse_table_blocks_in_body has significantly increased the cyclomatic complexity of this module. Given the manual string-slicing and regex-based parsing approach, this logic is highly prone to edge-case bugs. Ensure this implementation is backed by unit tests covering nested tags, multi-line headers, and varying line endings.

See Complexity in Codacy

pipeTableHeaderEmitted: boolean
}

function isPipeTableRow(line: string): boolean {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM RISK

The logic for tracking nested table states (HTML and GFM pipe syntax) within the snippet walker is becoming quite dense. To improve maintainability, consider refactoring the state management for each table type into separate handler functions or a strategy pattern to isolate buffer management and marker formatting.

See Issue in Codacy

Comment thread src-tauri/src/vault/parsing.rs
Comment on lines +493 to +509
fn strip_html_tags(s: &str) -> String {
let mut result = String::with_capacity(s.len());
let bytes = s.as_bytes();
let mut i = 0;
while i < bytes.len() {
if bytes[i] == b'<' {
if let Some(tag_len) = match_html_tag_len(bytes, i) {
i += tag_len;
continue;
}
}
let ch_len = utf8_char_len(bytes[i]);
result.push_str(&s[i..i + ch_len]);
i += ch_len;
}
collapse_whitespace(&result)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM RISK

Suggestion: The manual byte-based UTF-8 iteration and custom utf8_char_len logic are brittle and unidiomatic. Consider using s.chars() or s.char_indices() to safely handle multi-byte characters and prevent potential panics during string slicing.

@rit3sh-x
Copy link
Copy Markdown
Author

rit3sh-x commented May 3, 2026

@LucaRonin Codacy Static Code Analysis what should I do about this? This was the method I thought to test the functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: HTML tables embedded in Markdown are not rendered

1 participant