Skip to content

feat: Add HTML representation#2236

Open
katosh wants to merge 190 commits intoscverse:mainfrom
settylab:html_rep
Open

feat: Add HTML representation#2236
katosh wants to merge 190 commits intoscverse:mainfrom
settylab:html_rep

Conversation

@katosh
Copy link
Contributor

@katosh katosh commented Nov 29, 2025

Rich HTML representation for AnnData

Summary

Implements rich HTML representation (_repr_html_) for AnnData objects in Jupyter notebooks. Builds on previous draft PRs (#784, #694, #521, #346) with a complete, production-ready implementation.

Live Demo | Reviewer's Guide (technical details, design decisions, extensibility examples)

Screenshot

screenshot2

Features

Interactive Display

  • Foldable sections with auto-collapse for large datasets
  • Search/filter with regex and case-sensitive toggles
  • Copy-to-clipboard for field names
  • Nested AnnData expansion with configurable depth
  • .raw section showing unprocessed data (Report n_vars of .raw in __repr__ #349)

Visual Indicators

  • Category colors from uns palettes (e.g., cell_type_colors)
  • Type badges for views, backed mode, sparse matrices, Dask arrays
  • Serialization warnings for data that won't write to H5AD/Zarr
  • Value previews for simple uns values
  • README support via modal (renders markdown from uns["README"])
  • Memory info in footer

Serialization Warnings

Proactively warns about data that won't serialize:

Level Issue Related
🔴 Error datetime64/timedelta64 in obs/var #455, #2238
🔴 Error Non-string keys #321
🔴 Error Object columns with dicts/lists/custom objects #1923, #567, #636
🔴 Error Non-serializable types in uns
🟡 Warning Keys with / (deprecated) #1447, #2099
🟡 Warning String→categorical auto-conversion #534, #926

Compatibility

  • Dark mode auto-detection (Jupyter Lab/VS Code, Furo/sphinx-book-theme)
  • No-JS fallback with graceful degradation
  • JupyterLab safe - CSS scoped to .anndata-repr prevents style conflicts
  • Lazy-loading safe - configurable partial loading for read_lazy() (categories, colors)
  • Zero dependencies added

Extensibility

Three extension mechanisms for ecosystem packages (MuData, SpatialData, TreeData):

  1. TypeFormatter - Custom visualization for value types
  2. SectionFormatter - Add new sections (e.g., obst/vart, mod)
  3. Building blocks - CSS/JS/helpers for packages needing full control

See the Reviewer's Guide for examples and API documentation.

Testing

  • 601 unit tests organized by responsibility (core, sections, formatters, UI, warnings, registry, lazy, robustness, Jupyter compatibility)
  • 108 escaping/robustness tests covering escaping coverage at every user-data insertion point, broken objects, size bombs, threading
  • HTMLValidator for structured HTML assertions (section-aware, no external dependencies)
  • 26 visual test scenarios: python tests/visual_inspect_repr_html.py

Related

Acknowledgments

Thanks to @selmanozleyen (#784), @gtca (#694), @VolkerH (#521), @ivirshup (#346, #675), and @Zethson (#675) for prior work and discussions.


Technical Notes and Edits

Lazy Loading

Constants are in _repr_constants.py (outside _repr/) to prevent loading ~6K lines on import anndata. The full module loads only when _repr_html_() is called.

Config Changes

pyproject.toml: Added vart to codespell ignore list (TreeData section name).


Edit (Dec 27, 2024)

To simplify review and reduce the diff, I've merged settylab/anndata#3 into this PR. That PR was originally created as a follow-up to explore additional features based on the discussion with @Zethson about SpatialData/MuData extensibility.

What changed:

  • Exported building blocks - CSS, JavaScript, and rendering helpers for external packages to build custom reprs while reusing anndata's styling
  • .raw section - Expandable row showing unprocessed data (Report n_vars of .raw in __repr__ #349)
  • Enhanced serialization warnings - Extended to cover datetime64, non-string keys, slashes in keys, and all sections
  • Regex search - Case-sensitive and regex toggles for filtering
  • Robust error handling - Failed sections show visible error indicators instead of being silently hidden

Edit (Jan 4, 2025)

Moved detailed implementation documentation (architecture, design decisions, extensibility examples, configuration reference) to the Reviewer's Guide to keep this PR description focused on features.

Code refactoring:

  • Split html.py into focused modules for maintainability
  • UI components extracted to components.py (badges, buttons, icons)
  • Section renderers moved to sections.py (obs/var, mapping, uns, raw)
  • Shared rendering primitives extracted to core.py (avoids circular imports)
  • Preview utilities moved to utils.py
  • FormatterContext consolidates all 6 rendering settings (read once at entry, propagated via context)
  • Result: html.py reduced from ~2100 to ~740 lines, clean import hierarchy

New features:

  • "Lazy" badge for read_lazy() AnnData objects (experimental) - indicates when obs/var are xarray-backed
  • Visual test for lazy AnnData (9b) - demonstrates lazy loading with (lazy) indicator on columns

Bug fixes:

  • Consistent meta column styling - all meta column text now uses adata-text-muted class for uniform appearance
  • Bytes index decoding - properly decode bytes values in index previews

Related issue discovered:

  • read_lazy() returns index values as byte-representation strings (e.g., "b'cell_0'" instead of "cell_0") - see ISSUE_READ_LAZY_INDEX.md

Edit (Jan 6, 2025)

Smart partial loading for read_lazy() AnnData:

Previously, lazy AnnData showed no category previews to avoid disk I/O. Now we do minimal, configurable loading to get richer visualization cheaply: only the first N category labels and their colors are read from storage (not the full column data). New setting repr_html_max_lazy_categories (default: 100, set to 0 for metadata-only mode).

Visual tests reorganized: 8 (Dask), 8b (lazy categories), 8c (metadata-only), 9 (backed).


Edit (Jan 6, 2025 - continued)

FormattedOutput API and architecture:

Clean separation between formatters and renderers - formatters inspect data and produce complete FormattedOutput, renderers only receive FormattedOutput (never the original data).

The FormattedOutput dataclass fields were renamed to be self-documenting:

Old Field New Field Purpose
meta_content preview (text) or preview_html (HTML) Preview column content
html_content + is_expandable=True expanded_html Collapsible content below row
html_content + is_expandable=False preview_html Inline preview in preview column
is_expandable Removed Use expanded_html is not None
(new) type_html Custom HTML for type column (replaces type_name visually)

Naming convention: *_html suffix indicates raw HTML (caller responsible for escaping), plain text fields are auto-escaped.

UI/UX improvements:

  • Zebra striping for section rows (alternating background colors)
  • Expand buttons now use / arrows instead of / for consistency
  • No borders between entries within sections (cleaner look)
  • Fixed button alignment - Expand and wrap buttons now align properly
  • Category list styling - explicit muted color ensures consistent appearance in nested contexts

Edit (Jan 7, 2025)

Test architecture overhaul:

Tests reorganized from a single file into 10 focused modules for maintainability and parallel execution:

File Focus
test_repr_core.py HTML validation, settings, badges
test_repr_sections.py Section rendering (obs, var, uns, etc.)
test_repr_formatters.py Type-specific formatters
test_repr_ui.py Folding, colors, search, clipboard
test_repr_warnings.py Serialization warnings
test_repr_registry.py Plugin registry
test_repr_lazy.py Lazy AnnData support
test_html_validator.py HTMLValidator tests + Jupyter compatibility

HTMLValidator class (conftest.py) provides structured HTML assertions:

v = validate_html(html)
v.assert_section_exists("obs")
v.assert_section_contains_entry("obs", "batch")
v.assert_section_initially_collapsed("obs")  # or _not_initially_collapsed

Key features: regex-based (no dependencies), section-aware matching, exact attribute matching to avoid "obs" matching "obsm".

Optional strict validation when dependencies available:

  • validate_html5() - W3C HTML5 + ARIA (requires vnu)
  • validate_js() - JavaScript syntax (requires esprima)

Jupyter Notebook/Lab compatibility tests (13 new tests in TestJupyterNotebookCompatibility):

Validates CSS scoping, JavaScript isolation, unique IDs across multiple cells, and Jupyter dark mode support.

Bug fix: readme-modal-title ID is now unique per container to prevent ID collisions when multiple AnnData objects are displayed in the same notebook.


Edit (Jan 8, 2025)

Maintainability improvements:

Fix Description
Entry rendering Consolidated _render_entry_row and render_formatted_entry to eliminate duplication
Debug logging Added get_formatter_for() and list_formatters() methods to FormatterRegistry
Import hierarchy Documented module dependency tree at top of __init__.py
Static assets Moved CSS (~1060 lines), JS (~380 lines), markdown parser (~150 lines) to static/ directory
FormattedOutput docs Enhanced field documentation with precedence rules and CSS class reference
HTMLValidator Moved to separate tests/repr/html_validator.py module (conftest.py: 960→270 lines)
Magic strings Extracted CSS classes and section names to _repr_constants.py
TypeCellConfig Added dataclass to simplify render_entry_type_cell() signature
Lazy module Consolidated lazy loading utilities to new lazy.py module
CSS colors Moved 148 CSS color names to static/css_colors.txt for easy updates

File structure changes:

src/anndata/_repr/
├── static/                  # NEW: Static assets directory
│   ├── __init__.py
│   ├── repr.css             # CSS template (~1060 lines)
│   ├── repr.js              # JavaScript (~380 lines)
│   ├── markdown-parser.js   # Markdown parser (~150 lines)
│   └── css_colors.txt       # CSS named colors (148 colors)
├── lazy.py                  # NEW: Lazy loading utilities
└── ...

API simplifications:

  • render_entry_type_cell() now accepts TypeCellConfig dataclass instead of 10 individual parameters
  • Lazy utilities consolidated: is_lazy_adata(), is_lazy_column(), get_lazy_categories(), get_lazy_categorical_info()
  • Static assets loaded via importlib.resources.files() (Python 3.9+)

Edit (Jan 9, 2025)

Robustness & escaping coverage testing:

Added 108 tests in test_repr_robustness.py across 14 test classes:

  • Escaping coverage (12 tests): verifies html.escape() is called at every user-data insertion point using a <b>MARKER</b> probe
  • Unicode edge cases (emoji, CJK, RTL override, zero-width chars)
  • Broken objects (crashing __repr__, __len__, __sizeof__, properties)
  • Size handling (huge strings, many categories, deep nesting)
  • Color array robustness (too many/few, invalid formats, empty)
  • Thread safety (concurrent repr generation)

Escaping tests trust html.escape() (stdlib) and only verify it's called at every insertion point, rather than exercising the escaping mechanism itself with attack vectors.

Test cleanup:

Removed redundant and overly-specific tests to focus on meaningful coverage. Tests now verify behavior that matters (e.g., XSS escaped, errors visible, truncation applied) rather than testing identical code paths multiple times.

Visual inspection: Consolidated to 26 scenarios with single comprehensive "Evil AnnData" test combining all adversarial patterns.

Fixes:

  • Added repr_html_max_readme_size to _settings.pyi type stubs
  • Fixed strict warnings compatibility (pytest.warns for expected warnings)
  • Section error truncation now shows "..." indicator when message exceeds limit

Updated stats:

Metric Value
Total tests 601
Robustness tests 108 (14 test classes)
Visual scenarios 26
Settings 11

Edit (Jan 16, 2025)

Error handling consolidation:

Refactored error handling to use a single error field in FormattedOutput instead of separate is_hard_error parameters scattered across the codebase.

Key changes:

Component Change
FormattedOutput Added error: str | None field with documented precedence over preview/preview_html
FallbackFormatter Made bulletproof - wraps every attribute access in try/except, checks serializability and includes reason in warnings
FormatterRegistry.format_value() Accumulates failed formatters instead of stopping at first failure
render_formatted_entry() Removed is_hard_error param, now detects via output.error
_validate_key_and_collect_warnings() Returns (key_warnings, is_key_not_serializable) - key issues mark as not serializable, preserving preview

Error vs Warning separation:

  • output.error: Hard rendering failure - row highlighted red, error message replaces preview
  • output.is_serializable=False: Serialization warning - red background, but preview preserved
  • Tooltip format: "Not serializable to H5AD/Zarr: {reason}" uses ":" to connect to reason, ";" separates independent warnings

New behavior when formatters fail:

  1. Registry tries all matching formatters in priority order
  2. Failed formatters are accumulated (full message for warnings, type-only for HTML)
  3. If a later formatter succeeds: warnings emitted about earlier failures
  4. If all fail: accumulated errors passed to fallback formatter

This prevents long error messages from appearing in HTML while preserving full details in warnings for debugging. Serialization issues (like non-string keys, lambdas, custom objects) preserve the value preview while showing the reason in the tooltip.

Updated stats:

Metric Value
Total tests 601
Robustness tests 108 (14 test classes)
Source lines ~6,500 Python + ~2,130 static assets
Test lines ~10,450 (13 files)

Edit (Jan 26, 2025)

Review response changes (addressing @flying-sheep's review):

Typing: Anyobject

Replaced all ~95 uses of Any across 7 files. Formatter method signatures now use obj: object since AnnData's uns accepts genuinely arbitrary objects and formatters handle AnnData-like objects (e.g., MuData) via duck typing. dict[str, Any] with known structure replaced with precise union types.

CSS: Native nesting + dark mode + variable dedup

  • Full conversion of repr.css to native CSS nesting (&). Selector repetitions of .anndata-repr reduced from 173 to 13. File length unchanged (~1164 lines) because the feature surface is genuinely large (~68 component blocks, 14 dtype colors, copy button, README styling, state variants), not because of repetition.
  • Added Sphinx theme dark mode selectors ([data-theme="dark"] for Furo/sphinx-book-theme) alongside existing Jupyter/VS Code detection.
  • Dark mode variables (~35 declarations) deduplicated: defined once in Python and substituted into both the @media (prefers-color-scheme: dark) block and theme-selector block.
  • Limitation: BEM modifiers (&--variant) produce invalid CSS at nesting depth 2+ (browser treats & as :is(parent child), so &--view becomes :is(.anndata-repr .anndata-badge)--view). 7 modifier rules flattened to sibling selectors.

Security tests simplified

Replaced ~34 attack-vector-heavy tests with 12 focused escaping-coverage tests. Each test puts a <b>MARKER</b> probe at one user-data insertion point and verifies it appears escaped. Removed TestCSSAttacks, TestEncodingAttacks; trimmed TestBadColorArrays, TestEvilReadme; consolidated TestUltimateEvilAnnData to 1 test. Total: 108 tests (14 classes), down from 123 (16 classes).

Other:

  • FormatterContext.column_name renamed to FormatterContext.key
  • Key validation moved into FormatterRegistry.format_value()
  • HTML validator tests updated for native CSS nesting (vnu doesn't support nesting syntax yet, so CSS parse errors are filtered)
Future-Proofing: Related PRs and Issues

This PR includes explicit handling and/or code references to track compatibility with several in-progress or future changes. The following PRs/issues may trigger updates to the _repr module:

Already Handled

PR/Issue Description Status in _repr Code Locations
#1927 Removes scipy sparse inheritance SparseMatrixFormatter uses duck typing fallback formatters.py:242,260,307
#2063 Array-API compatibility ArrayAPIFormatter via duck typing formatters.py:771,1135
#2071 Array-API backends (JAX, Cubed) ✅ Covered by ArrayAPIFormatter (same as #2063)

May Require Updates When Merged

PR/Issue Description Current Handling Code Locations
#2288 LazyCategoricalDtype API Accesses private CategoricalArray internals lazy.py (all functions)
#1923 List data types in obs Marked not serializable formatters.py:159

Recommended Post-Merge Actions

  1. When feat: add LazyCategoricalDtype for lazy categorical columns #2288 merges:

    • Refactor CategoricalFormatter and lazy.py to use the new LazyCategoricalDtype API
    • Replace duck typing: get_lazy_categorical_info() extracts category count by manually navigating obj.variable._data.array — replace with dtype.n_categories and dtype.head_categories(n)
    • Can use isinstance(dtype, LazyCategoricalDtype) for cleaner detection
  2. When Add support for lists in obs #1923 is resolved:

    • Update _check_series_serializability() in formatters.py to recognize list-of-strings as serializable
  3. When feat: allow gpu io in sparse_dataset by removing scipy inheritance #1927 merges:

    • Verify SparseMatrixFormatter still works with new sparse array classes
    • Consider removing duck typing fallback: If anndata provides a canonical is_sparse() utility or the new classes have a stable API, the duck typing in can_format() (checking for nnz, tocsr, tocsc) could be simplified to direct type checks
  4. When feat: array-api compatibility #2063/feat: support array-api #2071 stabilize:

    • Keep duck typing: The ArrayAPIFormatter duck typing (shape/dtype/ndim) follows the Array API standard and is the correct approach
    • Consider: If anndata adds a utility like is_array_api_compatible(), could use that instead of manual attribute checks
    • Optional: Add "cubed": "Cubed" to known_backends dict in ArrayAPIFormatter for prettier display labels

Internal API Usage Inventory

Current patterns accessing internal/private APIs that may be replaceable:

Location Current Pattern Replacement Opportunity
lazy.py:_get_categorical_array() Navigates xarray internals: col.variable._data.array Post-#2288: Check isinstance(dtype, LazyCategoricalDtype)
lazy.py:get_lazy_category_count() Accesses private CategoricalArray._categories["values"].shape[0] Post-#2288: Use dtype.n_categories
lazy.py:get_lazy_categorical_info() Accesses private ._categories, ._ordered Post-#2288: Use dtype.n_categories, dtype.ordered
lazy.py:get_lazy_categories() Uses read_elem_partial() on private ._categories Post-#2288: Use dtype.head_categories(n)
lazy.py:is_lazy_adata() String check: obs.__class__.__name__ == "Dataset2D" Consider proper type import if stable
SparseMatrixFormatter.can_format() Duck typing: checks nnz, tocsr, tocsc Post-#1927: Use anndata's sparse utilities if provided
ArrayAPIFormatter.can_format() Duck typing: checks shape, dtype, ndim Keep — follows Array API standard
BackedSparseDatasetFormatter.can_format() Checks module name + format attr Verify post-#1927

Replace ~34 redundant attack-vector-heavy security tests with ~12 focused
escaping-coverage tests that verify html.escape() is called at every
user-data insertion point.

- Replace TestXSSPrevention (12 tests) with TestEscapingCoverage (12 tests)
  using a single <b>MARKER</b> probe at each insertion point
- Collapse TestUltimateEvilAnnData from 7 tests into 1 combined test
- Remove 6 CSS injection tests from TestBadColorArrays (covered by
  test_css_colors_sanitized in TestEscapingCoverage)
- Remove XSS tests from TestEvilReadme (covered by test_readme_content_escaped)
- Remove TestCSSAttacks and TestEncodingAttacks (redundant)
- Remove unused `re` import

Test count: ~130 → 108 (all passing, no coverage loss)
Replace ~95 uses of `Any` across 7 files with proper types:
- `obj: Any` on formatter methods → `obj: object`
- `dict[str, Any]` with known structure → precise union types
  (e.g., `dict[str, bool | str | None]`, `dict[str, str]`)
- Return types like `list[dict[str, Any]]` →
  `list[dict[str, str | int | tuple[str, ...] | None]]`
- Remove unused `from typing import Any` imports
Major CSS overhaul for the HTML repr:

- Convert to native CSS nesting with `&` for BEM modifiers/elements,
  reducing repetition and improving readability (Chrome 120+,
  Firefox 117+, Safari 17.2+)
- Add `[data-theme="dark"]` and `html[data-theme="dark"]` selectors
  for Furo and sphinx-book-theme dark mode support
- Deduplicate dark mode CSS variables via Python string substitution
  (`/* __DARK_MODE_VARS__ */` placeholder in css.py) — needed because
  @media queries can't combine with regular selector lists in CSS
- Add `_get_top_level_selectors()` helper to parse CSS at brace depth 0,
  since native nesting means nested selectors inherit scope from parents
- Update CSS scoping tests to only check top-level selectors
- Update bare element selector tests to use depth-aware parsing
- Filter out vnu "CSS: Parse Error" from strict HTML5 validation since
  vnu's CSS parser doesn't support native CSS nesting syntax
@katosh
Copy link
Contributor Author

katosh commented Jan 26, 2026

Thanks for the detailed review @flying-sheep, really appreciate you taking the time. Here's where things stand on each point. Happy to discuss any of these further or adjust course if you see it differently.

1. README Rendering: No dependencies added

Fully agreed, and already the case. The markdown rendering is a small inline JS parser (180 lines, zero imports). Server-side content is HTML-escaped with html.escape() before embedding in a data-readme attribute. No new dependencies were added.

2. Jinja Templating: Not adopted

I gave this serious thought. Jinja's benefit for HTML generation is auto-escaping: {{ user_data }} is escaped by default, so you can't accidentally inject raw content. This is the only safety feature it adds over f-strings. Both use the same escaping underneath (markupsafe.escape / html.escape); the difference is whether escaping is opt-out (Jinja: mark trusted content with |safe) or opt-in (f-strings: call escape_html() on user data).

The opt-out model is genuinely safer when templates insert plain data into HTML. The problem is that this architecture doesn't work that way.

Why auto-escaping doesn't help here

The repr builds HTML through ~5 layers of composition where components return pre-rendered HTML strings. At each composition boundary (entry row into section, section into page, badge into header, etc.), the inner HTML must be marked |safe or Jinja would double-escape it. I counted roughly 25-30 such insertion points:

  • _repr_html_() calls for nested AnnData and pandas DataFrames (recursive raw HTML)
  • FormattedOutput fields (preview_html, expanded_html, type_html) inserted into entry rows
  • Sub-component assembly (render_copy_button(), render_badge(), etc. composed into cells)
  • Section builders joining entry rows, header assembling badges

With ~25-30 |safe bypasses, auto-escaping is largely inert. The actual user-data insertion points (where escaping matters) are a smaller set (~12), and those need explicit verification regardless of approach.

Could we restructure to avoid |safe?

In principle yes: formatters would return structured data instead of HTML, and a single top-level Jinja template would render everything. But this changes the extension model for ecosystem packages. Here's how a package (e.g., SpatialData adding obst) works today:

@register_formatter
class ObstSectionFormatter(SectionFormatter):
    section_name = "obst"
    after_section = "obsm"

    def get_entries(self, obj, context):
        entries = []
        for key, tree in obj.obst.items():
            entries.append(FormattedEntry(
                key=key,
                output=FormattedOutput(
                    type_name=f"SpatialTree ({tree.n_nodes} nodes)",
                    css_class="anndata-dtype--tree",
                    preview_html=f'<span class="spatial-depth">depth={tree.depth}</span>',
                    expanded_html=tree._repr_html_(),  # rich expandable preview
                ),
            ))
        return entries

To eliminate |safe, formatters could no longer return HTML. But what does expanded_html become when a package wants to show its own rich interactive visualization? The options:

  1. Return only plain text/data. No custom styling, no interactive previews, no package-specific HTML.

  2. Ship a Jinja template. Requires a template discovery/loading mechanism, version compatibility with anndata's base templates, and the package author learning Jinja's inheritance model. Significantly more complex than returning a dataclass.

  3. Return HTML and mark it |safe. Back to square one.

What else we'd lose

Pandas DataFrame preview: DataFrameFormatter calls df._repr_html_() for pandas' native table output. There's no way to template someone else's _repr_html_(), so this always needs |safe.

Implementation simplicity: HTML generation is currently co-located with the Python logic that produces the data. Each formatter is self-contained. With Jinja, we'd need separate .html template files for sections, entries, badges, header, footer, copy button, search box, README modal, error displays, etc., each kept in sync with the Python data structures that feed it.

Debugging: With f-strings, you can print() any intermediate HTML. With Jinja, errors point to template line numbers, not Python source.

Testing is the same either way

The PR originally included attack-vector-style tests (script injection, event handlers, etc.) to make the escaping coverage explicit and easy to audit. Based on your feedback that those adversarial tests belong in the escaping library's own test suite, we simplified: the TestEscapingCoverage class now uses a single <b>MARKER</b> probe at each of the 12 user-data insertion points and verifies it appears as &lt;b&gt;MARKER&lt;/b&gt;. This tests that escaping is applied at every insertion point, without re-testing the escaping mechanism itself.

This testing burden is the same with or without Jinja: we'd still need to verify that no |safe sits on a user-data path. Jinja also wouldn't cover CSS injection (sanitize_css_color()), which is a separate concern.

I'm genuinely open here. If you think Jinja is the right call despite these tradeoffs, happy to look into it more concretely.

3. CSS: Adopted all suggestions - validation caveat

Implemented everything. The native nesting conversion was a real improvement in readability, though we did run into some limitations worth noting:

What's done

  • Furo/sphinx-book-theme dark mode: Added [data-theme="dark"] and html[data-theme="dark"] selectors alongside the existing Jupyter/VS Code detection. Removed @media (prefers-color-scheme: dark) because it reflects the OS preference, which can contradict the app theme (e.g., OS dark + Furo light toggle). This follows xarray's approach of relying on attribute selectors that match the actual app state. With only one dark mode block remaining, the dark mode variables now live directly in repr.css, replacing the Python-side string substitution that was needed to keep two blocks in sync.

  • Native CSS nesting: Full conversion of repr.css to use native nesting with &. No build tools needed, just plain CSS supported in Chrome 120+, Firefox 117+, Safari 17.2+.

  • Inline styles moved to CSS: All inline style= attributes have been moved to proper CSS classes (.anndata-header__filepath, .anndata-spacer, .anndata-categories__dot, .anndata-entry__custom, .anndata-entry--error, .anndata-badge--error). Removed redundant STYLE_SECTION_CONTENT and STYLE_SECTION_TABLE constants that duplicated what the CSS already defined. Also removed 6 unused selectors from the CSS (~55 lines). Only three categories of inline styles remain: display:none for progressive enhancement (JS sets different display values per element), dynamic background:{color} from user data (color swatches), and computed CSS variable values (--anndata-name-col-width).

  • CSS audit: Full inventory of every selector with usage locations in CSS_CLASS_REFERENCE.md.

Nesting limitations encountered

The & substitution has a gotcha with BEM modifiers at nesting depth 2+. At depth 1, &--view inside .anndata-badge { } correctly becomes .anndata-badge--view. But when .anndata-badge is itself nested inside .anndata-repr, & becomes :is(.anndata-repr .anndata-badge), and appending --view produces :is(.anndata-repr .anndata-badge)--view, which is invalid CSS that browsers silently ignore.

This broke 7 modifier rules: badge variants (view/backed/lazy/extension), the search hit counter's active state, and color swatch invalid states. The fix was to flatten those to sibling selectors rather than nested &-- patterns. I added comments in the CSS explaining this for future maintainers.

Why the file isn't shorter

Nesting reduced .anndata-repr selector repetitions from 173 to 13, and we removed all unused selectors and dead code. The file is ~1,100 lines. We did a full audit — every selector is accounted for with its usage location in the CSS class reference. The bulk comes from the number of distinct components:

  • ~40 CSS custom properties (colors, typography, layout — light and dark mode)
  • README modal content styling: headings, lists, tables, code blocks, blockquotes (~110 lines) — these need independent styles because the modal renders arbitrary user markdown
  • Copy button drawn entirely in CSS with ::before/::after pseudo-elements and a copied-state checkmark transition (~70 lines)
  • Embedded DataFrame tables (pandas _repr_html_() output needs restyling inside the nested content container)
  • 16 dtype color classes (one per data type)

Sass's remaining advantage over native nesting would be @each loops for the dtype colors and badge variants (~55 lines), which doesn't justify a build dependency.

vnu validation caveat

The Nu Html Checker doesn't support native CSS nesting syntax yet, so the strict HTML5 validation tests now filter out CSS parse errors. The CSS is valid per the CSS Nesting spec and works in most browsers.

4. Typing: Adopted, proposing to revert objectAny on formatter interface

Replaced dict[str, Any] with precise union types where the structure is known (e.g., dict[str, bool | str | None]), added missing return type annotations, and removed unused from typing import Any imports.

For the formatter interface (can_format(obj), format(obj)), we tried object but it doesn't work with the dispatch pattern. The separation exists because dispatch goes beyond isinstance(): formatters check attributes, modules, and section context (e.g., SparseMatrixFormatter duck-types scipy-like objects, CategoricalFormatter handles pandas categoricals, categorical Series, and lazy xarray categoricals through a single formatter). This is a chain-of-responsibility with priority ordering, which singledispatch can't express.

The tradeoff is that can_format() does the narrowing but format() uses the result, and type checkers can't carry narrowing across method boundaries. mypy/pyright produce 56/92 errors — all "cannot access attribute X for class object" in format() methods. Adding cast() to every format() would fix it but is just boilerplate. singledispatch has the same fundamental issue, its typeshed stub types the base function as Callable[..., _T], leaving the dispatched argument untyped. I'd propose reverting the object changes on the formatter interface (9452c48). What do you think?

5. <details> Tags: Partially adopted (sections only)

Done. All sections use <details>/<summary> with the open attribute for initial state. This removed ~60 lines of JS and the .anndata-section--collapsed class.
image

Entry-level expandables (nested AnnData, DataFrames) stay JS-based because <details> can't wrap two sibling <tr> elements inside a table (the browser's foster parenting algorithm ejects it).
image

On cutting CSS classes via structural selectors: I went ahead and applied this for the section header. The CSS now uses .anndata-section > summary instead of a .anndata-section__header class, and the <summary> tags in Python are bare. This worked cleanly because <summary> is a unique semantic tag with one per <details>.

For the children inside <summary> (name, count, help link) I kept the classes. These are all generic <span> elements, so distinguishing them structurally would mean span:first-child / span:nth-child(2) which is more fragile and less readable than .anndata-section__name. The class name documents purpose in a way positional selectors don't.

Worth noting: this does create a mixed approach where the header itself uses a structural selector but its children use classes. There's an argument for consistency in either direction. I made this an isolated commit so it's easy to revert if you'd prefer all-classes or want to discuss further.


Thanks again for the thorough review. Let me know if any of the above needs further discussion or a different direction.

katosh added 12 commits January 26, 2026 16:38
Section-level collapse now uses native HTML <details>/<summary>
elements instead of JS-driven div toggling. This removes ~60 lines
of JS (toggleSection, data-should-collapse setup, section header
click/keyboard handlers) and the .anndata-section--collapsed CSS
class in favor of the browser's built-in open/close state.

- render_section() and render_empty_section(): <div> → <details>,
  `open` attr for expanded sections, no data-should-collapse
- _render_section_header(): <div> → <summary>, fold icon removed
- sections.py: unknown/error sections also converted to <details>
- CSS: custom ::before triangle on <summary> with rotation for
  closed state; removed .anndata-section__fold and max-height
  transition styles
- JS search filter: section.classList → section.open property
- render_fold_icon(): returns "" (kept for API compat)
- Validators: assert collapsed/expanded via <details open> attr

Entry-level expandables (nested AnnData, DataFrames) stay JS-based
since <details> cannot wrap adjacent <tr> elements in a table.
…elector

Use `.anndata-section > summary` instead of `.anndata-section__header`
in CSS, and drop the class from <summary> tags in Python. The semantic
tag makes a dedicated class redundant for section headers.

Also unify unknown/error sections to use `anndata-section` class
(was `anndata-sec`) so they share the same summary styling.
… selectors

The media query reflects the OS preference, which can contradict the app
theme (e.g., OS dark + Furo light toggle). Attribute selectors like
[data-theme="dark"] match the actual app state. This follows xarray's
approach (pydata/xarray#6500).
With only one dark mode block remaining (the @media query was removed in
the previous commit), the placeholder substitution machinery is no longer
needed. Move variables directly into repr.css.
Move all inline styles to CSS rules for consistency:
- .anndata-header__filepath, .anndata-spacer, .anndata-categories__dot,
  .anndata-entry__custom, .anndata-entry--error, .anndata-badge--error
Remove redundant STYLE_SECTION_CONTENT/STYLE_SECTION_TABLE constants
(already defined in .anndata-section__content/.anndata-section__table).
Remove unused CSS: .anndata-search, .anndata-entry__preview--expanded,
.anndata-x, .anndata-x__info, .anndata-text--warning, [data-tooltip].
Add missing anndata-section__table class to unknown sections table.

Only STYLE_HIDDEN and dynamic background/CSS-variable values remain inline.
… color

The .anndata-text--warning rule was incorrectly removed during cleanup
but is still applied in registry.py. The .anndata-dtype--ndarray class
was defined in constants and used by formatters but never had a CSS rule,
falling through unstyled. It now shares the --array color variable.
Keep both copy_on_write_X setting from main and HTML repr settings
from html_rep branch.
Two-tier detection: tier 1 uses the canonical has_xp() protocol check
from anndata.compat (catches JAX, numpy >=2.0); tier 2 falls back to
duck-typing (shape/dtype/ndim) for arrays that don't yet implement the
full protocol (PyTorch, TensorFlow). Also uses __array_namespace__()
for backend label resolution and updates stale PR scverse#2063scverse#2071.
… arrays

Device info (cuda:0, tpu:0, GPU:0, etc.) is now shown inline in the type
column instead of being hidden in tooltips. Adds visual inspection test 26.
…ce-based coloring

CuPy ≥12 implements the full Array API protocol, so the dedicated formatter
was redundant. ArrayAPIFormatter now handles CuPy arrays (with GPU:{device.id}
for clean labels) and colors all array-api arrays by device type: GPU green,
TPU teal, CPU/other amber — uniformly across backends.

Also removes unused CSS_DTYPE_ARRAY constant and its CSS selector.
@flying-sheep
Copy link
Member

flying-sheep commented Feb 10, 2026

Hi, thanks for all the work! Quite a bit smaller now, but +22k still fills me with dread. (I know a lot is tests)

Fully agreed, and already the case. The markdown rendering is a small inline JS parser (180 lines, zero imports).

Hmm, I’ll take a look, but I’m not sure I want to risk it existing.

With ~25-30 |safe bypasses, auto-escaping is largely inert

I think you missed that you can instead wrap safe markup in markupsafe.Markup, which addresses your concerns here.

Removed @media (prefers-color-scheme: dark) because it reflects the OS preference, which can contradict the app theme (e.g., OS dark + Furo light toggle).

That makes no sense to me. Using a theme setting if we can detect one and defaulting to the OS setting if we can’t is the best we can do, so why not do that?

&--view

Huh, didn’t know that’s possible, I recommended nesting mainly for descendant selectors (parent child1 and parent > child2 can become parent { child1 {}; &>child2 {} } which is cleaner), and you do that now, nice!

I’m pretty happy with the state of the CSS now, but just FYI, there’s options:

  • Instead of BEM we could use something like data attributes, which would be nestable. .anndata-dtype--category {} … could become .anndata-dtype { &[data-dtype="category"] {}; … } or so.
  • There’s also custom elements, which are worth a look

But as said, just some pointers, no need to put a lot of work into that, the CSS looks fine!

Sass's remaining advantage over native nesting would be @each loops for the dtype colors and badge variants (~55 lines),

We could use a cached_property or @cached function calling jinja to do this, but as said, it’s fine as it is.

vnu validation caveat

Link for future reference: w3c/css-validator#431

For the formatter interface (can_format(obj), format(obj)), we tried object but it doesn't work with the dispatch pattern. The separation exists because dispatch goes beyond isinstance(): formatters check attributes, modules, and section context (e.g., SparseMatrixFormatter duck-types scipy-like objects, CategoricalFormatter handles pandas categoricals, categorical Series, and lazy xarray categoricals through a single formatter). This is a chain-of-responsibility with priority ordering, which singledispatch can't express.

First, singledispatch can do anything since you can register runtime checkable protocols or ABCs, but more importantly, I’d rather have a few typing errors light up in some branches than just ignore typing altogether using Any.

The tradeoff is that can_format() does the narrowing but format() uses the result,

Just make it return a TypeGuard[...] instead of a bool and it does what you want!

the browser's foster parenting algorithm ejects it

what’s that, do you have a link?

…move markdown parser

- Add @media (prefers-color-scheme: dark) as OS-level fallback (Tier 1),
  explicit light selectors (Tier 2) override when app is in light mode,
  existing dark selectors (Tier 3) unchanged. CSS variables defined once
  in css.py with placeholder substitution to avoid duplication.
- Make TypeFormatter generic via PEP 695 (TypeFormatter[T]). can_format()
  returns TypeGuard[T], format() receives narrowed type without manual
  casts. Duck-typed formatters use TypeFormatter[object] with type: ignore.
- Remove markdown-parser.js (6.7KB) and markdown.py. README content shown
  as plain text via <pre> + textContent (XSS-safe). Remove ~110 lines of
  markdown-specific CSS.
- Add w3c/css-validator#431 reference where CSS nesting validation is skipped.
- Update visual_inspect_repr_html.py descriptions for plain-text README.
@katosh
Copy link
Contributor Author

katosh commented Feb 10, 2026

@flying-sheep Thanks for the thorough review! Here's what we've addressed and where we landed on the discussion points.

Changes made

Dark mode CSS. Added @media (prefers-color-scheme: dark) as OS-level fallback (Tier 1), with explicit light theme selectors (Tier 2) overriding it when the app is in light mode. Existing dark selectors (Tier 3) unchanged. Dark/light variable blocks are defined once in css.py and substituted into placeholders to avoid duplication. This was originally omitted intentionally: on pages without theme-switching attributes (plain HTML exports, nbviewer, non-Jupyter contexts), only the OS media query fires, so the anndata repr goes dark while the rest of the page stays light.

TypeGuard for can_format(). Made TypeFormatter generic via PEP 695 (class TypeFormatter[T](ABC)). can_format() now returns TypeGuard[T], so format() receives the narrowed type without manual casts. Duck-typed formatters use TypeFormatter[object] with # type: ignore comments explaining the duck-typing contract.

README as plain text. Removed the JavaScript markdown parser (markdown-parser.js, 6.7KB). README content is now displayed as plain text via <pre> with textContent (XSS-safe, no parsing needed). Removed ~110 lines of markdown-specific CSS.

<details> for expandable rows. We use sibling <tr> elements with JS toggle because <details> inside <table> is ejected by the browser's foster parenting algorithm. Placing it inside a <td> works syntactically but can't colspan for full-width expansion. Minimal demo showing both failure modes.

CSS validator link. Added reference to w3c/css-validator#431 where we skip CSS nesting validation errors.

On Jinja / markupsafe

Thanks for the Markup correction, that's cleaner than |safe filters. I want to walk through the options honestly because each has a real tradeoff.

Markup without Jinja doesn't work well here. We use f-strings throughout, and f-strings bypass Markup's auto-escaping. Switching to Markup.format() would help, but Markup(f"<td>{x}</td>") and Markup("<td>{}</td>").format(x) look nearly identical while having opposite safety properties. There's no linter rule to catch this, so any contributor reaching for the idiomatic pattern silently reintroduces the bug Markup is supposed to prevent.

Markup with Jinja is structurally sound: template files can't contain f-strings, so the escaping pipeline is enforced by the language boundary. But it circles back to the composition problem: ~25-30 insertion points pass Markup objects through unescaped (formatter HTML, nested _repr_html_(), component assembly), so auto-escaping fires on a small minority of insertions. The actual user-data insertion points (~12) still need explicit verification. We'd also add Jinja as a runtime dependency, split HTML generation across template files and Python, and require ecosystem formatters to either ship templates or return Markup strings (which is the current *_html pattern with extra steps).

Current approach. Explicit escape_html() at every user-data insertion point, validated by TestEscapingCoverage. No new dependencies, no split between template and Python logic, ecosystem formatters just return a dataclass.

I think the current approach is the right fit for this architecture, but happy to discuss further.

@katosh
Copy link
Contributor Author

katosh commented Feb 11, 2026

On the PR size (~22K lines)

Happy to discuss what could be simplified or split. Here's an honest breakdown of where the lines went.

Summary

Category Lines %
Tests (566 test methods across 10 files) 9,075 41%
Visual inspection harness (26+ scenarios) 3,365 15%
Test infrastructure (validator + conftest) 1,108 5%
Source code (12 Python modules) 6,400 29%
Static assets (CSS, JS, color list) 1,756 8%
Settings + anndata.py integration 173 1%

Tests and test tooling account for 61% of the PR. The implementation itself is ~8,150 lines.

Source code breakdown

formatters.py (1,172 lines) — 20 type-specific formatters covering ndarray, masked arrays, sparse matrices, backed sparse, DataFrames, Series, categoricals, lazy columns, dask, awkward, array-API/CuPy, nested AnnData, None, bool, int, float, str, dict, color lists, and generic list/tuple. Each formatter is ~50 lines average, with the larger ones (categorical, array-API) handling color swatches, device info, and dtype CSS classes. This is the primary extension point for ecosystem packages.

registry.py (1,044 lines) — The plugin system. Bulk comes from: FormatterRegistry with priority dispatch, error accumulation, and debug helpers (226 lines), FallbackFormatter that defensively wraps every attribute access for arbitrary objects (205 lines), TypeFormatter/SectionFormatter ABCs with docstring examples for ecosystem authors (211 lines combined), FormattedOutput dataclass with field documentation (98 lines), and extract_uns_type_hint for tagged data in uns (91 lines). The registry is designed for packages like MuData, SpatialData, and TreeData to register custom sections and formatters without modifying anndata.

utils.py (790 lines) — Shared helpers: serialization checking via the IO registry, value preview generation (dicts, lists, strings with truncation), color detection and CSS sanitization (whitelist-based, blocks injection), HTML escaping, memory formatting, key validation. The color sanitization alone is ~60 lines because it validates against CSS named colors, hex, rgb(), and hsl() while blocking url(), expression(), and semicolons.

html.py (637 lines) — The entry point. Orchestrates header (shape, badges, README icon, search), section rendering loop, footer (version, memory), and wraps everything with scoped CSS/JS. Handles settings capture, container ID generation, and the overall HTML structure.

components.py (618 lines) — Reusable UI components: section headers with fold/expand, entry rows with name/type/preview columns, badges, warning icons, copy buttons, search box. These are the building blocks that ecosystem packages can use directly.

sections.py (563 lines) — Section renderers for obs/var DataFrames (with column width calculation), mapping sections (obsm, varm, obsp, varp, layers), uns (recursive dict traversal with depth limit), and raw.

init.py (468 lines) — Public API with __all__ (49 exports) and module-level architecture documentation. The exports are intentionally broad for ecosystem extensibility.

core.py (401 lines) — Shared rendering primitives: format_number (with comma grouping), table rendering for DataFrame expansion, and entry rendering coordination between formatters and HTML output.

lazy.py (346 lines) — Lazy AnnData support. Detects lazy mode, reads partial categories from disk without triggering full materialization, determines column dtypes from storage metadata. Wrapped in try/except with graceful fallback.

css.py (97 lines) — CSS loader with dark/light variable placeholder substitution (define color blocks once, substitute into both @media and theme-selector rules).

javascript.py (49 lines) — JS loader.

Static assets

repr.css (1,050 lines) — Scoped CSS with native nesting. Covers: layout grid, section headers, entry rows, type column with dtype-specific colors (12 dtype classes), dark mode (three-tier: OS media query, explicit light override, dark theme selectors for Jupyter/Sphinx), README modal, search box, fold/expand animations, badges, warning/error styling, color swatches, copy buttons, scrollable containers. All scoped under .anndata-repr to avoid Jupyter conflicts.

repr.js (509 lines) — Fold/expand toggle, search with regex support and toggle buttons, copy-to-clipboard, README modal with keyboard accessibility, wrap-mode toggle for long type strings, ResizeObserver for responsive layout.

css_colors.txt (197 lines) — CSS named colors for sanitize_css_color() validation. This is a static lookup table, not generated code.

Tests

Average test is 16 lines. Tests are split by concern:

File Tests Lines Focus
test_repr_core.py 95 1,238 HTML structure, settings, badges, README
test_repr_sections.py 91 1,237 Section rendering for all anndata slots
test_repr_robustness.py 72 1,493 XSS escaping, broken objects, edge cases
test_repr_formatters.py 59 1,066 All 20 type formatters
test_repr_utils.py 54 510 Utility functions
test_repr_lazy.py 44 826 Lazy AnnData with mocked storage
test_html_validator.py 43 732 Validator self-tests
test_repr_registry.py 39 920 Plugin registry, priority, error handling
test_repr_warnings.py 36 568 Serialization warnings
test_repr_ui.py 33 485 Folding, colors, search, clipboard

test_repr_robustness.py (1,493 lines) is the largest because it covers 72 edge cases: escaping at every user-data insertion point (probe-based, not attack-vector-based), unicode handling, crashing objects, circular references, size limits, concurrent access, and error accumulation. These are intentionally thorough because _repr_html_() runs on arbitrary user data.

Test infrastructure

html_validator.py (836 lines) — Regex-based HTML validator with structured assertions (assert_section_exists, assert_section_contains_entry, assert_shape_displayed, etc.). Built without external dependencies to keep the test requirements minimal. Using BeautifulSoup would reduce this but add a test dependency.

conftest.py (272 lines) — Shared fixtures: AnnData factories for various configurations, the validate_html fixture, optional strict validators (W3C HTML5, JS syntax) that skip gracefully when tools aren't installed.

Visual inspection harness

visual_inspect_repr_html.py (3,365 lines) — Generates an HTML page with 26+ scenarios for manual review. Not a pytest test. Includes: basic/empty/view AnnData, lazy mode, backed mode, deep nesting, many categories, custom sections (TreeData/MuData/SpatialData mocks), README modal, adversarial data, ecosystem extensibility demos. The HTML template itself is ~2,200 lines (inline CSS for the test page layout, accordion sections, checklists). This could live in a separate repo or as a notebook, but having it adjacent to the code makes it easy to regenerate during development.

What could be reduced?

Genuinely open to suggestions. Some candidates:

  1. Visual inspector (3,365 lines) — Could be moved out of the PR and maintained separately. It's a development tool, not a runtime or test dependency.

  2. html_validator.py (836 lines) — Could switch to BeautifulSoup, cutting this roughly in half. Trade-off is adding a test dependency.

  3. Registry docstrings/examples (~300 lines across registry.py) — The extension API documentation is verbose. Could be moved to Sphinx docs instead of inline. But inline examples are what ecosystem authors will actually find when they subclass TypeFormatter.

  4. test_repr_robustness.py (1,493 lines) — Some of the edge-case tests could be considered excessive for a _repr_html_() method. The escaping coverage tests (one probe per insertion point) are the most important; the unicode/crashing/concurrent tests could be trimmed.

  5. css_colors.txt (197 lines) — Could be replaced with a runtime query to matplotlib's color list, but that would add a soft dependency on matplotlib at repr time.

None of these would change the order of magnitude. The feature has genuine breadth: 20 type formatters, a plugin registry, 11 configurable settings, dark mode, lazy mode support, serialization warnings, and search. For comparison, pandas' _repr_html_ is ~2K lines and xarray's is ~1.5K lines, but neither has interactivity, extensibility, or this level of type-specific formatting.

The test-to-code ratio of 1.7:1 reflects a deliberate choice: _repr_html_() processes arbitrary user data and produces HTML that runs in notebooks, so thorough testing seemed appropriate. Happy to trim where the coverage isn't pulling its weight.

The expanded raw subsection now displays index previews matching the
main AnnData header, with graceful "not available" fallback when
indices are absent or inaccessible.
Upstream added `size: int` to `SupportsArrayApi`, causing `has_xp()`
to reject the mock and `coerce_array` to raise.
@flying-sheep flying-sheep added this to the 0.13.0 milestone Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTML Repr

4 participants

Comments