Architecture

CodeMap is structured as a layered application with clean separation between concerns.

Layer Diagram

┌─────────────────────────────┐
│          CLI Layer           │  Typer commands, argument parsing
│   codemap/cli/              │  Thin — delegates to Application
├─────────────────────────────┤
│      Application Layer       │  Use-case orchestration
│   codemap/application/      │  scan, analyze, graph, report
├─────────────────────────────┤
│        Domain Layer          │  Core graph model, metrics, protocols
│   codemap/domain/           │  Zero external dependencies
├──────────────┬──────────────┤
│ Infrastructure│  Rendering   │  Git, filesystem,   │ JSON, HTML,
│ codemap/     │  codemap/    │  extractors          │ terminal
│ infrastructure│  rendering/ │                      │
└──────────────┴──────────────┘

Domain Layer (`codemap/domain/`)

The domain layer is the core — it has no external dependencies (only Python stdlib).

Key types

Type	Purpose
`Node`	A file, directory, or module in the graph
`Edge`	A directed dependency between two nodes
`NodeGroup`	A logical cluster (directory, package)
`CodeGraph`	The central aggregate — nodes, edges, groups
`NodeMetrics`	Fan-in, fan-out, centrality, churn per node
`OwnershipInfo`	Contributor data attached to a node
`ContributorInfo`	Individual contributor snapshot

Protocols

Protocol	Purpose
`DependencyExtractor`	Extracts edges from a source file
`OwnershipProvider`	Provides contributor data per file
`GraphRenderer`	Renders a CodeGraph to an output format

All protocols use typing.Protocol with runtime_checkable for structural subtyping — implementors don't need to inherit from anything.

Infrastructure Layer (`codemap/infrastructure/`)

Handles all external I/O and language-specific logic.

FileScanner

Walks the repository tree with configurable include/exclude rules and ignored directory lists. Produces ScanResult containing ScannedFile entries.

GitAnalyzer

Shells out to git log to extract per-file contributor and churn data. Implements OwnershipProvider. Degrades gracefully when git is unavailable.

Batch mode (default): Uses prefetch() to load ownership and churn for all files in two git log calls total, regardless of repository size. This replaces the previous 2×N per-file subprocess approach and is critical for performance on large repositories (e.g. Angular: ~6200 files).

Extractors

Each extractor implements DependencyExtractor:

Extractor	Strategy
`PythonExtractor`	AST parsing of `import` / `from … import` statements
`JavaScriptExtractor`	Regex-based parsing of ES `import` and CommonJS `require`

Application Layer (`codemap/application/`)

Orchestrates domain and infrastructure into use-cases:

Module	Use-case
`scanner.py`	Scan a repository → `ScanResult`
`analyzer.py`	Scan + extract deps + git + metrics → `CodeGraph`
`grapher.py`	Render a `CodeGraph` to HTML or JSON
`reporter.py`	Generate a `CodeMapReport` with hotspots and ownership

Rendering Layer (`codemap/rendering/`)

Renderer	Output
`JsonRenderer`	Machine-readable `codemap.json`
`HtmlGraphRenderer`	Interactive D3.js visualization with multiple layouts, view modes, risk overlay, and exploration controls (`codemap.html`)
`PdfReportRenderer`	Multi-page PDF report with tables and metrics (`codemap.pdf`)
`terminal_renderer`	Rich console output for `codemap report`

The HTML template is separated into _html_template.py for maintainability. The renderer pre-computes risk scores and topological depth in Python and passes enriched data to the D3.js client-side visualization.

HTML Visualization Architecture

The HTML output is structured to be extensible:

Layout engines are separate JS functions (computeHierarchy, computeRadial, computeCluster, computeFlow) — adding a new layout means adding one function and one toolbar button. The Manual layout mode stops the simulation and pins nodes via fx/fy; a manualTick() function updates the DOM during drag. Save/restore uses localStorage with per-page keys.
Color modes are handled by a single nodeColor(d) function dispatching on colorMode state.
View modes (All / Neighborhood / Impact) are applied via the applyView() function which sets node dimming state.
Display modes (Overview / Readable / Focus / Presentation / Spacious) control label visibility and force spacing via modeSpacing() and labelCollisionR().
Focused Node Exploration is a first-class mode: activateFocusedNode() / applyFocusedNodeView() isolate a selected node's subgraph with sub-modes (local, deps, reverse, impact, flow).
Sidebar tabs are independent panels — new tabs can be added without affecting existing ones.
Minimap provides navigation for large, spacious graphs via buildMinimap().
Node Notes (nodeNotes Map) allow annotations on any node, with in-graph indicators and a modal editor (hidden by default, shown only on user action). Notes are stored in-memory per session.
Author accordion (buildAuthorsTab()) renders expandable per-contributor panels with commit counts, file lists, risk metrics, and clickable file navigation. Selecting an author dims unrelated nodes on the graph.
I18N uses a T(key) function backed by per-language dictionaries; all ~100 UI strings go through this function. Polish and English are fully supported.

CLI Layer (`codemap/cli/`)

Thin Typer commands that parse arguments and delegate to the application layer. No business logic lives here.

Progress Reporting (`cli/progress.py`)

The AnalysisProgress module provides Rich-based progress feedback for long-running operations:

Spinners for indeterminate work (e.g. scanning, git analysis)
Stage labels (► Extracting dependencies…) for each pipeline phase
Success/warning indicators (✓ / !) for completed stages
Progress bars for countable work (available but optional)

All CLI commands wrap operations with AnalysisProgress and handle Ctrl+C gracefully with a clean Cancelled. message and exit code 130.

Large-Graph Performance Architecture

The HTML visualization uses a progressive rendering strategy for large graphs (>200 nodes):

Collapsed Cluster View

When DATA.meta.nodeCount > 200, the visualization automatically:

Collapses groups — Directory groups with >5 files become single cluster nodes
Remaps edges — Edges between collapsed nodes are merged into cluster-level edges
Supports click-to-expand — Clicking a cluster node expands its member files
Rebuilds the simulation — The rebuildGraph() function reinitializes D3 data joins, forces, and event handlers

Adaptive Simulation Parameters

The simParams() function scales force simulation parameters based on node count:

Node Count	Link Distance	Charge	Distance Max	Collision Iterations
≤200	280	-800	1200	4
200–500	320	-1000	1600	3
>500	400	-1200	2000	2

Throttled Tick

On large graphs, the simulation tick handler runs expensive DOM operations (hulls, hotspot rings, entry markers, note indicators) every 3rd tick instead of every tick, reducing CPU overhead.

Performance Mode Toggle

The toolbar includes a Fast / Quality toggle (visible only for large graphs):

Fast — collapsed clusters, stricter zoom-gated labels, throttled ticks
Quality — all clusters expanded, full label visibility

CLI `--fast` Flag

The --fast flag on analyze and graph commands skips git analysis (ownership, churn) entirely, reducing processing time for large repositories to structure-only analysis.

Dependency Flow

CLI → Application → Domain
                  ↘ Infrastructure → Domain
                  ↘ Rendering → Domain

The domain layer is depended on by all other layers but depends on nothing. This ensures the core model can be tested in isolation and evolved independently.

Extension Points

New language analyzer → implement DependencyExtractor, register in analyzer.py
New output format → implement GraphRenderer, register in grapher.py
New ownership strategy → implement OwnershipProvider
New metrics → extend compute_all_metrics() in domain/metrics.py
CI validation rules → consume CodeGraph or CodeMapReport
New performance strategy → extend simParams() in the HTML template or add analysis depth levels in analyzer.py

Theme System

The HTML visualization supports three theme modes:

Mode	Behavior
System (default)	Follows the OS `prefers-color-scheme` media query
Dark	GitHub-inspired dark palette (`:root` variables)
Light	GitHub-inspired light palette (`[data-theme="light"]` override)

Theme selection is persisted in localStorage (codemap-theme key). The applyTheme() function sets or removes the data-theme attribute on <html>. All UI elements use CSS custom properties (--bg0–--bg3, --text, --accent, etc.) ensuring consistent theming across sidebar, toolbar, graph, and overlays.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

Layer Diagram

Domain Layer (`codemap/domain/`)

Key types

Protocols

Infrastructure Layer (`codemap/infrastructure/`)

FileScanner

GitAnalyzer

Extractors

Application Layer (`codemap/application/`)

Rendering Layer (`codemap/rendering/`)

HTML Visualization Architecture

CLI Layer (`codemap/cli/`)

Progress Reporting (`cli/progress.py`)

Large-Graph Performance Architecture

Collapsed Cluster View

Adaptive Simulation Parameters

Throttled Tick

Performance Mode Toggle

CLI `--fast` Flag

Dependency Flow

Extension Points

Theme System

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Architecture

Layer Diagram

Domain Layer (codemap/domain/)

Key types

Protocols

Infrastructure Layer (codemap/infrastructure/)

FileScanner

GitAnalyzer

Extractors

Application Layer (codemap/application/)

Rendering Layer (codemap/rendering/)

HTML Visualization Architecture

CLI Layer (codemap/cli/)

Progress Reporting (cli/progress.py)

Large-Graph Performance Architecture

Collapsed Cluster View

Adaptive Simulation Parameters

Throttled Tick

Performance Mode Toggle

CLI --fast Flag

Dependency Flow

Extension Points

Theme System

Domain Layer (`codemap/domain/`)

Infrastructure Layer (`codemap/infrastructure/`)

Application Layer (`codemap/application/`)

Rendering Layer (`codemap/rendering/`)

CLI Layer (`codemap/cli/`)

Progress Reporting (`cli/progress.py`)

CLI `--fast` Flag