Skip to content

Kitty graphics protocol support#24

Open
dsturnbull wants to merge 2 commits intoayosec:graphicsfrom
dsturnbull:alacritty-kitty
Open

Kitty graphics protocol support#24
dsturnbull wants to merge 2 commits intoayosec:graphicsfrom
dsturnbull:alacritty-kitty

Conversation

@dsturnbull
Copy link
Copy Markdown

This PR adds kitty graphics protocol support to the graphics branch, building on top of the existing sixel infrastructure. The key insight driving the implementation is that GraphicData is protocol-agnostic — once kitty's APC payload is decoded into a pixel buffer, the entire GPU pipeline (texture upload, cell attachment, shader rendering) works identically to sixel. No renderer changes were needed.

Method

Analyse kovid's spec, wezterm, ghostty, and alacritty-graphics. Turns out not a lot, mostly encoding/decoding machinery, which I generated (see docs)

Architecture

                       ┌──── DCS 'q' ──► sixel::Parser ──┐
 PTY ──► vte-graphics ─┤                                  ├──► GraphicData
                       └──── APC 'G' ──► kitty::Parser ──┘         │
                                 │                                 ▼
                                 │ (responses)              insert_graphic()
                                 ▼                                 │
                           PTY write-back                          ▼
                           ESC _Gi=N;OK                    Grid cells get
                                                           GraphicCell refs
                                                                   │
                                                                   ▼
           display::draw() ──► take_queues() ──► GPU upload ──► shader

The new code lives entirely in alacritty_terminal. The rendering side (alacritty/src/display/) is untouched — kitty images flow through the same insert_graphic()GraphicCell → texture upload → fragment shader path that sixel already uses.

What's implemented

APC routingvte-graphics is patched (dsturnbull/vte@kitty) to dispatch apc_start/apc_put/apc_end, following the same pattern as the existing DCS hooks. The VTE state machine already recognised the APC_STRING state; it just wasn't wired to callbacks.

Key-value parser (kitty_parser.rs, ~800 LOC) — Streaming parser for the ESC _G<key>=<value>[,...];<payload>ESC \ format. Handles all documented protocol fields including the overloaded animation keys (c=/r=/z= mean different things in frame context vs display context).

Transmission mediums (decode.rs, ~1100 LOC):

  • t=d Direct — inline base64, with lenient padding handling for clients like chafa that independently encode each chunk
  • t=f File — base64-encoded file path with offset/size support
  • t=t Temp file — relative to $TMPDIR, auto-deleted after read, path traversal validation rejects ../ and absolute paths
  • t=s Shared memory — POSIX shm_open + mmap, shm_unlink after read (unix-gated)

Decode pipeline: base64 → zlib (optional, o=z) → format dispatch (RGB f=24, RGBA f=32, PNG f=100) → GraphicData. Chunked transfers (m=0/m=1) decode each chunk's base64 on arrival rather than concatenating base64 strings, which handles the chafa per-chunk-padding case correctly.

Image storage (state.rs, ~1060 LOC) — HashMap<u32, KittyImage> with 320 MiB quota and LRU eviction (matching kitty/ghostty/wezterm). Image number (I=) → image ID (i=) mapping. All 11 delete target pairs:

Target Pairs Description
All d=a/A Clear all images
By ID d=i/I Delete by image ID
By number d=n/N Delete by image number
Cursor d=c/C Delete at cursor position
Column d=q/Q Delete intersecting column
Row d=r/R Delete intersecting row
Cell d=x/X Delete at specific cell
Cell+Z d=y/Y Delete at cell with z-index match
Z-index d=z/Z Delete by z-index
Placement d=p/P Delete by placement ID
Anim frame d=f/F Delete animation frames

The IncludingScrollback (uppercase) variants clean up orphaned images from storage.

Placement (placement.rs, ~600 LOC) — Source rectangle cropping (x=/y=/w=/h=), cell-based scaling (c=/r= via bilinear resize), sub-cell pixel offsets (X=/Y= via transparent padding), cursor movement control (C=1). Pipeline: crop → scale → offset → insert_graphic.

Animation (animation.rs, ~1120 LOC) — Frame storage with lazy single→multi-frame promotion (following the WezTerm pattern). a=f loads frames with base-frame copy and positional blitting. a=c composites between frames (alpha blending and overwrite modes). a=a controls playback (start/stop/loading, loop count). Note: the animation tick loop (advancing frames on a timer and re-uploading textures) is not yet wired into the display event loop — frames are stored and composed but not auto-advanced.

Protocol responses (response.rs) — ESC _Gi=<id>;OK ESC \ / ESC _Gi=<id>;EINVAL:<msg> ESC \ via Event::PtyWrite. Respects q=1 (suppress OK) and q=2 (suppress all). No response when i=0 and I=0 (matching kitty's finish_command_response behaviour).

Testing

184 #[test] functions across the modules:

  • 13 end-to-end integration tests that feed raw APC bytes through the full VTE → Parser → Term → dispatch pipeline and verify state
  • Unit tests across all streams (decode, state, placement, animation, response, parser)
  • scripts/test_kitty_graphics.py — 25 manual protocol-level tests with visual ✓/✗ feedback (query, PNG, chunked, file/tempfile/shm mediums, delete modes, scaling, pixel offsets, animation frames)
  • scripts/demo_kitty_graphics.sh — real-world validation with timg, chafa, and mpv

What works end-to-end

  • timg -p kitty image.png
  • timg -p kitty animation.gif
  • chafa -f kitty image.png
  • kitty +kitten icat image.png ✓ (chunked PNG transfer)
  • Direct RGBA/RGB transmission ✓
  • File and shared memory transmission ✓
  • Placement, cropping, scaling, pixel offsets ✓
  • Delete by all target types ✓

New dependencies

Crate Purpose
flate2 1.1 zlib decompression (o=z)
png 0.18 PNG decoding (f=100) — also fixes a breaking API change in the window icon loader
image 0.25 (minimal, png+rayon features) Bilinear scaling for c=/r= cell-based resize
tempfile 3 (dev) Integration tests for temp file medium

Segfaults with mpv video playback — looking for input

mpv --vo=kitty and timg -V (video mode) send images at 30+ fps, effectively stress-testing the graphics pipeline. Under this load, both tools reliably trigger a segfault in mpv. n.b. this also happens in ghostty!

David Turnbull added 2 commits March 3, 2026 19:58
Implements the kitty graphics protocol for inline image display:

Phase 0 — APC Routing:
- Vendored vte-graphics with APC state machine support
- Added apc_start/apc_put/apc_end to Perform and Handler traits

Phase 1 — Static Image Display:
- Key-value parser (kitty_parser.rs) for all protocol fields
- Chunked transfer accumulation (m=0/m=1)
- Decode pipeline: base64 → zlib → PNG/RGB/RGBA → GraphicData
- Image storage with 320 MiB quota and eviction
- Placement via existing insert_graphic() pipeline
- Protocol responses (OK/EINVAL) with quiet mode support
- Delete support (d=a/i/n, others stubbed)
- Source rectangle cropping, cursor movement control (C=1)

Modularised into kitty/{mod,state,decode,placement,response}.rs
for parallel development of Phase 2 features.

Validated end-to-end: timg -p kitty renders images correctly.
43 tests (unit + integration), 0 failures.
Clippy clean on alacritty_terminal and vte-graphics.
…es, scaling, animation

Phase 2 implements four parallel work streams on top of the Phase 0+1
baseline (static image display via direct base64 transmission).

Stream A — File & Shared Memory Transmission (decode.rs)
  - Medium::File (t=f): read image data from base64-encoded file path
    with offset (O=) and size (S=) support
  - Medium::TempFile (t=t): relative to temp dir, auto-deleted after read,
    path traversal validation rejects ../ and absolute paths
  - Medium::SharedMemory (t=s): POSIX shm_open + mmap, unix-gated,
    shm_unlink after read
  - Per-chunk base64 decoding: handles clients like chafa that
    independently encode each chunk (vs spec's split-one-string model)
  - Lenient dimension check: truncates trailing bytes when slightly over
    expected size (matches kitty's buf_used >= data_sz behavior)

Stream B — Full Delete Modes (state.rs)
  - KittyPlacement tracking: col, row, width, height, z_index per placement
  - All 11 delete target pairs implemented: cursor (d=c/C),
    placement ID (d=p/P), column (d=q/Q), row (d=r/R), cell (d=x/X),
    cell+z (d=y/Y), z-index (d=z/Z), animation frames (d=f/F)
  - IncludingScrollback variants clean up orphaned images from storage
  - Cursor position passed through from term grid

Stream C — Scaling & Pixel Offsets (placement.rs)
  - Cell-based scaling (c=/r=): bilinear resize via image crate
    - Both set: exact target dimensions
    - Only columns: proportional height
    - Only rows: proportional width
  - Sub-cell pixel offsets (X=/Y=): transparent padding approach,
    RGB→RGBA promotion when padding needed
  - Pipeline: crop → scale → offset → insert_graphic

Stream D — Animation Frame Storage (animation.rs, new module)
  - AnimationState/AnimationFrame data structures
  - Lazy single→multi-frame promotion (WezTerm pattern)
  - blit() with alpha compositing (source-over) and overwrite modes
  - a=f handler: load_animation_frame with base frame copy, positional
    blitting, frame editing by index
  - a=c handler: compose_frames between frames in same animation
  - a=a handler: control_animation (start/stop/loading, loop count)
  - Wired into mod.rs dispatch (replaces stubs)

Protocol Compatibility
  - Response suppression when id=0 (matches kitty's finish_command_response)
  - DecodePaddingMode::Indifferent for base64 (accepts padded/unpadded)

Tests: 347 total (was 43), 0 failures, clippy clean
  - 13 end-to-end integration tests (raw APC → VTE → Term → state verification)
  - Unit tests across all streams (decode, state, placement, animation)
  - Manual test script: 25 protocol-level tests with ✓/✗ feedback
  - Real-world demo script: timg, chafa, mpv visual validation
    with terminal mode reset after each tool invocation
@wkr111
Copy link
Copy Markdown

wkr111 commented Mar 4, 2026

🎉I can't believe you did something so amazing!!!🎉🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants