web3dev1337 · web3dev1337 · Mar 5, 2026 · Mar 5, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -20,7 +20,7 @@ HYTOPIA is a multiplayer voxel game engine monorepo. The **server** (TypeScript/
 - **Physics**: Rapier3D (`@dimforge/rapier3d-simd-compat`) at 60 Hz, default gravity `y = -32`
 - **Networking**: WebTransport (QUIC) preferred, WebSocket fallback. Packets serialized with msgpackr, large payloads gzip-compressed
 - **Protocol**: `protocol/` defines all packet schemas (AJV-validated). Published as `@hytopia.com/server-protocol`
-- **Rendering**: Three.js `WebGLRenderer` + `MeshBasicMaterial` (no dynamic lights). Post-processing: SMAA, bloom, outline. Chunk meshes built in Web Worker via greedy meshing + AO
+- **Rendering**: Three.js `WebGLRenderer` + `MeshBasicMaterial` (no dynamic lights). Post-processing: SMAA, bloom, outline. Chunk meshes built in a Web Worker with face culling + AO (no greedy quad merging on `master`)
 - **Persistence**: `@hytopia.com/save-states` for player/global KV data
 - **Singleton pattern**: Most server systems use `ClassName.instance`; client systems owned by `Game` singleton
 

diff --git a/CODEBASE_DOCUMENTATION.md b/CODEBASE_DOCUMENTATION.md
@@ -185,7 +185,7 @@ blocks/BlockTextureAtlasManager.ts - Texture atlas generation
 blocks/utils.ts - Block utilities
 chunks/Chunk.ts - Client chunk state
 chunks/ChunkManager.ts - Chunk lifecycle (load/unload by distance)
-chunks/ChunkMeshManager.ts - Greedy meshing + AO for voxel geometry
+chunks/ChunkMeshManager.ts - Batch meshes from worker output (per-face meshing with face culling + AO on `master`)
 chunks/ChunkRegistry.ts - Chunk lookup
 chunks/ChunkConstants.ts - Chunk size constants
 chunks/ChunkStats.ts - Chunk performance stats
@@ -423,4 +423,4 @@ zombies-fps/ - Zombie FPS
 - **Dual transport** — WebTransport (QUIC) preferred, WebSocket fallback. Reliable stream + unreliable datagrams
 - **msgpackr serialization** — All packets serialized with msgpackr, large payloads gzip-compressed
 - **60 Hz physics / 30 Hz network** — Server physics ticks at 60 Hz, network sync flushes every 2 ticks
-- **Web Worker meshing** — Client offloads greedy meshing + AO to a dedicated Web Worker
+- **Web Worker meshing** — Client offloads chunk meshing + AO to a dedicated Web Worker (per-face meshing with face culling; no greedy quad merging on `master`)
diff --git a/ai-memory/docs/perf-external-notes-2026-03-05/FINDINGS.md b/ai-memory/docs/perf-external-notes-2026-03-05/FINDINGS.md
@@ -0,0 +1,80 @@
+# External Notes vs. HYTOPIA Source (Verification + PR Cross-Check)
+
+Base reference for verification in this branch: `origin/master` at `24a295d` (2026-03-05).
+
+## What Was Imported
+
+Unmodified external notes live in `ai-memory/docs/perf-external-notes-2026-03-05/raw/`.
+
+## Quick Take
+
+The external docs mix:
+
+- **Accurate observations about the current client** (notably: face culling exists; greedy meshing does not; geometry churn is high; packet decompression is synchronous).
+- **Roadmap/architecture assumptions that do not match `master`** (procedural streaming, time-budgeted collider queues, LOD/occlusion/face-limit systems, several referenced constants/functions).
+
+So: use them as *idea input*, but treat many “current state” statements as unverified unless they point to code that exists on `master`.
+
+## Claim Verification (Against `master`)
+
+### Client meshing/rendering
+
+- ✅ **Face culling exists**: `client/src/workers/ChunkWorker.ts` culls faces when neighbor blocks are solid/opaque.
+- ❌ **Greedy meshing is not implemented**: `client/src/workers/ChunkWorker.ts` emits per-face quads (4 vertices per visible face) with no quad merging pass.
+- ❌ **Vertex pooling is not present**: `client/src/chunks/ChunkMeshManager.ts` recreates a new `BufferGeometry` for each batch update and disposes the old geometry.
+- ❌ **LOD / cave occlusion / “face limit safety caps” described in notes are not found** via repo search on `client/src/` (`lod`, `occlusion`, face-count thresholds, BFS visibility, etc.).
+
+### Client networking
+
+- ✅ **Synchronous gzip decompression on the main thread**: `client/src/network/NetworkManager.ts` calls `gunzipSync` (fflate) before msgpack decode.
+
+### Server networking (entity/chunk sync)
+
+- ✅ **Entity pos/rot are a dominant sync path (and split to unreliable when pos/rot-only)**: `server/src/networking/NetworkSynchronizer.ts`.
+- ❌ **No entity quantization/delta fields exist today**: `protocol/schemas/Entity.ts` has only `p` (Vector) and `r` (Quaternion). `server/src/networking/Serializer.ts` serializes full float arrays.
+- ❌ **No chunk pacing/segmentation is implemented**: `server/src/networking/NetworkSynchronizer.ts` batches *all queued chunk syncs* into a single packet each sync.
+
+### Server colliders / chunk streaming
+
+Several external docs reference a *procedural streaming* pipeline (chunks-per-tick, queued collider chunk processing, async region I/O). Those specific codepaths/constants (e.g. `CHUNKS_PER_TICK`, `processPendingColliderChunks`, `COLLIDER_MAX_CHUNK_DISTANCE`, `server/src/worlds/maps/*`) are **not present on `master`**.
+
+## Notable Errors / Corrections in the Notes
+
+- **Quantized position range math is wrong as written**:
+  - If you encode `pq = round(x * 256)` into **int16**, the representable world range is about **±128 blocks**, not ±32768 blocks.
+  - To keep **1/256 block precision** over large worlds, you need larger integers (e.g. int32), smaller quantization, or chunk-relative encoding.
+
+## How This Relates to Your Performance PRs
+
+PRs authored by you that touch performance (as of 2026-03-05):
+
+- #2 (OPEN) `analysis/codebase-audit`: https://github.com/web3dev1337/hytopia-source/pull/2
+- #3 (OPEN) `docs/iphone-pro-performance-analysis`: https://github.com/web3dev1337/hytopia-source/pull/3
+- #4 (OPEN) `fix/fps-cap-medium-low`: https://github.com/web3dev1337/hytopia-source/pull/4
+- #5 (OPEN) `fix/cap-mobile-dpr`: https://github.com/web3dev1337/hytopia-source/pull/5
+- #6 (OPEN) `feature/map-compression`: https://github.com/web3dev1337/hytopia-source/pull/6
+- #7 (OPEN) `review/mirror-upstream-pr-9`: https://github.com/web3dev1337/hytopia-source/pull/7
+- #8 (OPEN) `review/mirror-upstream-pr-10` (stacked on #7): https://github.com/web3dev1337/hytopia-source/pull/8
+- #9 (OPEN) `review/mirror-upstream-pr-11`: https://github.com/web3dev1337/hytopia-source/pull/9
+- #10 (CLOSED) `fix/cap-mobile-devicepixelratio` (superseded): https://github.com/web3dev1337/hytopia-source/pull/10
+
+Where they overlap with the external notes:
+
+- **High-DPI / mobile GPU load**:
+  - #4 adds a 60 FPS cap for MEDIUM/LOW (matches the “uncapped 120Hz” problem described in #3).
+  - #5 caps mobile pixel ratio (matches the “3x DPR” issue described in #3).
+  - #9 introduces a **pixel budget** based effective pixel ratio and reduces outline overhead (complementary to #3).
+- **Outline pass overhead**:
+  - #9 removes per-mesh define mutation in `SelectiveOutlinePass` by prebuilding shader variants (reduces CPU/shader churn). It does **not** reduce the outline shader’s sampling cost.
+- **View-distance mesh visibility**:
+  - `master` currently iterates all batch meshes each frame. #9 adds cached visibility sets and updates visibility only when the camera crosses a “cell” boundary or settings change.
+- **Map size / load time**:
+  - #6 (compressed maps) addresses the external “JSON map size” concern; the external “binary streaming maps” discussion is broader than #6’s scope.
+
+## What’s Still Missing (Relative to the External Notes + Your PRs)
+
+- **Greedy meshing / quad merging** in `client/src/workers/ChunkWorker.ts`.
+- **Entity sync quantization / deltas / distance-based rates** (protocol + serializer + client deserializer work).
+- **Chunk packet pacing/segmentation** to avoid bursty chunk arrays at join / fast movement.
+- **Off-main-thread decompression/decoding** for network payloads (or reduced use of sync `gunzipSync`).
+
diff --git a/ai-memory/docs/perf-external-notes-2026-03-05/README.md b/ai-memory/docs/perf-external-notes-2026-03-05/README.md
@@ -0,0 +1,6 @@
+# External Performance Notes (Imported)
+
+These documents were copied from the Windows mount (`/mnt/c/Users/AB/Downloads`) on **2026-03-05** and treated as *unverified external notes*.
+
+- Canonical copies live in `raw/`.
+- Some downloads existed as duplicate filenames with ` (1)` suffixes; those duplicates are preserved under `raw/duplicates/` for traceability.
diff --git a/...emory/docs/perf-external-notes-2026-03-05/raw/COLLIDER_ARCHITECTURE_RESEARCH.md b/...emory/docs/perf-external-notes-2026-03-05/raw/COLLIDER_ARCHITECTURE_RESEARCH.md
@@ -0,0 +1,159 @@
+# Collider Architecture Research
+
+**Purpose:** Guide the refactor of Hytopia’s block collider system from O(world) to O(nearby chunks).  
+**Audience:** Engineers implementing Phase 1 (Collider Locality) and Phase 2 (Incremental Voxel Updates).
+
+---
+
+## 1. Current Architecture
+
+### 1.1 Block Type → Collider
+
+- One collider per **block type** (dirt, stone, etc.), not per block.
+- Voxel collider: Rapier voxel grid; each cell = block present/absent.
+- Trimesh collider: Used for non-cube blocks; rebuilt when any block of that type changes.
+
+### 1.2 Critical Path
+
+```
+setBlock / addChunkBlocks
+  → _addBlockTypePlacement
+  → _getBlockTypePlacements()   // iterates ALL chunks of this block type
+  → _combineVoxelStates(collider)  // merges placements into voxel grid
+  → collider.addToSimulation / setVoxel
+```
+
+**Problem:** `_getBlockTypePlacements` and `_combineVoxelStates` touch every chunk that contains the block type. As world size grows, this becomes O(world).
+
+---
+
+## 2. Target Architecture: Spatial Locality
+
+### 2.1 Principle
+
+- Colliders should only include blocks from chunks **within N chunks of any player** (e.g. N=4).
+- When a chunk unloads (player moves away), remove its blocks from colliders.
+- When a chunk loads, add its blocks to colliders only if it’s within the active radius.
+
+### 2.2 Data Structure Change
+
+**Current:** `_blockTypePlacements` is global (or implicitly spans all chunks).
+
+**Target:** Maintain a **spatial index**:
+
+```ts
+// Chunk key (bigint) → for each block type in that chunk: Set of global coordinates
+private _chunkBlockPlacements: Map<bigint, Map<number, Set<string>>> = new Map();
+
+// Active chunk keys: chunks within COLLIDER_RADIUS of any player
+private _activeColliderChunkKeys: Set<bigint> = new Set();
+```
+
+- On chunk load: add chunk key to index; add block placements.
+- On chunk unload: remove chunk key; remove blocks from colliders.
+- `_getBlockTypePlacements` for collider: only return placements from `_activeColliderChunkKeys`.
+- `_combineVoxelStates`: only iterate over placements from active chunks.
+
+### 2.3 Update Flow
+
+```
+Player moves
+  → Update _activeColliderChunkKeys (chunks within radius)
+  → For chunks that left radius: remove from colliders
+  → For chunks that entered radius: add to colliders
+  → _combineVoxelStates only over active placements
+```
+
+---
+
+## 3. Incremental Voxel Updates
+
+### 3.1 Current
+
+- Adding a chunk: all 4096 blocks added at once to the voxel collider.
+- Heavy: `setVoxel` 4096 times + propagation.
+
+### 3.2 Target
+
+- Add blocks in **batches** (e.g. 256–512 per tick).
+- Time-budget: stop when budget exceeded; resume next tick.
+- Rapier voxel API: check if it supports incremental `setVoxel` without full rebuild.
+
+### 3.3 Implementation Sketch
+
+```ts
+private _pendingVoxelAdds: Array<{ chunk: Chunk; blockTypeId: number; nextIndex: number }> = [];
+
+function processPendingVoxelAdds(timeBudgetMs: number) {
+  const start = performance.now();
+  while (this._pendingVoxelAdds.length > 0 && (performance.now() - start) < timeBudgetMs) {
+    const next = this._pendingVoxelAdds[0];
+    const chunk = next.chunk;
+    const count = Math.min(256, chunk.blockCountForType(next.blockTypeId) - next.nextIndex);
+    for (let i = 0; i < count; i++) {
+      const idx = next.nextIndex + i;
+      const globalCoord = chunk.getGlobalCoordinateFromIndex(idx);
+      collider.setVoxel(globalCoord, true);
+    }
+    next.nextIndex += count;
+    if (next.nextIndex >= chunk.blockCountForType(next.blockTypeId)) {
+      this._pendingVoxelAdds.shift();
+    }
+  }
+}
+```
+
+---
+
+## 4. Trimesh Optimization
+
+### 4.1 Current
+
+- Trimesh collider rebuilt whenever any block of that type is added/removed.
+- Rebuild = collect all placements, generate mesh, replace collider.
+
+### 4.2 Options
+
+1. **Spatial locality:** Only include trimesh blocks from active chunks. Reduces vertex count for large worlds.
+2. **Deferred rebuild:** Queue rebuild; execute in next tick within time budget.
+3. **Per-chunk trimesh:** If block type is sparse, consider per-chunk trimesh instances instead of one giant trimesh. (Larger change.)
+
+**Recommendation:** Start with (1) and (2). (3) is Phase 6.
+
+---
+
+## 5. Collider Unload
+
+When a chunk unloads:
+
+1. Remove its block placements from the spatial index.
+2. For each block type in that chunk:
+   - Voxel: `setVoxel(coord, false)` for each placement.
+   - Trimesh: trigger rebuild (only over active chunks).
+3. Remove chunk from `_activeColliderChunkKeys`.
+
+---
+
+## 6. Rapier Voxel API Notes
+
+- Check `rapier3d` docs for `ColliderDesc.heightfield` vs `ColliderDesc.voxel`.
+- Voxel colliders: typically a 3D grid; `setVoxel` may or may not support incremental updates.
+- If full rebuild required per update: minimize rebuild frequency (batch changes) and scope (active chunks only).
+
+---
+
+## 7. Success Criteria
+
+| Metric | Before | After |
+|--------|--------|-------|
+| Chunks scanned per collider update | O(world) | O(active) ~100–300 |
+| Time per `_combineVoxelStates` | 5–50 ms | <2 ms |
+| Collider add spikes | Full chunk at once | Batched, time-budgeted |
+
+---
+
+## References
+
+- `ChunkLattice.ts` – `_addChunkBlocksToColliders`, `_combineVoxelStates`, `_getBlockTypePlacements`
+- Rapier3D voxel API
+- Minecraft: per-section collision, spatial culling