Skip to content

feat(arcane-infra): Rapier-backed cluster physics — RapierClusterSim + RapierClusterSimulation#123

Merged
martinjms merged 5 commits intochore/disable-claude-code-cifrom
feat/rapier-cluster
May 3, 2026
Merged

feat(arcane-infra): Rapier-backed cluster physics — RapierClusterSim + RapierClusterSimulation#123
martinjms merged 5 commits intochore/disable-claude-code-cifrom
feat/rapier-cluster

Conversation

@martinjms
Copy link
Copy Markdown
Contributor

Quick Summary

  • What changed: First Rapier-backed cluster physics integration lands as a ClusterSimulation wrapper. Drop-in for cluster_runner::run_cluster_loop; same wire format, same networking, same primitives.
  • Why: Foundation for the heterogeneous-node-tier vision (#33) and the cluster-physics-backends epic (#8). Closes the "second backend" line item in #8's acceptance criteria.
  • Behind a Cargo feature rapier-cluster. Vanilla cargo build -p arcane-infra pulls zero rapier3d into the dep tree.
  • Pure entity-driven physics. Per-entity collider shapes (sphere / capsule / cuboid). Contact start/stop events surfaced to user code with one-tick delay.
  • Architecture documentation included: new canonical entity-model.md plus updates to four-bucket-state-model.md and physics-backends-and-unreal.md codifying the unified-entity model.

Change Type

  • feature (with companion docs)

Impact

  • User/developer impact: Rust game developers can now compose a cluster node that runs authoritative physics via Rapier. They implement RapierClusterSimulation (or use the V1 path via ClusterSimulation); the wrapper handles spawn/despawn, body sync, contact events, and the physics step.
  • Risk level: Low. Behind a feature flag — vanilla path is byte-identical to before. 38 unit tests covering every documented contract; binary verified end-to-end against running Redis (1000 stable ticks at ~0.07ms tick time, WS connection accepted, no errors). Clippy silent both modes. Doctest compiles.

Verification

  • Build passes (cargo build -p arcane-infra and cargo build -p arcane-infra --features rapier-cluster --bins)
  • Tests pass — vanilla 65, --features rapier-cluster 104 (68 lib + 35 integration + 1 doctest)
  • Clippy silent both modes (cargo clippy -p arcane-infra and --features rapier-cluster --all-targets)
  • Vanilla cargo tree shows zero rapier3d references
  • Binary smoke-tested against running Redis; ~1000 ticks, stable tick time, WS connection accepted
  • Formatter clean

What's in (4 commits)

Commit Summary
131b439 Minimum integration: RapierClusterSim wraps ClusterSimulation, inserts PhysicsPipeline::step between user on_tick and ClusterServer::tick. Feature flag + binary. 18 tests.
664671c Sibling trait RapierClusterSimulation + extended RapierClusterTickContext with contact_events. RapierColliderShape enum (Ball / Capsule / Cuboid). 6 contact-event tests.
6a9c4fe Review-driven polish (Vec3↔Vector helpers, HashSet for removals, env-parser dedup, Mutex-poison propagation, #[non_exhaustive] on 4 public types) + 14 contract-pinning tests.
f91522d Architecture docs: new canonical entity-model.md; durable-state-per-entity invariant in four-bucket doc; body-kind clarification in physics-backends doc.
Architecture summary

RapierClusterSim IS-A ClusterSimulation that HAS-A user ClusterSimulation (or RapierClusterSimulation). Each tick:

  1. User logic runs (game intent, action handling, velocity mutation).
  2. Wrapper acquires its Mutex<RapierState> lock.
  3. Despawns bodies for pending_removals and entities that vanished from the entity map.
  4. Spawns first-sight entities at entity.position with shape from collider_for.
  5. Syncs entity.velocitybody.linvel for existing bodies.
  6. step_with_accumulator: fixed 1/60 s Rapier substeps until accumulator drains.
  7. sync_outputs: writes body.translationentity.position and body.linvelentity.velocity for replication.

Contract (codified in module docs and tests):

  • entity.velocity is intent-in.
  • entity.position is output-only after first-sight spawn.
  • Despawn driven by pending_removals and entity-map disappearance.
  • Default sphere collider; per-entity shapes via collider_for.
  • Contact events have a one-tick delay by design — user logic runs first to set intent, physics produces output for next tick.
  • Despawn-during-contact does not surface a Stopped event to the partner (documented and tested).

Public API surface (re-exported from arcane_infra):
RapierClusterSim, RapierConfig, RapierClusterSimulation, RapierClusterTickContext, RapierColliderShape, ContactEvent. All #[non_exhaustive] on the value types so future fields don't break SemVer.

What's NOT in this PR (follow-up issues filed)

The current shape is entity-only physics — it works for arena-style demos and entity-vs-entity gameplay, but real-game-shape capabilities are tracked separately:

  • #120 — spawn-time hooks: per-entity body kind (Dynamic / Kinematic / Fixed), material (friction / restitution / density), collision groups, sensor mode. Not blocked.
  • #121 — in-tick imperative ops: apply impulse / force / torque, set translation, raycasts, intersection queries, joints between same-cluster entities. Not blocked.
  • #122 — gap inventory tracker (living document; full table of every Rapier capability with status).
  • #119 — Terrain epic: automatic per-cluster collision loading driven by entity positions. Map geometry is not something developers insert by hand; the Arcane runtime does it. Required before raycasts can hit walls.

After #120 and #121 (both unblocked), every Rapier capability that doesn't depend on terrain or cross-cluster physics is reachable from user code. The "no extra cluster-imposed limitations vs. local Rapier" goal is the bar.

End-to-end smoke test against running Redis

Beyond unit tests: arcane-rapier-cluster binary started against a running Redis instance (arcane-bench-redis Docker container on :6379):

arcane-cluster started cluster_id=11111111-... neighbors=0 tick_rate=20Hz
cluster stats HTTP listening on http://0.0.0.0:18081/stats
cluster WebSocket listening on ws://0.0.0.0:18080 (binary arcane-wire frames only)
ws accept #1 from 127.0.0.1:34502

Ran ~1000 ticks at stable tick time (tick_ms=0.07–0.08, well under the 50ms 20Hz budget). WS connection accepted cleanly. Stats endpoint returned valid JSON. Zero parse failures, zero broadcast lag, zero send errors.

This is the integration confidence that no unit test could provide — the wrapper composes correctly with cluster_runner::run_cluster_loop, the WebSocket server, the neighbor-subscriber path, the stats HTTP endpoint, and Redis pub/sub end-to-end.

Reference

  • Closes #117 — Rapier cluster physics: minimum integration
  • Closes #118 — Rapier cluster physics: contact events + per-entity collider shapes
  • Refs #8 — Cluster physics backends (parent epic; #8's "second backend" acceptance criterion is now satisfied)
  • Refs #33 — Engine-specific node types (strategic context for the heterogeneous-node-tier vision)
  • Refs #119 — Terrain epic (parallel non-blocking dependency for shipping real games)
  • Refs #120 — Spawn-time hooks (next slice; not blocked)
  • Refs #121 — In-tick imperative ops (next slice; not blocked)
  • Refs #122 — Rapier gap inventory tracker

Note on PR base

This PR is currently based on chore/disable-claude-code-ci (which is #92) so reviewers see only the 4 Rapier commits, not the chore commit underneath. Once #92 merges, this PR's base will auto-retarget to main.

martinjms and others added 5 commits May 3, 2026 11:37
…e physics (v1)

Introduces a feature-gated Rapier integration as a `ClusterSimulation` wrapper.
Drop-in for `run_cluster_loop`: same wire format, same networking, same
replication primitives — only the per-tick physics step is new.

Architecture:
- `RapierClusterSim` IS-A `ClusterSimulation` that HAS-A user `ClusterSimulation`.
  User logic runs first (intent / game actions), then Rapier integrates pose.
- `entity.velocity` is intent-in; `entity.position` is output-only after first-
  sight spawn (user position writes are silently overwritten by Rapier output).
- Despawn driven by `pending_removals` and entity-map disappearance; sync_inputs
  / sync_outputs filter pending-removal ids to avoid re-spawning bodies the
  user just asked to remove.
- Fixed 1/60 Rapier substeps with accumulator over the variable cluster tick.
- v1 default: uniform 0.5-radius sphere collider per entity; per-entity shapes
  via `user_data` schema deferred.

Feature gating:
- `rapier3d = "0.32"` declared `optional = true`; `rapier-cluster` feature
  pulls it in alongside `cluster-ws`.
- `arcane_rapier_cluster` binary requires `rapier-cluster`.
- Vanilla `cargo build -p arcane-infra` produces zero Rapier in the dep tree.

Tests:
- 18 unit tests covering lifecycle (spawn/despawn/respawn), multi-entity
  independence (incl. 500-entity scale), dynamics (velocity passthrough,
  gravity vs analytic kinematic, velocity-change-mid-sim), user-sim
  composition (correct context propagation, pending_removals from user code,
  buff-pattern velocity modulation), and determinism / despawn-respawn
  round-trip (hand-off scenario).
- Vanilla tests unchanged: 65 pass. Feature-on: 83 pass. Clippy silent both
  modes; doctest in module docs compiles.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…apierClusterSim

Adds the V2 surface so Rapier-backed nodes are usable for real games — without
this, integration is "uniform spheres with invisible collisions" (fine for tech
demo, useless for typical action gameplay).

Public additions:
- `RapierClusterSimulation` trait — sibling of `ClusterSimulation`, lives in
  `arcane-infra::rapier_cluster` so it can use Rapier types freely. Receives a
  `RapierClusterTickContext` instead of `ClusterTickContext`.
- `RapierClusterTickContext<'a>` — same fields as `ClusterTickContext` plus
  `contact_events: &[ContactEvent]` from the previous tick's physics step.
  One-tick delay by design: user logic runs first to set intent, physics produces
  output for next tick.
- `ContactEvent { entity_a, entity_b, started }` — collisions mapped from
  Rapier collider handles back to entity Uuids via a new reverse map.
- `RapierColliderShape::{Ball | Capsule | Cuboid}` — declared per entity via
  `RapierClusterSimulation::collider_for`. Default impl returns
  `Ball(config.default_body_radius)`. Resolved at first-sight spawn only;
  later returns are ignored (despawn-and-respawn to change shape).
- `RapierClusterSim::with_rapier_sim(rapier_sim, config)` constructor for the
  new trait. V1 `new` and `with_default_config` constructors preserved.

Internal refactor:
- `RapierClusterSim` now holds a private `Backend { None, Cluster, Rapier }` enum.
  `on_tick` dispatches per variant; the Rapier branch builds the extended ctx.
- `RapierState` gains `collider_to_entity: HashMap<ColliderHandle, Uuid>` for
  event mapping and `pending_contact_events: Vec<ContactEvent>` populated by a
  custom `EventHandler` impl (`CollisionRecorder`) installed during the step.
  Spawn loop and shape resolution moved out of `RapierState` into the wrapper
  so the active backend can drive `collider_for`.
- Every spawned collider sets `ActiveEvents::COLLISION_EVENTS`.

Tests:
- 6 new V2 tests: contact event surfaces for overlapping spheres; distant
  capsules produce no contacts; collider_for honored at first-sight spawn
  (verified via direct ColliderSet shape inspection); shape change after
  first-sight is ignored AND collider_for is called exactly once per entity;
  one-tick-delay semantics for contact events; no duplicate Started for a
  persistent overlap.
- All 18 V1 tests pass unchanged (same trait, same constructors, same wire).

Verification: 54 unit tests + 35 integration + 1 doctest pass under
`--features rapier-cluster`. Vanilla 65 tests pass; vanilla `cargo tree`
shows zero `rapier3d` references. Clippy silent both modes.

Closes #118.
Refs #8, #117.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ct tests

Synthesizes findings from the simplify skill (3 parallel review agents:
reuse / quality / efficiency) and a security-review pass, plus an
architectural pass on SemVer stability. Security review found zero
HIGH/MEDIUM vulnerabilities. Adds 14 tests so every module-doc claim is
backed by a test that would fail if the claim broke.

Code polish:
- New `to_rapier(Vec3) -> Vector` and `from_rapier(Vector) -> Vec3` helpers
  replacing five sites of triplet `.x as f32, .y as f32, .z as f32` casts.
- `RapierState::set_linvel` now takes `Vec3` instead of three `f64`s.
- Deleted unused `impl Default for RapierColliderShape` (closed a drift door
  vs `RapierConfig::default_body_radius`).
- `CollisionRecorder` propagates Mutex poison via `expect()` instead of
  silently dropping events on poison — surfaces panics that happen mid-step.
- Per-tick `pending_removals` lookup now uses a `HashSet` (was O(N×M) linear
  scan over a slice when entities × removals was non-trivial).
- Extracted `ClusterEnv::from_env()` helper into `cluster_runner`; both
  `arcane_cluster` and `arcane_rapier_cluster` binaries use it (was a
  verbatim duplicate of env-parsing across the two binaries).
- Stripped `v1`/`v2` release-stage labels from doc comments per project
  policy; trimmed several comments that restated the next line of code.

SemVer stability:
- `#[non_exhaustive]` on `RapierColliderShape`, `ContactEvent`,
  `RapierClusterTickContext`, `RapierConfig`. Future additions
  (e.g. `Cylinder` shape, `impulse_magnitude` on events, query handles in
  the context) won't be breaking changes.

New tests (every one corresponds to a documented contract that was
previously unverified):

T1 stopped_event_surfaces_when_bodies_separate
T2 despawn_during_contact_does_not_surface_stopped_event (pins the
   no-Stopped-on-despawn behavior; partners detect via the entity map)
T3 default_path_collider_is_a_ball_with_config_radius (V1 default shape
   directly inspected; previously only Cuboid was)
T4 capsule_collider_is_honored_at_first_sight (capsule shape inspected)
T5 multi_substep_in_one_cluster_tick (dt=0.1 → 6 substeps, position ~0.1)
T6 slow_dt_accumulates_until_substep_fires (dt=0.005, fires after ~3-4
   ticks, position converges to dt_total*v)
T7 contact_resolution_applies_impulse_to_partner (B gets pushed when A
   collides with it — Rapier responds, doesn't just detect)
T8 collider_for_invoked_freshly_on_respawn (despawn-respawn-same-uuid
   triggers a fresh shape decision)
T9 rapier_ctx_propagates_game_actions_tick_and_dt (V2 parallel of the V1
   context-propagation test)
T10 rapier_user_can_request_removal_via_pending_removals (V2 parallel of
    the V1 removal test)
T11 mixed_shape_ball_vs_cuboid_produces_contact (cross-shape collision
    now exercised; all prior contact tests paired same-shape)
T12 nondefault_gravity_honored_on_arbitrary_axis (gravity isn't hardcoded
    to -Y somewhere)
T13 contact_events_do_not_carry_across_handoff (cluster B's first tick
    sees ctx.contact_events == &[], not cluster A's events)
T14 capsule_axis_is_y (segment endpoints at (0, ±half_height, 0))

Verification: 68 lib tests (was 54) + 35 integration + 1 doctest pass under
`--features rapier-cluster`. Vanilla 65 unchanged. Clippy silent both modes.
Vanilla dep tree still has zero `rapier3d`.

Refs #117, #118, #8.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… invariant; clarify body kinds in physics backends doc

New canonical doc `docs/architecture/entity-model.md` codifies the unified-
entity model decided in the 2026-05-03 architecture session:

- Arcane has one persistent-world concept: the Entity. Players, NPCs,
  projectiles, dropped items, structures, player-built walls — all are
  entities differentiated by per-entity hooks (body kind, collider, material,
  collision groups, sensor mode), not by separate types at the platform
  level. Matches modern engine practice (Unreal AActor, Unity GameObject,
  Bevy ECS, Godot Node).
- Two-axis classification (animate × moving/stationary) is described with
  industry-standard term cross-references.
- Physics body kinds (Dynamic / KinematicPositionBased / KinematicVelocityBased
  / Fixed) are documented with their per-tick cost and migration semantics.
- Affinity-bound vs spatial-bound distinction is called out as a clustering
  concern (not a physics concern); needs follow-up work in the clustering
  model so Fixed entities don't migrate by PGP affinity.
- Terrain is explicitly NOT entities — it's content loaded by the Arcane
  runtime based on entity positions. Cross-link to terrain epic #119.

Updates to `four-bucket-state-model.md`:
- Adds the "every entity has bucket-4 durable state" universal invariant up
  front. This is what makes recovery / migration work and what unifies
  ephemeral game objects with structural ones in a single concept.

Updates to `physics-backends-and-unreal.md` §6 (entity ↔ body mapping):
- Adds body-kind row (per-entity hook, default Dynamic).
- Adds explicit terrain-is-not-entities row with cross-link to #119.
- Notes Rapier's sleep mechanism + Fixed-body solver-skip preserve the
  "no entities → no simulation" invariant without needing a separate
  Structure concept.

Refs #117, #118, #119, #120, #121, #122, #8, #33.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…in lines

CI's `cargo fmt --check` flagged six sites in rapier_cluster.rs (V2 tests with
long-form `entities.insert(id, mk_entry(...))` calls and one chained
with_collider closure) plus one site in cluster_runner.rs (long-line
Uuid::parse_str with map_err). Pure formatting; no functional changes.
All 38 rapier_cluster tests pass; clippy silent both modes.

Refs #117, #118, #123.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@martinjms martinjms merged commit df83c6f into chore/disable-claude-code-ci May 3, 2026
1 check passed
martinjms added a commit that referenced this pull request May 3, 2026
…ic + terrain MapProvider framing

Updates entity-model.md with the architectural decisions from the 2026-05-03
sessions on cross-engine support and terrain handling.

§7 Terrain — rewritten:
- Static / voxel / procedural terrain shapes all supported through one
  per-engine MapProvider interface.
- Game owns storage (object storage / SpacetimeDB voxel chunks / on-disk /
  procedural / hybrid) and authoring tool (engine editor / voxel editor /
  generator). Arcane owns the loading interface.
- Voxel terrain content lives in SpacetimeDB; static mesh content in
  object storage; map manifest small in SpacetimeDB.

§8 (new) Conceptual contract vs. per-engine API:
- User-facing APIs are engine-native per plugin (UE C++, Unity C#, Godot,
  Rapier Rust). Wire format, manager / replication protocols, durable state
  schema invariant, and conceptual vocabulary are shared. Physics-property
  enums (BodyKind, ColliderShape, Material) are NOT promoted to a shared
  arcane-core; each plugin uses engine-native equivalents.
- Reverses an earlier proposal to unify physics value types — that would
  produce four parallel re-implementations of the same enum across language
  boundaries with no benefit.

§9 (new) Engine plugin pattern:
- Engine-named base classes (AArcaneUnrealEntity / ArcaneUnityEntity /
  ArcaneGodotEntity) extending engine-native types.
- Per-engine cluster runtime, MapProvider, in-tick imperative ops.
- Wire-format byte-compatibility across all engines via shared protocol.

§10 (new) Cross-engine entity migration:
- Entities can migrate between cluster tiers running different engines.
- Devs write per-engine game logic for each tier they support.
- Migration is at cluster-process boundaries; durable state in SpacetimeDB
  is the lingua franca.
- Cross-engine consistency for game rules (damage formulas, drop tables)
  lives in SpacetimeDB reducers called from every engine plugin.
- No in-process engine switching; "the function that runs physics for this
  engine" is the entire cluster binary written in that engine's language.

Refs #117, #118, #119, #122, #123, #124.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
martinjms added a commit that referenced this pull request May 3, 2026
Codifies the architectural decisions made during the Rapier integration
work (#117, #118, PR #123) for posterity. Closes the open ADR item in
#8's acceptance criteria for the Rapier track.

Decisions documented:

- Composition over inheritance — RapierClusterSim IS-A ClusterSimulation
  that wraps a user ClusterSimulation (or RapierClusterSimulation). No
  new PhysicsBackend trait introduced.
- In-process Rust library — no sidecar process, no FFI. Cargo feature
  rapier-cluster gates the optional rapier3d dependency. Vanilla builds
  pull zero rapier3d into the dep tree.
- Single Mutex<RapierState> wrapper; user code never sees RigidBodySet
  directly. Entity-keyed in-tick ops only — no off-spine bodies.
- Per-entity hooks called once at first-sight spawn; collider shape /
  material / body kind / collision groups / sensor are spawn-time
  decisions, not per-tick.
- Velocity in / position out contract. User mutations to entity.position
  during on_tick are silently overwritten by Rapier's post-step output.
- Contact events surface with one-tick delay (intent before output).
  Despawn-during-contact does NOT surface Stopped to the partner —
  partners detect via the entity map.
- All public types are `#[non_exhaustive]` from day one.

Alternatives considered + rejected:
- Separate crate `arcane-physics-rapier` with new PhysicsBackend trait
  (rejected — Cargo feature flag achieves dependency isolation with
  less ceremony).
- Sidecar process running Rapier (rejected — IPC overhead destroys
  per-tick budget for an in-process library).
- Direct &mut RigidBodySet exposure to user code (rejected — off-spine
  bodies and cross-cluster joints would silently break replication
  invariants).
- Engine-neutral physics types in arcane-core shared across backends
  (rejected — language barriers force per-plugin re-implementations
  anyway; documented in entity-model.md §8).

Updates physics-backends-and-unreal.md §7 to point at the ADR rather
than the earlier "separate crate per backend" framing, which the Rapier
work refined.

Updates docs/architecture/adr/README.md with an index of ADRs, marking
ADR-001 as Accepted and ADR-002 (Unreal Cluster Node) as Pending per
#124.

Refs #8, #117, #118, #119, #122, #123, #124.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant