Skip to content

feat(mesh): add MeshSession singleton — ref-counted Zenoh session per process#103

Closed
cagataycali wants to merge 1 commit intostrands-labs:mainfrom
cagataycali:feat/mesh-session-singleton
Closed

feat(mesh): add MeshSession singleton — ref-counted Zenoh session per process#103
cagataycali wants to merge 1 commit intostrands-labs:mainfrom
cagataycali:feat/mesh-session-singleton

Conversation

@cagataycali
Copy link
Copy Markdown
Member

Summary

Add strands_robots/mesh_session.py — a process-level singleton that manages a single zenoh.Session shared by all Mesh instances in the same process.

This is PR 1 of 6 for the Zenoh mesh feature (#95).

What this PR does

strands_robots/mesh_session.py (219 LOC)

A MeshSession class that provides:

Feature Detail
Ref-counted Multiple Mesh instances share one session; actual close at refcount=0
Fork-aware Detects PID change and re-initialises (stale session after os.fork)
Lazy import import zenoh only on first MeshSession.open() — zero cost at package import
Env config ZENOH_CONNECT and ZENOH_LISTEN override endpoints
Programmatic config config_overrides dict for CI/testing
atexit hook Best-effort cleanup at interpreter shutdown
Fail-fast RuntimeError if zenoh.open() fails — no silent fallback
Graceful degradation Returns None if eclipse-zenoh not installed

Usage (future PRs will use this)

from strands_robots.mesh_session import MeshSession

session = MeshSession.open()    # refcount = 1
session2 = MeshSession.open()   # refcount = 2, same session
MeshSession.close()             # refcount = 1
MeshSession.close()             # refcount = 0, session.close() called

pyproject.toml changes

  • Add [mesh] optional extra: eclipse-zenoh>=0.11.0,<1.0.0
  • Include [mesh] in [all] extras
  • Add zenoh.* to mypy ignore_missing_imports

Tests

15 tests in tests/test_mesh_session.py — all mock zenoh, no network required:

Category Tests What's verified
Lifecycle 5 open/close/refcount/reuse/noop
Import failure 1 Returns None when zenoh absent
Env config 3 ZENOH_CONNECT, ZENOH_LISTEN, multi-endpoint
Config overrides 1 Programmatic JSON5 injection
Fork detection 1 PID change → re-init
Open failure 1 RuntimeError propagation
Thread safety 1 10 concurrent open + 10 concurrent close
atexit 2 Cleanup + noop

Results: 15 passed, 0.4s. Full suite: 243 passed, 6 skipped, 0 failures.

Checklist

  • Tests pass (pytest tests/test_mesh_session.py — 15/15)
  • No regression (pytest tests/ — 243/243)
  • Lint clean (ruff check + ruff format)
  • Mypy clean
  • Lazy import — no import zenoh at package level
  • Fail-fast on unrecoverable errors
  • Upper-bounded dependency (<1.0.0)
  • 3 files changed, 522 insertions

Closes #95 (partially — PR 1 of 6)


🤖 AI agent response. Strands Agents. Feedback welcome!

… process

Add strands_robots/mesh_session.py — a process-level singleton that
manages a single zenoh.Session shared by all Mesh instances in the
same process.

Key design decisions:
- Ref-counted: multiple Mesh instances share one session, closed at 0
- Fork-aware: detects PID change and re-initialises (stale session)
- Lazy import: 'import zenoh' only on first MeshSession.open()
- Env config: ZENOH_CONNECT and ZENOH_LISTEN override endpoints
- Programmatic config: config_overrides dict for CI/testing
- atexit hook: best-effort cleanup at interpreter shutdown
- Fail-fast: RuntimeError if zenoh.open() fails (no silent fallback)
- Returns None if eclipse-zenoh not installed (graceful degradation)

pyproject.toml:
- Add [mesh] optional extra: eclipse-zenoh>=0.11.0,<1.0.0
- Include [mesh] in [all] extras
- Add zenoh.* to mypy ignore_missing_imports
- Add mesh_session mypy override (warn_return_any=false)

Tests: 15 new tests (0.4s), all mock zenoh — no network required:
- Lifecycle: open/close/refcount/reuse
- Import failure: graceful None when zenoh absent
- Env config: ZENOH_CONNECT, ZENOH_LISTEN, multi-endpoint
- Config overrides: programmatic JSON5 injection
- Fork detection: PID change → re-init
- Open failure: RuntimeError propagation
- Thread safety: 10 concurrent open + 10 concurrent close
- atexit cleanup: session closed, noop when no session

Part 1 of 6 for Zenoh mesh (strands-labs#95).
@cagataycali
Copy link
Copy Markdown
Member Author

Closing as duplicate of #101 — three concurrent mesh-session PRs were opened simultaneously (#100, #101, #103). Consolidating on #101 which has the most complete scope (4 files, 801 LOC incl. PeerInfo registry). If reviewers prefer pieces from this PR, we will cherry-pick before merging #101.

Canonical PR: #101
Tracking issue: #98

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

feat(mesh): Zenoh peer-to-peer mesh for Robot() — fleet coordination

1 participant