Skip to content

Conversation

@grahamking
Copy link
Contributor

@grahamking grahamking commented Nov 5, 2025

Running any of the components with --store-kv file will use the file system instead of etcd for discovery. If there is a shared folder, or for the single-node case, this allows running frontend + backend without etcd.

The default folder is /tmp/dynamo_store_kv, configurable via DYN_FILE_KV env var.

This PR touched a lot of things in relatively light ways, including some nice simplifications where we can now always take a DistributedRuntime instead of that or a Runtime.

The main file is lib/runtime/src/storage/key_value_store/file.rs.

Follow up PRs (this is big enough) will add:

  • Cleanup old keys similar to etcd lease revokation.
  • Remove the is_static stuff in favor of this file based KeyValueStore.

Example

Worker:

python -m dynamo.vllm --connector none --max-model-len 4096 --model Qwen/Qwen3-0.6B --store-kv file

and / or

python -m dynamo.sglang --model-path nvidia/NVIDIA-Nemotron-Nano-9B-v2 --page-size 16 --trust-remote-code --mem-fraction-static 0.92 --store-kv file

Frontend:

python -m dynamo.frontend --store-kv file

Look in $TMPDIR/dynamo_store_kv (typically /tmp) for the live keys.

Summary by CodeRabbit

  • New Features

    • Added --store-kv CLI option to select key-value store backend (etcd, memory, or file).
    • Introduced filesystem-based key-value store with file watching capabilities.
    • Added in-memory key-value store option for deployment flexibility.
  • Chores

    • Added inotify dependency for file system event monitoring.

@grahamking grahamking requested review from a team as code owners November 5, 2025 23:28
@github-actions github-actions bot added the feat label Nov 5, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 5, 2025

Walkthrough

This change introduces configurable KV store backends (Etcd, Memory, File) throughout the Dynamo runtime system. A --store-kv CLI option is added across multiple components, the DistributedRuntime constructor is updated to accept a store backend parameter, worker functions are refactored to instantiate runtime locally instead of receiving it as a decorator parameter, and a new filesystem-backed KV store with inotify support is implemented.

Changes

Cohort / File(s) Summary
CLI Configuration - Store Backend Option
components/src/dynamo/frontend/main.py, components/src/dynamo/mocker/args.py, components/src/dynamo/mocker/main.py, components/src/dynamo/sglang/args.py, components/src/dynamo/trtllm/utils/trtllm_utils.py, components/src/dynamo/vllm/args.py, launch/dynamo-run/src/flags.rs
Adds new --store-kv command-line option across frontend, mocker, sglang, trtllm, and vllm services with default from DYN_STORE_KV env var or "etcd" fallback; updates Config and Flags data structures to expose store_kv: String field.
Worker Function Refactoring
components/src/dynamo/sglang/main.py, components/src/dynamo/trtllm/main.py, components/src/dynamo/vllm/main.py
Removes @dynamo_worker(static=False) decorator and runtime parameter from worker functions; moves DistributedRuntime instantiation from decorator-based injection to local scope within worker, now constructed as DistributedRuntime(loop, config.store_kv, False).
Runtime Initialization and Engine Dispatch
launch/dynamo-run/src/lib.rs, lib/bindings/python/rust/lib.rs, lib/bindings/python/rust/llm/entrypoint.rs
Updates DistributedRuntime::new to accept store_kv: String parameter; refactors engine dispatch to pass DistributedRuntime directly instead of Either<Runtime, DistributedRuntime>; removes Either wrapper from runtime passing.
Input Entrypoint Consolidation
lib/llm/src/entrypoint/input.rs, lib/llm/src/entrypoint/input/batch.rs, lib/llm/src/entrypoint/input/common.rs, lib/llm/src/entrypoint/input/grpc.rs, lib/llm/src/entrypoint/input/http.rs, lib/llm/src/entrypoint/input/text.rs
Consolidates runtime parameter from Either<Runtime, DistributedRuntime> to direct DistributedRuntime parameter across all entrypoint handlers (batch, grpc, http, text); removes conditional runtime extraction logic and simplifies all invocations.
Storage Backend Abstraction
lib/runtime/src/distributed.rs, lib/runtime/src/storage/key_value_store.rs
Introduces KeyValueStoreSelect enum (Etcd, Memory, File) with parsing/formatting support; updates DistributedConfig to use store_backend: KeyValueStoreSelect instead of direct etcd_config; extends KeyValueStore trait with shutdown() method; modifies KeyValueStoreManager struct to tuple-like form with new file() constructor.
Filesystem KV Store Implementation
lib/runtime/src/storage/key_value_store/file.rs
Adds new FileStore implementation with Directory bucket type; uses inotify for file watching; provides insert, get, delete, watch, and entries operations with lifecycle cleanup and owned file tracking.
Storage Backend Updates
lib/runtime/src/storage/key_value_store/etcd.rs, lib/runtime/src/storage/key_value_store/mem.rs, lib/runtime/src/storage/key_value_store/nats.rs
Adds shutdown() method to KeyValueStore trait implementations; updates watch event handling to emit WatchEvent::Delete(Key::from_raw(key)) instead of KeyValue variant for delete events.
Component Discovery Refactoring
lib/runtime/src/component/client.rs, lib/runtime/src/component/endpoint.rs
Removes direct etcd client usage from dynamic instance source lookup; switches endpoint discovery from etcd registration to store-based bucket insertion; updates watcher to use generic store API instead of etcd-specific paths.
Runtime Audit and Integration
lib/llm/src/audit/sink.rs, lib/llm/tests/audit_nats_integration.rs
Changes spawn_workers_from_env signature from Option<&DistributedRuntime> to &DistributedRuntime (non-optional); updates test call sites to pass direct reference instead of Some(...) wrapper.
Discovery and Watch Event Refactoring
lib/llm/src/discovery/watcher.rs
Updates Delete event handling in model watcher to use WatchEvent::Delete(key) directly instead of WatchEvent::Delete(kv), extracting deleted key via key.as_ref().
Dependencies
lib/runtime/Cargo.toml
Adds new dependency inotify = { version = "0.11" } for filesystem-based watch event support.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas requiring extra attention:

  • lib/runtime/src/storage/key_value_store/file.rs: New filesystem-backed implementation with inotify integration; verify error handling, path sanitization, file lifecycle cleanup, and concurrent access patterns via Arc<Mutex>.
  • lib/runtime/src/distributed.rs: KV store backend selection logic and config flow changes; ensure all three backends (Etcd, Memory, File) initialize correctly and shutdown is called consistently.
  • lib/llm/src/entrypoint/input/*.rs: Six entrypoint files with runtime consolidation; verify that all code paths correctly derive cancellation tokens and handle shutdown semantics with the new DistributedRuntime pattern.
  • lib/runtime/src/component/client.rs and endpoint.rs: Discovery refactoring from etcd-specific to generic store API; ensure watcher event filtering and model registration work correctly across different store backends.
  • Worker function refactoring (sglang/main.py, trtllm/main.py, vllm/main.py): Verify that signal handlers and graceful shutdown paths correctly use locally instantiated runtime; check for race conditions in initialization sequencing.

Poem

🐰 A rabbit hops through backends galore,
Etcd, Memory, now File at the door!
Workers spring forth with their own little stores,
FileWatch twitches—inotify soars!
Runtime refactored, entrypoints aligned,
Storage abstracted, all backends combined! ✨

Pre-merge checks

✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main feature: adding filesystem-backed key-value store support.
Description check ✅ Passed The PR description clearly explains the feature, use cases, configuration options, examples, and notes planned follow-up work.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/runtime/src/storage/key_value_store.rs (1)

282-287: Handle missing file buckets the same way as other backends.

When the directory for a bucket doesn’t exist yet, this implementation bubbles up StoreError::MissingBucket. Callers expect get_bucket to yield Ok(None) in that situation (see KeyValueStoreManager::load, which treats None as “no data yet”). With the file backend this becomes a hard error, so flows that probe for a bucket before anything is published—like card/model lookups during startup—now fail when run with --store-kv file.

Please align the file backend with the memory/etcd implementations by returning Ok(None) (after leaving the directory untouched) when the bucket directory is absent. That keeps the higher-level logic working identically across backends.

-        if !p.exists() {
-            return Err(StoreError::MissingBucket(bucket_name.to_string()));
-        }
+        if !p.exists() {
+            return Ok(None);
+        }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ab2fe1 and 4a27f39.

⛔ Files ignored due to path filters (3)
  • Cargo.lock is excluded by !**/*.lock
  • lib/bindings/python/Cargo.lock is excluded by !**/*.lock
  • lib/runtime/examples/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (31)
  • components/src/dynamo/frontend/main.py (2 hunks)
  • components/src/dynamo/mocker/args.py (1 hunks)
  • components/src/dynamo/mocker/main.py (1 hunks)
  • components/src/dynamo/sglang/args.py (3 hunks)
  • components/src/dynamo/sglang/main.py (2 hunks)
  • components/src/dynamo/trtllm/main.py (2 hunks)
  • components/src/dynamo/trtllm/utils/trtllm_utils.py (4 hunks)
  • components/src/dynamo/vllm/args.py (2 hunks)
  • components/src/dynamo/vllm/main.py (2 hunks)
  • launch/dynamo-run/src/flags.rs (1 hunks)
  • launch/dynamo-run/src/lib.rs (4 hunks)
  • lib/bindings/python/rust/lib.rs (4 hunks)
  • lib/bindings/python/rust/llm/entrypoint.rs (1 hunks)
  • lib/llm/src/audit/sink.rs (1 hunks)
  • lib/llm/src/discovery/watcher.rs (1 hunks)
  • lib/llm/src/entrypoint/input.rs (1 hunks)
  • lib/llm/src/entrypoint/input/batch.rs (3 hunks)
  • lib/llm/src/entrypoint/input/common.rs (2 hunks)
  • lib/llm/src/entrypoint/input/grpc.rs (3 hunks)
  • lib/llm/src/entrypoint/input/http.rs (4 hunks)
  • lib/llm/src/entrypoint/input/text.rs (4 hunks)
  • lib/llm/tests/audit_nats_integration.rs (2 hunks)
  • lib/runtime/Cargo.toml (1 hunks)
  • lib/runtime/src/component/client.rs (5 hunks)
  • lib/runtime/src/component/endpoint.rs (2 hunks)
  • lib/runtime/src/distributed.rs (4 hunks)
  • lib/runtime/src/storage/key_value_store.rs (10 hunks)
  • lib/runtime/src/storage/key_value_store/etcd.rs (2 hunks)
  • lib/runtime/src/storage/key_value_store/file.rs (1 hunks)
  • lib/runtime/src/storage/key_value_store/mem.rs (3 hunks)
  • lib/runtime/src/storage/key_value_store/nats.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (18)
📚 Learning: 2025-06-13T22:07:24.843Z
Learnt from: kthui
Repo: ai-dynamo/dynamo PR: 1424
File: lib/runtime/src/pipeline/network/egress/push_router.rs:204-209
Timestamp: 2025-06-13T22:07:24.843Z
Learning: The codebase uses async-nats version 0.40, not the older nats crate. Error handling should use async_nats::error::Error variants, not nats::Error variants.

Applied to files:

  • lib/runtime/src/storage/key_value_store/nats.rs
  • lib/llm/src/audit/sink.rs
📚 Learning: 2025-08-15T23:51:04.958Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/utils/typed_prefix_watcher.rs:94-101
Timestamp: 2025-08-15T23:51:04.958Z
Learning: In the dynamo codebase's etcd client implementation, `PrefixWatcher` uses `#[derive(Dissolve)]` to generate a `dissolve()` method. The pattern `let (_, _watcher, mut events_rx) = prefix_watcher.dissolve();` is the standard and intended usage throughout the codebase. The `mpsc::Receiver<WatchEvent>` maintains the etcd watch stream independently, so the `Watcher` handle can be safely dropped. This pattern is used consistently in critical infrastructure modules like component/client.rs, utils/leader_worker_barrier.rs, and entrypoint/input/http.rs.

Applied to files:

  • lib/runtime/src/storage/key_value_store/nats.rs
  • lib/llm/src/discovery/watcher.rs
  • lib/runtime/src/component/endpoint.rs
  • lib/runtime/src/component/client.rs
  • lib/runtime/src/storage/key_value_store/etcd.rs
  • lib/runtime/src/storage/key_value_store.rs
  • lib/runtime/src/distributed.rs
📚 Learning: 2025-09-18T21:41:02.263Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 2989
File: lib/llm/src/block_manager/distributed/transfer.rs:59-60
Timestamp: 2025-09-18T21:41:02.263Z
Learning: ConnectorTransferBatcher in distributed/transfer.rs duplicates existing batching logic and uses try_join_all(), which spawns unlimited concurrent transfers and bypasses the existing concurrency control systems. The proper solution is to integrate distributed transfers with the existing TransferBatcher + LocalTransferManager pipeline rather than implementing separate batching logic.

Applied to files:

  • lib/llm/src/entrypoint/input/batch.rs
📚 Learning: 2025-08-18T20:51:51.324Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/pipeline/network/egress/push_router.rs:0-0
Timestamp: 2025-08-18T20:51:51.324Z
Learning: The runtime crate cannot depend on the llm crate due to architectural dependency constraints, preventing imports from lib/llm into lib/runtime.

Applied to files:

  • lib/llm/src/entrypoint/input/batch.rs
  • lib/llm/src/entrypoint/input/http.rs
  • lib/bindings/python/rust/llm/entrypoint.rs
  • lib/llm/src/entrypoint/input/grpc.rs
  • lib/llm/src/entrypoint/input/common.rs
📚 Learning: 2025-07-14T21:25:56.930Z
Learnt from: ryanolson
Repo: ai-dynamo/dynamo PR: 1919
File: lib/runtime/src/engine.rs:168-168
Timestamp: 2025-07-14T21:25:56.930Z
Learning: The AsyncEngineContextProvider trait in lib/runtime/src/engine.rs was intentionally changed from `Send + Sync + Debug` to `Send + Debug` because the Sync bound was overly constraining. The trait should only require Send + Debug as designed.

Applied to files:

  • lib/llm/src/entrypoint/input/batch.rs
  • lib/llm/tests/audit_nats_integration.rs
  • lib/llm/src/audit/sink.rs
  • lib/llm/src/entrypoint/input/http.rs
  • lib/llm/src/entrypoint/input/grpc.rs
  • lib/llm/src/entrypoint/input/common.rs
📚 Learning: 2025-09-21T01:40:52.456Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 3155
File: components/backends/vllm/src/dynamo/vllm/main.py:228-233
Timestamp: 2025-09-21T01:40:52.456Z
Learning: In the dynamo codebase, error handling for distributed runtime client initialization (like runtime.namespace().component().endpoint().client()) is handled at the Rust level in the distributed runtime bindings, so Python-level try/catch blocks are not needed and would be redundant.

Applied to files:

  • lib/llm/src/entrypoint/input/batch.rs
  • lib/llm/src/entrypoint/input/common.rs
  • lib/bindings/python/rust/lib.rs
📚 Learning: 2025-08-15T23:51:04.958Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/utils/typed_prefix_watcher.rs:94-101
Timestamp: 2025-08-15T23:51:04.958Z
Learning: In the dynamo codebase's etcd client implementation, when using `kv_get_and_watch_prefix()` and `dissolve()`, the returned `events_rx` receiver maintains the etcd watch stream independently. The watcher handle can be safely dropped (using `_watcher`) without terminating the stream, as the receiver keeps the connection alive internally. This is a consistent pattern used throughout the codebase in multiple critical modules.

Applied to files:

  • lib/llm/src/discovery/watcher.rs
  • lib/runtime/src/component/client.rs
📚 Learning: 2025-08-15T23:51:04.958Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/utils/typed_prefix_watcher.rs:94-101
Timestamp: 2025-08-15T23:51:04.958Z
Learning: In the dynamo codebase's etcd client implementation, when using `kv_get_and_watch_prefix()` and `dissolve()`, the returned `events_rx` receiver maintains the etcd watch stream independently. The watcher handle can be safely dropped (using `_watcher`) without terminating the stream, as the receiver keeps the connection alive internally.

Applied to files:

  • lib/llm/src/discovery/watcher.rs
  • lib/runtime/src/component/client.rs
📚 Learning: 2025-11-05T08:41:06.483Z
Learnt from: ryanolson
Repo: ai-dynamo/dynamo PR: 4070
File: lib/discovery/src/systems/etcd/peer.rs:151-188
Timestamp: 2025-11-05T08:41:06.483Z
Learning: In lib/discovery/src/systems/etcd/peer.rs, the register_instance method intentionally captures the lease_id before entering the OperationExecutor closure. If the lease is revoked or fails, the operation should hard-fail rather than retry with a new lease, because the system does not track which entries were registered under which lease. Retrying with a fresh lease would create inconsistent state.

Applied to files:

  • lib/runtime/src/component/endpoint.rs
  • lib/runtime/src/storage/key_value_store/etcd.rs
📚 Learning: 2025-07-01T13:55:03.940Z
Learnt from: nnshah1
Repo: ai-dynamo/dynamo PR: 1444
File: tests/fault_tolerance/utils/metrics.py:30-32
Timestamp: 2025-07-01T13:55:03.940Z
Learning: The `dynamo_worker()` decorator in the dynamo codebase returns a wrapper that automatically injects the `runtime` parameter before calling the wrapped function. This means callers only need to provide the non-runtime parameters, while the decorator handles injecting the runtime argument automatically. For example, a function with signature `async def get_metrics(runtime, log_dir)` decorated with `dynamo_worker()` can be called as `get_metrics(log_dir)` because the decorator wrapper injects the runtime parameter.

Applied to files:

  • components/src/dynamo/vllm/main.py
  • components/src/dynamo/trtllm/main.py
  • components/src/dynamo/sglang/main.py
📚 Learning: 2025-07-16T12:41:12.543Z
Learnt from: grahamking
Repo: ai-dynamo/dynamo PR: 1962
File: lib/runtime/src/component/client.rs:270-273
Timestamp: 2025-07-16T12:41:12.543Z
Learning: In lib/runtime/src/component/client.rs, the current mutex usage in get_or_create_dynamic_instance_source is temporary while evaluating whether the mutex can be dropped entirely. The code currently has a race condition between try_lock and lock().await, but this is acknowledged as an interim state during the performance optimization process.

Applied to files:

  • lib/runtime/src/component/client.rs
📚 Learning: 2025-09-17T01:00:50.937Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 3077
File: lib/llm/src/kv_router/subscriber.rs:334-336
Timestamp: 2025-09-17T01:00:50.937Z
Learning: PeaBrane identified that reordering tokio::select! arms in the indexer (moving dump_rx.recv() to be after event_rx.recv()) creates a natural barrier that ensures RouterEvents are always processed before dump requests, solving the ack-before-commit race condition. This leverages the existing biased directive and requires minimal code changes, aligning with their preference for contained solutions.

Applied to files:

  • lib/runtime/src/component/client.rs
📚 Learning: 2025-09-17T01:00:50.937Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 3077
File: lib/llm/src/kv_router/subscriber.rs:334-336
Timestamp: 2025-09-17T01:00:50.937Z
Learning: PeaBrane suggested using tokio::select! arm ordering with the existing biased directive in the indexer to create a natural barrier for dump requests, ensuring KV events are drained before snapshotting. This approach leverages existing architecture (biased select) to solve race conditions with minimal code changes, which aligns with their preference for contained solutions.

Applied to files:

  • lib/runtime/src/component/client.rs
📚 Learning: 2025-09-11T03:24:47.820Z
Learnt from: kthui
Repo: ai-dynamo/dynamo PR: 3004
File: lib/runtime/src/pipeline/network/ingress/push_handler.rs:271-277
Timestamp: 2025-09-11T03:24:47.820Z
Learning: In lib/runtime/src/pipeline/network/ingress/push_handler.rs, the maintainer prefers to keep the existing error comparison logic using format!("{:?}", err) == STREAM_ERR_MSG unchanged until proper error types are implemented, even though it has technical debt. Avoid suggesting changes to working legacy code that will be refactored later.

Applied to files:

  • lib/runtime/src/component/client.rs
📚 Learning: 2025-06-05T01:02:15.318Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 1392
File: lib/llm/src/kv_router/scoring.rs:35-46
Timestamp: 2025-06-05T01:02:15.318Z
Learning: In lib/llm/src/kv_router/scoring.rs, PeaBrane prefers panic-based early failure over Result-based error handling for the worker_id() method to catch invalid data early during development.

Applied to files:

  • lib/runtime/src/component/client.rs
📚 Learning: 2025-08-22T19:55:41.608Z
Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: There are two separate DeltaGenerator classes in the codebase: one for chat completions (lib/llm/src/protocols/openai/chat_completions/delta.rs with object "chat.completion.chunk") and one for text completions (lib/llm/src/protocols/openai/completions/delta.rs with object "text_completion"). They have different create_choice method signatures and serve different OpenAI API endpoints. The reasoning parsing functionality is only relevant to the chat completions DeltaGenerator.

Applied to files:

  • lib/llm/src/entrypoint/input/text.rs
📚 Learning: 2025-05-29T06:20:12.901Z
Learnt from: ryanolson
Repo: ai-dynamo/dynamo PR: 1093
File: lib/llm/src/block_manager/block/registry.rs:98-122
Timestamp: 2025-05-29T06:20:12.901Z
Learning: In lib/llm/src/block_manager/block/registry.rs, the background task spawned for handling unregister notifications uses detached concurrency by design. The JoinHandle is intentionally not stored as this represents a reasonable architectural tradeoff for a long-running cleanup task.

Applied to files:

  • lib/llm/src/entrypoint/input/http.rs
📚 Learning: 2025-09-02T16:46:54.015Z
Learnt from: GuanLuo
Repo: ai-dynamo/dynamo PR: 2714
File: lib/llm/src/discovery/model_entry.rs:38-42
Timestamp: 2025-09-02T16:46:54.015Z
Learning: In lib/llm/src/discovery/model_entry.rs, GuanLuo prefers not to add serde defaults for model_type and model_input fields to keep the specification explicit and avoid user errors, relying on atomic deployment strategy to avoid backward compatibility issues.

Applied to files:

  • lib/bindings/python/rust/llm/entrypoint.rs
  • launch/dynamo-run/src/lib.rs
🧬 Code graph analysis (27)
lib/runtime/src/storage/key_value_store/mem.rs (4)
lib/bindings/python/rust/lib.rs (2)
  • new (431-483)
  • shutdown (639-641)
lib/runtime/src/distributed.rs (2)
  • new (51-212)
  • shutdown (237-240)
lib/runtime/src/storage/key_value_store.rs (9)
  • new (40-42)
  • new (81-83)
  • new (269-271)
  • new (492-505)
  • shutdown (123-123)
  • shutdown (235-243)
  • shutdown (387-389)
  • from_raw (45-47)
  • key (85-87)
lib/bindings/python/rust/llm/kv.rs (2)
  • shutdown (156-158)
  • shutdown (891-894)
lib/runtime/src/storage/key_value_store/nats.rs (4)
lib/bindings/python/rust/lib.rs (1)
  • shutdown (639-641)
lib/runtime/src/distributed.rs (2)
  • shutdown (237-240)
  • new (51-212)
lib/runtime/src/storage/key_value_store.rs (8)
  • shutdown (123-123)
  • shutdown (235-243)
  • shutdown (387-389)
  • new (40-42)
  • new (81-83)
  • new (269-271)
  • new (492-505)
  • from_raw (45-47)
lib/runtime/src/storage/key_value_store/file.rs (1)
  • shutdown (101-107)
lib/llm/src/entrypoint/input/batch.rs (2)
lib/bindings/python/src/dynamo/_core.pyi (2)
  • DistributedRuntime (35-65)
  • CancellationToken (67-78)
lib/llm/src/entrypoint/input/common.rs (1)
  • prepare_engine (56-183)
lib/llm/src/discovery/watcher.rs (1)
lib/runtime/src/storage/key_value_store.rs (1)
  • key (85-87)
components/src/dynamo/vllm/args.py (4)
lib/runtime/src/storage/key_value_store.rs (2)
  • default (250-252)
  • get (405-405)
lib/runtime/src/storage/key_value_store/mem.rs (2)
  • default (31-33)
  • get (149-155)
lib/runtime/src/storage/key_value_store/etcd.rs (1)
  • get (85-99)
lib/runtime/src/storage/key_value_store/file.rs (1)
  • get (167-179)
lib/runtime/src/component/endpoint.rs (2)
lib/runtime/src/distributed.rs (1)
  • store (289-291)
lib/runtime/src/storage/key_value_store.rs (2)
  • key (85-87)
  • from_raw (45-47)
components/src/dynamo/vllm/main.py (4)
components/src/dynamo/mocker/main.py (1)
  • worker (25-55)
components/src/dynamo/sglang/main.py (1)
  • worker (36-65)
components/src/dynamo/trtllm/main.py (1)
  • worker (105-121)
examples/multimodal/components/worker.py (2)
  • worker (404-423)
  • parse_args (45-98)
lib/llm/tests/audit_nats_integration.rs (1)
lib/llm/src/audit/sink.rs (1)
  • spawn_workers_from_env (92-113)
lib/llm/src/audit/sink.rs (1)
lib/runtime/src/distributed.rs (1)
  • nats_client (268-270)
lib/runtime/src/component/client.rs (3)
lib/runtime/src/storage/key_value_store.rs (2)
  • etcd (261-263)
  • key (85-87)
lib/runtime/src/distributed.rs (2)
  • store (289-291)
  • new (51-212)
lib/runtime/src/transports/etcd.rs (2)
  • new (63-106)
  • new (489-532)
components/src/dynamo/trtllm/main.py (4)
components/src/dynamo/sglang/main.py (1)
  • worker (36-65)
components/src/dynamo/vllm/main.py (1)
  • worker (74-119)
examples/multimodal/components/worker.py (1)
  • worker (404-423)
components/src/dynamo/trtllm/utils/trtllm_utils.py (1)
  • cmd_line_args (116-364)
lib/llm/src/entrypoint/input.rs (7)
lib/llm/src/audit/sink.rs (2)
  • spawn_workers_from_env (92-113)
  • new (39-46)
launch/dynamo-run/src/lib.rs (1)
  • run (19-116)
lib/llm/src/entrypoint/input/batch.rs (1)
  • run (53-194)
lib/llm/src/entrypoint/input/grpc.rs (1)
  • run (23-165)
lib/llm/src/entrypoint/input/http.rs (1)
  • run (25-266)
lib/llm/src/entrypoint/input/text.rs (1)
  • run (20-37)
lib/llm/src/entrypoint/input/endpoint.rs (1)
  • run (27-132)
components/src/dynamo/trtllm/utils/trtllm_utils.py (5)
lib/runtime/src/storage/key_value_store.rs (2)
  • default (250-252)
  • get (405-405)
lib/runtime/src/storage/key_value_store/mem.rs (2)
  • default (31-33)
  • get (149-155)
lib/runtime/src/storage/key_value_store/etcd.rs (1)
  • get (85-99)
lib/runtime/src/storage/key_value_store/file.rs (1)
  • get (167-179)
lib/runtime/src/storage/key_value_store/nats.rs (1)
  • get (137-142)
components/src/dynamo/mocker/args.py (4)
lib/runtime/src/storage/key_value_store.rs (2)
  • default (250-252)
  • get (405-405)
lib/runtime/src/storage/key_value_store/mem.rs (2)
  • default (31-33)
  • get (149-155)
lib/runtime/src/storage/key_value_store/etcd.rs (1)
  • get (85-99)
lib/runtime/src/storage/key_value_store/file.rs (1)
  • get (167-179)
lib/llm/src/entrypoint/input/text.rs (2)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • DistributedRuntime (35-65)
lib/llm/src/entrypoint/input/common.rs (1)
  • prepare_engine (56-183)
lib/llm/src/entrypoint/input/http.rs (7)
lib/bindings/python/src/dynamo/_core.pyi (4)
  • DistributedRuntime (35-65)
  • RouterMode (984-986)
  • run (962-969)
  • EngineConfig (1020-1022)
launch/dynamo-run/src/lib.rs (1)
  • run (19-116)
lib/llm/src/entrypoint/input/batch.rs (1)
  • run (53-194)
lib/llm/src/entrypoint/input/grpc.rs (1)
  • run (23-165)
lib/llm/src/entrypoint/input/text.rs (1)
  • run (20-37)
lib/llm/src/http/service/service_v2.rs (1)
  • run (199-276)
lib/llm/src/entrypoint/input/endpoint.rs (1)
  • run (27-132)
components/src/dynamo/sglang/main.py (5)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • DistributedRuntime (35-65)
components/src/dynamo/trtllm/main.py (1)
  • worker (105-121)
components/src/dynamo/vllm/main.py (1)
  • worker (74-119)
components/src/dynamo/sglang/args.py (1)
  • parse_args (214-373)
components/src/dynamo/common/config_dump/config_dumper.py (1)
  • dump_config (88-121)
lib/bindings/python/rust/llm/entrypoint.rs (1)
lib/bindings/python/tests/test_kv_bindings.py (1)
  • distributed_runtime (31-39)
components/src/dynamo/mocker/main.py (1)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • DistributedRuntime (35-65)
lib/runtime/src/storage/key_value_store/file.rs (1)
lib/runtime/src/storage/key_value_store.rs (19)
  • new (40-42)
  • new (81-83)
  • new (269-271)
  • new (492-505)
  • get_or_create_bucket (112-117)
  • get_or_create_bucket (184-197)
  • get_or_create_bucket (273-280)
  • get_bucket (119-119)
  • get_bucket (199-223)
  • get_bucket (282-287)
  • insert (397-402)
  • key (85-87)
  • value (93-95)
  • get (405-405)
  • delete (408-408)
  • watch (314-363)
  • watch (413-415)
  • from_raw (45-47)
  • entries (417-417)
lib/runtime/src/storage/key_value_store/etcd.rs (6)
lib/bindings/python/rust/lib.rs (1)
  • shutdown (639-641)
lib/runtime/src/distributed.rs (2)
  • shutdown (237-240)
  • new (51-212)
lib/runtime/src/storage/key_value_store.rs (9)
  • shutdown (123-123)
  • shutdown (235-243)
  • shutdown (387-389)
  • new (40-42)
  • new (81-83)
  • new (269-271)
  • new (492-505)
  • key (85-87)
  • from_raw (45-47)
lib/runtime/src/storage/key_value_store/file.rs (1)
  • shutdown (101-107)
lib/runtime/src/storage/key_value_store/mem.rs (3)
  • shutdown (111-111)
  • new (52-56)
  • new (60-70)
lib/runtime/src/storage/key_value_store/nats.rs (2)
  • shutdown (56-59)
  • new (63-65)
lib/runtime/src/storage/key_value_store.rs (5)
lib/runtime/src/distributed.rs (3)
  • fmt (45-47)
  • shutdown (237-240)
  • new (51-212)
lib/runtime/src/storage/key_value_store/file.rs (2)
  • fmt (142-144)
  • shutdown (101-107)
lib/runtime/src/transports/etcd.rs (4)
  • fmt (45-47)
  • default (437-466)
  • new (63-106)
  • new (489-532)
lib/runtime/src/storage/key_value_store/etcd.rs (2)
  • shutdown (58-60)
  • new (24-26)
lib/runtime/src/storage/key_value_store/mem.rs (4)
  • shutdown (111-111)
  • default (31-33)
  • new (52-56)
  • new (60-70)
lib/llm/src/entrypoint/input/grpc.rs (4)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • DistributedRuntime (35-65)
lib/llm/src/entrypoint/input/batch.rs (1)
  • run (53-194)
lib/llm/src/entrypoint/input/http.rs (1)
  • run (25-266)
lib/llm/src/entrypoint/input/text.rs (1)
  • run (20-37)
lib/llm/src/entrypoint/input/common.rs (3)
lib/runtime/src/distributed.rs (1)
  • store (289-291)
lib/bindings/python/rust/lib.rs (2)
  • new (431-483)
  • new (1121-1125)
lib/llm/src/discovery/watcher.rs (1)
  • new (68-84)
lib/bindings/python/rust/lib.rs (3)
lib/bindings/python/rust/llm/entrypoint.rs (1)
  • to_pyerr (312-317)
lib/runtime/src/storage/key_value_store.rs (1)
  • default (250-252)
lib/bindings/python/tests/conftest.py (1)
  • runtime (401-435)
launch/dynamo-run/src/lib.rs (2)
lib/llm/src/entrypoint/input.rs (2)
  • default (95-101)
  • run_input (108-147)
lib/runtime/src/transports/nats.rs (2)
  • default (348-353)
  • default (378-399)
lib/runtime/src/distributed.rs (2)
lib/runtime/src/transports/etcd.rs (4)
  • etcd_client (110-112)
  • new (63-106)
  • new (489-532)
  • default (437-466)
lib/runtime/src/storage/key_value_store.rs (4)
  • memory (257-259)
  • etcd (261-263)
  • file (265-267)
  • default (250-252)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: vllm (amd64)
  • GitHub Check: trtllm (arm64)
  • GitHub Check: sglang (amd64)
  • GitHub Check: sglang (arm64)
  • GitHub Check: vllm (arm64)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: operator (amd64)
  • GitHub Check: operator (arm64)
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: tests (launch/dynamo-run)
  • GitHub Check: tests (.)
  • GitHub Check: tests (lib/bindings/python)
  • GitHub Check: clippy (.)
  • GitHub Check: tests (lib/runtime/examples)
  • GitHub Check: clippy (launch/dynamo-run)
  • GitHub Check: clippy (lib/bindings/python)
🔇 Additional comments (30)
lib/runtime/Cargo.toml (1)

66-66: Dependency verification complete—no issues found.

The inotify crate v0.11.0 is the latest published version with 65 million all-time downloads and was updated about 1 year ago. No security advisories exist for this crate in the RustSec database. The dependency is appropriately placed and specified.

components/src/dynamo/mocker/args.py (1)

207-212: LGTM!

The CLI option is well-structured with appropriate defaults and comprehensive help text describing the supported backends and their configuration.

components/src/dynamo/vllm/args.py (2)

53-53: LGTM!

The store_kv field is correctly added to the Config class and will be properly propagated through the runtime initialization.


192-197: LGTM!

The CLI option follows the same pattern as other components with consistent defaults and help text.

components/src/dynamo/trtllm/utils/trtllm_utils.py (2)

61-61: LGTM!

The store_kv field is properly initialized and will be populated from CLI arguments.


283-288: LGTM!

The CLI option is consistent with other components and provides clear documentation.

launch/dynamo-run/src/lib.rs (4)

9-10: LGTM!

The imports support the new KV store selection mechanism and NATS configuration.


80-86: LGTM!

The distributed runtime configuration properly integrates the KV store selection from flags, with appropriate defaults for NATS config and static worker mode.


104-113: LGTM!

The refactoring to always use DistributedRuntime simplifies the code by removing the previous Either<Runtime, DistributedRuntime> pattern.


120-124: LGTM!

The updated signature makes the distributed runtime requirement explicit and simplifies the function interface.

components/src/dynamo/sglang/args.py (3)

96-101: LGTM!

The dictionary-based argument definition follows SGLang's existing pattern and provides consistent configuration with other components.


111-111: LGTM!

The store_kv field is properly added to the dataclass and will be correctly propagated.


339-339: LGTM!

The field is correctly populated from the parsed arguments.

lib/llm/src/discovery/watcher.rs (1)

186-191: LGTM!

The update to use Key directly in the Delete event aligns with the KV-store-agnostic refactoring and simplifies the interface by removing the need to carry full KeyValue payloads.

lib/bindings/python/rust/llm/entrypoint.rs (1)

301-308: LGTM!

The change to pass distributed_runtime.inner.clone() directly is consistent with the refactoring that eliminates the Either<Runtime, DistributedRuntime> pattern in favor of always using DistributedRuntime.

launch/dynamo-run/src/flags.rs (1)

130-131: LGTM!

The field is properly added with an appropriate default value that matches the convention used across other components.

lib/llm/tests/audit_nats_integration.rs (2)

170-170: LGTM!

The change correctly aligns with the updated spawn_workers_from_env signature that now requires a direct reference instead of an optional reference.


227-227: LGTM!

Consistent with the signature change at line 170.

lib/runtime/src/storage/key_value_store/etcd.rs (3)

58-60: LGTM - Appropriate no-op shutdown implementation.

The comment correctly indicates that etcd automatically revokes the lease on disconnect, making explicit cleanup unnecessary. This aligns with etcd's lease-based lifecycle management.


140-142: LGTM - Optimization to construct KeyValue only when needed.

Moving the construction inside the Put branch avoids unnecessary allocation for Delete events.


145-145: LGTM - Standardized Delete event emission.

Using Key::from_raw(key) directly is cleaner and consistent with the updated semantics across all KeyValueStore backends.

lib/runtime/src/storage/key_value_store/nats.rs (2)

56-59: Incomplete shutdown implementation tracked as follow-up work.

The TODO indicates that owned keys should be deleted on shutdown. The PR description mentions "cleanup of old keys analogous to etcd lease revocation" as planned follow-up work, so this is a known limitation.

Note: Based on PR objectives, this will be addressed in a future PR.


169-175: LGTM - Standardized watch event emission.

The changes align with other backends:

  • KeyValue constructed inline for Put operations
  • Delete and Purge events emit Key::from_raw(entry.key) directly
lib/runtime/src/storage/key_value_store/mem.rs (3)

60-60: LGTM - Appropriate visibility restriction for encapsulation.

Changing from pub to pub(super) ensures that MemoryStore instances are created through the parent module's KeyValueStoreManager rather than being directly instantiated, which provides better control over store lifecycle management.


110-111: LGTM - No-op shutdown for in-memory store.

An empty shutdown implementation is appropriate since there are no external resources to clean up for an in-memory store.


210-210: LGTM - Consistent Delete event emission.

Aligns with the standardized approach across all KeyValueStore backends.

components/src/dynamo/mocker/main.py (1)

75-75: Original concern is resolved—args.store_kv is properly defined.

The verification confirms that --store-kv is correctly defined in components/src/dynamo/mocker/args.py with appropriate defaults (falling back to environment variable DYN_STORE_KV or defaulting to "etcd") and comprehensive help text describing the supported backends (etcd, mem, file). The reference to args.store_kv in main.py line 75 uses the correct naming convention (argparse converts the hyphenated argument to underscore format). No issues require attention.

lib/llm/src/audit/sink.rs (1)

92-94: All call sites correctly updated to pass direct references.

Verification confirms the signature simplification from Option<&DistributedRuntime> to &DistributedRuntime is properly applied across all three call sites:

  • lib/llm/tests/audit_nats_integration.rs:170, 227: Pass &drt from owned test values
  • lib/llm/src/entrypoint/input.rs:120: Passes &drt from owned function parameter

The refactoring is complete and consistent throughout the codebase.

lib/runtime/src/component/endpoint.rs (1)

206-222: Code change is verified as correct.

The unique_path() function returns a properly formatted string "{ns}/{cp}/{ep}/{lease_id:x}" composed of validated identifiers (namespace name, component name, endpoint name) and a hex-formatted lease ID. The use of Key::from_raw() is appropriate here because the string is constructed from legitimate identity sources and is already safe for the KV store abstraction. This usage pattern is consistent throughout the codebase (nats, etcd, file, mem implementations all use from_raw() similarly for pre-validated strings).

lib/llm/src/entrypoint/input.rs (1)

124-144: Fix move semantics in run_input dispatch

Moving drt and engine_config into every match arm triggers E0382 (“use of moved value”) because the first arm consumes the values, so later arms can no longer take ownership. Wrap engine_config in an Option and clone drt per branch so ownership is consumed at most once. Suggested rewrite:

-    match in_opt {
-        Input::Http => {
-            http::run(drt, engine_config).await?;
-        }
+    let mut engine_config = Some(engine_config);
+    match in_opt {
+        Input::Http => {
+            let engine_config = engine_config.take().expect("engine config");
+            http::run(drt.clone(), engine_config).await?;
+        }
         Input::Grpc => {
-            grpc::run(drt, engine_config).await?;
+            let engine_config = engine_config.take().expect("engine config");
+            grpc::run(drt.clone(), engine_config).await?;
         }
         Input::Text => {
-            text::run(drt, None, engine_config).await?;
+            let engine_config = engine_config.take().expect("engine config");
+            text::run(drt.clone(), None, engine_config).await?;
         }
         Input::Stdin => {
             let mut prompt = String::new();
             std::io::stdin().read_to_string(&mut prompt).unwrap();
-            text::run(drt, Some(prompt), engine_config).await?;
+            let engine_config = engine_config.take().expect("engine config");
+            text::run(drt.clone(), Some(prompt), engine_config).await?;
         }
         Input::Batch(path) => {
-            batch::run(drt, path, engine_config).await?;
+            let engine_config = engine_config.take().expect("engine config");
+            batch::run(drt.clone(), path, engine_config).await?;
         }
         Input::Endpoint(path) => {
-            endpoint::run(drt, path, engine_config).await?;
+            let engine_config = engine_config.take().expect("engine config");
+            endpoint::run(drt, path, engine_config).await?;
         }
     }

This compiles cleanly and preserves the single-use ownership semantics required here.

⛔ Skipped due to learnings
Learnt from: ryanolson
Repo: ai-dynamo/dynamo PR: 1919
File: lib/runtime/src/engine.rs:168-168
Timestamp: 2025-07-14T21:25:56.930Z
Learning: The AsyncEngineContextProvider trait in lib/runtime/src/engine.rs was intentionally changed from `Send + Sync + Debug` to `Send + Debug` because the Sync bound was overly constraining. The trait should only require Send + Debug as designed.

Copy link
Contributor

@biswapanda biswapanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm.
I've left few comments/notes

grahamking and others added 7 commits November 7, 2025 15:13
Running any of the components with `--store-kv file` will use the file
system instead of etcd for discovery. If there is a shared folder, or
for the single-node case, this allows running frontend + backend without
 etcd.
Signed-off-by: Graham King <grahamk@nvidia.com>
Signed-off-by: Graham King <grahamk@nvidia.com>
This test has a race condition, re-run it.

Signed-off-by: Graham King <grahamk@nvidia.com>
Allows me to land this so others can rebase, then I can fix in next PR.

Signed-off-by: Graham King <grahamk@nvidia.com>
Improve log output, fix static mode

Signed-off-by: Graham King <grahamk@nvidia.com>
Signed-off-by: Graham King <grahamk@nvidia.com>
@grahamking grahamking merged commit 794c0a4 into main Nov 7, 2025
32 of 33 checks passed
@grahamking grahamking deleted the gk-kv-file branch November 7, 2025 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants