-
Notifications
You must be signed in to change notification settings - Fork 676
feat(keyvalue): Filesystem backed KeyValueStore #4138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis change introduces configurable KV store backends (Etcd, Memory, File) throughout the Dynamo runtime system. A Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas requiring extra attention:
Poem
Pre-merge checks✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/runtime/src/storage/key_value_store.rs (1)
282-287: Handle missing file buckets the same way as other backends.When the directory for a bucket doesn’t exist yet, this implementation bubbles up
StoreError::MissingBucket. Callers expectget_bucketto yieldOk(None)in that situation (seeKeyValueStoreManager::load, which treatsNoneas “no data yet”). With the file backend this becomes a hard error, so flows that probe for a bucket before anything is published—like card/model lookups during startup—now fail when run with--store-kv file.Please align the file backend with the memory/etcd implementations by returning
Ok(None)(after leaving the directory untouched) when the bucket directory is absent. That keeps the higher-level logic working identically across backends.- if !p.exists() { - return Err(StoreError::MissingBucket(bucket_name.to_string())); - } + if !p.exists() { + return Ok(None); + }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
Cargo.lockis excluded by!**/*.locklib/bindings/python/Cargo.lockis excluded by!**/*.locklib/runtime/examples/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (31)
components/src/dynamo/frontend/main.py(2 hunks)components/src/dynamo/mocker/args.py(1 hunks)components/src/dynamo/mocker/main.py(1 hunks)components/src/dynamo/sglang/args.py(3 hunks)components/src/dynamo/sglang/main.py(2 hunks)components/src/dynamo/trtllm/main.py(2 hunks)components/src/dynamo/trtllm/utils/trtllm_utils.py(4 hunks)components/src/dynamo/vllm/args.py(2 hunks)components/src/dynamo/vllm/main.py(2 hunks)launch/dynamo-run/src/flags.rs(1 hunks)launch/dynamo-run/src/lib.rs(4 hunks)lib/bindings/python/rust/lib.rs(4 hunks)lib/bindings/python/rust/llm/entrypoint.rs(1 hunks)lib/llm/src/audit/sink.rs(1 hunks)lib/llm/src/discovery/watcher.rs(1 hunks)lib/llm/src/entrypoint/input.rs(1 hunks)lib/llm/src/entrypoint/input/batch.rs(3 hunks)lib/llm/src/entrypoint/input/common.rs(2 hunks)lib/llm/src/entrypoint/input/grpc.rs(3 hunks)lib/llm/src/entrypoint/input/http.rs(4 hunks)lib/llm/src/entrypoint/input/text.rs(4 hunks)lib/llm/tests/audit_nats_integration.rs(2 hunks)lib/runtime/Cargo.toml(1 hunks)lib/runtime/src/component/client.rs(5 hunks)lib/runtime/src/component/endpoint.rs(2 hunks)lib/runtime/src/distributed.rs(4 hunks)lib/runtime/src/storage/key_value_store.rs(10 hunks)lib/runtime/src/storage/key_value_store/etcd.rs(2 hunks)lib/runtime/src/storage/key_value_store/file.rs(1 hunks)lib/runtime/src/storage/key_value_store/mem.rs(3 hunks)lib/runtime/src/storage/key_value_store/nats.rs(2 hunks)
🧰 Additional context used
🧠 Learnings (18)
📚 Learning: 2025-06-13T22:07:24.843Z
Learnt from: kthui
Repo: ai-dynamo/dynamo PR: 1424
File: lib/runtime/src/pipeline/network/egress/push_router.rs:204-209
Timestamp: 2025-06-13T22:07:24.843Z
Learning: The codebase uses async-nats version 0.40, not the older nats crate. Error handling should use async_nats::error::Error variants, not nats::Error variants.
Applied to files:
lib/runtime/src/storage/key_value_store/nats.rslib/llm/src/audit/sink.rs
📚 Learning: 2025-08-15T23:51:04.958Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/utils/typed_prefix_watcher.rs:94-101
Timestamp: 2025-08-15T23:51:04.958Z
Learning: In the dynamo codebase's etcd client implementation, `PrefixWatcher` uses `#[derive(Dissolve)]` to generate a `dissolve()` method. The pattern `let (_, _watcher, mut events_rx) = prefix_watcher.dissolve();` is the standard and intended usage throughout the codebase. The `mpsc::Receiver<WatchEvent>` maintains the etcd watch stream independently, so the `Watcher` handle can be safely dropped. This pattern is used consistently in critical infrastructure modules like component/client.rs, utils/leader_worker_barrier.rs, and entrypoint/input/http.rs.
Applied to files:
lib/runtime/src/storage/key_value_store/nats.rslib/llm/src/discovery/watcher.rslib/runtime/src/component/endpoint.rslib/runtime/src/component/client.rslib/runtime/src/storage/key_value_store/etcd.rslib/runtime/src/storage/key_value_store.rslib/runtime/src/distributed.rs
📚 Learning: 2025-09-18T21:41:02.263Z
Learnt from: oandreeva-nv
Repo: ai-dynamo/dynamo PR: 2989
File: lib/llm/src/block_manager/distributed/transfer.rs:59-60
Timestamp: 2025-09-18T21:41:02.263Z
Learning: ConnectorTransferBatcher in distributed/transfer.rs duplicates existing batching logic and uses try_join_all(), which spawns unlimited concurrent transfers and bypasses the existing concurrency control systems. The proper solution is to integrate distributed transfers with the existing TransferBatcher + LocalTransferManager pipeline rather than implementing separate batching logic.
Applied to files:
lib/llm/src/entrypoint/input/batch.rs
📚 Learning: 2025-08-18T20:51:51.324Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/pipeline/network/egress/push_router.rs:0-0
Timestamp: 2025-08-18T20:51:51.324Z
Learning: The runtime crate cannot depend on the llm crate due to architectural dependency constraints, preventing imports from lib/llm into lib/runtime.
Applied to files:
lib/llm/src/entrypoint/input/batch.rslib/llm/src/entrypoint/input/http.rslib/bindings/python/rust/llm/entrypoint.rslib/llm/src/entrypoint/input/grpc.rslib/llm/src/entrypoint/input/common.rs
📚 Learning: 2025-07-14T21:25:56.930Z
Learnt from: ryanolson
Repo: ai-dynamo/dynamo PR: 1919
File: lib/runtime/src/engine.rs:168-168
Timestamp: 2025-07-14T21:25:56.930Z
Learning: The AsyncEngineContextProvider trait in lib/runtime/src/engine.rs was intentionally changed from `Send + Sync + Debug` to `Send + Debug` because the Sync bound was overly constraining. The trait should only require Send + Debug as designed.
Applied to files:
lib/llm/src/entrypoint/input/batch.rslib/llm/tests/audit_nats_integration.rslib/llm/src/audit/sink.rslib/llm/src/entrypoint/input/http.rslib/llm/src/entrypoint/input/grpc.rslib/llm/src/entrypoint/input/common.rs
📚 Learning: 2025-09-21T01:40:52.456Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 3155
File: components/backends/vllm/src/dynamo/vllm/main.py:228-233
Timestamp: 2025-09-21T01:40:52.456Z
Learning: In the dynamo codebase, error handling for distributed runtime client initialization (like runtime.namespace().component().endpoint().client()) is handled at the Rust level in the distributed runtime bindings, so Python-level try/catch blocks are not needed and would be redundant.
Applied to files:
lib/llm/src/entrypoint/input/batch.rslib/llm/src/entrypoint/input/common.rslib/bindings/python/rust/lib.rs
📚 Learning: 2025-08-15T23:51:04.958Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/utils/typed_prefix_watcher.rs:94-101
Timestamp: 2025-08-15T23:51:04.958Z
Learning: In the dynamo codebase's etcd client implementation, when using `kv_get_and_watch_prefix()` and `dissolve()`, the returned `events_rx` receiver maintains the etcd watch stream independently. The watcher handle can be safely dropped (using `_watcher`) without terminating the stream, as the receiver keeps the connection alive internally. This is a consistent pattern used throughout the codebase in multiple critical modules.
Applied to files:
lib/llm/src/discovery/watcher.rslib/runtime/src/component/client.rs
📚 Learning: 2025-08-15T23:51:04.958Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/utils/typed_prefix_watcher.rs:94-101
Timestamp: 2025-08-15T23:51:04.958Z
Learning: In the dynamo codebase's etcd client implementation, when using `kv_get_and_watch_prefix()` and `dissolve()`, the returned `events_rx` receiver maintains the etcd watch stream independently. The watcher handle can be safely dropped (using `_watcher`) without terminating the stream, as the receiver keeps the connection alive internally.
Applied to files:
lib/llm/src/discovery/watcher.rslib/runtime/src/component/client.rs
📚 Learning: 2025-11-05T08:41:06.483Z
Learnt from: ryanolson
Repo: ai-dynamo/dynamo PR: 4070
File: lib/discovery/src/systems/etcd/peer.rs:151-188
Timestamp: 2025-11-05T08:41:06.483Z
Learning: In lib/discovery/src/systems/etcd/peer.rs, the register_instance method intentionally captures the lease_id before entering the OperationExecutor closure. If the lease is revoked or fails, the operation should hard-fail rather than retry with a new lease, because the system does not track which entries were registered under which lease. Retrying with a fresh lease would create inconsistent state.
Applied to files:
lib/runtime/src/component/endpoint.rslib/runtime/src/storage/key_value_store/etcd.rs
📚 Learning: 2025-07-01T13:55:03.940Z
Learnt from: nnshah1
Repo: ai-dynamo/dynamo PR: 1444
File: tests/fault_tolerance/utils/metrics.py:30-32
Timestamp: 2025-07-01T13:55:03.940Z
Learning: The `dynamo_worker()` decorator in the dynamo codebase returns a wrapper that automatically injects the `runtime` parameter before calling the wrapped function. This means callers only need to provide the non-runtime parameters, while the decorator handles injecting the runtime argument automatically. For example, a function with signature `async def get_metrics(runtime, log_dir)` decorated with `dynamo_worker()` can be called as `get_metrics(log_dir)` because the decorator wrapper injects the runtime parameter.
Applied to files:
components/src/dynamo/vllm/main.pycomponents/src/dynamo/trtllm/main.pycomponents/src/dynamo/sglang/main.py
📚 Learning: 2025-07-16T12:41:12.543Z
Learnt from: grahamking
Repo: ai-dynamo/dynamo PR: 1962
File: lib/runtime/src/component/client.rs:270-273
Timestamp: 2025-07-16T12:41:12.543Z
Learning: In lib/runtime/src/component/client.rs, the current mutex usage in get_or_create_dynamic_instance_source is temporary while evaluating whether the mutex can be dropped entirely. The code currently has a race condition between try_lock and lock().await, but this is acknowledged as an interim state during the performance optimization process.
Applied to files:
lib/runtime/src/component/client.rs
📚 Learning: 2025-09-17T01:00:50.937Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 3077
File: lib/llm/src/kv_router/subscriber.rs:334-336
Timestamp: 2025-09-17T01:00:50.937Z
Learning: PeaBrane identified that reordering tokio::select! arms in the indexer (moving dump_rx.recv() to be after event_rx.recv()) creates a natural barrier that ensures RouterEvents are always processed before dump requests, solving the ack-before-commit race condition. This leverages the existing biased directive and requires minimal code changes, aligning with their preference for contained solutions.
Applied to files:
lib/runtime/src/component/client.rs
📚 Learning: 2025-09-17T01:00:50.937Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 3077
File: lib/llm/src/kv_router/subscriber.rs:334-336
Timestamp: 2025-09-17T01:00:50.937Z
Learning: PeaBrane suggested using tokio::select! arm ordering with the existing biased directive in the indexer to create a natural barrier for dump requests, ensuring KV events are drained before snapshotting. This approach leverages existing architecture (biased select) to solve race conditions with minimal code changes, which aligns with their preference for contained solutions.
Applied to files:
lib/runtime/src/component/client.rs
📚 Learning: 2025-09-11T03:24:47.820Z
Learnt from: kthui
Repo: ai-dynamo/dynamo PR: 3004
File: lib/runtime/src/pipeline/network/ingress/push_handler.rs:271-277
Timestamp: 2025-09-11T03:24:47.820Z
Learning: In lib/runtime/src/pipeline/network/ingress/push_handler.rs, the maintainer prefers to keep the existing error comparison logic using format!("{:?}", err) == STREAM_ERR_MSG unchanged until proper error types are implemented, even though it has technical debt. Avoid suggesting changes to working legacy code that will be refactored later.
Applied to files:
lib/runtime/src/component/client.rs
📚 Learning: 2025-06-05T01:02:15.318Z
Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 1392
File: lib/llm/src/kv_router/scoring.rs:35-46
Timestamp: 2025-06-05T01:02:15.318Z
Learning: In lib/llm/src/kv_router/scoring.rs, PeaBrane prefers panic-based early failure over Result-based error handling for the worker_id() method to catch invalid data early during development.
Applied to files:
lib/runtime/src/component/client.rs
📚 Learning: 2025-08-22T19:55:41.608Z
Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: There are two separate DeltaGenerator classes in the codebase: one for chat completions (lib/llm/src/protocols/openai/chat_completions/delta.rs with object "chat.completion.chunk") and one for text completions (lib/llm/src/protocols/openai/completions/delta.rs with object "text_completion"). They have different create_choice method signatures and serve different OpenAI API endpoints. The reasoning parsing functionality is only relevant to the chat completions DeltaGenerator.
Applied to files:
lib/llm/src/entrypoint/input/text.rs
📚 Learning: 2025-05-29T06:20:12.901Z
Learnt from: ryanolson
Repo: ai-dynamo/dynamo PR: 1093
File: lib/llm/src/block_manager/block/registry.rs:98-122
Timestamp: 2025-05-29T06:20:12.901Z
Learning: In lib/llm/src/block_manager/block/registry.rs, the background task spawned for handling unregister notifications uses detached concurrency by design. The JoinHandle is intentionally not stored as this represents a reasonable architectural tradeoff for a long-running cleanup task.
Applied to files:
lib/llm/src/entrypoint/input/http.rs
📚 Learning: 2025-09-02T16:46:54.015Z
Learnt from: GuanLuo
Repo: ai-dynamo/dynamo PR: 2714
File: lib/llm/src/discovery/model_entry.rs:38-42
Timestamp: 2025-09-02T16:46:54.015Z
Learning: In lib/llm/src/discovery/model_entry.rs, GuanLuo prefers not to add serde defaults for model_type and model_input fields to keep the specification explicit and avoid user errors, relying on atomic deployment strategy to avoid backward compatibility issues.
Applied to files:
lib/bindings/python/rust/llm/entrypoint.rslaunch/dynamo-run/src/lib.rs
🧬 Code graph analysis (27)
lib/runtime/src/storage/key_value_store/mem.rs (4)
lib/bindings/python/rust/lib.rs (2)
new(431-483)shutdown(639-641)lib/runtime/src/distributed.rs (2)
new(51-212)shutdown(237-240)lib/runtime/src/storage/key_value_store.rs (9)
new(40-42)new(81-83)new(269-271)new(492-505)shutdown(123-123)shutdown(235-243)shutdown(387-389)from_raw(45-47)key(85-87)lib/bindings/python/rust/llm/kv.rs (2)
shutdown(156-158)shutdown(891-894)
lib/runtime/src/storage/key_value_store/nats.rs (4)
lib/bindings/python/rust/lib.rs (1)
shutdown(639-641)lib/runtime/src/distributed.rs (2)
shutdown(237-240)new(51-212)lib/runtime/src/storage/key_value_store.rs (8)
shutdown(123-123)shutdown(235-243)shutdown(387-389)new(40-42)new(81-83)new(269-271)new(492-505)from_raw(45-47)lib/runtime/src/storage/key_value_store/file.rs (1)
shutdown(101-107)
lib/llm/src/entrypoint/input/batch.rs (2)
lib/bindings/python/src/dynamo/_core.pyi (2)
DistributedRuntime(35-65)CancellationToken(67-78)lib/llm/src/entrypoint/input/common.rs (1)
prepare_engine(56-183)
lib/llm/src/discovery/watcher.rs (1)
lib/runtime/src/storage/key_value_store.rs (1)
key(85-87)
components/src/dynamo/vllm/args.py (4)
lib/runtime/src/storage/key_value_store.rs (2)
default(250-252)get(405-405)lib/runtime/src/storage/key_value_store/mem.rs (2)
default(31-33)get(149-155)lib/runtime/src/storage/key_value_store/etcd.rs (1)
get(85-99)lib/runtime/src/storage/key_value_store/file.rs (1)
get(167-179)
lib/runtime/src/component/endpoint.rs (2)
lib/runtime/src/distributed.rs (1)
store(289-291)lib/runtime/src/storage/key_value_store.rs (2)
key(85-87)from_raw(45-47)
components/src/dynamo/vllm/main.py (4)
components/src/dynamo/mocker/main.py (1)
worker(25-55)components/src/dynamo/sglang/main.py (1)
worker(36-65)components/src/dynamo/trtllm/main.py (1)
worker(105-121)examples/multimodal/components/worker.py (2)
worker(404-423)parse_args(45-98)
lib/llm/tests/audit_nats_integration.rs (1)
lib/llm/src/audit/sink.rs (1)
spawn_workers_from_env(92-113)
lib/llm/src/audit/sink.rs (1)
lib/runtime/src/distributed.rs (1)
nats_client(268-270)
lib/runtime/src/component/client.rs (3)
lib/runtime/src/storage/key_value_store.rs (2)
etcd(261-263)key(85-87)lib/runtime/src/distributed.rs (2)
store(289-291)new(51-212)lib/runtime/src/transports/etcd.rs (2)
new(63-106)new(489-532)
components/src/dynamo/trtllm/main.py (4)
components/src/dynamo/sglang/main.py (1)
worker(36-65)components/src/dynamo/vllm/main.py (1)
worker(74-119)examples/multimodal/components/worker.py (1)
worker(404-423)components/src/dynamo/trtllm/utils/trtllm_utils.py (1)
cmd_line_args(116-364)
lib/llm/src/entrypoint/input.rs (7)
lib/llm/src/audit/sink.rs (2)
spawn_workers_from_env(92-113)new(39-46)launch/dynamo-run/src/lib.rs (1)
run(19-116)lib/llm/src/entrypoint/input/batch.rs (1)
run(53-194)lib/llm/src/entrypoint/input/grpc.rs (1)
run(23-165)lib/llm/src/entrypoint/input/http.rs (1)
run(25-266)lib/llm/src/entrypoint/input/text.rs (1)
run(20-37)lib/llm/src/entrypoint/input/endpoint.rs (1)
run(27-132)
components/src/dynamo/trtllm/utils/trtllm_utils.py (5)
lib/runtime/src/storage/key_value_store.rs (2)
default(250-252)get(405-405)lib/runtime/src/storage/key_value_store/mem.rs (2)
default(31-33)get(149-155)lib/runtime/src/storage/key_value_store/etcd.rs (1)
get(85-99)lib/runtime/src/storage/key_value_store/file.rs (1)
get(167-179)lib/runtime/src/storage/key_value_store/nats.rs (1)
get(137-142)
components/src/dynamo/mocker/args.py (4)
lib/runtime/src/storage/key_value_store.rs (2)
default(250-252)get(405-405)lib/runtime/src/storage/key_value_store/mem.rs (2)
default(31-33)get(149-155)lib/runtime/src/storage/key_value_store/etcd.rs (1)
get(85-99)lib/runtime/src/storage/key_value_store/file.rs (1)
get(167-179)
lib/llm/src/entrypoint/input/text.rs (2)
lib/bindings/python/src/dynamo/_core.pyi (1)
DistributedRuntime(35-65)lib/llm/src/entrypoint/input/common.rs (1)
prepare_engine(56-183)
lib/llm/src/entrypoint/input/http.rs (7)
lib/bindings/python/src/dynamo/_core.pyi (4)
DistributedRuntime(35-65)RouterMode(984-986)run(962-969)EngineConfig(1020-1022)launch/dynamo-run/src/lib.rs (1)
run(19-116)lib/llm/src/entrypoint/input/batch.rs (1)
run(53-194)lib/llm/src/entrypoint/input/grpc.rs (1)
run(23-165)lib/llm/src/entrypoint/input/text.rs (1)
run(20-37)lib/llm/src/http/service/service_v2.rs (1)
run(199-276)lib/llm/src/entrypoint/input/endpoint.rs (1)
run(27-132)
components/src/dynamo/sglang/main.py (5)
lib/bindings/python/src/dynamo/_core.pyi (1)
DistributedRuntime(35-65)components/src/dynamo/trtllm/main.py (1)
worker(105-121)components/src/dynamo/vllm/main.py (1)
worker(74-119)components/src/dynamo/sglang/args.py (1)
parse_args(214-373)components/src/dynamo/common/config_dump/config_dumper.py (1)
dump_config(88-121)
lib/bindings/python/rust/llm/entrypoint.rs (1)
lib/bindings/python/tests/test_kv_bindings.py (1)
distributed_runtime(31-39)
components/src/dynamo/mocker/main.py (1)
lib/bindings/python/src/dynamo/_core.pyi (1)
DistributedRuntime(35-65)
lib/runtime/src/storage/key_value_store/file.rs (1)
lib/runtime/src/storage/key_value_store.rs (19)
new(40-42)new(81-83)new(269-271)new(492-505)get_or_create_bucket(112-117)get_or_create_bucket(184-197)get_or_create_bucket(273-280)get_bucket(119-119)get_bucket(199-223)get_bucket(282-287)insert(397-402)key(85-87)value(93-95)get(405-405)delete(408-408)watch(314-363)watch(413-415)from_raw(45-47)entries(417-417)
lib/runtime/src/storage/key_value_store/etcd.rs (6)
lib/bindings/python/rust/lib.rs (1)
shutdown(639-641)lib/runtime/src/distributed.rs (2)
shutdown(237-240)new(51-212)lib/runtime/src/storage/key_value_store.rs (9)
shutdown(123-123)shutdown(235-243)shutdown(387-389)new(40-42)new(81-83)new(269-271)new(492-505)key(85-87)from_raw(45-47)lib/runtime/src/storage/key_value_store/file.rs (1)
shutdown(101-107)lib/runtime/src/storage/key_value_store/mem.rs (3)
shutdown(111-111)new(52-56)new(60-70)lib/runtime/src/storage/key_value_store/nats.rs (2)
shutdown(56-59)new(63-65)
lib/runtime/src/storage/key_value_store.rs (5)
lib/runtime/src/distributed.rs (3)
fmt(45-47)shutdown(237-240)new(51-212)lib/runtime/src/storage/key_value_store/file.rs (2)
fmt(142-144)shutdown(101-107)lib/runtime/src/transports/etcd.rs (4)
fmt(45-47)default(437-466)new(63-106)new(489-532)lib/runtime/src/storage/key_value_store/etcd.rs (2)
shutdown(58-60)new(24-26)lib/runtime/src/storage/key_value_store/mem.rs (4)
shutdown(111-111)default(31-33)new(52-56)new(60-70)
lib/llm/src/entrypoint/input/grpc.rs (4)
lib/bindings/python/src/dynamo/_core.pyi (1)
DistributedRuntime(35-65)lib/llm/src/entrypoint/input/batch.rs (1)
run(53-194)lib/llm/src/entrypoint/input/http.rs (1)
run(25-266)lib/llm/src/entrypoint/input/text.rs (1)
run(20-37)
lib/llm/src/entrypoint/input/common.rs (3)
lib/runtime/src/distributed.rs (1)
store(289-291)lib/bindings/python/rust/lib.rs (2)
new(431-483)new(1121-1125)lib/llm/src/discovery/watcher.rs (1)
new(68-84)
lib/bindings/python/rust/lib.rs (3)
lib/bindings/python/rust/llm/entrypoint.rs (1)
to_pyerr(312-317)lib/runtime/src/storage/key_value_store.rs (1)
default(250-252)lib/bindings/python/tests/conftest.py (1)
runtime(401-435)
launch/dynamo-run/src/lib.rs (2)
lib/llm/src/entrypoint/input.rs (2)
default(95-101)run_input(108-147)lib/runtime/src/transports/nats.rs (2)
default(348-353)default(378-399)
lib/runtime/src/distributed.rs (2)
lib/runtime/src/transports/etcd.rs (4)
etcd_client(110-112)new(63-106)new(489-532)default(437-466)lib/runtime/src/storage/key_value_store.rs (4)
memory(257-259)etcd(261-263)file(265-267)default(250-252)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: vllm (amd64)
- GitHub Check: trtllm (arm64)
- GitHub Check: sglang (amd64)
- GitHub Check: sglang (arm64)
- GitHub Check: vllm (arm64)
- GitHub Check: trtllm (amd64)
- GitHub Check: operator (amd64)
- GitHub Check: operator (arm64)
- GitHub Check: Build and Test - dynamo
- GitHub Check: tests (launch/dynamo-run)
- GitHub Check: tests (.)
- GitHub Check: tests (lib/bindings/python)
- GitHub Check: clippy (.)
- GitHub Check: tests (lib/runtime/examples)
- GitHub Check: clippy (launch/dynamo-run)
- GitHub Check: clippy (lib/bindings/python)
🔇 Additional comments (30)
lib/runtime/Cargo.toml (1)
66-66: Dependency verification complete—no issues found.The inotify crate v0.11.0 is the latest published version with 65 million all-time downloads and was updated about 1 year ago. No security advisories exist for this crate in the RustSec database. The dependency is appropriately placed and specified.
components/src/dynamo/mocker/args.py (1)
207-212: LGTM!The CLI option is well-structured with appropriate defaults and comprehensive help text describing the supported backends and their configuration.
components/src/dynamo/vllm/args.py (2)
53-53: LGTM!The
store_kvfield is correctly added to the Config class and will be properly propagated through the runtime initialization.
192-197: LGTM!The CLI option follows the same pattern as other components with consistent defaults and help text.
components/src/dynamo/trtllm/utils/trtllm_utils.py (2)
61-61: LGTM!The
store_kvfield is properly initialized and will be populated from CLI arguments.
283-288: LGTM!The CLI option is consistent with other components and provides clear documentation.
launch/dynamo-run/src/lib.rs (4)
9-10: LGTM!The imports support the new KV store selection mechanism and NATS configuration.
80-86: LGTM!The distributed runtime configuration properly integrates the KV store selection from flags, with appropriate defaults for NATS config and static worker mode.
104-113: LGTM!The refactoring to always use
DistributedRuntimesimplifies the code by removing the previousEither<Runtime, DistributedRuntime>pattern.
120-124: LGTM!The updated signature makes the distributed runtime requirement explicit and simplifies the function interface.
components/src/dynamo/sglang/args.py (3)
96-101: LGTM!The dictionary-based argument definition follows SGLang's existing pattern and provides consistent configuration with other components.
111-111: LGTM!The
store_kvfield is properly added to the dataclass and will be correctly propagated.
339-339: LGTM!The field is correctly populated from the parsed arguments.
lib/llm/src/discovery/watcher.rs (1)
186-191: LGTM!The update to use
Keydirectly in theDeleteevent aligns with the KV-store-agnostic refactoring and simplifies the interface by removing the need to carry fullKeyValuepayloads.lib/bindings/python/rust/llm/entrypoint.rs (1)
301-308: LGTM!The change to pass
distributed_runtime.inner.clone()directly is consistent with the refactoring that eliminates theEither<Runtime, DistributedRuntime>pattern in favor of always usingDistributedRuntime.launch/dynamo-run/src/flags.rs (1)
130-131: LGTM!The field is properly added with an appropriate default value that matches the convention used across other components.
lib/llm/tests/audit_nats_integration.rs (2)
170-170: LGTM!The change correctly aligns with the updated
spawn_workers_from_envsignature that now requires a direct reference instead of an optional reference.
227-227: LGTM!Consistent with the signature change at line 170.
lib/runtime/src/storage/key_value_store/etcd.rs (3)
58-60: LGTM - Appropriate no-op shutdown implementation.The comment correctly indicates that etcd automatically revokes the lease on disconnect, making explicit cleanup unnecessary. This aligns with etcd's lease-based lifecycle management.
140-142: LGTM - Optimization to construct KeyValue only when needed.Moving the construction inside the Put branch avoids unnecessary allocation for Delete events.
145-145: LGTM - Standardized Delete event emission.Using
Key::from_raw(key)directly is cleaner and consistent with the updated semantics across all KeyValueStore backends.lib/runtime/src/storage/key_value_store/nats.rs (2)
56-59: Incomplete shutdown implementation tracked as follow-up work.The TODO indicates that owned keys should be deleted on shutdown. The PR description mentions "cleanup of old keys analogous to etcd lease revocation" as planned follow-up work, so this is a known limitation.
Note: Based on PR objectives, this will be addressed in a future PR.
169-175: LGTM - Standardized watch event emission.The changes align with other backends:
- KeyValue constructed inline for Put operations
- Delete and Purge events emit
Key::from_raw(entry.key)directlylib/runtime/src/storage/key_value_store/mem.rs (3)
60-60: LGTM - Appropriate visibility restriction for encapsulation.Changing from
pubtopub(super)ensures thatMemoryStoreinstances are created through the parent module'sKeyValueStoreManagerrather than being directly instantiated, which provides better control over store lifecycle management.
110-111: LGTM - No-op shutdown for in-memory store.An empty shutdown implementation is appropriate since there are no external resources to clean up for an in-memory store.
210-210: LGTM - Consistent Delete event emission.Aligns with the standardized approach across all KeyValueStore backends.
components/src/dynamo/mocker/main.py (1)
75-75: Original concern is resolved—args.store_kvis properly defined.The verification confirms that
--store-kvis correctly defined incomponents/src/dynamo/mocker/args.pywith appropriate defaults (falling back to environment variableDYN_STORE_KVor defaulting to"etcd") and comprehensive help text describing the supported backends (etcd, mem, file). The reference toargs.store_kvin main.py line 75 uses the correct naming convention (argparse converts the hyphenated argument to underscore format). No issues require attention.lib/llm/src/audit/sink.rs (1)
92-94: All call sites correctly updated to pass direct references.Verification confirms the signature simplification from
Option<&DistributedRuntime>to&DistributedRuntimeis properly applied across all three call sites:
lib/llm/tests/audit_nats_integration.rs:170, 227: Pass&drtfrom owned test valueslib/llm/src/entrypoint/input.rs:120: Passes&drtfrom owned function parameterThe refactoring is complete and consistent throughout the codebase.
lib/runtime/src/component/endpoint.rs (1)
206-222: Code change is verified as correct.The
unique_path()function returns a properly formatted string"{ns}/{cp}/{ep}/{lease_id:x}"composed of validated identifiers (namespace name, component name, endpoint name) and a hex-formatted lease ID. The use ofKey::from_raw()is appropriate here because the string is constructed from legitimate identity sources and is already safe for the KV store abstraction. This usage pattern is consistent throughout the codebase (nats, etcd, file, mem implementations all usefrom_raw()similarly for pre-validated strings).lib/llm/src/entrypoint/input.rs (1)
124-144: Fix move semantics in run_input dispatchMoving
drtandengine_configinto every match arm triggersE0382(“use of moved value”) because the first arm consumes the values, so later arms can no longer take ownership. Wrapengine_configin anOptionand clonedrtper branch so ownership is consumed at most once. Suggested rewrite:- match in_opt { - Input::Http => { - http::run(drt, engine_config).await?; - } + let mut engine_config = Some(engine_config); + match in_opt { + Input::Http => { + let engine_config = engine_config.take().expect("engine config"); + http::run(drt.clone(), engine_config).await?; + } Input::Grpc => { - grpc::run(drt, engine_config).await?; + let engine_config = engine_config.take().expect("engine config"); + grpc::run(drt.clone(), engine_config).await?; } Input::Text => { - text::run(drt, None, engine_config).await?; + let engine_config = engine_config.take().expect("engine config"); + text::run(drt.clone(), None, engine_config).await?; } Input::Stdin => { let mut prompt = String::new(); std::io::stdin().read_to_string(&mut prompt).unwrap(); - text::run(drt, Some(prompt), engine_config).await?; + let engine_config = engine_config.take().expect("engine config"); + text::run(drt.clone(), Some(prompt), engine_config).await?; } Input::Batch(path) => { - batch::run(drt, path, engine_config).await?; + let engine_config = engine_config.take().expect("engine config"); + batch::run(drt.clone(), path, engine_config).await?; } Input::Endpoint(path) => { - endpoint::run(drt, path, engine_config).await?; + let engine_config = engine_config.take().expect("engine config"); + endpoint::run(drt, path, engine_config).await?; } }This compiles cleanly and preserves the single-use ownership semantics required here.
⛔ Skipped due to learnings
Learnt from: ryanolson Repo: ai-dynamo/dynamo PR: 1919 File: lib/runtime/src/engine.rs:168-168 Timestamp: 2025-07-14T21:25:56.930Z Learning: The AsyncEngineContextProvider trait in lib/runtime/src/engine.rs was intentionally changed from `Send + Sync + Debug` to `Send + Debug` because the Sync bound was overly constraining. The trait should only require Send + Debug as designed.
4a27f39 to
dcac2d9
Compare
biswapanda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall lgtm.
I've left few comments/notes
5898671 to
bb36be1
Compare
917fe09 to
1a70309
Compare
Running any of the components with `--store-kv file` will use the file system instead of etcd for discovery. If there is a shared folder, or for the single-node case, this allows running frontend + backend without etcd.
Signed-off-by: Graham King <grahamk@nvidia.com>
Signed-off-by: Graham King <grahamk@nvidia.com>
This test has a race condition, re-run it. Signed-off-by: Graham King <grahamk@nvidia.com>
Allows me to land this so others can rebase, then I can fix in next PR. Signed-off-by: Graham King <grahamk@nvidia.com>
Improve log output, fix static mode Signed-off-by: Graham King <grahamk@nvidia.com>
Signed-off-by: Graham King <grahamk@nvidia.com>
d6a2324 to
2bb7578
Compare
Running any of the components with
--store-kv filewill use the file system instead of etcd for discovery. If there is a shared folder, or for the single-node case, this allows running frontend + backend without etcd.The default folder is
/tmp/dynamo_store_kv, configurable viaDYN_FILE_KVenv var.This PR touched a lot of things in relatively light ways, including some nice simplifications where we can now always take a
DistributedRuntimeinstead of that or aRuntime.The main file is
lib/runtime/src/storage/key_value_store/file.rs.Follow up PRs (this is big enough) will add:
is_staticstuff in favor of this file based KeyValueStore.Example
Worker:
and / or
Frontend:
Look in
$TMPDIR/dynamo_store_kv(typically/tmp) for the live keys.Summary by CodeRabbit
New Features
--store-kvCLI option to select key-value store backend (etcd, memory, or file).Chores