From e2edace6038b58d65011a6fbb46c8955a1490655 Mon Sep 17 00:00:00 2001 From: Biswa Panda Date: Wed, 20 Aug 2025 09:52:13 -0700 Subject: [PATCH 1/3] --wip-- --- .../me-dynamo-integration.md | 129 ++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 000M-model-express-dynamo-integration/me-dynamo-integration.md diff --git a/000M-model-express-dynamo-integration/me-dynamo-integration.md b/000M-model-express-dynamo-integration/me-dynamo-integration.md new file mode 100644 index 00000000..88f28aaf --- /dev/null +++ b/000M-model-express-dynamo-integration/me-dynamo-integration.md @@ -0,0 +1,129 @@ +# Model Express ~ Dynamo Integration Proposal +**Status**: Draft +**Authors**: Biswa Panda +**Category**: Architecture +**Sponsor**: Neelay, Ganesh +**Required Reviewers**: Nick, Ganesh, Kavin, Alec, Graham +**Review Date**: 2025-08-20 + +# Summary +# Motivation +## Goals +### Non Goals +## Requirements + +### REQ 1: Model Express can be deployed as a per namespace service as part of Dynamo Operator +### REQ 2: + +# Proposal + +## Operator Configuration: +Following model express configuration will be added to the Dynamo Operator configuration. +This will enable Model Express to be deployed as a standalone service as part of Dynamo Operator. +Also it will automatically inject an environment variable (`MODEL_EXPRESS_URL` or better name) to the component. + +```yaml +# top level attribute in DynamoOperator CRD +spec: + services: + componentType: ModelExpress + dynamoNamespace: / "global" + ModelExpress: + pvc: + name: model-express-pvc + size: 10Gi + storageClass: standard + create: true + +``` + +## Model express sharing between dynamo namespaces + +Model express can be deployed as a shared service across all dynamo namespaces by specifiying it in top level attribute in DynamoGraphDeployment CRD. + +```yaml +# top level attribute in DynamoOperator CRD +spec: + services: + componentType: ModelExpress + dynamoNamespace: "global" # shared service across all dynamo namespaces + ModelExpress: + pvc: + name: model-express-pvc + size: 10Gi + storageClass: standard + create: true +``` + +For multi-tenant deployment in a single k8s namespace, model expresses can be deployed as a per `DYN_NAMESPACE` basis. + +```yaml +# top level attribute in DynamoOperator CRD +spec: + services: + componentType: ModelExpress + dynamoNamespace: "MY_DYN_NAMESPACE" # per namespace model express service + ModelExpress1: + pvc: + name: model-express-pvc + size: 10Gi + storageClass: standard + create: true +``` + +## Alternate Solutions + +### top level attribute in DynamoOperator CRD +Shared model express service across all dynamo namespaces: +```yaml +# top level attribute in DynamoOperator CRD +ModelExpress: + dynamoNamespace: / "global" + pvc: + name: model-express-pvc + size: 10Gi + storageClass: standard + create: true + image: ... +``` + +### Create a separate helm chart for model express and compose with Helmfile + + +## Model Express Integration with dynamo components: + +When model express is enabled, env variable `MODEL_EXPRESS_URL` will be injected to each component. + +This also acts as a feature flag to enable/disable model express for dynamo components. + +A common model initialization flow can be used across backend components (to facilitate sharing of model express between components). This flow will use model express lib to fetch models from model express. + +- `MODEL_EXPRESS_PATH` env variable will be set if we are using a common PVC path for model express in earlier phases. + +### Model express disabled: +When `MODEL_EXPRESS_URL` is not set, model express will be disabled for the component. +1. model initialization flow will skip fetching the model from model express. +2. Backend components will use huggingface hub libs directly. (This is the current behavior) + +### Model express enabled: +When `MODEL_EXPRESS_URL` is set, model express will be enabled for the component. + +1. model initialization flow will use model express lib to fetch the model from model express. +2. Backend specific initialization flow [for example: vllm]() will start after model is fetched from model express. + +Backend UX and command line will remain unchanged in the deployment spec. we'll internally use model express lib to fetch the model by changing the command line arguments for the backend. +For example: + +deployment spec: +``` +python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B +``` + +Internally, we'll change the vllm args to use model express lib to fetch the model. + +1. 1st phase: using shared pvc: +`MODEL_EXPRESS_URL` env variable will be set to model express service url. +`MODEL_EXPRESS_PATH` env variable will be set to the pvc path. + +a. call model express lib to fetch the model to `MODEL_EXPRESS_PATH` +b. we'll set appropriate HF env vars - [HF_HUB_OFFLINE](https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#hfhuboffline), [HF_HOME] mapping to `MODEL_EXPRESS_PATH` \ No newline at end of file From 2d738d8c45ba5cff559ea9d6e2a43ee59cabc32e Mon Sep 17 00:00:00 2001 From: Biswa Panda Date: Wed, 20 Aug 2025 09:59:20 -0700 Subject: [PATCH 2/3] --wip-- --- .../me-dynamo-integration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/000M-model-express-dynamo-integration/me-dynamo-integration.md b/000M-model-express-dynamo-integration/me-dynamo-integration.md index 88f28aaf..b58020ac 100644 --- a/000M-model-express-dynamo-integration/me-dynamo-integration.md +++ b/000M-model-express-dynamo-integration/me-dynamo-integration.md @@ -12,8 +12,8 @@ ### Non Goals ## Requirements -### REQ 1: Model Express can be deployed as a per namespace service as part of Dynamo Operator -### REQ 2: +### REQ 1: Model Express can be deployed as a first class citizen of Dynamo Operator +### REQ 2: Each component can be configured to use model express service as a shared service or a per namespace service. # Proposal From 11022dbafc9ea0cf80b446f7008f19b53c837828 Mon Sep 17 00:00:00 2001 From: Biswa Panda Date: Wed, 20 Aug 2025 10:00:22 -0700 Subject: [PATCH 3/3] --wip-- --- .../me-dynamo-integration.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/000M-model-express-dynamo-integration/me-dynamo-integration.md b/000M-model-express-dynamo-integration/me-dynamo-integration.md index b58020ac..b641eb84 100644 --- a/000M-model-express-dynamo-integration/me-dynamo-integration.md +++ b/000M-model-express-dynamo-integration/me-dynamo-integration.md @@ -1,9 +1,15 @@ # Model Express ~ Dynamo Integration Proposal + **Status**: Draft + **Authors**: Biswa Panda + **Category**: Architecture + **Sponsor**: Neelay, Ganesh + **Required Reviewers**: Nick, Ganesh, Kavin, Alec, Graham + **Review Date**: 2025-08-20 # Summary