Skip to content

Commit a38319e

Browse files
authored
build: OPS-810: add dynamo frontend image w/EPP support (#4150)
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
1 parent f50c386 commit a38319e

File tree

2 files changed

+124
-0
lines changed

2 files changed

+124
-0
lines changed

container/Dockerfile.frontend

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# syntax=docker/dockerfile:1.10.0
2+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
ARG DYNAMO_BASE_IMAGE="dynamo:latest-none"
6+
ARG EPP_IMAGE="us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:v0.5.1-dirty"
7+
ARG PYTHON_VERSION=3.12
8+
9+
FROM ${DYNAMO_BASE_IMAGE} AS dynamo_base
10+
11+
FROM ${EPP_IMAGE} AS epp
12+
13+
FROM nvcr.io/nvidia/base/ubuntu:noble-20250619 AS frontend
14+
15+
ARG PYTHON_VERSION
16+
RUN apt-get update -y \
17+
&& apt-get install -y --no-install-recommends \
18+
# required for EPP
19+
ca-certificates \
20+
libstdc++6 \
21+
&& apt-get clean \
22+
&& rm -rf /var/lib/apt/lists/*
23+
24+
25+
# Create dynamo user with group 0 for OpenShift compatibility
26+
RUN userdel -r ubuntu > /dev/null 2>&1 || true \
27+
&& useradd -m -s /bin/bash -g 0 dynamo \
28+
&& [ `id -u dynamo` -eq 1000 ] \
29+
&& mkdir -p /home/dynamo/.cache /opt/dynamo /workspace \
30+
&& chown -R dynamo: /opt/dynamo /home/dynamo/.cache /workspace \
31+
&& chmod -R g+w /opt/dynamo /home/dynamo/.cache /workspace
32+
33+
# Set HOME so ModelExpress can find the cache directory
34+
ENV HOME=/home/dynamo
35+
36+
# Switch to dynamo user
37+
USER dynamo
38+
ENV DYNAMO_HOME=/opt/dynamo
39+
40+
WORKDIR /
41+
COPY --chown=dynamo: --from=epp /epp /epp
42+
43+
COPY --chown=dynamo: container/launch_message.txt /opt/dynamo/.launch_screen
44+
# Copy tests, benchmarks, deploy and components with correct ownership
45+
COPY --chown=dynamo: tests /workspace/tests
46+
COPY --chown=dynamo: examples /workspace/examples
47+
COPY --chown=dynamo: benchmarks /workspace/benchmarks
48+
COPY --chown=dynamo: deploy /workspace/deploy
49+
COPY --chown=dynamo: components/ /workspace/components/
50+
COPY --chown=dynamo: recipes/ /workspace/recipes/
51+
# Copy attribution files with correct ownership
52+
COPY --chown=dynamo: ATTRIBUTION* LICENSE /workspace/
53+
54+
ENV VIRTUAL_ENV=/opt/dynamo/venv
55+
ENV PATH="/opt/dynamo/venv/bin:$PATH"
56+
# Copy virtual environment directly from dynamo_base (dev image)
57+
# This includes all installed packages: dynamo, nixl, requirements.txt, requirements.test.txt
58+
# Copy uv to system /bin
59+
COPY --from=dynamo_base /bin/uv /bin/uvx /bin/
60+
RUN uv python install $PYTHON_VERSION
61+
COPY --chown=dynamo: --from=dynamo_base /opt/dynamo/venv/ /opt/dynamo/venv/
62+
63+
# Setup environment for all users
64+
USER root
65+
RUN chmod 755 /opt/dynamo/.launch_screen && \
66+
echo 'source /opt/dynamo/venv/bin/activate' >> /etc/bash.bashrc && \
67+
echo 'cat /opt/dynamo/.launch_screen' >> /etc/bash.bashrc
68+
69+
USER dynamo
70+
71+
ENTRYPOINT ["/epp"]
72+
CMD ["/bin/bash"]
73+

container/README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ The NVIDIA Dynamo project uses containerized development and deployment to maint
1515
- `Dockerfile.trtllm` - For TensorRT-LLM inference backend
1616
- `Dockerfile.sglang` - For SGLang inference backend
1717
- `Dockerfile` - Base/standalone configuration
18+
- `Dockerfile.frontend` - For Kubernetes Gateway API Inference Extension integration with EPP
19+
- `Dockerfile.epp` - For building the Endpoint Picker (EPP) image
1820

1921
### Why Containerization?
2022

@@ -192,6 +194,55 @@ The `build.sh --dev-image` option takes a dev image and then builds a local-dev
192194
./build.sh --dev-image dynamo:latest-vllm --framework vllm --dry-run
193195
```
194196

197+
### Building the Frontend Image
198+
199+
The frontend image is a specialized container that includes the Dynamo components (NATS, etcd, dynamo, NIXL, etc) along with the Endpoint Picker (EPP) for Kubernetes Gateway API Inference Extension integration. This image is primarily used for inference gateway deployments.
200+
201+
**Step 1: Build the Custom Dynamo EPP Image**
202+
203+
Follow the instructions in [`deploy/inference-gateway/README.md`](../deploy/inference-gateway/README.md) under "Build the custom EPP image" section. This process:
204+
- Clones the Gateway API Inference Extension repository
205+
- Applies Dynamo-specific patches for custom routing
206+
- Builds the Dynamo router as a static library
207+
- Creates a custom EPP image with integrated Dynamo routing capabilities
208+
209+
**Step 2: Build the Dynamo Base Image**
210+
211+
The base image contains the core Dynamo runtime components, NATS server, etcd, and Python dependencies:
212+
```bash
213+
# Build the base dev image (framework=none for frontend-only deployment)
214+
./build.sh --framework none --target dev
215+
```
216+
217+
**Step 3: Build the Frontend Image**
218+
219+
Now build the frontend image that combines the Dynamo base with the EPP:
220+
221+
```bash
222+
# 2. Build the frontend image using the pre-built EPP
223+
docker buildx build --load --platform linux/amd64 \
224+
--build-arg DYNAMO_BASE_IMAGE=dynamo:latest-none-dev \
225+
--build-arg EPP_IMAGE={EPP_IMAGE_TAG} \
226+
--build-arg PYTHON_VERSION=3.12 \
227+
-f container/Dockerfile.frontend \
228+
-t dynamo:latest-none-frontend \
229+
.
230+
```
231+
#### Frontend Image Contents
232+
233+
The frontend image includes:
234+
- **EPP (Endpoint Picker)**: Handles request routing and load balancing for inference gateway
235+
- **Dynamo Runtime**: Core platform components and routing logic
236+
- **NIXL**: NVIDIA InfiniBand Library for high-performance network communication
237+
- **Benchmarking Tools**: Performance testing utilities (aiperf, aiconfigurator, etc)
238+
- **Python Environment**: Virtual environment with all required dependencies
239+
- **NATS Server**: Message broker for Dynamo's distributed communication
240+
- **etcd**: Distributed key-value store for configuration and coordination
241+
242+
#### Deployment
243+
244+
The frontend image is designed for Kubernetes deployment with the Gateway API Inference Extension. See [`deploy/inference-gateway/README.md`](../deploy/inference-gateway/README.md) for complete deployment instructions using Helm charts.
245+
195246
### run.sh - Container Runtime Manager
196247

197248
The `run.sh` script launches Docker containers with the appropriate configuration for development and inference workloads.

0 commit comments

Comments
 (0)