Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions container/Dockerfile.frontend
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# syntax=docker/dockerfile:1.10.0
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

ARG DYNAMO_BASE_IMAGE="dynamo:latest-none"
ARG EPP_IMAGE="us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:v0.5.1-dirty"
ARG PYTHON_VERSION=3.12

FROM ${DYNAMO_BASE_IMAGE} AS dynamo_base

FROM ${EPP_IMAGE} AS epp

FROM nvcr.io/nvidia/base/ubuntu:noble-20250619 AS frontend

ARG PYTHON_VERSION
RUN apt-get update -y \
&& apt-get install -y --no-install-recommends \
# required for EPP
ca-certificates \
libstdc++6 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*


# Create dynamo user with group 0 for OpenShift compatibility
RUN userdel -r ubuntu > /dev/null 2>&1 || true \
&& useradd -m -s /bin/bash -g 0 dynamo \
&& [ `id -u dynamo` -eq 1000 ] \
&& mkdir -p /home/dynamo/.cache /opt/dynamo /workspace \
&& chown -R dynamo: /opt/dynamo /home/dynamo/.cache /workspace \
&& chmod -R g+w /opt/dynamo /home/dynamo/.cache /workspace

# Set HOME so ModelExpress can find the cache directory
ENV HOME=/home/dynamo

# Switch to dynamo user
USER dynamo
ENV DYNAMO_HOME=/opt/dynamo

WORKDIR /
COPY --chown=dynamo: --from=epp /epp /epp

COPY --chown=dynamo: container/launch_message.txt /opt/dynamo/.launch_screen
# Copy tests, benchmarks, deploy and components with correct ownership
COPY --chown=dynamo: tests /workspace/tests
COPY --chown=dynamo: examples /workspace/examples
COPY --chown=dynamo: benchmarks /workspace/benchmarks
COPY --chown=dynamo: deploy /workspace/deploy
COPY --chown=dynamo: components/ /workspace/components/
COPY --chown=dynamo: recipes/ /workspace/recipes/
# Copy attribution files with correct ownership
COPY --chown=dynamo: ATTRIBUTION* LICENSE /workspace/

ENV VIRTUAL_ENV=/opt/dynamo/venv
ENV PATH="/opt/dynamo/venv/bin:$PATH"
# Copy virtual environment directly from dynamo_base (dev image)
# This includes all installed packages: dynamo, nixl, requirements.txt, requirements.test.txt
# Copy uv to system /bin
COPY --from=dynamo_base /bin/uv /bin/uvx /bin/
RUN uv python install $PYTHON_VERSION
COPY --chown=dynamo: --from=dynamo_base /opt/dynamo/venv/ /opt/dynamo/venv/

# Setup environment for all users
USER root
RUN chmod 755 /opt/dynamo/.launch_screen && \
echo 'source /opt/dynamo/venv/bin/activate' >> /etc/bash.bashrc && \
echo 'cat /opt/dynamo/.launch_screen' >> /etc/bash.bashrc

USER dynamo

ENTRYPOINT ["/epp"]
CMD ["/bin/bash"]

51 changes: 51 additions & 0 deletions container/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ The NVIDIA Dynamo project uses containerized development and deployment to maint
- `Dockerfile.trtllm` - For TensorRT-LLM inference backend
- `Dockerfile.sglang` - For SGLang inference backend
- `Dockerfile` - Base/standalone configuration
- `Dockerfile.frontend` - For Kubernetes Gateway API Inference Extension integration with EPP
- `Dockerfile.epp` - For building the Endpoint Picker (EPP) image

### Why Containerization?

Expand Down Expand Up @@ -192,6 +194,55 @@ The `build.sh --dev-image` option takes a dev image and then builds a local-dev
./build.sh --dev-image dynamo:latest-vllm --framework vllm --dry-run
```

### Building the Frontend Image

The frontend image is a specialized container that includes the Dynamo components (NATS, etcd, dynamo, NIXL, etc) along with the Endpoint Picker (EPP) for Kubernetes Gateway API Inference Extension integration. This image is primarily used for inference gateway deployments.

**Step 1: Build the Custom Dynamo EPP Image**

Follow the instructions in [`deploy/inference-gateway/README.md`](../deploy/inference-gateway/README.md) under "Build the custom EPP image" section. This process:
- Clones the Gateway API Inference Extension repository
- Applies Dynamo-specific patches for custom routing
- Builds the Dynamo router as a static library
- Creates a custom EPP image with integrated Dynamo routing capabilities

**Step 2: Build the Dynamo Base Image**

The base image contains the core Dynamo runtime components, NATS server, etcd, and Python dependencies:
```bash
# Build the base dev image (framework=none for frontend-only deployment)
./build.sh --framework none --target dev
```

**Step 3: Build the Frontend Image**

Now build the frontend image that combines the Dynamo base with the EPP:

```bash
# 2. Build the frontend image using the pre-built EPP
docker buildx build --load --platform linux/amd64 \
--build-arg DYNAMO_BASE_IMAGE=dynamo:latest-none-dev \
--build-arg EPP_IMAGE={EPP_IMAGE_TAG} \
--build-arg PYTHON_VERSION=3.12 \
-f container/Dockerfile.frontend \
-t dynamo:latest-none-frontend \
.
```
#### Frontend Image Contents

The frontend image includes:
- **EPP (Endpoint Picker)**: Handles request routing and load balancing for inference gateway
- **Dynamo Runtime**: Core platform components and routing logic
- **NIXL**: NVIDIA InfiniBand Library for high-performance network communication
- **Benchmarking Tools**: Performance testing utilities (aiperf, aiconfigurator, etc)
- **Python Environment**: Virtual environment with all required dependencies
- **NATS Server**: Message broker for Dynamo's distributed communication
- **etcd**: Distributed key-value store for configuration and coordination

#### Deployment

The frontend image is designed for Kubernetes deployment with the Gateway API Inference Extension. See [`deploy/inference-gateway/README.md`](../deploy/inference-gateway/README.md) for complete deployment instructions using Helm charts.

### run.sh - Container Runtime Manager

The `run.sh` script launches Docker containers with the appropriate configuration for development and inference workloads.
Expand Down
Loading