From 1840cbb9e5be9c44c1aa077a74064615931fa416 Mon Sep 17 00:00:00 2001 From: Mo King Date: Thu, 6 Nov 2025 15:59:09 -0500 Subject: [PATCH 1/7] Add custom template tutorial --- docs.json | 1 + pods/templates/create-custom-template.mdx | 359 ++++++++++++++++++++++ 2 files changed, 360 insertions(+) create mode 100644 pods/templates/create-custom-template.mdx diff --git a/docs.json b/docs.json index c24f8737..71ba4e93 100644 --- a/docs.json +++ b/docs.json @@ -132,6 +132,7 @@ "pages": [ "pods/templates/overview", "pods/templates/manage-templates", + "pods/templates/create-custom-template", "pods/templates/environment-variables", "pods/templates/secrets" ] diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx new file mode 100644 index 00000000..e861e182 --- /dev/null +++ b/pods/templates/create-custom-template.mdx @@ -0,0 +1,359 @@ +--- +title: "Create a custom Pod template" +sidebarTitle: "Create a custom template" +description: "Learn how to extend official Runpod templates to create your own Pod templates." +--- + +This tutorial shows how to create custom Pod templates by extending official Runpod base images with additional Python dependencies and pre-baked ML models. You'll learn the complete workflow from Dockerfile creation to deployment and testing. + +Custom templates allow you to package your specific dependencies, models, and configurations into reusable Docker images that can be deployed as Pods. This approach saves time during Pod initialization and ensures consistent environments across deployments. + +## What you'll learn + +In this tutorial, you'll learn how to: + +- Create a Dockerfile that extends a Runpod base image. +- Add Python dependencies to an existing base image. +- Pre-package ML models into your custom template. +- Build and push Docker images with the correct platform settings. +- Deploy and test your custom template as a Pod. + +## Requirements + +Before you begin, you'll need: + +- A [Runpod account](/get-started/manage-accounts). +- [Docker](https://www.docker.com/products/docker-desktop/) installed on your local machine. +- A [Docker Hub](https://hub.docker.com/) account for hosting your custom images. +- At least $5 in Runpod credits for testing. +- Basic familiarity with Docker and command-line operations. + +## Step 1: Create a custom Dockerfile + +First, you'll create a Dockerfile that extends a Runpod base image with additional dependencies: + +1. Create a new directory for your custom template: + +```bash +mkdir my-custom-template +cd my-custom-template +``` + +2. Create a new Dockerfile: + +```bash +touch Dockerfile +``` + +3. Open the Dockerfile in your preferred text editor and add the following content: + +```dockerfile +# Use the specified base image +FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 + +# Install additional Python dependencies +RUN pip install --no-cache-dir \ + transformers \ + accelerate + +# Don't specify a CMD here to use the base image's CMD +``` + +This extends the official Runpod PyTorch 2.8.0 base image, and installs two additional Python packages. This means that these packages will be automatically installed every time the Pod starts, so you won't need to run `pip install` again after Pod restarts. + + +When building custom templates, always start with a Runpod base image that matches your CUDA requirements. The base image includes essential components like the `/start.sh` script that handles Pod initialization. + + +To maintain access to packaged services (like JupyterLab and SSH over TCP), we avoid specifying a `CMD` or `ENTRYPOINT` in the Dockerfile. Runpod base images include a carefully configured startup script (`/start.sh`) that handles Pod initialization, SSH setup, and service startup. Overriding this can break Pod functionality. + +## Step 2: Add system dependencies + +If your application requires system-level packages, add them before the Python dependencies: + +```dockerfile +# Use the specified base image +FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 + +# Update package list and install system dependencies +RUN apt-get update && apt-get install -y \ + git \ + wget \ + curl \ + && rm -rf /var/lib/apt/lists/* + +# Install additional Python dependencies +RUN pip install --no-cache-dir \ + transformers \ + accelerate \ + datasets \ + torch-audio + +# Don't specify a CMD here to use the base image's CMD +``` + + +Always clean up package lists with `rm -rf /var/lib/apt/lists/*` after installing system packages to reduce image size. + + +## Step 3: Pre-bake ML models + +To reduce Pod setup overhead, you can pre-download models during the Pod initialization process. Here are two approaches: + +### Method 1: Simple model download script + +Create a Python script that downloads your model: + +1. Create a file named `download_model.py` in the same directory as your Dockerfile: + +```python +from transformers import AutoTokenizer, AutoModelForCausalLM +import torch + +# Download and cache the model +model_name = "microsoft/DialoGPT-medium" +print(f"Downloading {model_name}...") + +tokenizer = AutoTokenizer.from_pretrained(model_name) +model = AutoModelForCausalLM.from_pretrained(model_name) + +print("Model downloaded and cached successfully!") +``` + +2. Update your Dockerfile to include and run this script: + +```dockerfile +# Use the specified base image +FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 + +# Install additional Python dependencies +RUN pip install --no-cache-dir \ + transformers \ + accelerate + +# Copy and run model download script +COPY download_model.py /tmp/download_model.py +RUN python /tmp/download_model.py && rm /tmp/download_model.py + +# Don't specify a CMD here to use the base image's CMD +``` + +### Method 2: Using the Hugging Face CLI + +For more control over model downloads, use the Hugging Face CLI: + +```dockerfile +# Use the specified base image +FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 + +# Install additional Python dependencies +RUN pip install --no-cache-dir \ + transformers \ + accelerate \ + huggingface_hub + +# Pre-download specific model files +RUN python -c "from huggingface_hub import snapshot_download; snapshot_download('microsoft/DialoGPT-medium', cache_dir='/root/.cache/huggingface')" + +# Don't specify a CMD here to use the base image's CMD +``` + + +Pre-baking large models will significantly increase your Docker image size and build time. Consider whether the faster Pod startup time justifies the larger image size for your use case. + + +## Step 4: Build and push your Docker image + +Now you're ready to build your custom image and push it to Docker Hub: + +1. Build your Docker image with the correct platform specification: + +```bash +docker build --platform=linux/amd64 -t my-custom-template:latest . +``` + + +The `--platform=linux/amd64` flag is crucial for Runpod compatibility. Runpod's infrastructure requires AMD64 architecture images. + + +2. Tag your image for Docker Hub (replace `YOUR_USERNAME` with your Docker Hub username): + +```bash +docker tag my-custom-template:latest YOUR_USERNAME/my-custom-template:latest +``` + +3. Push the image to Docker Hub: + +```bash +docker push YOUR_USERNAME/my-custom-template:latest +``` + + +If you haven't logged into Docker Hub from your command line, run `docker login` first and enter your Docker Hub credentials. + + +## Step 5: Create a Pod template in Runpod + +Next, create a Pod template using your custom Docker image: + +1. Navigate to the [Templates page](https://console.runpod.io/user/templates) in the Runpod console. +2. Click **New Template**. +3. Configure your template with these settings: + - **Name**: Give your template a descriptive name (e.g., "My Custom PyTorch Template"). + - **Container Image**: Enter your Docker Hub image name (e.g., `YOUR_USERNAME/my-custom-template:latest`). + - **Container Disk**: Set to at least 20 GB to accommodate your custom dependencies. + - **Volume Disk**: Set according to your storage needs (e.g., 20 GB). + - **Volume Mount Path**: Keep the default `/workspace`. + - **Expose HTTP Ports**: Add `8888` for JupyterLab access. + - **Expose TCP Ports**: Add `22` if you need SSH access. +4. Click **Save Template**. + +## Step 6: Deploy and test your custom template + +Now you're ready to deploy a Pod using your custom template to verify everything works correctly: + +1. Go to the [Pods page](https://console.runpod.io/pods) in the Runpod console. +2. Click **Deploy**. +3. Choose an appropriate GPU (make sure it meets the CUDA version requirements of your base image). +4. Click **Change Template** and select your custom template under **Your Pod Templates**. +5. Fill out the rest of the settings as desired, then click **Deploy On Demand**. +6. Wait for your Pod to initialize (this may take 5-10 minutes for the first deployment). + +## Step 7: Verify your custom template + +Once your Pod is running, verify that your customizations work correctly: + +1. Find your Pod on the [Pods page](https://console.runpod.io/pods) and click on it to open the connection menu. Click Jupyter Lab under HTTP Services to open JupyterLab. +2. Create a new Python notebook and test your pre-installed dependencies: + +```python +# Test that your custom packages are installed +import transformers +import accelerate +print(f"Transformers version: {transformers.__version__}") +print(f"Accelerate version: {accelerate.__version__}") + +# If you pre-baked a model, test loading it +from transformers import AutoTokenizer, AutoModelForCausalLM + +model_name = "microsoft/DialoGPT-medium" +tokenizer = AutoTokenizer.from_pretrained(model_name) +model = AutoModelForCausalLM.from_pretrained(model_name) + +print("Model loaded successfully!") +``` + +3. Run the cell to confirm everything is working as expected. + +## Advanced customization options + +### Setting environment variables + +You can set default environment variables in your template configuration: + +1. In the template creation form, scroll to **Environment Variables**. +2. Add key-value pairs for any environment variables your application needs: + - Key: `HUGGINGFACE_HUB_CACHE` + - Value: `/workspace/hf_cache` + +### Adding startup scripts + +To run custom initialization code when your Pod starts, create a startup script: + +1. Create a `startup.sh` file in your project directory: + +```bash +#!/bin/bash +echo "Running custom startup script..." +mkdir -p /workspace/models +echo "Custom startup complete!" +``` + +2. Add it to your Dockerfile: + +```dockerfile +# Copy startup script +COPY startup.sh /usr/local/bin/startup.sh +RUN chmod +x /usr/local/bin/startup.sh + +# Modify the start script to run our custom startup +RUN echo '/usr/local/bin/startup.sh' >> /start.sh +``` + +### Using multi-stage builds + +For complex applications, use multi-stage builds to reduce final image size: + +```dockerfile +# Build stage +FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 as builder + +# Install build dependencies +RUN apt-get update && apt-get install -y build-essential +RUN pip install --no-cache-dir some-package-that-needs-compilation + +# Final stage +FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 + +# Copy only the necessary files from builder +COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages + +# Install runtime dependencies +RUN pip install --no-cache-dir transformers accelerate +``` + +## Troubleshooting + +Here are solutions to common issues when creating custom templates: + +### Build failures + +- **Platform mismatch**: Always use `--platform=linux/amd64` when building. +- **Base image not found**: Verify the base image tag exists on Docker Hub. +- **Package installation fails**: Check that package names are correct and available for the Python version in your base image. + +### Pod deployment issues + +- **Pod fails to start**: Check the Pod logs in the Runpod console for error messages. +- **Services not accessible**: Ensure you've exposed the correct ports in your template configuration. +- **CUDA version mismatch**: Make sure your base image CUDA version is compatible with your chosen GPU. + +### Performance issues + +- **Slow startup**: Consider pre-baking more dependencies or using a smaller base image. +- **Out of memory**: Increase container disk size or choose a GPU with more VRAM. +- **Model loading errors**: Verify that pre-baked models are in the expected cache directories. + +## Best practices + +Follow these best practices when creating custom templates: + +### Image optimization + +- Use `.dockerignore` to exclude unnecessary files from your build context. +- Combine RUN commands to reduce image layers. +- Clean up package caches and temporary files. +- Use specific version tags for dependencies to ensure reproducibility. + +### Security considerations + +- Don't include sensitive information like API keys in your Docker image. +- Use [Runpod Secrets](/pods/templates/secrets) for sensitive configuration. +- Regularly update base images to get security patches. + +### Version management + +- Tag your images with version numbers (e.g., `v1.0.0`) instead of just `latest`. +- Keep a changelog of what changes between versions. +- Test new versions thoroughly before updating production templates. + +## Next steps + +Now that you have a working custom template, consider these next steps: + +- **Automate builds**: Set up GitHub Actions or similar CI/CD to automatically build and push new versions of your template. +- **Share with team**: If you're using a team account, share your template with team members. +- **Create variations**: Build specialized versions of your template for different use cases (development vs. production). +- **Monitor usage**: Track how your custom templates perform in production and optimize accordingly. + +For more advanced template management, see the [Template Management API documentation](/api-reference/templates/POST/templates) to programmatically create and update templates. \ No newline at end of file From d95744e19801b87aa58f17a0a053ae9ee5c6cf9f Mon Sep 17 00:00:00 2001 From: Mo King Date: Thu, 6 Nov 2025 16:22:18 -0500 Subject: [PATCH 2/7] Update description --- pods/templates/create-custom-template.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx index e861e182..6c86892b 100644 --- a/pods/templates/create-custom-template.mdx +++ b/pods/templates/create-custom-template.mdx @@ -1,10 +1,10 @@ --- title: "Create a custom Pod template" sidebarTitle: "Create a custom template" -description: "Learn how to extend official Runpod templates to create your own Pod templates." +description: "A step-by-step guide to extending official Runpod's official templates." --- -This tutorial shows how to create custom Pod templates by extending official Runpod base images with additional Python dependencies and pre-baked ML models. You'll learn the complete workflow from Dockerfile creation to deployment and testing. +This tutorial shows how to create custom Pod templates by extending Runpod's official base images with additional Python dependencies and pre-baked ML models. You'll learn the complete workflow from Dockerfile creation to deployment and testing. Custom templates allow you to package your specific dependencies, models, and configurations into reusable Docker images that can be deployed as Pods. This approach saves time during Pod initialization and ensures consistent environments across deployments. From f1c8e58b9989ae49a25fb266a414fe7d4376220c Mon Sep 17 00:00:00 2001 From: Mo King Date: Mon, 24 Nov 2025 12:19:28 -0500 Subject: [PATCH 3/7] Second draft of custom Pod template tutorial --- docs.json | 7 +- instant-clusters/slurm-clusters.mdx | 1 - pods/overview.mdx | 2 + pods/templates/create-custom-template.mdx | 602 ++++++++++++------- pods/templates/manage-templates.mdx | 4 + serverless/load-balancing/build-a-worker.mdx | 1 - serverless/load-balancing/overview.mdx | 1 - serverless/load-balancing/vllm-worker.mdx | 1 - 8 files changed, 391 insertions(+), 228 deletions(-) diff --git a/docs.json b/docs.json index 5ff1a967..71675129 100644 --- a/docs.json +++ b/docs.json @@ -9,7 +9,12 @@ }, "background": {}, "styling": { - "codeblocks": "system" + "codeblocks": { + "theme": { + "dark": "github-dark", + "light": "github-light" + } + } }, "icons": { "library": "fontawesome" diff --git a/instant-clusters/slurm-clusters.mdx b/instant-clusters/slurm-clusters.mdx index 9ef40a65..d61b4ffa 100644 --- a/instant-clusters/slurm-clusters.mdx +++ b/instant-clusters/slurm-clusters.mdx @@ -2,7 +2,6 @@ title: Slurm Clusters sidebarTitle: Slurm Clusters description: Deploy Slurm Clusters on Runpod with zero configuration -tag: "NEW" --- Runpod Slurm Clusters provide a managed high-performance computing and scheduling solution that enables you to rapidly create and manage Slurm Clusters with minimal setup. diff --git a/pods/overview.mdx b/pods/overview.mdx index 050f5977..57726fd1 100644 --- a/pods/overview.mdx +++ b/pods/overview.mdx @@ -32,6 +32,8 @@ Each Pod consists of these core components: Templates eliminate the need to manually set up environments, saving time and reducing configuration errors. For example, instead of installing PyTorch, configuring JupyterLab, and setting up all dependencies yourself, you can select an official Runpod PyTorch template and have everything ready to go instantly. +To learn how to create your own custom templates, see [Build a custom Pod template](/pods/templates/create-custom-template). + ## Storage Pods offer three types of storage to match different use cases: diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx index 6c86892b..71052e0b 100644 --- a/pods/templates/create-custom-template.mdx +++ b/pods/templates/create-custom-template.mdx @@ -1,359 +1,515 @@ --- -title: "Create a custom Pod template" -sidebarTitle: "Create a custom template" -description: "A step-by-step guide to extending official Runpod's official templates." +title: "Build a custom Pod template" +sidebarTitle: "Build a custom template" +description: "A step-by-step guide to extending Runpod's official templates." +tag: "NEW" --- -This tutorial shows how to create custom Pod templates by extending Runpod's official base images with additional Python dependencies and pre-baked ML models. You'll learn the complete workflow from Dockerfile creation to deployment and testing. + +You can find the complete code for this tutorial, including automated build options with GitHub Actions, in the [runpod-workers/pod-template](https://github.com/runpod-workers/pod-template) repository. + + +This tutorial shows how to build a custom Pod template from scratch. You'll extend an official Runpod template, add your own dependencies, configure how your container starts, and pre-load machine learning models. This approach saves time during Pod initialization and ensures consistent environments across deployments. -Custom templates allow you to package your specific dependencies, models, and configurations into reusable Docker images that can be deployed as Pods. This approach saves time during Pod initialization and ensures consistent environments across deployments. +By creating custom templates, you can package everything your project needs into a reusable Docker image. Once built, you can deploy your workload in seconds instead of reinstalling dependencies every time you start a new Pod. You can also share your template with members of your team and the wider Runpod community. ## What you'll learn In this tutorial, you'll learn how to: +- Set up a project to build a custom Pod template. - Create a Dockerfile that extends a Runpod base image. -- Add Python dependencies to an existing base image. -- Pre-package ML models into your custom template. -- Build and push Docker images with the correct platform settings. -- Deploy and test your custom template as a Pod. +- Configure container startup options (JupyterLab/SSH, custom applications, or application-only mode). +- Add Python dependencies and system packages. +- Pre-load machine learning models from Hugging Face or local files. +- Build, test, and push your template to Docker Hub. +- Deploy Pods using your custom template. ## Requirements Before you begin, you'll need: - A [Runpod account](/get-started/manage-accounts). -- [Docker](https://www.docker.com/products/docker-desktop/) installed on your local machine. -- A [Docker Hub](https://hub.docker.com/) account for hosting your custom images. -- At least $5 in Runpod credits for testing. -- Basic familiarity with Docker and command-line operations. +- Docker installed on your local machine or a remote server. +- A Docker Hub account (or access to another container registry). +- Basic familiarity with Docker and Python. -## Step 1: Create a custom Dockerfile +## Step 1: Set up your project structure -First, you'll create a Dockerfile that extends a Runpod base image with additional dependencies: +First, create a directory for your custom template and the necessary files. -1. Create a new directory for your custom template: + + +Create a new directory for your template project: ```bash -mkdir my-custom-template -cd my-custom-template +mkdir my-custom-pod-template +cd my-custom-pod-template ``` + -2. Create a new Dockerfile: + +Create the following files in your project directory: ```bash -touch Dockerfile +touch Dockerfile requirements.txt main.py ``` -3. Open the Dockerfile in your preferred text editor and add the following content: +Your project structure should now look like this: -```dockerfile -# Use the specified base image -FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 +``` +my-custom-pod-template/ +├── Dockerfile +├── requirements.txt +└── main.py +``` + + -# Install additional Python dependencies -RUN pip install --no-cache-dir \ - transformers \ - accelerate +## Step 2: Choose a base image and create your Dockerfile -# Don't specify a CMD here to use the base image's CMD -``` +Runpod offers base images with PyTorch, CUDA, and common dependencies pre-installed. You'll extend one of these images to build your custom template. -This extends the official Runpod PyTorch 2.8.0 base image, and installs two additional Python packages. This means that these packages will be automatically installed every time the Pod starts, so you won't need to run `pip install` again after Pod restarts. + + +Runpod offers several base images. You can explore available base images on [Docker Hub](https://hub.docker.com/u/runpod). - -When building custom templates, always start with a Runpod base image that matches your CUDA requirements. The base image includes essential components like the `/start.sh` script that handles Pod initialization. - +For this tutorial, we'll use the PyTorch image, `runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404` which includes PyTorch 2.8.0, CUDA 12.8.1, and Ubuntu 24.04. + -To maintain access to packaged services (like JupyterLab and SSH over TCP), we avoid specifying a `CMD` or `ENTRYPOINT` in the Dockerfile. Runpod base images include a carefully configured startup script (`/start.sh`) that handles Pod initialization, SSH setup, and service startup. Overriding this can break Pod functionality. + +Open `Dockerfile` and add the following content: -## Step 2: Add system dependencies +```dockerfile Dockerfile +# Use Runpod PyTorch base image +FROM runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404 -If your application requires system-level packages, add them before the Python dependencies: +# Set environment variables +# This ensures Python output is immediately visible in logs +ENV PYTHONUNBUFFERED=1 -```dockerfile -# Use the specified base image -FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 +# Set the working directory +WORKDIR /app -# Update package list and install system dependencies -RUN apt-get update && apt-get install -y \ - git \ - wget \ - curl \ +# Install system dependencies if needed +RUN apt-get update --yes && \ + DEBIAN_FRONTEND=noninteractive apt-get install --yes --no-install-recommends \ + wget \ + curl \ && rm -rf /var/lib/apt/lists/* -# Install additional Python dependencies -RUN pip install --no-cache-dir \ - transformers \ - accelerate \ - datasets \ - torch-audio +# Copy requirements file +COPY requirements.txt /app/ + +# Install Python dependencies +RUN pip install --no-cache-dir --upgrade pip && \ + pip install --no-cache-dir -r requirements.txt -# Don't specify a CMD here to use the base image's CMD +# Copy application files +COPY . /app ``` - -Always clean up package lists with `rm -rf /var/lib/apt/lists/*` after installing system packages to reduce image size. - +This basic Dockerfile: +- Extends the Runpod PyTorch base image. +- Installs system packages (`wget`, `curl`). +- Installs Python dependencies from `requirements.txt`. +- Copies your application code to `/app`. + + -## Step 3: Pre-bake ML models +## Step 3: Add Python dependencies -To reduce Pod setup overhead, you can pre-download models during the Pod initialization process. Here are two approaches: +Now define the Python packages your application needs. -### Method 1: Simple model download script + + +Open `requirements.txt` and add your Python dependencies: -Create a Python script that downloads your model: +```txt requirements.txt +# Python dependencies +# Add your packages here +numpy>=1.24.0 +requests>=2.31.0 +transformers>=4.40.0 +``` -1. Create a file named `download_model.py` in the same directory as your Dockerfile: +These packages will be installed when you build your Docker image. Add any additional libraries your application requires. + + -```python -from transformers import AutoTokenizer, AutoModelForCausalLM -import torch +## Step 4: Configure container startup behavior -# Download and cache the model -model_name = "microsoft/DialoGPT-medium" -print(f"Downloading {model_name}...") +Runpod base images come with built-in services like Jupyter and SSH. You can choose how your container starts: whether to keep all the base image services running, run your application alongside those services, or run only your application. -tokenizer = AutoTokenizer.from_pretrained(model_name) -model = AutoModelForCausalLM.from_pretrained(model_name) +There are three ways to configure how your container starts: -print("Model downloaded and cached successfully!") -``` +**Option 1: Keep all base image services (default)** -2. Update your Dockerfile to include and run this script: +The base image automatically starts Jupyter and SSH based on your template settings. This is the default behavior and is ideal for interactive development and remote access. -```dockerfile -# Use the specified base image -FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 +**Option 2: Run your application after services start** -# Install additional Python dependencies -RUN pip install --no-cache-dir \ - transformers \ - accelerate +This option starts Jupyter/SSH in the background, then runs your application. You'll use a startup script for this. -# Copy and run model download script -COPY download_model.py /tmp/download_model.py -RUN python /tmp/download_model.py && rm /tmp/download_model.py +**Option 3: Application only (no Jupyter or SSH)** -# Don't specify a CMD here to use the base image's CMD -``` +This runs only your application with minimal overhead, which is ideal for production deployments where you don't need interactive access. -### Method 2: Using the Hugging Face CLI +### Option 1: Keep base image services (no changes needed) +If you want the default behavior with Jupyter and SSH services, you don't need to modify the Dockerfile. The base image's `/start.sh` script handles everything automatically. -For more control over model downloads, use the Hugging Face CLI: +This is already configured in the Dockerfile from Step 2. -```dockerfile -# Use the specified base image -FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 +### Option 2: Configure custom application with services +If you want to run your application alongside Jupyter/SSH services, add these lines to the end of your Dockerfile: -# Install additional Python dependencies -RUN pip install --no-cache-dir \ - transformers \ - accelerate \ - huggingface_hub +```dockerfile Dockerfile +# Run application after services start +COPY run.sh /app/run.sh +RUN chmod +x /app/run.sh +CMD ["/app/run.sh"] +``` -# Pre-download specific model files -RUN python -c "from huggingface_hub import snapshot_download; snapshot_download('microsoft/DialoGPT-medium', cache_dir='/root/.cache/huggingface')" +Create a new file named `run.sh` in the same directory as your `Dockerfile`: -# Don't specify a CMD here to use the base image's CMD +```bash +touch run.sh ``` - -Pre-baking large models will significantly increase your Docker image size and build time. Consider whether the faster Pod startup time justifies the larger image size for your use case. - +Then add the following content to it: -## Step 4: Build and push your Docker image +```bash run.sh +#!/bin/bash +# Start base image services (Jupyter/SSH) in background +/start.sh & -Now you're ready to build your custom image and push it to Docker Hub: +# Wait for services to start +sleep 2 -1. Build your Docker image with the correct platform specification: +# Run your application +python /app/main.py -```bash -docker build --platform=linux/amd64 -t my-custom-template:latest . +# Wait for background processes +wait ``` - -The `--platform=linux/amd64` flag is crucial for Runpod compatibility. Runpod's infrastructure requires AMD64 architecture images. - +This script starts the base services in the background, then runs your application. -2. Tag your image for Docker Hub (replace `YOUR_USERNAME` with your Docker Hub username): +### Option 3: Configure application-only mode +For production deployments where you don't need Jupyter or SSH, add these lines to the end of your Dockerfile: -```bash -docker tag my-custom-template:latest YOUR_USERNAME/my-custom-template:latest +```dockerfile Dockerfile +# Clear entrypoint and run application only +ENTRYPOINT [] +CMD ["python", "/app/main.py"] ``` -3. Push the image to Docker Hub: +This overrides the base image entrypoint and runs only your Python application. -```bash -docker push YOUR_USERNAME/my-custom-template:latest +--- + +For this tutorial, we'll use Option 1 (keeping the base image services) for ease of testing and development. + +## Step 5: Create your application + +Next we'll create the Python application that will run in your Pod. Open `main.py` and add your application code. + +Here's an example app that loads a machine learning model and performs inference on sample texts. (You can also replace this with your own application logic.) + +```python main.py +""" +Example Pod template application with sentiment analysis. +""" + +import sys +import torch +import time +import signal +from transformers import pipeline + +def main(): + print("Hello from your custom Runpod template!") + print(f"Python version: {sys.version.split()[0]}") + print(f"PyTorch version: {torch.__version__}") + print(f"CUDA available: {torch.cuda.is_available()}") + + if torch.cuda.is_available(): + print(f"CUDA version: {torch.version.cuda}") + print(f"GPU device: {torch.cuda.get_device_name(0)}") + + # Initialize model + print("\nLoading sentiment analysis model...") + device = 0 if torch.cuda.is_available() else -1 + + classifier = pipeline( + "sentiment-analysis", + model="distilbert-base-uncased-finetuned-sst-2-english", + device=device + ) + + print("Model loaded successfully!") + + # Example inference + test_texts = [ + "This is a wonderful experience!", + "I really don't like this at all.", + "The weather is nice today.", + ] + + print("\n--- Running sentiment analysis ---") + for text in test_texts: + result = classifier(text) + print(f"Text: {text}") + print(f"Result: {result[0]['label']} (confidence: {result[0]['score']:.4f})\n") + + print("Container is running. Press Ctrl+C to stop.") + + # Keep container running + def signal_handler(sig, frame): + print("\nShutting down...") + sys.exit(0) + + signal.signal(signal.SIGINT, signal_handler) + signal.signal(signal.SIGTERM, signal_handler) + + try: + while True: + time.sleep(60) + except KeyboardInterrupt: + signal_handler(None, None) + +if __name__ == "__main__": + main() ``` - -If you haven't logged into Docker Hub from your command line, run `docker login` first and enter your Docker Hub credentials. - +## Step 6: Pre-load a model into your template -## Step 5: Create a Pod template in Runpod +Pre-loading models into your Docker image means that you won't need to re-download a model every time you start up a new Pod, enabling you to create easily reusable and shareable environments for ML inference. -Next, create a Pod template using your custom Docker image: +There are two ways to pre-load models: + +- **Option 1: Automatic download from Hugging Face (recommended)**: This is the simplest approach. During the Docker build, Python downloads and caches the model using the transformers library. + +- **Option 2: Manual download with wget**: This gives you explicit control and works with custom or hosted models. -1. Navigate to the [Templates page](https://console.runpod.io/user/templates) in the Runpod console. -2. Click **New Template**. -3. Configure your template with these settings: - - **Name**: Give your template a descriptive name (e.g., "My Custom PyTorch Template"). - - **Container Image**: Enter your Docker Hub image name (e.g., `YOUR_USERNAME/my-custom-template:latest`). - - **Container Disk**: Set to at least 20 GB to accommodate your custom dependencies. - - **Volume Disk**: Set according to your storage needs (e.g., 20 GB). - - **Volume Mount Path**: Keep the default `/workspace`. - - **Expose HTTP Ports**: Add `8888` for JupyterLab access. - - **Expose TCP Ports**: Add `22` if you need SSH access. -4. Click **Save Template**. +For this tutorial, we'll use Option 1 (automatic download from Hugging Face) for ease of setup and testing, but you can use Option 2 if you need more control. -## Step 6: Deploy and test your custom template +### Option 1: Pre-load models from Hugging Face +Add these lines to your Dockerfile before the `COPY . /app` line: -Now you're ready to deploy a Pod using your custom template to verify everything works correctly: +```dockerfile Dockerfile +# Set Hugging Face cache directory +ENV HF_HOME=/app/models +ENV HF_HUB_ENABLE_HF_TRANSFER=0 -1. Go to the [Pods page](https://console.runpod.io/pods) in the Runpod console. -2. Click **Deploy**. -3. Choose an appropriate GPU (make sure it meets the CUDA version requirements of your base image). -4. Click **Change Template** and select your custom template under **Your Pod Templates**. -5. Fill out the rest of the settings as desired, then click **Deploy On Demand**. -6. Wait for your Pod to initialize (this may take 5-10 minutes for the first deployment). +# Pre-download model during build +RUN python -c "from transformers import pipeline; pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english')" +``` + +During the build, Python will download the model and cache it in `/app/models`. When you deploy Pods with this template, the model loads instantly from the cache. + +To use the cached model in your application, add `local_files_only=True` when loading: -## Step 7: Verify your custom template +```python +classifier = pipeline( + "sentiment-analysis", + model="distilbert-base-uncased-finetuned-sst-2-english", + device=device, + model_kwargs={"local_files_only": True} +) +``` -Once your Pod is running, verify that your customizations work correctly: +### Option 2: Pre-load models with wget +For more control or to use models from custom sources, you can manually download model files during the build. -1. Find your Pod on the [Pods page](https://console.runpod.io/pods) and click on it to open the connection menu. Click Jupyter Lab under HTTP Services to open JupyterLab. -2. Create a new Python notebook and test your pre-installed dependencies: +Add these lines to your Dockerfile before the `COPY . /app` line: + +```dockerfile Dockerfile +# Create model directory and download files +RUN mkdir -p /app/models/distilbert-model && \ + cd /app/models/distilbert-model && \ + wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/config.json && \ + wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/model.safetensors && \ + wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/tokenizer_config.json && \ + wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/vocab.txt +``` + +Then load the model from the local directory in your application: ```python -# Test that your custom packages are installed -import transformers -import accelerate -print(f"Transformers version: {transformers.__version__}") -print(f"Accelerate version: {accelerate.__version__}") +classifier = pipeline( + 'sentiment-analysis', + model='/app/models/distilbert-model', + device=device +) +``` -# If you pre-baked a model, test loading it -from transformers import AutoTokenizer, AutoModelForCausalLM +## Step 7: Build and test your Docker image -model_name = "microsoft/DialoGPT-medium" -tokenizer = AutoTokenizer.from_pretrained(model_name) -model = AutoModelForCausalLM.from_pretrained(model_name) +Now that your template is configured, you can build and test your Docker image locally to make sure it works correctly: -print("Model loaded successfully!") + + +Run the Docker build command from your project directory: + +```bash +docker build --platform linux/amd64 -t my-custom-template:latest . ``` -3. Run the cell to confirm everything is working as expected. +The `--platform linux/amd64` flag ensures compatibility with Runpod's infrastructure, and is required if you're building on a Mac or ARM system. -## Advanced customization options +The build process will: +- Download the base image. +- Install system dependencies. +- Install Python packages. +- Download and cache models (if configured). +- Copy your application files. -### Setting environment variables +This may take 5-15 minutes depending on your dependencies and model sizes. + -You can set default environment variables in your template configuration: + +Check that your image was created successfully: -1. In the template creation form, scroll to **Environment Variables**. -2. Add key-value pairs for any environment variables your application needs: - - Key: `HUGGINGFACE_HUB_CACHE` - - Value: `/workspace/hf_cache` +```bash +docker images | grep my-custom-template +``` -### Adding startup scripts +You should see your image listed with the `latest` tag, similar to this: -To run custom initialization code when your Pod starts, create a startup script: +```bash +my-custom-template latest 54c3d1f97912 10 seconds ago 10.9GB +``` + -1. Create a `startup.sh` file in your project directory: + +To test the container locally, run the following command: ```bash -#!/bin/bash -echo "Running custom startup script..." -mkdir -p /workspace/models -echo "Custom startup complete!" +docker run --rm -it --platform linux/amd64 my-custom-template:latest /bin/bash ``` -2. Add it to your Dockerfile: +This starts the container and connects you to a shell inside it, exactly like the Runpod web terminal but running locally on your machine. + +You can use this shell to test your application and verify that your dependencies are installed correctly. (Press `Ctrl+D` when you want to return to your local terminal.) -```dockerfile -# Copy startup script -COPY startup.sh /usr/local/bin/startup.sh -RUN chmod +x /usr/local/bin/startup.sh +When you connect to the container shell, you'll be taken directly to the `/app` directory, which contains your application code (`main.py`) and `requirements.txt`. Your models can be found in `/app/models`. -# Modify the start script to run our custom startup -RUN echo '/usr/local/bin/startup.sh' >> /start.sh + + + +Try running the sample application (or any custom code you added): + +```bash +python main.py ``` -### Using multi-stage builds +You should see output from the application in your terminal, including the model loading and inference results. + +Press `Ctrl+C` to stop the application and `Ctrl+D` when you're ready to exit the container. + + -For complex applications, use multi-stage builds to reduce final image size: +## Step 8: Push to Docker Hub -```dockerfile -# Build stage -FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 as builder +To use your template with Runpod, push to Docker Hub (or another container registry). -# Install build dependencies -RUN apt-get update && apt-get install -y build-essential -RUN pip install --no-cache-dir some-package-that-needs-compilation + + +Tag your image with your Docker Hub username: -# Final stage -FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 +```bash +docker tag my-custom-template:latest YOUR_DOCKER_USERNAME/my-custom-template:latest +``` -# Copy only the necessary files from builder -COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages +Replace `YOUR_DOCKER_USERNAME` with your actual Docker Hub username. + -# Install runtime dependencies -RUN pip install --no-cache-dir transformers accelerate + +Authenticate with Docker Hub: + +```bash +docker login ``` -## Troubleshooting +If you aren't already logged in to Docker Hub, you'll be prompted to enter your Docker Hub username and password. + -Here are solutions to common issues when creating custom templates: + +Push your image to Docker Hub: -### Build failures +```bash +docker push YOUR_DOCKER_USERNAME/my-custom-template:latest +``` + +This uploads your image to Docker Hub, making it accessible to Runpod. Large images may take several minutes to upload. + + + +## Step 9: Create a Pod template in the Runpod console -- **Platform mismatch**: Always use `--platform=linux/amd64` when building. -- **Base image not found**: Verify the base image tag exists on Docker Hub. -- **Package installation fails**: Check that package names are correct and available for the Python version in your base image. +Next, create a Pod template using your custom Docker image: -### Pod deployment issues + + +Navigate to the [Templates page](https://console.runpod.io/user/templates) in the Runpod console and click **New Template**. + -- **Pod fails to start**: Check the Pod logs in the Runpod console for error messages. -- **Services not accessible**: Ensure you've exposed the correct ports in your template configuration. -- **CUDA version mismatch**: Make sure your base image CUDA version is compatible with your chosen GPU. + +Configure your template with these settings: + - **Name**: Give your template a descriptive name (e.g., "my-custom-template"). + - **Container Image**: Enter the Docker Hub image name and tag: `YOUR_DOCKER_USERNAME/my-custom-template:latest`. + - **Container Disk**: Set to at least 15 GB. + - **HTTP Ports**: Expand the section, click **Add port**, then enter **JupyterLab** as the port label and **8888** as the port number. + - **TCP Ports**: Expand the section, click **Add port**, then enter **SSH** as the port label and **22** as the port number. -### Performance issues +Leave all other settings on their defaults and click **Save Template**. + + -- **Slow startup**: Consider pre-baking more dependencies or using a smaller base image. -- **Out of memory**: Increase container disk size or choose a GPU with more VRAM. -- **Model loading errors**: Verify that pre-baked models are in the expected cache directories. +## Step 10: Deploy and test your template -## Best practices +Now you can deploy and test your template on a Pod: -Follow these best practices when creating custom templates: + + +Go to the [Pods page](https://console.runpod.io/pods) in the Runpod console and click **Deploy**. + -### Image optimization + +Configure your Pod with these settings: -- Use `.dockerignore` to exclude unnecessary files from your build context. -- Combine RUN commands to reduce image layers. -- Clean up package caches and temporary files. -- Use specific version tags for dependencies to ensure reproducibility. + - **GPU**: Select any GPU type with at least 16 GB of VRAM. + - **Pod Template**: Click **Change Template**. You should see your custom template ("my-custom-template") in the list. Click it to select it. -### Security considerations +Leave all other settings on their defaults and click **Deploy On-Demand**. -- Don't include sensitive information like API keys in your Docker image. -- Use [Runpod Secrets](/pods/templates/secrets) for sensitive configuration. -- Regularly update base images to get security patches. +Your Pod will start with all your pre-installed dependencies and models. The first deployment may take a few minutes as Runpod downloads your image. + -### Version management + +Once your Pod is running, click on your Pod to open the connection options panel. -- Tag your images with version numbers (e.g., `v1.0.0`) instead of just `latest`. -- Keep a changelog of what changes between versions. -- Test new versions thoroughly before updating production templates. +Try one of the following connection options: +- **Web Terminal**: Click **Enable Web Terminal** and then **Open Web Terminal** to access it. +- **JupyterLab**: It may take a few minutes for JupyterLab to start. Once it's labeled as **Ready**, click the **JupyterLab** link to access it. +- **SSH**: Copy the SSH command and run it in your local terminal to access it. (See [Connect to a Pod with SSH](/pods/configuration/use-ssh) for details on how to use SSH.) + + ## Next steps -Now that you have a working custom template, consider these next steps: +Congratulations! You've built a custom Pod template and deployed it to Runpod. + +You can use this as a jumping off point to build your own custom templates with your own applications, dependencies, and models. + +For example, you can try: + +- Adding more dependencies and models to your template. +- Creating different template versions for different use cases. +- Automating builds using GitHub Actions or other CI/CD tools. +- Using [Runpod secrets](/pods/templates/secrets) to manage sensitive information. -- **Automate builds**: Set up GitHub Actions or similar CI/CD to automatically build and push new versions of your template. -- **Share with team**: If you're using a team account, share your template with team members. -- **Create variations**: Build specialized versions of your template for different use cases (development vs. production). -- **Monitor usage**: Track how your custom templates perform in production and optimize accordingly. +For more information on working with templates, see the [Manage Pod templates](/pods/templates/manage-templates) guide. -For more advanced template management, see the [Template Management API documentation](/api-reference/templates/POST/templates) to programmatically create and update templates. \ No newline at end of file +For more advanced template management, you can use the [Runpod REST API](/api-reference/templates/POST/templates) to programmatically create and update templates. \ No newline at end of file diff --git a/pods/templates/manage-templates.mdx b/pods/templates/manage-templates.mdx index da54a5b6..de6e2459 100644 --- a/pods/templates/manage-templates.mdx +++ b/pods/templates/manage-templates.mdx @@ -48,6 +48,10 @@ Most Docker images have built in start commands, so you can usually leave this b ## Creating templates + +To learn how to create your own custom templates, see [Build a custom Pod template](/pods/templates/create-custom-template). + + diff --git a/serverless/load-balancing/build-a-worker.mdx b/serverless/load-balancing/build-a-worker.mdx index 4ac68d7e..52af1447 100644 --- a/serverless/load-balancing/build-a-worker.mdx +++ b/serverless/load-balancing/build-a-worker.mdx @@ -2,7 +2,6 @@ title: "Build a load balancing worker" sidebarTitle: "Build a load balancing worker" description: "Learn how to implement and deploy a load balancing worker with FastAPI." -tag: "NEW" --- This tutorial shows how to build a load balancing worker using FastAPI and deploy it as a Serverless endpoint on Runpod. diff --git a/serverless/load-balancing/overview.mdx b/serverless/load-balancing/overview.mdx index 2b08f1cd..746b819c 100644 --- a/serverless/load-balancing/overview.mdx +++ b/serverless/load-balancing/overview.mdx @@ -2,7 +2,6 @@ title: "Overview" sidebarTitle: "Overview" description: "Deploy custom direct-access REST APIs with load balancing Serverless endpoints." -tag: "NEW" --- Load balancing endpoints offer a completely new paradigm for Serverless endpoint creation, enabling direct access to worker HTTP servers without an intermediary queueing system. diff --git a/serverless/load-balancing/vllm-worker.mdx b/serverless/load-balancing/vllm-worker.mdx index 404b891d..d393e5c2 100644 --- a/serverless/load-balancing/vllm-worker.mdx +++ b/serverless/load-balancing/vllm-worker.mdx @@ -2,7 +2,6 @@ title: "Build a load balancing vLLM endpoint" sidebarTitle: "Build a vLLM load balancer" description: "Learn how to deploy a custom vLLM server to a load balancing Serverless endpoint." -tag: "NEW" --- This tutorial shows how to build a vLLM application using FastAPI and deploy it as a load balancing Serverless endpoint on Runpod. From 66d3ec048102827a632e8f5d1572c7c4b1459b0d Mon Sep 17 00:00:00 2001 From: Mo King Date: Mon, 24 Nov 2025 12:21:20 -0500 Subject: [PATCH 4/7] Update wording --- pods/templates/create-custom-template.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx index 71052e0b..4c1e23be 100644 --- a/pods/templates/create-custom-template.mdx +++ b/pods/templates/create-custom-template.mdx @@ -9,7 +9,7 @@ tag: "NEW" You can find the complete code for this tutorial, including automated build options with GitHub Actions, in the [runpod-workers/pod-template](https://github.com/runpod-workers/pod-template) repository. -This tutorial shows how to build a custom Pod template from scratch. You'll extend an official Runpod template, add your own dependencies, configure how your container starts, and pre-load machine learning models. This approach saves time during Pod initialization and ensures consistent environments across deployments. +This tutorial shows how to build a custom Pod template from the ground up. You'll extend an official Runpod template, add your own dependencies, configure how your container starts, and pre-load machine learning models. This approach saves time during Pod initialization and ensures consistent environments across deployments. By creating custom templates, you can package everything your project needs into a reusable Docker image. Once built, you can deploy your workload in seconds instead of reinstalling dependencies every time you start a new Pod. You can also share your template with members of your team and the wider Runpod community. From 1e528629b8ba95a767df88cb4660d7c4d57f8404 Mon Sep 17 00:00:00 2001 From: Mo King Date: Mon, 24 Nov 2025 12:49:34 -0500 Subject: [PATCH 5/7] Update --- pods/templates/create-custom-template.mdx | 134 +++++++++++----------- 1 file changed, 68 insertions(+), 66 deletions(-) diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx index 4c1e23be..b2113a74 100644 --- a/pods/templates/create-custom-template.mdx +++ b/pods/templates/create-custom-template.mdx @@ -155,12 +155,13 @@ This option starts Jupyter/SSH in the background, then runs your application. Yo This runs only your application with minimal overhead, which is ideal for production deployments where you don't need interactive access. -### Option 1: Keep base image services (no changes needed) +### Option 1: Keep all base image services (no changes needed) If you want the default behavior with Jupyter and SSH services, you don't need to modify the Dockerfile. The base image's `/start.sh` script handles everything automatically. This is already configured in the Dockerfile from Step 2. -### Option 2: Configure custom application with services +### Option 2: Automatically run the application after services start + If you want to run your application alongside Jupyter/SSH services, add these lines to the end of your Dockerfile: ```dockerfile Dockerfile @@ -208,9 +209,54 @@ This overrides the base image entrypoint and runs only your Python application. --- -For this tutorial, we'll use Option 1 (keeping the base image services) for ease of testing and development. +For this tutorial, we'll use option 1 (default behavior for the base image services) so we can test out the various connection options. + +## Step 5: Pre-load a model into your template + +Pre-loading models into your Docker image means that you won't need to re-download a model every time you start up a new Pod, enabling you to create easily reusable and shareable environments for ML inference. + +There are two ways to pre-load models: + +- **Option 1: Automatic download from Hugging Face (recommended)**: This is the simplest approach. During the Docker build, Python downloads and caches the model using the transformers library. + +- **Option 2: Manual download with wget**: This gives you explicit control and works with custom or hosted models. + +For this tutorial, we'll use Option 1 (automatic download from Hugging Face) for ease of setup and testing, but you can use Option 2 if you need more control. + +### Option 1: Pre-load models from Hugging Face +Add these lines to your Dockerfile before the `COPY . /app` line: + +```dockerfile Dockerfile +# Set Hugging Face cache directory +ENV HF_HOME=/app/models +ENV HF_HUB_ENABLE_HF_TRANSFER=0 + +# Pre-download model during build +RUN python -c "from transformers import pipeline; pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english')" +``` + +During the build, Python will download the model and cache it in `/app/models`. When you deploy Pods with this template, the model loads instantly from the cache. + +### Option 2: Pre-load models with wget +For more control or to use models from custom sources, you can manually download model files during the build. + +Add these lines to your Dockerfile before the `COPY . /app` line: + +```dockerfile Dockerfile +# Create model directory and download files +RUN mkdir -p /app/models/distilbert-model && \ + cd /app/models/distilbert-model && \ + wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/config.json && \ + wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/model.safetensors && \ + wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/tokenizer_config.json && \ + wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/vocab.txt +``` + +--- + +For this tutorial, we'll use option 1 (automatic download from Hugging Face). -## Step 5: Create your application +## Step 6: Create your application Next we'll create the Python application that will run in your Pod. Open `main.py` and add your application code. @@ -241,11 +287,25 @@ def main(): print("\nLoading sentiment analysis model...") device = 0 if torch.cuda.is_available() else -1 + # MODEL LOADING OPTIONS: + + # OPTION 1: From Hugging Face Hub cache (default) + # Bakes the model into the container image using transformers pipeline + # Behavior: Loads model from the cache, requires local_files_only=True classifier = pipeline( "sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english", - device=device + device=device, + model_kwargs={"local_files_only": True}, ) + + # OPTION 2: From a local directory + # Download the model files using wget, loads them from the local directory + # Behavior: Loads directly from /app/models/distilbert-model + # To use: Uncomment the pipeline object below, comment OPTION 1 above + # classifier = pipeline('sentiment-analysis', + # model='/app/models/distilbert-model', + # device=device) print("Model loaded successfully!") @@ -282,67 +342,9 @@ if __name__ == "__main__": main() ``` -## Step 6: Pre-load a model into your template - -Pre-loading models into your Docker image means that you won't need to re-download a model every time you start up a new Pod, enabling you to create easily reusable and shareable environments for ML inference. - -There are two ways to pre-load models: - -- **Option 1: Automatic download from Hugging Face (recommended)**: This is the simplest approach. During the Docker build, Python downloads and caches the model using the transformers library. - -- **Option 2: Manual download with wget**: This gives you explicit control and works with custom or hosted models. - -For this tutorial, we'll use Option 1 (automatic download from Hugging Face) for ease of setup and testing, but you can use Option 2 if you need more control. - -### Option 1: Pre-load models from Hugging Face -Add these lines to your Dockerfile before the `COPY . /app` line: - -```dockerfile Dockerfile -# Set Hugging Face cache directory -ENV HF_HOME=/app/models -ENV HF_HUB_ENABLE_HF_TRANSFER=0 - -# Pre-download model during build -RUN python -c "from transformers import pipeline; pipeline('sentiment-analysis', model='distilbert-base-uncased-finetuned-sst-2-english')" -``` - -During the build, Python will download the model and cache it in `/app/models`. When you deploy Pods with this template, the model loads instantly from the cache. - -To use the cached model in your application, add `local_files_only=True` when loading: - -```python -classifier = pipeline( - "sentiment-analysis", - model="distilbert-base-uncased-finetuned-sst-2-english", - device=device, - model_kwargs={"local_files_only": True} -) -``` - -### Option 2: Pre-load models with wget -For more control or to use models from custom sources, you can manually download model files during the build. - -Add these lines to your Dockerfile before the `COPY . /app` line: - -```dockerfile Dockerfile -# Create model directory and download files -RUN mkdir -p /app/models/distilbert-model && \ - cd /app/models/distilbert-model && \ - wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/config.json && \ - wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/model.safetensors && \ - wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/tokenizer_config.json && \ - wget -q https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/vocab.txt -``` - -Then load the model from the local directory in your application: - -```python -classifier = pipeline( - 'sentiment-analysis', - model='/app/models/distilbert-model', - device=device -) -``` + +If you're pre-load a model with `wget` (option 2 from step 5), you need to uncomment the `classifier = pipeline()` object in `main.py` and comment out the `classifier = pipeline()` object for option 1. + ## Step 7: Build and test your Docker image From 07d091495344e0273a8ec4ff6a81f6f3d29c360b Mon Sep 17 00:00:00 2001 From: Mo King Date: Mon, 24 Nov 2025 13:20:13 -0500 Subject: [PATCH 6/7] Update --- pods/templates/create-custom-template.mdx | 28 +++++++++++++++++------ 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx index b2113a74..870e0682 100644 --- a/pods/templates/create-custom-template.mdx +++ b/pods/templates/create-custom-template.mdx @@ -17,13 +17,13 @@ By creating custom templates, you can package everything your project needs into In this tutorial, you'll learn how to: -- Set up a project to build a custom Pod template. - Create a Dockerfile that extends a Runpod base image. -- Configure container startup options (JupyterLab/SSH, custom applications, or application-only mode). +- Configure container startup options (JupyterLab/SSH, application + services, or application only). - Add Python dependencies and system packages. -- Pre-load machine learning models from Hugging Face or local files. -- Build, test, and push your template to Docker Hub. -- Deploy Pods using your custom template. +- Pre-load machine learning models from Hugging Face, local files, or custom sources. +- Build and test your image, then push it to Docker Hub. +- Create a custom Pod template in the Runpod console +- Deploy a Pod using your custom template. ## Requirements @@ -481,7 +481,7 @@ Go to the [Pods page](https://console.runpod.io/pods) in the Runpod console and Configure your Pod with these settings: - - **GPU**: Select any GPU type with at least 16 GB of VRAM. + - **GPU**: The Distilbert model used in this tutorial is very small, so you can **select any available GPU**. If you're using a different model, you'll need to [select a GPU](/pods/choose-a-pod) that matches its requirements. - **Pod Template**: Click **Change Template**. You should see your custom template ("my-custom-template") in the list. Click it to select it. Leave all other settings on their defaults and click **Deploy On-Demand**. @@ -492,11 +492,25 @@ Your Pod will start with all your pre-installed dependencies and models. The fir Once your Pod is running, click on your Pod to open the connection options panel. -Try one of the following connection options: +Try one or more connection options: + - **Web Terminal**: Click **Enable Web Terminal** and then **Open Web Terminal** to access it. - **JupyterLab**: It may take a few minutes for JupyterLab to start. Once it's labeled as **Ready**, click the **JupyterLab** link to access it. - **SSH**: Copy the SSH command and run it in your local terminal to access it. (See [Connect to a Pod with SSH](/pods/configuration/use-ssh) for details on how to use SSH.) + + +After you've connected, try running the sample application (or any custom code you added): + +```bash +python main.py +``` + +You should see output from the application in your terminal, including the model loading and inference results. + + +To avoid incurring unnecessary charges, make sure to stop and then terminate your Pod when you're finished. (See [Manage Pods](/pods/manage-pods) for detailed instructions.) + ## Next steps From 683217d1edcacae981eacd0d92c9e1cef57c3faa Mon Sep 17 00:00:00 2001 From: Mo King Date: Tue, 25 Nov 2025 14:45:28 -0500 Subject: [PATCH 7/7] Update --- pods/templates/create-custom-template.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx index 870e0682..4276f31e 100644 --- a/pods/templates/create-custom-template.mdx +++ b/pods/templates/create-custom-template.mdx @@ -343,7 +343,7 @@ if __name__ == "__main__": ``` -If you're pre-load a model with `wget` (option 2 from step 5), you need to uncomment the `classifier = pipeline()` object in `main.py` and comment out the `classifier = pipeline()` object for option 1. +If you're pre-loading a model with `wget` (option 2 from step 5), make sure to uncomment the `classifier = pipeline()` object in `main.py` and comment out the `classifier = pipeline()` object for option 1. ## Step 7: Build and test your Docker image