Skip to content

Commit 1840cbb

Browse files
committed
Add custom template tutorial
1 parent f31898c commit 1840cbb

File tree

2 files changed

+360
-0
lines changed

2 files changed

+360
-0
lines changed

docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,7 @@
132132
"pages": [
133133
"pods/templates/overview",
134134
"pods/templates/manage-templates",
135+
"pods/templates/create-custom-template",
135136
"pods/templates/environment-variables",
136137
"pods/templates/secrets"
137138
]
Lines changed: 359 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,359 @@
1+
---
2+
title: "Create a custom Pod template"
3+
sidebarTitle: "Create a custom template"
4+
description: "Learn how to extend official Runpod templates to create your own Pod templates."
5+
---
6+
7+
This tutorial shows how to create custom Pod templates by extending official Runpod base images with additional Python dependencies and pre-baked ML models. You'll learn the complete workflow from Dockerfile creation to deployment and testing.
8+
9+
Custom templates allow you to package your specific dependencies, models, and configurations into reusable Docker images that can be deployed as Pods. This approach saves time during Pod initialization and ensures consistent environments across deployments.
10+
11+
## What you'll learn
12+
13+
In this tutorial, you'll learn how to:
14+
15+
- Create a Dockerfile that extends a Runpod base image.
16+
- Add Python dependencies to an existing base image.
17+
- Pre-package ML models into your custom template.
18+
- Build and push Docker images with the correct platform settings.
19+
- Deploy and test your custom template as a Pod.
20+
21+
## Requirements
22+
23+
Before you begin, you'll need:
24+
25+
- A [Runpod account](/get-started/manage-accounts).
26+
- [Docker](https://www.docker.com/products/docker-desktop/) installed on your local machine.
27+
- A [Docker Hub](https://hub.docker.com/) account for hosting your custom images.
28+
- At least $5 in Runpod credits for testing.
29+
- Basic familiarity with Docker and command-line operations.
30+
31+
## Step 1: Create a custom Dockerfile
32+
33+
First, you'll create a Dockerfile that extends a Runpod base image with additional dependencies:
34+
35+
1. Create a new directory for your custom template:
36+
37+
```bash
38+
mkdir my-custom-template
39+
cd my-custom-template
40+
```
41+
42+
2. Create a new Dockerfile:
43+
44+
```bash
45+
touch Dockerfile
46+
```
47+
48+
3. Open the Dockerfile in your preferred text editor and add the following content:
49+
50+
```dockerfile
51+
# Use the specified base image
52+
FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
53+
54+
# Install additional Python dependencies
55+
RUN pip install --no-cache-dir \
56+
transformers \
57+
accelerate
58+
59+
# Don't specify a CMD here to use the base image's CMD
60+
```
61+
62+
This extends the official Runpod PyTorch 2.8.0 base image, and installs two additional Python packages. This means that these packages will be automatically installed every time the Pod starts, so you won't need to run `pip install` again after Pod restarts.
63+
64+
<Note>
65+
When building custom templates, always start with a Runpod base image that matches your CUDA requirements. The base image includes essential components like the `/start.sh` script that handles Pod initialization.
66+
</Note>
67+
68+
To maintain access to packaged services (like JupyterLab and SSH over TCP), we avoid specifying a `CMD` or `ENTRYPOINT` in the Dockerfile. Runpod base images include a carefully configured startup script (`/start.sh`) that handles Pod initialization, SSH setup, and service startup. Overriding this can break Pod functionality.
69+
70+
## Step 2: Add system dependencies
71+
72+
If your application requires system-level packages, add them before the Python dependencies:
73+
74+
```dockerfile
75+
# Use the specified base image
76+
FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
77+
78+
# Update package list and install system dependencies
79+
RUN apt-get update && apt-get install -y \
80+
git \
81+
wget \
82+
curl \
83+
&& rm -rf /var/lib/apt/lists/*
84+
85+
# Install additional Python dependencies
86+
RUN pip install --no-cache-dir \
87+
transformers \
88+
accelerate \
89+
datasets \
90+
torch-audio
91+
92+
# Don't specify a CMD here to use the base image's CMD
93+
```
94+
95+
<Tip>
96+
Always clean up package lists with `rm -rf /var/lib/apt/lists/*` after installing system packages to reduce image size.
97+
</Tip>
98+
99+
## Step 3: Pre-bake ML models
100+
101+
To reduce Pod setup overhead, you can pre-download models during the Pod initialization process. Here are two approaches:
102+
103+
### Method 1: Simple model download script
104+
105+
Create a Python script that downloads your model:
106+
107+
1. Create a file named `download_model.py` in the same directory as your Dockerfile:
108+
109+
```python
110+
from transformers import AutoTokenizer, AutoModelForCausalLM
111+
import torch
112+
113+
# Download and cache the model
114+
model_name = "microsoft/DialoGPT-medium"
115+
print(f"Downloading {model_name}...")
116+
117+
tokenizer = AutoTokenizer.from_pretrained(model_name)
118+
model = AutoModelForCausalLM.from_pretrained(model_name)
119+
120+
print("Model downloaded and cached successfully!")
121+
```
122+
123+
2. Update your Dockerfile to include and run this script:
124+
125+
```dockerfile
126+
# Use the specified base image
127+
FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
128+
129+
# Install additional Python dependencies
130+
RUN pip install --no-cache-dir \
131+
transformers \
132+
accelerate
133+
134+
# Copy and run model download script
135+
COPY download_model.py /tmp/download_model.py
136+
RUN python /tmp/download_model.py && rm /tmp/download_model.py
137+
138+
# Don't specify a CMD here to use the base image's CMD
139+
```
140+
141+
### Method 2: Using the Hugging Face CLI
142+
143+
For more control over model downloads, use the Hugging Face CLI:
144+
145+
```dockerfile
146+
# Use the specified base image
147+
FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
148+
149+
# Install additional Python dependencies
150+
RUN pip install --no-cache-dir \
151+
transformers \
152+
accelerate \
153+
huggingface_hub
154+
155+
# Pre-download specific model files
156+
RUN python -c "from huggingface_hub import snapshot_download; snapshot_download('microsoft/DialoGPT-medium', cache_dir='/root/.cache/huggingface')"
157+
158+
# Don't specify a CMD here to use the base image's CMD
159+
```
160+
161+
<Warning>
162+
Pre-baking large models will significantly increase your Docker image size and build time. Consider whether the faster Pod startup time justifies the larger image size for your use case.
163+
</Warning>
164+
165+
## Step 4: Build and push your Docker image
166+
167+
Now you're ready to build your custom image and push it to Docker Hub:
168+
169+
1. Build your Docker image with the correct platform specification:
170+
171+
```bash
172+
docker build --platform=linux/amd64 -t my-custom-template:latest .
173+
```
174+
175+
<Note>
176+
The `--platform=linux/amd64` flag is crucial for Runpod compatibility. Runpod's infrastructure requires AMD64 architecture images.
177+
</Note>
178+
179+
2. Tag your image for Docker Hub (replace `YOUR_USERNAME` with your Docker Hub username):
180+
181+
```bash
182+
docker tag my-custom-template:latest YOUR_USERNAME/my-custom-template:latest
183+
```
184+
185+
3. Push the image to Docker Hub:
186+
187+
```bash
188+
docker push YOUR_USERNAME/my-custom-template:latest
189+
```
190+
191+
<Tip>
192+
If you haven't logged into Docker Hub from your command line, run `docker login` first and enter your Docker Hub credentials.
193+
</Tip>
194+
195+
## Step 5: Create a Pod template in Runpod
196+
197+
Next, create a Pod template using your custom Docker image:
198+
199+
1. Navigate to the [Templates page](https://console.runpod.io/user/templates) in the Runpod console.
200+
2. Click **New Template**.
201+
3. Configure your template with these settings:
202+
- **Name**: Give your template a descriptive name (e.g., "My Custom PyTorch Template").
203+
- **Container Image**: Enter your Docker Hub image name (e.g., `YOUR_USERNAME/my-custom-template:latest`).
204+
- **Container Disk**: Set to at least 20 GB to accommodate your custom dependencies.
205+
- **Volume Disk**: Set according to your storage needs (e.g., 20 GB).
206+
- **Volume Mount Path**: Keep the default `/workspace`.
207+
- **Expose HTTP Ports**: Add `8888` for JupyterLab access.
208+
- **Expose TCP Ports**: Add `22` if you need SSH access.
209+
4. Click **Save Template**.
210+
211+
## Step 6: Deploy and test your custom template
212+
213+
Now you're ready to deploy a Pod using your custom template to verify everything works correctly:
214+
215+
1. Go to the [Pods page](https://console.runpod.io/pods) in the Runpod console.
216+
2. Click **Deploy**.
217+
3. Choose an appropriate GPU (make sure it meets the CUDA version requirements of your base image).
218+
4. Click **Change Template** and select your custom template under **Your Pod Templates**.
219+
5. Fill out the rest of the settings as desired, then click **Deploy On Demand**.
220+
6. Wait for your Pod to initialize (this may take 5-10 minutes for the first deployment).
221+
222+
## Step 7: Verify your custom template
223+
224+
Once your Pod is running, verify that your customizations work correctly:
225+
226+
1. Find your Pod on the [Pods page](https://console.runpod.io/pods) and click on it to open the connection menu. Click Jupyter Lab under HTTP Services to open JupyterLab.
227+
2. Create a new Python notebook and test your pre-installed dependencies:
228+
229+
```python
230+
# Test that your custom packages are installed
231+
import transformers
232+
import accelerate
233+
print(f"Transformers version: {transformers.__version__}")
234+
print(f"Accelerate version: {accelerate.__version__}")
235+
236+
# If you pre-baked a model, test loading it
237+
from transformers import AutoTokenizer, AutoModelForCausalLM
238+
239+
model_name = "microsoft/DialoGPT-medium"
240+
tokenizer = AutoTokenizer.from_pretrained(model_name)
241+
model = AutoModelForCausalLM.from_pretrained(model_name)
242+
243+
print("Model loaded successfully!")
244+
```
245+
246+
3. Run the cell to confirm everything is working as expected.
247+
248+
## Advanced customization options
249+
250+
### Setting environment variables
251+
252+
You can set default environment variables in your template configuration:
253+
254+
1. In the template creation form, scroll to **Environment Variables**.
255+
2. Add key-value pairs for any environment variables your application needs:
256+
- Key: `HUGGINGFACE_HUB_CACHE`
257+
- Value: `/workspace/hf_cache`
258+
259+
### Adding startup scripts
260+
261+
To run custom initialization code when your Pod starts, create a startup script:
262+
263+
1. Create a `startup.sh` file in your project directory:
264+
265+
```bash
266+
#!/bin/bash
267+
echo "Running custom startup script..."
268+
mkdir -p /workspace/models
269+
echo "Custom startup complete!"
270+
```
271+
272+
2. Add it to your Dockerfile:
273+
274+
```dockerfile
275+
# Copy startup script
276+
COPY startup.sh /usr/local/bin/startup.sh
277+
RUN chmod +x /usr/local/bin/startup.sh
278+
279+
# Modify the start script to run our custom startup
280+
RUN echo '/usr/local/bin/startup.sh' >> /start.sh
281+
```
282+
283+
### Using multi-stage builds
284+
285+
For complex applications, use multi-stage builds to reduce final image size:
286+
287+
```dockerfile
288+
# Build stage
289+
FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 as builder
290+
291+
# Install build dependencies
292+
RUN apt-get update && apt-get install -y build-essential
293+
RUN pip install --no-cache-dir some-package-that-needs-compilation
294+
295+
# Final stage
296+
FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
297+
298+
# Copy only the necessary files from builder
299+
COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
300+
301+
# Install runtime dependencies
302+
RUN pip install --no-cache-dir transformers accelerate
303+
```
304+
305+
## Troubleshooting
306+
307+
Here are solutions to common issues when creating custom templates:
308+
309+
### Build failures
310+
311+
- **Platform mismatch**: Always use `--platform=linux/amd64` when building.
312+
- **Base image not found**: Verify the base image tag exists on Docker Hub.
313+
- **Package installation fails**: Check that package names are correct and available for the Python version in your base image.
314+
315+
### Pod deployment issues
316+
317+
- **Pod fails to start**: Check the Pod logs in the Runpod console for error messages.
318+
- **Services not accessible**: Ensure you've exposed the correct ports in your template configuration.
319+
- **CUDA version mismatch**: Make sure your base image CUDA version is compatible with your chosen GPU.
320+
321+
### Performance issues
322+
323+
- **Slow startup**: Consider pre-baking more dependencies or using a smaller base image.
324+
- **Out of memory**: Increase container disk size or choose a GPU with more VRAM.
325+
- **Model loading errors**: Verify that pre-baked models are in the expected cache directories.
326+
327+
## Best practices
328+
329+
Follow these best practices when creating custom templates:
330+
331+
### Image optimization
332+
333+
- Use `.dockerignore` to exclude unnecessary files from your build context.
334+
- Combine RUN commands to reduce image layers.
335+
- Clean up package caches and temporary files.
336+
- Use specific version tags for dependencies to ensure reproducibility.
337+
338+
### Security considerations
339+
340+
- Don't include sensitive information like API keys in your Docker image.
341+
- Use [Runpod Secrets](/pods/templates/secrets) for sensitive configuration.
342+
- Regularly update base images to get security patches.
343+
344+
### Version management
345+
346+
- Tag your images with version numbers (e.g., `v1.0.0`) instead of just `latest`.
347+
- Keep a changelog of what changes between versions.
348+
- Test new versions thoroughly before updating production templates.
349+
350+
## Next steps
351+
352+
Now that you have a working custom template, consider these next steps:
353+
354+
- **Automate builds**: Set up GitHub Actions or similar CI/CD to automatically build and push new versions of your template.
355+
- **Share with team**: If you're using a team account, share your template with team members.
356+
- **Create variations**: Build specialized versions of your template for different use cases (development vs. production).
357+
- **Monitor usage**: Track how your custom templates perform in production and optimize accordingly.
358+
359+
For more advanced template management, see the [Template Management API documentation](/api-reference/templates/POST/templates) to programmatically create and update templates.

0 commit comments

Comments
 (0)