Skip to content

Commit 1391c58

Browse files
Add storage location and file path details to model caching documentation
1 parent 0e6a6e2 commit 1391c58

File tree

1 file changed

+38
-0
lines changed

1 file changed

+38
-0
lines changed

serverless/endpoints/model-caching.mdx

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,44 @@ flowchart TD
5959
```
6060
</div>
6161

62+
## Where models are stored
63+
64+
Cached models are stored on the worker container's local disk, separate from any attached network volumes. Runpod automatically manages this internal storage to optimize loading speed.
65+
66+
The cache persists across requests on the same worker, so once a worker initializes, you'll see consistent performance. Since the models live on local disk rather than network volumes, they won't appear on your attached network volumes.
67+
68+
## Accessing cached models
69+
70+
Cached models are stored at `/runpod-volume/huggingface-cache/hub/`. The directory structure follows Hugging Face cache conventions, where forward slashes (`/`) in the model name are replaced with double dashes (`--`).
71+
72+
The path structure follows this pattern:
73+
74+
```
75+
/runpod-volume/huggingface-cache/hub/models--{organization}--{model-name}/
76+
```
77+
78+
For example, `meta-llama/Llama-3.2-1B-Instruct` would be stored at:
79+
80+
```
81+
/runpod-volume/huggingface-cache/hub/models--meta-llama--Llama-3.2-1B-Instruct/
82+
```
83+
84+
## Using cached models in applications
85+
86+
You can access cached models in your application two ways:
87+
88+
**Direct configuration**: Configure your application to load models directly from `/runpod-volume/huggingface-cache/hub/`. Many frameworks and tools let you specify a custom cache directory for Hugging Face models.
89+
90+
**Symbolic links**: Create symbolic links from your application's expected model directory to the cache location. This is particularly useful for applications like ComfyUI that expect models in specific directories.
91+
92+
For example, create a symbolic link like this:
93+
94+
```bash
95+
ln -s /runpod-volume/huggingface-cache/hub/models--meta-llama--Llama-3.2-1B-Instruct/ /workspace/models/llama-3.2
96+
```
97+
98+
This lets your application access cached models without modifying its configuration.
99+
62100
## Enabling cached models
63101

64102
Follow these steps to select and add a cached model to your Serverless endpoint:

0 commit comments

Comments
 (0)