You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: serverless/endpoints/model-caching.mdx
+38Lines changed: 38 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,6 +59,44 @@ flowchart TD
59
59
```
60
60
</div>
61
61
62
+
## Where models are stored
63
+
64
+
Cached models are stored on the worker container's local disk, separate from any attached network volumes. Runpod automatically manages this internal storage to optimize loading speed.
65
+
66
+
The cache persists across requests on the same worker, so once a worker initializes, you'll see consistent performance. Since the models live on local disk rather than network volumes, they won't appear on your attached network volumes.
67
+
68
+
## Accessing cached models
69
+
70
+
Cached models are stored at `/runpod-volume/huggingface-cache/hub/`. The directory structure follows Hugging Face cache conventions, where forward slashes (`/`) in the model name are replaced with double dashes (`--`).
You can access cached models in your application two ways:
87
+
88
+
**Direct configuration**: Configure your application to load models directly from `/runpod-volume/huggingface-cache/hub/`. Many frameworks and tools let you specify a custom cache directory for Hugging Face models.
89
+
90
+
**Symbolic links**: Create symbolic links from your application's expected model directory to the cache location. This is particularly useful for applications like ComfyUI that expect models in specific directories.
0 commit comments