You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: serverless/endpoints/model-caching.mdx
+10-20Lines changed: 10 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -61,41 +61,31 @@ flowchart TD
61
61
62
62
## Where models are stored
63
63
64
-
Cached models are stored on the worker container's local disk, separate from any attached network volumes. Runpod automatically manages this internal storage to optimize loading speed.
64
+
Cached models are stored in a Runpod-managed Docker volume and mounted at `/runpod-volume/huggingface-cache/hub/`. This creates a "blended view" where you can see both your network volume contents and cached models under the same `/runpod-volume/` path.
65
65
66
-
The cache persists across requests on the same worker, so once a worker initializes, you'll see consistent performance. Since the models live on local disk rather than network volumes, they won't appear on your attached network volumes.
66
+
The model cache loads significantly faster than network volumes, reducing cold start times. The cache is automatically managed and persists across requests on the same worker. You'll see cached models overlaid onto your network volume mount point.
67
67
68
-
## Accessing cached models
68
+
## Accessing cached models in your application
69
69
70
-
Cached models are stored at `/runpod-volume/huggingface-cache/hub/`. The directory structure follows Hugging Face cache conventions, where forward slashes (`/`) in the model name are replaced with double dashes (`--`).
70
+
Runpod caches models at `/runpod-volume/huggingface-cache/hub/` following Hugging Face cache conventions. The directory structure replaces forward slashes (`/`) from the original model name with double dashes (`--`), and includes a version hash subdirectory.
You can access cached models in your application two ways:
86
+
The version hash in the path currently prevents direct integration with some applications (like ComfyUI worker) that expect to predict paths based solely on model name. We're working on removing the version hash requirement.
87
87
88
-
**Direct configuration**: Configure your application to load models directly from `/runpod-volume/huggingface-cache/hub/`. Many frameworks and tools let you specify a custom cache directory for Hugging Face models.
89
-
90
-
**Symbolic links**: Create symbolic links from your application's expected model directory to the cache location. This is particularly useful for applications like ComfyUI that expect models in specific directories.
0 commit comments