-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
area/securitySecurity-related issuesSecurity-related issuesenhancementNew feature or requestNew feature or requestkind/featureNew feature or requestNew feature or requestpriority/highHigh priorityHigh priority
Description
Problem
LLMKube currently requires internet access to download models from HuggingFace URLs. This blocks adoption in:
- Air-gapped environments - Government, healthcare, finance, defense
- Private networks - Corporate environments with restricted egress
- Edge deployments - Remote locations with limited connectivity
- Compliance scenarios - Data sovereignty requirements
Proposed Solution
Support multiple model source types beyond HTTP URLs:
1. Local File Path
apiVersion: inference.llmkube.dev/v1alpha1
kind: Model
metadata:
name: my-model
spec:
source:
type: local
path: /mnt/models/llama-3.1-8b.gguf2. PVC Reference
spec:
source:
type: pvc
claimName: model-storage
path: models/llama-3.1-8b.gguf3. S3-Compatible Storage (MinIO, etc.)
spec:
source:
type: s3
bucket: llm-models
key: llama-3.1-8b.gguf
endpoint: http://minio.internal:9000
secretRef:
name: s3-credentials4. Private HTTP Server
spec:
source:
type: http
url: http://model-server.internal/models/llama-3.1-8b.gguf
# Optional auth
secretRef:
name: http-credentials5. OCI Registry (Harbor, etc.)
spec:
source:
type: oci
image: harbor.internal/llm-models/llama-3.1-8b:v1CLI Changes
# Deploy from local path
llmkube deploy my-model --source /mnt/models/llama.gguf --gpu
# Deploy from S3
llmkube deploy my-model --source s3://bucket/model.gguf --gpu
# Catalog support for local sources
llmkube deploy llama-3.1-8b --gpu --source-override /mnt/models/llama.gguf
# Pre-populate cache from local file
llmkube cache import /mnt/models/llama.gguf --as llama-3.1-8bImplementation Phases
Phase 1: Local Path Support
- Support
file://and absolute paths in source field - Mount hostPath or PVC in inference pods
- Update controller to skip download for local sources
Phase 2: PVC and S3 Support
- Add source type field to Model CRD
- Implement S3 download with credentials
- Support PVC references
Phase 3: OCI and Private Registry
- Pull models from OCI registries
- Support private registry authentication
- Model versioning via tags
Benefits
- Air-gapped deployments - No internet required
- Faster deployments - Local models load instantly
- Security - Models stay within network boundary
- Compliance - Meet data residency requirements
- Cost - No HuggingFace bandwidth costs
Related
- Roadmap Q2 2026: "Air-gapped Mode - Private model registries, offline operation"
- Issue Persistent model storage - avoid re-downloading models #52: Persistent model storage
- Issue Private model registry with signing and verification #22: Private model registry with signing
Success Criteria
- Deploy model from local file path
- Deploy model from PVC
- Deploy model from S3/MinIO
- CLI supports local source paths
- Documentation for air-gapped setup
- Example manifests for each source type
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area/securitySecurity-related issuesSecurity-related issuesenhancementNew feature or requestNew feature or requestkind/featureNew feature or requestNew feature or requestpriority/highHigh priorityHigh priority