Air-gapped deployment: Support local/private model sources

## Problem

LLMKube currently requires internet access to download models from HuggingFace URLs. This blocks adoption in:

- **Air-gapped environments** - Government, healthcare, finance, defense
- **Private networks** - Corporate environments with restricted egress
- **Edge deployments** - Remote locations with limited connectivity
- **Compliance scenarios** - Data sovereignty requirements

## Proposed Solution

Support multiple model source types beyond HTTP URLs:

### 1. Local File Path
```yaml
apiVersion: inference.llmkube.dev/v1alpha1
kind: Model
metadata:
  name: my-model
spec:
  source:
    type: local
    path: /mnt/models/llama-3.1-8b.gguf
```

### 2. PVC Reference
```yaml
spec:
  source:
    type: pvc
    claimName: model-storage
    path: models/llama-3.1-8b.gguf
```

### 3. S3-Compatible Storage (MinIO, etc.)
```yaml
spec:
  source:
    type: s3
    bucket: llm-models
    key: llama-3.1-8b.gguf
    endpoint: http://minio.internal:9000
    secretRef:
      name: s3-credentials
```

### 4. Private HTTP Server
```yaml
spec:
  source:
    type: http
    url: http://model-server.internal/models/llama-3.1-8b.gguf
    # Optional auth
    secretRef:
      name: http-credentials
```

### 5. OCI Registry (Harbor, etc.)
```yaml
spec:
  source:
    type: oci
    image: harbor.internal/llm-models/llama-3.1-8b:v1
```

## CLI Changes

```bash
# Deploy from local path
llmkube deploy my-model --source /mnt/models/llama.gguf --gpu

# Deploy from S3
llmkube deploy my-model --source s3://bucket/model.gguf --gpu

# Catalog support for local sources
llmkube deploy llama-3.1-8b --gpu --source-override /mnt/models/llama.gguf

# Pre-populate cache from local file
llmkube cache import /mnt/models/llama.gguf --as llama-3.1-8b
```

## Implementation Phases

### Phase 1: Local Path Support
- Support `file://` and absolute paths in source field
- Mount hostPath or PVC in inference pods
- Update controller to skip download for local sources

### Phase 2: PVC and S3 Support  
- Add source type field to Model CRD
- Implement S3 download with credentials
- Support PVC references

### Phase 3: OCI and Private Registry
- Pull models from OCI registries
- Support private registry authentication
- Model versioning via tags

## Benefits

- **Air-gapped deployments** - No internet required
- **Faster deployments** - Local models load instantly
- **Security** - Models stay within network boundary
- **Compliance** - Meet data residency requirements
- **Cost** - No HuggingFace bandwidth costs

## Related

- Roadmap Q2 2026: "Air-gapped Mode - Private model registries, offline operation"
- Issue #52: Persistent model storage
- Issue #22: Private model registry with signing

## Success Criteria

- [ ] Deploy model from local file path
- [ ] Deploy model from PVC
- [ ] Deploy model from S3/MinIO
- [ ] CLI supports local source paths
- [ ] Documentation for air-gapped setup
- [ ] Example manifests for each source type

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Air-gapped deployment: Support local/private model sources #53

Problem

Proposed Solution

1. Local File Path

2. PVC Reference

3. S3-Compatible Storage (MinIO, etc.)

4. Private HTTP Server

5. OCI Registry (Harbor, etc.)

CLI Changes

Implementation Phases

Phase 1: Local Path Support

Phase 2: PVC and S3 Support

Phase 3: OCI and Private Registry

Benefits

Related

Success Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Air-gapped deployment: Support local/private model sources #53

Description

Problem

Proposed Solution

1. Local File Path

2. PVC Reference

3. S3-Compatible Storage (MinIO, etc.)

4. Private HTTP Server

5. OCI Registry (Harbor, etc.)

CLI Changes

Implementation Phases

Phase 1: Local Path Support

Phase 2: PVC and S3 Support

Phase 3: OCI and Private Registry

Benefits

Related

Success Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions