civitai · JustMaier · Jan 15, 2026
diff --git a/.gitignore b/.gitignore
@@ -476,3 +476,12 @@ $RECYCLE.BIN/
 # Windows shortcuts
 *.lnk
 *.db
+
+.env
+docs
+
+# Local development overrides
+docker-compose.override.yml
+
+# Shadow mode metrics (generated at runtime)
+shadow-mode-metrics.json
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,92 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Build Commands
+
+### Docker (Full Stack)
+```bash
+# Build both containers
+docker-compose build
+
+# Run the full stack
+docker-compose up
+```
+
+### Orchestrator (.NET 8)
+```bash
+# Build
+dotnet build orchestrator/ModelScanner.sln
+
+# Run (development mode with in-memory job storage)
+dotnet run --project orchestrator/ModelScanner
+
+# Run tests
+dotnet test orchestrator/ModelScanner.sln
+
+# Run a specific test
+dotnet test orchestrator/ModelScanner.sln --filter "FullyQualifiedName~HashTaskTests"
+```
+
+### Model Scanner Container
+```bash
+docker build -t civitai-model-scanner ./model-scanner/
+docker run -it --rm civitai-model-scanner 'https://example.com/model.bin'
+```
+
+## Architecture
+
+This is a distributed AI model scanning system with two main components:
+
+### 1. Model Scanner Container (Python)
+Located in `model-scanner/`. A Docker container running:
+- **picklescan**: Detects dangerous pickle imports in PyTorch models
+- **clamscan**: ClamAV malware detection
+- Python ML libraries (PyTorch CPU, safetensors) for model processing
+
+### 2. Orchestrator Service (.NET 8)
+Located in `orchestrator/ModelScanner/`. An ASP.NET Core web API that:
+- Receives scan requests via HTTP endpoints
+- Queues jobs using Hangfire (SQLite persistent or in-memory storage)
+- Executes the scanner container via Docker API
+- Calculates file hashes (SHA256, Blake3, CRC32, AutoV1/V2/V3)
+- Converts models between formats (CKPT ↔ SafeTensors)
+- Uploads processed files to S3/R2 cloud storage
+- Reports results via webhook callbacks
+
+### Processing Pipeline
+Jobs are enqueued via `POST /enqueue` with configurable task flags:
+- `Import` (1): Upload to cloud storage
+- `Convert` (2): Format conversion
+- `Scan` (4): Malware/pickle scanning
+- `Hash` (8): Calculate cryptographic hashes
+- `ParseMetadata` (16): Extract safetensors metadata
+- `Default`: Import | Hash | Scan | ParseMetadata
+- `All`: All tasks including Convert
+
+Key flow: `FileProcessor.cs` downloads the model, runs requested tasks via `IJobTask` implementations, and POSTs results to the callback URL.
+
+### Job Queues (Priority Order)
+- `default`: Normal priority
+- `low-prio`: Lower priority processing
+- `x-low-prio`: Lowest priority (conversions)
+- `cleanup`: Storage cleanup
+- `delete-objects`: Deletion jobs
+
+## Key Files
+
+- `orchestrator/ModelScanner/Program.cs`: API endpoints and DI setup
+- `orchestrator/ModelScanner/FileProcessor.cs`: Main job processing logic
+- `orchestrator/ModelScanner/Tasks/`: Individual task implementations (HashTask, ScanTask, ImportTask, ConvertTask, ParseMetadataTask)
+- `orchestrator/ModelScanner/CloudStorageService.cs`: S3/R2 integration
+- `orchestrator/ModelScanner/DockerService.cs`: Scanner container execution
+- `model-scanner/scripts/`: Python conversion scripts (ckpt_to_safetensors.py, safetensors_to_ckpt.py)
+
+## Configuration
+
+Settings in `appsettings.json`:
+- `ValidTokens`: API authentication tokens
+- `CloudStorageOptions`: S3/R2 credentials and bucket names
+- `LocalStorageOptions`: Temp folder path
+- `ConnectionStrings:JobStorage`: SQLite path for Hangfire (omit for in-memory)
+- `Concurrency`: Worker thread count (defaults to CPU count)
diff --git a/README.md b/README.md
@@ -0,0 +1,252 @@
+# Civitai Model Scanner
+
+A distributed AI model scanning system that detects malware and malicious code in machine learning model files.
+
+## Architecture
+
+The system consists of several components:
+
+```
+                                    +------------------+
+                                    |  Cloud Storage   |
+                                    |    (S3/R2)       |
+                                    +--------^---------+
+                                             |
++----------------+     +-------------------+ | +------------------+
+|   HTTP POST    |     |   Orchestrator    |-+-|  Callback URL    |
+|   /enqueue     +---->|   (.NET 8)        |   |  (Webhook)       |
++----------------+     |                   |   +------------------+
+                       |   - Hangfire      |
+                       |   - Job Queue     |
+                       +--------+----------+
+                                |
+              +-----------------+------------------+
+              |                 |                  |
+    +---------v------+  +-------v--------+  +-----v------+
+    | Legacy Scanner |  | Unified Scanner|  | ClamAV     |
+    | (picklescan)   |  | (TensorTrap)   |  | Updater    |
+    +----------------+  +----------------+  +------------+
+```
+
+### Components
+
+1. **Orchestrator Service** (.NET 8 / ASP.NET Core)
+   - Receives scan requests via HTTP API
+   - Queues jobs using Hangfire (SQLite or in-memory)
+   - Downloads model files, executes scanners via Docker
+   - Reports results via webhook callbacks
+
+2. **Legacy Scanner** (Python/Docker)
+   - Picklescan: Detects dangerous pickle imports in PyTorch models
+   - ClamAV: Malware signature scanning
+
+3. **Unified Scanner** (Python/Docker)
+   - TensorTrap: ML security scanner supporting 13+ formats
+   - ClamAV: Integrated malware scanning
+   - Detects 11+ CVEs and security vulnerabilities
+
+4. **ClamAV Updater** (Sidecar)
+   - Automatically updates virus definitions every 2 hours
+   - Shares definitions with scanner containers via Docker volume
+
+## Quick Start
+
+### Prerequisites
+
+- Docker Desktop
+- .NET 8 SDK (for local development)
+
+### Running with Docker Compose
+
+```bash
+# Build all images
+docker-compose build
+
+# Start the stack
+docker-compose up -d
+
+# Check status
+docker-compose ps
+```
+
+### API Endpoints
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/enqueue` | POST | Submit a scan job |
+| `/cleanup` | POST | Trigger temp storage cleanup |
+| `/delete` | POST | Delete an object from storage |
+| `/metrics/shadow` | GET | Get shadow mode metrics summary |
+| `/metrics/shadow/full` | GET | Get full shadow mode metrics |
+| `/metrics/shadow/reset` | POST | Reset shadow mode metrics |
+
+### Submitting a Scan Job
+
+```bash
+curl -X POST "http://localhost/enqueue?token=YOUR_TOKEN&fileUrl=https://example.com/model.safetensors&callbackUrl=https://your-callback.com/result"
+```
+
+**Parameters:**
+- `fileUrl` (required): URL of the model file to scan
+- `callbackUrl` (required): Webhook URL for scan results
+- `tasks` (optional): Bitmask of tasks to run (default: 28)
+  - Import = 1
+  - Convert = 2
+  - Scan = 4
+  - Hash = 8
+  - ParseMetadata = 16
+- `lowPrio` / `extraLowPrio` (optional): Queue priority flags
+
+## Configuration
+
+### Environment Variables
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `ValidTokens__0`, `__1`, etc. | API authentication tokens | - |
+| `ScannerOptions__UseUnifiedScanner` | Use TensorTrap instead of picklescan | `false` |
+| `ScannerOptions__ShadowMode` | Run both scanners for comparison | `false` |
+| `CloudStorageOptions__*` | S3/R2 credentials | - |
+| `ConnectionStrings__JobStorage` | SQLite path for Hangfire | (in-memory) |
+| `Concurrency` | Worker thread count | CPU count |
+
+### Scanner Modes
+
+1. **Legacy Mode** (default): Uses picklescan + ClamAV
+2. **Unified Mode**: Uses TensorTrap + ClamAV
+3. **Shadow Mode**: Runs both scanners, compares results, uses legacy for response
+
+Shadow mode is useful for validating the new scanner before full migration.
+
+## Development
+
+### Building the Orchestrator
+
+```bash
+cd orchestrator
+dotnet build ModelScanner.sln
+dotnet run --project ModelScanner
+```
+
+### Running Tests
+
+```bash
+dotnet test orchestrator/ModelScanner.sln
+```
+
+### End-to-End Testing
+
+The `e2e/` directory contains a test script that runs a full scan workflow:
+
+```bash
+# Start the stack with test configuration
+docker-compose -f docker-compose.yml -f docker-compose.test.yml up -d
+
+# Run the e2e test
+cd e2e
+python e2e_test.py /path/to/model.ckpt --timeout 300
+
+# Example with options
+python e2e_test.py ./model.safetensors \
+    --orchestrator-url http://localhost:80 \
+    --token test-token \
+    --tasks 28 \
+    --timeout 300 \
+    --json
+```
+
+**E2E Test Options:**
+- `--orchestrator-url`: Orchestrator API URL (default: http://localhost:8080)
+- `--token`: API token (default: test-token)
+- `--tasks`: Task flags bitmask (default: 28 = Scan|Hash|ParseMetadata)
+- `--timeout`: Timeout in seconds (default: 300)
+- `--json`: Output raw JSON results
+
+## Callback Response Format
+
+```json
+{
+  "url": "https://example.com/model.safetensors",
+  "fileExists": 1,
+  "picklescanExitCode": 0,
+  "picklescanOutput": "...",
+  "picklescanGlobalImports": ["torch", "collections"],
+  "picklescanDangerousImports": [],
+  "tensorTrapScanned": true,
+  "tensorTrapMaxSeverity": "info",
+  "tensorTrapIsSafe": true,
+  "tensorTrapFindings": [...],
+  "clamscanExitCode": 0,
+  "clamscanOutput": "OK",
+  "hashes": {
+    "SHA256": "...",
+    "Blake3": "...",
+    "CRC32": "...",
+    "AutoV1": "...",
+    "AutoV2": "...",
+    "AutoV3": "..."
+  }
+}
+```
+
+## Shadow Mode Metrics
+
+When running in shadow mode, metrics are collected comparing legacy and unified scanner results:
+
+```bash
+curl "http://localhost/metrics/shadow?token=YOUR_TOKEN"
+```
+
+```json
+{
+  "totalScans": 1000,
+  "matches": 985,
+  "discrepancies": 15,
+  "agreementRate": 98.5,
+  "unifiedFoundMoreThreats": 12,
+  "legacyFoundMoreThreats": 3,
+  "bothSafe": 970,
+  "bothDangerous": 15,
+  "errors": 0,
+  "recommendation": "Unified scanner is finding MORE threats - safe to migrate"
+}
+```
+
+## Supported File Formats
+
+### TensorTrap (Unified Scanner)
+- PyTorch: `.pt`, `.pth`, `.bin`, `.ckpt`
+- Pickle: `.pkl`, `.pickle`
+- NumPy: `.npy`, `.npz`
+- Safetensors: `.safetensors`
+- ONNX: `.onnx`
+- GGUF: `.gguf`
+
+### Legacy Scanner (Picklescan)
+- PyTorch pickle files
+- Does NOT scan `.safetensors` (considered safe by design)
+
+## Security Considerations
+
+- API endpoints require authentication via `token` query parameter
+- Scanner containers run with memory limits (2GB default)
+- ClamAV definitions are updated automatically
+- Model files are deleted after scanning
+
+## Troubleshooting
+
+### Scanner Timeout
+Large model files (>1GB) may take several minutes to scan. Adjust timeouts as needed.
+
+### ClamAV Definitions Not Found
+Ensure the clamav-updater container is running and has completed initial download:
+```bash
+docker logs model-scanner-clamav-updater-1
+```
+
+### Connection Refused on Callback
+Ensure your callback URL is accessible from the Docker network. Use `host.docker.internal` for local development.
+
+## License
+
+[Your license here]
diff --git a/clamav-updater/Dockerfile b/clamav-updater/Dockerfile
@@ -0,0 +1,19 @@
+FROM debian:bookworm-slim
+
+RUN apt-get update && \
+    apt-get install -y --no-install-recommends clamav clamav-freshclam ca-certificates && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
+
+# Create directory for definitions
+RUN mkdir -p /var/lib/clamav && \
+    chown clamav:clamav /var/lib/clamav
+
+COPY freshclam.conf /etc/clamav/freshclam.conf
+COPY update-loop.sh /usr/local/bin/update-loop.sh
+RUN chmod +x /usr/local/bin/update-loop.sh
+
+# Run as clamav user for security
+USER clamav
+
+ENTRYPOINT ["/usr/local/bin/update-loop.sh"]