🔍 FreeFace

Enterprise-Grade Distributed Face Recognition/Tracking Platform

⚠️ DEVELOPMENT IN PROGRESS - This project is under active development. Coming soon as a Kubernetes Operator and Helm Chart for production deployments.

📖 Table of Contents

Overview
Why FreeFace Scales to Billions
Key Features
Quick Start
Project Structure
Architecture
API Documentation
Development Setup
Deployment
Configuration
Monitoring & Operations
Contributing
License

🚀 Overview

FreeFace is a high-performance, cloud-native face recognition/tracking platform engineered for enterprise-scale deployments. Built with Rust for blazing-fast performance and designed for horizontal scalability, it delivers real-time face detection, feature extraction, and similarity matching with sub-millisecond latency.

What is FreeFace?

A distributed face recognition system capable of:

Processing millions of face recognition operations
Sub-100ms API response times for async operations
Handling massive concurrent requests
Auto-scaling based on workload
Real-time face clustering and association

🚀 Why FreeFace Scales to Billions

📊 Proven Scale Architecture

FreeFace is engineered from the ground up to handle billions of face records with millions of concurrent operations. Here's why our technology choices enable massive scale:

🎯 Database Layer: Built for Billions

ScyllaDB for Metadata at Scale

ScyllaDB: Built for Massive Data
Advantages:
  - Written in C++: No garbage collection pauses, consistent performance
  - Linear Scalability: Add nodes = proportional throughput increase
  - Sub-millisecond P99: Ultra-low latency even at massive scale
  - Shard-per-core Architecture: Maximum CPU utilization
  - Auto-tuning: Self-optimizing for different workloads
  
Large-scale capacity:
  - Single node: 1M+ ops/sec
  - Cluster scaling: 10M+ ops/sec with 10 nodes
  - Storage: Petabyte-scale with automatic compression
  - Face records: Billions with efficient time-series partitioning
  - Global distribution: Multi-datacenter replication

Milvus: Built for Massive Vector Data

Milvus: Purpose-Built Vector Database
Advantages:
  - Billion-Scale Performance: Handles 10B+ vectors with sub-10ms query times
  - GPU Acceleration: Native CUDA support for 100x computational speedup
  - Advanced Indexing: IVF, HNSW, DiskANN algorithms optimize for different scales
  - Elastic Scaling: Add compute/storage nodes independently
  - Memory Optimization: Intelligent vector compression and caching
  
Vector Operations:
  - Similarity Search: ANN algorithms with 99%+ recall at massive scale
  - Dynamic Updates: Real-time insertions without index rebuilds
  - Multi-Field Filtering: Combine vector similarity with metadata filters

⚡ Service Architecture: Microservices for Scale

┌────────────────────────────────────────────────────────────┐
│ Why Microservices Architecture?                            │
├────────────────────────────────────────────────────────────┤
│                                                            │
│ 1. REST API Server (api-server)                           │
│    • Stateless: Infinite horizontal scaling               │
│    • Load balanced: Even distribution                     │
│    • Rate limited: Protect downstream services            │
│    • Scales: 2-100+ pods based on load                   │
│                                                            │
│ 2. gRPC Face Extractor (extractor)                       │
│    • CPU/GPU optimized: Dedicated ML processing          │
│    • Connection pooling: Resource efficiency             │
│    • Circuit breaker: Fault isolation                    │
│    • Scales: Based on CPU/GPU availability              │
│                                                            │
│ 3. Kafka Async Processor (face_async_enroll)            │
│    • Worker pools: 50+ concurrent workers per instance                │
│    • Priority queues: Critical operations first         │
│    • Batch processing: 1000+ faces/batch                │
│    • Scales: Kafka partitions × consumer groups         │
│                                                            │
│ 4. Event-Driven Clustering (face_cluster)               │
│    • Stream processing: Real-time clustering             │
│    • State management: Distributed state via Kafka      │
│    • Incremental updates: No full recomputation         │
│    • Scales: With Kafka topic partitions                │
└────────────────────────────────────────────────────────────┘

🔄 Message Queue Architecture

Apache Kafka: Built for Massive Event Streams

Kafka: High-Throughput Event Streaming
Advantages:
  - Ultra-High Throughput: 1M+ messages/sec per broker with linear scaling
  - Persistent Storage: Durable, replicated logs retain data for years
  - Guaranteed Ordering: Per-partition message ordering for consistent processing
  - Event Sourcing: Complete event history replay for analytics and recovery
  - Zero Data Loss: Multi-replica persistence with configurable durability
  
Large-Scale Features:
  - Massive Parallelism: 1000s of partitions per topic, independent scaling
  - Consumer Groups: Multiple parallel processors per topic
  - Backpressure Handling: Producer buffering prevents system overload

💾 Caching & Storage Strategy

DragonflyDB for Caching

Choice: DragonflyDB over Redis
Reasons:
  - 25x throughput: vs Redis single-threaded
  - Memory efficiency: 30% less RAM usage
  - Compatibility: Redis protocol support
  - Multi-core: Vertical scaling on modern CPUs

MinIO for Object Storage

Choice: MinIO S3-compatible storage
Reasons:
  - Distributed: Erasure coding for reliability
  - Performance: 180 GB/s per cluster
  - Cost effective: Use commodity hardware
  - S3 API: Industry standard compatibility

🎯 Why These Technologies Enable Billion-Scale

Component	Technology	Scale Capability	Why It Matters
Metadata DB	ScyllaDB	10M+ ops/sec, Petabytes	No bottleneck on face records
Vector DB	Milvus	10B+ vectors, <10ms search	Fast similarity at scale
Cache	DragonflyDB	4M+ ops/sec	Instant hot data access
Queue	Kafka	1M+ msg/sec	Decouple & parallelize
Storage	MinIO	Exabytes, 180GB/s	Unlimited image storage
Language	Rust	Zero-cost abstractions	Maximum performance

🤖 AI Models & Detection Engine

FreeFace supports multiple state-of-the-art detection and recognition models, optimized for different deployment scenarios:

🔍 Face Detection Models

Model	Performance	Use Case	Hardware
MobileNet v0.25	Ultra-fast, lightweight	Mobile, edge devices, real-time	CPU optimized
ResNet50	High accuracy, robust	Server deployment, batch processing	CPU/GPU hybrid

🧠 Face Recognition Models

Model	Accuracy	Speed	Deployment
ArcFace (R50)	State-of-the-art	Medium	CPU/GPU versions
FaceNet (VGGFace2)	High accuracy	Fast	CPU optimized
MobileNet	Good accuracy	Ultra-fast	CPU/GPU versions

📁 Model Structure

models/
├── detection/
│   ├── mobilenet-cpu/       # Lightweight detection
│   │   └── mobilenet0.25_Final.pth
│   └── resnet50-cpu/        # Robust detection
│       └── Resnet50_Final.pth
└── recognition/
    ├── arcface-cpu/         # High accuracy CPU
    │   ├── R50-0000.params
    │   └── R50-symbol.json
    ├── arcface-gpu/         # GPU acceleration
    │   ├── R50-0000.params
    │   └── R50-symbol.json
    ├── facenet-cpu/         # Fast CPU inference
    │   └── facenet_vggface2.pth
    ├── mobilenet-cpu/       # Ultra-fast CPU
    │   ├── mnet10-0000.params
    │   └── mnet10-symbol.json
    └── mobilenet-gpu/       # GPU optimized
        ├── mnet10-0000.params
        └── mnet10-symbol.json

⚙️ Model Selection Strategy

High-Throughput: MobileNet detection + MobileNet recognition
High-Accuracy: ResNet50 detection + ArcFace recognition
Balanced: MobileNet detection + FaceNet recognition
GPU Acceleration: Use GPU variants when CUDA available

🎯 Key Features

Core Capabilities

✅ Multi-Model Support - Dynamic model selection based on accuracy/speed requirements
✅ Real-time Face Detection - Advanced ML models (FaceNet, ArcFace, MobileNet)
✅ Vector Similarity Search - Fast matching using Milvus vector database
✅ Async Processing - Immediate API responses with background processing
✅ Smart Auto-Clustering - Automatic grouping of similar faces
✅ Geospatial Analytics - GPS-based face tracking and location analysis
✅ Multi-Face Processing - Handle group photos and crowds

Technical Features

🚀 High Performance - Massive concurrent processing with parallel workers
📊 Priority Queuing - Critical, High, Normal, Low priority processing
🛡️ Resilient Architecture - Circuit breakers, retry logic, graceful degradation
📈 Auto-scaling - Kubernetes HPA/VPA based scaling
🔍 Enterprise Observability - Comprehensive monitoring, tracing, and logging stack
📊 Advanced Metrics - 40+ Prometheus metrics with atomic counters
🔍 Distributed Tracing - Jaeger integration with request correlation
🔐 Enterprise Security - JWT auth, encrypted storage, RBAC

⚡ Quick Start

⚠️ HARDWARE REQUIREMENTS
Docker Desktop: 14+ CPUs, 21984MB RAM, 220GB disk space
Why: Infrastructure services (ScyllaDB, Milvus, Kafka, ELK stack) are resource-intensive

🔥 Hybrid Development (Recommended)

Best for: Daily development with full debugging capabilities

Prerequisites

# macOS (recommended)
brew install minikube kubectl skaffold foreman

# Rust toolchain (for local apps)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
cargo install cargo-watch

Step-by-Step Setup

Step 1: Clone & Configure

git clone https://github.com/your-org/freeface.git
cd freeface
cp .env.sample .env

✅ Verification:

ls -la .env  # Should exist

Step 2: Deploy Infrastructure to Kubernetes

./dev-k8s.sh  # Starts Minikube + deploys infrastructure

✅ Verification:

# Wait for all pods to be Ready (may take 3-5 minutes)
kubectl get pods -A
# Should show all pods in Running/Ready state

# Test key services
curl -f http://localhost:9042 || echo "ScyllaDB: ✅"
curl -f http://localhost:19530 || echo "Milvus: ✅"
curl -f http://localhost:9093 || echo "Kafka: ✅"

Step 3: Prepare Databases

# Initialize database schemas (after infrastructure is running)
cargo run --bin migrate -- up --database scylla
cargo run --bin migrate -- up --database milvus
# Or: cargo run --bin migrate -- up --database all

✅ Verification:

cargo run --bin migrate -- status --database all
# Should show: "All migrations applied"

Step 4: Run FreeFace Applications (Host)

# Option A: All services together
foreman start

# Option B: Individual services for debugging
cargo run --bin api-server        # Terminal 1: REST API
cargo run --bin extractor          # Terminal 2: gRPC extractor
cargo run --bin face-cluster       # Terminal 3: Clustering service

✅ Verification:

# API Health Check
curl http://localhost:8080/api/v1/health
# Should return: {"status": "healthy"}

# gRPC Service Check
grpcurl -plaintext localhost:50051 list
# Should list available gRPC services

🎯 Why Hybrid Development?

Benefit	Description
⚡ Instant Rebuilds	Cargo rebuilds in 1-5s vs Docker's 30-60s
🛠️ Native Debugging	Full IDE support, breakpoints, variable inspection
🚫 No ARM64 Issues	Eliminates Docker cross-compilation problems
🌐 Real Infrastructure	Uses actual ScyllaDB, Milvus, Kafka in K8s
📊 Full Observability	Complete monitoring stack with real metrics

🔗 Service Access (after Skaffold starts)

Main Applications (Run on Host):

FreeFace API: http://localhost:8080 (API Docs)
FreeFace Extractor: grpc://localhost:50051
Face Clustering: grpc://localhost:50053
Admin Panel: http://localhost:7998

Infrastructure Services:

ScyllaDB: localhost:9042
ScyllaDB Console: http://localhost:10000
Milvus Vector DB: localhost:19530
Milvus Admin UI (Attu): http://localhost:3002
DragonflyDB Cache: redis://localhost:6379
Redis Commander: http://localhost:8083
MinIO API: http://localhost:9000 (freeface-admin/freeface-secret-2024)
MinIO Console: http://localhost:9090 (freeface-admin/freeface-secret-2024)

Message Queues & Streaming:

Kafka: localhost:9093
Kafka UI: http://localhost:8082
RabbitMQ Web: http://localhost:15672 (freeface/freeface123)
RabbitMQ AMQP: amqp://localhost:5672
Zookeeper: localhost:2181

Monitoring & Observability:

Prometheus: http://localhost:9091 (40+ metrics including HTTP, face operations, database)
Grafana: http://localhost:3001 (admin/admin) - Pre-built dashboards for system health & KPIs
Jaeger Tracing: http://localhost:16686 - Distributed request tracing with correlation IDs

Logging (ELK Stack):

Elasticsearch: http://localhost:9200 - Centralized log storage
Kibana: http://localhost:5601 - Log analysis dashboards
Logstash: http://localhost:9600 - Real-time log processing

UI Services:

etcd UI: http://localhost:8084

📁 Project Structure

FreeFace/
├── 📁 kubernetes/              # Kubernetes deployment manifests
│   ├── freeface-api/          # API service deployment
│   │   ├── Dockerfile.api-server      # Production build
│   │   ├── Dockerfile.api-server.dev  # Dev with cargo-watch
│   │   ├── deployment.yaml            # Production deployment
│   │   └── deployment-dev.yaml        # Hot-reload deployment
│   ├── freeface-extractor/    # Face extraction service
│   ├── face_enroll_async/     # Async processor deployment
│   ├── freeface-face-cluster/ # Clustering service
│   ├── scylladb/              # ScyllaDB cluster setup
│   ├── milvus/                # Milvus vector database
│   ├── kafka/                 # Event streaming setup
│   └── monitoring/            # Prometheus, Grafana, Jaeger
│
├── 📁 src/                    # Rust source code
│   ├── bin/                   # Binary entry points
│   │   ├── api-server.rs     # REST API server
│   │   ├── extractor.rs      # gRPC extraction service
│   │   ├── face-async-enroll.rs  # Async processor
│   │   ├── face_cluster.rs   # Clustering service
│   │   └── migrate.rs        # Database migration tool
│   ├── api/                   # REST API implementation
│   ├── face_async_enroll/     # Async processing logic
│   ├── face_cluster/          # Clustering algorithms
│   ├── extractors/            # ML model integration
│   ├── storage/               # Database clients
│   ├── services/              # Business logic
│   └── grpc/                  # gRPC definitions
│
├── 📁 db/                     # Database schemas
│   └── migration/
│       ├── scylla/            # ScyllaDB migrations (CQL)
│       └── milvus/            # Milvus collections (JSON)
│
├── 📁 models/                 # ML models
│   ├── onnx/                  # ONNX format models
│   ├── pytorch/               # PyTorch models
│   └── keras/                 # Keras/TensorFlow models
│
├── 📁 proto/                  # Protocol Buffers
│   └── extractor.proto        # gRPC service definitions
│
├── 📄 .env                    # Environment configuration (K8s DNS names)
├── 📄 dev-k8s.sh              # Development script (hot-reload default, --prod for production)
├── 📄 skaffold.yaml           # Production Skaffold configuration
├── 📄 skaffold-dev.yaml       # Hot-reload Skaffold config (DEFAULT)
├── 📄 Cargo.toml             # Rust dependencies
└── 📄 README.md              # This file

Key Components

Services (src/bin/)

api-server: REST API for face operations
extractor: gRPC service for face detection/extraction
face-async-enroll: High-throughput async processor
face_cluster: Event-driven clustering service
migrate: Database schema management

Infrastructure (kubernetes/)

ScyllaDB: Metadata storage (faces, persons, images) - scylladb/scylla-cluster.yaml
Milvus: Vector database for face embeddings - milvus/simple-standalone.yaml
Kafka: Event streaming for async processing - kafka/deployment.yaml
MinIO: S3-compatible image storage - minio/minio-deployment.yaml
DragonflyDB: High-performance caching - dragonfly/deployment.yaml
Prometheus: Metrics collection and alerting - prometheus/deployment.yaml
Grafana: Monitoring dashboards and visualization - grafana/deployment.yaml
Jaeger: Distributed tracing and request correlation - jaeger/deployment.yaml
ELK Stack: Centralized logging (Elasticsearch, Logstash, Kibana) - elk-stack/deployment.yaml
Admin Panel: Web-based system administration - admin/deployment-all.yaml
UI Services: Management UIs for databases and queues - ui-services/deployment.yaml
Supporting Services: Additional infrastructure components - supporting-services/

🏗️ Architecture

FreeFace Enterprise Architecture

┌──────────────────────── FreeFace Platform Architecture ────────────────────────────┐
│                                                                                     │
│ ┌─── APPLICATION LAYER ───────────────────────────────────────────────────────────┐ │
│ │                                                                                 │ │
│ │ ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │ │
│ │ │   REST API      │  │ gRPC Extractor  │  │ Face Clustering │  │ Async       │ │ │
│ │ │   (Scalable)    │  │   (Scalable)    │  │   (Service)     │  │ Processor   │ │ │
│ │ │                 │  │                 │  │                 │  │ (High-Scale)│ │ │
│ │ │ Face Recognition│  │Face Detection & │  │ Similarity      │  │Kafka Consumer│ │ │
│ │ │    Endpoints    │  │   Embedding     │  │ & Clustering    │  │& Worker Pool│ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │  Port: 8080     │  │  Port: 50051    │  │  Event-Driven   │  │ 50 Workers  │ │ │
│ │ │  Axum Framework │  │  Tonic/gRPC     │  │  Kafka Consumer │  │ Rust Tokio  │ │ │
│ │ └─────────────────┘  └─────────────────┘  └─────────────────┘  └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                     │
│ ┌─── UTILITY SERVICES ────────────────────────────────────────────────────────────┐ │
│ │                                                                                 │ │
│ │ ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │ │
│ │ │  Admin Panel    │  │   MinIO S3      │  │   Migrator      │  │ UI Services │ │ │
│ │ │ (CRUD/Explorer) │  │   (Storage)     │  │   (Service)     │  │ (Management)│ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │Data Management  │  │ Image Storage   │  │Database Schema  │  │ Kafka UI    │ │ │
│ │ │  & Analytics    │  │ & Encryption    │  │  Management     │  │ Redis Cmdr  │ │ │
│ │ │                 │  │ S3-Compatible   │  │ CQL & Milvus    │  │ etcd UI     │ │ │
│ │ │  Port: 7998     │  │  Port: 9000     │  │ Migration Tool  │  │Various Ports│ │ │
│ │ └─────────────────┘  └─────────────────┘  └─────────────────┘  └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                     │
│ ┌─── DATA LAYER ──────────────────────────────────────────────────────────────────┐ │
│ │                                                                                 │ │
│ │ ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │ │
│ │ │   ScyllaDB      │  │     Milvus      │  │  DragonflyDB    │  │    etcd     │ │ │
│ │ │  (Metadata)     │  │  (Vector DB)    │  │   (Cache)       │  │(Coordination│ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │ • Face Records  │  │• Face Embeddings│  │• Query Cache    │  │• Milvus Meta│ │ │
│ │ │ • Person Data   │  │• 128D/512D Vecs │  │• Session Store  │  │• Service    │ │ │
│ │ │ • Image Meta    │  │• Similarity     │  │• Hot Data       │  │  Discovery  │ │ │
│ │ │ • Clustering    │  │  Search         │  │• Rate Limits    │  │• Config     │ │ │
│ │ │                 │  │• GPU Accelerated│  │• 25x Redis Speed│  │  Store      │ │ │
│ │ │  Port: 9042     │  │  Port: 19530    │  │  Port: 6379     │  │ Port: 2379  │ │ │
│ │ └─────────────────┘  └─────────────────┘  └─────────────────┘  └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                     │
│ ┌─── MESSAGING & EVENT STREAMING ─────────────────────────────────────────────────┐ │
│ │                                                                                 │ │
│ │ ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │ │
│ │ │     Kafka       │  │   RabbitMQ      │  │   Zookeeper     │  │  Kafka UI   │ │ │
│ │ │  (Streaming)    │  │ (Message Queue) │  │ (Coordination)  │  │(Management) │ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │• Event Sourcing │  │• Reliable Msg   │  │• Kafka Cluster  │  │• Topic Mgmt │ │ │
│ │ │• Real-time      │  │  Delivery       │  │  Management     │  │• Consumer   │ │ │
│ │ │  Processing     │  │• Dead Letters   │  │• Leader Election│  │  Monitoring │ │ │
│ │ │• 1M+ msg/sec    │  │• Priority Queue │  │• Config Sync    │  │• Lag Track  │ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │ Port: 9093      │  │  Port: 5672     │  │  Port: 2181     │  │ Port: 8082  │ │ │
│ │ │Topics: events,  │  │  AMQP Protocol  │  │                 │  │             │ │ │
│ │ │ logs, clusters  │  │                 │  │                 │  │             │ │ │
│ │ └─────────────────┘  └─────────────────┘  └─────────────────┘  └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                     │
│ ┌─── MONITORING & OBSERVABILITY ──────────────────────────────────────────────────┐ │
│ │                                                                                 │ │
│ │ ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │ │
│ │ │   Prometheus    │──▶│     Grafana     │  │     Jaeger      │  │ Admin UIs   │ │ │
│ │ │   (Metrics)     │  │  (Dashboards)   │  │   (Tracing)     │  │(Data Explorer│ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │• 40+ Metrics    │  │• System Health  │  │• Request Flow   │  │• ScyllaDB   │ │ │
│ │ │• Performance    │  │• Business KPIs  │  │• Latency Track  │  │  Manager    │ │ │
│ │ │• Resource Usage │  │• Real-time      │  │• Span Analysis  │  │• Milvus Attu│ │ │
│ │ │• Thread-safe    │  │  Alerts         │  │• Service Map    │  │• Redis Cmdr │ │ │
│ │ │  Counters       │  │• Custom Panels  │  │• Error Tracking │  │             │ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │  Port: 9091     │  │  Port: 3001     │  │  Port: 16686    │  │Various Ports│ │ │
│ │ └─────────────────┘  └─────────────────┘  └─────────────────┘  └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                     │
│ ┌─── LOGGING & SEARCH STACK ──────────────────────────────────────────────────────┐ │
│ │                                                                                 │ │
│ │ ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │ │
│ │ │ Elasticsearch   │──▶│     Kibana      │  │    Logstash     │  │Auto Scaling │ │ │
│ │ │  (Log Store)    │  │   (Log UI)      │  │  (Processing)   │  │  (HPA/VPA)  │ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │• Centralized    │  │• Log Analysis   │  │• Log Parsing    │  │• Resource   │ │ │
│ │ │  Repository     │  │• Dashboards     │  │• Transformation │  │  Based      │ │ │
│ │ │• Full-text      │  │• Query Builder  │  │• Multiple Input │  │• K8s Native │ │ │
│ │ │  Search         │  │• Visualizations │  │  Sources        │  │• CPU/Memory │ │ │
│ │ │• Async Logging  │  │• Saved Searches │  │• Filter Pipeline│  │  Triggers   │ │ │
│ │ │                 │  │                 │  │• Enrichment     │  │• Pod Scaling│ │ │
│ │ │                 │  │                 │  │                 │  │             │ │ │
│ │ │  Port: 9200     │  │  Port: 5601     │  │  Port: 9600     │  │ 2-100+ pods │ │ │
│ │ └─────────────────┘  └─────────────────┘  └─────────────────┘  └─────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────────────┘

          🔄 Data Flow: Client → API Layer → Messaging → Data Layer → Storage
          📊 Monitoring: All services → Prometheus (40+ metrics) → Grafana Dashboards  
          📝 Logging: Async → Elasticsearch → Kibana (structured search)
          🔍 Tracing: Request correlation IDs → Jaeger → Distributed trace analysis
          ⚡ Events: Face operations → Kafka → Async processing & clustering
          💾 Storage: Metadata (ScyllaDB) + Vectors (Milvus) + Images (MinIO) + Cache (DragonflyDB)
          🎯 Observability: Thread-safe metrics + Non-blocking logs + Request correlation
          🚀 Scaling: HPA/VPA auto-scaling based on CPU/Memory/Custom metrics

Data Flow Patterns

Operation	Flow	Response Time
Sync Enrollment	Client → API → Extractor → Databases	~500ms
Async Enrollment	Client → API → Kafka → Background	< 100ms
Face Search	Client → API → Cache/Milvus → Results	~50ms
Recognition	Client → API → Vector Search → Match	~30ms

Core Services

🚀 FreeFace API (REST)

Tech: Rust + Axum framework
Purpose: HTTP REST API for face recognition
Features: Sync/async enrollment, search, recognition
Scaling: Auto-scales 2-20 replicas

🧠 FreeFace Extractor (gRPC)

Tech: Rust + Tonic + ONNX Runtime
Purpose: Face detection and feature extraction
Models: FaceNet, ArcFace, MobileNet
Scaling: CPU/GPU resource-based scaling

⚡ Async Processor

Tech: Rust + Kafka + Worker Pools
Purpose: High-throughput background processing
Capacity: Massive concurrent processing capacity
Features: Priority queuing, webhook callbacks

🔍 Face Clustering

Tech: Rust + Kafka consumer
Purpose: Real-time face grouping and association
Features: Auto-clustering, similarity analysis

🔌 API Documentation

📖 Interactive API Documentation: http://localhost:8080/docs (Swagger UI)

Core Endpoints

Method	Endpoint	Description
`POST`	`/api/v1/faces/enroll`	Synchronous face enrollment
`POST`	`/api/v1/faces/enroll_async`	Async enrollment (< 100ms)
`GET`	`/api/v1/faces/enroll/status/{id}`	Check async enrollment status
`POST`	`/api/v1/faces/search`	Search similar faces
`POST`	`/api/v1/faces/recognize`	Recognize face and return best match
`POST`	`/api/v1/faces/detect`	Detect faces without enrollment
`GET`	`/api/v1/faces/{id}`	Get face record details
`DELETE`	`/api/v1/faces/{id}`	Delete face record
`GET`	`/api/v1/persons`	List persons with pagination
`GET`	`/api/v1/persons/by_external_ids`	List persons by external IDs (comma-separated)
`GET`	`/api/v1/persons/by_external_id/{external_id}`	Get person by external ID
`POST`	`/api/v1/persons`	Create new person
`GET`	`/api/v1/persons/{id}`	Get person details
`PUT`	`/api/v1/persons/{id}`	Update person information
`DELETE`	`/api/v1/persons/{id}`	Delete person and all faces
`GET`	`/api/v1/persons/{person_id}/timeline`	Get person timeline (chronological appearances)
`GET`	`/api/v1/persons/{person_id}/appearances`	Get person appearances with geo-filtering
`GET`	`/api/v1/images/{image_id}`	Serve image file
`GET`	`/api/v1/images/{image_id}/download`	Download image file
`GET`	`/api/v1/health`	Basic health check
`GET`	`/api/v1/health/ready`	Readiness probe (K8s)
`GET`	`/api/v1/health/live`	Liveness probe (K8s)
`GET`	`/api/v1/metrics`	Service metrics (JSON)
`GET`	`/api/v1/metrics/prometheus`	Prometheus metrics
`POST`	`/api/v1/auth/login`	User authentication

Example: Async Face Enrollment

# Request with priority and options
curl -X POST http://localhost:7999/api/v1/faces/enroll_async \
  -H "Content-Type: application/json" \
  -d '{
    "person_id": "employee_123",
    "image_data": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQg...",
    "priority": "high",
    "processing_options": {
      "auto_associate": true,
      "enable_multi_face": true,
      "quality_threshold": 0.8
    },
    "webhook_url": "https://your-app.com/webhook/face-enrollment"
  }'

# Immediate response (< 100ms)
{
  "success": true,
  "data": {
    "face_id": "47f651a6-086d-4928-8c2d",
    "status": "processing",
    "estimated_completion": "< 30 seconds"
  }
}

Example: List Persons by External IDs

# List persons by multiple external IDs with pagination
curl -X GET "http://localhost:8080/api/v1/persons/by_external_ids?external_ids=ext_001,ext_002,ext_003&limit=50" \
  -H "Accept: application/json"

# Response
{
  "success": true,
  "message": "Found 2 persons for 3 external_ids",
  "data": {
    "persons": [
      {
        "person_id": "person_12345",
        "external_id": "ext_001",
        "name": "John Doe",
        "face_count": 3,
        "created_at": "2025-01-15T10:30:00Z",
        "updated_at": "2025-01-15T10:30:00Z",
        "metadata": {"department": "Engineering"}
      },
      {
        "person_id": "person_67890",
        "external_id": "ext_002",
        "name": "Jane Smith",
        "face_count": 5,
        "created_at": "2025-01-15T11:15:00Z",
        "updated_at": "2025-01-15T11:15:00Z",
        "metadata": {"department": "Marketing"}
      }
    ],
    "next_page_token": "",
    "has_more": false,
    "external_ids": ["ext_001", "ext_002", "ext_003"]
  }
}

Example: Get Person by External ID

# Get a single person by external ID
curl -X GET "http://localhost:8080/api/v1/persons/by_external_id/ext_001" \
  -H "Accept: application/json"

# Response
{
  "success": true,
  "message": "Person found for external_id: ext_001",
  "data": {
    "person_id": "person_12345",
    "external_id": "ext_001",
    "name": "John Doe",
    "face_count": 3,
    "created_at": "2025-01-15T10:30:00Z",
    "updated_at": "2025-01-15T10:30:00Z",
    "metadata": {"department": "Engineering"}
  }
}

🔄 API Operation Sequences

Async Face Enrollment Flow

┌────────┐    ┌─────────┐    ┌───────┐    ┌────────┐    ┌───────────┐    ┌───────────┐
│ Client │    │REST API │    │ Kafka │    │ Worker │    │ Extractor │    │ Databases │
└───┬────┘    └────┬────┘    └───┬───┘    └───┬────┘    └─────┬─────┘    └─────┬─────┘
    │              │              │            │               │                │
    │ POST /faces/ │              │            │               │                │
    │ enroll_async │              │            │               │                │
    ├─────────────►│              │            │               │                │
    │              │ Publish      │            │               │                │
    │              │ enrollment   │            │               │                │
    │              │ event        │            │               │                │
    │              ├─────────────►│            │               │                │
    │ 200 OK       │              │            │               │                │
    │ (< 100ms)    │              │            │               │                │
    │◄─────────────┤              │            │               │                │
    │              │              │ Consume    │               │                │
    │              │              │ event      │               │                │
    │              │              ├───────────►│               │                │
    │              │              │            │ gRPC extract_ │                │
    │              │              │            │ face          │                │
    │              │              │            ├──────────────►│                │
    │              │              │            │ Face          │                │
    │              │              │            │ embeddings    │                │
    │              │              │            │◄──────────────┤                │
    │              │              │            │ Store face +  │                │
    │              │              │            │ vectors       │                │
    │              │              │            ├───────────────────────────────►│
    │              │ Webhook      │            │               │                │
    │              │ callback     │            │               │                │
    │              │◄─────────────────────────┤               │                │
    │ Webhook      │              │            │               │                │
    │ notification │              │            │               │                │
    │◄─────────────┤              │            │               │                │

Face Recognition Flow

┌────────┐  ┌─────────┐  ┌───────┐  ┌───────────┐  ┌────────┐  ┌──────────┐
│ Client │  │REST API │  │ Cache │  │ Extractor │  │ Milvus │  │ ScyllaDB │
└───┬────┘  └────┬────┘  └───┬───┘  └─────┬─────┘  └───┬────┘  └────┬─────┘
    │            │            │            │            │            │
    │ POST /faces│            │            │            │            │
    │ /recognize │            │            │            │            │
    ├───────────►│            │            │            │            │
    │            │ Check cache│            │            │            │
    │            ├───────────►│            │            │            │
    │            │            │            │            │            │
    │         ┌──┴──┐         │            │            │            │
    │         │Cache│         │            │            │            │
    │         │Hit? │         │            │            │            │
    │         └──┬──┘         │            │            │            │
    │            │            │            │            │            │
    │     ┌──────┴─────────┐  │            │            │            │
    │   HIT               MISS             │            │            │
    │     │                │  │            │            │            │
    │     │ Cached result  │  │            │            │            │
    │     │◄───────────────┤  │            │            │            │
    │     │                │  │ Extract    │            │            │
    │     │                │  │ features   │            │            │
    │     │                │  ├───────────►│            │            │
    │     │                │  │ Face       │            │            │
    │     │                │  │ embeddings │            │            │
    │     │                │  │◄───────────┤            │            │
    │     │                │  │ Vector     │            │            │
    │     │                │  │ similarity │            │            │
    │     │                │  │ search     │            │            │
    │     │                │  ├────────────────────────►│            │
    │     │                │  │ Top matches│            │            │
    │     │                │  │◄────────────────────────┤            │
    │     │                │  │ Get person │            │            │
    │     │                │  │ details    │            │            │
    │     │                │  ├─────────────────────────────────────►│
    │     │                │  │ Person     │            │            │
    │     │                │  │ metadata   │            │            │
    │     │                │  │◄─────────────────────────────────────┤
    │     │                │  │ Store      │            │            │
    │     │                │  │ result     │            │            │
    │     │                │  ├───────────►│            │            │
    │     └────────────────────┘            │            │            │
    │ Recognition result (~30ms)             │            │            │
    │◄───────────┤                          │            │            │

📚 Interactive API Documentation

Swagger UI: http://localhost:8080/docs - Interactive API explorer with live testing capabilities

💻 Development Setup

🔍 Enterprise Observability Stack

FreeFace includes a comprehensive observability platform with metrics, logging, and tracing:

📈 Prometheus Metrics (40+ Metrics)

# View all metrics
curl http://localhost:8080/api/v1/metrics/prometheus

# Key metric categories:
# • HTTP Operations: requests_total, response_time_ms, client_errors, server_errors
# • Face Operations: enroll_requests, enroll_success, search_requests, recognition_requests
# • Database Operations: scylla_operations_total, milvus_operations_total, db_queries_success
# • System Metrics: system_errors_total, system_uptime_seconds, memory_usage_bytes

🔍 Distributed Tracing

# Jaeger tracing with request correlation
# Features:
# • HTTP request spans with timing
# • Database operation tracing
# • Cross-service request correlation
# • Request ID tracking throughout the system
# • Performance bottleneck identification

💾 Structured Logging

# All logs include rich context:
# • Request IDs for correlation
# • Service names and operations
# • Timing information (start/end/duration)
# • Success/failure status
# • Database table and operation details
# • Error messages and stack traces

Health Checks

# Service health (via Skaffold port-forward)
curl http://localhost:8080/health
curl http://localhost:8080/health/ready
curl http://localhost:8080/metrics

Monitoring Stack

All monitoring services are automatically port-forwarded when using Skaffold development mode:

Prometheus: http://localhost:9091 - Metrics collection and alerting
Grafana: http://localhost:3001 (admin/admin) - Visualization dashboards
Jaeger: http://localhost:16686 - Distributed request tracing
Kibana: http://localhost:5601 - Log analysis and search

Database Management

All migration commands now run directly with cargo (no Docker/K8s needed):

# Apply all pending migrations
cargo run --bin migrate -- up --database all

# Apply migrations for specific database
cargo run --bin migrate -- up --database scylla
cargo run --bin migrate -- up --database milvus

# Check migration status
cargo run --bin migrate -- status --database all

# Create new migration template
cargo run --bin migrate -- create "add_user_sessions" --database scylla
cargo run --bin migrate -- create "face_indexes" --database milvus

# Rollback migrations (not implemented yet)
cargo run --bin migrate -- down --database scylla --steps 1

# DESTRUCTIVE: Erase entire database
cargo run --bin migrate -- erase all --confirm

# DESTRUCTIVE: Truncate data while keeping schema
cargo run --bin migrate -- truncate scylla --tables all --confirm
cargo run --bin migrate -- truncate scylla --tables "persons,faces,images" --confirm
cargo run --bin migrate -- truncate scylla --tables "faces,user_auth" --confirm
cargo run --bin migrate -- truncate milvus --collections all --confirm
cargo run --bin migrate -- truncate milvus --collections "face_embeddings" --confirm

Migration Features:

✅ Dual Database Support: ScyllaDB (.cql files) and Milvus (.json collections)
✅ Checksum Validation: Prevents accidental re-runs of modified migrations
✅ Migration Tracking: Complete history with execution times and status
✅ Error Recovery: Failed migrations are logged for debugging
✅ Interactive Setup: Dynamic keyspace configuration for production deployments
✅ Safety Controls: Destructive operations require explicit confirmation

Note: Infrastructure services (ScyllaDB, Milvus) must be running in K8s with proper port forwarding.

📄 License

This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

🆓 Non-Commercial Use

✅ Free to use, modify, and share for personal, educational, and research purposes
✅ Open source contributions welcome
✅ Must provide attribution

💼 Commercial Use

For commercial use, you need explicit written permission from the author.

Contact for commercial licensing:

📧 Email: moussandour1@gmail.com
🐙 GitHub: github.com/touskar
💬 WhatsApp: +221772457199

See LICENSE file for full details.

🔗 Links

Built with ❤️ using Rust for enterprise-scale face recognition

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
admin		admin
db/migration		db/migration
doc		doc
extraction		extraction
grpc_extractor_test		grpc_extractor_test
kubernetes		kubernetes
models		models
proto/extraction		proto/extraction
scripts		scripts
src		src
test_images		test_images
.dockerignore		.dockerignore
.env.sample		.env.sample
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
README.md.backup		README.md.backup
build.rs		build.rs
check_clustering.rs		check_clustering.rs
convert_detection_models.py		convert_detection_models.py
convert_models_simple.py		convert_models_simple.py
cpu_models_embedding_results.json		cpu_models_embedding_results.json
dev-k8s.sh		dev-k8s.sh
grpc_test_client.py		grpc_test_client.py
reset_schema.sql		reset_schema.sql
skaffold.yaml		skaffold.yaml
test_image.png		test_image.png
test_scylla_direct.rs		test_scylla_direct.rs

Folders and files

Latest commit

History

Repository files navigation

🔍 FreeFace

📖 Table of Contents

🚀 Overview

What is FreeFace?

🚀 Why FreeFace Scales to Billions

📊 Proven Scale Architecture

🎯 Database Layer: Built for Billions

ScyllaDB for Metadata at Scale

Milvus: Built for Massive Vector Data

⚡ Service Architecture: Microservices for Scale

🔄 Message Queue Architecture

Apache Kafka: Built for Massive Event Streams

💾 Caching & Storage Strategy

DragonflyDB for Caching

MinIO for Object Storage

🎯 Why These Technologies Enable Billion-Scale

🤖 AI Models & Detection Engine

🔍 Face Detection Models

🧠 Face Recognition Models

📁 Model Structure

⚙️ Model Selection Strategy

🎯 Key Features

Core Capabilities

Technical Features

⚡ Quick Start

🔥 Hybrid Development (Recommended)

Prerequisites

Step-by-Step Setup

Step 1: Clone & Configure

Step 2: Deploy Infrastructure to Kubernetes

Step 3: Prepare Databases

Step 4: Run FreeFace Applications (Host)

🎯 Why Hybrid Development?

🔗 Service Access (after Skaffold starts)

📁 Project Structure

Key Components

Services (src/bin/)

Infrastructure (kubernetes/)

🏗️ Architecture

FreeFace Enterprise Architecture

Data Flow Patterns

Core Services

🚀 FreeFace API (REST)

🧠 FreeFace Extractor (gRPC)

⚡ Async Processor

🔍 Face Clustering

🔌 API Documentation

Core Endpoints

Example: Async Face Enrollment

Example: List Persons by External IDs

Example: Get Person by External ID

🔄 API Operation Sequences

Async Face Enrollment Flow

Face Recognition Flow

📚 Interactive API Documentation

💻 Development Setup

🔍 Enterprise Observability Stack

📈 Prometheus Metrics (40+ Metrics)

🔍 Distributed Tracing

💾 Structured Logging

Health Checks

Monitoring Stack

Database Management

📄 License

🆓 Non-Commercial Use

💼 Commercial Use

🔗 Links

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages