[Feature Request] Implement Multi-Provider LLM Microservices Architecture with Docker Orchestration

# Multi-Provider LLM Microservices Architecture

## Issue Labels
- `enhancement`
- `architecture` 
- `docker`
- `microservices`
- `litellm`

---

## 📋 Summary

Currently, the AIMO-Models project uses a single LiteLLM proxy service that aggregates multiple LLM providers through OpenRouter. We propose implementing a **microservices architecture** where each LLM provider runs in its own isolated Docker container, managed through a unified API gateway.

## 🎯 Motivation

### Current Limitations
- **Single Point of Failure**: All providers depend on one LiteLLM instance
- **Resource Contention**: All models share the same container resources
- **Difficult Scaling**: Cannot independently scale specific providers
- **Maintenance Complexity**: Updates affect all providers simultaneously
- **Limited Isolation**: Provider failures can impact other services

### Proposed Benefits
- **Fault Isolation**: Each provider runs independently
- **Independent Scaling**: Scale providers based on demand
- **Easier Maintenance**: Update/restart individual services
- **Better Resource Management**: Allocate resources per provider
- **Enhanced Monitoring**: Per-provider metrics and logging
- **Dynamic Provider Management**: Add/remove providers without downtime

## 🏗️ Proposed Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                    API Gateway / Load Balancer              │
│                   (Main AIMO Service)                       │
└─────────────────┬───────────────┬───────────────┬───────────┘
                  │               │               │
         ┌────────▼──────┐ ┌─────▼─────┐ ┌──────▼──────┐
         │  OpenRouter   │ │   Phala   │ │  ChutesAI   │
         │   Service     │ │  Service  │ │   Service   │
         │ (Port 4001)   │ │(Port 4002)│ │ (Port 4003) │
         └───────────────┘ └───────────┘ └─────────────┘
                  │               │               │
         ┌────────▼──────┐ ┌─────▼─────┐ ┌──────▼──────┐
         │  Nebula Block │ │ Provider  │ │  Provider   │
         │   Service     │ │ Service N │ │ Service N+1 │
         │ (Port 4004)   │ │(Port 400N)│ │(Port 400N+1)│
         └───────────────┘ └───────────┘ └─────────────┘
```

## 📁 Proposed File Structure

```
infra/
├── litellm/
│   ├── common/
│   │   ├── docker-compose.base.yml          # Base configuration
│   │   └── shared-network.yml               # Network definitions
│   ├── providers/
│   │   ├── openrouter/
│   │   │   ├── docker-compose.yml
│   │   │   ├── config.yaml
│   │   │   └── .env.openrouter
│   │   ├── phala/
│   │   │   ├── docker-compose.yml
│   │   │   ├── config.yaml
│   │   │   └── .env.phala
│   │   ├── chutesai/
│   │   │   ├── docker-compose.yml
│   │   │   ├── config.yaml
│   │   │   └── .env.chutesai
│   │   └── nebula/
│   │       ├── docker-compose.yml
│   │       ├── config.yaml
│   │       └── .env.nebula
│   ├── gateway/
│   │   ├── docker-compose.yml
│   │   ├── nginx.conf                       # Load balancer config
│   │   └── .env.gateway
│   ├── orchestration/
│   │   ├── docker-compose.all.yml           # Full orchestration
│   │   └── .env.all                         # Global environment
│   ├── monitoring/
│   │   ├── docker-compose.yml               # Prometheus & Grafana
│   │   └── prometheus.yml
│   └── scripts/
│       ├── manage-services.sh               # Service management
│       ├── health-check.sh                  # Health monitoring
│       └── deploy.sh                        # Deployment automation
```

## 🔧 Implementation Details

### 1. Individual Provider Services

Each provider will have its own Docker service configuration:

**Example: OpenRouter Service** (`providers/openrouter/docker-compose.yml`)
```yaml
version: '3.8'
services:
  openrouter-llm:
    image: ghcr.io/berriai/litellm:main-latest
    container_name: aimo-openrouter-llm
    ports:
      - "4001:4000"
    volumes:
      - ./config.yaml:/app/config.yaml
    env_file:
      - .env.openrouter
    command: ["--config", "/app/config.yaml", "--port", "4000"]
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:4000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    networks:
      - aimo-llm-network
    deploy:
      resources:
        limits:
          memory: 1G
          cpus: "0.5"

networks:
  aimo-llm-network:
    external: true
```

### 2. API Gateway Configuration

**Nginx Load Balancer** (`gateway/nginx.conf`)
```nginx
upstream openrouter_backend {
    server aimo-openrouter-llm:4000 weight=3 max_fails=2 fail_timeout=30s;
}

upstream phala_backend {
    server aimo-phala-llm:4000 weight=2 max_fails=2 fail_timeout=30s;
}

upstream chutesai_backend {
    server aimo-chutesai-llm:4000 weight=2 max_fails=2 fail_timeout=30s;
}

upstream nebula_backend {
    server aimo-nebula-llm:4000 weight=1 max_fails=2 fail_timeout=30s;
}

# Health check endpoint
server {
    listen 4000;
    
    # Provider-specific routing
    location /providers/openrouter/ {
        proxy_pass http://openrouter_backend/;
        include proxy_params;
    }
    
    location /providers/phala/ {
        proxy_pass http://phala_backend/;
        include proxy_params;
    }
    
    location /providers/chutesai/ {
        proxy_pass http://chutesai_backend/;
        include proxy_params;
    }
    
    location /providers/nebula/ {
        proxy_pass http://nebula_backend/;
        include proxy_params;
    }
    
    # Intelligent routing based on model name
    location /v1/chat/completions {
        # Route based on model parameter
        set $backend openrouter_backend;
        
        if ($request_body ~ "phala-") {
            set $backend phala_backend;
        }
        if ($request_body ~ "chutes-") {
            set $backend chutesai_backend;
        }
        if ($request_body ~ "nebula-") {
            set $backend nebula_backend;
        }
        
        proxy_pass http://$backend;
        include proxy_params;
    }
    
    # Health check aggregation
    location /health {
        access_log off;
        return 200 '{"status":"healthy","timestamp":"$time_iso8601"}';
        add_header Content-Type application/json;
    }
}
```

### 3. Service Management Scripts

**Service Management** (`scripts/manage-services.sh`)
```bash
#!/bin/bash

ACTION=$1
SERVICE=$2

PROVIDERS=("openrouter" "phala" "chutesai" "nebula")
BASE_DIR="$(dirname "$0")/.."

log() {
    echo "[$(date +'%Y-%m-%d %H:%M:%S')] $1"
}

create_network() {
    if ! docker network ls | grep -q aimo-llm-network; then
        log "Creating shared network..."
        docker network create aimo-llm-network
    fi
}

start_service() {
    local service=$1
    if [ "$service" == "all" ]; then
        create_network
        log "Starting all services..."
        cd "$BASE_DIR/orchestration"
        docker-compose -f docker-compose.all.yml up -d
    elif [ "$service" == "gateway" ]; then
        log "Starting gateway service..."
        cd "$BASE_DIR/gateway"
        docker-compose up -d
    elif [[ " ${PROVIDERS[*]} " =~ " $service " ]]; then
        create_network
        log "Starting $service service..."
        cd "$BASE_DIR/providers/$service"
        docker-compose up -d
    else
        log "Unknown service: $service"
        exit 1
    fi
}

stop_service() {
    local service=$1
    if [ "$service" == "all" ]; then
        log "Stopping all services..."
        cd "$BASE_DIR/orchestration"
        docker-compose -f docker-compose.all.yml down
    elif [ "$service" == "gateway" ]; then
        log "Stopping gateway service..."
        cd "$BASE_DIR/gateway"
        docker-compose down
    elif [[ " ${PROVIDERS[*]} " =~ " $service " ]]; then
        log "Stopping $service service..."
        cd "$BASE_DIR/providers/$service"
        docker-compose down
    else
        log "Unknown service: $service"
        exit 1
    fi
}

show_status() {
    log "Service Status:"
    docker ps --filter "name=aimo-*-llm" --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
    
    log "\nHealth Status:"
    for provider in "${PROVIDERS[@]}"; do
        port=$((4000 + $(printf '%d\n' $(for i in "${!PROVIDERS[@]}"; do [ "${PROVIDERS[$i]}" = "$provider" ] && echo "$((i+1))"; done))))
        if curl -sf "http://localhost:$port/health" >/dev/null 2>&1; then
            log "✅ $provider (port $port): healthy"
        else
            log "❌ $provider (port $port): unhealthy"
        fi
    done
}

case $ACTION in
    "start")
        start_service "${SERVICE:-all}"
        ;;
    "stop")
        stop_service "${SERVICE:-all}"
        ;;
    "restart")
        if [ -z "$SERVICE" ]; then
            log "Restarting all services..."
            stop_service "all"
            sleep 5
            start_service "all"
        else
            log "Restarting $SERVICE service..."
            stop_service "$SERVICE"
            sleep 2
            start_service "$SERVICE"
        fi
        ;;
    "status")
        show_status
        ;;
    "logs")
        if [ -z "$SERVICE" ]; then
            cd "$BASE_DIR/orchestration"
            docker-compose -f docker-compose.all.yml logs -f
        elif [ "$SERVICE" == "gateway" ]; then
            cd "$BASE_DIR/gateway"
            docker-compose logs -f
        elif [[ " ${PROVIDERS[*]} " =~ " $SERVICE " ]]; then
            cd "$BASE_DIR/providers/$SERVICE"
            docker-compose logs -f
        else
            log "Unknown service: $SERVICE"
            exit 1
        fi
        ;;
    *)
        echo "Usage: $0 {start|stop|restart|status|logs} [service_name]"
        echo "Services: all, gateway, ${PROVIDERS[*]}"
        echo ""
        echo "Examples:"
        echo "  $0 start                    # Start all services"
        echo "  $0 start openrouter         # Start OpenRouter service only"
        echo "  $0 stop phala               # Stop Phala service only"
        echo "  $0 restart all              # Restart all services"
        echo "  $0 status                   # Show service status"
        echo "  $0 logs chutesai            # Show ChutesAI logs"
        exit 1
        ;;
esac
```

### 4. Monitoring and Health Checks

**Health Check Script** (`scripts/health-check.sh`)
```bash
#!/bin/bash

PROVIDERS=("openrouter" "phala" "chutesai" "nebula")
GATEWAY_PORT=4000
BASE_PORT=4000

check_service_health() {
    local service=$1
    local port=$2
    local url="http://localhost:$port/health"
    
    if curl -sf "$url" >/dev/null 2>&1; then
        echo "✅ $service (port $port): healthy"
        return 0
    else
        echo "❌ $service (port $port): unhealthy"
        return 1
    fi
}

main() {
    echo "🏥 AIMO LLM Services Health Check"
    echo "================================="
    
    local unhealthy_count=0
    
    # Check gateway
    if ! check_service_health "gateway" $GATEWAY_PORT; then
        ((unhealthy_count++))
    fi
    
    # Check providers
    for i in "${!PROVIDERS[@]}"; do
        local provider="${PROVIDERS[$i]}"
        local port=$((BASE_PORT + i + 1))
        
        if ! check_service_health "$provider" $port; then
            ((unhealthy_count++))
        fi
    done
    
    echo ""
    if [ $unhealthy_count -eq 0 ]; then
        echo "🎉 All services are healthy!"
        exit 0
    else
        echo "⚠️  $unhealthy_count service(s) are unhealthy"
        exit 1
    fi
}

main "$@"
```

### 5. Integration with Main Application

Update the main AIMO application configuration to use the new gateway:

**Environment Variables** (`.env`)
```bash
# LiteLLM Gateway Configuration
LLM_BASE_URL=http://localhost:4000  # Points to the gateway
LLM_API_KEY=sk-litellm-master-key
LLM_MODEL_DEFAULT=prod-default      # Routes through OpenRouter by default

# Provider-specific configurations (optional)
OPENROUTER_ENDPOINT=http://localhost:4001
PHALA_ENDPOINT=http://localhost:4002
CHUTESAI_ENDPOINT=http://localhost:4003
NEBULA_ENDPOINT=http://localhost:4004
```

## 🚀 Implementation Plan

### Phase 1: Foundation Setup
- [ ] Create base file structure
- [ ] Implement shared network configuration
- [ ] Create service management scripts
- [ ] Set up monitoring infrastructure

### Phase 2: Provider Separation
- [ ] Extract OpenRouter to separate service
- [ ] Add Phala Network integration
- [ ] Add ChutesAI integration
- [ ] Add Nebula Block integration

### Phase 3: Gateway Implementation
- [ ] Implement Nginx-based API gateway
- [ ] Add intelligent routing logic
- [ ] Implement health check aggregation
- [ ] Add load balancing strategies

### Phase 4: Orchestration
- [ ] Create full orchestration scripts
- [ ] Implement deployment automation
- [ ] Add rolling update capabilities
- [ ] Set up monitoring dashboards

### Phase 5: Testing & Optimization
- [ ] Performance testing
- [ ] Load balancing optimization
- [ ] Fault tolerance testing
- [ ] Documentation updates


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Implement Multi-Provider LLM Microservices Architecture with Docker Orchestration #36

Multi-Provider LLM Microservices Architecture

Issue Labels

📋 Summary

🎯 Motivation

Current Limitations

Proposed Benefits

🏗️ Proposed Architecture

📁 Proposed File Structure

🔧 Implementation Details

1. Individual Provider Services

2. API Gateway Configuration

3. Service Management Scripts

4. Monitoring and Health Checks

5. Integration with Main Application

🚀 Implementation Plan

Phase 1: Foundation Setup

Phase 2: Provider Separation

Phase 3: Gateway Implementation

Phase 4: Orchestration

Phase 5: Testing & Optimization

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Implement Multi-Provider LLM Microservices Architecture with Docker Orchestration #36

Description

Multi-Provider LLM Microservices Architecture

Issue Labels

📋 Summary

🎯 Motivation

Current Limitations

Proposed Benefits

🏗️ Proposed Architecture

📁 Proposed File Structure

🔧 Implementation Details

1. Individual Provider Services

2. API Gateway Configuration

3. Service Management Scripts

4. Monitoring and Health Checks

5. Integration with Main Application

🚀 Implementation Plan

Phase 1: Foundation Setup

Phase 2: Provider Separation

Phase 3: Gateway Implementation

Phase 4: Orchestration

Phase 5: Testing & Optimization

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions