Skip to content

OmniCloudOrg/Lighthouse

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚨 Lighthouse

Scale anything, anywhere, intelligently

License: MIT Rust


🌟 Why Lighthouse?

Most autoscaling solutions are platform-specific and inflexible. Lighthouse breaks free from these constraints by providing a generic, type-safe framework that works with any infrastructure while delivering enterprise-grade intelligence.

graph TB
    A[Your Application] --> B[Lighthouse Engine]
    B --> C[Policy Engine]
    B --> D[Metrics Storage] 
    B --> E[Predictive AI]
    B --> F[Observer System]
    
    C --> G[Your Infrastructure]
    D --> G
    E --> G
    F --> H[Monitoring & Alerts]
    
    G --> I[Kubernetes]
    G --> J[AWS Auto Scaling]
    G --> K[Docker Swarm]
    G --> L[Custom APIs]
    
    style B fill:#e1f5fe
    style G fill:#f3e5f5
    style H fill:#fff3e0
Loading

✨ Key Features

Feature Description Status
🎯 Universal Compatibility Works with Kubernetes, AWS, Docker, bare metal, or any custom infrastructure
🧠 Intelligent Policies Composite policies, time-based rules, and weighted decision making
📊 Historical Analytics SQLite-based metrics storage with trend analysis and statistical insights
🔮 Predictive Scaling ML-powered forecasting with multiple algorithms (Linear, MA, Exponential)
Real-time Processing Streaming metrics with configurable evaluation intervals
🛡️ Type Safety Compile-time guarantees for scaling logic and configuration
📈 Production Ready Built-in observability, error handling, and performance optimization
🔧 Live Configuration Update policies and settings without restarts

🚀 Quick Start

Installation

Add Lighthouse to your Cargo.toml:

[dependencies]
# Basic autoscaling
lighthouse = "0.1"

# Full-featured (recommended)
lighthouse = { version = "0.1", features = ["full"] }

# À la carte
lighthouse = { 
    version = "0.1", 
    features = ["metrics-persistence", "predictive-scaling", "time-utils"] 
}

# Always needed
tokio = { version = "1.0", features = ["full"] }
async-trait = "0.1"

Basic Example

use lighthouse::{
    LighthouseEngine, LighthouseConfig, LighthouseCallbacks, ResourceConfig,
    MetricsProvider, ScalingExecutor, ResourceMetrics, ScaleAction, 
    CallbackContext, LighthouseResult, policies
};
use std::sync::Arc;
use async_trait::async_trait;

// 1. Implement metrics collection for your infrastructure
struct MyMetricsProvider;

#[async_trait]
impl MetricsProvider for MyMetricsProvider {
    async fn get_metrics(
        &self,
        resource_id: &str,
        _context: &CallbackContext,
    ) -> LighthouseResult<Option<ResourceMetrics>> {
        // Fetch metrics from your monitoring system (Prometheus, CloudWatch, etc.)
        let cpu = get_cpu_usage(resource_id).await?;
        let memory = get_memory_usage(resource_id).await?;
        
        Ok(Some(ResourceMetrics {
            resource_id: resource_id.to_string(),
            resource_type: "web-service".to_string(),
            timestamp: utils::current_timestamp(),
            metrics: [
                ("cpu_percent".to_string(), cpu),
                ("memory_percent".to_string(), memory),
            ].into(),
        }))
    }
}

// 2. Implement scaling actions for your infrastructure
struct MyScalingExecutor;

#[async_trait]
impl ScalingExecutor for MyScalingExecutor {
    async fn execute_scale_action(
        &self,
        action: &ScaleAction,
        _context: &CallbackContext,
    ) -> LighthouseResult<bool> {
        match action.direction {
            ScaleDirection::Up => {
                // Scale up your infrastructure (Kubernetes, AWS ASG, etc.)
                scale_up_infrastructure(&action.resource_id, action.scale_factor).await?;
                println!("✅ Scaled up {}", action.resource_id);
            },
            ScaleDirection::Down => {
                scale_down_infrastructure(&action.resource_id, action.scale_factor).await?;
                println!("📉 Scaled down {}", action.resource_id);
            },
            ScaleDirection::Maintain => {
                println!("➡️ Maintaining {}", action.resource_id);
            }
        }
        Ok(true)
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 3. Configure intelligent scaling policies
    let config = LighthouseConfig::builder()
        .evaluation_interval(30) // Check every 30 seconds
        .add_resource_config("web-frontend", ResourceConfig {
            resource_type: "service".to_string(),
            policies: vec![
                // Multi-metric policy with smart thresholds
                policies::multi_metric_policy(
                    "intelligent-scaling",
                    (75.0, 25.0),  // CPU: scale up at 75%, down at 25%
                    (80.0, 30.0),  // Memory: scale up at 80%, down at 30%
                    1.5,           // Scale by 50% each time
                    300            // 5-minute cooldown between actions
                )
            ],
            default_policy: Some("intelligent-scaling".to_string()),
            settings: [
                ("min_replicas".to_string(), "2".to_string()),
                ("max_replicas".to_string(), "50".to_string()),
            ].into(),
        })
        .enable_logging(true)
        .build();

    // 4. Create the engine and start scaling
    let callbacks = LighthouseCallbacks::new(
        Arc::new(MyMetricsProvider),
        Arc::new(MyScalingExecutor)
    );
    
    let engine = LighthouseEngine::new(config, callbacks);
    let handle = Arc::new(engine.handle());
    
    // Start the engine
    tokio::spawn(async move { engine.start().await });
    
    // 5. Your application runs normally - Lighthouse handles scaling automatically!
    println!("🚨 Lighthouse is now managing your infrastructure scaling!");
    
    // Engine will continuously:
    // - Fetch metrics from your systems
    // - Evaluate scaling policies  
    // - Execute scaling decisions
    // - Learn from historical patterns (with persistence enabled)
    
    Ok(())
}

// Helper functions (implement based on your infrastructure)
async fn get_cpu_usage(resource_id: &str) -> LighthouseResult<f64> { /* ... */ }
async fn get_memory_usage(resource_id: &str) -> LighthouseResult<f64> { /* ... */ }
async fn scale_up_infrastructure(resource_id: &str, factor: Option<f64>) -> LighthouseResult<()> { /* ... */ }
async fn scale_down_infrastructure(resource_id: &str, factor: Option<f64>) -> LighthouseResult<()> { /* ... */ }

🏗️ Architecture

Lighthouse uses a callback-driven architecture that separates concerns and maximizes flexibility:

flowchart TD
    subgraph "Your Application"
        A[Metrics Provider] --> B[Lighthouse Engine]
        C[Scaling Executor] --> B
        D[Scaling Observer] --> B
    end
    
    subgraph "Lighthouse Core"
        B --> E[Policy Engine]
        B --> F[Metrics Storage]
        B --> G[Predictive AI]
        B --> H[Event System]
        
        E --> I[Basic Policies]
        E --> J[Composite Policies]
        E --> K[Time-Based Policies]
        
        F --> L[SQLite Backend]
        F --> M[Trend Analysis]
        
        G --> N[Linear Regression]
        G --> O[Moving Average]
        G --> P[Exponential Smoothing]
    end
    
    subgraph "Your Infrastructure" 
        Q[Kubernetes Deployments]
        R[AWS Auto Scaling Groups]
        S[Docker Containers]
        T[Custom APIs]
    end
    
    B --> Q
    B --> R
    B --> S
    B --> T
    
    style B fill:#e3f2fd
    style E fill:#f1f8e9
    style F fill:#fce4ec
    style G fill:#fff8e1
Loading

Core Components

  1. 🔌 Your Callbacks - Connect Lighthouse to your infrastructure

    • MetricsProvider: Fetch metrics from your monitoring systems
    • ScalingExecutor: Execute scaling actions on your infrastructure
    • ScalingObserver: Receive notifications about scaling events
  2. 🧠 Policy Engine - Intelligent decision making

    • Basic policies (CPU, memory, request rate)
    • Composite policies (AND/OR/weighted logic)
    • Time-based policies (business hours vs after-hours)
  3. 📊 Metrics Storage (Optional) - Historical intelligence

    • SQLite backend with configurable retention
    • Trend analysis and pattern detection
    • Statistical summaries and percentiles
  4. 🔮 Predictive AI (Optional) - Proactive scaling

    • Multiple forecasting algorithms
    • Anomaly detection and alerting
    • Confidence-based recommendations

🎯 Examples

Example 1: Basic CPU Scaling

Perfect for getting started:

use lighthouse::policies;

let policy = policies::cpu_scaling_policy(
    80.0,  // Scale up when CPU > 80%
    20.0,  // Scale down when CPU < 20% 
    1.5,   // Scale by 50%
    300    // 5-minute cooldown
);

Example 2: Business Hours Scaling

Different scaling behavior for day vs night:

use lighthouse::policies;

// Conservative during business hours
let day_policy = policies::cpu_scaling_policy(85.0, 30.0, 1.3, 300);

// Aggressive after hours (fewer users, can be more responsive)
let night_policy = policies::cpu_scaling_policy(60.0, 15.0, 2.0, 180);

let time_aware_policy = policies::business_hours_scaling_policy(
    "intelligent-scheduling",
    day_policy,
    night_policy,
    Some("UTC")
);

Example 3: Advanced Composite Policies

Sophisticated multi-metric decision making:

use lighthouse::policies;

// Create individual policies
let cpu_policy = policies::cpu_scaling_policy(75.0, 25.0, 1.5, 300);
let memory_policy = policies::memory_scaling_policy(80.0, 30.0, 1.4, 300);
let request_policy = policies::request_rate_scaling_policy(1000.0, 200.0, 1.6, 180);

// Combine with weighted voting: CPU and memory more important than requests
let smart_policy = policies::composite_weighted_policy(
    "intelligent-scaling",
    vec![cpu_policy, memory_policy, request_policy],
    vec![0.4, 0.4, 0.2]  // 40% CPU, 40% memory, 20% requests
);

Example 4: Persistence & Analytics

Store metrics and analyze trends:

use lighthouse::persistence::{MetricsStoreConfig, RetentionPolicy};

let store_config = MetricsStoreConfig::builder()
    .database_path("./lighthouse_metrics.db")
    .retention_policy(
        RetentionPolicy::new()
            .raw_data_days(7)        // Raw data for 1 week
            .hourly_aggregates_days(30)  // Hourly summaries for 1 month
            .daily_aggregates_days(365)  // Daily summaries for 1 year
    )
    .build();

let engine = LighthouseEngine::new_with_persistence(
    config, callbacks, store_config
).await?;

// Query historical trends
let stats = handle.get_metrics_statistics(
    "web-frontend".to_string(),
    "cpu_percent".to_string(), 
    24  // Last 24 hours
).await?;

println!("Average CPU: {:.1}%", stats.mean);
println!("Trend: {:?} (confidence: {:.0}%)", 
    stats.trend.direction, stats.trend.confidence * 100.0);

Example 5: Predictive Scaling

Proactive scaling based on forecasts:

use lighthouse::predictive::{PredictiveScaler, PredictiveConfig};

let predictive_config = PredictiveConfig::builder()
    .enable_linear_regression(true)
    .enable_moving_average(true) 
    .enable_exponential_smoothing(true)
    .forecast_horizon_minutes(60)  // Predict 1 hour ahead
    .confidence_threshold(0.75)    // 75% confidence required
    .build();

let scaler = PredictiveScaler::new(predictive_config, metrics_store);

// Get proactive recommendations
let recommendations = scaler.get_scaling_recommendations(
    "web-frontend",
    vec!["cpu_percent".to_string(), "memory_percent".to_string()]
).await?;

for rec in recommendations {
    println!("🔮 Predicted {} load spike in {} minutes", 
        rec.metric_name, rec.trigger_in_minutes);
    println!("   Recommended: Scale by {}x", 
        rec.recommended_action.scale_factor.unwrap_or(1.0));
}

🔧 Real-World Integrations

Kubernetes Integration

use kube::Client;
use lighthouse::{ScalingExecutor, ScaleAction, LighthouseResult};

struct KubernetesScaler {
    client: Client,
    namespace: String,
}

#[async_trait::async_trait]
impl ScalingExecutor for KubernetesScaler {
    async fn execute_scale_action(
        &self,
        action: &ScaleAction,
        _context: &CallbackContext,
    ) -> LighthouseResult<bool> {
        let deployments: Api<Deployment> = Api::namespaced(self.client.clone(), &self.namespace);
        
        match action.direction {
            ScaleDirection::Up => {
                let scale = Scale {
                    spec: Some(ScaleSpec {
                        replicas: Some(current_replicas + scale_amount),
                    }),
                    ..Default::default()
                };
                deployments.patch_scale(&action.resource_id, &PatchParams::default(), &scale).await?;
            },
            // ... handle other directions
        }
        
        Ok(true)
    }
}

AWS Auto Scaling Integration

use aws_sdk_autoscaling::Client as ASGClient;

struct AWSAutoScaler {
    client: ASGClient,
}

#[async_trait::async_trait] 
impl ScalingExecutor for AWSAutoScaler {
    async fn execute_scale_action(
        &self,
        action: &ScaleAction,
        _context: &CallbackContext,
    ) -> LighthouseResult<bool> {
        match action.direction {
            ScaleDirection::Up => {
                self.client
                    .set_desired_capacity()
                    .auto_scaling_group_name(&action.resource_id)
                    .desired_capacity(new_capacity)
                    .send()
                    .await?;
            },
            // ... handle other directions
        }
        
        Ok(true)
    }
}

Prometheus Metrics Integration

struct PrometheusProvider {
    client: reqwest::Client,
    base_url: String,
}

#[async_trait::async_trait]
impl MetricsProvider for PrometheusProvider {
    async fn get_metrics(
        &self,
        resource_id: &str,
        _context: &CallbackContext,
    ) -> LighthouseResult<Option<ResourceMetrics>> {
        let cpu_query = format!(
            "avg(rate(container_cpu_usage_seconds_total{{container_label_app='{}'}}[5m])) * 100", 
            resource_id
        );
        
        let response = self.client
            .get(&format!("{}/api/v1/query", self.base_url))
            .query(&[("query", &cpu_query)])
            .send()
            .await?;
            
        // Parse Prometheus response and return ResourceMetrics
        // ... implementation details
    }
}

📊 Monitoring & Observability

Lighthouse provides comprehensive observability out of the box:

flowchart LR
    A[Lighthouse Engine] --> B[Scaling Events]
    B --> C[ScalingObserver]
    
    C --> D[Structured Logs]
    C --> E[Metrics Export]
    C --> F[Custom Alerts]
    
    D --> G[Application Logs]
    E --> H[Prometheus/Grafana]
    F --> I[PagerDuty/Slack]
    
    style A fill:#e3f2fd
    style C fill:#f1f8e9
Loading

Custom Observer Implementation

use lighthouse::{ScalingObserver, ScaleAction, LighthouseResult};
use tracing::{info, warn, error};

struct ProductionObserver {
    metrics_client: prometheus::Client,
    alert_manager: AlertManager,
}

#[async_trait::async_trait]
impl ScalingObserver for ProductionObserver {
    async fn on_scaling_decision(
        &self,
        action: &ScaleAction,
        context: &CallbackContext,
    ) -> LighthouseResult<()> {
        // Structured logging
        info!(
            resource_id = %action.resource_id,
            direction = ?action.direction,
            confidence = %action.confidence,
            reason = %action.reason,
            "Scaling decision made"
        );
        
        // Export metrics
        self.metrics_client.increment_counter(
            "lighthouse_scaling_decisions_total",
            &[
                ("resource", &action.resource_id),
                ("direction", &format!("{:?}", action.direction))
            ]
        );
        
        Ok(())
    }
    
    async fn on_scaling_executed(
        &self,
        action: &ScaleAction,
        success: bool,
        _context: &CallbackContext,
    ) -> LighthouseResult<()> {
        if success {
            info!("✅ Successfully scaled {}", action.resource_id);
        } else {
            warn!("⚠️ Scaling was skipped for {}", action.resource_id);
            
            // Send alert for failed scaling
            self.alert_manager.send_alert(&format!(
                "Lighthouse failed to scale {}", action.resource_id
            )).await?;
        }
        
        Ok(())
    }
    
    async fn on_scaling_error(
        &self,
        action: &ScaleAction,
        error: &lighthouse::LighthouseError,
        _context: &CallbackContext,
    ) -> LighthouseResult<()> {
        error!(
            resource_id = %action.resource_id,
            error = %error,
            "Critical scaling error"
        );
        
        // Critical alert
        self.alert_manager.send_critical_alert(&format!(
            "Lighthouse critical error for {}: {}", action.resource_id, error
        )).await?;
        
        Ok(())
    }
}

🎮 Running Examples

The repository includes comprehensive examples for different use cases:

# Basic functionality
cargo run --example basic_usage

# Historical data storage and analysis  
cargo run --example metrics_persistence --features "metrics-persistence time-utils"

# ML-based proactive scaling
cargo run --example predictive_scaling --features "predictive-scaling time-utils"

# Advanced policy composition
cargo run --example advanced_policies --features "time-utils"

# All features enabled
cargo run --example [example_name] --features "full"

Available Examples

Example Description Features Required
basic_usage.rs Core concepts and simple setup None
metrics_persistence.rs Historical data storage with trend analysis metrics-persistence, time-utils
predictive_scaling.rs ML forecasting with multiple algorithms predictive-scaling, time-utils
advanced_policies.rs Complex policy composition and scheduling time-utils

🚀 Production Deployment

Configuration Best Practices

fn create_production_config() -> LighthouseConfig {
    LighthouseConfig::builder()
        // Balance responsiveness vs overhead
        .evaluation_interval(30)  
        
        .add_resource_config("web-tier", ResourceConfig {
            resource_type: "kubernetes-deployment".to_string(),
            policies: vec![
                // Conservative thresholds for production
                lighthouse::policies::multi_metric_policy(
                    "production-scaling",
                    (75.0, 25.0),  // CPU thresholds
                    (80.0, 30.0),  // Memory thresholds
                    1.3,           // Modest scaling factor
                    300            // 5-minute cooldown
                )
            ],
            default_policy: Some("production-scaling".to_string()),
            settings: [
                ("min_replicas".to_string(), "3".to_string()),  // HA minimum
                ("max_replicas".to_string(), "50".to_string()),
                ("environment".to_string(), "production".to_string()),
            ].into(),
        })
        
        // Global settings for tracking and debugging
        .global_setting("cluster", "production-us-west")
        .global_setting("team", "platform")
        .global_setting("cost_center", "engineering")
        
        .enable_logging(true)
        .build()
}


### Performance Characteristics

| Metric | Value | Notes |
|--------|-------|-------|
| **Memory Overhead** | < 50MB | Base engine with 100 resources |
| **CPU Usage** | < 1% | During normal operation |
| **Evaluation Latency** | < 100ms | Per resource per evaluation |
| **Storage Growth** | ~1MB/day | With persistence (1000 metrics/day) |
| **Scalability** | 10,000+ resources | Tested configuration |

---

## 🤝 Contributing

We welcome contributions! Lighthouse thrives on community input and diverse use cases.

### Ways to Contribute

- 🐛 **Report Bugs** - Found an issue? Let us know!
- ✨ **Feature Requests** - Have an idea? We'd love to hear it!
- 📝 **Documentation** - Help improve our docs and examples
- 🔧 **Code Contributions** - Bug fixes, new features, optimizations
- 🏗️ **Platform Integrations** - Add support for new infrastructure platforms
- 📊 **Monitoring Integrations** - Connect with more observability tools

### Development Setup

```bash
# Clone the repository
git clone https://github.com/tristanpoland/lighthouse.git
cd lighthouse

# Install dependencies
cargo build

# Run tests  
cargo test --features "full"

# Run examples
cargo run --example basic_usage

Before Contributing

  1. Check existing issues and PRs
  2. Read our Contributing Guide
  3. Ensure tests pass and code follows our style guidelines
  4. Add examples for new features when possible

🔍 Feature Matrix

Feature Basic + Persistence + Predictive Full
Core Scaling
Policy Composition
Time-Based Policies 🟡¹
Historical Storage
Trend Analysis
Predictive Scaling
Anomaly Detection
Statistical Reports
Seasonal Patterns

¹ Basic time-based policies require time-utils feature

Feature Flags

[dependencies]
lighthouse = { version = "0.1", features = [...] }
  • metrics-persistence - SQLite storage, historical analysis, trend detection
  • predictive-scaling - ML forecasting, proactive recommendations, anomaly alerts
  • time-utils - Time-based policies, scheduling, seasonal analysis
  • prometheus-metrics - Prometheus integration and metrics export
  • config-toml / config-yaml - Configuration file format support
  • full - All features enabled (recommended for new projects)

📄 License

Lighthouse is dual-licensed under MIT OR Apache-2.0, the same as Rust itself.

This means you can use Lighthouse in any project, commercial or open source, with maximum flexibility.


🚀 Get Started Now

# Add to your project
cargo add lighthouse --features full

# Or start with an example  
git clone https://github.com/tristanpoland/lighthouse.git
cd lighthouse
cargo run --example basic_usage

Ready to scale intelligently? Dive into the examples and see Lighthouse in action!


Built with ❤️ in Rust for the modern infrastructure era

🚀 Get Started🐛 Report Issue💬 Discussions

Built with ❤️ in Rust for the modern infrastructure era.

About

A generic, easy to use scaling engine implemented in Rust

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Rust 100.0%