Implement dynamic model routing based on provider health monitoring

## Summary
Implement intelligent, dynamic failover for model routing that automatically switches primary providers when specific models experience errors, while maintaining Tinfoil as the preferred primary when healthy.

## Current Behavior
- Tinfoil is hardcoded as primary for all models it supports
- Failover to Continuum only happens per-request after timeout (90+ seconds)
- Temporary hardcoded swap for `llama3-3-70b` to use Continuum as primary (see src/proxy_config.rs:342-366)

## Desired Behavior
1. **Default Priority**: Tinfoil should remain primary for all models it supports when healthy
2. **Health Detection**: Monitor model-specific health per provider
3. **Automatic Failover**: When a specific model on Tinfoil fails, automatically promote Continuum as primary for that model only
4. **Recovery**: Periodically check if Tinfoil has recovered and restore it as primary
5. **Granular Control**: Track health per model, not per provider (e.g., `llama3-3-70b` might be down while `deepseek-r1-0528` works fine)

## Proposed Implementation

### 1. Health Tracking System
```rust
struct ModelHealth {
    provider: String,
    model_id: String,
    consecutive_failures: u32,
    last_failure: Option<DateTime<Utc>>,
    last_success: Option<DateTime<Utc>>,
    is_healthy: bool,
}
```

### 2. Dynamic Route Adjustment
- Track failures in `src/web/openai.rs` when handling requests
- Update model routes in ProxyRouter based on health status
- Threshold-based switching (e.g., 3 consecutive failures = unhealthy)

### 3. Health Check Strategies
**Option A: Passive Monitoring**
- Track actual request failures/successes
- No additional traffic, but slower to detect recovery
- Could implement exponential backoff for retry attempts

**Option B: Active Health Checks**
- Periodic lightweight test requests to each model
- Faster recovery detection
- Additional traffic/cost considerations

**Option C: Hybrid Approach**
- Passive monitoring for failure detection
- Active health checks only for models marked unhealthy
- Balance between responsiveness and efficiency

### 4. Integration Points
- Modify `ProxyRouter::refresh_cache()` to consider health status
- Add health tracking to request handlers in `src/web/openai.rs`
- Store health state in memory with optional persistence

### 5. Configuration
```toml
[model_health]
failure_threshold = 3  # Consecutive failures before marking unhealthy
recovery_check_interval = 300  # Seconds between health checks for failed models
recovery_threshold = 2  # Consecutive successes before marking healthy
```

## Implementation Notes
- Should handle both connection errors (502, timeouts) and model-specific errors differently
- Consider implementing circuit breaker pattern for more sophisticated failure handling
- Log all provider switches for debugging/monitoring
- Ensure thread-safe access to health state

## Acceptance Criteria
- [ ] Models automatically failover to Continuum when Tinfoil has issues
- [ ] Failed models automatically recover when Tinfoil is healthy again
- [ ] No manual intervention required for provider switching
- [ ] Health status visible in logs/metrics
- [ ] Configurable thresholds and intervals
- [ ] No performance degradation for healthy models

## Related Code
- `src/proxy_config.rs`: Current routing logic
- `src/web/openai.rs`: Request handling and fallback logic
- Current temporary fix at src/proxy_config.rs:342-366

## Priority
High - This directly impacts user experience when Tinfoil has intermittent issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement dynamic model routing based on provider health monitoring #87

Summary

Current Behavior

Desired Behavior

Proposed Implementation

1. Health Tracking System

2. Dynamic Route Adjustment

3. Health Check Strategies

4. Integration Points

5. Configuration

Implementation Notes

Acceptance Criteria

Related Code

Priority

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement dynamic model routing based on provider health monitoring #87

Description

Summary

Current Behavior

Desired Behavior

Proposed Implementation

1. Health Tracking System

2. Dynamic Route Adjustment

3. Health Check Strategies

4. Integration Points

5. Configuration

Implementation Notes

Acceptance Criteria

Related Code

Priority

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions