Claim: When your best execution path degrades, Kalibr routes around it automatically. Hardcoded systems keep failing until a human intervenes.
This is execution path routing, not model routing.
Each path is a complete strategy:
| Path ID | Model | Tool |
|---|---|---|
gpt4o-serper |
gpt-4o | Serper |
gpt4o-tavily |
gpt-4o | Tavily |
gpt4o-mini-tavily |
gpt-4o-mini | Tavily |
5-step research agent:
- Plan → Generate search queries (LLM)
- Search → Call Serper or Tavily API
- Extract → Pull facts with sources (LLM)
- Synthesize → Write cited answer (LLM)
- Validate → Verify citations
| Phase | Tasks | Description |
|---|---|---|
| Learning | 15 | Normal operation |
| Degraded | 25 | Serper fails 70% |
| Recovery | 10 | Measure adaptation |
| Phase | Hardcoded | Kalibr | Delta |
|---|---|---|---|
| Learning | 100% | 100% | +0% |
| Degraded | ~25% | ~90% | +65% |
| Recovery | ~25% | ~100% | +75% |
Kalibr routes to healthy Tavily paths. Hardcoded keeps failing.
pip install -r requirements.txt
export KALIBR_API_KEY=your-key
export KALIBR_TENANT_ID=your-tenant
export OPENAI_API_KEY=your-key
export SERPER_API_KEY=your-key
export TAVILY_API_KEY=your-key
python resilience_benchmark.py
python resilience_benchmark.py --quick # faster
python resilience_benchmark.py --full # more tasksMIT