Skip to content

Bahtya/kestrel-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

233 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kestrel Agent Logo

Kestrel Agent

A fast, streaming-first AI agent framework built in Rust

CI Rust License Crates

A fast, streaming-first AI agent framework built in Rust — connect any platform to any LLM with built-in memory, skills, and self-evolution.


Features

  • Multi-platform channels — Telegram, Discord, WebSocket, OpenAI-compatible HTTP API
  • Streaming responses — SSE streaming for real-time token delivery
  • Tool system — shell, web, filesystem, cron, search, message, spawn
  • Agent loop — context management, memory, hooks, and context compaction
  • Sub-agent spawning — parallel agent tasks via tokio JoinSet
  • Cron scheduling — tick-based scheduler with JSON state persistence
  • Health checks — registry-based checks with auto-restart and exponential backoff
  • Skill system — TOML manifests, hot-reload, SkillCompiler, runtime skill injection
  • Tiered memoryMemoryStore trait with HotStore (L1 in-memory) and WarmStore (L2 LanceDB vectors)
  • Learning & evolutionLearningEvent bus, event processors, prompt assembly from observations
  • Unified TraceID — cross-channel trace IDs (kst_{channel}_{id}) for end-to-end request tracking
  • Provider resilience — automatic retry with exponential backoff on 429s
  • SSRF protection — network allowlist/denylist, URL validation, sandboxed exec
  • Native daemon mode — double-fork daemonization, PID file with flock, signal handling (SIGTERM/SIGINT/SIGHUP), graceful shutdown with log flushing, log rotation (daily)

Architecture

                          ┌──────────────────────────────┐
                          │         CLI (clap)           │
                          │  agent · gateway · serve ·   │
                          │  daemon · heartbeat · setup · │
                          │  status                      │
                          └──────────────┬───────────────┘
                                         │
                 ┌───────────────────────┼───────────────────────┐
                 │                       │                       │
         ┌───────▼──────┐    ┌──────────▼──────────┐   ┌───────▼──────┐
         │   Telegram   │    │      Gateway        │   │  API Server  │
         │  (polling)   │    │  (ChannelManager)   │   │   (Axum)     │
         └───────┬──────┘    └──────────┬──────────┘   └───────┬──────┘
         ┌───────┴──────┐              │                       │
         │   Discord    │              │                       │
         │ (WebSocket)  │              │                       │
         └───────┬──────┘              │                       │
         ┌───────┴──────┐              │                       │
         │   WebSocket  │              │                       │
         │   (server)   │              │                       │
         └───────┬──────┘              │                       │
                 │                     │                       │
                 └─────────┬───────────┘───────────────────────┘
                           │
                  InboundMessage │ Bus (tokio broadcast)
                           │
                  ┌────────▼────────┐
                  │    Agent Loop    │
                  │  ┌────────────┐ │
                  │  │  Context   │ │
                  │  │  Memory    │ │
                  │  │  Skills    │ │
                  │  │  Hooks     │ │
                  │  └─────┬──────┘ │
                  └────────┼────────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
      ┌───────▼──────┐ ┌──▼───────┐ ┌──▼────────────┐
      │  Providers   │ │  Tools   │ │  Sub-agents   │
      │              │ │          │ │               │
      │  · OpenAI    │ │  · shell │ │  · parallel   │
      │  · Anthropic │ │  · web   │ │    spawning   │
      │  · DeepSeek  │ │  · fs    │ │  · isolated   │
      │  · Groq      │ │  · cron  │ │    contexts   │
      │  · Ollama    │ │  · search│ │               │
      └──────────────┘ │  · spawn │ └───────────────┘
                       └──────────┘
                           │
                  OutboundMessage │ Bus
                           │
                  ┌────────▼────────┐
                  │   Channel →     │
                  │   User Response │
                  └─────────────────┘

  ── Evolution Layer ────────────────────────────────────
  LearningEvent → EventBus → Processors → (SkillCreate / MemoryUpdate / PromptAdjust)

  ── Foundation Layer ───────────────────────────────────
  kestrel-core · kestrel-config · kestrel-bus
  kestrel-session · kestrel-security · kestrel-providers
  kestrel-cron · kestrel-heartbeat · kestrel-daemon
  kestrel-memory · kestrel-skill · kestrel-learning

Quick Start

Prerequisites

Rust 1.75+ and protobuf-compiler (required by LanceDB).

# Fedora / RHEL
sudo dnf install protobuf-compiler gcc

# Ubuntu / Debian
sudo apt install protobuf-compiler build-essential

Optional — mold linker for faster release linking:

# Fedora
sudo dnf install mold
# Then add to .cargo/config.toml:
# [target.x86_64-unknown-linux-gnu]
# linker = "clang"
# rustflags = ["-C", "link-arg=-fuse-ld=mold"]

The project ships a .cargo/config.toml with profile optimizations (thin LTO, dependency pre-optimization in dev mode, symbol stripping in release).

Build

cargo build --release

Configure

kestrel setup
# Edit ~/.kestrel/config.yaml with your API keys

Run

# Interactive agent (one-shot)
kestrel agent "Summarize the latest commits"

# Start gateway (Telegram + Discord + WebSocket)
kestrel gateway

# Start API server
kestrel serve --port 8080

# Periodic health checking
kestrel heartbeat

# Show system status
kestrel status

# Start as daemon (background, double-fork, PID file + flock)
kestrel daemon start

# Check status (auto-cleans stale PID files from crashed instances)
kestrel daemon status

# Stop gracefully (SIGTERM, configurable grace period)
kestrel daemon stop

# Restart (stop + re-exec)
kestrel daemon restart

Environment variable KESTREL_HOME overrides the default config directory (~/.kestrel).

Configuration

# ~/.kestrel/config.yaml

providers:
  openai:
    api_key: ${OPENAI_API_KEY}
    model: gpt-4o
    base_url: https://api.openai.com/v1   # optional: point to any OpenAI-compatible API
  anthropic:
    api_key: ${ANTHROPIC_API_KEY}
    model: claude-sonnet-4-6
  openrouter:
    api_key: ${OPENROUTER_API_KEY}
    model: anthropic/claude-sonnet-4-6
  ollama:
    base_url: http://localhost:11434/v1
    model: llama3
  deepseek:
    api_key: ${DEEPSEEK_API_KEY}
    model: deepseek-chat
  groq:
    api_key: ${GROQ_API_KEY}
    model: llama-3.3-70b-versatile
  gemini:
    api_key: ${GEMINI_API_KEY}
    model: gemini-2.0-flash
  # no_proxy: true              # skip proxy for domestic APIs (e.g. ZAI, Qwen)

channels:
  telegram:
    token: ${TELEGRAM_BOT_TOKEN}
    allowed_users: ["123456789"]         # optional: restrict to user IDs
    admin_users: ["123456789"]           # optional: admin user IDs
    enabled: true
    streaming: false                     # telegram doesn't support token-by-token
    proxy: ""                            # optional: http/socks5 proxy URL
  discord:
    token: ${DISCORD_BOT_TOKEN}
    allowed_guilds: ["111222333"]        # optional: restrict to guild IDs
    enabled: true
  websocket:
    enabled: true
    listen_addr: "127.0.0.1:8090"
    auth:
      required: true
      token: "my-secret"
    max_clients: 100
    max_message_size: 1048576

agent:
  model: gpt-4o
  temperature: 0.7
  max_tokens: 4096
  max_iterations: 50                    # tool loop limit
  streaming: true
  tool_timeout: 120                     # seconds per tool execution
  system_prompt: "You are a helpful AI assistant."  # optional override
  workspace: /tmp/workspace             # optional: default working directory

dream:                                  # memory consolidation
  enabled: true
  interval_secs: 7200                   # consolidate every 2h
  model: gpt-4o-mini                    # optional: use cheaper model

heartbeat:
  enabled: false
  interval_secs: 1800

cron:
  enabled: false
  state_file: ~/.kestrel/cron_state.json
  tick_secs: 60

security:
  block_private_ips: true               # block RFC1918 by default
  ssrf_whitelist: []                    # allowed IP ranges for outbound
  blocked_networks: []                  # additional blocked ranges

api:
  host: 0.0.0.0
  port: 8080
  allowed_origins: ["*"]                # CORS origins
  max_body_size: 10485760               # 10 MB

daemon:
  pid_file: ~/.kestrel/kestrel.pid
  log_dir: ~/.kestrel/logs
  working_directory: /
  grace_period_secs: 30

# optional: custom system prompt additions
custom_instructions: "Always respond in English."

# optional: agent identity
name: "Kestrel"

# optional: MCP tool servers
# mcp_servers:
#   filesystem:
#     transport: stdio
#     command: "mcp-filesystem"
#     args: ["--root", "/data"]

# optional: custom provider endpoints
# custom_providers:
#   - name: my_provider
#     base_url: https://my-api.com/v1
#     api_key: key123
#     model_patterns: ["my-model"]

notifications:
  online_notify: true
  # notify_chat_id: "-1001234567890"    # which chat receives the ping
  online_message: "Kestrel v{version} online — {channel} connected"

Environment variables in values (${VAR}) are expanded at load time.

CLI Commands

Command Description
agent Interactive agent — send a message and get a response
gateway Start the gateway — connect to Telegram, Discord, etc.
serve OpenAI-compatible HTTP API server (Axum)
heartbeat Periodic health checking with auto-restart
health Show health check status
cron list List all cron jobs
cron status Show status of a specific cron job
config validate Validate the config.yaml schema
config migrate Migrate Python kestrel config to kestrel format
setup Interactive configuration wizard
status Show current configuration and system status
daemon start/stop/restart/status Native Unix daemon: double-fork, PID file (flock), SIGTERM/SIGINT/SIGHUP, log rotation

Crates

Crate Description
kestrel-core Error types, constants, core types (MessageType, Platform)
kestrel-config YAML config loading, schema validation, path resolution
kestrel-bus Tokio broadcast-based async message bus
kestrel-session SQLite-backed session and conversation store
kestrel-security Network allowlist/denylist, command approval, SSRF protection
kestrel-providers LLM provider trait — OpenAI-compatible and Anthropic SSE streaming
kestrel-tools Tool registry + builtins (shell, web, fs, search, cron, spawn, message)
kestrel-agent Agent loop, context builder, memory, skills, hooks, sub-agents
kestrel-cron Tick-based cron scheduler with JSON state persistence
kestrel-heartbeat Health check registry, periodic task monitoring, auto-restart
kestrel-channels Platform adapters — Telegram, Discord, WebSocket — via ChannelManager
kestrel-api OpenAI-compatible HTTP API server (Axum)
kestrel-daemon Unix daemon: double-fork, PID file (flock), signal handling, file logging
kestrel-memory MemoryStore trait, HotStore (L1 in-memory), WarmStore/LanceDB (L2 vectors)
kestrel-skill Skill trait, TOML manifests, SkillRegistry, SkillCompiler
kestrel-learning LearningEvent bus, event processors, prompt assembly

Stats

Metric Value
Rust source files 151
Lines of Rust code ~105,200
Crates 16
Minimum Rust version 1.75

Build Performance

Tested on v0.1.1 (AMD Ryzen 7 6800U, 8GB RAM, Fedora 45, rustc 1.94.1):

Metric Value
Clean build (release) 7m 52s
Crates compiled 487
Binary size (stripped) 17M
Linker mold 2.41.0 via clang
LTO thin
Parallel jobs 4 (RAM-limited)

API

kestrel exposes an OpenAI-compatible HTTP API. Start with kestrel serve:

kestrel serve --port 8080

List models

curl http://localhost:8080/v1/models

Chat completion (non-streaming)

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is 2+2?"}
    ],
    "temperature": 0.7,
    "max_tokens": 256
  }'

Chat completion (SSE streaming)

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Health and readiness

curl http://localhost:8080/health    # detailed health snapshot
curl http://localhost:8080/ready     # 200=ready, 503=not ready (Kubernetes probe)

Endpoints

Endpoint Method Auth Description
/v1/chat/completions POST Bearer token (if configured) OpenAI-compatible chat (streaming + non-streaming)
/v1/models GET No List available models
/health GET No Health check with component details
/ready GET No Readiness probe (200/503)

Troubleshooting

Build error: protoc not found

LanceDB requires the protobuf compiler.

# Fedora / RHEL
sudo dnf install protobuf-compiler

# Ubuntu / Debian
sudo apt install protobuf-compiler
Build error: linker 'mold' not found

Remove or update the linker override in .cargo/config.toml. The shipped config does not require mold — this only happens if you added the optional mold config without installing it.

kestrel serve fails with "address already in use"

Another process is using port 8080. Either stop it or override the port:

kestrel serve --port 9090
# or set in config:
# api:
#   port: 9090
Clippy warnings on cargo clippy --workspace

Run cargo clippy --workspace --fix to auto-fix trivial issues. For persistent warnings, ensure you're on Rust 1.75+ and all dependencies are up to date (cargo update).

Development

# Build everything
cargo build --workspace

# Run all tests
cargo test --workspace

# Lint (must pass with 0 warnings)
cargo clippy --workspace -- -D warnings

# Format check
cargo fmt --all --check

# Quick compile check
cargo check

Design Principles

  • Thin harness, fat skills — Harness handles the loop, files, context, and safety. Complexity lives in skill files.
  • Latent vs deterministic — Judgment goes to the model; parsing and validation stay in code. Never mix the two.
  • Context engineering — JIT loading, compaction, and structured notes to stay within the context window.
  • Fewer, better tools — Consolidated operations with token-efficient returns and poka-yoke defaults.
  • LanceDB over SQLite FTS5 — Semantic vector search for memory and session recall.
  • TOML over YAML — Rust-native parsing for skill manifests and configuration.

Contributing

  1. Fork the repository and create a feature branch.
  2. Ensure cargo test --workspace and cargo clippy --workspace pass with zero warnings.
  3. Add tests for any new functionality — assertions must be deterministic (no LLM output in test expectations).
  4. Add /// doc comments on all pub functions.
  5. Open a pull request against main.

CI runs format checks, clippy, build, tests, and a security audit on every push.

License

This project is licensed under the MIT License.

About

Kestrel Agent — A fast, streaming-first Rust AI agent framework with built-in memory, skills, and self-evolution

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages