Rapid AI Experimentation Platform - Cast scripting spells to explore AI concepts, extract proven patterns to production-ready Rust
π Version 0.14.1 - Web Interface & Mission Control
π Quick Links: π Documentation Hub | π Quick Start | π― What This Is | ποΈ Experiment β Production | π Release Notes | π οΈ Examples | π§ Contributing
π Note: rs-llmspell builds upon concepts from numerous open-source projects and owes special acknowledgment to go-llms, which was instrumental in rapidly prototyping early ideas. This Rust implementation supersedes go-llms, leveraging Rust's native compilation and zero-cost abstractions for experimental velocity with production-ready foundations.
rs-llmspell is an experimental platform for rapid AI concept exploration.
The Experiment-Extract Workflow:
- Explore: Script AI concepts in Lua/JS - iterate in minutes
- Validate: Test ideas with production-grade performance
- Extract: Move proven patterns to Rust when ready
- Scale: Production deployment with minimal refactoring
Built with production-quality engineering (architecture, performance, testing, observability) to make the transition from experiment to production as painless as possible. We use Rust not because we're production-ready, but because proven patterns deserve solid foundations for extraction.
Current Status: v0.14.1 complete. "Mission Control" Web Interface (Phase 14) now available with unified single-binary deployment. Includes embedded React frontend, Monaco script editor, real-time console, and visual memory/session exploration. Performance validated with <100ms API latency and <2ms overhead. Project retains v0.13.1's production foundations: 10 storage backends, RLS multi-tenancy, and 21 preset profiles with 5540+ passing tests.
rs-llmspell prioritizes rapid experimentation while building production-ready foundations.
The Philosophy:
- Script Velocity: Lua/JS for minute-level iteration on AI ideas
- Concept Exploration: Play with LLMs, transformers, diffusion, memory, learning
- Validation at Scale: Production-quality performance for thorough testing
- Painless Extraction: Clear path from validated experiments to Rust production code
Although experimental, rs-llmspell is built with production-grade engineering:
- Performance: <2ms memory overhead, 8.47x HNSW speedup, <100ms context assembly
- Architecture: Modular (21 crates), trait-based, SOLID principles, clear boundaries
- Scalability: Designed for growth (async-first, resource limits, multi-tenancy ready)
- Testing: >90% coverage (5540 tests passing), zero warnings policy
- Documentation: >95% API docs (50+ guides across user/dev/technical)
- Observability: Full tracing with <2% overhead, structured logging
Result: When your experiment succeeds, transitioning to production is engineering work, not research work.
β Experimental AI concept playground β Script-first rapid iteration β Production-quality engineering β Clear extraction path to Rust β Learning platform for AI patterns
β Production-ready out of the box β Enterprise deployment platform β Guaranteed stable APIs (pre-1.0) β Support contracts or SLAs
Latest experimental infrastructure for rapid memory pattern exploration
- 3-Tier Memory: Episodic (conversation), Semantic (knowledge graph), Procedural (patterns)
- Hot-Swappable Backends: InMemory (dev), HNSW (production), SurrealDB (graph)
- Context Assembly: 4 strategies (episodic, semantic, hybrid, RAG) with parallel retrieval
- Performance: <2ms memory overhead (50x faster than target), 8.47x HNSW speedup
- CLI + Lua API: Memory global (17th), Context global (18th) for script access
- 149 Tests: 100% pass rate, zero warnings, comprehensive validation
- See: Memory Configuration Guide
10 experimental workflows for rapid AI concept exploration (v0.12.0)
- Research: research-assistant (4-phase workflow), knowledge-management (RAG integration)
- Development: code-generator (3 agents), code-review (multi-aspect analysis)
- Content: content-generation (quality-driven), document-processor (PDF/OCR)
- Productivity: interactive-chat (session-based), file-classification (scan-classify-act)
- Workflow: workflow-orchestrator (custom composition), data-analysis (CSV/Excel/JSON)
- CLI Access:
template list|info|exec|search|schema - Lua API: Template global (16th) with 6 methods
- Performance: <2ms overhead (50x faster than target)
Experiment offline with 100+ models (v0.11.0-v0.11.2)
- Dual Backend: Ollama (REST API, 100+ models) + Candle (embedded GGUF inference)
- Zero API Keys: No cloud accounts needed for experimentation
- Model Management:
llmspell model list|pull|info|status - Platform-Aware GPU: Metal (macOS) + CUDA (Linux) with CPU fallback
- Performance: 40 tok/s throughput, 150ms first token, <5GB memory
- 10 Builtin Profiles: LLaMA, T5, Qwen2, Phi-2 ready-to-use configs
Feature flags for rapid development cycles (v0.10.0)
- Minimal: 19MB (core only, fast compile)
- Common: 25MB (+templates, PDF)
- Full: 35MB (all experimental tools)
- 87% Compile Speedup: Bridge-only builds 38sβ5s
Vector search and hybrid retrieval for concept validation (v0.8.0)
- HNSW Vector Storage: <8ms @ 100K vectors, <35ms @ 1M vectors
- Hybrid Search: Vector + keyword + BM25 reranking
- Multi-Tenant: StateScope isolation with 3% overhead
- RAGPipelineBuilder: Fluent API for custom pipelines
- Embedding Providers: OpenAI, Cohere, HuggingFace, local models
Browser-based AI workflow development and monitoring
- Script Editor: Write and execute scripts with syntax highlighting and auto-completion
- Session Management: Visual session browser with history and artifacts
- Template Library: Browse and launch templates with interactive parameter forms
- Memory Browser: Explore episodic memory and knowledge graph visualization
- Agent Monitor: Real-time agent lifecycle and workflow execution tracking
- Tool Catalog: Interactive tool execution with parameter forms
- Configuration UI: Edit configuration, manage profiles, and restart server
- WebSocket Streaming: Real-time event updates for script execution and system changes
- OpenAPI Documentation: Interactive Swagger UI at
/swagger-ui/ - Single Binary: Frontend assets embedded, no separate web server needed
- Quick Start:
llmspell web startβ http://localhost:3000 - See: Web Interface Guide
Modular tools for rapid prototyping
- Core: File ops, web search, calculator, HTTP client
- Common (
--features common): Templates (Tera/Handlebars), PDF processing - Full (
--features full): Excel, CSV, archives, email, database - Direct CLI:
llmspell tool list|info|invoke|search|test - Sandboxed: Secure execution with automatic feature detection
When validating at scale or extracting to production (v0.10.0)
- Unix Daemon: Double-fork daemonization (1.8s startup)
- Signal Handling: SIGTERM/SIGINT β graceful shutdown
- systemd/launchd: Service deployment when concepts are proven
- Fleet Management: Multi-kernel orchestration for load testing
- Log Rotation: Automatic rotation (78ms, size/age policies)
- PID Management: Lifecycle tracking (6ms validation)
Coordinate 2-20+ agents for complex workflows
- Sequential, parallel, conditional execution patterns
- Real-time state sharing between agents
- Automatic error recovery and retry logic
- Session-aware context management
Experiment safely with isolated tool execution
- 3-level security model (Safe/Restricted/Privileged)
- Mandatory sandboxing for all tool executions
- Policy-based access control
- Resource boundaries (CPU, memory, I/O)
# Clone repository
git clone https://github.com/lexlapax/rs-llmspell
cd rs-llmspell
# Choose your build:
cargo build --release # Minimal: 19MB (core, fast compile)
cargo build --release --features common # Common: 25MB (+templates, PDF)
cargo build --release --features full # Full: 35MB (all tools)
# Set API key (or use local LLMs)
export OPENAI_API_KEY="sk-..." # Optional for cloud models# Simple agent interaction
./target/release/llmspell exec '
local agent = Agent.create({model = "openai/gpt-4o-mini"})
print(agent:execute({prompt = "Explain Rust ownership in 2 sentences"}).response)
'
# Or use local LLM (zero cost)
./target/release/llmspell exec '
local agent = Agent.create({model = "local/llama3.1:8b@ollama"})
print(agent:execute({prompt = "What is async Rust?"}).response)
'# Research workflow with memory
./target/release/llmspell template exec research-assistant \
--param topic="Rust async patterns" \
--param max_sources=10 \
--param memory_enabled=true
# Code generation experiment
./target/release/llmspell template exec code-generator \
--param description="Binary search tree in Rust" \
--param language="rust" \
--param model="ollama/llama3.2:3b"
# Memory exploration
./target/release/llmspell exec examples/script-users/getting-started/05-memory-rag-advanced.lua- 60+ Examples - Learning by doing
- User Guide - Comprehensive experimentation guide
- Template Guides - 10 workflow templates explained
- Local LLM Setup - Zero-cost exploration
- Developer Guide - Build your own experimental components
Phase 13 Memory System (Experimental Infrastructure):
| Metric | Target | Achieved | Status |
|---|---|---|---|
| Memory add | <10ms | 0.248ms | β 40x faster |
| Context assembly | <100ms | ~8ms | β 12x faster |
| HNSW speedup | >5x | 8.47x | β 70% better |
| Memory overhead | <100ms | <2ms | β 50x faster |
Phase 12 Template System (Experimental Workflows):
| Metric | Target | Achieved | Status |
|---|---|---|---|
| Template list | <10ms | 0.5ms | β 20x faster |
| Execute overhead | <100ms | <2ms | β 50x faster |
| Parameter validation | <5ms | 0.1ms | β 50x faster |
Phase 10-11 Infrastructure (Service & Local LLM):
| Metric | Target | Achieved | Status |
|---|---|---|---|
| Daemon startup | <2s | 1.8s | β 10% faster |
| Tool init | <10ms | 7ms | β 30% faster |
| Vector search @ 100K | <10ms | 8ms | β 20% faster |
| Candle throughput | 30 tok/s | 40 tok/s | β 33% faster |
Completed (14/14 major phases):
- β Phase 0-6: Foundation (traits, tools, hooks, state, sessions)
- β Phase 7: Infrastructure consolidation (536+ files refactored)
- β Phase 8: RAG system (HNSW vectors, multi-tenant)
- β Phase 9: REPL & debugging (interactive development)
- β Phase 10: Service integration (daemon, tool CLI, fleet)
- β Phase 11: Local LLM (Ollama + Candle dual backend)
- β Phase 11a: Bridge consolidation (87% compile speedup)
- β Phase 11b: LLM cleanup (unified profiles, T5 support)
- β Phase 12: Workflow templates (10 experimental templates)
- β Phase 13: Adaptive memory (3-tier, hot-swap backends, context engineering)
- β Phase 14: Web Interface (Unified Mission Control UI, single binary)
Upcoming Experimental Features (Phases 15+):
- Phase 15: Model Context Protocol (external tool integration)
- Phase 16: Advanced orchestration patterns
- Phase 17-18: Distributed execution, cloud platform
Note: All phases build experimental infrastructure with production-quality engineering. When concepts are proven, extraction to production is straightforward.
For comprehensive guides, see Documentation Hub
- Getting Started - 5-minute experimental setup
- Core Concepts - Understand the architecture
- Configuration - LLM providers, memory, storage, security
- Templates - 10 workflow templates
- Storage Setup - PostgreSQL deployment guide
- Lua API - 18 globals, 200+ methods
- Developer Guide - Build experimental components
- Technical Docs - Architecture & design decisions
# Quality checks before committing
./scripts/quality/quality-check-minimal.sh # Fast: format, clippy
./scripts/quality/quality-check-fast.sh # 1 min: + unit tests
# Run experimental workflows
./scripts/utilities/llmspell-easy.sh # Interactive launcher
./scripts/testing/test-by-tag.sh memory # Test memory systemSee Scripts Overview for all automation tools.
New Contributors: Start with README-DEVEL.md for complete development environment setup.
Building experimental AI components? See Developer Guide for:
- Rapid iteration patterns
- Production-quality code for future extraction
- Testing with llmspell-testing helpers
- 60+ examples to learn from
Read CONTRIBUTING.md for guidelines and workflow.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Examples: 60+ working examples
Apache License, Version 2.0. See LICENSE-APACHE for details.
v0.14.1 - Web Interface & Mission Control
Unified web interface for AI agent development and monitoring. Single-binary "Mission Control" with embedded React frontend.
Key Achievements:
- π Unified Web Interface: Dashboard, Editor, Sessions, Memory, Agents
- π Single Binary: Embedded React frontend assets (no separate server)
- β‘ Real-Time: WebSocket streaming for console and events
- π Visualization: Interactive memory graph and session timeline
- π οΈ Editor: Monaco-based script editor with syntax highlighting
- π§ Zero Config:
llmspell web startworks out of the box with defaults
See Release Notes for complete details.
π Full Documentation: See docs/ for comprehensive user guides, technical architecture, and developer resources.