AI-native durable workflows with built-in cost controls
The only durable workflow engine built for AI. Workflows survive crashes, budgets control AI spend, and costs are tracked per token. Ships as a 7.5 MB binary that runs anywhere.
Single Binary Deployment | AI Cost Tracking Built-in | Runs Anywhere
Kruxia Flow is a durable execution engine — in the same category as Temporal and Inngest, not batch schedulers like Airflow. We're built for AI from the ground up.
- AI startups — Ship AI agents to production with built-in cost tracking and budget control. Survive crashes, stop runaway spend.
- Small businesses — Define workflows with no code, deploy one binary and one database. No cluster, no DevOps team. Production reliability for tens of dollars a month.
- Data teams — Combine batch pipelines and AI agents in one platform. Python SDK with pandas and DuckDB, without a 4GB footprint or a $1K/month vendor lock-in.
LLM costs spiral out of control, and existing tools can't help:
- Invisible AI spend: No workflow engine tracks LLM costs natively. Teams with cost observability report 30-50% savings, but you're left stitching together external tools with no budget control to stop a runaway agent.
- Temporal's operational tax: 7+ components to self-host, and teams report 8 engineering-months per year on maintenance. Zero LLM awareness.
- LangGraph isn't a workflow engine: Python-only, no native scheduling, and requires the proprietary LangSmith platform for production at ~$1,000 per million executions.
Kruxia Flow combines durable execution with AI-native features:
| Feature | Kruxia Flow | Temporal | Inngest | LangGraph |
|---|---|---|---|---|
| Durable execution | Yes | Yes | Yes | Partial |
| LLM cost tracking | Yes | — | Partial | via LangSmith |
| Budget control | Yes | — | — | — |
| Model fallback | Yes | — | Yes | Partial |
| Token streaming | Yes | — | — | Yes |
| Self-host complexity | 1 binary + PG | 7+ components | 1 binary + PG | Proprietary |
| Throughput | 93 wf/s | 66 wf/s | Not tested | N/A |
| Binary size | 7.5 MB | ~200 MB | Not tested | N/A |
| Docker image | 63 MB | ~500 MB | Not tested | N/A |
| Peak memory | 328 MB | ~425 MB | Not tested | N/A |
| Open source | AGPL-3.0 + MIT | MIT | SSPL | MIT |
git clone https://github.com/kruxia/kruxiaflow.git
cd kruxiaflow
./docker up --examples
# Wait for "listening on 0.0.0.0:8080" then verify in another terminal:
./docker exec kruxiaflow /kruxiaflow healthThat's it. Kruxia Flow is running with PostgreSQL and Redis, ready to execute workflows.
Kruxia Flow always runs with OAuth2 security, so you'll need client authentication to run workflows. The simplest approach for local running is to use the generated client credentials to get an access token:
# Read the generated client secret from .env
CLIENT_SECRET=$(grep KRUXIAFLOW_CLIENT_SECRET .env | cut -d= -f2)
TOKEN=$(curl -s -X POST http://localhost:8080/api/v1/oauth/token \
-d "grant_type=client_credentials" \
-d "client_id=kruxiaflow-docker-client" \
-d "client_secret=$CLIENT_SECRET" | jq -r '.access_token')Deploy the weather report example and run it:
# Deploy the workflow definition
curl -s -X POST http://localhost:8080/api/v1/workflow_definitions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: text/yaml" \
--data-binary @examples/01-weather-report.yaml | jq .
# Submit a workflow instance
WORKFLOW_ID=$(curl -s -X POST http://localhost:8080/api/v1/workflows \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"definition_name": "weather_report",
"input": {"webhook_url": "https://httpbin.org/post"}
}' | jq .workflow_id | tr -d '"'); echo $WORKFLOW_IDThis fetches a weather forecast from the National Weather Service API and POSTs
the result to a webhook. Copy the workflow_id from the response to check status:
curl -s http://localhost:8080/api/v1/workflows/$WORKFLOW_ID \
-H "Authorization: Bearer $TOKEN" | jq .If it succeeded, you've got mail! Check the weather report at http://localhost:8025/
For AI workflows, set your provider API key in your shell environment and restart:
# Set your Anthropic API key (add to ~/.bashrc or ~/.zshrc to persist)
export ANTHROPIC_API_KEY=your-key-here
# Restart the server to pick up the new key
./docker down && ./docker up -dThen deploy and run the content moderation example:
# Deploy the moderation workflow
curl -s -X POST http://localhost:8080/api/v1/workflow_definitions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: text/yaml" \
--data-binary @examples/04-moderate-content.yaml | jq .
# Submit a moderation request
WORKFLOW_ID=$(curl -s -X POST http://localhost:8080/api/v1/workflows \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"definition_name": "moderate_content",
"input": {
"user_content": "Check out this amazing project!",
"content_id": "test-001"
}
}' | jq .workflow_id | tr -d '"'); echo $WORKFLOW_IDCheck workflow result and cost tracking with the workflow_id from the response:
curl -s http://localhost:8080/api/v1/workflows/$WORKFLOW_ID \
-H "Authorization: Bearer $TOKEN" | jq .
# View cost summary for the workflow
curl -s http://localhost:8080/api/v1/workflows/$WORKFLOW_ID/cost \
-H "Authorization: Bearer $TOKEN" | jq .
# View cost breakdown for the workflow activities
curl -s http://localhost:8080/api/v1/workflows/$WORKFLOW_ID/cost/history \
-H "Authorization: Bearer $TOKEN" | jq .See examples/ for 15+ example workflows covering parallel execution, model fallback,
caching, loops, scheduling, and RAG patterns. API docs at
http://localhost:8080/api/v1/docs.
The std (built-in) llm_prompt and embedding activities help you control costs: They estimate costs in advance using the published cost data for LLM models (stored in config/llm_models.yaml) and will only run activities that won't exceed the budget, if provided. Then when the LLM activity is run, the costs and token counts are recorded so that cost metrics can be analyzed and workflows optimized.
activities:
- key: analyze
activity_name: llm_prompt
parameters:
model: anthropic/claude-sonnet-4-5-20250929
prompt: "Analyze this document..."
max_tokens: 500
settings:
budget:
limit_usd: 0.50
action: abortReal-time costs are visible per workflow and per activity.
Automatically fall back to cheaper models when budget is constrained:
activities:
- key: generate
activity_name: llm_prompt
parameters:
model:
- anthropic/claude-sonnet-4-5-20250929 # Try first
- openai/gpt-4o-mini # If budget constrained
- anthropic/claude-haiku-4-20250415 # Last resort
prompt: "Generate a summary..."
max_tokens: 500
settings:
budget:
limit_usd: 0.10
action: abortSave on LLM costs by caching repeated queries:
activities:
- key: answer
activity_name: llm_prompt
parameters:
model: anthropic/claude-haiku-4-20250415
prompt: "{{INPUT.question}}"
max_tokens: 200
settings:
cache:
enabled: true
ttl_seconds: 3600
key:
- llm_prompt
- "{{parameters.model}}"
- "{{parameters.prompt}}"Identical queries hit cache instead of the LLM. (NOTE: Semantic caching is planned.)
Workflows survive crashes and restart from where they left off.
Native support for all major providers:
- Anthropic: Claude 4.5 Sonnet, Claude 4.5 Haiku
- OpenAI: GPT-5.1, GPT-4o, GPT-4o-mini, GPT-3.5 Turbo
- Google: Gemini Pro, Gemini Flash
- Ollama: Self-hosted open models
Kruxia Flow includes 15+ production-ready example workflows in YAML and Python:
| # | Example | Concepts Demonstrated |
|---|---|---|
| 1 | Weather Report | Sequential workflow, HTTP requests, templates |
| 2 | User Validation | Conditional branching, PostgreSQL queries |
| 3 | Document Processing | Parallel execution, fan-out/fan-in, file storage |
| 4 | Content Moderation | LLM with cost tracking, retry with backoff |
| 5 | Research Assistant | Multi-model fallback, budget-aware selection |
| 6 | FAQ Bot / RAG | Semantic caching, vector search, embeddings |
| 7 | Agentic Research | Iterative loops, agent patterns |
| 8 | Scheduled Tasks | Delays, rate limiting, scheduled execution |
| 9 | Token Streaming | Real-time LLM streaming via WebSocket |
| 10 | Order Processing | HTTP, database transactions, email notifications |
| 11 | GitHub Health Check | Python SDK, HTTP API integration |
| 12 | Sales ETL Pipeline | Python SDK, pandas, DuckDB SQL on DataFrames |
| 13 | Customer Churn Prediction | Python SDK, parallel ML training, LLM explanations |
| 14 | Document Intelligence | Python SDK, AI-powered document analysis |
| 15 | Content Moderation System | Python SDK, multi-stage moderation pipeline |
Kruxia Flow is a single Rust binary with PostgreSQL as the only required dependency. Runs anywhere: on cloud VMs, on-premise computers, or edge devices like Raspberry Pi Zero.
┌─────────────────────────────────────────────────────────────────┐
│ Kruxia Flow (7.5MB binary) │
├─────────────────────────────────────────────────────────────────┤
│ API Server │ Orchestrator │ Worker Pool │ Cost Tracker │
└──────────────┴────────────────┴───────────────┴─────────────────┘
│
▼
┌───────────────────┐
│ PostgreSQL │
│ (events, state, │
│ costs, files) │
└───────────────────┘
- Event-driven: Publish-subscribe architecture with exactly-once semantics
- PostgreSQL-only: No Kafka, Cassandra, or Elasticsearch required
- Pluggable: Include Redis for activity results caching
- Planned: Swap in Kafka for events, S3 for storage when you need scale [POST-MVP]
Kruxia Flow is benchmarked favorably against industry-standard workflow engines (January 2026):
| Metric | Kruxia Flow | Temporal | Airflow |
|---|---|---|---|
| Throughput (wf/sec) | 93 | 66 | 8 |
| P99 Latency | 0.9–1.5s | 0.5–2.7s | 6–22s |
| Peak Memory | 328MB | 425MB | 7.2GB |
| Binary Size | 7.5MB | ~200MB | ~500MB+ |
| Docker Image | 63MB | ~500MB | ~1GB+ |
Benchmark methodology: Identical echo workflows (sequential, parallel, high-concurrency), Docker Compose environment, same hardware. See benchmarks/ for reproducible tests.
- Architecture - System design and component overview
- MVP Requirements - Product requirements and roadmap
- Implementation Plans - Detailed technical implementation specifications
- Post-MVP Roadmap, Features - Future features and integrations
- Docker and Docker Compose
- (Optional) Rust 1.90+ for local development
# Start development environment (hot reload)
./docker up --develop
# View logs
./docker logs -f
# Stop services
./docker down# Set up test database and run tests
./scripts/test.sh
# With coverage
./scripts/test.sh --coverage# Install Rust and sqlx-cli
cargo install sqlx-cli --no-default-features --features postgres
# Start PostgreSQL manually
docker run -d --name pg -e POSTGRES_PASSWORD=dev -p 5432:5432 postgres:17
# Run migrations
export DATABASE_URL='postgres://postgres:dev@localhost:5432/kruxiaflow'
sqlx database create
sqlx migrate run
# Build and run
cargo build --release
./target/release/kruxiaflow serve- Durable workflow execution
- 15+ example workflows
- LLM cost tracking and budgets
- Multi-provider LLM support
- Token streaming
- Human-in-the-loop workflows
- Python SDK (install from GitHub)
- Semantic caching
- Web dashboard for cost visualization
- Airflow migration guide
- Kubernetes Helm chart
- TypeScript SDK
- RBAC and multi-tenancy
- Kafka protocol event backend
- S3-compatible workflow storage backend
See Post-MVP Roadmap for details.
- Discord: Join the Kruxia community
- Bluesky: @kruxia.com
- GitHub Issues: Report bugs and request features
- Code of Conduct: Community guidelines
- Security: Report vulnerabilities
Contributions are welcome! Please read our Contributing Guidelines before submitting PRs.
# Fork and clone
git clone https://github.com/YOUR_USERNAME/kruxiaflow.git
# Create a branch
git checkout -b feature/your-feature
# Make changes and test
./scripts/test.sh --coverage
# Submit a PRSee CONTRIBUTING.md for detailed development setup and guidelines.
AGPL-3.0 License - See LICENSE for details.
Kruxia Flow - AI-native durable workflows that run everywhere, with built-in LLM cost controls and streaming.