From ecfdd10f42291a02811c11b4e92f28c4a9b35d33 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 26 Feb 2026 14:14:58 +0000 Subject: [PATCH 01/10] Add OpenFang project research document Research three GitHub projects sharing the OpenFang name: - RightNow-AI/openfang: Rust-based Agent OS (most significant) - anmaped/openfang: Camera firmware for Ingenic T20 (dormant) - danshorstein/OpenFang: Python AI assistant fork https://claude.ai/code/session_015KgxqLUhevxop1jhiZY2Y4 --- docs/research-openfang.md | 210 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 210 insertions(+) create mode 100644 docs/research-openfang.md diff --git a/docs/research-openfang.md b/docs/research-openfang.md new file mode 100644 index 000000000..f45c6f3c5 --- /dev/null +++ b/docs/research-openfang.md @@ -0,0 +1,210 @@ +# OpenFang Project Research + +**Date**: 2026-02-26 +**Scope**: GitHub projects using the "OpenFang" name + +--- + +## Summary + +There are three distinct projects on GitHub that share the "OpenFang" name: + +| Project | Domain | Language | License | Stars | Status | +|---------|--------|----------|---------|-------|--------| +| [RightNow-AI/openfang](https://github.com/RightNow-AI/openfang) | Agent Operating System | Rust | MIT / Apache 2.0 | ~979 | Active (v0.1.0, Feb 2026) | +| [anmaped/openfang](https://github.com/anmaped/openfang) | Camera firmware (Ingenic T20) | PHP/Shell | GPL-3.0 | ~188 | Dormant (last release 2018) | +| [danshorstein/OpenFang](https://github.com/danshorstein/OpenFang) | Python AI assistant | Python | Unknown | Low | Fork of OpenClaw | + +--- + +## 1. RightNow-AI/openfang — Agent Operating System (Primary) + +**Website**: [openfang.sh](https://www.openfang.sh/) +**Repo**: [github.com/RightNow-AI/openfang](https://github.com/RightNow-AI/openfang) +**Built by**: Jaber (RightNow AI) + +### What It Is + +OpenFang is a **production-grade Agent Operating System** built from scratch in Rust. It is not a chatbot framework or a Python wrapper around an LLM — it is a full operating system for autonomous agents that run 24/7, building knowledge graphs, monitoring targets, generating leads, and managing social media. + +The entire system compiles to a **single ~32 MB binary**. + +### Key Numbers + +- **137,728** lines of Rust code across **14 crates** +- **1,767+** passing tests, **0** clippy warnings +- **30** pre-built agents across 4 performance tiers +- **40** channel adapters (Telegram, Discord, Slack, WhatsApp, Signal, Matrix, etc.) +- **38** built-in tools + MCP integration +- **27** LLM providers supporting **123+** models +- **16** security systems +- **v0.1.0** — first public release (February 2026) + +### Performance Benchmarks + +| Metric | OpenFang | OpenClaw | CrewAI | AutoGen | +|--------|----------|----------|--------|---------| +| Cold Start | 180ms | 5.98s | 3s | — | +| Idle Memory | 40MB | 394MB | 250MB | — | +| Install Size | 32MB | 500MB | — | 200MB | + +### The 7 "Hands" (Autonomous Agents) + +1. **Clip** — Video processing: downloads YouTube content, creates vertical shorts with captions +2. **Lead** — Daily prospect discovery with ICP matching and qualification scoring +3. **Collector** — OSINT intelligence with continuous monitoring and change detection +4. **Predictor** — Superforecasting engine with calibrated reasoning and accuracy tracking +5. **Researcher** — Cross-references sources using CRAAP criteria with APA citations +6. **Twitter** — Account management across 7 content formats with approval gates +7. **Browser** — Web automation with mandatory purchase approval safeguards + +### 14 Core Rust Crates + +| Crate | Purpose | +|-------|---------| +| `openfang-kernel` | Orchestration, workflows, RBAC | +| `openfang-runtime` | Agent loop, 53 tools, WASM sandbox | +| `openfang-api` | 140+ REST/WS/SSE endpoints | +| `openfang-channels` | 40 messaging adapters | +| `openfang-memory` | SQLite persistence, vector embeddings | +| `openfang-skills` | 60 bundled skills | +| `openfang-hands` | Lifecycle management for autonomous agents | +| `openfang-extensions` | 25 MCP templates | +| `openfang-wire` | P2P protocol | +| `openfang-cli` | Daemon management | +| `openfang-desktop` | Tauri 2.0 native app | +| `openfang-migrate` | OpenClaw/LangChain migration | + +### 16 Security Systems + +1. WASM dual-metered sandbox (fuel + epoch interruption) +2. Merkle hash-chain audit trails +3. Information flow taint tracking +4. Ed25519 signed agent manifests +5. SSRF protection +6. Secret zeroization +7. OFP mutual authentication (HMAC-SHA256) +8. Capability gates (role-based access) +9. Security headers (CSP, HSTS, X-Frame-Options) +10. Health endpoint redaction +11. Subprocess sandbox with environment isolation +12. Prompt injection scanner +13. Loop guard with circuit breaker +14. Session repair (7-phase validation) +15. Path traversal prevention +16. GCRA rate limiter + +### Protocol Support + +- **MCP** (Model Context Protocol) +- **A2A** (Agent-to-Agent) +- **OFP** (OpenFang Protocol — proprietary P2P with HMAC-SHA256 mutual auth) + +### LLM Providers (27) + +Anthropic, OpenAI, Google Gemini, Groq, DeepSeek, Mistral, xAI, Ollama, AWS Bedrock, and 18+ others — supporting 123+ models total. + +### Installation + +```bash +# macOS/Linux +curl -fsSL https://openfang.sh/install | sh +openfang init +openfang start +# Dashboard: http://localhost:4200 + +# Windows +irm https://openfang.sh/install.ps1 | iex +``` + +### Key Differentiators + +- Single binary deployment — no Python, no Node, no Docker required +- OpenAI-compatible API — drop-in replacement capability +- Migration engine — imports from OpenClaw, LangChain, AutoGPT +- Dashboard-first — web UI at localhost:4200 +- Desktop app — native Tauri 2.0 application with system tray + +--- + +## 2. anmaped/openfang — Camera Firmware + +**Repo**: [github.com/anmaped/openfang](https://github.com/anmaped/openfang) + +### What It Is + +An open-source bootloader, kernel, and toolchain for IP cameras using **Ingenic T10 and T20 SoCs**. This was one of the early community firmware projects for cheap Chinese IP cameras. + +### Supported Devices + +| SoC | RAM | Cameras | +|-----|-----|---------| +| Ingenic T20L | 64MB DDR | Xiaomi Mijia 2018, Xiaomi Xiaofang 1S | +| Ingenic T20N | 64MB DDR + SIMD128 | DIGOO DG W30 | +| Ingenic T20X | 128MB DDR | Wyze Cam V2, Xiaomi Dafang, Wyze Cam Pan | + +### Technical Details + +- Kernel version 3.10.14 +- U-Boot bootloader v2013.07 +- Buildroot-based toolchain +- Docker support for compilation +- GPL-3.0 license + +### Status + +- **Last release**: RC5 (November 2018) — **dormant** +- 188 stars, 43 forks, 10 contributors +- Largely superseded by [OpenMiko](https://github.com/openmiko/openmiko), [OpenIPC](https://github.com/OpenIPC), and [Thingino](https://thingino.com/) + +--- + +## 3. danshorstein/OpenFang — Python AI Assistant + +**Repo**: [github.com/danshorstein/OpenFang](https://github.com/danshorstein/OpenFang) + +### What It Is + +An open-source fork of **OpenClaw** that rethinks personal AI agents. Built on the principle that "LLMs should write automations, not be automations." + +### Key Claims + +- 90%+ reduction in token costs +- Faster execution and more reliable automations +- System gets cheaper over time as workflows graduate from LLM-orchestrated to Python-automated + +### Status + +Low activity, small community. Positioned as a philosophical alternative to the mainstream agent frameworks. + +--- + +## Analysis & Relevance + +### Most Notable: RightNow-AI/openfang + +The **RightNow-AI** variant is by far the most significant project: + +- **Active development** with a February 2026 v0.1.0 release +- **Rust-based architecture** — high performance, single binary, low memory +- **Comprehensive agent ecosystem** — 30 agents, 40 channels, 38 tools +- **Strong security posture** — 16 dedicated security systems +- **Production-oriented** — not a research project or toy framework + +### Potential Relevance to RuVector + +- The Rust architecture and WASM sandbox approach could inform solver optimization strategies +- The 14-crate modular design demonstrates a scalable Rust workspace pattern +- The security systems (especially taint tracking, prompt injection scanning) are relevant to any AI-adjacent system +- The performance benchmarks (180ms cold start, 40MB idle) set a useful reference point + +--- + +## Sources + +- [RightNow-AI/openfang (GitHub)](https://github.com/RightNow-AI/openfang) +- [OpenFang.sh (Website)](https://www.openfang.sh/) +- [anmaped/openfang (GitHub)](https://github.com/anmaped/openfang) +- [danshorstein/OpenFang (GitHub)](https://github.com/danshorstein/OpenFang) +- [OpenMiko (GitHub)](https://github.com/openmiko/openmiko) +- [OpenIPC (GitHub)](https://github.com/OpenIPC) From cb939120de646b0eabc039034aa37cba81bd272b Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 26 Feb 2026 14:24:42 +0000 Subject: [PATCH 02/10] Add OpenFang Agent OS RVF example Standalone RVF knowledge base modeling the OpenFang agent OS architecture: - 7 autonomous Hands with tier/security metadata - 38 built-in tools across 12 categories - 20 channel adapters with protocol metadata - Task routing via nearest-neighbor search - Security and tier filtering with combined filter expressions - Cryptographic witness chain audit trail - Persistence verification (close/reopen round-trip) - All tests passing, 65 vectors in ~35KB https://claude.ai/code/session_015KgxqLUhevxop1jhiZY2Y4 --- examples/rvf/Cargo.toml | 4 + .../rvf/examples/assets/openfang-README.md | 70 +++ examples/rvf/examples/openfang.rs | 462 ++++++++++++++++++ 3 files changed, 536 insertions(+) create mode 100644 examples/rvf/examples/assets/openfang-README.md create mode 100644 examples/rvf/examples/openfang.rs diff --git a/examples/rvf/Cargo.toml b/examples/rvf/Cargo.toml index beb860420..8cc43b187 100644 --- a/examples/rvf/Cargo.toml +++ b/examples/rvf/Cargo.toml @@ -244,3 +244,7 @@ path = "examples/causal_atlas_dashboard.rs" [[example]] name = "security_hardened" path = "examples/security_hardened.rs" + +[[example]] +name = "openfang" +path = "examples/openfang.rs" diff --git a/examples/rvf/examples/assets/openfang-README.md b/examples/rvf/examples/assets/openfang-README.md new file mode 100644 index 000000000..e032ed1c3 --- /dev/null +++ b/examples/rvf/examples/assets/openfang-README.md @@ -0,0 +1,70 @@ +# OpenFang Agent OS — RVF Example + +An RVF (RuVector Format) knowledge base modeling the architecture of [OpenFang](https://github.com/RightNow-AI/openfang), a production-grade Agent Operating System built in Rust. + +## What It Does + +This example creates an RVF vector store that serves as a **component registry** for an agent OS, demonstrating how to model and query a multi-type agent ecosystem: + +| Component | Count | Description | +|-----------|-------|-------------| +| **Hands** | 7 | Autonomous agents (Clip, Lead, Collector, Predictor, Researcher, Twitter, Browser) | +| **Tools** | 38 | Built-in capabilities across 12 categories | +| **Channels** | 20 | Messaging adapters (Telegram, Discord, Slack, etc.) | +| **Total** | 65 | Searchable components in a single 35KB RVF file | + +## Capabilities Demonstrated + +1. **Multi-type registry** — Hands, Tools, and Channels stored in one vector space +2. **Rich metadata** — component type, name, domain, tier, security level +3. **Task routing** — nearest-neighbor search to find the best agent for a task +4. **Security filtering** — query only agents meeting a security threshold (>= 80) +5. **Tier filtering** — isolate autonomous (tier 4) agents +6. **Category search** — find tools by category (e.g., all security tools) +7. **Combined filters** — AND/OR/NOT filter expressions +8. **Witness chain** — cryptographic audit trail of all registry operations +9. **Persistence** — verified round-trip: create, close, reopen, query + +## Run + +```bash +cd examples/rvf +cargo run --example openfang +``` + +## Metadata Schema + +| Field ID | Name | Type | Applies To | +|----------|------|------|------------| +| 0 | `component_type` | String | All (`"hand"`, `"tool"`, `"channel"`) | +| 1 | `name` | String | All | +| 2 | `domain` / `category` / `protocol` | String | All | +| 3 | `tier` | U64 (1-4) | Hands only | +| 4 | `security_level` | U64 (0-100) | Hands only | + +## OpenFang Hands + +| Hand | Domain | Tier | Security | +|------|--------|------|----------| +| clip | video-processing | 3 | 60 | +| lead | sales-automation | 2 | 70 | +| collector | osint-intelligence | 4 | 90 | +| predictor | forecasting | 3 | 80 | +| researcher | fact-checking | 3 | 75 | +| twitter | social-media | 2 | 65 | +| browser | web-automation | 4 | 95 | + +## Tool Categories + +`browser`, `communication`, `database`, `document`, `filesystem`, `inference`, `integration`, `memory`, `network`, `scheduling`, `security`, `system`, `transform` + +## Channel Adapters + +Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email (SMTP/IMAP), Teams, Google Chat, LinkedIn, Twitter/X, Mastodon, Bluesky, Reddit, IRC, XMPP, Webhooks, gRPC + +## About OpenFang + +[OpenFang](https://openfang.sh) by RightNow AI is a Rust-based Agent Operating System — 137K lines of code across 14 crates, compiling to a single ~32MB binary. It runs autonomous agents 24/7 with 16 security systems, 27 LLM providers, and 40 channel adapters. + +- GitHub: [RightNow-AI/openfang](https://github.com/RightNow-AI/openfang) +- License: MIT / Apache 2.0 diff --git a/examples/rvf/examples/openfang.rs b/examples/rvf/examples/openfang.rs new file mode 100644 index 000000000..5b24f972c --- /dev/null +++ b/examples/rvf/examples/openfang.rs @@ -0,0 +1,462 @@ +//! OpenFang Agent OS — Knowledge Base +//! +//! Demonstrates how an RVF store can model the knowledge architecture +//! of an Agent Operating System like OpenFang (RightNow-AI): +//! +//! 1. Create a store representing the OpenFang agent registry +//! 2. Insert embeddings for 7 autonomous "Hands" (Clip, Lead, Collector, +//! Predictor, Researcher, Twitter, Browser) with metadata +//! 3. Insert tool embeddings across 38 built-in tools +//! 4. Insert channel adapter embeddings (40 messaging channels) +//! 5. Query for agents matching a task description +//! 6. Filter by domain, capability tier, and security level +//! 7. Cross-domain search: find the best agent+tool combination +//! 8. Witness chain tracking all registry operations +//! +//! RVF segments used: VEC_SEG, MANIFEST_SEG (via RvfStore), WITNESS_SEG (via rvf-crypto) +//! +//! Run with: +//! cargo run --example openfang + +use rvf_runtime::{ + FilterExpr, MetadataEntry, MetadataValue, QueryOptions, RvfOptions, RvfStore, SearchResult, +}; +use rvf_runtime::filter::FilterValue; +use rvf_runtime::options::DistanceMetric; +use rvf_crypto::{create_witness_chain, verify_witness_chain, shake256_256, WitnessEntry}; +use tempfile::TempDir; + +/// Simple pseudo-random number generator (LCG) for deterministic results. +fn random_vector(dim: usize, seed: u64) -> Vec { + let mut v = Vec::with_capacity(dim); + let mut x = seed.wrapping_add(1); + for _ in 0..dim { + x = x.wrapping_mul(6364136223846793005).wrapping_add(1442695040888963407); + v.push(((x >> 33) as f32) / (u32::MAX as f32) - 0.5); + } + v +} + +/// Domain-biased vector: adds a domain-specific offset to cluster related items. +fn domain_vector(dim: usize, seed: u64, domain_bias: f32) -> Vec { + let mut v = random_vector(dim, seed); + // Apply domain bias to the first 16 dimensions to create domain clusters + for i in 0..16.min(dim) { + v[i] += domain_bias; + } + v +} + +// -- OpenFang component definitions -- + +struct Hand { + name: &'static str, + domain: &'static str, + tier: u64, // performance tier: 1=lightweight, 2=standard, 3=heavy, 4=autonomous + security: u64, // security level: 0-100 + _description: &'static str, +} + +struct Tool { + name: &'static str, + category: &'static str, +} + +struct Channel { + name: &'static str, + protocol: &'static str, +} + +const HANDS: &[Hand] = &[ + Hand { name: "clip", domain: "video-processing", tier: 3, security: 60, _description: "YouTube shorts creation with captions" }, + Hand { name: "lead", domain: "sales-automation", tier: 2, security: 70, _description: "Daily prospect discovery with ICP matching" }, + Hand { name: "collector", domain: "osint-intelligence", tier: 4, security: 90, _description: "Continuous monitoring and change detection" }, + Hand { name: "predictor", domain: "forecasting", tier: 3, security: 80, _description: "Superforecasting with Brier score tracking" }, + Hand { name: "researcher", domain: "fact-checking", tier: 3, security: 75, _description: "CRAAP criteria cross-referencing" }, + Hand { name: "twitter", domain: "social-media", tier: 2, security: 65, _description: "X account management with approval gates" }, + Hand { name: "browser", domain: "web-automation", tier: 4, security: 95, _description: "Web automation with purchase approval" }, +]; + +const TOOLS: &[Tool] = &[ + Tool { name: "http_fetch", category: "network" }, + Tool { name: "web_search", category: "network" }, + Tool { name: "web_scrape", category: "network" }, + Tool { name: "file_read", category: "filesystem" }, + Tool { name: "file_write", category: "filesystem" }, + Tool { name: "file_list", category: "filesystem" }, + Tool { name: "shell_exec", category: "system" }, + Tool { name: "process_spawn", category: "system" }, + Tool { name: "json_parse", category: "transform" }, + Tool { name: "json_format", category: "transform" }, + Tool { name: "csv_parse", category: "transform" }, + Tool { name: "regex_match", category: "transform" }, + Tool { name: "template_render", category: "transform" }, + Tool { name: "llm_complete", category: "inference" }, + Tool { name: "llm_embed", category: "inference" }, + Tool { name: "llm_classify", category: "inference" }, + Tool { name: "vector_store", category: "memory" }, + Tool { name: "vector_search", category: "memory" }, + Tool { name: "kv_get", category: "memory" }, + Tool { name: "kv_set", category: "memory" }, + Tool { name: "sql_query", category: "database" }, + Tool { name: "sql_execute", category: "database" }, + Tool { name: "screenshot", category: "browser" }, + Tool { name: "click_element", category: "browser" }, + Tool { name: "fill_form", category: "browser" }, + Tool { name: "navigate", category: "browser" }, + Tool { name: "pdf_extract", category: "document" }, + Tool { name: "ocr_image", category: "document" }, + Tool { name: "email_send", category: "communication" }, + Tool { name: "email_read", category: "communication" }, + Tool { name: "webhook_fire", category: "integration" }, + Tool { name: "api_call", category: "integration" }, + Tool { name: "schedule_cron", category: "scheduling" }, + Tool { name: "schedule_delay", category: "scheduling" }, + Tool { name: "crypto_sign", category: "security" }, + Tool { name: "crypto_verify", category: "security" }, + Tool { name: "secret_read", category: "security" }, + Tool { name: "audit_log", category: "security" }, +]; + +const CHANNELS: &[Channel] = &[ + Channel { name: "telegram", protocol: "bot-api" }, + Channel { name: "discord", protocol: "gateway" }, + Channel { name: "slack", protocol: "events-api" }, + Channel { name: "whatsapp", protocol: "cloud-api" }, + Channel { name: "signal", protocol: "signal-cli" }, + Channel { name: "matrix", protocol: "client-server" }, + Channel { name: "email-smtp", protocol: "smtp" }, + Channel { name: "email-imap", protocol: "imap" }, + Channel { name: "teams", protocol: "graph-api" }, + Channel { name: "google-chat", protocol: "chat-api" }, + Channel { name: "linkedin", protocol: "rest-api" }, + Channel { name: "twitter-x", protocol: "api-v2" }, + Channel { name: "mastodon", protocol: "activitypub" }, + Channel { name: "bluesky", protocol: "at-proto" }, + Channel { name: "reddit", protocol: "oauth-api" }, + Channel { name: "irc", protocol: "irc-v3" }, + Channel { name: "xmpp", protocol: "xmpp-core" }, + Channel { name: "webhook-in", protocol: "http-post" }, + Channel { name: "webhook-out", protocol: "http-post" }, + Channel { name: "grpc", protocol: "grpc" }, +]; + +fn main() { + println!("=== OpenFang Agent OS — RVF Knowledge Base ===\n"); + + let dim = 128; + let tmp_dir = TempDir::new().expect("failed to create temp dir"); + let store_path = tmp_dir.path().join("openfang.rvf"); + + // -- Step 1: Create the OpenFang registry store -- + println!("--- 1. Creating OpenFang Agent Registry ---"); + let options = RvfOptions { + dimension: dim as u16, + metric: DistanceMetric::L2, + ..Default::default() + }; + + let mut store = RvfStore::create(&store_path, options).expect("failed to create store"); + println!(" Registry created at {:?}", store_path); + println!(" Embedding dimensions: {}", dim); + + let mut witness_entries: Vec = Vec::new(); + let mut next_id: u64 = 0; + + // -- Step 2: Register the 7 autonomous Hands -- + // Metadata fields: + // field_id 0: component_type (String: "hand", "tool", "channel") + // field_id 1: name (String) + // field_id 2: domain (String) + // field_id 3: tier (U64: 1-4) + // field_id 4: security_level (U64: 0-100) + println!("\n--- 2. Registering Autonomous Hands ({}) ---", HANDS.len()); + + let hand_base_id = next_id; + let hand_vectors: Vec> = HANDS.iter().enumerate() + .map(|(i, h)| domain_vector(dim, i as u64 * 17 + 100, h.tier as f32 * 0.1)) + .collect(); + let hand_refs: Vec<&[f32]> = hand_vectors.iter().map(|v| v.as_slice()).collect(); + let hand_ids: Vec = (hand_base_id..hand_base_id + HANDS.len() as u64).collect(); + + let mut hand_metadata = Vec::with_capacity(HANDS.len() * 5); + for hand in HANDS { + hand_metadata.push(MetadataEntry { field_id: 0, value: MetadataValue::String("hand".to_string()) }); + hand_metadata.push(MetadataEntry { field_id: 1, value: MetadataValue::String(hand.name.to_string()) }); + hand_metadata.push(MetadataEntry { field_id: 2, value: MetadataValue::String(hand.domain.to_string()) }); + hand_metadata.push(MetadataEntry { field_id: 3, value: MetadataValue::U64(hand.tier) }); + hand_metadata.push(MetadataEntry { field_id: 4, value: MetadataValue::U64(hand.security) }); + } + + let hand_result = store.ingest_batch(&hand_refs, &hand_ids, Some(&hand_metadata)) + .expect("failed to register hands"); + next_id += HANDS.len() as u64; + + println!(" Registered {} Hands (epoch {})", hand_result.accepted, hand_result.epoch); + for hand in HANDS { + println!(" - {} ({}), tier {}, security {}", hand.name, hand.domain, hand.tier, hand.security); + } + + witness_entries.push(WitnessEntry { + prev_hash: [0u8; 32], + action_hash: shake256_256(format!("REGISTER_HANDS:count={}", HANDS.len()).as_bytes()), + timestamp_ns: 1_709_000_000_000_000_000, + witness_type: 0x01, + }); + + // -- Step 3: Register built-in tools -- + println!("\n--- 3. Registering Built-in Tools ({}) ---", TOOLS.len()); + + let tool_base_id = next_id; + let tool_vectors: Vec> = TOOLS.iter().enumerate() + .map(|(i, _)| domain_vector(dim, i as u64 * 31 + 500, 0.3)) + .collect(); + let tool_refs: Vec<&[f32]> = tool_vectors.iter().map(|v| v.as_slice()).collect(); + let tool_ids: Vec = (tool_base_id..tool_base_id + TOOLS.len() as u64).collect(); + + let mut tool_metadata = Vec::with_capacity(TOOLS.len() * 3); + for tool in TOOLS { + tool_metadata.push(MetadataEntry { field_id: 0, value: MetadataValue::String("tool".to_string()) }); + tool_metadata.push(MetadataEntry { field_id: 1, value: MetadataValue::String(tool.name.to_string()) }); + tool_metadata.push(MetadataEntry { field_id: 2, value: MetadataValue::String(tool.category.to_string()) }); + } + + let tool_result = store.ingest_batch(&tool_refs, &tool_ids, Some(&tool_metadata)) + .expect("failed to register tools"); + next_id += TOOLS.len() as u64; + + println!(" Registered {} tools (epoch {})", tool_result.accepted, tool_result.epoch); + + // Print tools grouped by category + let categories: Vec<&str> = { + let mut cats: Vec<&str> = TOOLS.iter().map(|t| t.category).collect(); + cats.sort(); + cats.dedup(); + cats + }; + for cat in &categories { + let tools_in_cat: Vec<&str> = TOOLS.iter() + .filter(|t| t.category == *cat) + .map(|t| t.name) + .collect(); + println!(" [{}] {}", cat, tools_in_cat.join(", ")); + } + + witness_entries.push(WitnessEntry { + prev_hash: [0u8; 32], + action_hash: shake256_256(format!("REGISTER_TOOLS:count={}", TOOLS.len()).as_bytes()), + timestamp_ns: 1_709_000_001_000_000_000, + witness_type: 0x01, + }); + + // -- Step 4: Register channel adapters -- + println!("\n--- 4. Registering Channel Adapters ({}) ---", CHANNELS.len()); + + let channel_base_id = next_id; + let channel_vectors: Vec> = CHANNELS.iter().enumerate() + .map(|(i, _)| domain_vector(dim, i as u64 * 43 + 1000, -0.2)) + .collect(); + let channel_refs: Vec<&[f32]> = channel_vectors.iter().map(|v| v.as_slice()).collect(); + let channel_ids: Vec = (channel_base_id..channel_base_id + CHANNELS.len() as u64).collect(); + + let mut channel_metadata = Vec::with_capacity(CHANNELS.len() * 3); + for ch in CHANNELS { + channel_metadata.push(MetadataEntry { field_id: 0, value: MetadataValue::String("channel".to_string()) }); + channel_metadata.push(MetadataEntry { field_id: 1, value: MetadataValue::String(ch.name.to_string()) }); + channel_metadata.push(MetadataEntry { field_id: 2, value: MetadataValue::String(ch.protocol.to_string()) }); + } + + let channel_result = store.ingest_batch(&channel_refs, &channel_ids, Some(&channel_metadata)) + .expect("failed to register channels"); + let _ = next_id + CHANNELS.len() as u64; // total IDs allocated + + println!(" Registered {} channels (epoch {})", channel_result.accepted, channel_result.epoch); + for ch in CHANNELS { + println!(" - {} ({})", ch.name, ch.protocol); + } + + witness_entries.push(WitnessEntry { + prev_hash: [0u8; 32], + action_hash: shake256_256(format!("REGISTER_CHANNELS:count={}", CHANNELS.len()).as_bytes()), + timestamp_ns: 1_709_000_002_000_000_000, + witness_type: 0x01, + }); + + let total_components = HANDS.len() + TOOLS.len() + CHANNELS.len(); + println!("\n Total registry: {} components", total_components); + + // -- Step 5: Query — find agents for a task -- + println!("\n--- 5. Task Routing: Find Best Agent ---"); + + let task_query = domain_vector(dim, 42, 0.3); // bias toward tier-3 agents + let k = 5; + + // Unfiltered — search across all components + let all_results = store.query(&task_query, k, &QueryOptions::default()) + .expect("task routing query failed"); + println!(" Unfiltered top-{} (all component types):", k); + print_registry_results(&all_results, hand_base_id, tool_base_id, channel_base_id); + + // Filter to Hands only + let filter_hands = FilterExpr::Eq(0, FilterValue::String("hand".to_string())); + let opts_hands = QueryOptions { filter: Some(filter_hands), ..Default::default() }; + let hand_results = store.query(&task_query, k, &opts_hands) + .expect("hand filter query failed"); + println!("\n Hands only — best agent for this task:"); + print_registry_results(&hand_results, hand_base_id, tool_base_id, channel_base_id); + + witness_entries.push(WitnessEntry { + prev_hash: [0u8; 32], + action_hash: shake256_256(b"ROUTE_TASK:unfiltered+hands"), + timestamp_ns: 1_709_000_010_000_000_000, + witness_type: 0x02, + }); + + // -- Step 6: Filter by security level -- + println!("\n--- 6. Security Filter: High-Security Hands (>= 80) ---"); + + let filter_secure = FilterExpr::And(vec![ + FilterExpr::Eq(0, FilterValue::String("hand".to_string())), + FilterExpr::Ge(4, FilterValue::U64(80)), + ]); + let opts_secure = QueryOptions { filter: Some(filter_secure), ..Default::default() }; + let secure_results = store.query(&task_query, k, &opts_secure) + .expect("security filter query failed"); + + println!(" High-security Hands:"); + print_registry_results(&secure_results, hand_base_id, tool_base_id, channel_base_id); + println!(" ({} agents meet security >= 80 threshold)", secure_results.len()); + + // -- Step 7: Filter by tier -- + println!("\n--- 7. Autonomous Tier (tier 4) Agents ---"); + + let filter_autonomous = FilterExpr::And(vec![ + FilterExpr::Eq(0, FilterValue::String("hand".to_string())), + FilterExpr::Eq(3, FilterValue::U64(4)), + ]); + let opts_autonomous = QueryOptions { filter: Some(filter_autonomous), ..Default::default() }; + let autonomous_results = store.query(&task_query, k, &opts_autonomous) + .expect("tier filter query failed"); + + println!(" Fully autonomous agents (tier 4):"); + print_registry_results(&autonomous_results, hand_base_id, tool_base_id, channel_base_id); + + // -- Step 8: Tool search by category -- + println!("\n--- 8. Tool Discovery: Security Tools ---"); + + let filter_sec_tools = FilterExpr::And(vec![ + FilterExpr::Eq(0, FilterValue::String("tool".to_string())), + FilterExpr::Eq(2, FilterValue::String("security".to_string())), + ]); + let opts_sec_tools = QueryOptions { filter: Some(filter_sec_tools), ..Default::default() }; + let sec_tool_results = store.query(&task_query, 10, &opts_sec_tools) + .expect("security tool query failed"); + + println!(" Security tools available:"); + print_registry_results(&sec_tool_results, hand_base_id, tool_base_id, channel_base_id); + + // -- Step 9: Witness chain -- + println!("\n--- 9. Registry Audit Trail (Witness Chain) ---"); + + let chain_bytes = create_witness_chain(&witness_entries); + println!(" Created witness chain: {} entries, {} bytes", witness_entries.len(), chain_bytes.len()); + + match verify_witness_chain(&chain_bytes) { + Ok(verified) => { + println!(" Chain integrity: VALID ({} entries verified)\n", verified.len()); + println!(" {:>5} {:>8} {:>30}", "Index", "Type", "Timestamp (ns)"); + println!(" {:->5} {:->8} {:->30}", "", "", ""); + let labels = ["REGISTER_HANDS", "REGISTER_TOOLS", "REGISTER_CHANNELS", "ROUTE_TASK"]; + for (i, entry) in verified.iter().enumerate() { + let wtype = match entry.witness_type { + 0x01 => "PROV", + 0x02 => "COMP", + _ => "????", + }; + let label = if i < labels.len() { labels[i] } else { "???" }; + println!(" {:>5} {:>8} {:>30} {}", i, wtype, entry.timestamp_ns, label); + } + } + Err(e) => println!(" Chain integrity: FAILED ({:?})", e), + } + + // -- Step 10: Persistence -- + println!("\n--- 10. Persistence Verification ---"); + + let status = store.status(); + println!(" Vectors: {}, File size: {} bytes, Epoch: {}", status.total_vectors, status.file_size, status.current_epoch); + + store.close().expect("failed to close store"); + println!(" Store closed."); + + let reopened = RvfStore::open(&store_path).expect("failed to reopen store"); + let status_after = reopened.status(); + println!(" Reopened: {} vectors, epoch {}", status_after.total_vectors, status_after.current_epoch); + + let persist_check = reopened.query(&task_query, k, &QueryOptions::default()) + .expect("persistence query failed"); + assert_eq!(all_results.len(), persist_check.len(), "result count mismatch after reopen"); + for (a, b) in all_results.iter().zip(persist_check.iter()) { + assert_eq!(a.id, b.id, "ID mismatch after reopen"); + assert!((a.distance - b.distance).abs() < 1e-6, "distance mismatch after reopen"); + } + println!(" Persistence verified: results match before and after reopen."); + + reopened.close().expect("failed to close reopened store"); + + // -- Summary -- + println!("\n=== OpenFang Registry Summary ===\n"); + println!(" Component Type Count"); + println!(" ---------------- -----"); + println!(" Hands {:>4}", HANDS.len()); + println!(" Tools {:>4}", TOOLS.len()); + println!(" Channels {:>4}", CHANNELS.len()); + println!(" ---------------- -----"); + println!(" Total {:>4}", total_components); + println!(); + println!(" Witness chain: {} entries", witness_entries.len()); + println!(" Persistence: verified"); + println!(" Security filter: working"); + println!(" Tier filter: working"); + println!(" Cross-type search: working"); + + println!("\nDone."); +} + +fn print_registry_results( + results: &[SearchResult], + hand_base: u64, + tool_base: u64, + channel_base: u64, +) { + println!( + " {:>4} {:>10} {:>10} {:>20}", + "ID", "Distance", "Type", "Name" + ); + println!( + " {:->4} {:->10} {:->10} {:->20}", + "", "", "", "" + ); + for r in results { + let (comp_type, name) = identify_component(r.id, hand_base, tool_base, channel_base); + println!( + " {:>4} {:>10.4} {:>10} {:>20}", + r.id, r.distance, comp_type, name + ); + } +} + +fn identify_component(id: u64, hand_base: u64, tool_base: u64, channel_base: u64) -> (&'static str, &'static str) { + if id >= channel_base && (id - channel_base) < CHANNELS.len() as u64 { + let idx = (id - channel_base) as usize; + ("channel", CHANNELS[idx].name) + } else if id >= tool_base && (id - tool_base) < TOOLS.len() as u64 { + let idx = (id - tool_base) as usize; + ("tool", TOOLS[idx].name) + } else if id >= hand_base && (id - hand_base) < HANDS.len() as u64 { + let idx = (id - hand_base) as usize; + ("hand", HANDS[idx].name) + } else { + ("unknown", "???") + } +} From c5a0c0ad68dc08275354f157090cbd32df229b62 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 26 Feb 2026 14:33:33 +0000 Subject: [PATCH 03/10] Optimize openfang RVF example with deep capability integration - Extract Registry struct, metadata helpers, and witness helper to reduce repetition and improve readability - Replace dead `_description` field with lean struct definitions - Add per-category vector biasing via hash-based offsets for better clustering - Use named constants for metadata field IDs (F_TYPE, F_NAME, etc.) - Integrate 6 additional RVF capabilities: - Delete + compact lifecycle (decommission twitter, reclaim 512 bytes) - Derive with lineage tracking (parent/child provenance, depth=0->1) - COW branching + freeze (staging env with experimental 'sentinel' agent) - Segment directory inspection (raw segment types/offsets) - File identity preservation across close/reopen - Last witness hash inspection - Expand from 10 steps to 14 covering the full RVF API surface - Update README with capability table, architecture notes, and lifecycle docs https://claude.ai/code/session_015KgxqLUhevxop1jhiZY2Y4 --- .../rvf/examples/assets/openfang-README.md | 109 ++- examples/rvf/examples/openfang.rs | 773 ++++++++++-------- 2 files changed, 517 insertions(+), 365 deletions(-) diff --git a/examples/rvf/examples/assets/openfang-README.md b/examples/rvf/examples/assets/openfang-README.md index e032ed1c3..f606b09dd 100644 --- a/examples/rvf/examples/assets/openfang-README.md +++ b/examples/rvf/examples/assets/openfang-README.md @@ -1,29 +1,6 @@ # OpenFang Agent OS — RVF Example -An RVF (RuVector Format) knowledge base modeling the architecture of [OpenFang](https://github.com/RightNow-AI/openfang), a production-grade Agent Operating System built in Rust. - -## What It Does - -This example creates an RVF vector store that serves as a **component registry** for an agent OS, demonstrating how to model and query a multi-type agent ecosystem: - -| Component | Count | Description | -|-----------|-------|-------------| -| **Hands** | 7 | Autonomous agents (Clip, Lead, Collector, Predictor, Researcher, Twitter, Browser) | -| **Tools** | 38 | Built-in capabilities across 12 categories | -| **Channels** | 20 | Messaging adapters (Telegram, Discord, Slack, etc.) | -| **Total** | 65 | Searchable components in a single 35KB RVF file | - -## Capabilities Demonstrated - -1. **Multi-type registry** — Hands, Tools, and Channels stored in one vector space -2. **Rich metadata** — component type, name, domain, tier, security level -3. **Task routing** — nearest-neighbor search to find the best agent for a task -4. **Security filtering** — query only agents meeting a security threshold (>= 80) -5. **Tier filtering** — isolate autonomous (tier 4) agents -6. **Category search** — find tools by category (e.g., all security tools) -7. **Combined filters** — AND/OR/NOT filter expressions -8. **Witness chain** — cryptographic audit trail of all registry operations -9. **Persistence** — verified round-trip: create, close, reopen, query +A deep RVF integration example that models the [OpenFang](https://github.com/RightNow-AI/openfang) Agent Operating System as a searchable component registry, exercising the full RVF capability surface. ## Run @@ -32,20 +9,50 @@ cd examples/rvf cargo run --example openfang ``` +## What It Does + +Creates a single RVF file (~34 KB) containing an entire agent OS registry, then exercises 14 distinct RVF capabilities against it. + +### Registry Contents + +| Component | Count | Description | +|-----------|------:|-------------| +| **Hands** | 7 | Autonomous agents (Clip, Lead, Collector, Predictor, Researcher, Twitter, Browser) | +| **Tools** | 38 | Built-in capabilities across 13 categories | +| **Channels** | 20 | Messaging adapters (Telegram, Discord, Slack, WhatsApp, etc.) | +| **Total** | 65 | All searchable in one vector space | + +### RVF Capabilities Exercised + +| # | Capability | RVF API | What It Shows | +|---|-----------|---------|---------------| +| 1 | **Store creation** | `RvfStore::create` | 128-dim L2 store with file identity | +| 2-4 | **Batch ingestion** | `ingest_batch` | Multi-type metadata (String, U64) with per-category vector biasing | +| 5 | **Nearest-neighbor search** | `query` | Unfiltered + type-filtered task routing | +| 6 | **Combined filters** | `FilterExpr::And` + `Ge` | Security threshold filtering (>= 80) | +| 7 | **Equality filter** | `FilterExpr::Eq` | Tier-4 autonomous agent isolation | +| 8 | **Category filter** | `FilterExpr::And` | Tool discovery by category | +| 9 | **Delete + compact** | `delete` + `compact` | Decommission 'twitter' hand, reclaim 512 bytes | +| 10 | **Derive (lineage)** | `derive` | Snapshot with parent-child provenance, depth tracking | +| 11 | **COW branching** | `freeze` + `branch` | Staging environment with experimental 'sentinel' agent | +| 12 | **Segment inspection** | `segment_dir` | Raw segment directory (VEC, MANIFEST, JOURNAL, etc.) | +| 13 | **Witness chain** | `create_witness_chain` + `verify` | 7-entry cryptographic audit trail | +| 14 | **Persistence** | `close` + `open_readonly` | Round-trip verification with file ID preservation | + ## Metadata Schema -| Field ID | Name | Type | Applies To | -|----------|------|------|------------| -| 0 | `component_type` | String | All (`"hand"`, `"tool"`, `"channel"`) | -| 1 | `name` | String | All | -| 2 | `domain` / `category` / `protocol` | String | All | -| 3 | `tier` | U64 (1-4) | Hands only | -| 4 | `security_level` | U64 (0-100) | Hands only | +| Field ID | Constant | Name | Type | Applies To | +|:--------:|----------|------|------|------------| +| 0 | `F_TYPE` | component_type | String | All (`"hand"`, `"tool"`, `"channel"`) | +| 1 | `F_NAME` | name | String | All | +| 2 | `F_DOMAIN` | domain / category / protocol | String | All | +| 3 | `F_TIER` | tier | U64 (1-4) | Hands only | +| 4 | `F_SEC` | security_level | U64 (0-100) | Hands only | -## OpenFang Hands +## Hands | Hand | Domain | Tier | Security | -|------|--------|------|----------| +|------|--------|:----:|:--------:| | clip | video-processing | 3 | 60 | | lead | sales-automation | 2 | 70 | | collector | osint-intelligence | 4 | 90 | @@ -54,17 +61,45 @@ cargo run --example openfang | twitter | social-media | 2 | 65 | | browser | web-automation | 4 | 95 | -## Tool Categories +## Tool Categories (13) `browser`, `communication`, `database`, `document`, `filesystem`, `inference`, `integration`, `memory`, `network`, `scheduling`, `security`, `system`, `transform` -## Channel Adapters +## Channel Adapters (20) + +Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email (SMTP/IMAP), Teams, Google Chat, LinkedIn, Twitter/X, Mastodon, Bluesky, Reddit, IRC, XMPP, Webhooks (in/out), gRPC + +## Architecture Notes + +### Vector Biasing + +Tools and channels use `category_bias()` — a hash-based offset applied to the first 16 dimensions — so items sharing a category cluster in vector space. Hands use tier-proportional bias (`tier * 0.1`) to create performance-tier clusters. + +### Delete + Compact Lifecycle + +Step 9 demonstrates the full decommission workflow: +1. `delete(&[twitter_id])` — soft-delete, marks vector as tombstoned +2. `compact()` — rewrites the store, reclaiming dead space +3. Post-delete queries confirm the vector is gone + +### COW Branching + +Step 11 shows a staging/production pattern: +1. `freeze()` — makes the parent read-only (immutable baseline) +2. `branch()` — creates a COW child inheriting all parent vectors +3. New vectors added to the child don't affect the parent +4. `cow_stats()` reports cluster-level copy-on-write telemetry + +### Lineage Tracking -Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email (SMTP/IMAP), Teams, Google Chat, LinkedIn, Twitter/X, Mastodon, Bluesky, Reddit, IRC, XMPP, Webhooks, gRPC +Step 10 derives a snapshot child and verifies: +- Child `parent_id` matches parent `file_id` +- Lineage depth increments (0 -> 1) +- Provenance chain is cryptographically verifiable ## About OpenFang -[OpenFang](https://openfang.sh) by RightNow AI is a Rust-based Agent Operating System — 137K lines of code across 14 crates, compiling to a single ~32MB binary. It runs autonomous agents 24/7 with 16 security systems, 27 LLM providers, and 40 channel adapters. +[OpenFang](https://openfang.sh) by RightNow AI is a Rust-based Agent Operating System — 137K lines of code across 14 crates, compiling to a single ~32 MB binary. It runs autonomous agents 24/7 with 16 security systems, 27 LLM providers, and 40 channel adapters. - GitHub: [RightNow-AI/openfang](https://github.com/RightNow-AI/openfang) - License: MIT / Apache 2.0 diff --git a/examples/rvf/examples/openfang.rs b/examples/rvf/examples/openfang.rs index 5b24f972c..ec5db1ce9 100644 --- a/examples/rvf/examples/openfang.rs +++ b/examples/rvf/examples/openfang.rs @@ -1,60 +1,56 @@ -//! OpenFang Agent OS — Knowledge Base +//! OpenFang Agent OS — RVF Knowledge Base //! -//! Demonstrates how an RVF store can model the knowledge architecture -//! of an Agent Operating System like OpenFang (RightNow-AI): +//! A deep integration example that exercises the full RVF capability surface +//! using OpenFang's agent registry as the domain model. //! -//! 1. Create a store representing the OpenFang agent registry -//! 2. Insert embeddings for 7 autonomous "Hands" (Clip, Lead, Collector, -//! Predictor, Researcher, Twitter, Browser) with metadata -//! 3. Insert tool embeddings across 38 built-in tools -//! 4. Insert channel adapter embeddings (40 messaging channels) -//! 5. Query for agents matching a task description -//! 6. Filter by domain, capability tier, and security level -//! 7. Cross-domain search: find the best agent+tool combination -//! 8. Witness chain tracking all registry operations -//! -//! RVF segments used: VEC_SEG, MANIFEST_SEG (via RvfStore), WITNESS_SEG (via rvf-crypto) +//! Capabilities demonstrated: +//! - Multi-type registry (Hands, Tools, Channels) in one vector space +//! - Rich metadata with typed fields and combined filter expressions +//! - Task routing via nearest-neighbor search +//! - Security and tier filtering +//! - Delete + compact lifecycle (decommission an agent, reclaim space) +//! - COW branching + freeze (staging branch for experimental agents) +//! - File identity and lineage tracking (parent/child provenance) +//! - Audited queries (witness entries for every search) +//! - Segment directory inspection +//! - Cryptographic witness chain with verification +//! - Persistence round-trip //! //! Run with: //! cargo run --example openfang +use rvf_crypto::{create_witness_chain, shake256_256, verify_witness_chain, WitnessEntry}; +use rvf_runtime::filter::FilterValue; +use rvf_runtime::options::DistanceMetric; use rvf_runtime::{ FilterExpr, MetadataEntry, MetadataValue, QueryOptions, RvfOptions, RvfStore, SearchResult, }; -use rvf_runtime::filter::FilterValue; -use rvf_runtime::options::DistanceMetric; -use rvf_crypto::{create_witness_chain, verify_witness_chain, shake256_256, WitnessEntry}; +use rvf_types::DerivationType; use tempfile::TempDir; -/// Simple pseudo-random number generator (LCG) for deterministic results. -fn random_vector(dim: usize, seed: u64) -> Vec { - let mut v = Vec::with_capacity(dim); - let mut x = seed.wrapping_add(1); - for _ in 0..dim { - x = x.wrapping_mul(6364136223846793005).wrapping_add(1442695040888963407); - v.push(((x >> 33) as f32) / (u32::MAX as f32) - 0.5); - } - v -} +// --------------------------------------------------------------------------- +// Constants +// --------------------------------------------------------------------------- -/// Domain-biased vector: adds a domain-specific offset to cluster related items. -fn domain_vector(dim: usize, seed: u64, domain_bias: f32) -> Vec { - let mut v = random_vector(dim, seed); - // Apply domain bias to the first 16 dimensions to create domain clusters - for i in 0..16.min(dim) { - v[i] += domain_bias; - } - v -} +const DIM: usize = 128; +const K: usize = 5; -// -- OpenFang component definitions -- +// Metadata field IDs — shared across all component types. +const F_TYPE: u16 = 0; // "hand" | "tool" | "channel" +const F_NAME: u16 = 1; +const F_DOMAIN: u16 = 2; // domain (hand), category (tool), protocol (channel) +const F_TIER: u16 = 3; // hand only: 1-4 +const F_SEC: u16 = 4; // hand only: 0-100 + +// --------------------------------------------------------------------------- +// Data definitions +// --------------------------------------------------------------------------- struct Hand { name: &'static str, domain: &'static str, - tier: u64, // performance tier: 1=lightweight, 2=standard, 3=heavy, 4=autonomous - security: u64, // security level: 0-100 - _description: &'static str, + tier: u64, + security: u64, } struct Tool { @@ -68,13 +64,13 @@ struct Channel { } const HANDS: &[Hand] = &[ - Hand { name: "clip", domain: "video-processing", tier: 3, security: 60, _description: "YouTube shorts creation with captions" }, - Hand { name: "lead", domain: "sales-automation", tier: 2, security: 70, _description: "Daily prospect discovery with ICP matching" }, - Hand { name: "collector", domain: "osint-intelligence", tier: 4, security: 90, _description: "Continuous monitoring and change detection" }, - Hand { name: "predictor", domain: "forecasting", tier: 3, security: 80, _description: "Superforecasting with Brier score tracking" }, - Hand { name: "researcher", domain: "fact-checking", tier: 3, security: 75, _description: "CRAAP criteria cross-referencing" }, - Hand { name: "twitter", domain: "social-media", tier: 2, security: 65, _description: "X account management with approval gates" }, - Hand { name: "browser", domain: "web-automation", tier: 4, security: 95, _description: "Web automation with purchase approval" }, + Hand { name: "clip", domain: "video-processing", tier: 3, security: 60 }, + Hand { name: "lead", domain: "sales-automation", tier: 2, security: 70 }, + Hand { name: "collector", domain: "osint-intelligence", tier: 4, security: 90 }, + Hand { name: "predictor", domain: "forecasting", tier: 3, security: 80 }, + Hand { name: "researcher", domain: "fact-checking", tier: 3, security: 75 }, + Hand { name: "twitter", domain: "social-media", tier: 2, security: 65 }, + Hand { name: "browser", domain: "web-automation", tier: 4, security: 95 }, ]; const TOOLS: &[Tool] = &[ @@ -141,322 +137,443 @@ const CHANNELS: &[Channel] = &[ Channel { name: "grpc", protocol: "grpc" }, ]; -fn main() { - println!("=== OpenFang Agent OS — RVF Knowledge Base ===\n"); - - let dim = 128; - let tmp_dir = TempDir::new().expect("failed to create temp dir"); - let store_path = tmp_dir.path().join("openfang.rvf"); - - // -- Step 1: Create the OpenFang registry store -- - println!("--- 1. Creating OpenFang Agent Registry ---"); - let options = RvfOptions { - dimension: dim as u16, - metric: DistanceMetric::L2, - ..Default::default() - }; +// --------------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------------- - let mut store = RvfStore::create(&store_path, options).expect("failed to create store"); - println!(" Registry created at {:?}", store_path); - println!(" Embedding dimensions: {}", dim); - - let mut witness_entries: Vec = Vec::new(); - let mut next_id: u64 = 0; - - // -- Step 2: Register the 7 autonomous Hands -- - // Metadata fields: - // field_id 0: component_type (String: "hand", "tool", "channel") - // field_id 1: name (String) - // field_id 2: domain (String) - // field_id 3: tier (U64: 1-4) - // field_id 4: security_level (U64: 0-100) - println!("\n--- 2. Registering Autonomous Hands ({}) ---", HANDS.len()); - - let hand_base_id = next_id; - let hand_vectors: Vec> = HANDS.iter().enumerate() - .map(|(i, h)| domain_vector(dim, i as u64 * 17 + 100, h.tier as f32 * 0.1)) - .collect(); - let hand_refs: Vec<&[f32]> = hand_vectors.iter().map(|v| v.as_slice()).collect(); - let hand_ids: Vec = (hand_base_id..hand_base_id + HANDS.len() as u64).collect(); - - let mut hand_metadata = Vec::with_capacity(HANDS.len() * 5); - for hand in HANDS { - hand_metadata.push(MetadataEntry { field_id: 0, value: MetadataValue::String("hand".to_string()) }); - hand_metadata.push(MetadataEntry { field_id: 1, value: MetadataValue::String(hand.name.to_string()) }); - hand_metadata.push(MetadataEntry { field_id: 2, value: MetadataValue::String(hand.domain.to_string()) }); - hand_metadata.push(MetadataEntry { field_id: 3, value: MetadataValue::U64(hand.tier) }); - hand_metadata.push(MetadataEntry { field_id: 4, value: MetadataValue::U64(hand.security) }); +fn random_vector(seed: u64) -> Vec { + let mut v = Vec::with_capacity(DIM); + let mut x = seed.wrapping_add(1); + for _ in 0..DIM { + x = x.wrapping_mul(6364136223846793005).wrapping_add(1442695040888963407); + v.push(((x >> 33) as f32) / (u32::MAX as f32) - 0.5); } + v +} - let hand_result = store.ingest_batch(&hand_refs, &hand_ids, Some(&hand_metadata)) - .expect("failed to register hands"); - next_id += HANDS.len() as u64; - - println!(" Registered {} Hands (epoch {})", hand_result.accepted, hand_result.epoch); - for hand in HANDS { - println!(" - {} ({}), tier {}, security {}", hand.name, hand.domain, hand.tier, hand.security); +fn biased_vector(seed: u64, bias: f32) -> Vec { + let mut v = random_vector(seed); + for d in v.iter_mut().take(16) { + *d += bias; } + v +} - witness_entries.push(WitnessEntry { - prev_hash: [0u8; 32], - action_hash: shake256_256(format!("REGISTER_HANDS:count={}", HANDS.len()).as_bytes()), - timestamp_ns: 1_709_000_000_000_000_000, - witness_type: 0x01, - }); - - // -- Step 3: Register built-in tools -- - println!("\n--- 3. Registering Built-in Tools ({}) ---", TOOLS.len()); - - let tool_base_id = next_id; - let tool_vectors: Vec> = TOOLS.iter().enumerate() - .map(|(i, _)| domain_vector(dim, i as u64 * 31 + 500, 0.3)) - .collect(); - let tool_refs: Vec<&[f32]> = tool_vectors.iter().map(|v| v.as_slice()).collect(); - let tool_ids: Vec = (tool_base_id..tool_base_id + TOOLS.len() as u64).collect(); - - let mut tool_metadata = Vec::with_capacity(TOOLS.len() * 3); - for tool in TOOLS { - tool_metadata.push(MetadataEntry { field_id: 0, value: MetadataValue::String("tool".to_string()) }); - tool_metadata.push(MetadataEntry { field_id: 1, value: MetadataValue::String(tool.name.to_string()) }); - tool_metadata.push(MetadataEntry { field_id: 2, value: MetadataValue::String(tool.category.to_string()) }); - } +fn category_bias(cat: &str) -> f32 { + let h = cat.bytes().fold(0u32, |a, b| a.wrapping_mul(31).wrapping_add(b as u32)); + ((h % 200) as f32 - 100.0) * 0.003 +} - let tool_result = store.ingest_batch(&tool_refs, &tool_ids, Some(&tool_metadata)) - .expect("failed to register tools"); - next_id += TOOLS.len() as u64; +fn push_meta(out: &mut Vec, fid: u16, val: MetadataValue) { + out.push(MetadataEntry { field_id: fid, value: val }); +} - println!(" Registered {} tools (epoch {})", tool_result.accepted, tool_result.epoch); +fn sv(s: &str) -> MetadataValue { + MetadataValue::String(s.to_string()) +} - // Print tools grouped by category - let categories: Vec<&str> = { - let mut cats: Vec<&str> = TOOLS.iter().map(|t| t.category).collect(); - cats.sort(); - cats.dedup(); - cats - }; - for cat in &categories { - let tools_in_cat: Vec<&str> = TOOLS.iter() - .filter(|t| t.category == *cat) - .map(|t| t.name) - .collect(); - println!(" [{}] {}", cat, tools_in_cat.join(", ")); - } +fn hex(bytes: &[u8]) -> String { + bytes.iter().map(|b| format!("{:02x}", b)).collect::>().join("") +} - witness_entries.push(WitnessEntry { +fn witness(entries: &mut Vec, action: &str, ts_ns: u64, wtype: u8) { + entries.push(WitnessEntry { prev_hash: [0u8; 32], - action_hash: shake256_256(format!("REGISTER_TOOLS:count={}", TOOLS.len()).as_bytes()), - timestamp_ns: 1_709_000_001_000_000_000, - witness_type: 0x01, + action_hash: shake256_256(action.as_bytes()), + timestamp_ns: ts_ns, + witness_type: wtype, }); +} - // -- Step 4: Register channel adapters -- - println!("\n--- 4. Registering Channel Adapters ({}) ---", CHANNELS.len()); - - let channel_base_id = next_id; - let channel_vectors: Vec> = CHANNELS.iter().enumerate() - .map(|(i, _)| domain_vector(dim, i as u64 * 43 + 1000, -0.2)) - .collect(); - let channel_refs: Vec<&[f32]> = channel_vectors.iter().map(|v| v.as_slice()).collect(); - let channel_ids: Vec = (channel_base_id..channel_base_id + CHANNELS.len() as u64).collect(); - - let mut channel_metadata = Vec::with_capacity(CHANNELS.len() * 3); - for ch in CHANNELS { - channel_metadata.push(MetadataEntry { field_id: 0, value: MetadataValue::String("channel".to_string()) }); - channel_metadata.push(MetadataEntry { field_id: 1, value: MetadataValue::String(ch.name.to_string()) }); - channel_metadata.push(MetadataEntry { field_id: 2, value: MetadataValue::String(ch.protocol.to_string()) }); - } +// --------------------------------------------------------------------------- +// Registry — tracks ID ranges for component lookup. +// --------------------------------------------------------------------------- - let channel_result = store.ingest_batch(&channel_refs, &channel_ids, Some(&channel_metadata)) - .expect("failed to register channels"); - let _ = next_id + CHANNELS.len() as u64; // total IDs allocated +struct Registry { + hand_base: u64, + hand_count: u64, + tool_base: u64, + tool_count: u64, + channel_base: u64, + channel_count: u64, +} - println!(" Registered {} channels (epoch {})", channel_result.accepted, channel_result.epoch); - for ch in CHANNELS { - println!(" - {} ({})", ch.name, ch.protocol); +impl Registry { + fn new() -> Self { + let hc = HANDS.len() as u64; + let tc = TOOLS.len() as u64; + let cc = CHANNELS.len() as u64; + Self { hand_base: 0, hand_count: hc, tool_base: hc, tool_count: tc, channel_base: hc + tc, channel_count: cc } } + fn total(&self) -> u64 { self.hand_count + self.tool_count + self.channel_count } + fn identify(&self, id: u64) -> (&'static str, &'static str) { + if id >= self.channel_base && id < self.channel_base + self.channel_count { + ("channel", CHANNELS[(id - self.channel_base) as usize].name) + } else if id >= self.tool_base && id < self.tool_base + self.tool_count { + ("tool", TOOLS[(id - self.tool_base) as usize].name) + } else if id >= self.hand_base && id < self.hand_base + self.hand_count { + ("hand", HANDS[(id - self.hand_base) as usize].name) + } else { + ("unknown", "???") + } + } +} - witness_entries.push(WitnessEntry { - prev_hash: [0u8; 32], - action_hash: shake256_256(format!("REGISTER_CHANNELS:count={}", CHANNELS.len()).as_bytes()), - timestamp_ns: 1_709_000_002_000_000_000, - witness_type: 0x01, - }); - - let total_components = HANDS.len() + TOOLS.len() + CHANNELS.len(); - println!("\n Total registry: {} components", total_components); - - // -- Step 5: Query — find agents for a task -- - println!("\n--- 5. Task Routing: Find Best Agent ---"); - - let task_query = domain_vector(dim, 42, 0.3); // bias toward tier-3 agents - let k = 5; - - // Unfiltered — search across all components - let all_results = store.query(&task_query, k, &QueryOptions::default()) - .expect("task routing query failed"); - println!(" Unfiltered top-{} (all component types):", k); - print_registry_results(&all_results, hand_base_id, tool_base_id, channel_base_id); - - // Filter to Hands only - let filter_hands = FilterExpr::Eq(0, FilterValue::String("hand".to_string())); - let opts_hands = QueryOptions { filter: Some(filter_hands), ..Default::default() }; - let hand_results = store.query(&task_query, k, &opts_hands) - .expect("hand filter query failed"); - println!("\n Hands only — best agent for this task:"); - print_registry_results(&hand_results, hand_base_id, tool_base_id, channel_base_id); +fn print_results(results: &[SearchResult], reg: &Registry) { + println!(" {:>4} {:>10} {:>8} {:>20}", "ID", "Distance", "Type", "Name"); + println!(" {:->4} {:->10} {:->8} {:->20}", "", "", "", ""); + for r in results { + let (ty, nm) = reg.identify(r.id); + println!(" {:>4} {:>10.4} {:>8} {:>20}", r.id, r.distance, ty, nm); + } +} - witness_entries.push(WitnessEntry { - prev_hash: [0u8; 32], - action_hash: shake256_256(b"ROUTE_TASK:unfiltered+hands"), - timestamp_ns: 1_709_000_010_000_000_000, - witness_type: 0x02, - }); +// --------------------------------------------------------------------------- +// Main +// --------------------------------------------------------------------------- - // -- Step 6: Filter by security level -- - println!("\n--- 6. Security Filter: High-Security Hands (>= 80) ---"); +fn main() { + println!("=== OpenFang Agent OS — RVF Knowledge Base ===\n"); - let filter_secure = FilterExpr::And(vec![ - FilterExpr::Eq(0, FilterValue::String("hand".to_string())), - FilterExpr::Ge(4, FilterValue::U64(80)), - ]); - let opts_secure = QueryOptions { filter: Some(filter_secure), ..Default::default() }; - let secure_results = store.query(&task_query, k, &opts_secure) - .expect("security filter query failed"); + let reg = Registry::new(); + let tmp = TempDir::new().expect("tmpdir"); + let store_path = tmp.path().join("openfang.rvf"); + let branch_path = tmp.path().join("openfang-staging.rvf"); + let derived_path = tmp.path().join("openfang-snapshot.rvf"); - println!(" High-security Hands:"); - print_registry_results(&secure_results, hand_base_id, tool_base_id, channel_base_id); - println!(" ({} agents meet security >= 80 threshold)", secure_results.len()); + let opts = RvfOptions { + dimension: DIM as u16, + metric: DistanceMetric::L2, + ..Default::default() + }; - // -- Step 7: Filter by tier -- - println!("\n--- 7. Autonomous Tier (tier 4) Agents ---"); + let mut wit: Vec = Vec::new(); + + // ----------------------------------------------------------------------- + // 1. Create store + // ----------------------------------------------------------------------- + println!("--- 1. Create Registry ---"); + let mut store = RvfStore::create(&store_path, opts).expect("create"); + println!(" Store: {:?} ({}d, L2)", store_path, DIM); + println!(" File ID: {}", hex(&store.file_id()[..8])); + println!(" Lineage depth: {}", store.lineage_depth()); + + // ----------------------------------------------------------------------- + // 2. Register Hands + // ----------------------------------------------------------------------- + println!("\n--- 2. Register Hands ({}) ---", HANDS.len()); + { + let vecs: Vec> = HANDS.iter().enumerate() + .map(|(i, h)| biased_vector(i as u64 * 17 + 100, h.tier as f32 * 0.1)) + .collect(); + let refs: Vec<&[f32]> = vecs.iter().map(|v| v.as_slice()).collect(); + let ids: Vec = (reg.hand_base..reg.hand_base + reg.hand_count).collect(); + let mut meta = Vec::with_capacity(HANDS.len() * 5); + for h in HANDS { + push_meta(&mut meta, F_TYPE, sv("hand")); + push_meta(&mut meta, F_NAME, sv(h.name)); + push_meta(&mut meta, F_DOMAIN, sv(h.domain)); + push_meta(&mut meta, F_TIER, MetadataValue::U64(h.tier)); + push_meta(&mut meta, F_SEC, MetadataValue::U64(h.security)); + } + let r = store.ingest_batch(&refs, &ids, Some(&meta)).expect("ingest hands"); + println!(" Ingested {} hands (epoch {})", r.accepted, r.epoch); + for h in HANDS { + println!(" {:12} {:22} tier={} sec={}", h.name, h.domain, h.tier, h.security); + } + } + witness(&mut wit, &format!("REGISTER_HANDS:{}", HANDS.len()), 1_709_000_000_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 3. Register Tools (per-category bias) + // ----------------------------------------------------------------------- + println!("\n--- 3. Register Tools ({}) ---", TOOLS.len()); + { + let vecs: Vec> = TOOLS.iter().enumerate() + .map(|(i, t)| biased_vector(i as u64 * 31 + 500, category_bias(t.category))) + .collect(); + let refs: Vec<&[f32]> = vecs.iter().map(|v| v.as_slice()).collect(); + let ids: Vec = (reg.tool_base..reg.tool_base + reg.tool_count).collect(); + let mut meta = Vec::with_capacity(TOOLS.len() * 3); + for t in TOOLS { + push_meta(&mut meta, F_TYPE, sv("tool")); + push_meta(&mut meta, F_NAME, sv(t.name)); + push_meta(&mut meta, F_DOMAIN, sv(t.category)); + } + let r = store.ingest_batch(&refs, &ids, Some(&meta)).expect("ingest tools"); + println!(" Ingested {} tools (epoch {})", r.accepted, r.epoch); + let mut cats: Vec<&str> = TOOLS.iter().map(|t| t.category).collect(); + cats.sort_unstable(); + cats.dedup(); + for c in &cats { + let ns: Vec<&str> = TOOLS.iter().filter(|t| t.category == *c).map(|t| t.name).collect(); + println!(" [{:14}] {}", c, ns.join(", ")); + } + } + witness(&mut wit, &format!("REGISTER_TOOLS:{}", TOOLS.len()), 1_709_000_001_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 4. Register Channels + // ----------------------------------------------------------------------- + println!("\n--- 4. Register Channels ({}) ---", CHANNELS.len()); + { + let vecs: Vec> = CHANNELS.iter().enumerate() + .map(|(i, c)| biased_vector(i as u64 * 43 + 1000, category_bias(c.protocol))) + .collect(); + let refs: Vec<&[f32]> = vecs.iter().map(|v| v.as_slice()).collect(); + let ids: Vec = (reg.channel_base..reg.channel_base + reg.channel_count).collect(); + let mut meta = Vec::with_capacity(CHANNELS.len() * 3); + for c in CHANNELS { + push_meta(&mut meta, F_TYPE, sv("channel")); + push_meta(&mut meta, F_NAME, sv(c.name)); + push_meta(&mut meta, F_DOMAIN, sv(c.protocol)); + } + let r = store.ingest_batch(&refs, &ids, Some(&meta)).expect("ingest channels"); + println!(" Ingested {} channels (epoch {})", r.accepted, r.epoch); + for c in CHANNELS { + println!(" {:14} ({})", c.name, c.protocol); + } + } + witness(&mut wit, &format!("REGISTER_CHANNELS:{}", CHANNELS.len()), 1_709_000_002_000_000_000, 0x01); - let filter_autonomous = FilterExpr::And(vec![ - FilterExpr::Eq(0, FilterValue::String("hand".to_string())), - FilterExpr::Eq(3, FilterValue::U64(4)), - ]); - let opts_autonomous = QueryOptions { filter: Some(filter_autonomous), ..Default::default() }; - let autonomous_results = store.query(&task_query, k, &opts_autonomous) - .expect("tier filter query failed"); + println!("\n Total registry: {} components", reg.total()); - println!(" Fully autonomous agents (tier 4):"); - print_registry_results(&autonomous_results, hand_base_id, tool_base_id, channel_base_id); + // ----------------------------------------------------------------------- + // 5. Task routing — unfiltered + hands-only + // ----------------------------------------------------------------------- + println!("\n--- 5. Task Routing ---"); + let query = biased_vector(42, 0.3); - // -- Step 8: Tool search by category -- - println!("\n--- 8. Tool Discovery: Security Tools ---"); + let all = store.query(&query, K, &QueryOptions::default()).expect("query"); + println!(" Unfiltered top-{}:", K); + print_results(&all, ®); - let filter_sec_tools = FilterExpr::And(vec![ - FilterExpr::Eq(0, FilterValue::String("tool".to_string())), - FilterExpr::Eq(2, FilterValue::String("security".to_string())), - ]); - let opts_sec_tools = QueryOptions { filter: Some(filter_sec_tools), ..Default::default() }; - let sec_tool_results = store.query(&task_query, 10, &opts_sec_tools) - .expect("security tool query failed"); + let hands_only = QueryOptions { + filter: Some(FilterExpr::Eq(F_TYPE, FilterValue::String("hand".into()))), + ..Default::default() + }; + let hand_res = store.query(&query, K, &hands_only).expect("query hands"); + println!("\n Hands only:"); + print_results(&hand_res, ®); + witness(&mut wit, "ROUTE_TASK:k=5", 1_709_000_010_000_000_000, 0x02); + + // ----------------------------------------------------------------------- + // 6. Security filter (>= 80) + // ----------------------------------------------------------------------- + println!("\n--- 6. High-Security Hands (sec >= 80) ---"); + let sec_opts = QueryOptions { + filter: Some(FilterExpr::And(vec![ + FilterExpr::Eq(F_TYPE, FilterValue::String("hand".into())), + FilterExpr::Ge(F_SEC, FilterValue::U64(80)), + ])), + ..Default::default() + }; + let sec_res = store.query(&query, K, &sec_opts).expect("sec query"); + print_results(&sec_res, ®); + println!(" {} agents pass threshold", sec_res.len()); + + // ----------------------------------------------------------------------- + // 7. Autonomous tier-4 agents + // ----------------------------------------------------------------------- + println!("\n--- 7. Tier-4 Autonomous Agents ---"); + let tier_opts = QueryOptions { + filter: Some(FilterExpr::And(vec![ + FilterExpr::Eq(F_TYPE, FilterValue::String("hand".into())), + FilterExpr::Eq(F_TIER, FilterValue::U64(4)), + ])), + ..Default::default() + }; + let tier_res = store.query(&query, K, &tier_opts).expect("tier query"); + print_results(&tier_res, ®); + + // ----------------------------------------------------------------------- + // 8. Tool discovery by category + // ----------------------------------------------------------------------- + println!("\n--- 8. Security Tool Discovery ---"); + let tool_opts = QueryOptions { + filter: Some(FilterExpr::And(vec![ + FilterExpr::Eq(F_TYPE, FilterValue::String("tool".into())), + FilterExpr::Eq(F_DOMAIN, FilterValue::String("security".into())), + ])), + ..Default::default() + }; + let tool_res = store.query(&query, 10, &tool_opts).expect("tool query"); + print_results(&tool_res, ®); + + // ----------------------------------------------------------------------- + // 9. Delete + Compact — decommission the "twitter" hand + // ----------------------------------------------------------------------- + println!("\n--- 9. Delete + Compact (decommission 'twitter') ---"); + let twitter_id = HANDS.iter().position(|h| h.name == "twitter").unwrap() as u64 + reg.hand_base; + let st_before = store.status(); + println!(" Before: {} vectors, {} bytes, dead_ratio={:.2}", + st_before.total_vectors, st_before.file_size, st_before.dead_space_ratio); + + let del = store.delete(&[twitter_id]).expect("delete twitter"); + println!(" Deleted {} vector(s) (epoch {})", del.deleted, del.epoch); + + let st_mid = store.status(); + println!(" After delete: {} vectors, dead_ratio={:.2}", st_mid.total_vectors, st_mid.dead_space_ratio); + + let comp = store.compact().expect("compact"); + println!(" Compacted: {} segments, {} bytes reclaimed (epoch {})", + comp.segments_compacted, comp.bytes_reclaimed, comp.epoch); + + let st_after = store.status(); + println!(" After compact: {} vectors, {} bytes, dead_ratio={:.2}", + st_after.total_vectors, st_after.file_size, st_after.dead_space_ratio); + + // Verify twitter is gone from hand queries + let post_del = store.query(&query, K, &hands_only).expect("post-delete query"); + for r in &post_del { + assert_ne!(r.id, twitter_id, "twitter should be deleted"); + } + println!(" Verified: 'twitter' no longer appears in results"); + witness(&mut wit, "DELETE+COMPACT:twitter", 1_709_000_020_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 10. Derive — create a snapshot with lineage tracking + // ----------------------------------------------------------------------- + println!("\n--- 10. Derive (Snapshot with Lineage) ---"); + let parent_fid = hex(&store.file_id()[..8]); + let parent_depth = store.lineage_depth(); + + let child = store.derive(&derived_path, DerivationType::Snapshot, None).expect("derive"); + let child_fid = hex(&child.file_id()[..8]); + let child_parent = hex(&child.parent_id()[..8]); + let child_depth = child.lineage_depth(); + + println!(" Parent: file_id={} depth={}", parent_fid, parent_depth); + println!(" Child: file_id={} depth={}", child_fid, child_depth); + println!(" Child parent_id={} (matches parent: {})", child_parent, child_parent == parent_fid); + assert_eq!(child_depth, parent_depth + 1, "depth should increment"); + + let child_st = child.status(); + println!(" Child vectors: {}, segments: {}", child_st.total_vectors, child_st.total_segments); + child.close().expect("close child"); + witness(&mut wit, "DERIVE:snapshot", 1_709_000_030_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 11. COW Branch — staging environment for experimental agents + // ----------------------------------------------------------------------- + println!("\n--- 11. COW Branch (Staging Environment) ---"); + store.freeze().expect("freeze parent"); + println!(" Parent frozen (read-only)"); + + let mut staging = store.branch(&branch_path).expect("branch"); + println!(" Branch created: is_cow_child={}", staging.is_cow_child()); + if let Some(stats) = staging.cow_stats() { + println!(" COW stats: {} clusters, {} local", stats.cluster_count, stats.local_cluster_count); + } - println!(" Security tools available:"); - print_registry_results(&sec_tool_results, hand_base_id, tool_base_id, channel_base_id); + // Add experimental agent to staging only + let exp_id = reg.total(); + let exp_vec = biased_vector(9999, 0.5); + let mut exp_meta = Vec::with_capacity(5); + push_meta(&mut exp_meta, F_TYPE, sv("hand")); + push_meta(&mut exp_meta, F_NAME, sv("sentinel")); + push_meta(&mut exp_meta, F_DOMAIN, sv("threat-detection")); + push_meta(&mut exp_meta, F_TIER, MetadataValue::U64(4)); + push_meta(&mut exp_meta, F_SEC, MetadataValue::U64(99)); + + let exp_r = staging.ingest_batch(&[exp_vec.as_slice()], &[exp_id], Some(&exp_meta)) + .expect("ingest experimental"); + println!(" Added experimental 'sentinel' to staging (epoch {})", exp_r.epoch); + + let staging_st = staging.status(); + println!(" Staging: {} vectors (parent had {})", staging_st.total_vectors, st_after.total_vectors); + + if let Some(stats) = staging.cow_stats() { + println!(" COW stats after write: {} clusters, {} local", stats.cluster_count, stats.local_cluster_count); + } - // -- Step 9: Witness chain -- - println!("\n--- 9. Registry Audit Trail (Witness Chain) ---"); + staging.close().expect("close staging"); + witness(&mut wit, "COW_BRANCH:staging+sentinel", 1_709_000_040_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 12. Segment directory inspection + // ----------------------------------------------------------------------- + println!("\n--- 12. Segment Directory ---"); + let seg_dir: Vec<_> = store.segment_dir().to_vec(); + println!(" {} segments in parent store:", seg_dir.len()); + println!(" {:>12} {:>8} {:>8} {:>6}", "SegID", "Offset", "Length", "Type"); + println!(" {:->12} {:->8} {:->8} {:->6}", "", "", "", ""); + for &(seg_id, offset, length, seg_type) in &seg_dir { + let tname = match seg_type { + 0x01 => "VEC", + 0x02 => "MFST", + 0x03 => "JRNL", + 0x04 => "WITN", + 0x05 => "KERN", + 0x06 => "EBPF", + _ => "????", + }; + println!(" {:>12} {:>8} {:>8} {:>6}", seg_id, offset, length, tname); + } - let chain_bytes = create_witness_chain(&witness_entries); - println!(" Created witness chain: {} entries, {} bytes", witness_entries.len(), chain_bytes.len()); + // ----------------------------------------------------------------------- + // 13. Witness chain + // ----------------------------------------------------------------------- + println!("\n--- 13. Witness Chain ---"); + let chain = create_witness_chain(&wit); + println!(" {} entries, {} bytes", wit.len(), chain.len()); + println!(" Last witness hash: {}", hex(&store.last_witness_hash()[..8])); - match verify_witness_chain(&chain_bytes) { + match verify_witness_chain(&chain) { Ok(verified) => { - println!(" Chain integrity: VALID ({} entries verified)\n", verified.len()); - println!(" {:>5} {:>8} {:>30}", "Index", "Type", "Timestamp (ns)"); - println!(" {:->5} {:->8} {:->30}", "", "", ""); - let labels = ["REGISTER_HANDS", "REGISTER_TOOLS", "REGISTER_CHANNELS", "ROUTE_TASK"]; - for (i, entry) in verified.iter().enumerate() { - let wtype = match entry.witness_type { - 0x01 => "PROV", - 0x02 => "COMP", - _ => "????", - }; - let label = if i < labels.len() { labels[i] } else { "???" }; - println!(" {:>5} {:>8} {:>30} {}", i, wtype, entry.timestamp_ns, label); + println!(" Integrity: VALID\n"); + let labels = [ + "REGISTER_HANDS", "REGISTER_TOOLS", "REGISTER_CHANNELS", + "ROUTE_TASK", "DELETE+COMPACT", "DERIVE", "COW_BRANCH", + ]; + println!(" {:>2} {:>4} {:>22} {}", "#", "Type", "Timestamp", "Action"); + println!(" {:->2} {:->4} {:->22} {:->20}", "", "", "", ""); + for (i, e) in verified.iter().enumerate() { + let t = if e.witness_type == 0x01 { "PROV" } else { "COMP" }; + let l = labels.get(i).unwrap_or(&"???"); + println!(" {:>2} {:>4} {:>22} {}", i, t, e.timestamp_ns, l); } } - Err(e) => println!(" Chain integrity: FAILED ({:?})", e), + Err(e) => println!(" Integrity: FAILED ({:?})", e), } - // -- Step 10: Persistence -- - println!("\n--- 10. Persistence Verification ---"); - - let status = store.status(); - println!(" Vectors: {}, File size: {} bytes, Epoch: {}", status.total_vectors, status.file_size, status.current_epoch); - - store.close().expect("failed to close store"); - println!(" Store closed."); - - let reopened = RvfStore::open(&store_path).expect("failed to reopen store"); - let status_after = reopened.status(); - println!(" Reopened: {} vectors, epoch {}", status_after.total_vectors, status_after.current_epoch); - - let persist_check = reopened.query(&task_query, k, &QueryOptions::default()) - .expect("persistence query failed"); - assert_eq!(all_results.len(), persist_check.len(), "result count mismatch after reopen"); - for (a, b) in all_results.iter().zip(persist_check.iter()) { - assert_eq!(a.id, b.id, "ID mismatch after reopen"); - assert!((a.distance - b.distance).abs() < 1e-6, "distance mismatch after reopen"); + // ----------------------------------------------------------------------- + // 14. Persistence round-trip + // ----------------------------------------------------------------------- + println!("\n--- 14. Persistence ---"); + let final_st = store.status(); + println!(" Before close: {} vectors, {} bytes", final_st.total_vectors, final_st.file_size); + + // Parent is frozen/read-only, so we just drop it + drop(store); + + let reopened = RvfStore::open_readonly(&store_path).expect("reopen"); + let reopen_st = reopened.status(); + println!(" After reopen: {} vectors, epoch {}", reopen_st.total_vectors, reopen_st.current_epoch); + println!(" File ID preserved: {}", hex(&reopened.file_id()[..8]) == parent_fid); + + let recheck = reopened.query(&query, K, &QueryOptions::default()).expect("recheck"); + assert_eq!(all.len(), recheck.len(), "count mismatch"); + for (a, b) in all.iter().zip(recheck.iter()) { + assert_eq!(a.id, b.id, "id mismatch"); + assert!((a.distance - b.distance).abs() < 1e-6, "distance mismatch"); } - println!(" Persistence verified: results match before and after reopen."); - - reopened.close().expect("failed to close reopened store"); - - // -- Summary -- - println!("\n=== OpenFang Registry Summary ===\n"); - println!(" Component Type Count"); - println!(" ---------------- -----"); - println!(" Hands {:>4}", HANDS.len()); - println!(" Tools {:>4}", TOOLS.len()); - println!(" Channels {:>4}", CHANNELS.len()); - println!(" ---------------- -----"); - println!(" Total {:>4}", total_components); - println!(); - println!(" Witness chain: {} entries", witness_entries.len()); - println!(" Persistence: verified"); - println!(" Security filter: working"); - println!(" Tier filter: working"); - println!(" Cross-type search: working"); + println!(" Persistence verified."); + + // ----------------------------------------------------------------------- + // Summary + // ----------------------------------------------------------------------- + println!("\n=== Summary ===\n"); + println!(" Registry: {} hands + {} tools + {} channels = {} components", + HANDS.len(), TOOLS.len(), CHANNELS.len(), reg.total()); + println!(" Deleted: twitter (+ compacted)"); + println!(" Derived: snapshot at depth {}", child_depth); + println!(" Branched: COW staging with experimental 'sentinel'"); + println!(" Segments: {} in parent", seg_dir.len()); + println!(" Witness: {} entries", wit.len()); + println!(" File size: {} bytes", final_st.file_size); + println!(" Filters: security, tier, category — all passing"); + println!(" Persist: verified"); println!("\nDone."); } - -fn print_registry_results( - results: &[SearchResult], - hand_base: u64, - tool_base: u64, - channel_base: u64, -) { - println!( - " {:>4} {:>10} {:>10} {:>20}", - "ID", "Distance", "Type", "Name" - ); - println!( - " {:->4} {:->10} {:->10} {:->20}", - "", "", "", "" - ); - for r in results { - let (comp_type, name) = identify_component(r.id, hand_base, tool_base, channel_base); - println!( - " {:>4} {:>10.4} {:>10} {:>20}", - r.id, r.distance, comp_type, name - ); - } -} - -fn identify_component(id: u64, hand_base: u64, tool_base: u64, channel_base: u64) -> (&'static str, &'static str) { - if id >= channel_base && (id - channel_base) < CHANNELS.len() as u64 { - let idx = (id - channel_base) as usize; - ("channel", CHANNELS[idx].name) - } else if id >= tool_base && (id - tool_base) < TOOLS.len() as u64 { - let idx = (id - tool_base) as usize; - ("tool", TOOLS[idx].name) - } else if id >= hand_base && (id - hand_base) < HANDS.len() as u64 { - let idx = (id - hand_base) as usize; - ("hand", HANDS[idx].name) - } else { - ("unknown", "???") - } -} From 2d00053fdc766edf026d511fe32a9f0558b6734c Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 26 Feb 2026 14:47:29 +0000 Subject: [PATCH 04/10] Expand openfang to full RVF surface demo (14 -> 24 capabilities) Add 10 new capability demonstrations: - Quality envelope (query_with_envelope): ResponseQuality, safety-net activation, budget reporting - Audited queries (query_audited): auto-appended witness per search - Membership filter: tenant isolation via include-mode bitmap - DoS hardening: BudgetTokenBucket, NegativeCache, ProofOfWork - Adversarial detection: CV analysis, degenerate distribution check - Embed WASM: microkernel role, self-bootstrapping verification - Embed kernel: Linux image with cmdline and API port - Embed eBPF: socket filter program (2 instructions) - Embed dashboard: HTML registry bundle - AGI container: full manifest with model, orchestrator, tools, eval, policy, parsed back with ParsedAgiManifest All 24 steps pass, including persistence round-trip verifying that WASM, kernel, eBPF, and dashboard segments survive close/reopen. Update README with capability table, architecture notes for each new feature (quality envelope, audited queries, membership, DoS, adversarial, segment embedding, AGI container). https://claude.ai/code/session_015KgxqLUhevxop1jhiZY2Y4 --- .../rvf/examples/assets/openfang-README.md | 132 +++-- examples/rvf/examples/openfang.rs | 456 ++++++++++++++---- 2 files changed, 443 insertions(+), 145 deletions(-) diff --git a/examples/rvf/examples/assets/openfang-README.md b/examples/rvf/examples/assets/openfang-README.md index f606b09dd..a4263bbb9 100644 --- a/examples/rvf/examples/assets/openfang-README.md +++ b/examples/rvf/examples/assets/openfang-README.md @@ -1,6 +1,6 @@ -# OpenFang Agent OS — RVF Example +# OpenFang Agent OS — RVF Full Surface Demo -A deep RVF integration example that models the [OpenFang](https://github.com/RightNow-AI/openfang) Agent Operating System as a searchable component registry, exercising the full RVF capability surface. +Exercises **every major RVF capability** against a realistic agent-OS registry, using [OpenFang](https://github.com/RightNow-AI/openfang) as the domain model. A single ~35 KB RVF file holds 65 vector components plus embedded WASM, kernel, eBPF, and dashboard segments. ## Run @@ -9,11 +9,34 @@ cd examples/rvf cargo run --example openfang ``` -## What It Does +## 24 Capabilities Demonstrated -Creates a single RVF file (~34 KB) containing an entire agent OS registry, then exercises 14 distinct RVF capabilities against it. - -### Registry Contents +| # | Capability | RVF API | What It Shows | +|--:|-----------|---------|---------------| +| 1 | **Store creation** | `RvfStore::create` | 128-dim L2 store with file identity | +| 2-4 | **Batch ingestion** | `ingest_batch` | Multi-type metadata (String, U64) with per-category vector biasing | +| 5 | **Nearest-neighbor search** | `query` | Unfiltered + type-filtered task routing | +| 6 | **Quality envelope** | `query_with_envelope` | ResponseQuality, safety-net activation, budget reporting | +| 7 | **Audited query** | `query_audited` | Auto-appends witness entry per search (compliance) | +| 8 | **Security filter** | `FilterExpr::Ge` | Hands with security >= 80 | +| 9 | **Tier filter** | `FilterExpr::Eq` | Tier-4 autonomous agents only | +| 10 | **Category filter** | `FilterExpr::And` | Security tools by category | +| 11 | **Membership filter** | `MembershipFilter` | Tenant isolation — tools-only view via bitmap | +| 12 | **DoS hardening** | `BudgetTokenBucket`, `NegativeCache`, `ProofOfWork` | Rate limiting, degenerate query blacklisting, PoW challenge | +| 13 | **Adversarial detection** | `is_degenerate_distribution`, `centroid_distance_cv` | CV analysis to detect uniform (attack) distance distributions | +| 14 | **Embed WASM** | `embed_wasm` / `extract_wasm` | Microkernel role, self-bootstrapping check | +| 15 | **Embed kernel** | `embed_kernel` / `extract_kernel` | Linux image with cmdline, API port binding | +| 16 | **Embed eBPF** | `embed_ebpf` / `extract_ebpf` | Socket filter program (2 instructions) | +| 17 | **Embed dashboard** | `embed_dashboard` / `extract_dashboard` | HTML registry dashboard bundle | +| 18 | **Delete + compact** | `delete` + `compact` | Decommission 'twitter', reclaim 512 bytes | +| 19 | **Derive (lineage)** | `derive` | Snapshot child with parent provenance, depth 0→1 | +| 20 | **COW branch + freeze** | `freeze` + `branch` | Staging env with experimental 'sentinel' agent | +| 21 | **AGI container** | `AgiContainerBuilder` + `ParsedAgiManifest` | Full manifest: model, orchestrator, tools, eval, policy | +| 22 | **Segment directory** | `segment_dir` | Raw segment inventory (VEC, WASM, KERN, EBPF, DASH) | +| 23 | **Witness chain** | `create_witness_chain` + `verify` | 17-entry cryptographic audit trail | +| 24 | **Persistence** | `close` + `open_readonly` | Round-trip with file ID, WASM, kernel, eBPF, dashboard preservation | + +## Registry Contents | Component | Count | Description | |-----------|------:|-------------| @@ -22,23 +45,6 @@ Creates a single RVF file (~34 KB) containing an entire agent OS registry, then | **Channels** | 20 | Messaging adapters (Telegram, Discord, Slack, WhatsApp, etc.) | | **Total** | 65 | All searchable in one vector space | -### RVF Capabilities Exercised - -| # | Capability | RVF API | What It Shows | -|---|-----------|---------|---------------| -| 1 | **Store creation** | `RvfStore::create` | 128-dim L2 store with file identity | -| 2-4 | **Batch ingestion** | `ingest_batch` | Multi-type metadata (String, U64) with per-category vector biasing | -| 5 | **Nearest-neighbor search** | `query` | Unfiltered + type-filtered task routing | -| 6 | **Combined filters** | `FilterExpr::And` + `Ge` | Security threshold filtering (>= 80) | -| 7 | **Equality filter** | `FilterExpr::Eq` | Tier-4 autonomous agent isolation | -| 8 | **Category filter** | `FilterExpr::And` | Tool discovery by category | -| 9 | **Delete + compact** | `delete` + `compact` | Decommission 'twitter' hand, reclaim 512 bytes | -| 10 | **Derive (lineage)** | `derive` | Snapshot with parent-child provenance, depth tracking | -| 11 | **COW branching** | `freeze` + `branch` | Staging environment with experimental 'sentinel' agent | -| 12 | **Segment inspection** | `segment_dir` | Raw segment directory (VEC, MANIFEST, JOURNAL, etc.) | -| 13 | **Witness chain** | `create_witness_chain` + `verify` | 7-entry cryptographic audit trail | -| 14 | **Persistence** | `close` + `open_readonly` | Round-trip verification with file ID preservation | - ## Metadata Schema | Field ID | Constant | Name | Type | Applies To | @@ -73,29 +79,79 @@ Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email (SMTP/IMAP), Teams, Go ### Vector Biasing -Tools and channels use `category_bias()` — a hash-based offset applied to the first 16 dimensions — so items sharing a category cluster in vector space. Hands use tier-proportional bias (`tier * 0.1`) to create performance-tier clusters. +Tools and channels use `category_bias()` — a hash-based offset on the first 16 dimensions — so same-category items cluster in vector space. Hands use tier-proportional bias (`tier * 0.1`). + +### Quality Envelope (Step 6) + +`query_with_envelope` returns a `QualityEnvelope` containing: +- `ResponseQuality` — Verified, Approximate, Degraded, or Unreliable +- `SearchEvidenceSummary` — HNSW vs. safety-net candidate counts +- `BudgetReport` — time budget consumption in microseconds +- Optional `DegradationReport` if quality falls below threshold + +### Audited Queries (Step 7) + +`query_audited` works like `query` but auto-appends a `COMPUTATION` witness entry to the store's on-disk witness chain. Used for compliance-grade audit trails where every search must be recorded. + +### Membership Filter (Step 11) + +A dense bitmap that controls vector visibility: +- **Include mode**: only IDs in the bitmap are visible (tenant isolation) +- **Exclude mode**: IDs in the bitmap are hidden (access revocation) +- Serializes to compact bytes for network transfer between nodes + +### DoS Hardening (Step 12) + +Three-layer defense: +1. **BudgetTokenBucket** — rate-limits distance ops per time window +2. **NegativeCache** — blacklists query signatures that trigger degenerate search >N times +3. **ProofOfWork** — optional computational challenge (FNV-1a hash with leading-zero difficulty) + +### Adversarial Detection (Step 13) + +Detects attack vectors where all centroid distances are nearly uniform (CV < 0.05), indicating the query is designed to force exhaustive search. The `adaptive_n_probe` function widens search when degenerate distributions are detected. + +### Segment Embedding (Steps 14-17) + +Four segment types can be embedded into the RVF file: +- **WASM** — query engine microkernel or interpreter (enables self-bootstrapping) +- **Kernel** — Linux image with cmdline and API port binding +- **eBPF** — socket filter or XDP programs for kernel-level acceleration +- **Dashboard** — HTML/JS bundle for browser-based registry visualization + +All survive close/reopen and can be extracted with `extract_*` methods. + +### AGI Container (Step 21) + +`AgiContainerBuilder` packages the entire agent OS into a self-describing manifest: +- Model pinning (`claude-opus-4-6`) +- Orchestrator config (Claude Code + Claude Flow) +- Tool registry, agent prompts, eval suite, grading rules +- Policy, skill library, project instructions +- Segment inventory (kernel, WASM, vectors, witnesses) +- Offline capability flag + +`ParsedAgiManifest` provides zero-copy parsing and `is_autonomous_capable()` validation. -### Delete + Compact Lifecycle +### Delete + Compact Lifecycle (Step 18) -Step 9 demonstrates the full decommission workflow: -1. `delete(&[twitter_id])` — soft-delete, marks vector as tombstoned -2. `compact()` — rewrites the store, reclaiming dead space +1. `delete(&[id])` — soft-delete (tombstone), dead_ratio increases +2. `compact()` — rewrites store, reclaims dead space 3. Post-delete queries confirm the vector is gone -### COW Branching +### COW Branching (Step 20) -Step 11 shows a staging/production pattern: -1. `freeze()` — makes the parent read-only (immutable baseline) -2. `branch()` — creates a COW child inheriting all parent vectors -3. New vectors added to the child don't affect the parent +1. `freeze()` — immutable baseline +2. `branch()` — COW child inheriting all parent vectors +3. Writes to child allocate local clusters only 4. `cow_stats()` reports cluster-level copy-on-write telemetry -### Lineage Tracking +### Lineage (Step 19) -Step 10 derives a snapshot child and verifies: -- Child `parent_id` matches parent `file_id` -- Lineage depth increments (0 -> 1) -- Provenance chain is cryptographically verifiable +`derive()` creates a child with: +- New `file_id`, parent's `file_id` as `parent_id` +- `lineage_depth` incremented (0 → 1) +- Provenance chain cryptographically verifiable ## About OpenFang diff --git a/examples/rvf/examples/openfang.rs b/examples/rvf/examples/openfang.rs index ec5db1ce9..322e91a9a 100644 --- a/examples/rvf/examples/openfang.rs +++ b/examples/rvf/examples/openfang.rs @@ -1,30 +1,27 @@ -//! OpenFang Agent OS — RVF Knowledge Base +//! OpenFang Agent OS — RVF Knowledge Base (Full Surface) //! -//! A deep integration example that exercises the full RVF capability surface -//! using OpenFang's agent registry as the domain model. -//! -//! Capabilities demonstrated: -//! - Multi-type registry (Hands, Tools, Channels) in one vector space -//! - Rich metadata with typed fields and combined filter expressions -//! - Task routing via nearest-neighbor search -//! - Security and tier filtering -//! - Delete + compact lifecycle (decommission an agent, reclaim space) -//! - COW branching + freeze (staging branch for experimental agents) -//! - File identity and lineage tracking (parent/child provenance) -//! - Audited queries (witness entries for every search) -//! - Segment directory inspection -//! - Cryptographic witness chain with verification -//! - Persistence round-trip +//! Exercises **every major RVF capability** against a realistic agent-OS +//! registry: vector ingestion, filtered queries, quality envelopes, audited +//! queries, delete + compact, lineage derivation, COW branching, segment +//! embedding (WASM, kernel, eBPF, dashboard), membership filters, DoS +//! hardening, adversarial detection, AGI container packaging, file identity, +//! segment directory, witness chain, and persistence. //! //! Run with: //! cargo run --example openfang +use std::time::Duration; + use rvf_crypto::{create_witness_chain, shake256_256, verify_witness_chain, WitnessEntry}; +use rvf_runtime::adversarial::{centroid_distance_cv, is_degenerate_distribution}; use rvf_runtime::filter::FilterValue; use rvf_runtime::options::DistanceMetric; use rvf_runtime::{ - FilterExpr, MetadataEntry, MetadataValue, QueryOptions, RvfOptions, RvfStore, SearchResult, + AgiContainerBuilder, BudgetTokenBucket, FilterExpr, MembershipFilter, MetadataEntry, + MetadataValue, NegativeCache, ParsedAgiManifest, ProofOfWork, QueryOptions, QuerySignature, + RvfOptions, RvfStore, SearchResult, }; +use rvf_types::agi_container::ContainerSegments; use rvf_types::DerivationType; use tempfile::TempDir; @@ -35,12 +32,11 @@ use tempfile::TempDir; const DIM: usize = 128; const K: usize = 5; -// Metadata field IDs — shared across all component types. -const F_TYPE: u16 = 0; // "hand" | "tool" | "channel" +const F_TYPE: u16 = 0; const F_NAME: u16 = 1; -const F_DOMAIN: u16 = 2; // domain (hand), category (tool), protocol (channel) -const F_TIER: u16 = 3; // hand only: 1-4 -const F_SEC: u16 = 4; // hand only: 0-100 +const F_DOMAIN: u16 = 2; +const F_TIER: u16 = 3; +const F_SEC: u16 = 4; // --------------------------------------------------------------------------- // Data definitions @@ -186,7 +182,7 @@ fn witness(entries: &mut Vec, action: &str, ts_ns: u64, wtype: u8) } // --------------------------------------------------------------------------- -// Registry — tracks ID ranges for component lookup. +// Registry // --------------------------------------------------------------------------- struct Registry { @@ -233,7 +229,7 @@ fn print_results(results: &[SearchResult], reg: &Registry) { // --------------------------------------------------------------------------- fn main() { - println!("=== OpenFang Agent OS — RVF Knowledge Base ===\n"); + println!("=== OpenFang Agent OS — RVF Full Surface Demo ===\n"); let reg = Registry::new(); let tmp = TempDir::new().expect("tmpdir"); @@ -285,7 +281,7 @@ fn main() { witness(&mut wit, &format!("REGISTER_HANDS:{}", HANDS.len()), 1_709_000_000_000_000_000, 0x01); // ----------------------------------------------------------------------- - // 3. Register Tools (per-category bias) + // 3. Register Tools // ----------------------------------------------------------------------- println!("\n--- 3. Register Tools ({}) ---", TOOLS.len()); { @@ -358,83 +354,269 @@ fn main() { witness(&mut wit, "ROUTE_TASK:k=5", 1_709_000_010_000_000_000, 0x02); // ----------------------------------------------------------------------- - // 6. Security filter (>= 80) + // 6. Quality envelope query + // ----------------------------------------------------------------------- + println!("\n--- 6. Quality Envelope ---"); + let envelope = store.query_with_envelope(&query, K, &QueryOptions::default()) + .expect("envelope query"); + println!(" Quality: {:?}", envelope.quality); + println!(" HNSW candidates: {}", envelope.evidence.hnsw_candidate_count); + println!(" Safety-net candidates: {}", envelope.evidence.safety_net_candidate_count); + println!(" Budget total_us: {}", envelope.budgets.total_us); + println!(" Results: {} (top match: id={}, d={:.4})", + envelope.results.len(), + envelope.results.first().map_or(0, |r| r.id), + envelope.results.first().map_or(0.0, |r| r.distance)); + witness(&mut wit, "QUERY_ENVELOPE:k=5", 1_709_000_011_000_000_000, 0x02); + + // ----------------------------------------------------------------------- + // 7. Audited query (auto-appends witness) + // ----------------------------------------------------------------------- + println!("\n--- 7. Audited Query ---"); + let audited = store.query_audited(&query, K, &QueryOptions::default()) + .expect("audited query"); + println!(" Returned {} results (witness auto-appended to store)", audited.len()); + println!(" Store witness hash: {}", hex(&store.last_witness_hash()[..8])); + witness(&mut wit, "QUERY_AUDITED:k=5", 1_709_000_012_000_000_000, 0x02); + + // ----------------------------------------------------------------------- + // 8. Security + tier filters // ----------------------------------------------------------------------- - println!("\n--- 6. High-Security Hands (sec >= 80) ---"); - let sec_opts = QueryOptions { + println!("\n--- 8. Security Filter (sec >= 80) ---"); + let sec_res = store.query(&query, K, &QueryOptions { filter: Some(FilterExpr::And(vec![ FilterExpr::Eq(F_TYPE, FilterValue::String("hand".into())), FilterExpr::Ge(F_SEC, FilterValue::U64(80)), ])), ..Default::default() - }; - let sec_res = store.query(&query, K, &sec_opts).expect("sec query"); + }).expect("sec query"); print_results(&sec_res, ®); println!(" {} agents pass threshold", sec_res.len()); - // ----------------------------------------------------------------------- - // 7. Autonomous tier-4 agents - // ----------------------------------------------------------------------- - println!("\n--- 7. Tier-4 Autonomous Agents ---"); - let tier_opts = QueryOptions { + println!("\n--- 9. Tier-4 Autonomous Agents ---"); + let tier_res = store.query(&query, K, &QueryOptions { filter: Some(FilterExpr::And(vec![ FilterExpr::Eq(F_TYPE, FilterValue::String("hand".into())), FilterExpr::Eq(F_TIER, FilterValue::U64(4)), ])), ..Default::default() - }; - let tier_res = store.query(&query, K, &tier_opts).expect("tier query"); + }).expect("tier query"); print_results(&tier_res, ®); - // ----------------------------------------------------------------------- - // 8. Tool discovery by category - // ----------------------------------------------------------------------- - println!("\n--- 8. Security Tool Discovery ---"); - let tool_opts = QueryOptions { + println!("\n--- 10. Security Tool Discovery ---"); + let tool_res = store.query(&query, 10, &QueryOptions { filter: Some(FilterExpr::And(vec![ FilterExpr::Eq(F_TYPE, FilterValue::String("tool".into())), FilterExpr::Eq(F_DOMAIN, FilterValue::String("security".into())), ])), ..Default::default() - }; - let tool_res = store.query(&query, 10, &tool_opts).expect("tool query"); + }).expect("tool query"); print_results(&tool_res, ®); // ----------------------------------------------------------------------- - // 9. Delete + Compact — decommission the "twitter" hand + // 11. Membership filter — multi-tenant isolation + // ----------------------------------------------------------------------- + println!("\n--- 11. Membership Filter (tenant isolation) ---"); + { + let mut mf = MembershipFilter::new_include(reg.total()); + // Tenant A can only see tools (IDs 7..44) + for id in reg.tool_base..reg.tool_base + reg.tool_count { + mf.add(id); + } + println!(" Created include-mode filter: {} members of {} total", + mf.member_count(), reg.total()); + println!(" Mode: {:?}, generation: {}", mf.mode(), mf.generation_id()); + + // Verify containment + let hand_visible = mf.contains(0); // clip hand + let tool_visible = mf.contains(reg.tool_base); // http_fetch tool + println!(" Hand 'clip' visible: {} (expect false)", hand_visible); + println!(" Tool 'http_fetch' visible: {} (expect true)", tool_visible); + assert!(!hand_visible); + assert!(tool_visible); + + let serialized = mf.serialize(); + println!(" Serialized filter: {} bytes", serialized.len()); + } + witness(&mut wit, "MEMBERSHIP_FILTER:tenant", 1_709_000_015_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 12. DoS hardening — token bucket + negative cache + proof-of-work + // ----------------------------------------------------------------------- + println!("\n--- 12. DoS Hardening ---"); + { + // Token bucket: 1000 ops per second + let mut bucket = BudgetTokenBucket::new(1000, Duration::from_secs(1)); + let cost = (K as u64) * (reg.total()); // k * N distance ops + match bucket.try_consume(cost) { + Ok(remaining) => println!(" Token bucket: consumed {} ops, {} remaining", cost, remaining), + Err(deficit) => println!(" Token bucket: REJECTED (need {} more tokens)", deficit), + } + + // Query signature + negative cache + let sig = QuerySignature::from_query(&query); + let mut neg_cache = NegativeCache::new(3, Duration::from_secs(60), 1000); + let bl1 = neg_cache.record_degenerate(sig); + let bl2 = neg_cache.record_degenerate(sig); + let bl3 = neg_cache.record_degenerate(sig); // 3rd hit -> blacklisted + println!(" Negative cache: hit1={} hit2={} hit3(blacklist)={}", bl1, bl2, bl3); + println!(" Signature blacklisted: {}", neg_cache.is_blacklisted(&sig)); + + // Proof-of-work + let pow = ProofOfWork { + challenge: [0x4F, 0x50, 0x45, 0x4E, 0x46, 0x41, 0x4E, 0x47, // "OPENFANG" + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00], + difficulty: 8, + }; + match pow.solve() { + Some(nonce) => { + let valid = pow.verify(nonce); + println!(" PoW (d=8): solved nonce={}, valid={}", nonce, valid); + } + None => println!(" PoW: no solution found within limit"), + } + } + witness(&mut wit, "DOS_HARDENING:bucket+cache+pow", 1_709_000_016_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 13. Adversarial detection // ----------------------------------------------------------------------- - println!("\n--- 9. Delete + Compact (decommission 'twitter') ---"); + println!("\n--- 13. Adversarial Detection ---"); + { + // Natural distances — extract from query results + let natural: Vec = all.iter().map(|r| r.distance).collect(); + let cv_natural = centroid_distance_cv(&natural, K); + let degen_natural = is_degenerate_distribution(&natural, K); + println!(" Natural query distances: {:?}", natural.iter().map(|d| format!("{:.2}", d)).collect::>()); + println!(" CV={:.4}, degenerate={}", cv_natural, degen_natural); + + // Adversarial — uniform distances (attack vector) + let uniform = vec![5.0f32; 100]; + let cv_uniform = centroid_distance_cv(&uniform, K); + let degen_uniform = is_degenerate_distribution(&uniform, K); + println!(" Uniform distances (simulated attack):"); + println!(" CV={:.4}, degenerate={}", cv_uniform, degen_uniform); + assert!(degen_uniform, "uniform should be degenerate"); + } + witness(&mut wit, "ADVERSARIAL:detect", 1_709_000_017_000_000_000, 0x02); + + // ----------------------------------------------------------------------- + // 14. Embed WASM module (query engine microkernel) + // ----------------------------------------------------------------------- + println!("\n--- 14. Embed WASM Module ---"); + let fake_wasm = b"\x00asm\x01\x00\x00\x00"; // minimal WASM header + let wasm_seg = store.embed_wasm( + 0x02, // role: Microkernel + 0x01, // target: wasm32 + 0x0000, // no required features + fake_wasm, + 1, // export_count + 1, // bootstrap_priority + 0, // interpreter_type + ).expect("embed wasm"); + println!(" Embedded WASM microkernel: seg_id={}, {} bytes", wasm_seg, fake_wasm.len()); + println!(" Self-bootstrapping: {}", store.is_self_bootstrapping()); + + // Extract and verify round-trip + let (hdr, bytecode) = store.extract_wasm().expect("extract wasm").expect("wasm present"); + println!(" Extracted: header={} bytes, bytecode={} bytes", hdr.len(), bytecode.len()); + assert_eq!(&bytecode, fake_wasm); + witness(&mut wit, "EMBED_WASM:microkernel", 1_709_000_018_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 15. Embed kernel image + // ----------------------------------------------------------------------- + println!("\n--- 15. Embed Kernel ---"); + let fake_kernel = b"bzImage-openfang-v1.0-minimal"; + let kern_seg = store.embed_kernel( + 0x01, // arch: x86_64 + 0x01, // kernel_type: linux + 0x0000, // flags + fake_kernel, + 8080, // api_port + Some("console=ttyS0 root=/dev/vda rw"), + ).expect("embed kernel"); + println!(" Embedded kernel: seg_id={}, {} bytes, port=8080", kern_seg, fake_kernel.len()); + + let (khdr, kpayload) = store.extract_kernel().expect("extract kernel").expect("kernel present"); + println!(" Extracted: header={} bytes, payload={} bytes", khdr.len(), kpayload.len()); + // Payload contains cmdline + image; verify image bytes are present + assert!(kpayload.windows(fake_kernel.len()).any(|w| w == fake_kernel), + "kernel image not found in extracted payload"); + witness(&mut wit, "EMBED_KERNEL:linux", 1_709_000_019_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 16. Embed eBPF program + // ----------------------------------------------------------------------- + println!("\n--- 16. Embed eBPF ---"); + // 8 bytes = 1 eBPF instruction (mov64 r0, 0; exit) + let fake_ebpf = &[0xb7, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00u8, + 0x95, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]; + let ebpf_seg = store.embed_ebpf( + 0x01, // program_type: socket_filter + 0x01, // attach_type: ingress + DIM as u16, // max_dimension + fake_ebpf, + None, // no BTF + ).expect("embed ebpf"); + println!(" Embedded eBPF: seg_id={}, {} bytes (2 insns)", ebpf_seg, fake_ebpf.len()); + + let (ehdr, eprog) = store.extract_ebpf().expect("extract ebpf").expect("ebpf present"); + println!(" Extracted: header={} bytes, program={} bytes", ehdr.len(), eprog.len()); + assert_eq!(&eprog, fake_ebpf); + witness(&mut wit, "EMBED_EBPF:filter", 1_709_000_019_500_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 17. Embed dashboard + // ----------------------------------------------------------------------- + println!("\n--- 17. Embed Dashboard ---"); + let dashboard_html = br#" +OpenFang Registry +

OpenFang Agent Registry Dashboard

+

7 Hands | 38 Tools | 20 Channels

"#; + let dash_seg = store.embed_dashboard( + 0x01, // ui_framework: vanilla HTML + dashboard_html, + "index.html", + ).expect("embed dashboard"); + println!(" Embedded dashboard: seg_id={}, {} bytes", dash_seg, dashboard_html.len()); + + let (dhdr, dbundle) = store.extract_dashboard().expect("extract dash").expect("dash present"); + println!(" Extracted: header={} bytes, bundle={} bytes", dhdr.len(), dbundle.len()); + assert_eq!(&dbundle, dashboard_html); + witness(&mut wit, "EMBED_DASHBOARD:html", 1_709_000_019_800_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 18. Delete + Compact + // ----------------------------------------------------------------------- + println!("\n--- 18. Delete + Compact (decommission 'twitter') ---"); let twitter_id = HANDS.iter().position(|h| h.name == "twitter").unwrap() as u64 + reg.hand_base; let st_before = store.status(); - println!(" Before: {} vectors, {} bytes, dead_ratio={:.2}", + println!(" Before: {} vectors, {} bytes, dead={:.2}", st_before.total_vectors, st_before.file_size, st_before.dead_space_ratio); - let del = store.delete(&[twitter_id]).expect("delete twitter"); + let del = store.delete(&[twitter_id]).expect("delete"); println!(" Deleted {} vector(s) (epoch {})", del.deleted, del.epoch); - let st_mid = store.status(); - println!(" After delete: {} vectors, dead_ratio={:.2}", st_mid.total_vectors, st_mid.dead_space_ratio); - let comp = store.compact().expect("compact"); println!(" Compacted: {} segments, {} bytes reclaimed (epoch {})", comp.segments_compacted, comp.bytes_reclaimed, comp.epoch); let st_after = store.status(); - println!(" After compact: {} vectors, {} bytes, dead_ratio={:.2}", + println!(" After: {} vectors, {} bytes, dead={:.2}", st_after.total_vectors, st_after.file_size, st_after.dead_space_ratio); - // Verify twitter is gone from hand queries - let post_del = store.query(&query, K, &hands_only).expect("post-delete query"); + let post_del = store.query(&query, K, &hands_only).expect("post-delete"); for r in &post_del { assert_ne!(r.id, twitter_id, "twitter should be deleted"); } - println!(" Verified: 'twitter' no longer appears in results"); + println!(" Verified: 'twitter' absent from results"); witness(&mut wit, "DELETE+COMPACT:twitter", 1_709_000_020_000_000_000, 0x01); // ----------------------------------------------------------------------- - // 10. Derive — create a snapshot with lineage tracking + // 19. Derive (lineage snapshot) // ----------------------------------------------------------------------- - println!("\n--- 10. Derive (Snapshot with Lineage) ---"); + println!("\n--- 19. Derive (Lineage Snapshot) ---"); let parent_fid = hex(&store.file_id()[..8]); let parent_depth = store.lineage_depth(); @@ -443,30 +625,26 @@ fn main() { let child_parent = hex(&child.parent_id()[..8]); let child_depth = child.lineage_depth(); - println!(" Parent: file_id={} depth={}", parent_fid, parent_depth); - println!(" Child: file_id={} depth={}", child_fid, child_depth); - println!(" Child parent_id={} (matches parent: {})", child_parent, child_parent == parent_fid); - assert_eq!(child_depth, parent_depth + 1, "depth should increment"); - - let child_st = child.status(); - println!(" Child vectors: {}, segments: {}", child_st.total_vectors, child_st.total_segments); + println!(" Parent: fid={} depth={}", parent_fid, parent_depth); + println!(" Child: fid={} depth={}", child_fid, child_depth); + println!(" Lineage: parent_id matches = {}", child_parent == parent_fid); + assert_eq!(child_depth, parent_depth + 1); child.close().expect("close child"); witness(&mut wit, "DERIVE:snapshot", 1_709_000_030_000_000_000, 0x01); // ----------------------------------------------------------------------- - // 11. COW Branch — staging environment for experimental agents + // 20. COW Branch + freeze // ----------------------------------------------------------------------- - println!("\n--- 11. COW Branch (Staging Environment) ---"); - store.freeze().expect("freeze parent"); - println!(" Parent frozen (read-only)"); + println!("\n--- 20. COW Branch (Staging) ---"); + store.freeze().expect("freeze"); + println!(" Parent frozen"); let mut staging = store.branch(&branch_path).expect("branch"); - println!(" Branch created: is_cow_child={}", staging.is_cow_child()); + println!(" Branch: cow_child={}", staging.is_cow_child()); if let Some(stats) = staging.cow_stats() { - println!(" COW stats: {} clusters, {} local", stats.cluster_count, stats.local_cluster_count); + println!(" COW: {} clusters, {} local", stats.cluster_count, stats.local_cluster_count); } - // Add experimental agent to staging only let exp_id = reg.total(); let exp_vec = biased_vector(9999, 0.5); let mut exp_meta = Vec::with_capacity(5); @@ -477,27 +655,77 @@ fn main() { push_meta(&mut exp_meta, F_SEC, MetadataValue::U64(99)); let exp_r = staging.ingest_batch(&[exp_vec.as_slice()], &[exp_id], Some(&exp_meta)) - .expect("ingest experimental"); - println!(" Added experimental 'sentinel' to staging (epoch {})", exp_r.epoch); - - let staging_st = staging.status(); - println!(" Staging: {} vectors (parent had {})", staging_st.total_vectors, st_after.total_vectors); + .expect("ingest sentinel"); + println!(" Added 'sentinel' to staging (epoch {})", exp_r.epoch); + println!(" Staging: {} vectors (parent: {})", staging.status().total_vectors, st_after.total_vectors); if let Some(stats) = staging.cow_stats() { - println!(" COW stats after write: {} clusters, {} local", stats.cluster_count, stats.local_cluster_count); + println!(" COW after write: {} clusters, {} local", stats.cluster_count, stats.local_cluster_count); } - staging.close().expect("close staging"); - witness(&mut wit, "COW_BRANCH:staging+sentinel", 1_709_000_040_000_000_000, 0x01); + witness(&mut wit, "COW_BRANCH:sentinel", 1_709_000_040_000_000_000, 0x01); // ----------------------------------------------------------------------- - // 12. Segment directory inspection + // 21. AGI Container manifest // ----------------------------------------------------------------------- - println!("\n--- 12. Segment Directory ---"); + println!("\n--- 21. AGI Container ---"); + { + let orchestrator = br#"{"claude_code":{"model":"claude-opus-4-6"},"claude_flow":{"topology":"hierarchical","max_agents":15}}"#; + let tool_reg = br#"[{"name":"rvf_query","type":"vector_search"},{"name":"rvf_route","type":"task_routing"}]"#; + let eval_tasks = br#"[{"id":1,"task":"route 1000 tasks under 10ms"}]"#; + let eval_graders = br#"[{"type":"latency_p99","threshold_ms":10}]"#; + + let segs = ContainerSegments { + kernel_present: true, + kernel_size: fake_kernel.len() as u64, + wasm_count: 1, + wasm_total_size: fake_wasm.len() as u64, + vec_segment_count: 1, + witness_count: wit.len() as u32, + orchestrator_present: true, + world_model_present: true, + ..Default::default() + }; + + let builder = AgiContainerBuilder::new([0xFA; 16], [0x01; 16]) + .with_model_id("claude-opus-4-6") + .with_policy(b"autonomous-level-4", [0xBB; 8]) + .with_orchestrator(orchestrator) + .with_tool_registry(tool_reg) + .with_agent_prompts(b"You are an OpenFang routing agent.") + .with_eval_tasks(eval_tasks) + .with_eval_graders(eval_graders) + .with_skill_library(b"[]") + .with_project_instructions(b"# OpenFang CLAUDE.md\nRoute tasks to Hands.") + .with_domain_profile(b"agent-os-registry-v1") + .offline_capable() + .with_segments(segs); + + let (payload, header) = builder.build().expect("build container"); + println!(" Container: {} bytes", payload.len()); + println!(" Magic valid: {}", header.is_valid_magic()); + println!(" Flags: kernel={} orchestrator={} eval={} offline={} tools={}", + header.has_kernel(), header.has_orchestrator(), + header.flags & 0x10 != 0, header.is_offline_capable(), + header.flags & 0x400 != 0); + + let parsed = ParsedAgiManifest::parse(&payload).expect("parse manifest"); + println!(" Model: {:?}", parsed.model_id_str()); + println!(" Autonomous capable: {}", parsed.is_autonomous_capable()); + println!(" Orchestrator: {} bytes", parsed.orchestrator_config.map_or(0, |c| c.len())); + println!(" Tool registry: {} bytes", parsed.tool_registry.map_or(0, |c| c.len())); + println!(" Project instructions: {} bytes", parsed.project_instructions.map_or(0, |c| c.len())); + } + witness(&mut wit, "AGI_CONTAINER:build+parse", 1_709_000_050_000_000_000, 0x01); + + // ----------------------------------------------------------------------- + // 22. Segment directory + // ----------------------------------------------------------------------- + println!("\n--- 22. Segment Directory ---"); let seg_dir: Vec<_> = store.segment_dir().to_vec(); - println!(" {} segments in parent store:", seg_dir.len()); - println!(" {:>12} {:>8} {:>8} {:>6}", "SegID", "Offset", "Length", "Type"); - println!(" {:->12} {:->8} {:->8} {:->6}", "", "", "", ""); + println!(" {} segments:", seg_dir.len()); + println!(" {:>6} {:>8} {:>8} {:>6}", "SegID", "Offset", "Length", "Type"); + println!(" {:->6} {:->8} {:->8} {:->6}", "", "", "", ""); for &(seg_id, offset, length, seg_type) in &seg_dir { let tname = match seg_type { 0x01 => "VEC", @@ -506,27 +734,33 @@ fn main() { 0x04 => "WITN", 0x05 => "KERN", 0x06 => "EBPF", + 0x0F => "EBPF2", + 0x10 => "WASM", + 0x11 => "DASH", _ => "????", }; - println!(" {:>12} {:>8} {:>8} {:>6}", seg_id, offset, length, tname); + println!(" {:>6} {:>8} {:>8} {:>6}", seg_id, offset, length, tname); } // ----------------------------------------------------------------------- - // 13. Witness chain + // 23. Witness chain // ----------------------------------------------------------------------- - println!("\n--- 13. Witness Chain ---"); + println!("\n--- 23. Witness Chain ---"); let chain = create_witness_chain(&wit); println!(" {} entries, {} bytes", wit.len(), chain.len()); - println!(" Last witness hash: {}", hex(&store.last_witness_hash()[..8])); + println!(" Store witness hash: {}", hex(&store.last_witness_hash()[..8])); match verify_witness_chain(&chain) { Ok(verified) => { println!(" Integrity: VALID\n"); let labels = [ "REGISTER_HANDS", "REGISTER_TOOLS", "REGISTER_CHANNELS", - "ROUTE_TASK", "DELETE+COMPACT", "DERIVE", "COW_BRANCH", + "ROUTE_TASK", "QUERY_ENVELOPE", "QUERY_AUDITED", + "MEMBERSHIP", "DOS_HARDENING", "ADVERSARIAL", + "EMBED_WASM", "EMBED_KERNEL", "EMBED_EBPF", "EMBED_DASH", + "DELETE+COMPACT", "DERIVE", "COW_BRANCH", "AGI_CONTAINER", ]; - println!(" {:>2} {:>4} {:>22} {}", "#", "Type", "Timestamp", "Action"); + println!(" {:>2} {:>4} {:>22} {}", "#", "Kind", "Timestamp", "Action"); println!(" {:->2} {:->4} {:->22} {:->20}", "", "", "", ""); for (i, e) in verified.iter().enumerate() { let t = if e.witness_type == 0x01 { "PROV" } else { "COMP" }; @@ -538,20 +772,26 @@ fn main() { } // ----------------------------------------------------------------------- - // 14. Persistence round-trip + // 24. Persistence // ----------------------------------------------------------------------- - println!("\n--- 14. Persistence ---"); + println!("\n--- 24. Persistence ---"); let final_st = store.status(); - println!(" Before close: {} vectors, {} bytes", final_st.total_vectors, final_st.file_size); - - // Parent is frozen/read-only, so we just drop it + println!(" Before: {} vectors, {} segments, {} bytes", + final_st.total_vectors, final_st.total_segments, final_st.file_size); drop(store); let reopened = RvfStore::open_readonly(&store_path).expect("reopen"); let reopen_st = reopened.status(); - println!(" After reopen: {} vectors, epoch {}", reopen_st.total_vectors, reopen_st.current_epoch); + println!(" After: {} vectors, epoch {}", reopen_st.total_vectors, reopen_st.current_epoch); println!(" File ID preserved: {}", hex(&reopened.file_id()[..8]) == parent_fid); + // Verify WASM survives persistence + let wasm_ok = reopened.extract_wasm().expect("re-extract wasm").is_some(); + let kern_ok = reopened.extract_kernel().expect("re-extract kernel").is_some(); + let ebpf_ok = reopened.extract_ebpf().expect("re-extract ebpf").is_some(); + let dash_ok = reopened.extract_dashboard().expect("re-extract dash").is_some(); + println!(" WASM={} Kernel={} eBPF={} Dashboard={}", wasm_ok, kern_ok, ebpf_ok, dash_ok); + let recheck = reopened.query(&query, K, &QueryOptions::default()).expect("recheck"); assert_eq!(all.len(), recheck.len(), "count mismatch"); for (a, b) in all.iter().zip(recheck.iter()) { @@ -564,16 +804,18 @@ fn main() { // Summary // ----------------------------------------------------------------------- println!("\n=== Summary ===\n"); - println!(" Registry: {} hands + {} tools + {} channels = {} components", + println!(" Registry: {} hands + {} tools + {} channels = {}", HANDS.len(), TOOLS.len(), CHANNELS.len(), reg.total()); - println!(" Deleted: twitter (+ compacted)"); - println!(" Derived: snapshot at depth {}", child_depth); - println!(" Branched: COW staging with experimental 'sentinel'"); - println!(" Segments: {} in parent", seg_dir.len()); - println!(" Witness: {} entries", wit.len()); - println!(" File size: {} bytes", final_st.file_size); - println!(" Filters: security, tier, category — all passing"); - println!(" Persist: verified"); - - println!("\nDone."); + println!(" Queries: basic, filtered, envelope, audited"); + println!(" Filters: security (>=80), tier (==4), category, membership"); + println!(" Segments: VEC + WASM + KERN + EBPF + DASH = {} total", seg_dir.len()); + println!(" Lifecycle: delete + compact (twitter removed)"); + println!(" Lineage: derive depth {}, COW branch with sentinel", child_depth); + println!(" Security: DoS bucket + negative cache + PoW + adversarial detect"); + println!(" AGI: container manifest (autonomous capable)"); + println!(" Witness: {} entries, chain verified", wit.len()); + println!(" File: {} bytes, {} segments, persistence verified", + final_st.file_size, final_st.total_segments); + + println!("\nDone — {} RVF capabilities exercised.", 24); } From 545d099d662db6cf0aef42f2b9578c0ae97813b9 Mon Sep 17 00:00:00 2001 From: rUv Date: Thu, 26 Feb 2026 16:00:40 +0000 Subject: [PATCH 05/10] fix: address review issues in openfang RVF example - Fix fragile persistence assertion: compare against post-delete baseline instead of pre-delete `all` which could include the deleted twitter vector - Extract segment type magic numbers into named constants (SEG_VEC, etc.) - Add comments for raw AGI container flag bitmasks (bits 4 and 10) - Add seed non-overlap comment for vector generation - Improve hex() to use pre-allocated String with fmt::Write Co-Authored-By: claude-flow --- examples/rvf/examples/openfang.rs | 54 +++++++++++++++++++++---------- 1 file changed, 37 insertions(+), 17 deletions(-) diff --git a/examples/rvf/examples/openfang.rs b/examples/rvf/examples/openfang.rs index 322e91a9a..ac9fc5158 100644 --- a/examples/rvf/examples/openfang.rs +++ b/examples/rvf/examples/openfang.rs @@ -38,6 +38,17 @@ const F_DOMAIN: u16 = 2; const F_TIER: u16 = 3; const F_SEC: u16 = 4; +// Segment type constants (from RVF file format) +const SEG_VEC: u8 = 0x01; +const SEG_MFST: u8 = 0x02; +const SEG_JRNL: u8 = 0x03; +const SEG_WITN: u8 = 0x04; +const SEG_KERN: u8 = 0x05; +const SEG_EBPF: u8 = 0x06; +const SEG_EBPF2: u8 = 0x0F; +const SEG_WASM: u8 = 0x10; +const SEG_DASH: u8 = 0x11; + // --------------------------------------------------------------------------- // Data definitions // --------------------------------------------------------------------------- @@ -169,7 +180,12 @@ fn sv(s: &str) -> MetadataValue { } fn hex(bytes: &[u8]) -> String { - bytes.iter().map(|b| format!("{:02x}", b)).collect::>().join("") + use std::fmt::Write; + let mut s = String::with_capacity(bytes.len() * 2); + for b in bytes { + let _ = write!(s, "{:02x}", b); + } + s } fn witness(entries: &mut Vec, action: &str, ts_ns: u64, wtype: u8) { @@ -259,6 +275,7 @@ fn main() { // ----------------------------------------------------------------------- println!("\n--- 2. Register Hands ({}) ---", HANDS.len()); { + // Seeds: hands [100..218], tools [500..1647], channels [1000..1817] — non-overlapping let vecs: Vec> = HANDS.iter().enumerate() .map(|(i, h)| biased_vector(i as u64 * 17 + 100, h.tier as f32 * 0.1)) .collect(); @@ -706,8 +723,9 @@ fn main() { println!(" Magic valid: {}", header.is_valid_magic()); println!(" Flags: kernel={} orchestrator={} eval={} offline={} tools={}", header.has_kernel(), header.has_orchestrator(), - header.flags & 0x10 != 0, header.is_offline_capable(), - header.flags & 0x400 != 0); + header.flags & 0x10 != 0, // bit 4: eval suite present + header.is_offline_capable(), + header.flags & 0x400 != 0); // bit 10: tool registry present let parsed = ParsedAgiManifest::parse(&payload).expect("parse manifest"); println!(" Model: {:?}", parsed.model_id_str()); @@ -728,15 +746,15 @@ fn main() { println!(" {:->6} {:->8} {:->8} {:->6}", "", "", "", ""); for &(seg_id, offset, length, seg_type) in &seg_dir { let tname = match seg_type { - 0x01 => "VEC", - 0x02 => "MFST", - 0x03 => "JRNL", - 0x04 => "WITN", - 0x05 => "KERN", - 0x06 => "EBPF", - 0x0F => "EBPF2", - 0x10 => "WASM", - 0x11 => "DASH", + SEG_VEC => "VEC", + SEG_MFST => "MFST", + SEG_JRNL => "JRNL", + SEG_WITN => "WITN", + SEG_KERN => "KERN", + SEG_EBPF => "EBPF", + SEG_EBPF2 => "EBPF2", + SEG_WASM => "WASM", + SEG_DASH => "DASH", _ => "????", }; println!(" {:>6} {:>8} {:>8} {:>6}", seg_id, offset, length, tname); @@ -775,6 +793,8 @@ fn main() { // 24. Persistence // ----------------------------------------------------------------------- println!("\n--- 24. Persistence ---"); + // Capture baseline *after* delete+compact so the comparison is stable + let pre_close = store.query(&query, K, &QueryOptions::default()).expect("pre-close query"); let final_st = store.status(); println!(" Before: {} vectors, {} segments, {} bytes", final_st.total_vectors, final_st.total_segments, final_st.file_size); @@ -785,7 +805,7 @@ fn main() { println!(" After: {} vectors, epoch {}", reopen_st.total_vectors, reopen_st.current_epoch); println!(" File ID preserved: {}", hex(&reopened.file_id()[..8]) == parent_fid); - // Verify WASM survives persistence + // Verify segments survive persistence let wasm_ok = reopened.extract_wasm().expect("re-extract wasm").is_some(); let kern_ok = reopened.extract_kernel().expect("re-extract kernel").is_some(); let ebpf_ok = reopened.extract_ebpf().expect("re-extract ebpf").is_some(); @@ -793,10 +813,10 @@ fn main() { println!(" WASM={} Kernel={} eBPF={} Dashboard={}", wasm_ok, kern_ok, ebpf_ok, dash_ok); let recheck = reopened.query(&query, K, &QueryOptions::default()).expect("recheck"); - assert_eq!(all.len(), recheck.len(), "count mismatch"); - for (a, b) in all.iter().zip(recheck.iter()) { - assert_eq!(a.id, b.id, "id mismatch"); - assert!((a.distance - b.distance).abs() < 1e-6, "distance mismatch"); + assert_eq!(pre_close.len(), recheck.len(), "count mismatch after reopen"); + for (a, b) in pre_close.iter().zip(recheck.iter()) { + assert_eq!(a.id, b.id, "id mismatch after reopen"); + assert!((a.distance - b.distance).abs() < 1e-6, "distance mismatch after reopen"); } println!(" Persistence verified."); From 69cf4c5304992193426f95fad3ee94b40b0cd2a4 Mon Sep 17 00:00:00 2001 From: rUv Date: Thu, 26 Feb 2026 16:05:29 +0000 Subject: [PATCH 06/10] docs: update guides to match current API surface and versions - GETTING_STARTED.md: rewrite to cover both ruvector-core (VectorDB) and rvf-runtime (RvfStore) APIs, add package registry table, fix SearchQuery fields (ef_search not include_vectors), results use .score not .distance - INSTALLATION.md: update crate version 0.1.0 -> 2.0, fix npm scoped package names (@ruvector/*), remove non-existent Docker image, update Rust version requirement to 1.80+, fix CLI docs to match actual subcommands - BASIC_TUTORIAL.md: fix SearchQuery.include_vectors -> ef_search, fix result.distance -> result.score, fix HnswConfig/QuantizationConfig field access patterns (options.hnsw -> options.hnsw_config, wrap in Some()) - ADVANCED_FEATURES.md: same field name fixes, fix QuantizationConfig wrapping in Some(), remove references to non-existent mmap_vectors field - docs/README.md: update version to 2.0.4/0.1.100, update date Co-Authored-By: claude-flow --- docs/README.md | 2 +- docs/guides/ADVANCED_FEATURES.md | 27 +++--- docs/guides/BASIC_TUTORIAL.md | 37 ++++--- docs/guides/GETTING_STARTED.md | 160 +++++++++++++++++++++++-------- docs/guides/INSTALLATION.md | 123 ++++++++++-------------- 5 files changed, 198 insertions(+), 151 deletions(-) diff --git a/docs/README.md b/docs/README.md index 033490334..e92f851d7 100644 --- a/docs/README.md +++ b/docs/README.md @@ -140,4 +140,4 @@ docs/ --- -**Last Updated**: 2026-01-21 | **Version**: 0.1.29 | **Status**: Production Ready +**Last Updated**: 2026-02-26 | **Version**: 2.0.4 (core) / 0.1.100 (npm) | **Status**: Production Ready diff --git a/docs/guides/ADVANCED_FEATURES.md b/docs/guides/ADVANCED_FEATURES.md index 689d98393..58f80bc77 100644 --- a/docs/guides/ADVANCED_FEATURES.md +++ b/docs/guides/ADVANCED_FEATURES.md @@ -224,10 +224,10 @@ use ruvector_core::{EnhancedPQ, PQConfig}; fn product_quantization_example() -> Result<(), Box> { let mut options = DbOptions::default(); options.dimensions = 128; - options.quantization = QuantizationConfig::Product { + options.quantization = Some(QuantizationConfig::Product { subspaces: 16, // Split into 16 subvectors of 8D each k: 256, // 256 centroids per subspace - }; + }); let db = VectorDB::new(options)?; @@ -360,7 +360,7 @@ cargo build --release -vv | grep target-cpu ```rust let mut options = DbOptions::default(); -options.mmap_vectors = true; // Enable memory mapping +// options.mmap_vectors = true; // Enable memory mapping (if supported by storage backend) let db = VectorDB::new(options)?; ``` @@ -408,26 +408,26 @@ let results: Vec> = queries ```rust // For speed (lower recall) -options.hnsw.ef_search = 50; +options.hnsw_config.as_mut().unwrap().ef_search = 50; // For accuracy (slower) -options.hnsw.ef_search = 500; +options.hnsw_config.as_mut().unwrap().ef_search = 500; // Balanced (recommended) -options.hnsw.ef_search = 100; +options.hnsw_config.as_mut().unwrap().ef_search = 100; ``` ### 6. Quantization ```rust // 4x compression, 97-99% recall -options.quantization = QuantizationConfig::Scalar; +options.quantization = Some(QuantizationConfig::Scalar); // 16x compression, 90-95% recall -options.quantization = QuantizationConfig::Product { +options.quantization = Some(QuantizationConfig::Product { subspaces: 16, k: 256, -}; +}); ``` ### 7. Distance Metric Selection @@ -463,18 +463,17 @@ fn advanced_demo() -> Result<(), Box> { let mut options = DbOptions::default(); options.dimensions = 384; options.storage_path = "./advanced_db.db".to_string(); - options.hnsw = HnswConfig { + options.hnsw_config = Some(HnswConfig { m: 64, ef_construction: 400, ef_search: 200, max_elements: 10_000_000, - }; + }); options.distance_metric = DistanceMetric::Cosine; - options.quantization = QuantizationConfig::Product { + options.quantization = Some(QuantizationConfig::Product { subspaces: 16, k: 256, - }; - options.mmap_vectors = true; + }); let db = VectorDB::new(options)?; diff --git a/docs/guides/BASIC_TUTORIAL.md b/docs/guides/BASIC_TUTORIAL.md index 536a5f8b8..4e9032152 100644 --- a/docs/guides/BASIC_TUTORIAL.md +++ b/docs/guides/BASIC_TUTORIAL.md @@ -112,7 +112,7 @@ fn search_examples(db: &VectorDB) -> Result<(), Box> { vector: vec![0.15; 128], k: 10, // Return top 10 results filter: None, - include_vectors: false, + ef_search: None, }; let results = db.search(&query)?; @@ -122,7 +122,7 @@ fn search_examples(db: &VectorDB) -> Result<(), Box> { "{}. ID: {}, Distance: {:.4}", i + 1, result.id, - result.distance + result.score ); } @@ -139,7 +139,7 @@ const results = await db.search({ }); results.forEach((result, i) => { - console.log(`${i + 1}. ID: ${result.id}, Distance: ${result.distance.toFixed(4)}`); + console.log(`${i + 1}. ID: ${result.id}, Distance: ${result.score.toFixed(4)}`); }); ``` @@ -270,7 +270,8 @@ Tune HNSW parameters for your use case. ### Rust ```rust -use ruvector_core::{HnswConfig, DistanceMetric}; +use ruvector_core::types::{HnswConfig, DbOptions}; +use ruvector_core::DistanceMetric; fn create_tuned_db() -> Result> { let mut options = DbOptions::default(); @@ -278,12 +279,12 @@ fn create_tuned_db() -> Result> { options.storage_path = "./tuned_db.db".to_string(); // HNSW configuration - options.hnsw = HnswConfig { + options.hnsw_config = Some(HnswConfig { m: 32, // Connections per node (16-64) ef_construction: 200, // Build quality (100-400) ef_search: 100, // Search quality (50-500) max_elements: 10_000_000, // Maximum vectors - }; + }); // Distance metric options.distance_metric = DistanceMetric::Cosine; @@ -328,7 +329,7 @@ Reduce memory usage with quantization. ### Rust ```rust -use ruvector_core::QuantizationConfig; +use ruvector_core::types::{QuantizationConfig, DbOptions}; fn create_quantized_db() -> Result> { let mut options = DbOptions::default(); @@ -336,13 +337,13 @@ fn create_quantized_db() -> Result> { options.storage_path = "./quantized_db.db".to_string(); // Scalar quantization (4x compression) - options.quantization = QuantizationConfig::Scalar; + options.quantization = Some(QuantizationConfig::Scalar); // Product quantization (8-16x compression) - // options.quantization = QuantizationConfig::Product { + // options.quantization = Some(QuantizationConfig::Product { // subspaces: 16, // k: 256, - // }; + // }); let db = VectorDB::new(options)?; println!("Created database with scalar quantization"); @@ -414,10 +415,8 @@ ruvector import --db ./new_db.db --input ./backup.json Here's a complete program combining everything: ```rust -use ruvector_core::{ - VectorDB, VectorEntry, SearchQuery, DbOptions, HnswConfig, - DistanceMetric, QuantizationConfig, -}; +use ruvector_core::{VectorDB, VectorEntry, SearchQuery, DistanceMetric}; +use ruvector_core::types::{DbOptions, HnswConfig, QuantizationConfig}; use rand::Rng; use serde_json::json; use std::collections::HashMap; @@ -427,14 +426,14 @@ fn main() -> Result<(), Box> { let mut options = DbOptions::default(); options.dimensions = 128; options.storage_path = "./tutorial_db.db".to_string(); - options.hnsw = HnswConfig { + options.hnsw_config = Some(HnswConfig { m: 32, ef_construction: 200, ef_search: 100, max_elements: 1_000_000, - }; + }); options.distance_metric = DistanceMetric::Cosine; - options.quantization = QuantizationConfig::Scalar; + options.quantization = Some(QuantizationConfig::Scalar); let db = VectorDB::new(options)?; println!("✓ Created database"); @@ -469,7 +468,7 @@ fn main() -> Result<(), Box> { vector: query_vector, k: 10, filter: None, - include_vectors: false, + ef_search: None, }; let start = std::time::Instant::now(); @@ -479,7 +478,7 @@ fn main() -> Result<(), Box> { println!("✓ Search completed in {:?}", search_time); println!("\nTop 10 Results:"); for (i, result) in results.iter().enumerate() { - println!(" {}. ID: {}, Distance: {:.4}", i + 1, result.id, result.distance); + println!(" {}. ID: {}, Distance: {:.4}", i + 1, result.id, result.score); } Ok(()) diff --git a/docs/guides/GETTING_STARTED.md b/docs/guides/GETTING_STARTED.md index 7c11cfa60..17665087d 100644 --- a/docs/guides/GETTING_STARTED.md +++ b/docs/guides/GETTING_STARTED.md @@ -1,49 +1,73 @@ -# Getting Started with Ruvector +# Getting Started with RuVector -## What is Ruvector? +## What is RuVector? -Ruvector is a high-performance, Rust-native vector database designed for modern AI applications. It provides: +RuVector is a high-performance, Rust-native vector database and file format designed for modern AI applications. It provides: - **10-100x performance improvements** over Python/TypeScript implementations - **Sub-millisecond latency** with HNSW indexing and SIMD optimization -- **AgenticDB API compatibility** for seamless migration - **Multi-platform deployment** (Rust, Node.js, WASM/Browser, CLI) -- **Advanced features** including quantization, hybrid search, and causal memory +- **RVF (RuVector Format)** — a self-contained binary format with embedded WASM, kernel, eBPF, and dashboard segments +- **Advanced features** including quantization, filtered search, witness chains, COW branching, and AGI container manifests + +## Packages + +| Package | Registry | Version | Description | +|---------|----------|---------|-------------| +| `ruvector-core` | crates.io | 2.0.x | Core Rust library (VectorDB, HNSW, quantization) | +| `ruvector` | npm | 0.1.x | Node.js native bindings via NAPI-RS | +| `@ruvector/rvf` | npm | 0.2.x | RVF format library (TypeScript) | +| `@ruvector/rvf-node` | npm | 0.1.x | RVF Node.js native bindings | +| `@ruvector/gnn` | npm | 0.1.x | Graph Neural Network bindings | +| `@ruvector/graph-node` | npm | 2.0.x | Graph database with Cypher queries | +| `ruvector-wasm` / `@ruvector/wasm` | npm | — | Browser WASM build | ## Quick Start ### Installation -#### Rust -```bash -# Add to Cargo.toml +#### Rust (ruvector-core) +```toml +# Cargo.toml +[dependencies] +ruvector-core = "2.0" +``` + +#### Rust (RVF format — separate workspace) +```toml +# Cargo.toml — RVF crates live in examples/rvf or crates/rvf [dependencies] -ruvector-core = "0.1.0" +rvf-runtime = "0.2" +rvf-crypto = "0.2" ``` #### Node.js ```bash npm install ruvector -# or -yarn add ruvector +# or for the RVF format: +npm install @ruvector/rvf-node ``` #### CLI ```bash -cargo install ruvector-cli +# Build from source +git clone https://github.com/ruvnet/ruvector.git +cd ruvector +cargo install --path crates/ruvector-cli ``` -### Basic Usage +### Basic Usage — ruvector-core (VectorDB) #### Rust ```rust use ruvector_core::{VectorDB, VectorEntry, SearchQuery, DbOptions}; fn main() -> Result<(), Box> { - // Create a new vector database - let mut options = DbOptions::default(); - options.dimensions = 128; - options.storage_path = "./vectors.db".to_string(); + let options = DbOptions { + dimensions: 128, + storage_path: "./vectors.db".to_string(), + ..Default::default() + }; let db = VectorDB::new(options)?; @@ -61,12 +85,12 @@ fn main() -> Result<(), Box> { vector: vec![0.1; 128], k: 10, filter: None, - include_vectors: false, + ef_search: None, }; - let results = db.search(&query)?; + let results = db.search(query)?; for (i, result) in results.iter().enumerate() { - println!("{}. ID: {}, Distance: {}", i + 1, result.id, result.distance); + println!("{}. ID: {}, Score: {:.4}", i + 1, result.id, result.score); } Ok(()) @@ -78,34 +102,68 @@ fn main() -> Result<(), Box> { const { VectorDB } = require('ruvector'); async function main() { - // Create a new vector database const db = new VectorDB({ dimensions: 128, storagePath: './vectors.db', - distanceMetric: 'cosine' + distanceMetric: 'Cosine' }); - // Insert a vector const id = await db.insert({ vector: new Float32Array(128).fill(0.1), metadata: { text: 'Example document' } }); console.log('Inserted vector:', id); - // Search for similar vectors const results = await db.search({ vector: new Float32Array(128).fill(0.1), k: 10 }); results.forEach((result, i) => { - console.log(`${i + 1}. ID: ${result.id}, Distance: ${result.distance}`); + console.log(`${i + 1}. ID: ${result.id}, Score: ${result.score.toFixed(4)}`); }); } main().catch(console.error); ``` +### Basic Usage — RVF Format (RvfStore) + +The RVF format is a newer, self-contained binary format used in the `rvf-runtime` crate. See [`examples/rvf/`](../../examples/rvf/) for working examples. + +```rust +use rvf_runtime::{RvfStore, RvfOptions, QueryOptions, MetadataEntry, MetadataValue}; +use rvf_runtime::options::DistanceMetric; + +fn main() -> Result<(), Box> { + let opts = RvfOptions { + dimension: 128, + metric: DistanceMetric::L2, + ..Default::default() + }; + + let mut store = RvfStore::create("data.rvf", opts)?; + + // Ingest vectors with metadata + let vectors = vec![vec![0.1f32; 128]]; + let refs: Vec<&[f32]> = vectors.iter().map(|v| v.as_slice()).collect(); + let ids = vec![0u64]; + let meta = vec![ + MetadataEntry { field_id: 0, value: MetadataValue::String("doc".into()) }, + ]; + store.ingest_batch(&refs, &ids, Some(&meta))?; + + // Query + let query = vec![0.1f32; 128]; + let results = store.query(&query, 5, &QueryOptions::default())?; + for r in &results { + println!("id={}, distance={:.4}", r.id, r.distance); + } + + Ok(()) +} +``` + #### CLI ```bash # Create a database @@ -119,8 +177,25 @@ ruvector search --db ./vectors.db --query "[0.1, 0.2, ...]" --top-k 10 # Show database info ruvector info --db ./vectors.db + +# Graph operations +ruvector graph create --db ./graph.db --dimensions 128 +ruvector graph query --db ./graph.db --query "MATCH (n) RETURN n LIMIT 10" ``` +## Two API Surfaces + +RuVector has two main API surfaces: + +| | **ruvector-core (VectorDB)** | **rvf-runtime (RvfStore)** | +|---|---|---| +| **Use case** | General-purpose vector DB | Self-contained binary format | +| **Storage** | Directory-based | Single `.rvf` file | +| **IDs** | String-based | u64-based | +| **Metadata** | JSON HashMap | Typed fields (String, U64) | +| **Extras** | Collections, metrics, health | Witness chains, WASM/kernel/eBPF embedding, COW branching, AGI containers | +| **Node.js** | `ruvector` npm package | `@ruvector/rvf-node` npm package | + ## Core Concepts ### 1. Vector Database @@ -133,11 +208,11 @@ A vector database stores high-dimensional vectors (embeddings) and enables fast ### 2. Distance Metrics -Ruvector supports multiple distance metrics: +RuVector supports multiple distance metrics: - **Euclidean (L2)**: Standard distance in Euclidean space - **Cosine**: Measures angle between vectors (normalized dot product) - **Dot Product**: Inner product (useful for pre-normalized vectors) -- **Manhattan (L1)**: Sum of absolute differences +- **Manhattan (L1)**: Sum of absolute differences (ruvector-core only) ### 3. HNSW Indexing @@ -153,41 +228,43 @@ Key parameters: ### 4. Quantization -Reduce memory usage with quantization: +Reduce memory usage with quantization (ruvector-core): - **Scalar (int8)**: 4x compression, 97-99% recall - **Product**: 8-16x compression, 90-95% recall - **Binary**: 32x compression, 80-90% recall (filtering) -### 5. AgenticDB Features +### 5. RVF Format Features -Advanced features for AI agents: -- **Reflexion Memory**: Self-critique episodes for learning -- **Skill Library**: Reusable action patterns -- **Causal Memory**: Cause-effect relationships -- **Learning Sessions**: RL training data +The RVF binary format supports: +- **Witness chains**: Cryptographic audit trails (SHAKE256) +- **Segment embedding**: WASM, kernel, eBPF, and dashboard segments in one file +- **COW branching**: Copy-on-write branches for staging environments +- **Lineage tracking**: Parent-child derivation with depth tracking +- **Membership filters**: Bitmap-based tenant isolation +- **DoS hardening**: Token buckets, negative caches, proof-of-work +- **AGI containers**: Self-describing agent manifests ## Next Steps - [Installation Guide](INSTALLATION.md) - Detailed installation instructions -- [Basic Tutorial](BASIC_TUTORIAL.md) - Step-by-step tutorial +- [Basic Tutorial](BASIC_TUTORIAL.md) - Step-by-step tutorial with ruvector-core - [Advanced Features](ADVANCED_FEATURES.md) - Hybrid search, quantization, filtering -- [AgenticDB Migration Guide](../MIGRATION.md) - Migrate from agenticDB +- [RVF Examples](../../examples/rvf/) - Working RVF format examples (openfang, security_hardened, etc.) - [API Reference](../api/) - Complete API documentation -- [Examples](../../examples/) - Working code examples +- [Examples](../../examples/) - All working code examples ## Performance Tips 1. **Choose the right distance metric**: Cosine for normalized embeddings, Euclidean otherwise 2. **Tune HNSW parameters**: Higher `m` and `ef_construction` for better recall 3. **Enable quantization**: Reduces memory 4-32x with minimal accuracy loss -4. **Batch operations**: Use `insert_batch()` for better throughput -5. **Memory-map large datasets**: Set `mmap_vectors: true` for datasets larger than RAM +4. **Batch operations**: Use `insert_batch()` / `ingest_batch()` for better throughput +5. **Build with SIMD**: `RUSTFLAGS="-C target-cpu=native" cargo build --release` ## Common Issues ### Out of Memory - Enable quantization to reduce memory usage -- Use memory-mapped vectors for large datasets - Reduce `max_elements` or increase available RAM ### Slow Search @@ -204,8 +281,7 @@ Advanced features for AI agents: - **GitHub**: [https://github.com/ruvnet/ruvector](https://github.com/ruvnet/ruvector) - **Issues**: [https://github.com/ruvnet/ruvector/issues](https://github.com/ruvnet/ruvector/issues) -- **Documentation**: [https://docs.rs/ruvector-core](https://docs.rs/ruvector-core) ## License -Ruvector is licensed under the MIT License. See [LICENSE](../../LICENSE) for details. +RuVector is licensed under the MIT License. See [LICENSE](../../LICENSE) for details. diff --git a/docs/guides/INSTALLATION.md b/docs/guides/INSTALLATION.md index 304c89b21..292fd69b6 100644 --- a/docs/guides/INSTALLATION.md +++ b/docs/guides/INSTALLATION.md @@ -5,7 +5,7 @@ This guide covers installation of Ruvector for all supported platforms: Rust, No ## Prerequisites ### Rust -- **Rust 1.77+** (latest stable recommended) +- **Rust 1.80+** (latest stable recommended) - **Cargo** (included with Rust) Install Rust from [rustup.rs](https://rustup.rs/): @@ -30,7 +30,15 @@ Download from [nodejs.org](https://nodejs.org/) #### Add to Cargo.toml ```toml [dependencies] -ruvector-core = "0.1.0" +ruvector-core = "2.0" +``` + +For the RVF binary format (separate workspace in `crates/rvf`): +```toml +[dependencies] +rvf-runtime = "0.2" +rvf-crypto = "0.2" +rvf-types = "0.2" ``` #### Build with optimizations @@ -45,15 +53,15 @@ RUSTFLAGS="-C target-cpu=native" cargo build --release RUSTFLAGS="-C target-feature=+avx2,+fma" cargo build --release ``` -#### Optional features +#### Optional features (ruvector-core) ```toml [dependencies] -ruvector-core = { version = "0.1.0", features = ["agenticdb", "advanced"] } +ruvector-core = { version = "2.0", features = ["hnsw", "storage"] } ``` Available features: -- `agenticdb`: AgenticDB API compatibility (enabled by default) -- `advanced`: Advanced features (product quantization, hybrid search) +- `hnsw`: HNSW indexing (enabled by default) +- `storage`: Persistent storage backend - `simd`: SIMD intrinsics (enabled by default on x86_64) ### 2. Node.js Package @@ -81,10 +89,10 @@ console.log('Ruvector loaded successfully!'); #### Platform-specific binaries -Ruvector uses NAPI-RS for native bindings. Pre-built binaries are available for: -- **Linux**: x64, arm64 (glibc 2.17+) -- **macOS**: x64 (10.13+), arm64 (11.0+) -- **Windows**: x64, arm64 +RuVector uses NAPI-RS for native bindings. Pre-built binaries are available for: +- **Linux**: x64 (glibc), x64 (musl), arm64 (glibc), arm64 (musl) +- **macOS**: x64, arm64 (Apple Silicon) +- **Windows**: x64 If no pre-built binary is available, it will compile from source (requires Rust). @@ -92,7 +100,13 @@ If no pre-built binary is available, it will compile from source (requires Rust) #### NPM package ```bash -npm install ruvector-wasm +npm install @ruvector/wasm +``` + +There are also specialized WASM packages: +```bash +npm install @ruvector/rvf-wasm # RVF format in browser +npm install @ruvector/gnn-wasm # Graph neural networks ``` #### Basic usage @@ -100,11 +114,11 @@ npm install ruvector-wasm - Ruvector WASM Demo + RuVector WASM Demo