Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

25 changes: 14 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@

<p align="center">
<a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT%2FApache--2.0-blue.svg" alt="License" /></a>
<a href="#current-status"><img src="https://img.shields.io/badge/Status-Beta-brightgreen.svg" alt="Status" /></a>
<a href="https://github.com/anaslimem/CortexaDB/releases"><img src="https://img.shields.io/badge/Version-0.1.8-blue.svg" alt="Version" /></a>
<a href="#current-status"><img src="https://img.shields.io/badge/Status-Stable-brightgreen.svg" alt="Status" /></a>
<a href="https://github.com/anaslimem/CortexaDB/releases"><img src="https://img.shields.io/badge/Version-1.0.0-blue.svg" alt="Version" /></a>
<a href="https://pepy.tech/projects/cortexadb"><img src="https://static.pepy.tech/personalized-badge/cortexadb?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLUE&left_text=downloads" alt="Downloads" /></a>
<a href="https://cortexa-db.vercel.app"><img src="https://img.shields.io/badge/Docs-cortexa--db.vercel.app-purple.svg" alt="Documentation" /></a>
</p>
Expand Down Expand Up @@ -82,23 +82,26 @@ pip install cortexadb[docs,pdf] # Optional: For PDF/Docx support
<details>
<summary><b>Technical Architecture & Benchmarks</b></summary>

### Performance Benchmarks (v0.1.8)
### Performance Benchmarks (v1.0.0)

CortexaDB `v0.1.8` introduced a new batching architecture. Measured on an M2 Mac with 1,000 chunks of text:
Measured on an M-series Mac — 10,000 embeddings × 384 dimensions.

| Operation | v0.1.6 (Sync) | v0.1.8 (Batch) | Improvement |
|-----------|---------------|----------------|-------------|
| Ingestion | 12.4s | **0.12s** | **103x Faster** |
| Memory Add| 15ms | 1ms | 15x Faster |
| HNSW Search| 0.3ms | 0.28ms | - |
| Operation | Latency / Time |
|-----------|---------------|
| Bulk Ingestion (1,000 chunks) | **0.12s** |
| Single Memory Add | **1ms** |
| HNSW Search p50 | **1.03ms** (debug) / ~0.3ms (release) |
| HNSW Recall | **95%** |

See the [full benchmark docs](https://cortexa-db.vercel.app/docs/resources/benchmarks) for HNSW vs Exact comparison and how to reproduce.

</details>

---

## License & Status
CortexaDB is currently in **Beta (v0.1.8)**. It is released under the **MIT** and **Apache-2.0** licenses.
We are actively refining the API and welcome feedback!
CortexaDB `v1.0.0` is a **stable release** available under the **MIT** and **Apache-2.0** licenses.
We welcome feedback and contributions!

---
> *CortexaDB — Because agents shouldn't have to choose between speed and a soul (memory).*
2 changes: 1 addition & 1 deletion crates/cortexadb-core/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "cortexadb-core"
version = "0.1.8"
version = "1.0.0"
edition = "2021"
authors = ["Anas Limem <limemanas0@gmail.com>"]
description = "Fast, embedded vector + graph memory for AI agents"
Expand Down
14 changes: 6 additions & 8 deletions crates/cortexadb-core/src/engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ impl Engine {
match &cmd {
Command::Add(entry) => {
// Write entry to segment storage
self._write_entry_to_segments(entry)?;
self.write_entry_to_segments(entry)?;
}
Command::Delete(id) => {
// Mark as deleted in segments
Expand All @@ -288,7 +288,7 @@ impl Engine {
// In relaxed modes caller flushes later via sync policy.
self.state_machine.apply_command(cmd)?;

// 5. Update tracking
// 3. Update tracking
self.last_applied_id = cmd_id;

Ok(cmd_id)
Expand Down Expand Up @@ -425,8 +425,8 @@ impl Engine {
collection_bytes + content_bytes + embedding_bytes + metadata_bytes
}

/// Helper: Write entry to segments
fn _write_entry_to_segments(
/// Write entry to segments.
fn write_entry_to_segments(
&mut self,
entry: &crate::core::memory_entry::MemoryEntry,
) -> Result<()> {
Expand All @@ -439,10 +439,8 @@ impl Engine {
&self.state_machine
}

/// Get mutable reference to the state machine
/// NOTE: If you modify state directly (not via execute_command),
/// you bypass WAL durability! Use execute_command() instead.
pub fn get_state_machine_mut(&mut self) -> &mut StateMachine {
/// Get mutable reference to the state machine
pub(crate) fn get_state_machine_mut(&mut self) -> &mut StateMachine {
&mut self.state_machine
}

Expand Down
2 changes: 1 addition & 1 deletion crates/cortexadb-core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,5 @@ pub mod store;

// Re-export the primary facade types for convenience.
pub use chunker::{chunk, ChunkMetadata, ChunkResult, ChunkingStrategy};
pub use facade::{CortexaDB, CortexaDBConfig, CortexaDBError, Memory, Stats};
pub use facade::{BatchRecord, CortexaDB, CortexaDBBuilder, CortexaDBConfig, CortexaDBError, Hit, Memory, Stats};
pub use index::{HnswBackend, HnswConfig, HnswError, IndexMode, MetricKind};
27 changes: 17 additions & 10 deletions crates/cortexadb-py/cortexadb/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,8 @@ def search(
if self.get(target_id).collection not in collections:
continue
scored_candidates[target_id] = max(scored_candidates.get(target_id, 0), hit.score * 0.9)
except: pass
except Exception:
pass

if recency_bias:
now = time.time()
Expand All @@ -291,7 +292,8 @@ def search(
age = max(0, now - mem.created_at)
decay = 0.5 ** (age / (30 * 86400))
scored_candidates[obj_id] *= (1.0 + 0.2 * decay)
except: pass
except Exception:
pass

final = [Hit(mid, s) for mid, s in scored_candidates.items()]
final.sort(key=lambda h: h.score, reverse=True)
Expand All @@ -311,28 +313,33 @@ def export_replay(self, path: str):
"""Export all memories to a replay log."""
from .replay import ReplayWriter
writer = ReplayWriter(path, dimension=self._dimension)
report = {"checked": 0, "exported": 0, "skipped_missing_embedding": 0}

# This is a bit slow as we iterate all IDs
report = {"checked": 0, "exported": 0, "skipped_missing_embedding": 0, "errors": []}

stats = self.stats()
for i in range(1, stats.entries + 1):
total_live = stats.entries
found = 0
mid = 1
scan_limit = max(total_live * 4, 1000)
while found < total_live and mid <= scan_limit:
Comment on lines +322 to +323
Comment on lines 318 to +323
report["checked"] += 1
try:
mem = self.get(i)
mem = self.get(mid)
if mem.embedding:
writer.record_add(
id=mem.id,
text=bytes(mem.content).decode("utf-8") if mem.content else "",
embedding=mem.embedding,
collection=mem.collection,
metadata=mem.metadata
metadata=mem.metadata,
)
report["exported"] += 1
else:
report["skipped_missing_embedding"] += 1
except:
found += 1
except Exception:
pass
Comment on lines +339 to 340

mid += 1

writer.close()
self._last_export_replay_report = report

Expand Down
2 changes: 1 addition & 1 deletion crates/cortexadb-py/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "maturin"

[project]
name = "cortexadb"
version = "0.1.8"
version = "1.0.0"
requires-python = ">=3.9"
description = "Fast, embedded vector + graph memory for AI agents"
authors = [
Expand Down
66 changes: 39 additions & 27 deletions docs/content/docs/resources/benchmarks.mdx
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
---
title: Benchmarks
description: Performance benchmarks and methodology
description: Performance benchmarks and methodology for v1.0.0
---

CortexaDB delivers sub-millisecond query latency and rapid ingestion, optimized for local agentic workflows.
CortexaDB delivers fast, local vector search optimized for AI agent memory workloads. Numbers below are from a **debug build** on an M-series Mac — a release build is 5–10x faster.

## Performance Overview

Key metrics measured with **10,000 embeddings** (384 dimensions) on an M1 Pro Mac.
Key metrics measured with **10,000 embeddings** (384 dimensions) on an M-series Mac, v1.0.0 debug build.

<Cards>
<Card title="103x Faster Ingestion" icon={<Zap className="text-yellow-500" />}>
Batch ingestion processed 1,000 chunks in **0.12s** (formerly 12.4s).
<Card title="1.03ms p50 Search" icon={<Zap className="text-yellow-500" />}>
HNSW search on 10,000 vectors. Release build achieves **~0.3ms** p50.
</Card>
<Card title="3,200+ QPS" icon={<TrendingUp className="text-blue-500" />}>
High-throughput HNSW search with sub-millisecond p50 latency.
<Card title="952+ QPS" icon={<TrendingUp className="text-blue-500" />}>
HNSW throughput (debug build). Release build exceeds **3,000 QPS**.
</Card>
<Card title="95%+ Recall" icon={<Shield className="text-green-500" />}>
<Card title="95% Recall" icon={<Shield className="text-green-500" />}>
Approximate search maintains high accuracy relative to brute-force.
</Card>
</Cards>
Expand All @@ -25,47 +25,59 @@ Key metrics measured with **10,000 embeddings** (384 dimensions) on an M1 Pro Ma

## Retrieval Benchmarks

| Mode | Latency (p50) | Throughput | Recall | Index Time |
|------|---------------|------------|--------|------------|
| **HNSW** | **0.29ms** | **3,203 QPS** | 95% | 151s |
| Exact | 1.34ms | 690 QPS | 100% | 138s |
Measured: 10,000 embeddings × 384 dimensions, 1,000 queries, 100 warmup, top-10.

## Ingestion Benchmarks
| Mode | p50 | p95 | p99 | Throughput | Recall | Disk |
|------|-----|-----|-----|-----------|--------|------|
| **HNSW** | **1.03ms** | 1.18ms | 1.29ms | **952 QPS** | **95%** | 47 MB |
| Exact | 16.38ms | 22.69ms | 35.77ms | 56 QPS | 100% | 31 MB |

| Operation | Previous | Current | Speedup |
|-----------|----------|---------|---------|
| **Bulk Ingest** | 12.4s | **0.12s** | **103x** |
| Memory Add | 15ms | 1ms | 15x |
| HNSW Build | 151s | 151s | - |
> [!NOTE]
> These numbers are from a **debug build** (`maturin develop`). With a release build (`maturin develop --release`), HNSW achieves **~0.3ms p50** and **3,000+ QPS** — consistent with the ingestion benchmarks below.

---

## Ingestion

| Operation | Time |
|-----------|------|
| Bulk Ingest (1,000 chunks) | **0.12s** |
| Single Memory Add | **1ms** |
| HNSW Index Build (10,000 vectors) | ~286s (debug) / ~140s (release) |

---

## Methodology

- **Dataset**: 10,000 embeddings x 384 dimensions (Sentence-Transformers standard).
- **Environment**: MacBook Pro M1 Pro (16-core GPU, 32GB RAM).
- **Query Latency**: p50 measured across 1,000 queries after 100 warmup cycles.
- **Recall**: Percentage of HNSW results identical to brute-force exact scan.
- **Dataset**: 10,000 random embeddings × 384 dimensions.
- **Environment**: M-series Mac. Debug build via `maturin develop`.
- **Query Latency**: p50/p95/p99 measured across 1,000 queries after 100 warmup cycles.
- **Recall**: % of HNSW results identical to brute-force exact scan (100 queries, top-10).

---

## Reproducing Results

Build the release extension:
Build the release extension for best performance:

```bash
cd crates/cortexadb-py
maturin develop --release
cd ../..
pip install numpy psutil
```

Run the automated benchmark suite:

```bash
# Generate 10k test vectors
python benchmark/generate_embeddings.py --count 10000 --dimensions 384
python3 benchmark/generate_embeddings.py --count 10000 --dimensions 384

# Benchmark HNSW performance
python benchmark/run_benchmark.py --index-mode hnsw
python3 benchmark/run_benchmark.py --index-mode hnsw

# Benchmark Exact performance
python3 benchmark/run_benchmark.py --index-mode exact
```

---
Expand All @@ -76,11 +88,11 @@ python benchmark/run_benchmark.py --index-mode hnsw
|--------|-----------|----------|
| **Dataset Size** | < 10,000 entries | > 10,000 entries |
| **Recall Needed** | 100% (Strict) | 95-99% (Semantic) |
| **Latency Target** | < 5ms | < 1ms |
| **Latency Target** | < 20ms (debug) / < 2ms (release) | < 5ms (debug) / < 1ms (release) |
| **Resource Profile** | Minimum Memory | High Performance |

> [!TIP]
> For datasets between 1k and 10k, **Exact mode** is often faster due to zero index-building overhead while maintaining sub-millisecond latency on modern CPUs.
> For datasets between 1k and 10k, **Exact mode** is often a good choice due to zero index-building overhead and 100% recall. HNSW shines at 10k+ entries where its sub-linear search complexity pays off.

---

Expand Down
Loading
Loading