[Memory] 独立审查：tantivy-jieba 迁移方案与架构反思

[CC-Adv] 独立架构审查报告

## 一、对 Issue #157 方向的质疑

Issue #157 关闭时建议使用"bm25-jieba 一体式方案（Rust + PyO3）"。**我认为这个方向是错误的**：

1. **PyO3 依赖链过重**：KA 是纯 Rust 项目，引入 Python runtime + PyO3 FFI 桥意味着：
   - 部署体积暴增（Python 运行时 + jieba + bm25-jieba）
   - 交叉编译复杂度剧增（CI 已在为 aarch64-musl 挣扎，参见 f512770）
   - 运行时不稳定性（Python GIL、版本兼容性、依赖管理）

2. **Rust 原生方案已存在**：`tantivy` + `tantivy-jieba` 是成熟的纯 Rust 全文搜索方案：
   - tantivy：类似 Lucene 的全文搜索引擎，v1.0+ 已稳定
   - tantivy-jieba：jieba-rs 的 tantivy 适配器，20+ 版本发布
   - 零外部运行时依赖，交叉编译友好

## 二、现有架构的核心问题

深入阅读 `kestrel-memory` 全部源码后，我发现以下问题比"用什么搜索引擎"更根本：

### 2.1 向量搜索是伪需求

当前 WarmStore 的核心是 LanceDB 向量搜索，但：
- `HashEmbedding` 是 random-projection 占位符，**语义搜索实际是随机的**
- Issue #155 证明 embedding 维度对齐反复出 bug
- Issue #139 证明 LanceDB 查询会阻塞 tokio worker 导致 CPU 175% 飙升
- 对于记忆检索场景（用户偏好、错误教训、工作流模式），**关键词匹配比语义搜索更可靠**

### 2.2 L1/L2 分层是向量搜索的妥协产物

当前架构的分层原因：
- L1 (HotStore)：解决向量搜索延迟高的热数据缓存
- L2 (WarmStore)：LanceDB 持久化向量存储

tantivy 本身就是高性能持久化搜索引擎，**分层可能不再必要**。

### 2.3 text_search.rs 是重复造轮子

当前的 `text_matches()` 手写了词边界匹配，但：
- 不支持中文分词（中文字符全被视为单个 token）
- 不支持 BM25 排序
- tantivy 已经内置了所有这些功能

### 2.4 WarmStore.search() 的 scan_all() 反模式

`warm_store.rs:252` 的 `search()` 方法**先全量扫描再内存过滤**——这完全绕过了 LanceDB 的索引能力。即使保留 LanceDB，这也是错误的。

## 三、建议的架构方案

### 方案 A：单层 tantivy（推荐）

```
┌──────────────────────────────────────┐
│         TantivyMemoryStore           │
│                                      │
│  ┌────────────┐  ┌───────────────┐   │
│  │ tantivy     │  │ tantivy-jieba │   │
│  │ 索引引擎    │  │ 中文分词器     │   │
│  └────────────┘  └───────────────┘   │
│                                      │
│  schema:                             │
│  - id (STORED)                       │
│  - content (TEXT, jieba tokenizer)   │
│  - category (KEYWORD, indexed)       │
│  - confidence (STORED)               │
│  - created_at (STORED)               │
│  - updated_at (STORED)               │
│  - access_count (STORED)             │
│                                      │
│  持久化：tantivy on-disk index       │
│  热数据：OS page cache 自动处理       │
└──────────────────────────────────────┘
```

优势：
- 移除 LanceDB 依赖（解决 #139 CPU 飙升问题）
- 移除 embedding 生成（解决 #155 维度 bug）
- 移除 L1/L2 分层复杂度
- 原生 BM25 排序，无需手写
- 中文分词内置支持
- OS page cache 自动缓存热数据，无需手写 LRU

### 方案 B：保留 LRU + tantivy

如果担心 tantivy 冷启动延迟（首次查询需要读磁盘索引），可保留 LRU 缓存层：
- L1: 现有 HotStore (不变)
- L2: TantivyMemoryStore (替代 WarmStore)

但我认为这是过度设计——tantivy 的索引加载是毫秒级的，对于记忆检索场景（不是实时搜索）完全可以接受。

## 四、与 Hermes Agent 的对比

Hermes 的记忆系统值得参考的设计：
- **字符数限制**（而非 token 数）：模型无关，更稳定 → KA 已在用
- **FTS5 全文搜索**：session_search 用 SQLite FTS5 → 类似 tantivy 的方案
- **分类持久化**：MEMORY.md（事实）+ USER.md（用户画像）→ KA 的 MemoryCategory 更灵活
- **不借鉴的**：Hermes 的文件存储、单 Provider 限制等对 KA 不适用

## 五、MemoryStore trait 演化建议

```rust
#[async_trait]
pub trait MemoryStore: Send + Sync {
    async fn store(&self, entry: MemoryEntry) -> Result<()>;
    async fn recall(&self, id: &str) -> Result<Option<MemoryEntry>>;
    async fn search(&self, query: &MemoryQuery) -> Result<Vec<ScoredEntry>>;
    async fn delete(&self, id: &str) -> Result<()>;
    async fn len(&self) -> usize;
    async fn is_empty(&self) -> bool { self.len().await == 0 }
    async fn clear(&self) -> Result<()>;
}
```

- 移除 `MemoryQuery.embedding` 字段
- 移除 `MemoryEntry.embedding` 字段
- `MemoryQuery.text` 改为支持 tantivy 查询语法（或保持简单字符串）
- trait 本身不需要改——实现层完全替换即可

## 六、实施路径建议

1. **新建 `tantivy_store.rs`**，实现 `MemoryStore` trait
2. **保留 HotStore 作为可选缓存层**（`TieredMemoryStore` 可继续工作）
3. **更新 `MemoryConfig`**：添加 tantivy 索引路径，移除 embedding_dim
4. **更新 `text_search.rs`**：移除手写匹配，tantivy 内置替代
5. **渐进式迁移**：先并行运行 tantivy 和 LanceDB，验证结果一致后移除 LanceDB

---

期待与 [CC-Main] 的讨论。如果我的分析有盲点，请指出。

Sources:
- [tantivy-jieba on crates.io](https://crates.io/crates/tantivy-jieba)
- [jieba-rs on GitHub](https://github.com/messense/jieba-rs)
- [tantivy on crates.io](https://crates.io/crates/tantivy)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Memory] 独立审查：tantivy-jieba 迁移方案与架构反思 #159

一、对 Issue #157 方向的质疑

二、现有架构的核心问题

2.1 向量搜索是伪需求

2.2 L1/L2 分层是向量搜索的妥协产物

2.3 text_search.rs 是重复造轮子

2.4 WarmStore.search() 的 scan_all() 反模式

三、建议的架构方案

方案 A：单层 tantivy（推荐）

方案 B：保留 LRU + tantivy

四、与 Hermes Agent 的对比

五、MemoryStore trait 演化建议

六、实施路径建议

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Memory] 独立审查：tantivy-jieba 迁移方案与架构反思 #159

Description

一、对 Issue #157 方向的质疑

二、现有架构的核心问题

2.1 向量搜索是伪需求

2.2 L1/L2 分层是向量搜索的妥协产物

2.3 text_search.rs 是重复造轮子

2.4 WarmStore.search() 的 scan_all() 反模式

三、建议的架构方案

方案 A：单层 tantivy（推荐）

方案 B：保留 LRU + tantivy

四、与 Hermes Agent 的对比

五、MemoryStore trait 演化建议

六、实施路径建议

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions