diff --git a/docs/design/multi-tenant-design.md b/docs/design/multi-tenant-design.md index 5eeef560..0bfd966b 100644 --- a/docs/design/multi-tenant-design.md +++ b/docs/design/multi-tenant-design.md @@ -261,7 +261,7 @@ PUT /api/v1/admin/accounts/{account_id}/users/{uid}/role 修改用户角色 ( - **user**:同一 account 内,不同用户的私有数据互不可见。用户记忆、资源、session 属于用户本人 - **agent**:同一 account 内,agent 目录由 user_id + agent_id 共同决定,每用户独立(见 4.3) -**Space 标识符**:`UserIdentifier` 新增两个方法,拆分现有的 `unique_space_name()`: +**Space 标识符**:`UserIdentifier` 提供两个方法 `user_space_name()` 和 `agent_space_name()`: ```python def user_space_name(self) -> str: diff --git a/docs/en/concepts/04-viking-uri.md b/docs/en/concepts/04-viking-uri.md index 7b27d350..42bbd25c 100644 --- a/docs/en/concepts/04-viking-uri.md +++ b/docs/en/concepts/04-viking-uri.md @@ -120,7 +120,7 @@ viking:// │ ├── entities/ # Each independent │ └── events/ # Each independent │ -├── agent/{unique_space_name}/ # unique_space_name see UserIdentifier +├── agent/{agent_space}/ # agent_space = agent_space_name() │ ├── skills/ # Skill definitions │ ├── memories/ │ │ ├── cases/ @@ -128,7 +128,7 @@ viking:// │ ├── workspaces/ │ └── instructions/ │ -└── session/{unique_space_name}/{session_id}/ +└── session/{user_space}/{session_id}/ ├── messages/ ├── tools/ └── history/ diff --git a/docs/en/guides/03-deployment.md b/docs/en/guides/03-deployment.md index 9f57a764..10ffe708 100644 --- a/docs/en/guides/03-deployment.md +++ b/docs/en/guides/03-deployment.md @@ -191,6 +191,105 @@ curl http://localhost:1933/api/v1/fs/ls?uri=viking:// \ -H "X-API-Key: your-key" ``` +## Cloud Deployment + +### Docker + +```dockerfile +FROM python:3.11-slim +WORKDIR /app +COPY . . +RUN pip install -e . +EXPOSE 1933 +CMD ["python", "-m", "openviking", "serve", "--config", "/etc/openviking/ov.conf"] +``` + +```bash +docker build -t openviking . +docker run -d -p 1933:1933 \ + -v /path/to/ov.conf:/etc/openviking/ov.conf:ro \ + -v /data/openviking:/data/openviking \ + openviking +``` + +### Kubernetes + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: openviking +spec: + replicas: 1 + selector: + matchLabels: + app: openviking + template: + metadata: + labels: + app: openviking + spec: + containers: + - name: openviking + image: openviking:latest + ports: + - containerPort: 1933 + volumeMounts: + - name: config + mountPath: /etc/openviking + readOnly: true + - name: data + mountPath: /data/openviking + livenessProbe: + httpGet: + path: /health + port: 1933 + initialDelaySeconds: 5 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /ready + port: 1933 + initialDelaySeconds: 10 + periodSeconds: 15 + volumes: + - name: config + configMap: + name: openviking-config + - name: data + persistentVolumeClaim: + claimName: openviking-data +--- +apiVersion: v1 +kind: Service +metadata: + name: openviking +spec: + selector: + app: openviking + ports: + - port: 1933 + targetPort: 1933 +``` + +## Health Checks + +| Endpoint | Auth | Purpose | +|----------|------|---------| +| `GET /health` | No | Liveness probe — returns `{"status": "ok"}` immediately | +| `GET /ready` | No | Readiness probe — checks AGFS, VectorDB, APIKeyManager | + +```bash +# Liveness +curl http://localhost:1933/health + +# Readiness +curl http://localhost:1933/ready +# {"status": "ready", "checks": {"agfs": "ok", "vectordb": "ok", "api_key_manager": "ok"}} +``` + +Use `/health` for Kubernetes liveness probes and `/ready` for readiness probes. + ## Related Documentation - [Authentication](04-authentication.md) - API key setup diff --git a/docs/zh/guides/03-deployment.md b/docs/zh/guides/03-deployment.md index ccc7c9ce..ba6ba38e 100644 --- a/docs/zh/guides/03-deployment.md +++ b/docs/zh/guides/03-deployment.md @@ -185,6 +185,105 @@ curl http://localhost:1933/api/v1/fs/ls?uri=viking:// \ -H "X-API-Key: your-key" ``` +## 云上部署 + +### Docker + +```dockerfile +FROM python:3.11-slim +WORKDIR /app +COPY . . +RUN pip install -e . +EXPOSE 1933 +CMD ["python", "-m", "openviking", "serve", "--config", "/etc/openviking/ov.conf"] +``` + +```bash +docker build -t openviking . +docker run -d -p 1933:1933 \ + -v /path/to/ov.conf:/etc/openviking/ov.conf:ro \ + -v /data/openviking:/data/openviking \ + openviking +``` + +### Kubernetes + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: openviking +spec: + replicas: 1 + selector: + matchLabels: + app: openviking + template: + metadata: + labels: + app: openviking + spec: + containers: + - name: openviking + image: openviking:latest + ports: + - containerPort: 1933 + volumeMounts: + - name: config + mountPath: /etc/openviking + readOnly: true + - name: data + mountPath: /data/openviking + livenessProbe: + httpGet: + path: /health + port: 1933 + initialDelaySeconds: 5 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /ready + port: 1933 + initialDelaySeconds: 10 + periodSeconds: 15 + volumes: + - name: config + configMap: + name: openviking-config + - name: data + persistentVolumeClaim: + claimName: openviking-data +--- +apiVersion: v1 +kind: Service +metadata: + name: openviking +spec: + selector: + app: openviking + ports: + - port: 1933 + targetPort: 1933 +``` + +## 健康检查 + +| 端点 | 认证 | 用途 | +|------|------|------| +| `GET /health` | 否 | 存活探针 — 立即返回 `{"status": "ok"}` | +| `GET /ready` | 否 | 就绪探针 — 检查 AGFS、VectorDB、APIKeyManager | + +```bash +# 存活探针 +curl http://localhost:1933/health + +# 就绪探针 +curl http://localhost:1933/ready +# {"status": "ready", "checks": {"agfs": "ok", "vectordb": "ok", "api_key_manager": "ok"}} +``` + +在 Kubernetes 中,使用 `/health` 作为存活探针,`/ready` 作为就绪探针。 + ## 相关文档 - [认证](04-authentication.md) - API Key 设置 diff --git a/examples/chatmem/chatmem.py b/examples/chatmem/chatmem.py index 19570537..0c4846e4 100644 --- a/examples/chatmem/chatmem.py +++ b/examples/chatmem/chatmem.py @@ -426,7 +426,9 @@ def main(): """, ) - parser.add_argument("--config", type=str, default="./ov.conf", help="Path to config file") + parser.add_argument( + "--config", type=str, default="~/.openviking/ov.conf", help="Path to config file" + ) parser.add_argument("--data", type=str, default="./data", help="Path to data directory") parser.add_argument( "--session-id", diff --git a/examples/cloud/.gitignore b/examples/cloud/.gitignore new file mode 100644 index 00000000..b655d32e --- /dev/null +++ b/examples/cloud/.gitignore @@ -0,0 +1,3 @@ +user_keys.json +ovcli.conf +ov.conf \ No newline at end of file diff --git a/examples/cloud/GUIDE.md b/examples/cloud/GUIDE.md new file mode 100644 index 00000000..4e8b3e23 --- /dev/null +++ b/examples/cloud/GUIDE.md @@ -0,0 +1,360 @@ +# OpenViking 上云部署指南 + +本文档介绍如何将 OpenViking 部署到火山引擎云上,使用 TOS(对象存储)+ VikingDB(向量数据库)+ 方舟大模型作为后端。 + +--- + +## 1. 开通云服务 + +### 1.1 开通 TOS(对象存储) + +TOS 用于持久化存储 OpenViking 的文件数据(AGFS 后端)。 + +1. 登录 [火山引擎控制台](https://console.volcengine.com/) +2. 进入 **对象存储 TOS** → 开通服务 +3. 创建存储桶: + - 桶名称:如 `openvikingdata` + - 地域:`cn-beijing`(需与其他服务保持一致) + - 存储类型:标准存储 + - 访问权限:私有 +4. 记录桶名称和地域,填入配置文件的 `storage.agfs.s3` 部分 + +### 1.2 开通 VikingDB(向量数据库) + +VikingDB 用于存储和检索向量嵌入。 + +1. 进入 [火山引擎控制台](https://console.volcengine.com/) → **智能数据** → **向量数据库 VikingDB** +2. 开通服务(按量付费即可) +3. VikingDB 的 API Host 默认为:`api-vikingdb.vikingdb.cn-beijing.volces.com` +4. 无需手动创建 Collection,OpenViking 启动后会自动创建 + +### 1.3 申请 AK/SK(IAM 访问密钥) + +AK/SK 同时用于 TOS 和 VikingDB 的鉴权。 + +1. 进入 [火山引擎控制台](https://console.volcengine.com/) → **访问控制 IAM** +2. 创建子用户(建议不使用主账号 AK/SK) +3. 为子用户授权以下策略: + - `TOSFullAccess`(或精确到桶级别的自定义策略) + - `VikingDBFullAccess` +4. 为子用户创建 **AccessKey**,记录: + - `Access Key ID`(即 AK) + - `Secret Access Key`(即 SK) +5. 将 AK/SK 填入配置文件中的以下位置: + - `storage.vectordb.volcengine.ak` / `sk` + - `storage.agfs.s3.access_key` / `secret_key` + - `rerank.ak` / `sk`(如果使用 rerank) + +### 1.4 申请方舟 API Key + +方舟平台提供 Embedding 和 VLM 模型的推理服务。 + +1. 进入 [火山方舟控制台](https://console.volcengine.com/ark) +2. 左侧菜单 → **API Key 管理** → 创建 API Key +3. 记录生成的 API Key +4. 确认以下模型已开通(在 **模型广场** 中申请): + - `doubao-embedding-vision-250615`(多模态 Embedding) + - `doubao-seed-1-8-251228`(VLM 推理) + - `doubao-seed-rerank`(Rerank,可选) +5. 将 API Key 填入配置文件的 `embedding.dense.api_key` 和 `vlm.api_key` + +--- + +## 2. 编写配置文件 + +参考本目录下的 [ov.conf](./ov.conf),将上述步骤获取的凭据填入。 + +关键字段说明: + +| 字段 | 说明 | +|------|------| +| `server.root_api_key` | 管理员密钥,用于多租户管理,设置一个强密码 | +| `storage.vectordb.backend` | 设置为 `volcengine` 使用云端 VikingDB | +| `storage.vectordb.volcengine.ak/sk` | IAM 的 AK/SK | +| `storage.agfs.backend` | 设置为 `s3` 使用 TOS 存储 | +| `storage.agfs.s3.bucket` | TOS 桶名称 | +| `storage.agfs.s3.endpoint` | TOS 端点,北京为 `https://tos-cn-beijing.volces.com` | +| `storage.agfs.s3.access_key/secret_key` | IAM 的 AK/SK | +| `embedding.dense.api_key` | 方舟 API Key | +| `vlm.api_key` | 方舟 API Key | + +--- + +## 3. 启动服务 + +### 方式一:Docker(推荐) + +```bash +# 构建镜像(如果不使用预构建镜像) +docker build -t openviking:latest . + +# 启动 +docker run -d \ + --name openviking \ + -p 1933:1933 \ + -v $(pwd)/examples/cloud/ov.conf:/app/ov.conf \ + -v /var/lib/openviking/data:/app/data \ + --restart unless-stopped \ + openviking:latest +``` + +### 方式二:Docker Compose + +修改 `docker-compose.yml` 中的配置挂载路径后: + +```bash +docker-compose up -d +``` + +### 方式三:Kubernetes + Helm + +```bash +helm install openviking ./examples/k8s-helm \ + --set openviking.config.embedding.dense.api_key="YOUR_ARK_API_KEY" \ + --set openviking.config.vlm.api_key="YOUR_ARK_API_KEY" \ + --set openviking.config.storage.vectordb.volcengine.ak="YOUR_AK" \ + --set openviking.config.storage.vectordb.volcengine.sk="YOUR_SK" +``` + +### 方式四:直接运行 + +```bash +pip install openviking +export OPENVIKING_CONFIG_FILE=$(pwd)/examples/cloud/ov.conf +openviking-server +``` + +### 验证启动 + +```bash +# 健康检查 +curl http://localhost:1933/health +# 期望返回: {"status":"ok"} + +# 就绪检查(验证 AGFS、VikingDB 连接) +curl http://localhost:1933/ready +# 期望返回: {"status":"ready","checks":{"agfs":"ok","vectordb":"ok","api_key_manager":"ok"}} +``` + +--- + +## 4. 注册租户和用户 + +OpenViking 支持多租户隔离。配置了 `root_api_key` 后自动启用多租户模式。 + +### 4.1 创建租户(Account) + +使用 `root_api_key` 创建租户,同时会生成一个管理员用户: + +```bash +curl -X POST http://localhost:1933/api/v1/admin/accounts \ + -H "X-API-Key: YOUR_ROOT_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "account_id": "my-team", + "admin_user_id": "admin" + }' +``` + +返回结果中包含管理员的 API Key,**请妥善保存**: + +```json +{ + "status": "ok", + "result": { + "account_id": "my-team", + "admin_user_id": "admin", + "user_key": "abcdef1234567890..." + } +} +``` + +### 4.2 注册普通用户 + +租户管理员可以为租户添加用户: + +```bash +curl -X POST http://localhost:1933/api/v1/admin/accounts/my-team/users \ + -H "X-API-Key: ADMIN_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "user_id": "alice", + "role": "user" + }' +``` + +返回用户的 API Key: + +```json +{ + "status": "ok", + "result": { + "user_id": "alice", + "user_key": "fedcba0987654321..." + } +} +``` + +### 4.3 查看租户下的用户 + +```bash +curl http://localhost:1933/api/v1/admin/accounts/my-team/users \ + -H "X-API-Key: ADMIN_API_KEY" +``` + +--- + +## 5. 使用 + +以下操作使用用户的 API Key 进行。 + +### 5.1 添加资源 + +```bash +# 添加一个 URL 资源 +curl -X POST http://localhost:1933/api/v1/resources \ + -H "X-API-Key: USER_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "path": "https://raw.githubusercontent.com/volcengine/OpenViking/main/README.md", + "reason": "项目文档" + }' + +# 上传本地文件(先上传到临时路径,再添加为资源) +curl -X POST http://localhost:1933/api/v1/resources/temp_upload \ + -H "X-API-Key: USER_API_KEY" \ + -F "file=@./my-document.pdf" + +# 然后使用返回的 temp_path 添加资源 +curl -X POST http://localhost:1933/api/v1/resources \ + -H "X-API-Key: USER_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "temp_path": "/tmp/upload_xyz", + "reason": "内部文档" + }' +``` + +### 5.2 等待处理完成 + +添加资源后,系统会异步进行解析和向量化。等待处理完成: + +```bash +curl -X POST http://localhost:1933/api/v1/system/wait \ + -H "X-API-Key: USER_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"timeout": 120}' +``` + +### 5.3 语义搜索 + +```bash +curl -X POST http://localhost:1933/api/v1/search/find \ + -H "X-API-Key: USER_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "query": "OpenViking 是什么", + "limit": 5 + }' +``` + +### 5.4 浏览文件系统 + +```bash +# 列出根目录 +curl "http://localhost:1933/api/v1/fs/ls?uri=viking://" \ + -H "X-API-Key: USER_API_KEY" + +# 查看目录树 +curl "http://localhost:1933/api/v1/fs/tree?uri=viking://&depth=2" \ + -H "X-API-Key: USER_API_KEY" +``` + +### 5.5 读取内容 + +```bash +# 读取文件内容 +curl "http://localhost:1933/api/v1/content/read?uri=viking://resources/doc1" \ + -H "X-API-Key: USER_API_KEY" + +# 读取摘要 +curl "http://localhost:1933/api/v1/content/abstract?uri=viking://resources/doc1" \ + -H "X-API-Key: USER_API_KEY" +``` + +### 5.6 会话管理(Memory) + +```bash +# 创建会话 +curl -X POST http://localhost:1933/api/v1/sessions \ + -H "X-API-Key: USER_API_KEY" + +# 添加对话消息 +curl -X POST http://localhost:1933/api/v1/sessions/{session_id}/messages \ + -H "X-API-Key: USER_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "role": "user", + "content": "帮我分析这个文档的核心观点" + }' + +# 带会话上下文的搜索 +curl -X POST http://localhost:1933/api/v1/search/search \ + -H "X-API-Key: USER_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "query": "核心观点", + "session_id": "SESSION_ID", + "limit": 5 + }' +``` + +### 5.7 Python SDK 使用 + +```python +import openviking as ov + +client = ov.SyncHTTPClient( + url="http://localhost:1933", + api_key="USER_API_KEY", + agent_id="my-agent" +) +client.initialize() + +# 添加资源 +client.add_resource( + path="https://example.com/doc.pdf", + reason="参考文档" +) +client.wait_processed(timeout=120) + +# 搜索 +results = client.find("OpenViking 架构设计", limit=5) +for r in results: + print(r.uri, r.score) + +client.close() +``` + +--- + +## 6. 运维 + +### 日志 + +容器日志默认输出到 stdout,可通过 `docker logs` 或 K8s 日志系统查看: + +```bash +docker logs -f openviking +``` + +### 监控 + +- 健康检查:`GET /health` +- 就绪检查:`GET /ready`(检测 AGFS、VikingDB、APIKeyManager 连接状态) +- 系统状态:`GET /api/v1/system/status` + +### 数据备份 + +- **TOS 数据**:通过 TOS 控制台配置跨区域复制或定期备份 +- **本地数据**(如使用 PVC):定期快照 PersistentVolume diff --git a/examples/cloud/alice.py b/examples/cloud/alice.py new file mode 100644 index 00000000..ba086645 --- /dev/null +++ b/examples/cloud/alice.py @@ -0,0 +1,171 @@ +#!/usr/bin/env python3 +""" +Alice — 技术负责人的使用流程 + +操作:添加项目文档 → 语义搜索 → 多轮对话 → 沉淀记忆 → 回顾记忆 + +获取 API Key: + API Key 由管理员通过 Admin API 分配,流程如下: + + 1. ov.conf 中配置 server.root_api_key(如 "test") + 2. 用 root_api_key 创建租户和管理员: + curl -X POST http://localhost:1933/api/v1/admin/accounts \ + -H "X-API-Key: test" -H "Content-Type: application/json" \ + -d '{"account_id": "demo-team", "admin_user_id": "alice"}' + 返回中的 user_key 就是 Alice 的 API Key + 3. 或者运行 setup_users.py 自动完成上述步骤,Key 写入 user_keys.json + +运行: + uv run examples/cloud/alice.py + uv run examples/cloud/alice.py --url http://localhost:1933 --api-key +""" + +import argparse +import json +import sys +import time + +import openviking as ov +from openviking_cli.utils.async_utils import run_async + + +def load_key_from_file(user="alice"): + try: + with open("examples/cloud/user_keys.json") as f: + keys = json.load(f) + return keys["url"], keys[f"{user}_key"] + except (FileNotFoundError, KeyError): + return None, None + + +def main(): + parser = argparse.ArgumentParser(description="Alice 的使用流程") + parser.add_argument("--url", default=None, help="Server URL") + parser.add_argument("--api-key", default=None, help="Alice 的 API Key") + args = parser.parse_args() + + url, api_key = args.url, args.api_key + if not api_key: + url_from_file, key_from_file = load_key_from_file("alice") + url = url or url_from_file or "http://localhost:1933" + api_key = key_from_file + if not url: + url = "http://localhost:1933" + if not api_key: + print("请通过 --api-key 指定 API Key,或先运行 setup_users.py") + sys.exit(1) + + print(f"Server: {url}") + print("User: alice") + print(f"Key: {api_key[:16]}...") + + client = ov.SyncHTTPClient(url=url, api_key=api_key, agent_id="alice-agent") + client.initialize() + + try: + # ── 1. 添加资源 ── + print("\n== 1. 添加资源: OpenViking README ==") + result = client.add_resource( + path="https://raw.githubusercontent.com/volcengine/OpenViking/refs/heads/main/README.md", + reason="项目核心文档", + ) + readme_uri = result.get("root_uri", "") + print(f" URI: {readme_uri}") + print(" 等待处理...") + client.wait_processed() + print(" 完成") + + # ── 2. 查看文件系统 ── + print("\n== 2. 文件系统 ==") + entries = client.ls("viking://") + for entry in entries: + if isinstance(entry, dict): + kind = "dir " if entry.get("isDir") else "file" + print(f" [{kind}] {entry.get('name', '?')}") + + # ── 3. 读取摘要 ── + if readme_uri: + print("\n== 3. 资源摘要 ==") + abstract = client.abstract(readme_uri) + print(f" {abstract[:300]}") + + # ── 4. 语义搜索 ── + print("\n== 4. 语义搜索: 'context database for AI agent' ==") + results = client.find("context database for AI agent", limit=3) + if hasattr(results, "resources") and results.resources: + for i, r in enumerate(results.resources, 1): + print(f" {i}. [{r.score:.3f}] {r.uri}") + + # ── 5. 创建会话,模拟技术讨论 ── + print("\n== 5. 对话: 技术架构讨论 ==") + session = client.session() + print(f" Session: {session.session_id}") + + messages = [ + ("user", "我们的项目选择用 OpenViking 做 Context Database,主要原因是什么?"), + ( + "assistant", + "选择 OpenViking 的核心原因:1) 文件系统范式统一管理上下文 " + "2) 分层加载(L0/L1/L2)节省 token 3) 目录递归检索比传统 RAG 效果好 " + "4) 内置 session 管理和 memory 自动沉淀。", + ), + ("user", "向量数据库后端我们决定用 VikingDB 还是本地?"), + ( + "assistant", + "生产环境决定使用 VikingDB 云服务。原因:1) 不需要自己运维向量库 " + "2) VikingDB 和方舟模型在同一可用区,网络延迟低 " + "3) 自动扩缩容,不用担心数据量增长。本地模式只用于开发测试。", + ), + ("user", "文件存储用 TOS 对吧?prefix 是 ov?"), + ( + "assistant", + "是的。AGFS 后端配置为 S3 模式,对接 TOS。" + "bucket 是 openvikingdata,prefix 设为 ov,所有文件存在 ov/ 目录下。" + "AK/SK 使用 IAM 子用户的密钥,权限范围限定在这个 bucket。", + ), + ] + for role, content in messages: + run_async(session.add_message(role=role, content=content)) + print(f" 添加了 {len(messages)} 条消息") + + # ── 6. 沉淀记忆 ── + print("\n== 6. 沉淀记忆: commit session ==") + print(" 正在提取(技术决策、架构选型等)...") + client.commit_session(session.session_id) + print(" commit 完成") + time.sleep(2) + client.wait_processed() + print(" 记忆向量化完成") + + # ── 7. 查看记忆目录 ── + print("\n== 7. 记忆目录 ==") + try: + mem_entries = client.ls("viking://user/alice/memories") + for entry in mem_entries: + if isinstance(entry, dict): + kind = "dir " if entry.get("isDir") else "file" + print(f" [{kind}] {entry.get('name', '?')}") + except Exception: + print(" 记忆目录为空(可能无可提取的记忆)") + + # ── 8. 搜索回顾记忆 ── + print("\n== 8. 回顾记忆: '为什么选择 VikingDB' ==") + results = client.find("为什么选择 VikingDB 作为向量数据库", limit=3) + if hasattr(results, "memories") and results.memories: + print(" 记忆:") + for i, m in enumerate(results.memories, 1): + desc = m.abstract or m.overview or str(m.uri) + print(f" {i}. [{m.score:.3f}] {desc[:150]}") + if hasattr(results, "resources") and results.resources: + print(" 资源:") + for i, r in enumerate(results.resources, 1): + print(f" {i}. [{r.score:.3f}] {r.uri}") + + print("\nAlice 流程完成") + + finally: + client.close() + + +if __name__ == "__main__": + main() diff --git a/examples/cloud/bob.py b/examples/cloud/bob.py new file mode 100644 index 00000000..e2ae59fc --- /dev/null +++ b/examples/cloud/bob.py @@ -0,0 +1,189 @@ +#!/usr/bin/env python3 +""" +Bob — 新入职成员的使用流程 + +操作:浏览团队资源 → 回顾团队记忆 → 添加自己的资源 → 对话 → 沉淀记忆 → 带上下文搜索 + +获取 API Key: + API Key 由租户管理员分配,流程如下: + + 1. 管理员(如 Alice)用自己的 Key 注册 Bob: + curl -X POST http://localhost:1933/api/v1/admin/accounts/demo-team/users \ + -H "X-API-Key: " -H "Content-Type: application/json" \ + -d '{"user_id": "bob", "role": "user"}' + 返回中的 user_key 就是 Bob 的 API Key + 2. 或者运行 setup_users.py 自动完成,Key 写入 user_keys.json + +运行(建议在 alice.py 之后执行,这样可以看到 Alice 沉淀的团队记忆): + uv run examples/cloud/bob.py + uv run examples/cloud/bob.py --url http://localhost:1933 --api-key +""" + +import argparse +import json +import sys +import time + +import openviking as ov +from openviking_cli.utils.async_utils import run_async + + +def load_key_from_file(user="bob"): + try: + with open("examples/cloud/user_keys.json") as f: + keys = json.load(f) + return keys["url"], keys[f"{user}_key"] + except (FileNotFoundError, KeyError): + return None, None + + +def main(): + parser = argparse.ArgumentParser(description="Bob 的使用流程") + parser.add_argument("--url", default=None, help="Server URL") + parser.add_argument("--api-key", default=None, help="Bob 的 API Key") + args = parser.parse_args() + + url, api_key = args.url, args.api_key + if not api_key: + url_from_file, key_from_file = load_key_from_file("bob") + url = url or url_from_file or "http://localhost:1933" + api_key = key_from_file + if not url: + url = "http://localhost:1933" + if not api_key: + print("请通过 --api-key 指定 API Key,或先运行 setup_users.py") + sys.exit(1) + + print(f"Server: {url}") + print("User: bob") + print(f"Key: {api_key[:16]}...") + + client = ov.SyncHTTPClient(url=url, api_key=api_key, agent_id="bob-agent") + client.initialize() + + try: + # ── 1. 浏览团队已有资源 ── + print("\n== 1. 浏览团队资源 ==") + entries = client.ls("viking://") + if not entries: + print(" (空,Alice 还没添加资源)") + for entry in entries: + if isinstance(entry, dict): + kind = "dir " if entry.get("isDir") else "file" + print(f" [{kind}] {entry.get('name', '?')}") + + # ── 2. 回顾团队记忆(Alice 沉淀的技术决策) ── + print("\n== 2. 回顾团队记忆: '项目技术选型' ==") + results = client.find("项目用了什么技术栈和架构选型", limit=5) + if hasattr(results, "memories") and results.memories: + print(" 团队记忆:") + for i, m in enumerate(results.memories, 1): + desc = m.abstract or m.overview or str(m.uri) + print(f" {i}. [{m.score:.3f}] {desc[:150]}") + else: + print(" 未找到团队记忆(Alice 可能还没执行 commit)") + if hasattr(results, "resources") and results.resources: + print(" 相关资源:") + for i, r in enumerate(results.resources, 1): + print(f" {i}. [{r.score:.3f}] {r.uri}") + + # ── 3. 搜索具体决策 ── + print("\n== 3. 搜索: '存储方案 TOS 配置' ==") + results = client.find("文件存储方案 TOS bucket 配置", limit=3) + if hasattr(results, "memories") and results.memories: + for i, m in enumerate(results.memories, 1): + desc = m.abstract or m.overview or str(m.uri) + print(f" {i}. [{m.score:.3f}] {desc[:150]}") + else: + print(" 未找到相关记忆") + + # ── 4. 添加自己的资源 ── + print("\n== 4. 添加资源: CONTRIBUTING.md ==") + result = client.add_resource( + path="https://raw.githubusercontent.com/volcengine/OpenViking/refs/heads/main/CONTRIBUTING.md", + reason="贡献指南学习笔记", + ) + bob_uri = result.get("root_uri", "") + print(f" URI: {bob_uri}") + print(" 等待处理...") + client.wait_processed(timeout=120) + print(" 完成") + + # ── 5. 创建会话,模拟入职学习 ── + print("\n== 5. 对话: 入职学习 ==") + session = client.session() + print(f" Session: {session.session_id}") + + messages = [ + ("user", "我刚入职,需要了解 OpenViking 的贡献流程"), + ( + "assistant", + "欢迎!贡献流程主要是:1) Fork 仓库 2) 创建 feature branch " + "3) 提交 PR 并通过 CI 4) Code Review 后合并。" + "代码规范见 CONTRIBUTING.md。", + ), + ("user", "本地开发环境怎么搭建?"), + ( + "assistant", + "本地开发步骤:1) 安装 Python 3.10+ 和 uv " + "2) git clone 后执行 uv sync 安装依赖 " + "3) 复制 examples/ov.conf.example 为 ~/.openviking/ov.conf 填入 API Key " + "4) 运行 openviking-server 启动开发服务。C++ 扩展需要 cmake 和 pybind11。", + ), + ("user", "测试怎么跑?"), + ( + "assistant", + "运行测试:1) uv run pytest 跑全量测试 " + "2) uv run pytest tests/unit -x 只跑单元测试 " + "3) CI 会自动跑 lint + test,PR 合并前必须全绿。", + ), + ] + for role, content in messages: + run_async(session.add_message(role=role, content=content)) + print(f" 添加了 {len(messages)} 条消息") + + # ── 6. 沉淀记忆 ── + print("\n== 6. 沉淀记忆: commit session ==") + print(" 正在提取(开发流程、环境配置等)...") + client.commit_session(session.session_id) + print(" commit 完成") + time.sleep(2) + client.wait_processed(timeout=120) + print(" 记忆向量化完成") + + # ── 7. 回顾自己的记忆 ── + print("\n== 7. 回顾记忆: '本地开发环境搭建' ==") + results = client.find("本地开发环境搭建步骤", limit=3) + if hasattr(results, "memories") and results.memories: + print(" 记忆:") + for i, m in enumerate(results.memories, 1): + desc = m.abstract or m.overview or str(m.uri) + print(f" {i}. [{m.score:.3f}] {desc[:150]}") + if hasattr(results, "resources") and results.resources: + print(" 资源:") + for i, r in enumerate(results.resources, 1): + print(f" {i}. [{r.score:.3f}] {r.uri}") + + # ── 8. 带会话上下文的搜索 ── + print("\n== 8. 带上下文搜索: '还有什么注意事项' ==") + results = client.search( + "还有什么需要注意的事项", + session_id=session.session_id, + limit=3, + ) + if hasattr(results, "resources") and results.resources: + for i, r in enumerate(results.resources, 1): + print(f" {i}. [{r.score:.3f}] {r.uri}") + if hasattr(results, "memories") and results.memories: + for i, m in enumerate(results.memories, 1): + desc = m.abstract or m.overview or str(m.uri) + print(f" {i}. [{m.score:.3f}] {desc[:100]}") + + print("\nBob 流程完成") + + finally: + client.close() + + +if __name__ == "__main__": + main() diff --git a/examples/cloud/ov.conf.example b/examples/cloud/ov.conf.example new file mode 100644 index 00000000..c57f13b5 --- /dev/null +++ b/examples/cloud/ov.conf.example @@ -0,0 +1,77 @@ +{ + "server": { + "host": "0.0.0.0", + "port": 1933, + "root_api_key": "${ROOT_API_KEY}", + "cors_origins": ["*"] + }, + "storage": { + "workspace": "/app/data", + "vectordb": { + "name": "context", + "backend": "volcengine", + "project": "default", + "volcengine": { + "region": "cn-beijing", + "ak": "${VOLCENGINE_AK}", + "sk": "${VOLCENGINE_SK}" + } + }, + "agfs": { + "port": 1833, + "log_level": "warn", + "backend": "s3", + "timeout": 10, + "retry_times": 3, + "s3": { + "bucket": "${TOS_BUCKET}", + "region": "cn-beijing", + "access_key": "${VOLCENGINE_AK}", + "secret_key": "${VOLCENGINE_SK}", + "endpoint": "https://tos-s3-cn-beijing.volces.com", + "prefix": "openviking", + "use_ssl": true, + "use_path_style": false + } + } + }, + "embedding": { + "dense": { + "model": "doubao-embedding-vision-250615", + "api_key": "${ARK_API_KEY}", + "api_base": "https://ark.cn-beijing.volces.com/api/v3", + "dimension": 1024, + "provider": "volcengine", + "input": "multimodal" + } + }, + "vlm": { + "model": "doubao-seed-1-8-251228", + "api_key": "${ARK_API_KEY}", + "api_base": "https://ark.cn-beijing.volces.com/api/v3", + "temperature": 0.0, + "max_retries": 3, + "provider": "volcengine", + "thinking": false + }, + "rerank": { + "ak": "${VOLCENGINE_AK}", + "sk": "${VOLCENGINE_SK}", + "host": "api-vikingdb.vikingdb.cn-beijing.volces.com", + "model_name": "doubao-seed-rerank", + "model_version": "251028", + "threshold": 0.1 + }, + "auto_generate_l0": true, + "auto_generate_l1": true, + "default_search_mode": "thinking", + "default_search_limit": 3, + "enable_memory_decay": true, + "memory_decay_check_interval": 3600, + "log": { + "level": "WARN", + "format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s", + "output": "stdout", + "rotation": false + } +} diff --git a/examples/cloud/setup_users.py b/examples/cloud/setup_users.py new file mode 100644 index 00000000..8ec3a76a --- /dev/null +++ b/examples/cloud/setup_users.py @@ -0,0 +1,97 @@ +#!/usr/bin/env python3 +""" +创建租户和用户,获取 API Key + +前置条件: + 1. 按照 GUIDE.md 完成云服务开通和配置 + 2. 启动 OpenViking Server: + export OPENVIKING_CONFIG_FILE=examples/cloud/ov.conf + openviking-server + +获取用户 API Key 的流程: + 1. 在 ov.conf 中设置 server.root_api_key(管理员密钥) + 2. 用 root_api_key 调用 POST /api/v1/admin/accounts 创建租户,返回管理员用户的 API Key + 3. 用管理员 API Key 调用 POST /api/v1/admin/accounts/{id}/users 注册用户,返回用户的 API Key + 4. 每个用户拿到自己的 API Key 后即可独立使用所有数据接口 + +本脚本自动完成上述流程,创建一个租户 "demo-team",注册 alice 和 bob 两个用户。 + +运行: + uv run setup_users.py + uv run setup_users.py --url http://localhost:1933 --root-key test +""" + +import argparse +import json +import sys + +import httpx + + +def main(): + parser = argparse.ArgumentParser(description="创建租户和用户") + parser.add_argument("--url", default="http://localhost:1933", help="Server URL") + parser.add_argument("--root-key", default="test", help="ov.conf 中的 root_api_key") + args = parser.parse_args() + + base = args.url.rstrip("/") + headers = {"X-API-Key": args.root_key, "Content-Type": "application/json"} + + # 健康检查 + resp = httpx.get(f"{base}/health") + if not resp.is_success: + print(f"Server 不可用: {resp.status_code}") + sys.exit(1) + print(f"Server 正常: {resp.json()}") + + # 创建租户,alice 作为管理员 + print("\n== 创建租户 demo-team ==") + resp = httpx.post( + f"{base}/api/v1/admin/accounts", + headers=headers, + json={"account_id": "demo-team", "admin_user_id": "alice"}, + ) + if not resp.is_success: + print(f"创建失败: {resp.status_code} {resp.text}") + sys.exit(1) + result = resp.json()["result"] + alice_key = result["user_key"] + print(" 租户: demo-team") + print(" 管理员: alice (admin)") + print(f" Alice API Key: {alice_key}") + + # alice 注册 bob + print("\n== 注册用户 bob ==") + alice_headers = {"X-API-Key": alice_key, "Content-Type": "application/json"} + resp = httpx.post( + f"{base}/api/v1/admin/accounts/demo-team/users", + headers=alice_headers, + json={"user_id": "bob", "role": "user"}, + ) + if not resp.is_success: + print(f"注册失败: {resp.status_code} {resp.text}") + sys.exit(1) + result = resp.json()["result"] + bob_key = result["user_key"] + print(" 用户: bob (user)") + print(f" Bob API Key: {bob_key}") + + # 输出汇总 + keys = { + "url": args.url, + "account_id": "demo-team", + "alice_key": alice_key, + "bob_key": bob_key, + } + print("\n== 汇总 ==") + print(json.dumps(keys, indent=2)) + + # 写入文件供后续脚本使用 + keys_file = "examples/cloud/user_keys.json" + with open(keys_file, "w") as f: + json.dump(keys, f, indent=2) + print(f"\n已写入 {keys_file},后续脚本可直接读取。") + + +if __name__ == "__main__": + main() diff --git a/openviking/client/local.py b/openviking/client/local.py index e3535550..b5ff6ef7 100644 --- a/openviking/client/local.py +++ b/openviking/client/local.py @@ -7,6 +7,7 @@ from typing import Any, Dict, List, Optional, Union +from openviking.server.identity import RequestContext, Role from openviking.service import OpenVikingService from openviking_cli.client.base import BaseClient from openviking_cli.session.user_id import UserIdentifier @@ -32,6 +33,7 @@ def __init__( user=UserIdentifier.the_default_user(), ) self._user = self._service.user + self._ctx = RequestContext(user=self._user, role=Role.ROOT) @property def service(self) -> OpenVikingService: @@ -63,6 +65,7 @@ async def add_resource( """Add resource to OpenViking.""" return await self._service.resources.add_resource( path=path, + ctx=self._ctx, target=target, reason=reason, instruction=instruction, @@ -80,6 +83,7 @@ async def add_skill( """Add skill to OpenViking.""" return await self._service.resources.add_skill( data=data, + ctx=self._ctx, wait=wait, timeout=timeout, ) @@ -102,6 +106,7 @@ async def ls( """List directory contents.""" return await self._service.fs.ls( uri, + ctx=self._ctx, simple=simple, recursive=recursive, output=output, @@ -120,6 +125,7 @@ async def tree( """Get directory tree.""" return await self._service.fs.tree( uri, + ctx=self._ctx, output=output, abs_limit=abs_limit, show_all_hidden=show_all_hidden, @@ -128,19 +134,19 @@ async def tree( async def stat(self, uri: str) -> Dict[str, Any]: """Get resource status.""" - return await self._service.fs.stat(uri) + return await self._service.fs.stat(uri, ctx=self._ctx) async def mkdir(self, uri: str) -> None: """Create directory.""" - await self._service.fs.mkdir(uri) + await self._service.fs.mkdir(uri, ctx=self._ctx) async def rm(self, uri: str, recursive: bool = False) -> None: """Remove resource.""" - await self._service.fs.rm(uri, recursive=recursive) + await self._service.fs.rm(uri, ctx=self._ctx, recursive=recursive) async def mv(self, from_uri: str, to_uri: str) -> None: """Move resource.""" - await self._service.fs.mv(from_uri, to_uri) + await self._service.fs.mv(from_uri, to_uri, ctx=self._ctx) # ============= Content Reading ============= @@ -152,15 +158,15 @@ async def read(self, uri: str, offset: int = 0, limit: int = -1) -> str: offset: Starting line number (0-indexed). Default 0. limit: Number of lines to read. -1 means read to end. Default -1. """ - return await self._service.fs.read(uri, offset=offset, limit=limit) + return await self._service.fs.read(uri, ctx=self._ctx, offset=offset, limit=limit) async def abstract(self, uri: str) -> str: """Read L0 abstract.""" - return await self._service.fs.abstract(uri) + return await self._service.fs.abstract(uri, ctx=self._ctx) async def overview(self, uri: str) -> str: """Read L1 overview.""" - return await self._service.fs.overview(uri) + return await self._service.fs.overview(uri, ctx=self._ctx) # ============= Search ============= @@ -175,6 +181,7 @@ async def find( """Semantic search without session context.""" return await self._service.search.find( query=query, + ctx=self._ctx, target_uri=target_uri, limit=limit, score_threshold=score_threshold, @@ -193,10 +200,11 @@ async def search( """Semantic search with optional session context.""" session = None if session_id: - session = self._service.sessions.session(session_id) + session = self._service.sessions.session(self._ctx, session_id) await session.load() return await self._service.search.search( query=query, + ctx=self._ctx, target_uri=target_uri, session=session, limit=limit, @@ -206,31 +214,33 @@ async def search( async def grep(self, uri: str, pattern: str, case_insensitive: bool = False) -> Dict[str, Any]: """Content search with pattern.""" - return await self._service.fs.grep(uri, pattern, case_insensitive=case_insensitive) + return await self._service.fs.grep( + uri, pattern, ctx=self._ctx, case_insensitive=case_insensitive + ) async def glob(self, pattern: str, uri: str = "viking://") -> Dict[str, Any]: """File pattern matching.""" - return await self._service.fs.glob(pattern, uri=uri) + return await self._service.fs.glob(pattern, ctx=self._ctx, uri=uri) # ============= Relations ============= async def relations(self, uri: str) -> List[Any]: """Get relations for a resource.""" - return await self._service.relations.relations(uri) + return await self._service.relations.relations(uri, ctx=self._ctx) async def link(self, from_uri: str, to_uris: Union[str, List[str]], reason: str = "") -> None: """Create link between resources.""" - await self._service.relations.link(from_uri, to_uris, reason) + await self._service.relations.link(from_uri, to_uris, ctx=self._ctx, reason=reason) async def unlink(self, from_uri: str, to_uri: str) -> None: """Remove link between resources.""" - await self._service.relations.unlink(from_uri, to_uri) + await self._service.relations.unlink(from_uri, to_uri, ctx=self._ctx) # ============= Sessions ============= async def create_session(self) -> Dict[str, Any]: """Create a new session.""" - session = self._service.sessions.session() + session = await self._service.sessions.create(self._ctx) return { "session_id": session.session_id, "user": session.user.to_dict(), @@ -238,12 +248,11 @@ async def create_session(self) -> Dict[str, Any]: async def list_sessions(self) -> List[Any]: """List all sessions.""" - return await self._service.sessions.sessions() + return await self._service.sessions.sessions(self._ctx) async def get_session(self, session_id: str) -> Dict[str, Any]: """Get session details.""" - session = self._service.sessions.session(session_id) - await session.load() + session = await self._service.sessions.get(session_id, self._ctx) return { "session_id": session.session_id, "user": session.user.to_dict(), @@ -252,18 +261,18 @@ async def get_session(self, session_id: str) -> Dict[str, Any]: async def delete_session(self, session_id: str) -> None: """Delete a session.""" - await self._service.sessions.delete(session_id) + await self._service.sessions.delete(session_id, self._ctx) async def commit_session(self, session_id: str) -> Dict[str, Any]: """Commit a session (archive and extract memories).""" - return await self._service.sessions.commit(session_id) + return await self._service.sessions.commit(session_id, self._ctx) async def add_message( self, session_id: str, role: str, - content: str | None = None, - parts: list[dict] | None = None, + content: Optional[str] = None, + parts: Optional[List[Dict[str, Any]]] = None, ) -> Dict[str, Any]: """Add a message to a session. @@ -277,7 +286,7 @@ async def add_message( """ from openviking.message.part import Part, TextPart, part_from_dict - session = self._service.sessions.session(session_id) + session = self._service.sessions.session(self._ctx, session_id) await session.load() message_parts: list[Part] @@ -298,7 +307,7 @@ async def add_message( async def export_ovpack(self, uri: str, to: str) -> str: """Export context as .ovpack file.""" - return await self._service.pack.export_ovpack(uri, to) + return await self._service.pack.export_ovpack(uri, to, ctx=self._ctx) async def import_ovpack( self, @@ -309,7 +318,7 @@ async def import_ovpack( ) -> str: """Import .ovpack file.""" return await self._service.pack.import_ovpack( - file_path, parent, force=force, vectorize=vectorize + file_path, parent, ctx=self._ctx, force=force, vectorize=vectorize ) # ============= Debug ============= diff --git a/openviking/core/context.py b/openviking/core/context.py index f22d8491..063cbcc1 100644 --- a/openviking/core/context.py +++ b/openviking/core/context.py @@ -67,6 +67,8 @@ def __init__( meta: Optional[Dict[str, Any]] = None, session_id: Optional[str] = None, user: Optional[UserIdentifier] = None, + account_id: Optional[str] = None, + owner_space: Optional[str] = None, id: Optional[str] = None, ): """ @@ -86,34 +88,44 @@ def __init__( self.meta = meta or {} self.session_id = session_id self.user = user + self.account_id = account_id or (user.account_id if user else "default") + self.owner_space = owner_space or self._derive_owner_space(user) self.vector: Optional[List[float]] = None self.vectorize = Vectorize(abstract) + def _derive_owner_space(self, user: Optional[UserIdentifier]) -> str: + """Best-effort owner space derived from URI and user.""" + if not user: + return "" + if self.uri.startswith("viking://agent/"): + return user.agent_space_name() + if self.uri.startswith("viking://user/") or self.uri.startswith("viking://session/"): + return user.user_space_name() + return "" + def _derive_context_type(self) -> str: - """Derive context type from URI prefix.""" - if self.uri.startswith("viking://agent/skills"): + """Derive context type from URI using substring matching.""" + if "/skills" in self.uri: return "skill" - elif "memories" in self.uri: + elif "/memories" in self.uri: return "memory" else: return "resource" def _derive_category(self) -> str: - """Derive category from URI prefix.""" - if self.uri.startswith("viking://agent/memories"): - if "patterns" in self.uri: - return "patterns" - elif "cases" in self.uri: - return "cases" - elif self.uri.startswith("viking://user/memories"): - if "profile" in self.uri: - return "profile" - if "preferences" in self.uri: - return "preferences" - if "entities" in self.uri: - return "entities" - elif "events" in self.uri: - return "events" + """Derive category from URI using substring matching.""" + if "/patterns" in self.uri: + return "patterns" + elif "/cases" in self.uri: + return "cases" + elif "/profile" in self.uri: + return "profile" + elif "/preferences" in self.uri: + return "preferences" + elif "/entities" in self.uri: + return "entities" + elif "/events" in self.uri: + return "events" return "" def get_context_type(self) -> str: @@ -153,6 +165,8 @@ def to_dict(self) -> Dict[str, Any]: "meta": self.meta, "related_uri": self.related_uri, "session_id": self.session_id, + "account_id": self.account_id, + "owner_space": self.owner_space, } if self.user: @@ -168,6 +182,8 @@ def to_dict(self) -> Dict[str, Any]: @classmethod def from_dict(cls, data: Dict[str, Any]) -> "Context": """Create a context object from dictionary.""" + user_data = data.get("user") + user_obj = UserIdentifier.from_dict(user_data) if isinstance(user_data, dict) else user_data obj = cls( uri=data["uri"], parent_uri=data.get("parent_uri"), @@ -189,7 +205,9 @@ def from_dict(cls, data: Dict[str, Any]) -> "Context": related_uri=data.get("related_uri", []), meta=data.get("meta", {}), session_id=data.get("session_id"), - user=data.get("user"), + user=user_obj, + account_id=data.get("account_id"), + owner_space=data.get("owner_space"), ) obj.id = data.get("id", obj.id) obj.vector = data.get("vector") diff --git a/openviking/core/directories.py b/openviking/core/directories.py index 02623cd6..2033a29e 100644 --- a/openviking/core/directories.py +++ b/openviking/core/directories.py @@ -11,6 +11,7 @@ from typing import TYPE_CHECKING, Dict, List, Optional from openviking.core.context import Context, ContextType, Vectorize +from openviking.server.identity import RequestContext from openviking.storage.queuefs.embedding_msg_converter import EmbeddingMsgConverter if TYPE_CHECKING: @@ -125,7 +126,6 @@ class DirectoryDefinition: def get_context_type_for_uri(uri: str) -> str: """Determine context_type based on URI.""" - uri = uri[:20] if "/memories" in uri: return ContextType.MEMORY.value elif "/resources" in uri: @@ -146,52 +146,64 @@ def __init__( ): self.vikingdb = vikingdb - async def initialize_all(self) -> int: - """Initialize all global preset directories (skip user scope).""" - from openviking_cli.utils.logger import get_logger - - logger = get_logger(__name__) + async def initialize_account_directories(self, ctx: RequestContext) -> int: + """Initialize account-shared scope roots.""" count = 0 - for scope, root_defn in PRESET_DIRECTORIES.items(): - if scope == "user": - logger.info("Skipping user scope (lazy initialization)") - continue - + scope_roots = { + "user": PRESET_DIRECTORIES["user"], + "agent": PRESET_DIRECTORIES["agent"], + "resources": PRESET_DIRECTORIES["resources"], + "session": PRESET_DIRECTORIES["session"], + } + for scope, defn in scope_roots.items(): root_uri = f"viking://{scope}" created = await self._ensure_directory( uri=root_uri, parent_uri=None, - defn=root_defn, + defn=defn, scope=scope, + ctx=ctx, ) if created: count += 1 - - count += await self._initialize_children(scope, root_defn.children, root_uri) return count - async def initialize_user_directories(self) -> int: - """Initialize user preset directory tree. - - Returns: - Number of directories created - """ + async def initialize_user_directories(self, ctx: RequestContext) -> int: + """Initialize user-space tree lazily for the current user.""" if "user" not in PRESET_DIRECTORIES: return 0 - - user_root_uri = "viking://user" + user_space_root = f"viking://user/{ctx.user.user_space_name()}" user_tree = PRESET_DIRECTORIES["user"] - created = await self._ensure_directory( - uri=user_root_uri, - parent_uri=None, + uri=user_space_root, + parent_uri="viking://user", defn=user_tree, scope="user", + ctx=ctx, ) - count = 1 if created else 0 - count += await self._initialize_children("user", user_tree.children, user_root_uri) + count += await self._initialize_children( + "user", user_tree.children, user_space_root, ctx=ctx + ) + return count + async def initialize_agent_directories(self, ctx: RequestContext) -> int: + """Initialize agent-space tree lazily for the current user+agent.""" + if "agent" not in PRESET_DIRECTORIES: + return 0 + agent_space_root = f"viking://agent/{ctx.user.agent_space_name()}" + agent_tree = PRESET_DIRECTORIES["agent"] + created = await self._ensure_directory( + uri=agent_space_root, + parent_uri="viking://agent", + defn=agent_tree, + scope="agent", + ctx=ctx, + ) + count = 1 if created else 0 + count += await self._initialize_children( + "agent", agent_tree.children, agent_space_root, ctx=ctx + ) return count async def _ensure_directory( @@ -200,6 +212,7 @@ async def _ensure_directory( parent_uri: Optional[str], defn: DirectoryDefinition, scope: str, + ctx: RequestContext, ) -> bool: """Ensure directory exists, return whether newly created.""" from openviking_cli.utils.logger import get_logger @@ -207,9 +220,9 @@ async def _ensure_directory( logger = get_logger(__name__) created = False # 1. Ensure files exist in AGFS - if not await self._check_agfs_files_exist(uri): + if not await self._check_agfs_files_exist(uri, ctx=ctx): logger.debug(f"[VikingFS] Creating directory: {uri} for scope {scope}") - await self._create_agfs_structure(uri, defn.abstract, defn.overview) + await self._create_agfs_structure(uri, defn.abstract, defn.overview, ctx=ctx) created = True else: logger.debug(f"[VikingFS] Directory {uri} already exists") @@ -225,12 +238,20 @@ async def _ensure_directory( limit=1, ) if not existing: + owner_space = "" + if scope in {"user", "session"}: + owner_space = ctx.user.user_space_name() + elif scope == "agent": + owner_space = ctx.user.agent_space_name() context = Context( uri=uri, parent_uri=parent_uri, is_leaf=False, context_type=get_context_type_for_uri(uri), abstract=defn.abstract, + user=ctx.user, + account_id=ctx.account_id, + owner_space=owner_space, ) context.set_vectorize(Vectorize(text=defn.overview)) dir_emb_msg = EmbeddingMsgConverter.from_context(context) @@ -238,13 +259,13 @@ async def _ensure_directory( created = True return created - async def _check_agfs_files_exist(self, uri: str) -> bool: + async def _check_agfs_files_exist(self, uri: str, ctx: RequestContext) -> bool: """Check if L0/L1 files exist in AGFS.""" from openviking.storage.viking_fs import get_viking_fs try: viking_fs = get_viking_fs() - await viking_fs.abstract(uri) + await viking_fs.abstract(uri, ctx=ctx) return True except Exception: return False @@ -254,6 +275,7 @@ async def _initialize_children( scope: str, children: List[DirectoryDefinition], parent_uri: str, + ctx: RequestContext, ) -> int: """Recursively initialize subdirectories.""" count = 0 @@ -266,16 +288,19 @@ async def _initialize_children( parent_uri=parent_uri, defn=defn, scope=scope, + ctx=ctx, ) if created: count += 1 if defn.children: - count += await self._initialize_children(scope, defn.children, uri) + count += await self._initialize_children(scope, defn.children, uri, ctx=ctx) return count - async def _create_agfs_structure(self, uri: str, abstract: str, overview: str) -> None: + async def _create_agfs_structure( + self, uri: str, abstract: str, overview: str, ctx: RequestContext + ) -> None: """Create L0/L1 file structure for directory in AGFS.""" from openviking.storage.viking_fs import get_viking_fs @@ -284,4 +309,5 @@ async def _create_agfs_structure(self, uri: str, abstract: str, overview: str) - abstract=abstract, overview=overview, is_leaf=False, # Preset directories can continue traversing downward + ctx=ctx, ) diff --git a/openviking/parse/parsers/media/utils.py b/openviking/parse/parsers/media/utils.py index 4615c285..7e8ec8dd 100644 --- a/openviking/parse/parsers/media/utils.py +++ b/openviking/parse/parsers/media/utils.py @@ -5,13 +5,16 @@ import asyncio from datetime import datetime from pathlib import Path -from typing import Any, Dict, Optional +from typing import TYPE_CHECKING, Any, Dict, Optional from openviking.prompts import render_prompt from openviking.storage.viking_fs import get_viking_fs from openviking_cli.utils.config import get_openviking_config from openviking_cli.utils.logger import get_logger +if TYPE_CHECKING: + from openviking.server.identity import RequestContext + from .constants import AUDIO_EXTENSIONS, IMAGE_EXTENSIONS, VIDEO_EXTENSIONS logger = get_logger(__name__) @@ -95,7 +98,10 @@ def get_media_base_uri(media_type: str) -> str: async def generate_image_summary( - image_uri: str, original_filename: str, llm_sem: Optional[asyncio.Semaphore] = None + image_uri: str, + original_filename: str, + llm_sem: Optional[asyncio.Semaphore] = None, + ctx: Optional["RequestContext"] = None, ) -> Dict[str, Any]: """ Generate summary for an image file using VLM. @@ -103,6 +109,8 @@ async def generate_image_summary( Args: image_uri: URI to the image file in VikingFS original_filename: Original filename of the image + llm_sem: Semaphore to limit concurrent LLM calls + ctx: Optional request context for tenant-aware file access Returns: Dictionary with "name" and "summary" keys @@ -113,7 +121,7 @@ async def generate_image_summary( try: # Read image bytes - image_bytes = await viking_fs.read_file_bytes(image_uri) + image_bytes = await viking_fs.read_file_bytes(image_uri, ctx=ctx) if not isinstance(image_bytes, bytes): raise ValueError(f"Expected bytes for image file, got {type(image_bytes)}") @@ -163,7 +171,10 @@ async def generate_image_summary( async def generate_audio_summary( - audio_uri: str, original_filename: str, llm_sem: Optional[asyncio.Semaphore] = None + audio_uri: str, + original_filename: str, + llm_sem: Optional[asyncio.Semaphore] = None, + ctx: Optional["RequestContext"] = None, ) -> Dict[str, Any]: """ Generate summary for an audio file (placeholder). @@ -171,6 +182,8 @@ async def generate_audio_summary( Args: audio_uri: URI to the audio file in VikingFS original_filename: Original filename of the audio + llm_sem: Semaphore to limit concurrent LLM calls + ctx: Optional request context for tenant-aware file access Returns: Dictionary with "name" and "summary" keys @@ -182,7 +195,10 @@ async def generate_audio_summary( async def generate_video_summary( - video_uri: str, original_filename: str, llm_sem: Optional[asyncio.Semaphore] = None + video_uri: str, + original_filename: str, + llm_sem: Optional[asyncio.Semaphore] = None, + ctx: Optional["RequestContext"] = None, ) -> Dict[str, Any]: """ Generate summary for a video file (placeholder). @@ -190,6 +206,8 @@ async def generate_video_summary( Args: video_uri: URI to the video file in VikingFS original_filename: Original filename of the video + llm_sem: Semaphore to limit concurrent LLM calls + ctx: Optional request context for tenant-aware file access Returns: Dictionary with "name" and "summary" keys diff --git a/openviking/parse/tree_builder.py b/openviking/parse/tree_builder.py index 2d202aa0..9a767fce 100644 --- a/openviking/parse/tree_builder.py +++ b/openviking/parse/tree_builder.py @@ -26,6 +26,7 @@ from openviking.core.building_tree import BuildingTree from openviking.parse.parsers.media.utils import get_media_base_uri, get_media_type +from openviking.server.identity import RequestContext from openviking.storage.queuefs import SemanticMsg, get_queue_manager from openviking.storage.viking_fs import get_viking_fs from openviking_cli.utils.uri import VikingURI @@ -84,6 +85,7 @@ def _get_base_uri( async def finalize_from_temp( self, temp_dir_path: str, + ctx: RequestContext, scope: str, base_uri: Optional[str] = None, source_path: Optional[str] = None, @@ -113,7 +115,7 @@ async def finalize_from_temp( temp_uri = temp_dir_path # 1. Find document root directory - entries = await viking_fs.ls(temp_uri) + entries = await viking_fs.ls(temp_uri, ctx=ctx) doc_dirs = [e for e in entries if e.get("isDir") and e["name"] not in [".", ".."]] if len(doc_dirs) != 1: @@ -133,26 +135,26 @@ async def finalize_from_temp( # 3. Build final URI, auto-renaming on conflict (e.g. doc_1, doc_2, ...) candidate_uri = VikingURI(base_uri).join(doc_name).uri - final_uri = await self._resolve_unique_uri(candidate_uri) + final_uri = await self._resolve_unique_uri(candidate_uri, ctx=ctx) if final_uri != candidate_uri: logger.info(f"[TreeBuilder] Resolved name conflict: {candidate_uri} -> {final_uri}") else: logger.info(f"[TreeBuilder] Finalizing from temp: {final_uri}") # 4. Move directory tree from temp to final location in AGFS - await self._move_temp_to_dest(viking_fs, temp_doc_uri, final_uri) + await self._move_temp_to_dest(viking_fs, temp_doc_uri, final_uri, ctx=ctx) logger.info(f"[TreeBuilder] Moved temp tree: {temp_doc_uri} -> {final_uri}") # 5. Cleanup temporary root directory try: - await viking_fs.delete_temp(temp_uri) + await viking_fs.delete_temp(temp_uri, ctx=ctx) logger.info(f"[TreeBuilder] Cleaned up temp root: {temp_uri}") except Exception as e: logger.warning(f"[TreeBuilder] Failed to cleanup temp root: {e}") # 6. Enqueue to SemanticQueue for async semantic generation try: - await self._enqueue_semantic_generation(final_uri, "resource") + await self._enqueue_semantic_generation(final_uri, "resource", ctx=ctx) logger.info(f"[TreeBuilder] Enqueued semantic generation for: {final_uri}") except Exception as e: logger.error(f"[TreeBuilder] Failed to enqueue semantic generation: {e}", exc_info=True) @@ -166,7 +168,9 @@ async def finalize_from_temp( return tree - async def _resolve_unique_uri(self, uri: str, max_attempts: int = 100) -> str: + async def _resolve_unique_uri( + self, uri: str, max_attempts: int = 100, ctx: Optional[RequestContext] = None + ) -> str: """Return a URI that does not collide with an existing resource. If *uri* is free, return it unchanged. Otherwise append ``_1``, @@ -177,7 +181,7 @@ async def _resolve_unique_uri(self, uri: str, max_attempts: int = 100) -> str: async def _exists(u: str) -> bool: try: - await viking_fs.stat(u) + await viking_fs.stat(u, ctx=ctx) return True except Exception: return False @@ -192,17 +196,19 @@ async def _exists(u: str) -> bool: raise FileExistsError(f"Cannot resolve unique name for {uri} after {max_attempts} attempts") - async def _move_temp_to_dest(self, viking_fs, src_uri: str, dst_uri: str) -> None: + async def _move_temp_to_dest( + self, viking_fs, src_uri: str, dst_uri: str, ctx: RequestContext + ) -> None: """Move temp directory to final destination using a single native AGFS mv call. Temp files have no vector records yet, so no vector index update is needed. """ - src_path = viking_fs._uri_to_path(src_uri) - dst_path = viking_fs._uri_to_path(dst_uri) - await self._ensure_parent_dirs(dst_uri) + src_path = viking_fs._uri_to_path(src_uri, ctx=ctx) + dst_path = viking_fs._uri_to_path(dst_uri, ctx=ctx) + await self._ensure_parent_dirs(dst_uri, ctx=ctx) await asyncio.to_thread(viking_fs.agfs.mv, src_path, dst_path) - async def _ensure_parent_dirs(self, uri: str) -> None: + async def _ensure_parent_dirs(self, uri: str, ctx: RequestContext) -> None: """Recursively create parent directories.""" viking_fs = get_viking_fs() parent = VikingURI(uri).parent @@ -210,18 +216,20 @@ async def _ensure_parent_dirs(self, uri: str) -> None: return parent_uri = parent.uri # Recursively ensure parent's parent exists - await self._ensure_parent_dirs(parent_uri) + await self._ensure_parent_dirs(parent_uri, ctx=ctx) # Create parent directory (ignore if already exists) try: - await viking_fs.mkdir(parent_uri) + await viking_fs.mkdir(parent_uri, ctx=ctx) logger.debug(f"Created parent directory: {parent_uri}") except Exception as e: # Directory may already exist, ignore error if "exist" not in str(e).lower(): logger.debug(f"Parent dir {parent_uri} may already exist: {e}") - async def _enqueue_semantic_generation(self, uri: str, context_type: str) -> None: + async def _enqueue_semantic_generation( + self, uri: str, context_type: str, ctx: RequestContext + ) -> None: """ Enqueue a directory for semantic generation. @@ -239,6 +247,10 @@ async def _enqueue_semantic_generation(self, uri: str, context_type: str) -> Non msg = SemanticMsg( uri=uri, context_type=context_type, + account_id=ctx.account_id, + user_id=ctx.user.user_id, + agent_id=ctx.user.agent_id, + role=ctx.role.value, ) await semantic_queue.enqueue(msg) diff --git a/openviking/retrieve/hierarchical_retriever.py b/openviking/retrieve/hierarchical_retriever.py index 26b4372a..987978a4 100644 --- a/openviking/retrieve/hierarchical_retriever.py +++ b/openviking/retrieve/hierarchical_retriever.py @@ -11,6 +11,7 @@ from typing import Any, Dict, List, Optional, Tuple from openviking.models.embedder.base import EmbedResult +from openviking.server.identity import RequestContext, Role from openviking.storage import VikingDBInterface from openviking.storage.viking_fs import get_viking_fs from openviking_cli.retrieve.types import ( @@ -76,6 +77,7 @@ def __init__( async def retrieve( self, query: TypedQuery, + ctx: RequestContext, limit: int = 5, mode: RetrieverMode = RetrieverMode.THINKING, score_threshold: Optional[float] = None, @@ -96,15 +98,22 @@ async def retrieve( # Use custom threshold or default threshold effective_threshold = score_threshold if score_threshold is not None else self.threshold - collection = self._type_to_collection(query.context_type) + collection = "context" target_dirs = [d for d in (query.target_directories or []) if d] - # Create context_type filter - type_filter = {"op": "must", "field": "context_type", "conds": [query.context_type.value]} - - # Merge all filters - filters_to_merge = [type_filter] + # Create context_type filter (skip when context_type is None = search all types) + filters_to_merge = [] + if query.context_type is not None: + type_filter = { + "op": "must", + "field": "context_type", + "conds": [query.context_type.value], + } + filters_to_merge.append(type_filter) + tenant_filter = self._build_tenant_filter(ctx, context_type=query.context_type) + if tenant_filter: + filters_to_merge.append(tenant_filter) if target_dirs: target_filter = { "op": "or", @@ -139,7 +148,7 @@ async def retrieve( if target_dirs: root_uris = target_dirs else: - root_uris = self._get_root_uris_for_type(query.context_type) + root_uris = self._get_root_uris_for_type(query.context_type, ctx=ctx) # Step 2: Global vector search to supplement starting points global_results = await self._global_vector_search( @@ -168,7 +177,7 @@ async def retrieve( ) # Step 6: Convert results - matched = await self._convert_to_matched_contexts(candidates, query.context_type) + matched = await self._convert_to_matched_contexts(candidates, ctx=ctx) return QueryResult( query=query, @@ -176,6 +185,34 @@ async def retrieve( searched_directories=root_uris, ) + def _build_tenant_filter( + self, ctx: RequestContext, context_type: Optional[ContextType] = None + ) -> Optional[Dict[str, Any]]: + """Build tenant visibility filter by role. + + Args: + ctx: Request context with role and user info. + context_type: When RESOURCE, allow owner_space="" so shared + resources are visible to USER role. + """ + if ctx.role == Role.ROOT: + return None + + owner_spaces = [ctx.user.user_space_name(), ctx.user.agent_space_name()] + if context_type == ContextType.RESOURCE: + owner_spaces.append("") + return { + "op": "and", + "conds": [ + {"op": "must", "field": "account_id", "conds": [ctx.account_id]}, + { + "op": "must", + "field": "owner_space", + "conds": owner_spaces, + }, + ], + } + async def _global_vector_search( self, collection: str, @@ -377,7 +414,7 @@ def merge_filter(base_filter: Dict, extra_filter: Optional[Dict]) -> Dict: async def _convert_to_matched_contexts( self, candidates: List[Dict[str, Any]], - context_type: ContextType, + ctx: RequestContext, ) -> List[MatchedContext]: """Convert candidate results to MatchedContext list.""" results = [] @@ -386,10 +423,10 @@ async def _convert_to_matched_contexts( # Read related contexts and get summaries relations = [] if get_viking_fs(): - related_uris = await get_viking_fs().get_relations(c.get("uri", "")) + related_uris = await get_viking_fs().get_relations(c.get("uri", ""), ctx=ctx) if related_uris: related_abstracts = await get_viking_fs().read_batch( - related_uris[: self.MAX_RELATIONS], level="l0" + related_uris[: self.MAX_RELATIONS], level="l0", ctx=ctx ) for uri in related_uris[: self.MAX_RELATIONS]: abstract = related_abstracts.get(uri, "") @@ -399,7 +436,9 @@ async def _convert_to_matched_contexts( results.append( MatchedContext( uri=c.get("uri", ""), - context_type=context_type, + context_type=ContextType(c["context_type"]) + if c.get("context_type") + else ContextType.RESOURCE, level=c.get("level", 2), abstract=c.get("abstract", ""), category=c.get("category", ""), @@ -410,18 +449,33 @@ async def _convert_to_matched_contexts( return results - def _get_root_uris_for_type(self, context_type: ContextType) -> List[str]: - """Return starting directory URI list based on context_type.""" - if context_type == ContextType.MEMORY: - return ["viking://user/memories", "viking://agent/memories"] + def _get_root_uris_for_type( + self, context_type: Optional[ContextType], ctx: Optional[RequestContext] = None + ) -> List[str]: + """Return starting directory URI list based on context_type and user context. + + When context_type is None, returns roots for all types. + ROOT has no space, relies on global_vector_search without URI prefix filter. + """ + if not ctx or ctx.role == Role.ROOT: + return [] + + user_space = ctx.user.user_space_name() + agent_space = ctx.user.agent_space_name() + if context_type is None: + return [ + f"viking://user/{user_space}/memories", + f"viking://agent/{agent_space}/memories", + "viking://resources", + f"viking://agent/{agent_space}/skills", + ] + elif context_type == ContextType.MEMORY: + return [ + f"viking://user/{user_space}/memories", + f"viking://agent/{agent_space}/memories", + ] elif context_type == ContextType.RESOURCE: return ["viking://resources"] elif context_type == ContextType.SKILL: - return ["viking://agent/skills"] + return [f"viking://agent/{agent_space}/skills"] return [] - - def _type_to_collection(self, context_type: ContextType) -> str: - """ - Convert context type to collection name. - """ - return "context" diff --git a/openviking/server/auth.py b/openviking/server/auth.py index 3c058cae..06aad25e 100644 --- a/openviking/server/auth.py +++ b/openviking/server/auth.py @@ -15,6 +15,8 @@ async def resolve_identity( request: Request, x_api_key: Optional[str] = Header(None), authorization: Optional[str] = Header(None), + x_openviking_account: Optional[str] = Header(None, alias="X-OpenViking-Account"), + x_openviking_user: Optional[str] = Header(None, alias="X-OpenViking-User"), x_openviking_agent: Optional[str] = Header(None, alias="X-OpenViking-Agent"), ) -> ResolvedIdentity: """Resolve API key to identity. @@ -28,8 +30,8 @@ async def resolve_identity( if api_key_manager is None: return ResolvedIdentity( role=Role.ROOT, - account_id="default", - user_id="default", + account_id=x_openviking_account or "default", + user_id=x_openviking_user or "default", agent_id=x_openviking_agent or "default", ) @@ -44,6 +46,9 @@ async def resolve_identity( identity = api_key_manager.resolve(api_key) identity.agent_id = x_openviking_agent or "default" + if identity.role == Role.ROOT: + identity.account_id = x_openviking_account or identity.account_id or "default" + identity.user_id = x_openviking_user or identity.user_id or "default" return identity diff --git a/openviking/server/routers/admin.py b/openviking/server/routers/admin.py index c76e0e38..681dbe5c 100644 --- a/openviking/server/routers/admin.py +++ b/openviking/server/routers/admin.py @@ -6,9 +6,15 @@ from pydantic import BaseModel from openviking.server.auth import require_role +from openviking.server.dependencies import get_service from openviking.server.identity import RequestContext, Role from openviking.server.models import Response +from openviking.storage.viking_fs import get_viking_fs from openviking_cli.exceptions import PermissionDeniedError +from openviking_cli.session.user_id import UserIdentifier +from openviking_cli.utils.logger import get_logger + +logger = get_logger(__name__) router = APIRouter(prefix="/api/v1/admin", tags=["admin"]) @@ -53,6 +59,13 @@ async def create_account( """Create a new account (workspace) with its first admin user.""" manager = _get_api_key_manager(request) user_key = await manager.create_account(body.account_id, body.admin_user_id) + service = get_service() + account_ctx = RequestContext( + user=UserIdentifier(body.account_id, body.admin_user_id, "default"), + role=Role.ADMIN, + ) + await service.initialize_account_directories(account_ctx) + await service.initialize_user_directories(account_ctx) return Response( status="ok", result={ @@ -80,8 +93,44 @@ async def delete_account( account_id: str = Path(..., description="Account ID"), ctx: RequestContext = require_role(Role.ROOT), ): - """Delete an account.""" + """Delete an account and cascade-clean its storage (AGFS + VectorDB).""" manager = _get_api_key_manager(request) + + # Build a ROOT-level context scoped to the target account for cleanup + cleanup_ctx = RequestContext( + user=UserIdentifier(account_id, "system", "system"), + role=Role.ROOT, + ) + + # Cascade: remove AGFS data for the account + viking_fs = get_viking_fs() + account_prefixes = [ + "viking://user/", + "viking://agent/", + "viking://session/", + "viking://resources/", + ] + for prefix in account_prefixes: + try: + await viking_fs.rm(prefix, recursive=True, ctx=cleanup_ctx) + except Exception as e: + logger.warning(f"AGFS cleanup for {prefix} in account {account_id}: {e}") + + # Cascade: remove VectorDB records for the account + try: + storage = viking_fs._get_vector_store() + if storage: + account_filter = { + "op": "must", + "field": "account_id", + "conds": [account_id], + } + deleted = await storage.batch_delete("context", account_filter) + logger.info(f"VectorDB cascade delete for account {account_id}: {deleted} records") + except Exception as e: + logger.warning(f"VectorDB cleanup for account {account_id}: {e}") + + # Finally delete the account metadata await manager.delete_account(account_id) return Response(status="ok", result={"deleted": True}) @@ -100,6 +149,12 @@ async def register_user( _check_account_access(ctx, account_id) manager = _get_api_key_manager(request) user_key = await manager.register_user(account_id, body.user_id, body.role) + service = get_service() + user_ctx = RequestContext( + user=UserIdentifier(account_id, body.user_id, "default"), + role=Role.USER, + ) + await service.initialize_user_directories(user_ctx) return Response( status="ok", result={ diff --git a/openviking/server/routers/content.py b/openviking/server/routers/content.py index 38a3cde3..f73e4e2b 100644 --- a/openviking/server/routers/content.py +++ b/openviking/server/routers/content.py @@ -21,7 +21,7 @@ async def read( ): """Read file content (L2).""" service = get_service() - result = await service.fs.read(uri, offset=offset, limit=limit) + result = await service.fs.read(uri, ctx=_ctx, offset=offset, limit=limit) return Response(status="ok", result=result) @@ -32,7 +32,7 @@ async def abstract( ): """Read L0 abstract.""" service = get_service() - result = await service.fs.abstract(uri) + result = await service.fs.abstract(uri, ctx=_ctx) return Response(status="ok", result=result) @@ -43,5 +43,5 @@ async def overview( ): """Read L1 overview.""" service = get_service() - result = await service.fs.overview(uri) + result = await service.fs.overview(uri, ctx=_ctx) return Response(status="ok", result=result) diff --git a/openviking/server/routers/filesystem.py b/openviking/server/routers/filesystem.py index 521098ae..f34f1632 100644 --- a/openviking/server/routers/filesystem.py +++ b/openviking/server/routers/filesystem.py @@ -30,6 +30,7 @@ async def ls( service = get_service() result = await service.fs.ls( uri, + ctx=_ctx, recursive=recursive, simple=simple, output=output, @@ -53,6 +54,7 @@ async def tree( service = get_service() result = await service.fs.tree( uri, + ctx=_ctx, output=output, abs_limit=abs_limit, show_all_hidden=show_all_hidden, @@ -69,7 +71,7 @@ async def stat( """Get resource status.""" service = get_service() try: - result = await service.fs.stat(uri) + result = await service.fs.stat(uri, ctx=_ctx) return Response(status="ok", result=result) except AGFSClientError as e: if "no such file or directory" in str(e).lower(): @@ -90,7 +92,7 @@ async def mkdir( ): """Create directory.""" service = get_service() - await service.fs.mkdir(request.uri) + await service.fs.mkdir(request.uri, ctx=_ctx) return Response(status="ok", result={"uri": request.uri}) @@ -102,7 +104,7 @@ async def rm( ): """Remove resource.""" service = get_service() - await service.fs.rm(uri, recursive=recursive) + await service.fs.rm(uri, ctx=_ctx, recursive=recursive) return Response(status="ok", result={"uri": uri}) @@ -120,5 +122,5 @@ async def mv( ): """Move resource.""" service = get_service() - await service.fs.mv(request.from_uri, request.to_uri) + await service.fs.mv(request.from_uri, request.to_uri, ctx=_ctx) return Response(status="ok", result={"from": request.from_uri, "to": request.to_uri}) diff --git a/openviking/server/routers/pack.py b/openviking/server/routers/pack.py index a79d3f76..738bc7ea 100644 --- a/openviking/server/routers/pack.py +++ b/openviking/server/routers/pack.py @@ -39,7 +39,7 @@ async def export_ovpack( ): """Export context as .ovpack file.""" service = get_service() - result = await service.pack.export_ovpack(request.uri, request.to) + result = await service.pack.export_ovpack(request.uri, request.to, ctx=_ctx) return Response(status="ok", result={"file": result}) @@ -58,6 +58,7 @@ async def import_ovpack( result = await service.pack.import_ovpack( file_path, request.parent, + ctx=_ctx, force=request.force, vectorize=request.vectorize, ) diff --git a/openviking/server/routers/relations.py b/openviking/server/routers/relations.py index c2e5d2c4..b6803513 100644 --- a/openviking/server/routers/relations.py +++ b/openviking/server/routers/relations.py @@ -37,7 +37,7 @@ async def relations( ): """Get relations for a resource.""" service = get_service() - result = await service.relations.relations(uri) + result = await service.relations.relations(uri, ctx=_ctx) return Response(status="ok", result=result) @@ -48,7 +48,7 @@ async def link( ): """Create link between resources.""" service = get_service() - await service.relations.link(request.from_uri, request.to_uris, request.reason) + await service.relations.link(request.from_uri, request.to_uris, ctx=_ctx, reason=request.reason) return Response(status="ok", result={"from": request.from_uri, "to": request.to_uris}) @@ -59,5 +59,5 @@ async def unlink( ): """Remove link between resources.""" service = get_service() - await service.relations.unlink(request.from_uri, request.to_uri) + await service.relations.unlink(request.from_uri, request.to_uri, ctx=_ctx) return Response(status="ok", result={"from": request.from_uri, "to": request.to_uri}) diff --git a/openviking/server/routers/resources.py b/openviking/server/routers/resources.py index 370612fd..dbfa0ae1 100644 --- a/openviking/server/routers/resources.py +++ b/openviking/server/routers/resources.py @@ -91,6 +91,7 @@ async def add_resource( result = await service.resources.add_resource( path=path, + ctx=_ctx, target=request.target, reason=request.reason, instruction=request.instruction, @@ -109,6 +110,7 @@ async def add_skill( service = get_service() result = await service.resources.add_skill( data=request.data, + ctx=_ctx, wait=request.wait, timeout=request.timeout, ) diff --git a/openviking/server/routers/search.py b/openviking/server/routers/search.py index 0dbbb956..299d93b2 100644 --- a/openviking/server/routers/search.py +++ b/openviking/server/routers/search.py @@ -60,6 +60,7 @@ async def find( service = get_service() result = await service.search.find( query=request.query, + ctx=_ctx, target_uri=request.target_uri, limit=request.limit, score_threshold=request.score_threshold, @@ -82,11 +83,12 @@ async def search( # Get session if session_id provided session = None if request.session_id: - session = service.sessions.session(request.session_id) + session = service.sessions.session(_ctx, request.session_id) await session.load() result = await service.search.search( query=request.query, + ctx=_ctx, target_uri=request.target_uri, session=session, limit=request.limit, @@ -109,6 +111,7 @@ async def grep( result = await service.fs.grep( request.uri, request.pattern, + ctx=_ctx, case_insensitive=request.case_insensitive, ) return Response(status="ok", result=result) @@ -121,5 +124,5 @@ async def glob( ): """File pattern matching.""" service = get_service() - result = await service.fs.glob(request.pattern, uri=request.uri) + result = await service.fs.glob(request.pattern, ctx=_ctx, uri=request.uri) return Response(status="ok", result=result) diff --git a/openviking/server/routers/sessions.py b/openviking/server/routers/sessions.py index 985c41b1..29b5f64f 100644 --- a/openviking/server/routers/sessions.py +++ b/openviking/server/routers/sessions.py @@ -7,7 +7,7 @@ from fastapi import APIRouter, Depends, Path from pydantic import BaseModel, model_validator -from openviking.message.part import Part, TextPart, part_from_dict +from openviking.message.part import TextPart, part_from_dict from openviking.server.auth import get_request_context from openviking.server.dependencies import get_service from openviking.server.identity import RequestContext @@ -87,7 +87,9 @@ async def create_session( ): """Create a new session.""" service = get_service() - session = service.sessions.session() + await service.initialize_user_directories(_ctx) + await service.initialize_agent_directories(_ctx) + session = await service.sessions.create(_ctx) return Response( status="ok", result={ @@ -103,7 +105,7 @@ async def list_sessions( ): """List all sessions.""" service = get_service() - result = await service.sessions.sessions() + result = await service.sessions.sessions(_ctx) return Response(status="ok", result=result) @@ -114,8 +116,7 @@ async def get_session( ): """Get session details.""" service = get_service() - session = service.sessions.session(session_id) - await session.load() + session = await service.sessions.get(session_id, _ctx) return Response( status="ok", result={ @@ -133,7 +134,7 @@ async def delete_session( ): """Delete a session.""" service = get_service() - await service.sessions.delete(session_id) + await service.sessions.delete(session_id, _ctx) return Response(status="ok", result={"session_id": session_id}) @@ -144,7 +145,7 @@ async def commit_session( ): """Commit a session (archive and extract memories).""" service = get_service() - result = await service.sessions.commit(session_id) + result = await service.sessions.commit(session_id, _ctx) return Response(status="ok", result=result) @@ -155,7 +156,7 @@ async def extract_session( ): """Extract memories from a session.""" service = get_service() - result = await service.sessions.extract(session_id) + result = await service.sessions.extract(session_id, _ctx) return Response(status="ok", result=_to_jsonable(result)) @@ -180,7 +181,7 @@ async def add_message( If both `content` and `parts` are provided, `parts` takes precedence. """ service = get_service() - session = service.sessions.session(session_id) + session = service.sessions.session(_ctx, session_id) await session.load() if request.parts is not None: diff --git a/openviking/server/routers/system.py b/openviking/server/routers/system.py index 6002d439..7dcea691 100644 --- a/openviking/server/routers/system.py +++ b/openviking/server/routers/system.py @@ -4,13 +4,18 @@ from typing import Optional -from fastapi import APIRouter, Depends +from fastapi import APIRouter, Depends, Request +from fastapi.responses import JSONResponse from pydantic import BaseModel from openviking.server.auth import get_request_context from openviking.server.dependencies import get_service from openviking.server.identity import RequestContext from openviking.server.models import Response +from openviking.storage.viking_fs import get_viking_fs +from openviking_cli.utils.logger import get_logger + +logger = get_logger(__name__) router = APIRouter() @@ -21,6 +26,53 @@ async def health_check(): return {"status": "ok"} +@router.get("/ready", tags=["system"]) +async def readiness_check(request: Request): + """Readiness probe — checks AGFS, VectorDB, and APIKeyManager. + + Returns 200 when all subsystems are operational, 503 otherwise. + No authentication required (designed for K8s probes). + """ + checks = {} + + # 1. AGFS: try to list root + try: + viking_fs = get_viking_fs() + await viking_fs.ls("viking://", ctx=None) + checks["agfs"] = "ok" + except Exception as e: + checks["agfs"] = f"error: {e}" + + # 2. VectorDB: health_check() + try: + viking_fs = get_viking_fs() + storage = viking_fs._get_vector_store() + if storage: + healthy = await storage.health_check() + checks["vectordb"] = "ok" if healthy else "unhealthy" + else: + checks["vectordb"] = "not_configured" + except Exception as e: + checks["vectordb"] = f"error: {e}" + + # 3. APIKeyManager: check if loaded + try: + manager = getattr(request.app.state, "api_key_manager", None) + if manager is not None: + checks["api_key_manager"] = "ok" + else: + checks["api_key_manager"] = "not_configured" + except Exception as e: + checks["api_key_manager"] = f"error: {e}" + + all_ok = all(v in ("ok", "not_configured") for v in checks.values()) + status_code = 200 if all_ok else 503 + return JSONResponse( + status_code=status_code, + content={"status": "ready" if all_ok else "not_ready", "checks": checks}, + ) + + @router.get("/api/v1/system/status", tags=["system"]) async def system_status( _ctx: RequestContext = Depends(get_request_context), diff --git a/openviking/service/core.py b/openviking/service/core.py index cab39dfe..be603058 100644 --- a/openviking/service/core.py +++ b/openviking/service/core.py @@ -11,6 +11,7 @@ from openviking.agfs_manager import AGFSManager from openviking.core.directories import DirectoryInitializer +from openviking.server.identity import RequestContext, Role from openviking.service.debug_service import DebugService from openviking.service.fs_service import FSService from openviking.service.pack_service import PackService @@ -75,6 +76,7 @@ def __init__( self._skill_processor: Optional[SkillProcessor] = None self._session_compressor: Optional[SessionCompressor] = None self._transaction_manager: Optional[TransactionManager] = None + self._directory_initializer: Optional[DirectoryInitializer] = None # Sub-services self._fs_service = FSService() @@ -106,13 +108,10 @@ def _init_storage( max_concurrent_semantic: int = 100, ) -> None: """Initialize storage resources.""" - if config.agfs.backend == "local": - self._agfs_manager = AGFSManager(config=config.agfs) - self._agfs_manager.start() - self._agfs_url = self._agfs_manager.url - config.agfs.url = self._agfs_url - else: - self._agfs_url = config.agfs.url + self._agfs_manager = AGFSManager(config=config.agfs) + self._agfs_manager.start() + self._agfs_url = self._agfs_manager.url + config.agfs.url = self._agfs_url # Initialize QueueManager if self._agfs_url: @@ -234,9 +233,15 @@ async def initialize(self) -> None: # Initialize directories directory_initializer = DirectoryInitializer(vikingdb=self._vikingdb_manager) - await directory_initializer.initialize_all() - count = await directory_initializer.initialize_user_directories() - logger.info(f"Initialized {count} directories for user scope") + self._directory_initializer = directory_initializer + default_ctx = RequestContext(user=self._user, role=Role.ROOT) + account_count = await directory_initializer.initialize_account_directories(default_ctx) + user_count = await directory_initializer.initialize_user_directories(default_ctx) + logger.info( + "Initialized preset directories account=%d user=%d", + account_count, + user_count, + ) # Initialize processors self._resource_processor = ResourceProcessor(vikingdb=self._vikingdb_manager) @@ -258,13 +263,11 @@ async def initialize(self) -> None: viking_fs=self._viking_fs, resource_processor=self._resource_processor, skill_processor=self._skill_processor, - user=self.user, ) self._session_service.set_dependencies( vikingdb=self._vikingdb_manager, viking_fs=self._viking_fs, session_compressor=self._session_compressor, - user=self.user, ) self._debug_service.set_dependencies( vikingdb=self._vikingdb_manager, @@ -297,6 +300,7 @@ async def close(self) -> None: self._resource_processor = None self._skill_processor = None self._session_compressor = None + self._directory_initializer = None self._initialized = False logger.info("OpenVikingService closed") @@ -305,3 +309,24 @@ def _ensure_initialized(self) -> None: """Ensure service is initialized.""" if not self._initialized: raise NotInitializedError("OpenVikingService") + + async def initialize_account_directories(self, ctx: RequestContext) -> int: + """Initialize account-shared preset roots.""" + self._ensure_initialized() + if not self._directory_initializer: + return 0 + return await self._directory_initializer.initialize_account_directories(ctx) + + async def initialize_user_directories(self, ctx: RequestContext) -> int: + """Initialize current user's directory tree.""" + self._ensure_initialized() + if not self._directory_initializer: + return 0 + return await self._directory_initializer.initialize_user_directories(ctx) + + async def initialize_agent_directories(self, ctx: RequestContext) -> int: + """Initialize current user's current-agent directory tree.""" + self._ensure_initialized() + if not self._directory_initializer: + return 0 + return await self._directory_initializer.initialize_agent_directories(ctx) diff --git a/openviking/service/fs_service.py b/openviking/service/fs_service.py index 4f5dcc45..82b1f194 100644 --- a/openviking/service/fs_service.py +++ b/openviking/service/fs_service.py @@ -8,6 +8,7 @@ from typing import Any, Dict, List, Optional +from openviking.server.identity import RequestContext from openviking.storage.viking_fs import VikingFS from openviking_cli.exceptions import NotInitializedError from openviking_cli.utils import get_logger @@ -34,6 +35,7 @@ def _ensure_initialized(self) -> VikingFS: async def ls( self, uri: str, + ctx: RequestContext, recursive: bool = False, simple: bool = False, output: str = "original", @@ -59,19 +61,21 @@ async def ls( if recursive: entries = await viking_fs.tree( uri, + ctx=ctx, output="original", show_all_hidden=show_all_hidden, node_limit=node_limit, ) else: entries = await viking_fs.ls( - uri, output="original", show_all_hidden=show_all_hidden + uri, ctx=ctx, output="original", show_all_hidden=show_all_hidden ) return [e.get("uri", "") for e in entries] if recursive: entries = await viking_fs.tree( uri, + ctx=ctx, output=output, abs_limit=abs_limit, show_all_hidden=show_all_hidden, @@ -79,28 +83,33 @@ async def ls( ) else: entries = await viking_fs.ls( - uri, output=output, abs_limit=abs_limit, show_all_hidden=show_all_hidden + uri, + ctx=ctx, + output=output, + abs_limit=abs_limit, + show_all_hidden=show_all_hidden, ) return entries - async def mkdir(self, uri: str) -> None: + async def mkdir(self, uri: str, ctx: RequestContext) -> None: """Create directory.""" viking_fs = self._ensure_initialized() - await viking_fs.mkdir(uri) + await viking_fs.mkdir(uri, ctx=ctx) - async def rm(self, uri: str, recursive: bool = False) -> None: + async def rm(self, uri: str, ctx: RequestContext, recursive: bool = False) -> None: """Remove resource.""" viking_fs = self._ensure_initialized() - await viking_fs.rm(uri, recursive=recursive) + await viking_fs.rm(uri, recursive=recursive, ctx=ctx) - async def mv(self, from_uri: str, to_uri: str) -> None: + async def mv(self, from_uri: str, to_uri: str, ctx: RequestContext) -> None: """Move resource.""" viking_fs = self._ensure_initialized() - await viking_fs.mv(from_uri, to_uri) + await viking_fs.mv(from_uri, to_uri, ctx=ctx) async def tree( self, uri: str, + ctx: RequestContext, output: str = "original", abs_limit: int = 128, show_all_hidden: bool = False, @@ -110,38 +119,41 @@ async def tree( viking_fs = self._ensure_initialized() return await viking_fs.tree( uri, + ctx=ctx, output=output, abs_limit=abs_limit, show_all_hidden=show_all_hidden, node_limit=node_limit, ) - async def stat(self, uri: str) -> Dict[str, Any]: + async def stat(self, uri: str, ctx: RequestContext) -> Dict[str, Any]: """Get resource status.""" viking_fs = self._ensure_initialized() - return await viking_fs.stat(uri) + return await viking_fs.stat(uri, ctx=ctx) - async def read(self, uri: str, offset: int = 0, limit: int = -1) -> str: + async def read(self, uri: str, ctx: RequestContext, offset: int = 0, limit: int = -1) -> str: """Read file content.""" viking_fs = self._ensure_initialized() - return await viking_fs.read_file(uri, offset=offset, limit=limit) + return await viking_fs.read_file(uri, offset=offset, limit=limit, ctx=ctx) - async def abstract(self, uri: str) -> str: + async def abstract(self, uri: str, ctx: RequestContext) -> str: """Read L0 abstract (.abstract.md).""" viking_fs = self._ensure_initialized() - return await viking_fs.abstract(uri) + return await viking_fs.abstract(uri, ctx=ctx) - async def overview(self, uri: str) -> str: + async def overview(self, uri: str, ctx: RequestContext) -> str: """Read L1 overview (.overview.md).""" viking_fs = self._ensure_initialized() - return await viking_fs.overview(uri) + return await viking_fs.overview(uri, ctx=ctx) - async def grep(self, uri: str, pattern: str, case_insensitive: bool = False) -> Dict: + async def grep( + self, uri: str, pattern: str, ctx: RequestContext, case_insensitive: bool = False + ) -> Dict: """Content search.""" viking_fs = self._ensure_initialized() - return await viking_fs.grep(uri, pattern, case_insensitive=case_insensitive) + return await viking_fs.grep(uri, pattern, case_insensitive=case_insensitive, ctx=ctx) - async def glob(self, pattern: str, uri: str = "viking://") -> Dict: + async def glob(self, pattern: str, ctx: RequestContext, uri: str = "viking://") -> Dict: """File pattern matching.""" viking_fs = self._ensure_initialized() - return await viking_fs.glob(pattern, uri=uri) + return await viking_fs.glob(pattern, uri=uri, ctx=ctx) diff --git a/openviking/service/pack_service.py b/openviking/service/pack_service.py index fb8a8796..b7a7d4bc 100644 --- a/openviking/service/pack_service.py +++ b/openviking/service/pack_service.py @@ -8,6 +8,7 @@ from typing import Optional +from openviking.server.identity import RequestContext from openviking.storage.local_fs import export_ovpack as local_export_ovpack from openviking.storage.local_fs import import_ovpack as local_import_ovpack from openviking.storage.viking_fs import VikingFS @@ -33,7 +34,7 @@ def _ensure_initialized(self) -> VikingFS: raise NotInitializedError("VikingFS") return self._viking_fs - async def export_ovpack(self, uri: str, to: str) -> str: + async def export_ovpack(self, uri: str, to: str, ctx: RequestContext) -> str: """Export specified context path as .ovpack file. Args: @@ -44,10 +45,15 @@ async def export_ovpack(self, uri: str, to: str) -> str: Exported file path """ viking_fs = self._ensure_initialized() - return await local_export_ovpack(viking_fs, uri, to) + return await local_export_ovpack(viking_fs, uri, to, ctx=ctx) async def import_ovpack( - self, file_path: str, parent: str, force: bool = False, vectorize: bool = True + self, + file_path: str, + parent: str, + ctx: RequestContext, + force: bool = False, + vectorize: bool = True, ) -> str: """Import local .ovpack file to specified parent path. @@ -62,5 +68,5 @@ async def import_ovpack( """ viking_fs = self._ensure_initialized() return await local_import_ovpack( - viking_fs, file_path, parent, force=force, vectorize=vectorize + viking_fs, file_path, parent, force=force, vectorize=vectorize, ctx=ctx ) diff --git a/openviking/service/relation_service.py b/openviking/service/relation_service.py index 22ab9bb4..cb94451d 100644 --- a/openviking/service/relation_service.py +++ b/openviking/service/relation_service.py @@ -8,6 +8,7 @@ from typing import Any, Dict, List, Optional, Union +from openviking.server.identity import RequestContext from openviking.storage.viking_fs import VikingFS from openviking_cli.exceptions import NotInitializedError from openviking_cli.utils import get_logger @@ -31,12 +32,18 @@ def _ensure_initialized(self) -> VikingFS: raise NotInitializedError("VikingFS") return self._viking_fs - async def relations(self, uri: str) -> List[Dict[str, Any]]: + async def relations(self, uri: str, ctx: RequestContext) -> List[Dict[str, Any]]: """Get relations (returns [{"uri": "...", "reason": "..."}, ...]).""" viking_fs = self._ensure_initialized() - return await viking_fs.relations(uri) + return await viking_fs.relations(uri, ctx=ctx) - async def link(self, from_uri: str, uris: Union[str, List[str]], reason: str = "") -> None: + async def link( + self, + from_uri: str, + uris: Union[str, List[str]], + ctx: RequestContext, + reason: str = "", + ) -> None: """Create link (single or multiple). Args: @@ -45,9 +52,9 @@ async def link(self, from_uri: str, uris: Union[str, List[str]], reason: str = " reason: Reason for linking """ viking_fs = self._ensure_initialized() - await viking_fs.link(from_uri, uris, reason) + await viking_fs.link(from_uri, uris, reason, ctx=ctx) - async def unlink(self, from_uri: str, uri: str) -> None: + async def unlink(self, from_uri: str, uri: str, ctx: RequestContext) -> None: """Remove link (remove specified URI from uris). Args: @@ -55,4 +62,4 @@ async def unlink(self, from_uri: str, uri: str) -> None: uri: Target URI to remove """ viking_fs = self._ensure_initialized() - await viking_fs.unlink(from_uri, uri) + await viking_fs.unlink(from_uri, uri, ctx=ctx) diff --git a/openviking/service/resource_service.py b/openviking/service/resource_service.py index b8a5026e..9f078580 100644 --- a/openviking/service/resource_service.py +++ b/openviking/service/resource_service.py @@ -8,6 +8,7 @@ from typing import Any, Dict, Optional +from openviking.server.identity import RequestContext from openviking.storage import VikingDBManager from openviking.storage.queuefs import get_queue_manager from openviking.storage.viking_fs import VikingFS @@ -18,7 +19,6 @@ InvalidArgumentError, NotInitializedError, ) -from openviking_cli.session.user_id import UserIdentifier from openviking_cli.utils import get_logger from openviking_cli.utils.uri import VikingURI @@ -34,13 +34,11 @@ def __init__( viking_fs: Optional[VikingFS] = None, resource_processor: Optional[ResourceProcessor] = None, skill_processor: Optional[SkillProcessor] = None, - user: Optional[UserIdentifier] = None, ): self._vikingdb = vikingdb self._viking_fs = viking_fs self._resource_processor = resource_processor self._skill_processor = skill_processor - self._user = user def set_dependencies( self, @@ -48,14 +46,12 @@ def set_dependencies( viking_fs: VikingFS, resource_processor: ResourceProcessor, skill_processor: SkillProcessor, - user: Optional[UserIdentifier] = None, ) -> None: """Set dependencies (for deferred initialization).""" self._vikingdb = vikingdb self._viking_fs = viking_fs self._resource_processor = resource_processor self._skill_processor = skill_processor - self._user = user def _ensure_initialized(self) -> None: """Ensure all dependencies are initialized.""" @@ -69,6 +65,7 @@ def _ensure_initialized(self) -> None: async def add_resource( self, path: str, + ctx: RequestContext, target: Optional[str] = None, reason: str = "", instruction: str = "", @@ -104,6 +101,7 @@ async def add_resource( result = await self._resource_processor.process_resource( path=path, + ctx=ctx, reason=reason, instruction=instruction, scope="resources", @@ -131,6 +129,7 @@ async def add_resource( async def add_skill( self, data: Any, + ctx: RequestContext, wait: bool = False, timeout: Optional[float] = None, ) -> Dict[str, Any]: @@ -149,7 +148,7 @@ async def add_skill( result = await self._skill_processor.process_skill( data=data, viking_fs=self._viking_fs, - user=self._user, + ctx=ctx, ) if wait: diff --git a/openviking/service/search_service.py b/openviking/service/search_service.py index 0b88746e..7dc13ca9 100644 --- a/openviking/service/search_service.py +++ b/openviking/service/search_service.py @@ -8,6 +8,7 @@ from typing import TYPE_CHECKING, Any, Dict, Optional +from openviking.server.identity import RequestContext from openviking.storage.viking_fs import VikingFS from openviking_cli.exceptions import NotInitializedError from openviking_cli.utils import get_logger @@ -37,6 +38,7 @@ def _ensure_initialized(self) -> VikingFS: async def search( self, query: str, + ctx: RequestContext, target_uri: str = "", session: Optional["Session"] = None, limit: int = 10, @@ -64,6 +66,7 @@ async def search( return await viking_fs.search( query=query, + ctx=ctx, target_uri=target_uri, session_info=session_info, limit=limit, @@ -74,6 +77,7 @@ async def search( async def find( self, query: str, + ctx: RequestContext, target_uri: str = "", limit: int = 10, score_threshold: Optional[float] = None, @@ -94,6 +98,7 @@ async def find( viking_fs = self._ensure_initialized() return await viking_fs.find( query=query, + ctx=ctx, target_uri=target_uri, limit=limit, score_threshold=score_threshold, diff --git a/openviking/service/session_service.py b/openviking/service/session_service.py index fb9de26c..74b16514 100644 --- a/openviking/service/session_service.py +++ b/openviking/service/session_service.py @@ -8,12 +8,12 @@ from typing import Any, Dict, List, Optional +from openviking.server.identity import RequestContext from openviking.session import Session from openviking.session.compressor import SessionCompressor from openviking.storage import VikingDBManager from openviking.storage.viking_fs import VikingFS from openviking_cli.exceptions import NotFoundError, NotInitializedError -from openviking_cli.session.user_id import UserIdentifier from openviking_cli.utils import get_logger logger = get_logger(__name__) @@ -27,32 +27,28 @@ def __init__( vikingdb: Optional[VikingDBManager] = None, viking_fs: Optional[VikingFS] = None, session_compressor: Optional[SessionCompressor] = None, - user: Optional[UserIdentifier] = None, ): self._vikingdb = vikingdb self._viking_fs = viking_fs self._session_compressor = session_compressor - self._user = user or UserIdentifier.the_default_user() def set_dependencies( self, vikingdb: VikingDBManager, viking_fs: VikingFS, session_compressor: SessionCompressor, - user: Optional[UserIdentifier] = None, ) -> None: """Set dependencies (for deferred initialization).""" self._vikingdb = vikingdb self._viking_fs = viking_fs self._session_compressor = session_compressor - self._user = user or UserIdentifier.the_default_user() def _ensure_initialized(self) -> None: """Ensure all dependencies are initialized.""" if not self._viking_fs: raise NotInitializedError("VikingFS") - def session(self, session_id: Optional[str] = None) -> Session: + def session(self, ctx: RequestContext, session_id: Optional[str] = None) -> Session: """Create a new session or load an existing one. Args: @@ -66,21 +62,39 @@ def session(self, session_id: Optional[str] = None) -> Session: viking_fs=self._viking_fs, vikingdb_manager=self._vikingdb, session_compressor=self._session_compressor, - user=self._user, + user=ctx.user, + ctx=ctx, session_id=session_id, ) - async def sessions(self) -> List[Dict[str, Any]]: + async def create(self, ctx: RequestContext) -> Session: + """Create a session and persist its root path.""" + session = self.session(ctx) + await session.ensure_exists() + return session + + async def get(self, session_id: str, ctx: RequestContext) -> Session: + """Get an existing session. + + Raises NotFoundError when the session does not exist under current user scope. + """ + session = self.session(ctx, session_id) + if not await session.exists(): + raise NotFoundError(session_id, "session") + await session.load() + return session + + async def sessions(self, ctx: RequestContext) -> List[Dict[str, Any]]: """Get all sessions for the current user. Returns: List of session info dicts """ self._ensure_initialized() - session_base_uri = "viking://session" + session_base_uri = f"viking://session/{ctx.user.user_space_name()}" try: - entries = await self._viking_fs.ls(session_base_uri) + entries = await self._viking_fs.ls(session_base_uri, ctx=ctx) sessions = [] for entry in entries: name = entry.get("name", "") @@ -97,7 +111,7 @@ async def sessions(self) -> List[Dict[str, Any]]: except Exception: return [] - async def delete(self, session_id: str) -> bool: + async def delete(self, session_id: str, ctx: RequestContext) -> bool: """Delete a session. Args: @@ -107,17 +121,17 @@ async def delete(self, session_id: str) -> bool: True if deleted successfully """ self._ensure_initialized() - session_uri = f"viking://session/{session_id}" + session_uri = f"viking://session/{ctx.user.user_space_name()}/{session_id}" try: - await self._viking_fs.rm(session_uri, recursive=True) + await self._viking_fs.rm(session_uri, recursive=True, ctx=ctx) logger.info(f"Deleted session: {session_id}") return True except Exception as e: logger.error(f"Failed to delete session {session_id}: {e}") raise NotFoundError(session_id, "session") - async def commit(self, session_id: str) -> Dict[str, Any]: + async def commit(self, session_id: str, ctx: RequestContext) -> Dict[str, Any]: """Commit a session (archive messages and extract memories). Args: @@ -127,11 +141,10 @@ async def commit(self, session_id: str) -> Dict[str, Any]: Commit result """ self._ensure_initialized() - session = self.session(session_id) - await session.load() + session = await self.get(session_id, ctx) return session.commit() - async def extract(self, session_id: str) -> List[Any]: + async def extract(self, session_id: str, ctx: RequestContext) -> List[Any]: """Extract memories from a session. Args: @@ -144,11 +157,11 @@ async def extract(self, session_id: str) -> List[Any]: if not self._session_compressor: raise NotInitializedError("SessionCompressor") - session = self.session(session_id) - await session.load() + session = await self.get(session_id, ctx) return await self._session_compressor.extract_long_term_memories( messages=session.messages, - user=self._user, + user=ctx.user, session_id=session_id, + ctx=ctx, ) diff --git a/openviking/session/compressor.py b/openviking/session/compressor.py index 2929cf03..4007ecc0 100644 --- a/openviking/session/compressor.py +++ b/openviking/session/compressor.py @@ -12,6 +12,7 @@ from openviking.core.context import Context, Vectorize from openviking.message import Message +from openviking.server.identity import RequestContext from openviking.storage import VikingDBManager from openviking.storage.viking_fs import get_viking_fs from openviking_cli.session.user_id import UserIdentifier @@ -69,10 +70,11 @@ async def _merge_into_existing( candidate: CandidateMemory, target_memory: Context, viking_fs, + ctx: RequestContext, ) -> bool: """Merge candidate content into an existing memory file.""" try: - existing_content = await viking_fs.read_file(target_memory.uri) + existing_content = await viking_fs.read_file(target_memory.uri, ctx=ctx) payload = await self.extractor._merge_memory_bundle( existing_abstract=target_memory.abstract, existing_overview=(target_memory.meta or {}).get("overview") or "", @@ -86,7 +88,7 @@ async def _merge_into_existing( if not payload: return False - await viking_fs.write_file(target_memory.uri, payload.content) + await viking_fs.write_file(target_memory.uri, payload.content, ctx=ctx) target_memory.abstract = payload.abstract target_memory.meta = {**(target_memory.meta or {}), "overview": payload.overview} logger.info( @@ -99,10 +101,12 @@ async def _merge_into_existing( logger.error(f"Failed to merge memory {target_memory.uri}: {e}") return False - async def _delete_existing_memory(self, memory: Context, viking_fs) -> bool: + async def _delete_existing_memory( + self, memory: Context, viking_fs, ctx: RequestContext + ) -> bool: """Hard delete an existing memory file and clean up its vector record.""" try: - await viking_fs.rm(memory.uri, recursive=False) + await viking_fs.rm(memory.uri, recursive=False, ctx=ctx) except Exception as e: logger.error(f"Failed to delete memory file {memory.uri}: {e}") return False @@ -119,12 +123,16 @@ async def extract_long_term_memories( messages: List[Message], user: Optional["UserIdentifier"] = None, session_id: Optional[str] = None, + ctx: Optional[RequestContext] = None, ) -> List[Context]: """Extract long-term memories from messages.""" if not messages: return [] context = {"messages": messages} + if not ctx: + return [] + candidates = await self.extractor.extract(context, user, session_id) if not candidates: @@ -137,7 +145,7 @@ async def extract_long_term_memories( for candidate in candidates: # Profile: skip dedup, always merge if candidate.category in ALWAYS_MERGE_CATEGORIES: - memory = await self.extractor.create_memory(candidate, user, session_id) + memory = await self.extractor.create_memory(candidate, user, session_id, ctx=ctx) if memory: memories.append(memory) stats.created += 1 @@ -173,14 +181,16 @@ async def extract_long_term_memories( for action in actions: if action.decision == MemoryActionDecision.DELETE: if viking_fs and await self._delete_existing_memory( - action.memory, viking_fs + action.memory, viking_fs, ctx=ctx ): stats.deleted += 1 else: stats.skipped += 1 elif action.decision == MemoryActionDecision.MERGE: if candidate.category in MERGE_SUPPORTED_CATEGORIES and viking_fs: - if await self._merge_into_existing(candidate, action.memory, viking_fs): + if await self._merge_into_existing( + candidate, action.memory, viking_fs, ctx=ctx + ): stats.merged += 1 else: stats.skipped += 1 @@ -194,13 +204,13 @@ async def extract_long_term_memories( for action in actions: if action.decision == MemoryActionDecision.DELETE: if viking_fs and await self._delete_existing_memory( - action.memory, viking_fs + action.memory, viking_fs, ctx=ctx ): stats.deleted += 1 else: stats.skipped += 1 - memory = await self.extractor.create_memory(candidate, user, session_id) + memory = await self.extractor.create_memory(candidate, user, session_id, ctx=ctx) if memory: memories.append(memory) stats.created += 1 @@ -211,7 +221,7 @@ async def extract_long_term_memories( # Extract URIs used in messages, create relations used_uris = self._extract_used_uris(messages) if used_uris and memories: - await self._create_relations(memories, used_uris) + await self._create_relations(memories, used_uris, ctx=ctx) logger.info( f"Memory extraction: created={stats.created}, " @@ -238,6 +248,7 @@ async def _create_relations( self, memories: List[Context], used_uris: Dict[str, List[str]], + ctx: RequestContext, ) -> None: """Create bidirectional relations between memories and resources/skills.""" viking_fs = get_viking_fs() @@ -256,21 +267,25 @@ async def _create_relations( memory_uri, resource_uris, reason="Memory extracted from session using these resources", + ctx=ctx, ) if skill_uris: await viking_fs.link( memory_uri, skill_uris, reason="Memory extracted from session calling these skills", + ctx=ctx, ) # Resources/skills -> memories (reverse) for resource_uri in resource_uris: await viking_fs.link( - resource_uri, memory_uris, reason="Referenced by these memories" + resource_uri, memory_uris, reason="Referenced by these memories", ctx=ctx ) for skill_uri in skill_uris: - await viking_fs.link(skill_uri, memory_uris, reason="Called by these memories") + await viking_fs.link( + skill_uri, memory_uris, reason="Called by these memories", ctx=ctx + ) logger.info(f"Created bidirectional relations for {len(memories)} memories") except Exception as e: diff --git a/openviking/session/memory_deduplicator.py b/openviking/session/memory_deduplicator.py index 99f2cf8a..63b2e669 100644 --- a/openviking/session/memory_deduplicator.py +++ b/openviking/session/memory_deduplicator.py @@ -64,13 +64,18 @@ class MemoryDeduplicator: SIMILARITY_THRESHOLD = 0.0 # Vector similarity threshold for pre-filtering MAX_PROMPT_SIMILAR_MEMORIES = 5 # Number of similar memories sent to LLM - CATEGORY_URI_PREFIX = { - "preferences": "viking://user/memories/preferences/", - "entities": "viking://user/memories/entities/", - "events": "viking://user/memories/events/", - "cases": "viking://agent/memories/cases/", - "patterns": "viking://agent/memories/patterns/", - } + + _USER_CATEGORIES = {"preferences", "entities", "events"} + _AGENT_CATEGORIES = {"cases", "patterns"} + + @staticmethod + def _category_uri_prefix(category: str, user) -> str: + """Build category URI prefix with space segment.""" + if category in MemoryDeduplicator._USER_CATEGORIES: + return f"viking://user/{user.user_space_name()}/memories/{category}/" + elif category in MemoryDeduplicator._AGENT_CATEGORIES: + return f"viking://agent/{user.agent_space_name()}/memories/{category}/" + return "" def __init__( self, @@ -125,13 +130,23 @@ async def _find_similar_memories( # Determine collection and filter based on category collection = "context" - category_uri_prefix = self.CATEGORY_URI_PREFIX.get(candidate.category.value, "") + category_uri_prefix = self._category_uri_prefix(candidate.category.value, candidate.user) # Build filter by memory scope + uri prefix (schema does not have category field yet). filter_conds = [ {"field": "context_type", "op": "must", "conds": ["memory"]}, {"field": "is_leaf", "op": "must", "conds": [True]}, ] + owner = candidate.user + if hasattr(owner, "account_id"): + filter_conds.append({"field": "account_id", "op": "must", "conds": [owner.account_id]}) + if owner and hasattr(owner, "user_space_name"): + owner_space = ( + owner.agent_space_name() + if candidate.category.value in {"cases", "patterns"} + else owner.user_space_name() + ) + filter_conds.append({"field": "owner_space", "op": "must", "conds": [owner_space]}) if category_uri_prefix: filter_conds.append({"field": "uri", "op": "prefix", "prefix": category_uri_prefix}) dedup_filter = {"op": "and", "conds": filter_conds} diff --git a/openviking/session/memory_extractor.py b/openviking/session/memory_extractor.py index be13b0f6..16349935 100644 --- a/openviking/session/memory_extractor.py +++ b/openviking/session/memory_extractor.py @@ -16,6 +16,7 @@ from openviking.core.context import Context, ContextType, Vectorize from openviking.prompts import render_prompt +from openviking.server.identity import RequestContext from openviking.storage.viking_fs import get_viking_fs from openviking_cli.session.user_id import UserIdentifier from openviking_cli.utils import get_logger @@ -74,9 +75,34 @@ class MemoryExtractor: MemoryCategory.PATTERNS: "memories/patterns", } + # Categories that belong to user space + _USER_CATEGORIES = { + MemoryCategory.PROFILE, + MemoryCategory.PREFERENCES, + MemoryCategory.ENTITIES, + MemoryCategory.EVENTS, + } + + # Categories that belong to agent space + _AGENT_CATEGORIES = { + MemoryCategory.CASES, + MemoryCategory.PATTERNS, + } + def __init__(self): """Initialize memory extractor.""" + @staticmethod + def _get_owner_space(category: MemoryCategory, ctx: RequestContext) -> str: + """Derive owner_space from memory category. + + PROFILE / PREFERENCES / ENTITIES / EVENTS → user_space + CASES / PATTERNS → agent_space + """ + if category in MemoryExtractor._USER_CATEGORIES: + return ctx.user.user_space_name() + return ctx.user.agent_space_name() + @staticmethod def _detect_output_language(messages: List, fallback_language: str = "en") -> str: """Detect dominant language from user messages only. @@ -208,6 +234,7 @@ async def create_memory( candidate: CandidateMemory, user: str, session_id: str, + ctx: RequestContext, ) -> Optional[Context]: """Create Context object from candidate and persist to AGFS as .md file.""" viking_fs = get_viking_fs() @@ -215,35 +242,41 @@ async def create_memory( logger.warning("VikingFS not available, skipping memory creation") return None + owner_space = self._get_owner_space(candidate.category, ctx) + # Special handling for profile: append to profile.md if candidate.category == MemoryCategory.PROFILE: - payload = await self._append_to_profile(candidate, viking_fs) + payload = await self._append_to_profile(candidate, viking_fs, ctx=ctx) if not payload: return None - memory_uri = "viking://user/memories/profile.md" + user_space = ctx.user.user_space_name() + memory_uri = f"viking://user/{user_space}/memories/profile.md" memory = Context( uri=memory_uri, - parent_uri="viking://user/memories", + parent_uri=f"viking://user/{user_space}/memories", is_leaf=True, abstract=payload.abstract, context_type=ContextType.MEMORY.value, category=candidate.category.value, session_id=session_id, user=user, + account_id=ctx.account_id, + owner_space=owner_space, ) logger.info(f"uri {memory_uri} abstract: {payload.abstract} content: {payload.content}") memory.set_vectorize(Vectorize(text=payload.content)) return memory # Determine parent URI based on category + cat_dir = self.CATEGORY_DIRS[candidate.category] if candidate.category in [ MemoryCategory.PREFERENCES, MemoryCategory.ENTITIES, MemoryCategory.EVENTS, ]: - parent_uri = f"viking://user/{self.CATEGORY_DIRS[candidate.category]}" + parent_uri = f"viking://user/{ctx.user.user_space_name()}/{cat_dir}" else: # CASES, PATTERNS - parent_uri = f"viking://agent/{self.CATEGORY_DIRS[candidate.category]}" + parent_uri = f"viking://agent/{ctx.user.agent_space_name()}/{cat_dir}" # Generate file URI (store directly as .md file, no directory creation) memory_id = f"mem_{str(uuid4())}" @@ -251,7 +284,7 @@ async def create_memory( # Write to AGFS as single .md file try: - await viking_fs.write_file(memory_uri, candidate.content) + await viking_fs.write_file(memory_uri, candidate.content, ctx=ctx) logger.info(f"Created memory file: {memory_uri}") except Exception as e: logger.error(f"Failed to write memory to AGFS: {e}") @@ -267,6 +300,8 @@ async def create_memory( category=candidate.category.value, session_id=session_id, user=user, + account_id=ctx.account_id, + owner_space=owner_space, ) logger.info(f"uri {memory_uri} abstract: {candidate.abstract} content: {candidate.content}") memory.set_vectorize(Vectorize(text=candidate.content)) @@ -276,17 +311,18 @@ async def _append_to_profile( self, candidate: CandidateMemory, viking_fs, + ctx: RequestContext, ) -> Optional[MergedMemoryPayload]: """Update user profile - always merge with existing content.""" - uri = "viking://user/memories/profile.md" + uri = f"viking://user/{ctx.user.user_space_name()}/memories/profile.md" existing = "" try: - existing = await viking_fs.read_file(uri) or "" + existing = await viking_fs.read_file(uri, ctx=ctx) or "" except Exception: pass if not existing.strip(): - await viking_fs.write_file(uri=uri, content=candidate.content) + await viking_fs.write_file(uri=uri, content=candidate.content, ctx=ctx) logger.info(f"Created profile at {uri}") return MergedMemoryPayload( abstract=candidate.abstract, @@ -308,7 +344,7 @@ async def _append_to_profile( if not payload: logger.warning("Profile merge bundle failed; keeping existing profile unchanged") return None - await viking_fs.write_file(uri=uri, content=payload.content) + await viking_fs.write_file(uri=uri, content=payload.content, ctx=ctx) logger.info(f"Merged profile info to {uri}") return payload diff --git a/openviking/session/session.py b/openviking/session/session.py index 401f3ea4..87dc7b11 100644 --- a/openviking/session/session.py +++ b/openviking/session/session.py @@ -13,6 +13,7 @@ from uuid import uuid4 from openviking.message import Message, Part +from openviking.server.identity import RequestContext, Role from openviking.utils.time_utils import get_current_timestamp from openviking_cli.session.user_id import UserIdentifier from openviking_cli.utils import get_logger, run_async @@ -70,6 +71,7 @@ def __init__( vikingdb_manager: Optional["VikingDBManager"] = None, session_compressor: Optional["SessionCompressor"] = None, user: Optional["UserIdentifier"] = None, + ctx: Optional[RequestContext] = None, session_id: Optional[str] = None, auto_commit_threshold: int = 8000, ): @@ -77,10 +79,11 @@ def __init__( self._vikingdb_manager = vikingdb_manager self._session_compressor = session_compressor self.user = user or UserIdentifier.the_default_user() + self.ctx = ctx or RequestContext(user=self.user, role=Role.ROOT) self.session_id = session_id or str(uuid4()) self.created_at = datetime.now() self._auto_commit_threshold = auto_commit_threshold - self._session_uri = f"viking://session/{self.session_id}" + self._session_uri = f"viking://session/{self.user.user_space_name()}/{self.session_id}" self._messages: List[Message] = [] self._usage_records: List[Usage] = [] @@ -96,7 +99,9 @@ async def load(self): return try: - content = await self._viking_fs.read_file(f"{self._session_uri}/messages.jsonl") + content = await self._viking_fs.read_file( + f"{self._session_uri}/messages.jsonl", ctx=self.ctx + ) self._messages = [ Message.from_dict(json.loads(line)) for line in content.strip().split("\n") @@ -108,7 +113,7 @@ async def load(self): # Restore compression_index (scan history directory) try: - history_items = await self._viking_fs.ls(f"{self._session_uri}/history") + history_items = await self._viking_fs.ls(f"{self._session_uri}/history", ctx=self.ctx) archives = [ item["name"] for item in history_items if item["name"].startswith("archive_") ] @@ -122,6 +127,21 @@ async def load(self): self._loaded = True + async def exists(self) -> bool: + """Check whether this session already exists in storage.""" + try: + await self._viking_fs.stat(self._session_uri, ctx=self.ctx) + return True + except Exception: + return False + + async def ensure_exists(self) -> None: + """Materialize session root and messages file if missing.""" + if await self.exists(): + return + await self._viking_fs.mkdir(self._session_uri, exist_ok=True, ctx=self.ctx) + await self._viking_fs.write_file(f"{self._session_uri}/messages.jsonl", "", ctx=self.ctx) + @property def messages(self) -> List[Message]: """Get message list.""" @@ -244,6 +264,7 @@ def commit(self) -> Dict[str, Any]: messages=messages_to_archive, user=self.user, session_id=self.session_id, + ctx=self.ctx, ) ) logger.info(f"Extracted {len(memories)} memories") @@ -319,7 +340,9 @@ async def get_context_for_search( summaries = [] if self.compression.compression_index > 0: try: - history_items = await self._viking_fs.ls(f"{self._session_uri}/history") + history_items = await self._viking_fs.ls( + f"{self._session_uri}/history", ctx=self.ctx + ) query_lower = query.lower() # Collect all archives with relevance scores @@ -329,7 +352,7 @@ async def get_context_for_search( if name and name.startswith("archive_"): overview_uri = f"{self._session_uri}/history/{name}/.overview.md" try: - overview = await self._viking_fs.read_file(overview_uri) + overview = await self._viking_fs.read_file(overview_uri, ctx=self.ctx) # Calculate relevance by keyword matching score = 0 if query_lower in overview.lower(): @@ -409,11 +432,16 @@ def _write_archive( viking_fs.write_file( uri=f"{archive_uri}/messages.jsonl", content="\n".join(lines) + "\n", + ctx=self.ctx, ) ) - run_async(viking_fs.write_file(uri=f"{archive_uri}/.abstract.md", content=abstract)) - run_async(viking_fs.write_file(uri=f"{archive_uri}/.overview.md", content=overview)) + run_async( + viking_fs.write_file(uri=f"{archive_uri}/.abstract.md", content=abstract, ctx=self.ctx) + ) + run_async( + viking_fs.write_file(uri=f"{archive_uri}/.overview.md", content=overview, ctx=self.ctx) + ) logger.debug(f"Written archive: {archive_uri}") @@ -435,6 +463,7 @@ def _write_to_agfs(self, messages: List[Message]) -> None: viking_fs.write_file( uri=f"{self._session_uri}/messages.jsonl", content=content, + ctx=self.ctx, ) ) @@ -443,12 +472,14 @@ def _write_to_agfs(self, messages: List[Message]) -> None: viking_fs.write_file( uri=f"{self._session_uri}/.abstract.md", content=abstract, + ctx=self.ctx, ) ) run_async( viking_fs.write_file( uri=f"{self._session_uri}/.overview.md", content=overview, + ctx=self.ctx, ) ) @@ -460,6 +491,7 @@ def _append_to_jsonl(self, msg: Message) -> None: self._viking_fs.append_file( f"{self._session_uri}/messages.jsonl", msg.to_jsonl() + "\n", + ctx=self.ctx, ) ) @@ -474,6 +506,7 @@ def _update_message_in_jsonl(self) -> None: self._viking_fs.write_file( f"{self._session_uri}/messages.jsonl", content, + ctx=self.ctx, ) ) @@ -505,6 +538,7 @@ def _save_tool_result( self._viking_fs.write_file( f"{self._session_uri}/tools/{tool_id}/tool.json", json.dumps(tool_data, ensure_ascii=False), + ctx=self.ctx, ) ) @@ -548,7 +582,7 @@ def _write_relations(self) -> None: viking_fs = self._viking_fs for usage in self._usage_records: try: - run_async(viking_fs.link(self._session_uri, usage.uri)) + run_async(viking_fs.link(self._session_uri, usage.uri, ctx=self.ctx)) logger.debug(f"Created relation: {self._session_uri} -> {usage.uri}") except Exception as e: logger.warning(f"Failed to create relation to {usage.uri}: {e}") diff --git a/openviking/storage/collection_schemas.py b/openviking/storage/collection_schemas.py index 61b76d36..56b59cd9 100644 --- a/openviking/storage/collection_schemas.py +++ b/openviking/storage/collection_schemas.py @@ -75,6 +75,8 @@ def context_collection(name: str, vector_dim: int) -> Dict[str, Any]: {"FieldName": "description", "FieldType": "string"}, {"FieldName": "tags", "FieldType": "string"}, {"FieldName": "abstract", "FieldType": "string"}, + {"FieldName": "account_id", "FieldType": "string"}, + {"FieldName": "owner_space", "FieldType": "string"}, ], "ScalarIndex": [ "uri", @@ -87,6 +89,8 @@ def context_collection(name: str, vector_dim: int) -> Dict[str, Any]: "level", "name", "tags", + "account_id", + "owner_space", ], } @@ -198,7 +202,10 @@ async def on_dequeue(self, data: Optional[Dict[str, Any]]) -> Optional[Dict[str, # Ensure vector DB has at most one record per URI. uri = inserted_data.get("uri") if uri: - inserted_data["id"] = hashlib.md5(uri.encode("utf-8")).hexdigest() + account_id = inserted_data.get("account_id", "default") + owner_space = inserted_data.get("owner_space", "") + id_seed = f"{account_id}:{owner_space}:{uri}" + inserted_data["id"] = hashlib.md5(id_seed.encode("utf-8")).hexdigest() record_id = await self._vikingdb.insert(self._collection_name, inserted_data) if record_id: diff --git a/openviking/storage/local_fs.py b/openviking/storage/local_fs.py index 34acf023..cea4fbea 100644 --- a/openviking/storage/local_fs.py +++ b/openviking/storage/local_fs.py @@ -7,6 +7,7 @@ from typing import cast from openviking.core.context import Context +from openviking.server.identity import RequestContext from openviking.storage.queuefs import EmbeddingQueue, get_queue_manager from openviking.storage.queuefs.embedding_msg_converter import EmbeddingMsgConverter from openviking_cli.utils.logger import get_logger @@ -61,14 +62,14 @@ def get_viking_rel_path_from_zip(zip_path: str) -> str: # TODO: Consider recursive vectorization -async def _enqueue_direct_vectorization(viking_fs, uri: str) -> None: +async def _enqueue_direct_vectorization(viking_fs, uri: str, ctx: RequestContext) -> None: queue_manager = get_queue_manager() embedding_queue = cast( EmbeddingQueue, queue_manager.get_queue(queue_manager.EMBEDDING, allow_create=True) ) parent_uri = VikingURI(uri).parent.uri - abstract = await viking_fs.abstract(uri) + abstract = await viking_fs.abstract(uri, ctx=ctx) resource = Context( uri=uri, parent_uri=parent_uri, @@ -77,6 +78,15 @@ async def _enqueue_direct_vectorization(viking_fs, uri: str) -> None: created_at=datetime.now(), active_count=0, related_uri=[], + user=ctx.user, + account_id=ctx.account_id, + owner_space=( + ctx.user.agent_space_name() + if uri.startswith("viking://agent/") + else ctx.user.user_space_name() + if uri.startswith("viking://user/") or uri.startswith("viking://session/") + else "" + ), meta={"semantic_name": uri.split("/")[-1]}, ) @@ -85,7 +95,12 @@ async def _enqueue_direct_vectorization(viking_fs, uri: str) -> None: async def import_ovpack( - viking_fs, file_path: str, parent: str, force: bool = False, vectorize: bool = True + viking_fs, + file_path: str, + parent: str, + ctx: RequestContext, + force: bool = False, + vectorize: bool = True, ) -> str: """ Import .ovpack file to the specified parent path. @@ -106,10 +121,10 @@ async def import_ovpack( parent = parent.strip().rstrip("/") try: - await viking_fs.stat(parent) + await viking_fs.stat(parent, ctx=ctx) except Exception: # Parent directory does not exist, create it - await viking_fs.mkdir(parent) + await viking_fs.mkdir(parent, ctx=ctx) with zipfile.ZipFile(file_path, "r") as zf: # 1. Get root directory name from ZIP and perform initial validation @@ -127,7 +142,7 @@ async def import_ovpack( # 2. Conflict check try: - await viking_fs.ls(root_uri) + await viking_fs.ls(root_uri, ctx=ctx) if not force: raise FileExistsError( f"Resource already exists at {root_uri}. Use force=True to overwrite." @@ -163,7 +178,7 @@ async def import_ovpack( if zip_path.endswith("/"): rel_path = get_viking_rel_path_from_zip(zip_path.rstrip("/")) target_dir_uri = f"{root_uri}/{rel_path}" if rel_path else root_uri - await viking_fs.mkdir(target_dir_uri, exist_ok=True) + await viking_fs.mkdir(target_dir_uri, exist_ok=True, ctx=ctx) continue # Handle file entries @@ -172,7 +187,7 @@ async def import_ovpack( try: data = zf.read(zip_path) - await viking_fs.write_file_bytes(target_file_uri, data) + await viking_fs.write_file_bytes(target_file_uri, data, ctx=ctx) except Exception as e: logger.error(f"Failed to import {zip_path} to {target_file_uri}: {e}") if not force: # In non-force mode, stop on error @@ -181,13 +196,13 @@ async def import_ovpack( logger.info(f"[local_fs] Successfully imported {file_path} to {root_uri}") if vectorize: - await _enqueue_direct_vectorization(viking_fs, root_uri) + await _enqueue_direct_vectorization(viking_fs, root_uri, ctx=ctx) logger.info(f"[local_fs] Enqueued direct vectorization for: {root_uri}") return root_uri -async def export_ovpack(viking_fs, uri: str, to: str) -> str: +async def export_ovpack(viking_fs, uri: str, to: str, ctx: RequestContext) -> str: """ Export the specified context path as a .ovpack file. @@ -210,7 +225,7 @@ async def export_ovpack(viking_fs, uri: str, to: str) -> str: ensure_dir_exists(to) - entries = await viking_fs.tree(uri, show_all_hidden=True) + entries = await viking_fs.tree(uri, show_all_hidden=True, ctx=ctx) with zipfile.ZipFile(to, "w", zipfile.ZIP_DEFLATED, allowZip64=True) as zf: # Write root directory entry @@ -225,7 +240,7 @@ async def export_ovpack(viking_fs, uri: str, to: str) -> str: else: full_uri = f"{uri}/{rel_path}" try: - data = await viking_fs.read_file_bytes(full_uri) + data = await viking_fs.read_file_bytes(full_uri, ctx=ctx) zf.writestr(zip_path, data) except Exception as e: logger.warning(f"Failed to export file {full_uri}: {e}") diff --git a/openviking/storage/queuefs/embedding_msg_converter.py b/openviking/storage/queuefs/embedding_msg_converter.py index d105d5f6..d6f5b648 100644 --- a/openviking/storage/queuefs/embedding_msg_converter.py +++ b/openviking/storage/queuefs/embedding_msg_converter.py @@ -26,24 +26,44 @@ def from_context(context: Context, **kwargs) -> EmbeddingMsg: if not vectorization_text: return None - context_dict = context.to_dict() + context_data = context.to_dict() - # 根据 URI 判断 level 字段(用于向量索引) - uri = context_dict.get("uri", "") + # Backfill tenant fields for legacy writers that only set user/uri. + if not context_data.get("account_id"): + user = context_data.get("user") or {} + context_data["account_id"] = user.get("account_id", "default") + if not context_data.get("owner_space"): + user = context_data.get("user") or {} + uri = context_data.get("uri", "") + account = user.get("account_id", "default") + user_id = user.get("user_id", "default") + agent_id = user.get("agent_id", "default") + from openviking_cli.session.user_id import UserIdentifier + + owner_user = UserIdentifier(account, user_id, agent_id) + if uri.startswith("viking://agent/"): + context_data["owner_space"] = owner_user.agent_space_name() + elif uri.startswith("viking://user/") or uri.startswith("viking://session/"): + context_data["owner_space"] = owner_user.user_space_name() + else: + context_data["owner_space"] = "" + + # Derive level field from URI for hierarchical retrieval. + uri = context_data.get("uri", "") if uri.endswith("/.abstract.md"): - context_dict["level"] = ContextLevel.ABSTRACT + context_data["level"] = ContextLevel.ABSTRACT elif uri.endswith("/.overview.md"): - context_dict["level"] = ContextLevel.OVERVIEW + context_data["level"] = ContextLevel.OVERVIEW else: - context_dict["level"] = ContextLevel.DETAIL + context_data["level"] = ContextLevel.DETAIL embedding_msg = EmbeddingMsg( message=vectorization_text, - context_data=context_dict, + context_data=context_data, ) # Set any additional fields from kwargs for key, value in kwargs.items(): - if hasattr(embedding_msg.context_data, key) and value is not None: - setattr(embedding_msg.context_data, key, value) + if value is not None: + embedding_msg.context_data[key] = value return embedding_msg diff --git a/openviking/storage/queuefs/semantic_dag.py b/openviking/storage/queuefs/semantic_dag.py index 69967056..0307521f 100644 --- a/openviking/storage/queuefs/semantic_dag.py +++ b/openviking/storage/queuefs/semantic_dag.py @@ -6,6 +6,7 @@ from dataclasses import dataclass, field from typing import Dict, List, Optional +from openviking.server.identity import RequestContext from openviking.storage.viking_fs import get_viking_fs from openviking_cli.utils import VikingURI from openviking_cli.utils.logger import get_logger @@ -41,10 +42,17 @@ class DagStats: class SemanticDagExecutor: """Execute semantic generation with DAG-style, event-driven lazy dispatch.""" - def __init__(self, processor: "SemanticProcessor", context_type: str, max_concurrent_llm: int): + def __init__( + self, + processor: "SemanticProcessor", + context_type: str, + max_concurrent_llm: int, + ctx: RequestContext, + ): self._processor = processor self._context_type = context_type self._max_concurrent_llm = max_concurrent_llm + self._ctx = ctx self._llm_sem = asyncio.Semaphore(max_concurrent_llm) self._viking_fs = get_viking_fs() self._nodes: Dict[str, DirNode] = {} @@ -112,7 +120,7 @@ async def _dispatch_dir(self, dir_uri: str, parent_uri: Optional[str]) -> None: async def _list_dir(self, uri: str) -> tuple[list[str], list[str]]: """List directory entries and return (child_dirs, file_paths).""" try: - entries = await self._viking_fs.ls(uri) + entries = await self._viking_fs.ls(uri, ctx=self._ctx) except Exception as e: logger.warning(f"Failed to list directory {uri}: {e}") return [], [] @@ -138,7 +146,7 @@ async def _file_summary_task(self, parent_uri: str, file_path: str) -> None: file_name = file_path.split("/")[-1] try: summary_dict = await self._processor._generate_single_file_summary( - file_path, llm_sem=self._llm_sem + file_path, llm_sem=self._llm_sem, ctx=self._ctx ) except Exception as e: logger.warning(f"Failed to generate summary for {file_path}: {e}") @@ -157,6 +165,7 @@ async def _file_summary_task(self, parent_uri: str, file_path: str) -> None: context_type=self._context_type, file_path=file_path, summary_dict=summary_dict, + ctx=self._ctx, ) ) except Exception as e: @@ -245,14 +254,14 @@ async def _overview_task(self, dir_uri: str) -> None: abstract = self._processor._extract_abstract_from_overview(overview) try: - await self._viking_fs.write_file(f"{dir_uri}/.overview.md", overview) - await self._viking_fs.write_file(f"{dir_uri}/.abstract.md", abstract) + await self._viking_fs.write_file(f"{dir_uri}/.overview.md", overview, ctx=self._ctx) + await self._viking_fs.write_file(f"{dir_uri}/.abstract.md", abstract, ctx=self._ctx) except Exception as e: logger.warning(f"Failed to write overview/abstract for {dir_uri}: {e}") try: await self._processor._vectorize_directory_simple( - dir_uri, self._context_type, abstract, overview + dir_uri, self._context_type, abstract, overview, ctx=self._ctx ) except Exception as e: logger.error(f"Failed to vectorize directory {dir_uri}: {e}", exc_info=True) diff --git a/openviking/storage/queuefs/semantic_msg.py b/openviking/storage/queuefs/semantic_msg.py index 0fc46d09..4eb3ec89 100644 --- a/openviking/storage/queuefs/semantic_msg.py +++ b/openviking/storage/queuefs/semantic_msg.py @@ -31,17 +31,29 @@ class SemanticMsg: status: str = "pending" # pending/processing/completed timestamp: int = int(datetime.now().timestamp()) recursive: bool = True # Whether to recursively process subdirectories + account_id: str = "default" + user_id: str = "default" + agent_id: str = "default" + role: str = "root" def __init__( self, uri: str, context_type: str, recursive: bool = True, + account_id: str = "default", + user_id: str = "default", + agent_id: str = "default", + role: str = "root", ): self.id = str(uuid4()) self.uri = uri self.context_type = context_type self.recursive = recursive + self.account_id = account_id + self.user_id = user_id + self.agent_id = agent_id + self.role = role def to_dict(self) -> Dict[str, Any]: """Convert object to dictionary.""" @@ -72,6 +84,10 @@ def from_dict(cls, data: Dict[str, Any]) -> "SemanticMsg": uri=uri, context_type=context_type, recursive=data.get("recursive", True), + account_id=data.get("account_id", "default"), + user_id=data.get("user_id", "default"), + agent_id=data.get("agent_id", "default"), + role=data.get("role", "root"), ) if "id" in data and data["id"]: obj.id = data["id"] diff --git a/openviking/storage/queuefs/semantic_processor.py b/openviking/storage/queuefs/semantic_processor.py index d312c57a..c5e0a526 100644 --- a/openviking/storage/queuefs/semantic_processor.py +++ b/openviking/storage/queuefs/semantic_processor.py @@ -20,10 +20,12 @@ get_media_type, ) from openviking.prompts import render_prompt +from openviking.server.identity import RequestContext, Role from openviking.storage.queuefs.named_queue import DequeueHandlerBase from openviking.storage.queuefs.semantic_dag import DagStats, SemanticDagExecutor from openviking.storage.queuefs.semantic_msg import SemanticMsg from openviking.storage.viking_fs import get_viking_fs +from openviking_cli.session.user_id import UserIdentifier from openviking_cli.utils import VikingURI from openviking_cli.utils.config import get_openviking_config from openviking_cli.utils.logger import get_logger @@ -51,6 +53,30 @@ def __init__(self, max_concurrent_llm: int = 100): """ self.max_concurrent_llm = max_concurrent_llm self._dag_executor: Optional[SemanticDagExecutor] = None + self._current_ctx = RequestContext(user=UserIdentifier.the_default_user(), role=Role.ROOT) + + @staticmethod + def _owner_space_for_uri(uri: str, ctx: RequestContext) -> str: + """Derive owner_space from a URI. + + Resources (viking://resources/...) always get owner_space="" so they + are globally visible. User / agent / session URIs inherit the + caller's space name. + """ + if uri.startswith("viking://agent/"): + return ctx.user.agent_space_name() + if uri.startswith("viking://user/") or uri.startswith("viking://session/"): + return ctx.user.user_space_name() + # resources and anything else → shared (empty owner_space) + return "" + + @staticmethod + def _ctx_from_semantic_msg(msg: SemanticMsg) -> RequestContext: + role = Role(msg.role) if msg.role in {r.value for r in Role} else Role.ROOT + return RequestContext( + user=UserIdentifier(msg.account_id, msg.user_id, msg.agent_id), + role=role, + ) def _detect_file_type(self, file_name: str) -> str: """ @@ -97,7 +123,7 @@ async def _collect_directory_info( viking_fs = get_viking_fs() try: - entries = await viking_fs.ls(uri) + entries = await viking_fs.ls(uri, ctx=self._current_ctx) except Exception as e: logger.warning(f"Failed to list directory {uri}: {e}") return @@ -138,6 +164,7 @@ async def on_dequeue(self, data: Optional[Dict[str, Any]]) -> Optional[Dict[str, # data is guaranteed to be not None at this point assert data is not None msg = SemanticMsg.from_dict(data) + self._current_ctx = self._ctx_from_semantic_msg(msg) logger.info( f"Processing semantic generation for: {msg.uri} (recursive={msg.recursive})" ) @@ -147,6 +174,7 @@ async def on_dequeue(self, data: Optional[Dict[str, Any]]) -> Optional[Dict[str, processor=self, context_type=msg.context_type, max_concurrent_llm=self.max_concurrent_llm, + ctx=self._current_ctx, ) self._dag_executor = executor await executor.run(msg.uri) @@ -161,7 +189,7 @@ async def on_dequeue(self, data: Optional[Dict[str, Any]]) -> Optional[Dict[str, # Collect immediate children info only (no recursion) viking_fs = get_viking_fs() try: - entries = await viking_fs.ls(msg.uri) + entries = await viking_fs.ls(msg.uri, ctx=self._current_ctx) for entry in entries: name = entry.get("name", "") if not name or name.startswith(".") or name in [".", ".."]: @@ -223,8 +251,8 @@ async def _process_single_directory( abstract = self._extract_abstract_from_overview(overview) # 5. Write files - await viking_fs.write_file(f"{uri}/.overview.md", overview) - await viking_fs.write_file(f"{uri}/.abstract.md", abstract) + await viking_fs.write_file(f"{uri}/.overview.md", overview, ctx=self._current_ctx) + await viking_fs.write_file(f"{uri}/.abstract.md", abstract, ctx=self._current_ctx) logger.debug(f"Generated overview and abstract for {uri}") @@ -240,7 +268,7 @@ async def _collect_children_abstracts(self, children_uris: List[str]) -> List[Di results = [] for child_uri in children_uris: - abstract = await viking_fs.abstract(child_uri) + abstract = await viking_fs.abstract(child_uri, ctx=self._current_ctx) dir_name = child_uri.split("/")[-1] results.append({"name": dir_name, "abstract": abstract}) return results @@ -257,7 +285,7 @@ async def _generate_file_summaries( return [] async def generate_one_summary(file_path: str) -> Dict[str, str]: - summary = await self._generate_single_file_summary(file_path) + summary = await self._generate_single_file_summary(file_path, ctx=self._current_ctx) if enqueue_files and context_type and parent_uri: try: await self._vectorize_single_file( @@ -277,14 +305,19 @@ async def generate_one_summary(file_path: str) -> Dict[str, str]: return await asyncio.gather(*tasks) async def _generate_text_summary( - self, file_path: str, file_name: str, llm_sem: asyncio.Semaphore + self, + file_path: str, + file_name: str, + llm_sem: asyncio.Semaphore, + ctx: Optional[RequestContext] = None, ) -> Dict[str, str]: """Generate summary for a single text file (code, documentation, or other text).""" viking_fs = get_viking_fs() vlm = get_openviking_config().vlm + active_ctx = ctx or self._current_ctx # Read file content (limit length) - content = await viking_fs.read_file(file_path) + content = await viking_fs.read_file(file_path, ctx=active_ctx) if isinstance(content, bytes): # Try to decode with error handling for text files try: @@ -323,7 +356,10 @@ async def _generate_text_summary( return {"name": file_name, "summary": summary.strip()} async def _generate_single_file_summary( - self, file_path: str, llm_sem: Optional[asyncio.Semaphore] = None + self, + file_path: str, + llm_sem: Optional[asyncio.Semaphore] = None, + ctx: Optional[RequestContext] = None, ) -> Dict[str, str]: """Generate summary for a single file. @@ -337,13 +373,13 @@ async def _generate_single_file_summary( llm_sem = llm_sem or asyncio.Semaphore(self.max_concurrent_llm) media_type = get_media_type(file_name, None) if media_type == "image": - return await generate_image_summary(file_path, file_name, llm_sem) + return await generate_image_summary(file_path, file_name, llm_sem, ctx=ctx) elif media_type == "audio": - return await generate_audio_summary(file_path, file_name, llm_sem) + return await generate_audio_summary(file_path, file_name, llm_sem, ctx=ctx) elif media_type == "video": - return await generate_video_summary(file_path, file_name, llm_sem) + return await generate_video_summary(file_path, file_name, llm_sem, ctx=ctx) else: - return await self._generate_text_summary(file_path, file_name, llm_sem) + return await self._generate_text_summary(file_path, file_name, llm_sem, ctx=ctx) def _extract_abstract_from_overview(self, overview_content: str) -> str: """Extract abstract from overview.md.""" @@ -434,13 +470,19 @@ def replace_index(match): return f"# {dir_uri.split('/')[-1]}\n\nDirectory overview" async def _vectorize_directory_simple( - self, uri: str, context_type: str, abstract: str, overview: str + self, + uri: str, + context_type: str, + abstract: str, + overview: str, + ctx: Optional[RequestContext] = None, ) -> None: """Create directory Context and enqueue to EmbeddingQueue.""" from openviking.storage.queuefs import get_queue_manager from openviking.storage.queuefs.embedding_msg_converter import EmbeddingMsgConverter + active_ctx = ctx or self._current_ctx queue_manager = get_queue_manager() embedding_queue = queue_manager.get_queue(queue_manager.EMBEDDING) @@ -452,6 +494,9 @@ async def _vectorize_directory_simple( is_leaf=False, abstract=abstract, context_type=context_type, + user=active_ctx.user, + account_id=active_ctx.account_id, + owner_space=self._owner_space_for_uri(uri, active_ctx), ) context_abstract.set_vectorize(Vectorize(text=abstract)) embedding_msg_abstract = EmbeddingMsgConverter.from_context(context_abstract) @@ -466,6 +511,15 @@ async def _vectorize_directory_simple( is_leaf=False, abstract=abstract, context_type=context_type, + user=active_ctx.user, + account_id=active_ctx.account_id, + owner_space=( + active_ctx.user.agent_space_name() + if uri.startswith("viking://agent/") + else active_ctx.user.user_space_name() + if uri.startswith("viking://user/") or uri.startswith("viking://session/") + else "" + ), ) context_overview.set_vectorize(Vectorize(text=overview)) embedding_msg_overview = EmbeddingMsgConverter.from_context(context_overview) @@ -478,6 +532,7 @@ async def _vectorize_files( context_type: str, file_paths: List[str], file_summaries: List[Dict[str, str]], + ctx: Optional[RequestContext] = None, ) -> None: """Vectorize files in directory.""" from openviking.storage.queuefs import get_queue_manager @@ -492,6 +547,7 @@ async def _vectorize_files( file_path=file_path, summary_dict=file_summary_dict, embedding_queue=embedding_queue, + ctx=ctx, ) async def _vectorize_single_file( @@ -501,6 +557,7 @@ async def _vectorize_single_file( file_path: str, summary_dict: Dict[str, str], embedding_queue: Optional[Any] = None, + ctx: Optional[RequestContext] = None, ) -> None: """Vectorize a single file using its content or summary.""" from datetime import datetime @@ -516,6 +573,7 @@ async def _vectorize_single_file( queue_manager = get_queue_manager() embedding_queue = queue_manager.get_queue(queue_manager.EMBEDDING) + active_ctx = ctx or self._current_ctx context = Context( uri=file_path, parent_uri=parent_uri, @@ -523,10 +581,13 @@ async def _vectorize_single_file( abstract=summary, context_type=context_type, created_at=datetime.now(), + user=active_ctx.user, + account_id=active_ctx.account_id, + owner_space=self._owner_space_for_uri(file_path, active_ctx), ) if self.get_resource_content_type(file_name) == ResourceContentType.TEXT: - content = await get_viking_fs().read_file(file_path) + content = await get_viking_fs().read_file(file_path, ctx=active_ctx) context.set_vectorize(Vectorize(text=content)) elif summary: context.set_vectorize(Vectorize(text=summary)) diff --git a/openviking/storage/viking_fs.py b/openviking/storage/viking_fs.py index 28638083..6cfe7834 100644 --- a/openviking/storage/viking_fs.py +++ b/openviking/storage/viking_fs.py @@ -13,8 +13,10 @@ """ import asyncio +import contextvars import hashlib import json +from contextlib import contextmanager from dataclasses import dataclass, field from datetime import datetime from pathlib import PurePath @@ -22,8 +24,10 @@ from pyagfs import AGFSClient +from openviking.server.identity import RequestContext, Role from openviking.storage.vikingdb_interface import VikingDBInterface from openviking.utils.time_utils import format_simplified, get_current_timestamp, parse_iso_datetime +from openviking_cli.session.user_id import UserIdentifier from openviking_cli.utils.logger import get_logger from openviking_cli.utils.uri import VikingURI @@ -162,13 +166,47 @@ def __init__( self.query_embedder = query_embedder self.rerank_config = rerank_config self.vector_store = vector_store + self._bound_ctx: contextvars.ContextVar[Optional[RequestContext]] = contextvars.ContextVar( + "vikingfs_bound_ctx", default=None + ) logger.info(f"[VikingFS] Initialized with agfs_url={agfs_url}") + @staticmethod + def _default_ctx() -> RequestContext: + return RequestContext(user=UserIdentifier.the_default_user(), role=Role.ROOT) + + def _ctx_or_default(self, ctx: Optional[RequestContext]) -> RequestContext: + if ctx is not None: + return ctx + bound = self._bound_ctx.get() + return bound or self._default_ctx() + + @contextmanager + def bind_request_context(self, ctx: RequestContext): + """Temporarily bind ctx for legacy internal call paths without explicit ctx param.""" + token = self._bound_ctx.set(ctx) + try: + yield + finally: + self._bound_ctx.reset(token) + + def _ensure_access(self, uri: str, ctx: Optional[RequestContext]) -> None: + real_ctx = self._ctx_or_default(ctx) + if not self._is_accessible(uri, real_ctx): + raise PermissionError(f"Access denied for {uri}") + # ========== AGFS Basic Commands ========== - async def read(self, uri: str, offset: int = 0, size: int = -1) -> bytes: + async def read( + self, + uri: str, + offset: int = 0, + size: int = -1, + ctx: Optional[RequestContext] = None, + ) -> bytes: """Read file""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) result = self.agfs.read(path, offset, size) if isinstance(result, bytes): return result @@ -177,50 +215,80 @@ async def read(self, uri: str, offset: int = 0, size: int = -1) -> bytes: else: return b"" - async def write(self, uri: str, data: Union[bytes, str]) -> str: + async def write( + self, + uri: str, + data: Union[bytes, str], + ctx: Optional[RequestContext] = None, + ) -> str: """Write file""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) if isinstance(data, str): data = data.encode("utf-8") return self.agfs.write(path, data) - async def mkdir(self, uri: str, mode: str = "755", exist_ok: bool = False) -> None: + async def mkdir( + self, + uri: str, + mode: str = "755", + exist_ok: bool = False, + ctx: Optional[RequestContext] = None, + ) -> None: """Create directory.""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) # Always ensure parent directories exist before creating this directory await self._ensure_parent_dirs(path) if exist_ok: try: - await self.stat(uri) + await self.stat(uri, ctx=ctx) return None except Exception: pass self.agfs.mkdir(path) - async def rm(self, uri: str, recursive: bool = False) -> Dict[str, Any]: + async def rm( + self, uri: str, recursive: bool = False, ctx: Optional[RequestContext] = None + ) -> Dict[str, Any]: """Delete file/directory + recursively update vector index.""" - path = self._uri_to_path(uri) - uris_to_delete = await self._collect_uris(path, recursive) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) + uris_to_delete = await self._collect_uris(path, recursive, ctx=ctx) result = self.agfs.rm(path, recursive) if uris_to_delete: - await self._delete_from_vector_store(uris_to_delete) + await self._delete_from_vector_store(uris_to_delete, ctx=ctx) return result - async def mv(self, old_uri: str, new_uri: str) -> Dict[str, Any]: + async def mv( + self, + old_uri: str, + new_uri: str, + ctx: Optional[RequestContext] = None, + ) -> Dict[str, Any]: """Move file/directory + recursively update vector index.""" - old_path = self._uri_to_path(old_uri) - new_path = self._uri_to_path(new_uri) - uris_to_move = await self._collect_uris(old_path, recursive=True) + self._ensure_access(old_uri, ctx) + self._ensure_access(new_uri, ctx) + old_path = self._uri_to_path(old_uri, ctx=ctx) + new_path = self._uri_to_path(new_uri, ctx=ctx) + uris_to_move = await self._collect_uris(old_path, recursive=True, ctx=ctx) result = self.agfs.mv(old_path, new_path) if uris_to_move: - await self._update_vector_store_uris(uris_to_move, old_uri, new_uri) + await self._update_vector_store_uris(uris_to_move, old_uri, new_uri, ctx=ctx) return result - async def grep(self, uri: str, pattern: str, case_insensitive: bool = False) -> Dict: + async def grep( + self, + uri: str, + pattern: str, + case_insensitive: bool = False, + ctx: Optional[RequestContext] = None, + ) -> Dict: """Content search by pattern or keywords.""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) result = self.agfs.grep(path, pattern, True, case_insensitive) if result.get("matches", None) is None: result["matches"] = [] @@ -228,25 +296,32 @@ async def grep(self, uri: str, pattern: str, case_insensitive: bool = False) -> for match in result.get("matches", []): new_match = { "line": match.get("line"), - "uri": self._path_to_uri(match.get("file")), + "uri": self._path_to_uri(match.get("file"), ctx=ctx), "content": match.get("content"), } new_matches.append(new_match) result["matches"] = new_matches return result - async def stat(self, uri: str) -> Dict[str, Any]: + async def stat(self, uri: str, ctx: Optional[RequestContext] = None) -> Dict[str, Any]: """ File/directory information. example: {'name': 'resources', 'size': 128, 'mode': 2147484141, 'modTime': '2026-02-10T21:26:02.934376379+08:00', 'isDir': True, 'meta': {'Name': 'localfs', 'Type': 'local', 'Content': {'local_path': '...'}}} """ - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) return self.agfs.stat(path) - async def glob(self, pattern: str, uri: str = "viking://", node_limit: int = 1000) -> Dict: + async def glob( + self, + pattern: str, + uri: str = "viking://", + node_limit: int = 1000, + ctx: Optional[RequestContext] = None, + ) -> Dict: """File pattern matching, supports **/*.md recursive.""" - entries = await self.tree(uri, node_limit=node_limit) + entries = await self.tree(uri, node_limit=node_limit, ctx=ctx) base_uri = uri.rstrip("/") matches = [] for entry in entries: @@ -259,6 +334,7 @@ async def _batch_fetch_abstracts( self, entries: List[Dict[str, Any]], abs_limit: int, + ctx: Optional[RequestContext] = None, ) -> None: """Batch fetch abstracts for entries. @@ -273,7 +349,7 @@ async def fetch_abstract(index: int, entry: Dict[str, Any]) -> tuple[int, str]: if not entry.get("isDir", False): return index, "" try: - abstract = await self.abstract(entry["uri"]) + abstract = await self.abstract(entry["uri"], ctx=ctx) return index, abstract except Exception: return index, "[.abstract.md is not ready]" @@ -292,6 +368,7 @@ async def tree( abs_limit: int = 256, show_all_hidden: bool = False, node_limit: int = 1000, + ctx: Optional[RequestContext] = None, ) -> List[Dict[str, Any]]: """ Recursively list all contents (includes rel_path). @@ -308,19 +385,25 @@ async def tree( output="agent" [{'name': '.abstract.md', 'size': 100, 'modTime': '2026-02-11 16:52:16', 'isDir': False, 'rel_path': '.abstract.md', 'uri': 'viking://resources...', 'abstract': "..."}] """ + self._ensure_access(uri, ctx) if output == "original": - return await self._tree_original(uri, show_all_hidden, node_limit) + return await self._tree_original(uri, show_all_hidden, node_limit, ctx=ctx) elif output == "agent": - return await self._tree_agent(uri, abs_limit, show_all_hidden, node_limit) + return await self._tree_agent(uri, abs_limit, show_all_hidden, node_limit, ctx=ctx) else: raise ValueError(f"Invalid output format: {output}") async def _tree_original( - self, uri: str, show_all_hidden: bool = False, node_limit: int = 1000 + self, + uri: str, + show_all_hidden: bool = False, + node_limit: int = 1000, + ctx: Optional[RequestContext] = None, ) -> List[Dict[str, Any]]: """Recursively list all contents (original format).""" - path = self._uri_to_path(uri) + path = self._uri_to_path(uri, ctx=ctx) all_entries = [] + real_ctx = self._ctx_or_default(ctx) async def _walk(current_path: str, current_rel: str): if len(all_entries) >= node_limit: @@ -334,7 +417,9 @@ async def _walk(current_path: str, current_rel: str): rel_path = f"{current_rel}/{name}" if current_rel else name new_entry = dict(entry) new_entry["rel_path"] = rel_path - new_entry["uri"] = self._path_to_uri(f"{current_path}/{name}") + new_entry["uri"] = self._path_to_uri(f"{current_path}/{name}", ctx=ctx) + if not self._is_accessible(new_entry["uri"], real_ctx): + continue if entry.get("isDir"): all_entries.append(new_entry) await _walk(f"{current_path}/{name}", rel_path) @@ -347,12 +432,18 @@ async def _walk(current_path: str, current_rel: str): return all_entries async def _tree_agent( - self, uri: str, abs_limit: int, show_all_hidden: bool = False, node_limit: int = 1000 + self, + uri: str, + abs_limit: int, + show_all_hidden: bool = False, + node_limit: int = 1000, + ctx: Optional[RequestContext] = None, ) -> List[Dict[str, Any]]: """Recursively list all contents (agent format with abstracts).""" - path = self._uri_to_path(uri) + path = self._uri_to_path(uri, ctx=ctx) all_entries = [] now = datetime.now() + real_ctx = self._ctx_or_default(ctx) async def _walk(current_path: str, current_rel: str): if len(all_entries) >= node_limit: @@ -365,11 +456,14 @@ async def _walk(current_path: str, current_rel: str): continue rel_path = f"{current_rel}/{name}" if current_rel else name new_entry = { - "uri": self._path_to_uri(f"{current_path}/{name}"), + "uri": self._path_to_uri(f"{current_path}/{name}", ctx=ctx), "size": entry.get("size", 0), "isDir": entry.get("isDir", False), "modTime": format_simplified(parse_iso_datetime(entry.get("modTime", "")), now), } + new_entry["rel_path"] = rel_path + if not self._is_accessible(new_entry["uri"], real_ctx): + continue if entry.get("isDir"): all_entries.append(new_entry) await _walk(f"{current_path}/{name}", rel_path) @@ -380,7 +474,7 @@ async def _walk(current_path: str, current_rel: str): await _walk(path, "") - await self._batch_fetch_abstracts(all_entries, abs_limit) + await self._batch_fetch_abstracts(all_entries, abs_limit, ctx=ctx) return all_entries @@ -389,9 +483,11 @@ async def _walk(current_path: str, current_rel: str): async def abstract( self, uri: str, + ctx: Optional[RequestContext] = None, ) -> str: """Read directory's L0 summary (.abstract.md).""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) info = self.agfs.stat(path) if not info.get("isDir"): raise ValueError(f"{uri} is not a directory") @@ -402,9 +498,11 @@ async def abstract( async def overview( self, uri: str, + ctx: Optional[RequestContext] = None, ) -> str: """Read directory's L1 overview (.overview.md).""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) info = self.agfs.stat(path) if not info.get("isDir"): raise ValueError(f"{uri} is not a directory") @@ -415,16 +513,19 @@ async def overview( async def relations( self, uri: str, + ctx: Optional[RequestContext] = None, ) -> List[Dict[str, Any]]: """Get relation list. Returns: [{"uri": "...", "reason": "..."}, ...] """ - entries = await self.get_relation_table(uri) + self._ensure_access(uri, ctx) + entries = await self.get_relation_table(uri, ctx=ctx) result = [] for entry in entries: for u in entry.uris: - result.append({"uri": u, "reason": entry.reason}) + if self._is_accessible(u, self._ctx_or_default(ctx)): + result.append({"uri": u, "reason": entry.reason}) return result async def find( @@ -434,6 +535,7 @@ async def find( limit: int = 10, score_threshold: Optional[float] = None, filter: Optional[Dict] = None, + ctx: Optional[RequestContext] = None, ): """Semantic search. @@ -456,6 +558,8 @@ async def find( if not self.rerank_config: raise RuntimeError("rerank_config is required for find") + if target_uri: + self._ensure_access(target_uri, ctx) storage = self._get_vector_store() if not storage: @@ -471,8 +575,8 @@ async def find( rerank_config=self.rerank_config, ) - # Infer context_type - context_type = self._infer_context_type(target_uri) if target_uri else ContextType.RESOURCE + # Infer context_type (None = search all types) + context_type = self._infer_context_type(target_uri) if target_uri else None typed_query = TypedQuery( query=query, @@ -483,6 +587,7 @@ async def find( result = await retriever.retrieve( typed_query, + ctx=self._ctx_or_default(ctx), limit=limit, score_threshold=score_threshold, metadata_filter=filter, @@ -512,6 +617,7 @@ async def search( limit: int = 10, score_threshold: Optional[float] = None, filter: Optional[Dict] = None, + ctx: Optional[RequestContext] = None, ): """Complex search with session context. @@ -538,6 +644,8 @@ async def search( recent_messages = session_info.get("recent_messages") if session_info else None query_plan: Optional[QueryPlan] = None + if target_uri: + self._ensure_access(target_uri, ctx) # When target_uri exists: read abstract, infer context_type target_context_type: Optional[ContextType] = None @@ -545,7 +653,7 @@ async def search( if target_uri: target_context_type = self._infer_context_type(target_uri) try: - target_abstract = await self.abstract(target_uri) + target_abstract = await self.abstract(target_uri, ctx=ctx) except Exception: target_abstract = "" @@ -596,6 +704,7 @@ async def search( async def _execute(tq: TypedQuery): return await retriever.retrieve( tq, + ctx=self._ctx_or_default(ctx), limit=limit, score_threshold=score_threshold, metadata_filter=filter, @@ -629,12 +738,16 @@ async def link( from_uri: str, uris: Union[str, List[str]], reason: str = "", + ctx: Optional[RequestContext] = None, ) -> None: """Create relation (maintained in .relations.json).""" if isinstance(uris, str): uris = [uris] + self._ensure_access(from_uri, ctx) + for uri in uris: + self._ensure_access(uri, ctx) - from_path = self._uri_to_path(from_uri) + from_path = self._uri_to_path(from_uri, ctx=ctx) entries = await self._read_relation_table(from_path) existing_ids = {e.id for e in entries} @@ -650,9 +763,12 @@ async def unlink( self, from_uri: str, uri: str, + ctx: Optional[RequestContext] = None, ) -> None: """Delete relation.""" - from_path = self._uri_to_path(from_uri) + self._ensure_access(from_uri, ctx) + self._ensure_access(uri, ctx) + from_path = self._uri_to_path(from_uri, ctx=ctx) try: entries = await self._read_relation_table(from_path) @@ -680,9 +796,12 @@ async def unlink( logger.error(f"[VikingFS] Failed to unlink {from_uri} -> {uri}: {e}") raise IOError(f"Failed to unlink: {e}") - async def get_relation_table(self, uri: str) -> List[RelationEntry]: + async def get_relation_table( + self, uri: str, ctx: Optional[RequestContext] = None + ) -> List[RelationEntry]: """Get relation table.""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) return await self._read_relation_table(path) # ========== URI Conversion ========== @@ -703,15 +822,24 @@ def _shorten_component(component: str, max_bytes: int = 255) -> str: prefix = prefix[:-1] return f"{prefix}_{hash_suffix}" - def _uri_to_path(self, uri: str) -> str: - """viking://user/memories/preferences/test -> /local/user/memories/preferences/test""" - remainder = uri[len("viking://") :].strip("/") + _USER_STRUCTURE_DIRS = {"memories"} + _AGENT_STRUCTURE_DIRS = {"memories", "skills", "instructions", "workspaces"} + + def _uri_to_path(self, uri: str, ctx: Optional[RequestContext] = None) -> str: + """Map virtual URI to account-isolated AGFS path. + + Pure prefix replacement: viking://{remainder} -> /local/{account_id}/{remainder}. + No implicit space injection — URIs must include space segments explicitly. + """ + real_ctx = self._ctx_or_default(ctx) + account_id = real_ctx.account_id + remainder = uri[len("viking://") :].strip("/") if uri.startswith("viking://") else uri if not remainder: - return "/local" - # Ensure each path component does not exceed filesystem filename limit - parts = remainder.split("/") + return f"/local/{account_id}" + + parts = [p for p in remainder.split("/") if p] safe_parts = [self._shorten_component(p, self._MAX_FILENAME_BYTES) for p in parts] - return f"/local/{'/'.join(safe_parts)}" + return f"/local/{account_id}/{'/'.join(safe_parts)}" _INTERNAL_DIRS = {"_system"} _ROOT_PATH = "/local" @@ -719,25 +847,88 @@ def _uri_to_path(self, uri: str) -> str: def _ls_entries(self, path: str) -> List[Dict[str, Any]]: """List directory entries, filtering out internal directories. - At root level (/local), uses VALID_SCOPES whitelist. + At account root (/local/{account}), uses VALID_SCOPES whitelist. At other levels, uses _INTERNAL_DIRS blacklist. """ entries = self.agfs.ls(path) - if path == self._ROOT_PATH: + parts = [p for p in path.strip("/").split("/") if p] + if len(parts) == 2 and parts[0] == "local": return [e for e in entries if e.get("name") in VikingURI.VALID_SCOPES] return [e for e in entries if e.get("name") not in self._INTERNAL_DIRS] - def _path_to_uri(self, path: str) -> str: - """/local/user/memories/preferences -> viking://user/memories/preferences""" + def _path_to_uri(self, path: str, ctx: Optional[RequestContext] = None) -> str: + """/local/{account}/... -> viking://... + + Pure prefix replacement: strips /local/{account_id}/ and prepends viking://. + No implicit space stripping. + """ if path.startswith("viking://"): return path elif path.startswith("/local/"): - return f"viking://{path[7:]}" # Remove /local prefix + inner = path[7:].strip("/") + if not inner: + return "viking://" + real_ctx = self._ctx_or_default(ctx) + parts = [p for p in inner.split("/") if p] + if parts and parts[0] == real_ctx.account_id: + parts = parts[1:] + if not parts: + return "viking://" + return f"viking://{'/'.join(parts)}" elif path.startswith("/"): return f"viking:/{path}" else: return f"viking://{path}" + def _extract_space_from_uri(self, uri: str) -> Optional[str]: + """Extract space segment from URI if present. + + URIs are WYSIWYG: viking://{scope}/{space}/... + For user/agent, the second segment is space unless it's a known structure dir. + For session, the second segment is always space (when 3+ parts). + """ + if not uri.startswith("viking://"): + return None + parts = [p for p in uri[len("viking://") :].strip("/").split("/") if p] + if len(parts) < 2: + return None + scope = parts[0] + second = parts[1] + if scope == "user" and second not in self._USER_STRUCTURE_DIRS: + return second + if scope == "agent" and second not in self._AGENT_STRUCTURE_DIRS: + return second + if scope == "session" and len(parts) >= 2: + return second + return None + + def _is_accessible(self, uri: str, ctx: RequestContext) -> bool: + """Check whether a URI is visible/accessible under current request context.""" + if ctx.role == Role.ROOT: + return True + if not uri.startswith("viking://"): + return False + + parts = [p for p in uri[len("viking://") :].strip("/").split("/") if p] + if not parts: + return True + + scope = parts[0] + if scope in {"resources", "temp", "transactions"}: + return True + if scope == "_system": + return False + + space = self._extract_space_from_uri(uri) + if space is None: + return True + + if scope in {"user", "session"}: + return space == ctx.user.user_space_name() + if scope == "agent": + return space == ctx.user.agent_space_name() + return True + def _handle_agfs_read(self, result: Union[bytes, Any, None]) -> bytes: """Handle AGFSClient read return types consistently.""" if isinstance(result, bytes): @@ -769,18 +960,22 @@ def _handle_agfs_content(self, result: Union[bytes, Any, None]) -> str: return "" def _infer_context_type(self, uri: str): - """Infer context_type from URI.""" + """Infer context_type from URI. Returns None when ambiguous.""" from openviking_cli.retrieve import ContextType if "/memories" in uri: return ContextType.MEMORY elif "/skills" in uri: return ContextType.SKILL - return ContextType.RESOURCE + elif "/resources" in uri: + return ContextType.RESOURCE + return None # ========== Vector Sync Helper Methods ========== - async def _collect_uris(self, path: str, recursive: bool) -> List[str]: + async def _collect_uris( + self, path: str, recursive: bool, ctx: Optional[RequestContext] = None + ) -> List[str]: """Recursively collect all URIs (for rm/mv).""" uris = [] @@ -795,14 +990,16 @@ async def _collect(p: str): if recursive: await _collect(full_path) else: - uris.append(self._path_to_uri(full_path)) + uris.append(self._path_to_uri(full_path, ctx=ctx)) except Exception: pass await _collect(path) return uris - async def _delete_from_vector_store(self, uris: List[str]) -> None: + async def _delete_from_vector_store( + self, uris: List[str], ctx: Optional[RequestContext] = None + ) -> None: """Delete records with specified URIs from vector store. Uses storage.remove_by_uri method, which implements recursive deletion of child nodes. @@ -810,16 +1007,42 @@ async def _delete_from_vector_store(self, uris: List[str]) -> None: storage = self._get_vector_store() if not storage: return + real_ctx = self._ctx_or_default(ctx) for uri in uris: try: - await storage.remove_by_uri("context", uri) + filter_conds: List[Dict[str, Any]] = [ + {"op": "must", "field": "account_id", "conds": [real_ctx.account_id]}, + { + "op": "or", + "conds": [ + {"op": "must", "field": "uri", "conds": [uri]}, + {"op": "prefix", "field": "uri", "prefix": f"{uri}/"}, + ], + }, + ] + if real_ctx.role == Role.USER and uri.startswith( + ("viking://user/", "viking://agent/") + ): + owner_space = ( + real_ctx.user.user_space_name() + if uri.startswith("viking://user/") + else real_ctx.user.agent_space_name() + ) + filter_conds.append( + {"op": "must", "field": "owner_space", "conds": [owner_space]} + ) + await storage.batch_delete("context", {"op": "and", "conds": filter_conds}) logger.info(f"[VikingFS] Deleted from vector store: {uri}") except Exception as e: logger.warning(f"[VikingFS] Failed to delete {uri} from vector store: {e}") async def _update_vector_store_uris( - self, uris: List[str], old_base: str, new_base: str + self, + uris: List[str], + old_base: str, + new_base: str, + ctx: Optional[RequestContext] = None, ) -> None: """Update URIs in vector store (when moving files). @@ -829,14 +1052,24 @@ async def _update_vector_store_uris( if not storage: return - old_base_uri = self._path_to_uri(old_base) - new_base_uri = self._path_to_uri(new_base) + old_base_uri = self._path_to_uri(old_base, ctx=ctx) + new_base_uri = self._path_to_uri(new_base, ctx=ctx) for uri in uris: try: records = await storage.filter( collection="context", - filter={"op": "must", "field": "uri", "conds": [uri]}, + filter={ + "op": "and", + "conds": [ + {"op": "must", "field": "uri", "conds": [uri]}, + { + "op": "must", + "field": "account_id", + "conds": [self._ctx_or_default(ctx).account_id], + }, + ], + }, limit=1, ) @@ -932,16 +1165,18 @@ async def _write_relation_table(self, dir_path: str, entries: List[RelationEntry # ========== Batch Read (backward compatible) ========== - async def read_batch(self, uris: List[str], level: str = "l0") -> Dict[str, str]: + async def read_batch( + self, uris: List[str], level: str = "l0", ctx: Optional[RequestContext] = None + ) -> Dict[str, str]: """Batch read content from multiple URIs.""" results = {} for uri in uris: try: content = "" if level == "l0": - content = await self.abstract(uri) + content = await self.abstract(uri, ctx=ctx) elif level == "l1": - content = await self.overview(uri) + content = await self.overview(uri, ctx=ctx) results[uri] = content except Exception: pass @@ -953,9 +1188,11 @@ async def write_file( self, uri: str, content: Union[str, bytes], + ctx: Optional[RequestContext] = None, ) -> None: """Write file directly.""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) await self._ensure_parent_dirs(path) if isinstance(content, str): @@ -967,6 +1204,7 @@ async def read_file( uri: str, offset: int = 0, limit: int = -1, + ctx: Optional[RequestContext] = None, ) -> str: """Read single file, optionally sliced by line range. @@ -978,7 +1216,8 @@ async def read_file( Raises: FileNotFoundError: If the file does not exist. """ - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) try: content = self.agfs.read(path) except Exception as e: @@ -993,9 +1232,11 @@ async def read_file( async def read_file_bytes( self, uri: str, + ctx: Optional[RequestContext] = None, ) -> bytes: """Read single binary file.""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) try: return self._handle_agfs_read(self.agfs.read(path)) except Exception as e: @@ -1005,9 +1246,11 @@ async def write_file_bytes( self, uri: str, content: bytes, + ctx: Optional[RequestContext] = None, ) -> None: """Write single binary file.""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) await self._ensure_parent_dirs(path) self.agfs.write(path, content) @@ -1015,9 +1258,11 @@ async def append_file( self, uri: str, content: str, + ctx: Optional[RequestContext] = None, ) -> None: """Append content to file.""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) try: existing = "" @@ -1040,6 +1285,7 @@ async def ls( output: str = "original", abs_limit: int = 256, show_all_hidden: bool = False, + ctx: Optional[RequestContext] = None, ) -> List[Dict[str, Any]]: """ List directory contents (URI version). @@ -1056,18 +1302,24 @@ async def ls( output="agent" [{'name': '.abstract.md', 'size': 100, 'modTime': '2026-02-11(or 16:52:16 for today)', 'isDir': False, 'uri': 'viking://resources/.abstract.md', 'abstract': "..."}] """ + self._ensure_access(uri, ctx) if output == "original": - return await self._ls_original(uri, show_all_hidden) + return await self._ls_original(uri, show_all_hidden, ctx=ctx) elif output == "agent": - return await self._ls_agent(uri, abs_limit, show_all_hidden) + return await self._ls_agent(uri, abs_limit, show_all_hidden, ctx=ctx) else: raise ValueError(f"Invalid output format: {output}") async def _ls_agent( - self, uri: str, abs_limit: int, show_all_hidden: bool + self, + uri: str, + abs_limit: int, + show_all_hidden: bool, + ctx: Optional[RequestContext] = None, ) -> List[Dict[str, Any]]: """List directory contents (URI version).""" - path = self._uri_to_path(uri) + path = self._uri_to_path(uri, ctx=ctx) + real_ctx = self._ctx_or_default(ctx) try: entries = self._ls_entries(path) except Exception as e: @@ -1086,11 +1338,13 @@ async def _ls_agent( # 保持时间部分最多 26 位 (YYYY-MM-DDTHH:MM:SS.mmmmmm) raw_time = parts[0][:26] + "+" + parts[1] new_entry = { - "uri": self._path_to_uri(f"{path}/{name}"), + "uri": self._path_to_uri(f"{path}/{name}", ctx=ctx), "size": entry.get("size", 0), "isDir": entry.get("isDir", False), "modTime": format_simplified(parse_iso_datetime(raw_time), now), } + if not self._is_accessible(new_entry["uri"], real_ctx): + continue if entry.get("isDir"): all_entries.append(new_entry) elif not name.startswith("."): @@ -1098,12 +1352,18 @@ async def _ls_agent( elif show_all_hidden: all_entries.append(new_entry) # call abstract in parallel 6 threads - await self._batch_fetch_abstracts(all_entries, abs_limit) + await self._batch_fetch_abstracts(all_entries, abs_limit, ctx=ctx) return all_entries - async def _ls_original(self, uri: str, show_all_hidden: bool = False) -> List[Dict[str, Any]]: + async def _ls_original( + self, + uri: str, + show_all_hidden: bool = False, + ctx: Optional[RequestContext] = None, + ) -> List[Dict[str, Any]]: """List directory contents (URI version).""" - path = self._uri_to_path(uri) + path = self._uri_to_path(uri, ctx=ctx) + real_ctx = self._ctx_or_default(ctx) try: entries = self._ls_entries(path) # AGFS returns read-only structure, need to create new dict @@ -1111,7 +1371,9 @@ async def _ls_original(self, uri: str, show_all_hidden: bool = False) -> List[Di for entry in entries: name = entry.get("name", "") new_entry = dict(entry) # Copy original data - new_entry["uri"] = self._path_to_uri(f"{path}/{name}") + new_entry["uri"] = self._path_to_uri(f"{path}/{name}", ctx=ctx) + if not self._is_accessible(new_entry["uri"], real_ctx): + continue if entry.get("isDir"): all_entries.append(new_entry) elif not name.startswith("."): @@ -1126,10 +1388,13 @@ async def move_file( self, from_uri: str, to_uri: str, + ctx: Optional[RequestContext] = None, ) -> None: """Move file.""" - from_path = self._uri_to_path(from_uri) - to_path = self._uri_to_path(to_uri) + self._ensure_access(from_uri, ctx) + self._ensure_access(to_uri, ctx) + from_path = self._uri_to_path(from_uri, ctx=ctx) + to_path = self._uri_to_path(to_uri, ctx=ctx) content = self.agfs.read(from_path) await self._ensure_parent_dirs(to_path) self.agfs.write(to_path, content) @@ -1141,9 +1406,9 @@ def create_temp_uri(self) -> str: """Create temp directory URI.""" return VikingURI.create_temp_uri() - async def delete_temp(self, temp_uri: str) -> None: + async def delete_temp(self, temp_uri: str, ctx: Optional[RequestContext] = None) -> None: """Delete temp directory and its contents.""" - path = self._uri_to_path(temp_uri) + path = self._uri_to_path(temp_uri, ctx=ctx) try: for entry in self._ls_entries(path): name = entry.get("name", "") @@ -1151,19 +1416,21 @@ async def delete_temp(self, temp_uri: str) -> None: continue entry_path = f"{path}/{name}" if entry.get("isDir"): - await self.delete_temp(f"{temp_uri}/{name}") + await self.delete_temp(f"{temp_uri}/{name}", ctx=ctx) else: self.agfs.rm(entry_path) self.agfs.rm(path) except Exception as e: logger.warning(f"[VikingFS] Failed to delete temp {temp_uri}: {e}") - async def get_relations(self, uri: str) -> List[str]: + async def get_relations(self, uri: str, ctx: Optional[RequestContext] = None) -> List[str]: """Get all related URIs (backward compatible).""" - entries = await self.get_relation_table(uri) + entries = await self.get_relation_table(uri, ctx=ctx) all_uris = [] for entry in entries: - all_uris.extend(entry.uris) + for related in entry.uris: + if self._is_accessible(related, self._ctx_or_default(ctx)): + all_uris.append(related) return all_uris async def get_relations_with_content( @@ -1171,9 +1438,10 @@ async def get_relations_with_content( uri: str, include_l0: bool = True, include_l1: bool = False, + ctx: Optional[RequestContext] = None, ) -> List[Dict[str, Any]]: """Get related URIs and their content (backward compatible).""" - relation_uris = await self.get_relations(uri) + relation_uris = await self.get_relations(uri, ctx=ctx) if not relation_uris: return [] @@ -1181,9 +1449,9 @@ async def get_relations_with_content( abstracts = {} overviews = {} if include_l0: - abstracts = await self.read_batch(relation_uris, level="l0") + abstracts = await self.read_batch(relation_uris, level="l0", ctx=ctx) if include_l1: - overviews = await self.read_batch(relation_uris, level="l1") + overviews = await self.read_batch(relation_uris, level="l1", ctx=ctx) for rel_uri in relation_uris: info = {"uri": rel_uri} @@ -1203,9 +1471,11 @@ async def write_context( overview: str = "", content_filename: str = "content.md", is_leaf: bool = False, + ctx: Optional[RequestContext] = None, ) -> None: """Write context to AGFS (L0/L1/L2).""" - path = self._uri_to_path(uri) + self._ensure_access(uri, ctx) + path = self._uri_to_path(uri, ctx=ctx) try: await self._ensure_parent_dirs(path) diff --git a/openviking/utils/resource_processor.py b/openviking/utils/resource_processor.py index b28776f6..106a33c8 100644 --- a/openviking/utils/resource_processor.py +++ b/openviking/utils/resource_processor.py @@ -10,6 +10,7 @@ from typing import TYPE_CHECKING, Any, Dict, Optional from openviking.parse.tree_builder import TreeBuilder +from openviking.server.identity import RequestContext from openviking.storage import VikingDBManager from openviking.storage.viking_fs import get_viking_fs from openviking_cli.utils import get_logger @@ -71,6 +72,7 @@ def _get_media_processor(self): async def process_resource( self, path: str, + ctx: RequestContext, reason: str = "", instruction: str = "", scope: str = "resources", @@ -95,14 +97,16 @@ async def process_resource( # ============ Phase 1: Parse source (Parser generates L0/L1 and writes to temp) ============ try: media_processor = self._get_media_processor() + viking_fs = get_viking_fs() # Use reason as instruction fallback so it influences L0/L1 # generation and improves search relevance as documented. effective_instruction = instruction or reason - parse_result = await media_processor.process( - source=path, - instruction=effective_instruction, - **kwargs, - ) + with viking_fs.bind_request_context(ctx): + parse_result = await media_processor.process( + source=path, + instruction=effective_instruction, + **kwargs, + ) result["source_path"] = parse_result.source_path or path result["meta"] = parse_result.meta @@ -141,13 +145,15 @@ async def process_resource( # ============ Phase 3: TreeBuilder finalizes from temp (scan + move to AGFS) ============ try: - context_tree = await self.tree_builder.finalize_from_temp( - temp_dir_path=parse_result.temp_dir_path, - scope=scope, - base_uri=located_uri, - source_path=parse_result.source_path, - source_format=parse_result.source_format, - ) + with get_viking_fs().bind_request_context(ctx): + context_tree = await self.tree_builder.finalize_from_temp( + temp_dir_path=parse_result.temp_dir_path, + ctx=ctx, + scope=scope, + base_uri=located_uri, + source_path=parse_result.source_path, + source_format=parse_result.source_format, + ) except Exception as e: result["status"] = "error" result["errors"].append(f"Finalize from temp error: {e}") @@ -155,7 +161,7 @@ async def process_resource( # Cleanup temporary directory on error (via VikingFS) try: if parse_result.temp_dir_path: - await get_viking_fs().delete_temp(parse_result.temp_dir_path) + await get_viking_fs().delete_temp(parse_result.temp_dir_path, ctx=ctx) except Exception: pass diff --git a/openviking/utils/skill_processor.py b/openviking/utils/skill_processor.py index 5017fe1b..9b337567 100644 --- a/openviking/utils/skill_processor.py +++ b/openviking/utils/skill_processor.py @@ -12,10 +12,10 @@ from openviking.core.context import Context, ContextType, Vectorize from openviking.core.mcp_converter import is_mcp_format, mcp_to_skill from openviking.core.skill_loader import SkillLoader +from openviking.server.identity import RequestContext from openviking.storage import VikingDBManager from openviking.storage.queuefs.embedding_msg_converter import EmbeddingMsgConverter from openviking.storage.viking_fs import VikingFS -from openviking_cli.session.user_id import UserIdentifier from openviking_cli.utils import get_logger from openviking_cli.utils.config import get_openviking_config @@ -42,7 +42,7 @@ async def process_skill( self, data: Any, viking_fs: VikingFS, - user: Optional[UserIdentifier] = None, + ctx: RequestContext, ) -> Dict[str, Any]: """ Process and store a skill. @@ -66,6 +66,9 @@ async def process_skill( is_leaf=False, abstract=skill_dict.get("description", ""), context_type=ContextType.SKILL.value, + user=ctx.user, + account_id=ctx.account_id, + owner_space=ctx.user.agent_space_name(), meta={ "name": skill_dict["name"], "description": skill_dict.get("description", ""), @@ -85,6 +88,7 @@ async def process_skill( skill_dict=skill_dict, skill_dir_uri=skill_dir_uri, overview=overview, + ctx=ctx, ) await self._write_auxiliary_files( @@ -92,6 +96,7 @@ async def process_skill( auxiliary_files=auxiliary_files, base_path=base_path, skill_dir_uri=skill_dir_uri, + ctx=ctx, ) await self._index_skill( @@ -165,6 +170,7 @@ async def _write_skill_content( skill_dict: Dict[str, Any], skill_dir_uri: str, overview: str, + ctx: RequestContext, ): """Write main skill content to VikingFS.""" await viking_fs.write_context( @@ -174,6 +180,7 @@ async def _write_skill_content( overview=overview, content_filename="SKILL.md", is_leaf=False, + ctx=ctx, ) async def _write_auxiliary_files( @@ -182,6 +189,7 @@ async def _write_auxiliary_files( auxiliary_files: List[Path], base_path: Optional[Path], skill_dir_uri: str, + ctx: RequestContext, ): """Write auxiliary files to VikingFS.""" for aux_file in auxiliary_files: @@ -199,9 +207,9 @@ async def _write_auxiliary_files( is_text = False if is_text: - await viking_fs.write_file(aux_uri, file_bytes.decode("utf-8")) + await viking_fs.write_file(aux_uri, file_bytes.decode("utf-8"), ctx=ctx) else: - await viking_fs.write_file_bytes(aux_uri, file_bytes) + await viking_fs.write_file_bytes(aux_uri, file_bytes, ctx=ctx) async def _index_skill(self, context: Context, skill_dir_uri: str): """Write skill to vector store via async queue.""" diff --git a/openviking_cli/retrieve/types.py b/openviking_cli/retrieve/types.py index db4347f0..f1ca5106 100644 --- a/openviking_cli/retrieve/types.py +++ b/openviking_cli/retrieve/types.py @@ -247,7 +247,7 @@ class TypedQuery: """ query: str - context_type: ContextType + context_type: Optional[ContextType] intent: str priority: int = 3 target_directories: List[str] = field(default_factory=list) diff --git a/openviking_cli/session/user_id.py b/openviking_cli/session/user_id.py index a69e6480..cd959e16 100644 --- a/openviking_cli/session/user_id.py +++ b/openviking_cli/session/user_id.py @@ -37,18 +37,27 @@ def _validate_error(self) -> str: def account_id(self) -> str: return self._account_id - def unique_space_name(self, short: bool = True) -> str: - # 匿名化,只保留 {account_id}_{md5 of user and agent id} - hash = hashlib.md5((self._user_id + self._agent_id).encode()).hexdigest() - if short: - return f"{self._account_id}_{hash[:8]}" - return f"{self._account_id}_{hash}" + @property + def user_id(self) -> str: + return self._user_id + + @property + def agent_id(self) -> str: + return self._agent_id + + def user_space_name(self) -> str: + """User-level space name.""" + return self._user_id + + def agent_space_name(self) -> str: + """Agent-level space name (user + agent).""" + return hashlib.md5((self._user_id + self._agent_id).encode()).hexdigest()[:12] def memory_space_uri(self) -> str: - return "viking://agent/memories/" + self.unique_space_name() + return f"viking://agent/{self.agent_space_name()}/memories" def work_space_uri(self) -> str: - return "viking://agent/workspaces/" + self.unique_space_name() + return f"viking://agent/{self.agent_space_name()}/workspaces" def to_dict(self): return { diff --git a/tests/conftest.py b/tests/conftest.py index 1e5cf17d..8773ff8b 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -32,7 +32,6 @@ def temp_dir() -> Generator[Path, None, None]: shutil.rmtree(TEST_TMP_DIR, ignore_errors=True) TEST_TMP_DIR.mkdir(parents=True, exist_ok=True) yield TEST_TMP_DIR - shutil.rmtree(TEST_TMP_DIR, ignore_errors=True) @pytest.fixture(scope="function") diff --git a/tests/integration/conftest.py b/tests/integration/conftest.py index cdf43517..82a06e43 100644 --- a/tests/integration/conftest.py +++ b/tests/integration/conftest.py @@ -32,7 +32,6 @@ def temp_dir(): shutil.rmtree(TEST_TMP_DIR, ignore_errors=True) TEST_TMP_DIR.mkdir(parents=True, exist_ok=True) yield TEST_TMP_DIR - shutil.rmtree(TEST_TMP_DIR, ignore_errors=True) @pytest.fixture(scope="session") diff --git a/tests/server/conftest.py b/tests/server/conftest.py index d775db63..9f6d6279 100644 --- a/tests/server/conftest.py +++ b/tests/server/conftest.py @@ -17,6 +17,7 @@ from openviking import AsyncOpenViking from openviking.server.app import create_app from openviking.server.config import ServerConfig +from openviking.server.identity import RequestContext, Role from openviking.service.core import OpenVikingService from openviking_cli.session.user_id import UserIdentifier @@ -99,8 +100,10 @@ async def client(app): @pytest_asyncio.fixture(scope="function") async def client_with_resource(client, service, sample_markdown_file): """Client + a resource already added and processed.""" + ctx = RequestContext(user=UserIdentifier.the_default_user(), role=Role.ROOT) result = await service.resources.add_resource( path=str(sample_markdown_file), + ctx=ctx, reason="test resource", wait=True, ) diff --git a/tests/server/test_api_sessions.py b/tests/server/test_api_sessions.py index 78b66f62..413907d3 100644 --- a/tests/server/test_api_sessions.py +++ b/tests/server/test_api_sessions.py @@ -5,11 +5,12 @@ import httpx +from openviking.server.identity import RequestContext, Role +from openviking_cli.session.user_id import UserIdentifier + async def test_create_session(client: httpx.AsyncClient): - resp = await client.post( - "/api/v1/sessions", json={} - ) + resp = await client.post("/api/v1/sessions", json={}) assert resp.status_code == 200 body = resp.json() assert body["status"] == "ok" @@ -27,9 +28,7 @@ async def test_list_sessions(client: httpx.AsyncClient): async def test_get_session(client: httpx.AsyncClient): - create_resp = await client.post( - "/api/v1/sessions", json={} - ) + create_resp = await client.post("/api/v1/sessions", json={}) session_id = create_resp.json()["result"]["session_id"] resp = await client.get(f"/api/v1/sessions/{session_id}") @@ -40,9 +39,7 @@ async def test_get_session(client: httpx.AsyncClient): async def test_add_message(client: httpx.AsyncClient): - create_resp = await client.post( - "/api/v1/sessions", json={} - ) + create_resp = await client.post("/api/v1/sessions", json={}) session_id = create_resp.json()["result"]["session_id"] resp = await client.post( @@ -56,9 +53,7 @@ async def test_add_message(client: httpx.AsyncClient): async def test_add_multiple_messages(client: httpx.AsyncClient): - create_resp = await client.post( - "/api/v1/sessions", json={} - ) + create_resp = await client.post("/api/v1/sessions", json={}) session_id = create_resp.json()["result"]["session_id"] # Add messages one by one; each add_message call should see @@ -85,13 +80,9 @@ async def test_add_multiple_messages(client: httpx.AsyncClient): assert count3 >= count2 -async def test_add_message_persistence_regression( - client: httpx.AsyncClient, service -): +async def test_add_message_persistence_regression(client: httpx.AsyncClient, service): """Regression: message payload must persist as valid parts across loads.""" - create_resp = await client.post( - "/api/v1/sessions", json={"user": "test"} - ) + create_resp = await client.post("/api/v1/sessions", json={"user": "test"}) session_id = create_resp.json()["result"]["session_id"] resp1 = await client.post( @@ -114,7 +105,8 @@ async def test_add_message_persistence_regression( assert get_resp.json()["result"]["message_count"] == 2 # Verify stored message content survives load/decode. - session = service.sessions.session(session_id) + ctx = RequestContext(user=UserIdentifier.the_default_user(), role=Role.ROOT) + session = service.sessions.session(ctx, session_id) await session.load() assert len(session.messages) == 2 assert session.messages[0].content == "Message A" @@ -122,9 +114,7 @@ async def test_add_message_persistence_regression( async def test_delete_session(client: httpx.AsyncClient): - create_resp = await client.post( - "/api/v1/sessions", json={} - ) + create_resp = await client.post("/api/v1/sessions", json={}) session_id = create_resp.json()["result"]["session_id"] # Add a message so the session file exists in storage @@ -141,9 +131,7 @@ async def test_delete_session(client: httpx.AsyncClient): async def test_compress_session(client: httpx.AsyncClient): - create_resp = await client.post( - "/api/v1/sessions", json={} - ) + create_resp = await client.post("/api/v1/sessions", json={}) session_id = create_resp.json()["result"]["session_id"] # Add some messages before committing @@ -152,16 +140,12 @@ async def test_compress_session(client: httpx.AsyncClient): json={"role": "user", "content": "Hello"}, ) - resp = await client.post( - f"/api/v1/sessions/{session_id}/commit" - ) + resp = await client.post(f"/api/v1/sessions/{session_id}/commit") assert resp.status_code == 200 assert resp.json()["status"] == "ok" -async def test_extract_session_jsonable_regression( - client: httpx.AsyncClient, service, monkeypatch -): +async def test_extract_session_jsonable_regression(client: httpx.AsyncClient, service, monkeypatch): """Regression: extract endpoint should serialize internal objects.""" class FakeMemory: @@ -173,14 +157,12 @@ def __init__(self, uri: str): def to_dict(self): return {"uri": self.uri} - async def fake_extract(_session_id: str): + async def fake_extract(_session_id: str, _ctx): return [FakeMemory("viking://user/memories/mock.md")] monkeypatch.setattr(service.sessions, "extract", fake_extract) - create_resp = await client.post( - "/api/v1/sessions", json={"user": "test"} - ) + create_resp = await client.post("/api/v1/sessions", json={"user": "test"}) session_id = create_resp.json()["result"]["session_id"] resp = await client.post(f"/api/v1/sessions/{session_id}/extract") diff --git a/tests/server/test_auth.py b/tests/server/test_auth.py index 70dec790..dd856dfc 100644 --- a/tests/server/test_auth.py +++ b/tests/server/test_auth.py @@ -167,3 +167,35 @@ async def test_agent_id_header_forwarded(auth_client: httpx.AsyncClient): headers={"X-API-Key": ROOT_KEY, "X-OpenViking-Agent": "my-agent"}, ) assert resp.status_code == 200 + + +async def test_cross_tenant_session_get_returns_not_found(auth_client: httpx.AsyncClient, auth_app): + """A user must not access another tenant's session by session_id.""" + manager = auth_app.state.api_key_manager + alice_key = await manager.create_account("acme", "alice") + bob_key = await manager.create_account("beta", "bob") + + create_resp = await auth_client.post( + "/api/v1/sessions", json={}, headers={"X-API-Key": alice_key} + ) + assert create_resp.status_code == 200 + session_id = create_resp.json()["result"]["session_id"] + + add_resp = await auth_client.post( + f"/api/v1/sessions/{session_id}/messages", + json={"role": "user", "content": "hello from alice"}, + headers={"X-API-Key": alice_key}, + ) + assert add_resp.status_code == 200 + + own_get = await auth_client.get( + f"/api/v1/sessions/{session_id}", headers={"X-API-Key": alice_key} + ) + assert own_get.status_code == 200 + assert own_get.json()["result"]["message_count"] == 1 + + cross_get = await auth_client.get( + f"/api/v1/sessions/{session_id}", headers={"X-API-Key": bob_key} + ) + assert cross_get.status_code == 404 + assert cross_get.json()["error"]["code"] == "NOT_FOUND" diff --git a/tests/storage/test_embedding_msg_converter_tenant.py b/tests/storage/test_embedding_msg_converter_tenant.py new file mode 100644 index 00000000..047c13cc --- /dev/null +++ b/tests/storage/test_embedding_msg_converter_tenant.py @@ -0,0 +1,42 @@ +# Copyright (c) 2026 Beijing Volcano Engine Technology Co., Ltd. +# SPDX-License-Identifier: Apache-2.0 + +"""Tenant-field backfill tests for EmbeddingMsgConverter.""" + +import pytest + +from openviking.core.context import Context +from openviking.storage.queuefs.embedding_msg_converter import EmbeddingMsgConverter +from openviking_cli.session.user_id import UserIdentifier + + +@pytest.mark.parametrize( + ("uri", "expected_space"), + [ + ( + "viking://user/memories/preferences/me.md", + lambda user: user.user_space_name(), + ), + ( + "viking://agent/memories/cases/me.md", + lambda user: user.agent_space_name(), + ), + ( + "viking://resources/doc.md", + lambda _user: "", + ), + ], +) +def test_embedding_msg_converter_backfills_account_and_owner_space(uri, expected_space): + user = UserIdentifier("acme", "alice", "helper") + context = Context(uri=uri, abstract="hello", user=user) + + # Simulate legacy producer that forgot tenant fields. + context.account_id = "" + context.owner_space = "" + + msg = EmbeddingMsgConverter.from_context(context) + + assert msg is not None + assert msg.context_data["account_id"] == "acme" + assert msg.context_data["owner_space"] == expected_space(user)