HKUDS · CCLCK · Apr 8, 2026 · Apr 8, 2026 · Apr 11, 2026 · Apr 11, 2026
diff --git a/.gitignore b/.gitignore
@@ -30,6 +30,7 @@ build/
 
 # MCP files
 openspace/config/config_mcp.json
+.mcp.json
 
 # Logs
 logs/
@@ -46,6 +47,7 @@ showcase/.openspace/*
 # GDPVal benchmark cache
 gdpval_bench/.openspace/*
 !gdpval_bench/.openspace/*.db
+gdpval_bench/results/
 
 # Embedding cache
 embedding_cache/
@@ -73,3 +75,6 @@ openspace/skills/*
 node_modules/
 # Frontend local dependency link
 frontend/node_modules
+
+# Local scratch
+tmp/
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,87 @@
+# AGENTS
+
+## Project Skill Bucket
+
+For this repository, the project-scoped OpenSpace skill bucket is:
+
+- `~/.codex/projects/openspace/skills`
+- index: `~/.codex/projects/openspace/SKILL_INDEX.md`
+
+Routing preference for work inside this repo:
+
+1. project bucket `openspace`
+2. shared local bucket `default`
+3. common global skills
+
+Mirror OpenSpace's own pattern:
+- first run `/Users/admin/.codex/tools/route_codex_skills_via_openspace.py`
+- prefilter by skill header metadata first
+- only open the most likely 1-2 `SKILL.md` files
+- avoid scanning every project skill file unless the user explicitly asks
+
+## Codex Desktop Sidecar Evolution
+
+Use this workflow when the user is coding in Codex Desktop with their normal subscription login and wants OpenSpace to do post-task skill capture through the isolated `openspace_evolution` sidecar.
+
+Rules:
+- Keep the main coding workflow unchanged.
+- Do not switch the main Codex Desktop session to a provider-backed model.
+- Do not modify code as part of sidecar evolution unless the user separately asks for code changes.
+- Do not let OpenSpace take over the main task.
+- Use the sidecar only for post-task skill capture.
+- Prefer at most 1 new high-reuse skill per invocation unless the user explicitly asks for more.
+
+When the user asks for sidecar self-evolution, call:
+- `openspace_evolution.evolve_from_context`
+
+Trigger phrases:
+- `sidecar 自进化一下`
+- `做一次 sidecar 自进化`
+- `对当前这轮工作做一次 sidecar 自进化`
+- `用 sidecar 沉淀一个 skill`
+- `基于当前改动做一次 sidecar skill capture`
+- `不要改代码，做一次 sidecar 自进化`
+
+If the user uses one of these phrases, default to this workflow automatically unless they explicitly ask for a different behavior.
+
+Derive the tool inputs from:
+- the current conversation
+- the current `git diff`
+- the key changed files
+
+Behavior:
+- Infer a concise `task`
+- Infer a concise but specific `summary`
+- Pass the most relevant changed files in `file_paths`
+- Use `max_skills = 1` by default
+- After the tool returns, report:
+  - the skill name
+  - the skill path
+  - why the skill is worth keeping
+
+Recommended user-facing invocation:
+
+```text
+对当前这轮工作做一次 sidecar 自进化。不要改代码，不要接管任务。请调用 openspace_evolution.evolve_from_context，基于当前对话、git diff 和关键改动，自动提炼 task/summary，最多生成 1 个高复用 skill，并告诉我 skill 名称、路径、为什么值得保留。
+```
+
+## New Project Bootstrap
+
+If the user wants this sidecar workflow in a new repository, treat it as a project bootstrap task first.
+
+Bootstrap order:
+- Add or update a project launcher before relying on sidecar evolution.
+- Point `OPENSPACE_WORKSPACE` at the new repository root.
+- Keep the user's main Codex Desktop workflow unchanged.
+- Do not modify global `~/.codex` defaults unless the user explicitly asks.
+
+Expected bootstrap outputs:
+- a project-level launcher such as `scripts/codex-desktop-evolution`
+- a project-level `AGENTS.md` section documenting the sidecar trigger phrases
+- sidecar skill output routed to `~/.codex-openspace-desktop/projects/<project-name>/skills`
+
+When a user asks to initialize a new project for this workflow, default to:
+- creating the launcher first
+- wiring `OPENSPACE_WORKSPACE` to the repository root
+- preserving the normal Codex Desktop login path
+- only then enabling phrases like `sidecar 自进化一下`
diff --git a/README.md b/README.md
@@ -157,6 +157,22 @@ pip install -e .
 openspace-mcp --help   # verify installation
 ```
 
+> [!TIP]
+> **Recommended split routing for OpenAI-compatible gateways**
+>
+> If your main model runs through an OpenAI-compatible provider or local relay (for example `gpt-5.4` via `http://127.0.0.1:8080/v1`), the recommended default is:
+> - keep the main LLM on that provider via `OPENSPACE_LLM_*`
+> - keep skill-router embeddings local via `OPENSPACE_SKILL_EMBEDDING_BACKEND=local`
+> - use `OPENSPACE_SKILL_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5`
+>
+> Why this is the default recommendation:
+> - lower latency for routing and prefilter
+> - no dependence on a remote `/v1/embeddings` endpoint
+> - no extra token spend for embedding generation
+> - stronger main LLM still handles final reasoning and selection
+>
+> Architecture notes and flow diagram: [`docs/current-routing-flow.md`](docs/current-routing-flow.md)
+
 > [!TIP]
 > **Slow clone?** The `assets/` folder (~50 MB of images) makes the default clone large. Use this lightweight alternative to skip it:
 > ```bash

diff --git a/README_CN.md b/README_CN.md
@@ -157,6 +157,22 @@ pip install -e .
 openspace-mcp --help   # 验证安装
 ```
 
+> [!TIP]
+> **OpenAI 兼容网关下，默认推荐双路由方案 A**
+>
+> 如果你的主模型走的是 OpenAI 兼容 provider 或本地 relay（例如 `gpt-5.4` 走 `http://127.0.0.1:8080/v1`），当前最推荐的默认配置是：
+> - 主 LLM 继续走 `OPENSPACE_LLM_*`
+> - skill router 的 embedding 走本地：`OPENSPACE_SKILL_EMBEDDING_BACKEND=local`
+> - 本地 embedding 模型使用：`OPENSPACE_SKILL_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5`
+>
+> 这样选的原因：
+> - 路由和预筛选延迟更低
+> - 不依赖远程 `/v1/embeddings`
+> - embedding 不额外消耗 provider token
+> - 更强的主 LLM 仍然负责最终推理和选择
+>
+> 架构说明和流程图见：[`docs/current-routing-flow.md`](docs/current-routing-flow.md)
+
 > [!TIP]
 > **Clone 太慢？** `assets/` 目录包含约 50 MB 的图片文件，导致仓库较大。使用以下轻量方式跳过它：
 > ```bash

diff --git a/context/local-machine/admin-macos/README.md b/context/local-machine/admin-macos/README.md
@@ -0,0 +1,28 @@
+## Admin macOS Local Context
+
+This directory keeps machine-specific snapshots that are useful for future
+deployment, migration, and debugging on other machines, while avoiding noise in
+the repo root.
+
+Included here:
+
+- `mcp/repo-local.mcp.json`
+  - snapshot of the repo-local MCP wiring that was used during local debugging
+- `gdpval_bench/...`
+  - selected benchmark result snapshots that were useful during local call-rate
+    and provider-path investigation
+
+Intentional choices:
+
+- absolute local paths are preserved because they are part of the context
+- localhost API base values are preserved because they document the local stack
+- secrets are not preserved
+  - any benchmark config copied here has API keys redacted
+
+Intentionally omitted from this snapshot:
+
+- SQLite/WAL benchmark databases
+- raw recording directories
+- ad hoc `tmp/` scratch files
+
+Those source locations remain local-only and are ignored via `.gitignore`.
diff --git a/context/local-machine/admin-macos/gdpval_bench/codex_callrate_smoke/config.json b/context/local-machine/admin-macos/gdpval_bench/codex_callrate_smoke/config.json
@@ -0,0 +1,19 @@
+{
+  "clawwork_root": "/tmp/openspace-bench-mp4o3U",
+  "gdpval_path": null,
+  "model": "gpt-5.4",
+  "max_iterations": 20,
+  "backend_scope": [
+    "shell"
+  ],
+  "use_clawwork_productivity": false,
+  "run_name": "codex_callrate_smoke",
+  "max_tasks": 3,
+  "per_occupation": null,
+  "sectors": null,
+  "occupations": null,
+  "task_ids": null,
+  "record_call_details": true,
+  "enable_evaluation": false,
+  "concurrency": 1
+}
diff --git a/context/local-machine/admin-macos/gdpval_bench/codex_callrate_smoke_noref/config.json b/context/local-machine/admin-macos/gdpval_bench/codex_callrate_smoke_noref/config.json
@@ -0,0 +1,19 @@
+{
+  "clawwork_root": "/tmp/openspace-bench-noref-JwyV4l",
+  "gdpval_path": null,
+  "model": "gpt-5.4",
+  "max_iterations": 20,
+  "backend_scope": [
+    "shell"
+  ],
+  "use_clawwork_productivity": false,
+  "run_name": "codex_callrate_smoke_noref",
+  "max_tasks": 3,
+  "per_occupation": null,
+  "sectors": null,
+  "occupations": null,
+  "task_ids": null,
+  "record_call_details": true,
+  "enable_evaluation": false,
+  "concurrency": 1
+}
diff --git a/...xt/local-machine/admin-macos/gdpval_bench/codex_callrate_smoke_noref/phase1_results.jsonl b/...xt/local-machine/admin-macos/gdpval_bench/codex_callrate_smoke_noref/phase1_results.jsonl
@@ -0,0 +1,6 @@
+{"task_id": "0112fc9b-c3b2-4084-8993-5a4abb1f54f1", "phase": "phase1", "occupation": "Nurse Practitioners", "sector": "Health Care and Social Assistance", "task_value_usd": 0.0, "status": "error", "tokens": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "llm_calls": 0, "cost_usd": 0.0, "wall_time_sec": 0.01, "agent_prompt_tokens": 0, "agent_completion_tokens": 0, "agent_total_tokens": 0, "agent_llm_calls": 0, "call_details": []}, "execution": {"iterations": 0, "tool_calls": 0, "time_sec": 0.01}, "skills": {"before": 2, "after": 2, "new_this_task": 0, "evolved": [], "used": []}, "evaluation": {"has_evaluation": false}, "timestamp": "2026-04-12T02:37:44.273256"}
+{"task_id": "02314fc6-a24e-42f4-a8cd-362cae0f0ec1", "phase": "phase1", "occupation": "General and Operations Managers", "sector": "Retail Trade", "task_value_usd": 0.0, "status": "error", "tokens": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "llm_calls": 0, "cost_usd": 0.0, "wall_time_sec": 0.0, "agent_prompt_tokens": 0, "agent_completion_tokens": 0, "agent_total_tokens": 0, "agent_llm_calls": 0, "call_details": []}, "execution": {"iterations": 0, "tool_calls": 0, "time_sec": 0.0}, "skills": {"before": 2, "after": 2, "new_this_task": 0, "evolved": [], "used": []}, "evaluation": {"has_evaluation": false}, "timestamp": "2026-04-12T02:37:44.278249"}
+{"task_id": "02aa1805-c658-4069-8a6a-02dec146063a", "phase": "phase1", "occupation": "Project Management Specialists", "sector": "Professional, Scientific, and Technical Services", "task_value_usd": 0.0, "status": "error", "tokens": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "llm_calls": 0, "cost_usd": 0.0, "wall_time_sec": 0.0, "agent_prompt_tokens": 0, "agent_completion_tokens": 0, "agent_total_tokens": 0, "agent_llm_calls": 0, "call_details": []}, "execution": {"iterations": 0, "tool_calls": 0, "time_sec": 0.0}, "skills": {"before": 2, "after": 2, "new_this_task": 0, "evolved": [], "used": []}, "evaluation": {"has_evaluation": false}, "timestamp": "2026-04-12T02:37:44.282846"}
+{"task_id": "0112fc9b-c3b2-4084-8993-5a4abb1f54f1", "phase": "phase1", "occupation": "Nurse Practitioners", "sector": "Health Care and Social Assistance", "task_value_usd": 0.0, "status": "error", "tokens": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "llm_calls": 0, "cost_usd": 0.0, "wall_time_sec": 0.0, "agent_prompt_tokens": 0, "agent_completion_tokens": 0, "agent_total_tokens": 0, "agent_llm_calls": 0, "call_details": []}, "execution": {"iterations": 0, "tool_calls": 0, "time_sec": 0.0}, "skills": {"before": 2, "after": 2, "new_this_task": 0, "evolved": [], "used": []}, "evaluation": {"has_evaluation": false}, "timestamp": "2026-04-12T02:38:19.523517"}
+{"task_id": "02314fc6-a24e-42f4-a8cd-362cae0f0ec1", "phase": "phase1", "occupation": "General and Operations Managers", "sector": "Retail Trade", "task_value_usd": 0.0, "status": "error", "tokens": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "llm_calls": 0, "cost_usd": 0.0, "wall_time_sec": 0.0, "agent_prompt_tokens": 0, "agent_completion_tokens": 0, "agent_total_tokens": 0, "agent_llm_calls": 0, "call_details": []}, "execution": {"iterations": 0, "tool_calls": 0, "time_sec": 0.0}, "skills": {"before": 2, "after": 2, "new_this_task": 0, "evolved": [], "used": []}, "evaluation": {"has_evaluation": false}, "timestamp": "2026-04-12T02:38:19.526964"}
+{"task_id": "02aa1805-c658-4069-8a6a-02dec146063a", "phase": "phase1", "occupation": "Project Management Specialists", "sector": "Professional, Scientific, and Technical Services", "task_value_usd": 0.0, "status": "error", "tokens": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "llm_calls": 0, "cost_usd": 0.0, "wall_time_sec": 0.0, "agent_prompt_tokens": 0, "agent_completion_tokens": 0, "agent_total_tokens": 0, "agent_llm_calls": 0, "call_details": []}, "execution": {"iterations": 0, "tool_calls": 0, "time_sec": 0.0}, "skills": {"before": 2, "after": 2, "new_this_task": 0, "evolved": [], "used": []}, "evaluation": {"has_evaluation": false}, "timestamp": "2026-04-12T02:38:19.529894"}