@@ -13,8 +13,6 @@
[](https://modelcontextprotocol.io/)
[](https://www.python.org/)
[](https://opensource.org/licenses/MIT/)
-[](./COMMUNICATION.md)
-[](./COMMUNICATION.md)
[](./README_CN.md)
**One Command to Evolve All Your AI Agents**: OpenClaw, nanobot, Claude Code, Codex, Cursor and etc.
@@ -126,9 +124,9 @@ On 50 professional tasks (**📈 [GDPVal Economic Benchmark](#-benchmark-gdpval)
- [📊 Local Dashboard](#-local-dashboard)
- [📈 Benchmark: GDPVal](#-benchmark-gdpval)
- [📊 Showcase: My Daily Monitor](#-showcase-my-daily-monitor)
-- [🏗️ Framework](#️-framework)
+- [🏗️ OpenSpace's Framework](#️-openspaces-framework)
- [🧬 Self-Evolution Engine](#-self-evolution-engine)
- - [🌐 Cloud Skill Community](#-cloud-skill-community)
+ - [🌐 Collaborative Skill Community](#-collaborative-skill-community)
- [🔧 Advanced Configuration](#-advanced-configuration)
- [📖 Code Structure](#-code-structure)
- [🤝 Contribute & Roadmap](#-contribute--roadmap)
@@ -141,7 +139,7 @@ On 50 professional tasks (**📈 [GDPVal Economic Benchmark](#-benchmark-gdpval)
🌐 **Just want to explore?** Browse community skills, evolution lineage at **[open-space.cloud](https://open-space.cloud)** — no installation needed.
```bash
-git clone https://github.com/HKUDS/OpenSpace.git && cd OpenSpace
+git clone https://github.com/Deepfreezechill/OpenSpace.git && cd OpenSpace
pip install -e .
openspace-mcp --help # verify installation
```
@@ -149,12 +147,20 @@ openspace-mcp --help # verify installation
> [!TIP]
> **Slow clone?** The `assets/` folder (~50 MB of images) makes the default clone large. Use this lightweight alternative to skip it:
> ```bash
-> git clone --filter=blob:none --sparse https://github.com/HKUDS/OpenSpace.git
+> git clone --filter=blob:none --sparse https://github.com/Deepfreezechill/OpenSpace.git
> cd OpenSpace
> git sparse-checkout set '/*' '!assets/'
> pip install -e .
> ```
+> [!NOTE]
+> **Windows users:** If `pip install` fails with a path-length error, enable long paths:
+> ```powershell
+> # Run PowerShell as Administrator, then restart your terminal
+> New-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem" -Name "LongPathsEnabled" -Value 1 -PropertyType DWORD -Force
+> ```
+> This is a one-time Windows configuration. Linux and macOS are unaffected.
+
**Choose your path:**
- **[Path A](#-path-a-for-your-agent)** — Plug OpenSpace into your agent
- **[Path B](#-path-b-as-your-co-worker)** — Use OpenSpace directly as your AI co-worker
@@ -427,6 +433,8 @@ Multi-Layer Tracking: Quality monitoring covers the entire execution stack — f
- Safety checks flag dangerous patterns (prompt injection, credential exfiltration)
- Evolved skills are validated before replacing predecessors
+
+
**🌐 Collaborative Skill Community**
A collaborative registry where agents share evolved skills. When one agent evolves an improvement, every connected agent can discover, import, and build on it — turning individual progress into collective intelligence.
@@ -551,11 +559,11 @@ OpenSpace builds upon the following open-source projects. We sincerely thank the
If you find OpenSpace helpful, please consider giving us a star! ⭐
@@ -568,6 +576,6 @@ If you find OpenSpace helpful, please consider giving us a star! ⭐
❤️ Thanks for visiting ✨ OpenSpace!
-
diff --git a/README_CN.md b/README_CN.md
index eca8642..0b972f1 100644
--- a/README_CN.md
+++ b/README_CN.md
@@ -12,8 +12,6 @@
[](https://www.python.org/)
[](https://nodejs.org/)
[](https://opensource.org/licenses/MIT/)
-[](./COMMUNICATION.md)
-[](./COMMUNICATION.md)
**一条命令,进化你所有的 AI Agent**:OpenClaw、nanobot、Claude Code、Codex、Cursor 等
@@ -124,9 +122,9 @@ Skill 能够自动学习并持续提升
- [📊 本地仪表盘](#-本地仪表盘)
- [📈 基准测试:GDPVal](#-基准测试gdpval)
- [📊 案例展示:My Daily Monitor](#-案例展示my-daily-monitor)
-- [🏗️ 框架](#️-框架)
+- [🏗️ OpenSpace 框架](#️-openspace-框架)
- [🧬 自我进化引擎](#-自我进化引擎)
- - [🌐 云端 Skill 社区](#-云端-skill-社区)
+ - [🌐 协作 Skill 社区](#-协作-skill-社区)
- [🔧 高级配置](#-高级配置)
- [📖 代码结构](#-代码结构)
- [🤝 贡献与路线图](#-贡献与路线图)
@@ -139,7 +137,7 @@ Skill 能够自动学习并持续提升
🌐 **只想看看?** 在 **[open-space.cloud](https://open-space.cloud)** 浏览社区 Skill 和进化谱系——无需安装。
```bash
-git clone https://github.com/HKUDS/OpenSpace.git && cd OpenSpace
+git clone https://github.com/Deepfreezechill/OpenSpace.git && cd OpenSpace
pip install -e .
openspace-mcp --help # 验证安装
```
@@ -147,7 +145,7 @@ openspace-mcp --help # 验证安装
> [!TIP]
> **Clone 太慢?** `assets/` 目录包含约 50 MB 的图片文件,导致仓库较大。使用以下轻量方式跳过它:
> ```bash
-> git clone --filter=blob:none --sparse https://github.com/HKUDS/OpenSpace.git
+> git clone --filter=blob:none --sparse https://github.com/Deepfreezechill/OpenSpace.git
> cd OpenSpace
> git sparse-checkout set '/*' '!assets/'
> pip install -e .
@@ -425,6 +423,8 @@ OpenSpace 的核心。Skill 不是静态文件——它们是能够自动选择
- 安全检查标记危险模式(Prompt Injection、凭证窃取)
- 进化后的 Skill 经验证后才替换前代
+
+
**🌐 协作 Skill 社区**
一个协作式注册中心,Agent 在此共享进化后的 Skill。当一个 Agent 完成改进,所有连接的 Agent 都可以发现、导入并在此基础上构建——将个体进步转化为集体智慧。
@@ -554,6 +554,6 @@ OpenSpace 构建于以下开源项目之上。我们衷心感谢其作者和贡
❤️ 感谢访问 ✨ OpenSpace!
-
diff --git a/RUNBOOK.yaml b/RUNBOOK.yaml
deleted file mode 100644
index d77d77b..0000000
--- a/RUNBOOK.yaml
+++ /dev/null
@@ -1,663 +0,0 @@
-schema_version: '1.0'
-lock:
- active_claim: null
- claimed_by: null
- claimed_at: '2026-04-07T01:20:00Z'
-review:
- max_rounds: 3
- required_gates:
- - eight_eyes
- - collab
- merge_policy: human_approval
- rollback_on_max_rounds: true
-completed_phases:
-- id: P0
- name: "Phase 0 \xE2\u20AC\u201D Emergency Hardening"
- epics: 11
- pr_range: "#433\xE2\u20AC\u201C#467"
-- id: P1
- name: "Phase 1 \xE2\u20AC\u201D Foundation Architecture"
- epics: 8
- pr_range: "#451\xE2\u20AC\u201C#459"
-- id: P2
- name: "Phase 2 \xE2\u20AC\u201D Smart Sandbox"
- epics: 6
- pr_range: '#1'
-- id: P3
- name: "Phase 3 \xE2\u20AC\u201D Skill Store Decomposition"
- epics: 7
- pr_range: "#12\xE2\u20AC\u201C#28"
- notes: 'Monolithic store.py (1,345 lines) decomposed into 5 focused modules (SkillRepository,
- LineageTracker, AnalysisStore, TagSearch, MigrationManager) plus types.py. 7 epics
- + 3 validation/cleanup PRs. 1,356 tests passing.
-
- '
-- id: P4
- name: "Phase 4 \xE2\u20AC\u201D Tool Layer + MCP Decomposition"
- epics: 10
- pr_range: "#31\xE2\u20AC\u201C#38"
- notes: 'tool_layer.py (788 lines) decomposed into 4 focused modules (ToolRegistry,
- ExecutionEngine, RecordingService, LLMFactory) + facade cleanup. mcp_server.py
- (992 lines) decomposed into openspace/mcp/ package (server.py, tool_handlers.py)
- + backward-compat shim. 2 epics skipped (4.2 covered by 4.1, 4.8 covered by P3).
- 62 new tests. 1,408 passing.
-
- '
-active_phase:
- id: P5
- name: "Phase 5 \xE2\u20AC\u201D Skill Engine / Grounding Decomposition"
- status: in_progress
- epics_total: 11
- epics_merged: 0
-epics:
-- id: '2.6'
- phase: P2
- title: Secret Broker
- status: merged
- branch: epic/2.6-secret-broker
- pr: 1
- review_round: 5
- description: 'Scoped storage, encryption at rest, capability-based access with owner
- isolation. 64 tests, 1028 suite passing.
-
- '
- acceptance:
- - SecretBroker passes all 64 unit tests
- - Owner isolation enforced across get/revoke/list/put
- - Full suite 1028+ green
- - /8eyes + /collab buyoff
-- id: '3.1'
- phase: P3
- title: Extract SkillSchema + SkillVersion models
- status: merged
- branch: epic/3.1-skill-schema
- pr: 12
- review_round: 2
- source_file: openspace/skill_engine/store.py
- target_module: openspace/skill_engine/skill_schema.py
- description: 'Extract SkillSchema and SkillVersion frozen dataclasses/models from
- store.py. These are the core domain types for the skill persistence layer.
-
- '
- acceptance:
- - skill_schema.py contains all model definitions
- - store.py imports from skill_schema.py
- - All existing tests pass unchanged
-- id: '3.2'
- phase: P3
- title: Extract SkillRepository (CRUD)
- status: merged
- branch: epic/3.2-skill-repository
- pr: 15
- review_round: 2
- source_file: openspace/skill_engine/store.py
- target_module: openspace/skill_engine/skill_repository.py
- description: 'Extract save_record, load_record, load_all, load_active, delete_record,
- deactivate/reactivate and all sync variants into a focused repository class.
-
- '
- acceptance:
- - skill_repository.py handles all CRUD operations
- - store.py delegates to SkillRepository
- - All existing tests pass unchanged
-- id: '3.3'
- phase: P3
- title: Extract LineageTracker
- status: merged
- branch: epic/3.3-lineage-tracker
- pr: 17
- review_round: 2
- source_file: openspace/skill_engine/store.py
- target_module: openspace/skill_engine/lineage_tracker.py
- description: 'Extract get_ancestry, get_lineage_tree, _subtree, find_children, evolve_skill
- into a focused lineage/derivation graph module.
-
- '
- acceptance:
- - lineage_tracker.py owns all parent-child and derivation graph logic
- - All existing tests pass unchanged
-- id: '3.4'
- phase: P3
- title: Extract AnalysisStore
- status: merged
- branch: epic/3.4-analysis-store
- pr: 19
- review_round: 2
- source_file: openspace/skill_engine/store.py
- target_module: openspace/skill_engine/analysis_store.py
- description: 'Extract record_analysis, load_analyses, load_analyses_for_task, load_all_analyses,
- load_evolution_candidates, _insert_analysis into a focused analysis persistence
- module.
-
- '
- acceptance:
- - analysis_store.py owns quality scores and evolution trigger data
- - All existing tests pass unchanged
-- id: '3.5'
- phase: P3
- title: Extract TagIndex + SearchEngine
- status: merged
- branch: epic/3.5-tag-search
- pr: 21
- review_round: 2
- source_file: openspace/skill_engine/store.py
- target_module: openspace/skill_engine/tag_index_search.py
- description: 'Extract find_skills_by_tool, load_by_category, get_summary, get_stats,
- get_top_skills, get_task_skill_summary, count into search/analytics module.
-
- '
- acceptance:
- - tag_index_search.py owns all query and analytics operations
- - All existing tests pass unchanged
-- id: '3.6'
- phase: P3
- title: Extract MigrationManager
- status: merged
- branch: epic/3.6-migration-manager
- pr: 23
- review_round: 2
- source_file: openspace/skill_engine/store.py
- target_module: openspace/skill_engine/migration_manager.py
- description: 'Extract _init_db, _make_connection, _cleanup_wal_on_startup, _ensure_open,
- _db_retry, vacuum, clear into a DB lifecycle and schema migration module.
-
- '
- acceptance:
- - migration_manager.py owns DDL, WAL recovery, and connection lifecycle
- - All existing tests pass unchanged
-- id: '3.7'
- phase: P3
- title: P3 integration tests + store.py cleanup
- status: merged
- branch: epic/3.7-integration-cleanup
- pr: 25
- review_round: 4
- notes: 'Additional PRs: #26 (E2E validation, 7 blockers fixed), #27 (warning cleanup),
- #28 (batch hydration + mypy fixes). All merged. 1,356 tests passing.
-
- '
- source_file: openspace/skill_engine/store.py
- target_module: null
- description: 'Final integration epic: verify all 7 modules work together, add backward
- compatibility tests, then replace store.py with a thin re-export facade or delete
- entirely.
-
- '
- acceptance:
- - store.py is either deleted or reduced to re-export facade
- - All 7 extracted modules have dedicated tests
- - Full suite passes (1028+ tests)
- - /8eyes + /collab buyoff on complete P3
-- id: '4.1'
- phase: P4
- title: Extract ToolRegistry
- status: merged
- branch: epic/4.1-tool-registry
- pr: 31
- review_round: 2
- source_file: openspace/tool_layer.py
- target_module: openspace/tool_registry.py
- description: 'Extract tool registration, discovery, and backend listing from OpenSpace
- class.
-
- '
- acceptance:
- - tool_registry.py manages tool lifecycle independently
- - All existing tests pass unchanged
-- id: '4.2'
- phase: P4
- title: Extract SkillSelector
- status: skipped
- branch: null
- pr: null
- review_round: 0
- source_file: openspace/tool_layer.py
- target_module: openspace/skill_selector.py
- description: 'Extract _select_and_inject_skills, _get_skill_selection_llm into focused
- skill-first routing module.
-
- '
- acceptance:
- - "skill_selector.py owns the skill-first \xE2\u2020\u2019 fallback decision"
- - All existing tests pass unchanged
-- id: '4.3'
- phase: P4
- title: Extract ExecutionEngine
- status: merged
- branch: epic/4.3-execution-engine
- pr: 32
- review_round: 2
- source_file: openspace/tool_layer.py
- target_module: openspace/execution_engine.py
- description: 'Extract execute method and fallback orchestration logic.
-
- '
- acceptance:
- - execution_engine.py handles task execution with skill injection
- - All existing tests pass unchanged
-- id: '4.4'
- phase: P4
- title: Extract RecordingService
- status: merged
- branch: epic/4.4-recording-service
- pr: 33
- review_round: 1
- source_file: openspace/tool_layer.py
- target_module: openspace/recording_service.py
- description: 'Extract _maybe_analyze_execution, _maybe_evolve_quality and all trace
- capture/recording logic.
-
- '
- acceptance:
- - recording_service.py owns execution trace capture
- - All existing tests pass unchanged
-- id: '4.5'
- phase: P4
- title: Extract LLMAdapter
- status: merged
- branch: epic/4.5-llm-factory
- pr: 34
- review_round: 1
- source_file: openspace/tool_layer.py
- target_module: openspace/llm_adapter.py
- description: 'Extract LLM client initialization and provider abstraction.
-
- '
- acceptance:
- - llm_adapter.py provides a clean provider-agnostic LLM interface
- - All existing tests pass unchanged
-- id: '4.6'
- phase: P4
- title: Extract ToolLayerFacade
- status: merged
- branch: epic/4.6-facade-cleanup
- pr: 35
- review_round: 2
- source_file: openspace/tool_layer.py
- target_module: openspace/tool_layer_facade.py
- description: 'Reduce tool_layer.py to thin facade that composes the 5 extracted
- modules. Preserve public API (OpenSpace class, OpenSpaceConfig).
-
- '
- acceptance:
- - tool_layer.py is a thin re-export facade or deleted
- - Public API unchanged
- - All existing tests pass unchanged
-- id: '4.7'
- phase: P4
- title: Extract MCPToolHandlers
- status: merged
- branch: epic/4.7-mcp-tool-handlers
- pr: 36
- review_round: 2
- source_file: openspace/mcp_server.py
- target_module: openspace/mcp/tool_handlers.py
- description: 'Extract execute_task, search_skills, fix_skill, upload_skill handlers
- and their helpers into focused handler module.
-
- '
- acceptance:
- - tool_handlers.py contains all 4 MCP tool implementations
- - All existing tests pass unchanged
-- id: '4.8'
- phase: P4
- title: Extract MCPAuthMiddleware
- status: skipped
- branch: null
- pr: null
- review_round: 0
- source_file: openspace/mcp_server.py
- target_module: openspace/mcp/auth_middleware.py
- description: 'Extract RateLimitMiddleware, BearerTokenMiddleware, and HMAC auth
- into dedicated auth middleware module.
-
- '
- acceptance:
- - auth_middleware.py owns all MCP authentication/authorization
- - All existing tests pass unchanged
-- id: '4.9'
- phase: P4
- title: Extract MCPRouter + MCPServer
- status: merged
- branch: epic/4.9-mcp-server-extract
- pr: 37
- review_round: 2
- source_file: openspace/mcp_server.py
- target_module: openspace/mcp/server.py
- description: 'Extract endpoint wiring (MCPRouter) and server lifecycle (run_mcp_server)
- into clean server module. Create openspace/mcp/ package.
-
- '
- acceptance:
- - openspace/mcp/ package exists with clean separation
- - mcp_server.py is thin facade or deleted
- - All existing tests pass unchanged
-- id: '4.10'
- phase: P4
- title: P4 integration tests + MCP conformance
- status: merged
- branch: epic/4.10-integration-tests
- pr: 38
- review_round: 2
- source_file: null
- target_module: tests/test_mcp_conformance.py
- description: 'Integration tests verifying tool_layer and MCP server modules work
- together. MCP protocol conformance test harness.
-
- '
- acceptance:
- - Full suite passes (1028+ tests)
- - MCP protocol conformance tests pass
- - /8eyes + /collab buyoff on complete P4
-- id: '5.1'
- phase: P5
- title: Extract EvolutionModels
- status: done
- branch: null
- pr: 39
- review_round: 0
- source_file: openspace/skill_engine/evolver.py
- target_module: openspace/skill_engine/evolution/models.py
- description: 'Extract EvolutionTrigger, EvolutionContext, _sanitize_skill_name into
- domain models module.
-
- '
- acceptance:
- - models.py contains all evolution domain types
- - All existing tests pass unchanged
-- id: '5.2'
- phase: P5
- title: Extract EvolutionOrchestrator
- status: done
- branch: null
- pr: 40
- review_round: 0
- source_file: openspace/skill_engine/evolver.py
- target_module: openspace/skill_engine/evolution/orchestrator.py
- description: 'Extract evolve, _execute_contexts, schedule_background, _log_background_result
- into orchestration module.
-
- '
- acceptance:
- - orchestrator.py owns evolution scheduling and dispatch
- - All existing tests pass unchanged
-- id: '5.3'
- phase: P5
- title: Extract AnalysisEvolution + ToolDegradation + MetricMonitor
- status: done
- branch: null
- pr: 41
- review_round: 0
- source_file: openspace/skill_engine/evolver.py
- target_module: openspace/skill_engine/evolution/triggers/
- description: 'Extract process_analysis, process_tool_degradation, process_metric_check
- into three focused trigger modules.
-
- '
- acceptance:
- - Each trigger type has its own module
- - All existing tests pass unchanged
-- id: '5.4'
- phase: P5
- title: Extract EvolutionConfirmation
- status: done
- branch: null
- pr: 42
- review_round: 0
- source_file: openspace/skill_engine/evolver.py
- target_module: openspace/skill_engine/evolution/confirmation.py
- description: 'Extract _llm_confirm_evolution, _parse_confirmation into LLM confirmation/approval
- module.
-
- '
- acceptance:
- - confirmation.py owns the LLM-based evolution approval flow
- - All existing tests pass unchanged
-- id: '5.5'
- phase: P5
- title: Extract FixEngine + DerivedCapturedEngine
- status: done
- branch: null
- pr: 43
- review_round: 0
- source_file: openspace/skill_engine/evolver.py
- target_module: openspace/skill_engine/evolution/engines/
- description: 'Extract _evolve_fix, _evolve_derived, _evolve_captured, _run_evolution_loop,
- _parse_evolution_output, _apply_with_retry into evolution engine modules.
-
- '
- acceptance:
- - FIX, DERIVED, CAPTURED each have focused engine modules
- - All existing tests pass unchanged
-- id: '5.6'
- phase: P5
- title: "P5a integration \xE2\u20AC\u201D evolver.py deletion"
- status: done
- branch: null
- pr: 44
- review_round: 0
- source_file: openspace/skill_engine/evolver.py
- target_module: null
- description: 'Replace evolver.py with thin facade or delete. Verify all 9 evolution
- modules integrate correctly.
-
- '
- acceptance:
- - evolver.py is facade or deleted
- - openspace/skill_engine/evolution/ package complete
- - All existing tests pass
-- id: '5.7'
- phase: P5
- title: Extract SkillContextManager + MessageSafety
- status: done
- branch: null
- pr: 45
- review_round: 0
- source_file: openspace/agents/grounding_agent.py
- target_module: openspace/agents/grounding/
- description: 'Extract set_skill_context, clear_skill_context, _cap_message_content,
- _truncate_messages into focused modules.
-
- '
- acceptance:
- - Skill context and message safety are independent modules
- - All existing tests pass unchanged
-- id: '5.8'
- phase: P5
- title: Extract AgentExecutionLoop + PromptBuilder
- status: done
- branch: null
- pr: 46
- review_round: 0
- source_file: openspace/agents/grounding_agent.py
- target_module: openspace/agents/grounding/
- description: 'Extract process (core loop), construct_messages, _default_system_prompt
- into execution loop and prompt builder modules.
-
- '
- acceptance:
- - Core reasoning loop is independent of tool/visual/workspace concerns
- - All existing tests pass unchanged
-- id: '5.9'
- phase: P5
- title: Extract ToolCatalog + VisualAnalysis + WorkspaceInspector
- status: done
- branch: null
- pr: 47
- review_round: 0
- source_file: openspace/agents/grounding_agent.py
- target_module: openspace/agents/grounding/
- description: 'Extract _get_available_tools, _load_all_tools, _visual_analysis_callback,
- _scan_workspace_files, _check_workspace_artifacts into focused modules.
-
- '
- acceptance:
- - Tool discovery, visual analysis, and workspace scanning are independent
- - All existing tests pass unchanged
-- id: '5.10'
- phase: P5
- title: Extract ResultFormatter + Telemetry
- status: done
- branch: null
- pr: 48
- review_round: 0
- source_file: openspace/agents/grounding_agent.py
- target_module: openspace/agents/grounding/result_telemetry.py
- description: 'Extract _build_final_result, _format_tool_executions, _generate_final_summary,
- _record_agent_execution into result formatting and telemetry module.
-
- '
- acceptance:
- - Result formatting and execution recording are independent
- - All existing tests pass unchanged
-- id: '5.11'
- phase: P5
- title: "P5b integration \xE2\u20AC\u201D grounding_agent.py deletion"
- status: done
- branch: null
- pr: 49
- review_round: 0
- source_file: openspace/agents/grounding_agent.py
- target_module: null
- description: 'Replace grounding_agent.py with thin facade or delete. Verify all
- 8 grounding modules integrate correctly. Full /8eyes + /collab on P5.
-
- '
- acceptance:
- - grounding_agent.py is facade or deleted
- - openspace/agents/grounding/ package complete
- - Full suite passes (1028+ tests)
- - /8eyes + /collab buyoff on complete P5
-- id: '6.1'
- phase: P6
- title: Observability (metrics, tracing, health)
- status: done
- branch: epic/6.1-observability
- pr: 50
- review_round: 2
- description: 'Add structured metrics (execution latency, skill hit rate, evolution
- success rate), distributed tracing for multi-step workflows, and health check
- endpoints for MCP server.
-
- '
- acceptance:
- - Prometheus-compatible metrics endpoint
- - Health check endpoint returns structured status
- - Execution traces captured for debugging
-- id: '6.2'
- phase: P6
- title: SLOs (latency, error rate, availability)
- status: done
- branch: epic/6.2-slos
- pr: 51
- review_round: 2
- description: 'Define and enforce SLO targets: p99 execution latency, error rate
- budget, availability targets. Add SLO burn-rate alerting logic.
-
- '
- acceptance:
- - SLO targets documented and measurable
- - Burn-rate calculations implemented
- - Tests for SLO boundary conditions
-- id: '6.3'
- phase: P6
- title: Deployment architecture
- status: done
- branch: epic/6.3-deployment
- pr: 52
- review_round: 2
- description: 'Containerization (Dockerfile), config management (env-based), graceful
- shutdown, startup probes, resource limits.
-
- '
- acceptance:
- - Dockerfile builds and runs successfully
- - Config via environment variables documented
- - Graceful shutdown handles in-flight tasks
- notes:
- - DeployConfig frozen dataclass with env-based config
- - GracefulShutdownHandler with global monotonic deadline, 70/30 drain/hook budget
- - /health endpoint outside auth for K8s probes
- - HEALTHCHECK hits real /health endpoint
- - Dockerfile multi-stage, non-root, OPENSPACE_SHUTDOWN_TIMEOUT=8
- - 41 deployment tests, 1925 total passed
- - 2 review rounds, 9 agents total
-- id: '6.4'
- phase: P6
- title: DX polish (CLI help, errors, onboarding)
- status: done
- branch: epic/6.4-dx-polish
- pr: 53
- review_round: 2
- description: 'Improve CLI help text, error messages with actionable suggestions,
- first-run onboarding flow, quickstart documentation.
-
- '
- acceptance:
- - Every CLI command has --help with examples
- - Error messages include suggested fixes
- - Quickstart guide tested on clean install
- - /8eyes + /collab buyoff on complete P6
- notes:
- - Rich --help with epilog (examples, env vars, auth notes), --version
- - Preflight checks: bearer token, skill store, port, Python version
- - Platform-aware suggestions (PowerShell on Win, bash on Unix)
- - Output writes to sys.__stderr__ (console), not just logger (file)
- - 32 DX tests, 1957 total passed
- - 2 review rounds, 7 agents total (4 R1 + 3 R2)
-- id: '7.1'
- phase: P7
- title: Eight-eyes integration
- status: done
- branch: epic/7.1-review-gate
- pr: 54
- review_round: 3
- description: "Built-in /8eyes review for skill evolution \xE2\u20AC\u201D every\
- \ evolved skill gets multi-agent review before activation.\n"
- acceptance:
- - Skill evolution triggers /8eyes review automatically
- - Evolved skills quarantined until review passes
- notes: |
- ReviewGate with 5-layer security: path traversal, extension allowlist,
- size limits, HIGH+CRITICAL AST blocking, lineage.validate().
- 63 tests (45 adversarial), 2020 total passed.
- 3 review rounds, 12 agent-reviews (impl, sec, GPT-5.4, Opus 4.6).
-- id: '7.2'
- phase: P7
- title: SkillGuard quality gates
- status: done
- branch: epic/7.2-skillguard
- pr: 55
- review_round: 2
- description: 'Pre-persist quality gates for all skill mutations. SkillGuard wraps
- ReviewGate+SkillStore. AST blocklist hardened to 40+ patterns covering RCE, sandbox
- escape, data exfil. Deep disk rollback on guard rejection. 2 review rounds, 7
- agent-reviews, all P0/P1s fixed.
-
- '
- acceptance:
- - SkillGuard blocks activation of unsafe skills
- - Capability scope validated against tier
- - Test coverage minimum enforced
- notes: '33 guard tests, 2056 total. Blocklist: builtins, __builtins__, sys.modules,
- os.exec*, os.spawn*, pty, code, http exfil, marshal/runpy, asyncio subprocess,
- signal, webbrowser, multiprocessing, shutil, os.remove, os.chmod. Deep rollback
- via rglob snapshot. sync_from_registry unguarded = accepted risk for 7.3.'
-- id: '7.3'
- phase: P7
- title: Launch checklist
- status: pending
- branch: null
- pr: null
- review_round: 0
- description: 'Final security audit, documentation completeness, release notes, changelog,
- license verification, dependency audit clean.
-
- '
- acceptance:
- - Security audit report clean
- - All docs up to date
- - CHANGELOG.md covers all phases
- - pip-audit clean (zero known vulnerabilities)
- - /8eyes + /collab buyoff on complete P7
- - PROJECT COMPLETE
diff --git a/SECURITY.md b/SECURITY.md
new file mode 100644
index 0000000..60febd1
--- /dev/null
+++ b/SECURITY.md
@@ -0,0 +1,49 @@
+# Security Policy
+
+## Supported Versions
+
+| Version | Supported |
+|---------|--------------------|
+| 2.0.x | ✅ Active support |
+| < 2.0 | ❌ Not supported |
+
+## Reporting a Vulnerability
+
+If you discover a security vulnerability in OpenSpace, please report it responsibly:
+
+1. **Do NOT open a public GitHub issue.**
+2. Email the maintainers at the address listed in the repository's GitHub Security Advisories tab.
+3. Alternatively, use [GitHub's private vulnerability reporting](https://github.com/Deepfreezechill/OpenSpace/security/advisories/new).
+
+### What to include
+
+- Description of the vulnerability
+- Steps to reproduce
+- Impact assessment (RCE, data leak, DoS, etc.)
+- Suggested fix (if any)
+
+### Response timeline
+
+- **Acknowledgment:** within 48 hours
+- **Initial assessment:** within 7 days
+- **Fix or mitigation:** within 30 days for critical/high severity
+
+## Security Architecture
+
+OpenSpace employs defense-in-depth:
+
+- **Bearer token + HMAC auth** on all MCP endpoints
+- **Capability lease model** with tier-gated access, expiration, and revocation
+- **5-layer ReviewGate** for skill evolution (path, extension, size, AST, lineage)
+- **40+ AST blocklist patterns** preventing RCE, sandbox escape, and data exfiltration
+- **Filesystem jailing** with TOCTOU race protection
+- **Network SSRF protection** blocking loopback, link-local, and domain rebinding
+- **E2B sandbox enforcement** as default execution environment
+- **PII redaction** in structured logging
+- **Dependency auditing** via `pip-audit` in CI
+
+## Known Mitigations
+
+- `litellm` pinned to `>=1.83.0,<2` — excludes PYSEC-2026-2 (supply-chain compromise in 1.82.7/1.82.8) and fixes CVE-2026-35029 + CVE-2026-35030
+- All dependencies have upper-bound pins enforced by CI tests
+- Cloud auto-import disabled by default (must be explicitly enabled)
diff --git a/docs/enforcement/ARCHITECTURE.md b/docs/enforcement/ARCHITECTURE.md
index c42045e..20487d6 100644
--- a/docs/enforcement/ARCHITECTURE.md
+++ b/docs/enforcement/ARCHITECTURE.md
@@ -1,16 +1,15 @@
# 🔒 Phase Gate Enforcement Architecture
> **Classification:** Security-Critical Infrastructure
-> **Status:** DRAFT — Pending /8eyes Security Audit
-> **Author:** /collab Security & Enforcement Lead
-> **Pattern Lineage:** eight-eyes/circuit_breaker → squad-audit/label-enforce → openspace/phase-gates
+> **Status:** Stable
+> **Pattern Lineage:** circuit_breaker → label-enforce → openspace/phase-gates
---
## Executive Summary
A **fail-closed enforcement system** that makes it **technically impossible** to merge
-out-of-phase code into the openspace-upgrade repository without explicit, audited admin override.
+out-of-phase code into the OpenSpace repository without explicit, audited admin override.
**Key guarantee:** If the enforcement system crashes, is misconfigured, or encounters any
unexpected state — the merge is BLOCKED, not allowed.
@@ -293,7 +292,7 @@ branches. The enforcement Action provides equivalent protection with zero branch
## 6. Failure Mode Analysis
-Inspired by eight-eyes circuit breaker: every failure mode is enumerated and handled.
+Inspired by the circuit breaker pattern: every failure mode is enumerated and handled.
| Failure | Category | System Behavior | Recovery |
|---------|----------|-----------------|----------|
@@ -308,9 +307,9 @@ Inspired by eight-eyes circuit breaker: every failure mode is enumerated and han
| Circular dependency in config | Config | FAIL: "circular dep detected" | Fix config |
| New phase added, config not updated | Config | FAIL: "milestone X not in config" | Update config |
-**Key insight from eight-eyes:** The circuit breaker pattern (HEALTHY → RETRY → CIRCUIT_OPEN)
+**Key insight from circuit breaker design:** The circuit breaker pattern (HEALTHY → RETRY → CIRCUIT_OPEN)
maps to our GitHub Actions retry strategy. We retry API calls 3x with backoff before failing.
-But unlike eight-eyes, we don't need AWAITING_USER because the "user" (developer) is already
+Unlike a full circuit breaker, we don't need AWAITING_USER because the "user" (developer) is already
present — they see the failed check and can re-run it.
## 7. Security Considerations
diff --git a/docs/enforcement/configure-branch-protection.sh b/docs/enforcement/configure-branch-protection.sh
index 1966fab..ef2dfc7 100644
--- a/docs/enforcement/configure-branch-protection.sh
+++ b/docs/enforcement/configure-branch-protection.sh
@@ -3,7 +3,7 @@
# configure-branch-protection.sh
# ============================================================================
#
-# Configures branch protection rules for the openspace-upgrade repo.
+# Configures branch protection rules for the OpenSpace repo.
# Run this ONCE after pushing the enforcement workflow.
#
# Prerequisites:
@@ -16,7 +16,7 @@
# ./configure-branch-protection.sh
#
# Example:
-# ./configure-branch-protection.sh Deepfreezechill openspace-upgrade
+# ./configure-branch-protection.sh Deepfreezechill OpenSpace
# ============================================================================
set -euo pipefail
diff --git a/docs/sdk-api-spec.md b/docs/sdk-api-spec.md
index 10db8b9..fa438c8 100644
--- a/docs/sdk-api-spec.md
+++ b/docs/sdk-api-spec.md
@@ -1,8 +1,7 @@
# SkillGuard SDK — REST API Specification
-> **Phase:** 1 (design only — implementation in Phase 6)
-> **Status:** Draft
-> **Version:** 0.1.0
+> **Version:** 2.0.0
+> **Status:** Stable
## Overview
@@ -297,7 +296,7 @@ Health check.
"ok": true,
"data": {
"status": "healthy",
- "version": "0.1.0",
+ "version": "2.0.0",
"initialized": true,
"backends": ["shell", "mcp", "gui"]
}
diff --git a/docs/sdk-public-surface.md b/docs/sdk-public-surface.md
index 56bcc6d..bb2df08 100644
--- a/docs/sdk-public-surface.md
+++ b/docs/sdk-public-surface.md
@@ -1,8 +1,7 @@
# SkillGuard SDK — Public API Surface
-> **Phase:** 1 (design only — implementation in Phase 6)
-> **Status:** Draft
-> **Version:** 0.1.0
+> **Version:** 2.0.0
+> **Status:** Stable
## Overview
diff --git a/openspace/__init__.py b/openspace/__init__.py
index 4868066..324d77b 100644
--- a/openspace/__init__.py
+++ b/openspace/__init__.py
@@ -10,7 +10,7 @@
from openspace.tool_layer import OpenSpace as OpenSpace
from openspace.tool_layer import OpenSpaceConfig as OpenSpaceConfig
-__version__ = "0.1.0"
+__version__ = "2.0.0"
__all__ = [
# Version
diff --git a/openspace/config/README.md b/openspace/config/README.md
index a68d4ac..355987b 100644
--- a/openspace/config/README.md
+++ b/openspace/config/README.md
@@ -5,7 +5,7 @@ All configuration applies to both Path A (host agent) and Path B (standalone). C
## 1. API Keys (`.env`)
> [!NOTE]
-> Create a `.env` file and add your API keys (refer to [`.env.example`](../../.env.example)). When used via host agent (Path A), LLM keys are auto-detected from your agent's config — `.env` is mainly needed for standalone mode.
+> Create a `.env` file and add your API keys (refer to [`.env.example`](../.env.example)). When used via host agent (Path A), LLM keys are auto-detected from your agent's config — `.env` is mainly needed for standalone mode.
## 2. Environment Variables
diff --git a/openspace/grounding/backends/mcp/transport/connectors/http.py b/openspace/grounding/backends/mcp/transport/connectors/http.py
index 0c0055b..9ab38eb 100644
--- a/openspace/grounding/backends/mcp/transport/connectors/http.py
+++ b/openspace/grounding/backends/mcp/transport/connectors/http.py
@@ -312,7 +312,7 @@ async def _try_jsonrpc_connection(self) -> None:
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {},
- "clientInfo": {"name": "OpenSpace", "version": "1.0.0"},
+ "clientInfo": {"name": "OpenSpace", "version": "2.0.0"},
},
}
@@ -520,7 +520,7 @@ async def initialize(self) -> Dict[str, Any]:
{
"protocolVersion": "2024-11-05",
"capabilities": {},
- "clientInfo": {"name": "OpenSpace", "version": "1.0.0"},
+ "clientInfo": {"name": "OpenSpace", "version": "2.0.0"},
},
)
diff --git a/openspace/host_skills/README.md b/openspace/host_skills/README.md
index 0a555ed..0eb3c48 100644
--- a/openspace/host_skills/README.md
+++ b/openspace/host_skills/README.md
@@ -8,7 +8,7 @@ This guide covers **agent-specific setup** for integrating OpenSpace. For instal
|------------|-------------|
| **[nanobot](https://github.com/HKUDS/nanobot)** | [Setup for nanobot](#setup-for-nanobot) |
| **[openclaw](https://github.com/openclaw/openclaw)** | [Setup for openclaw](#setup-for-openclaw) |
-| **Other agents** | Follow the [generic setup](../../README.md#-path-a-empower-your-agent-with-openspace) in the main README |
+| **Other agents** | Follow the [generic setup](../../README.md#-path-a-for-your-agent) in the main README |
---
@@ -70,7 +70,7 @@ mcporter config add openspace --command "openspace-mcp" \
## Environment Variables (Agent-Specific)
-The three env vars in each agent's setup above are the most important. For the **full env var list**, config files reference, and advanced settings, see the [Configuration Guide](../../README.md#configuration-guide) in the main README.
+The three env vars in each agent's setup above are the most important. For the **full env var list**, config files reference, and advanced settings, see the [Advanced Configuration](../../README.md#-advanced-configuration) in the main README.
What needs OPENSPACE_API_KEY?
diff --git a/openspace/local_server/main.py b/openspace/local_server/main.py
index 62530c6..677dbc4 100644
--- a/openspace/local_server/main.py
+++ b/openspace/local_server/main.py
@@ -169,7 +169,7 @@ def health_check():
{
"status": "ok",
"service": "OpenSpace Desktop Server",
- "version": "1.0.0",
+ "version": "2.0.0",
"platform": platform_name,
"features": features,
"timestamp": datetime.now().isoformat(),
diff --git a/openspace/mcp/server.py b/openspace/mcp/server.py
index 5dce6bf..2b6098b 100644
--- a/openspace/mcp/server.py
+++ b/openspace/mcp/server.py
@@ -185,7 +185,7 @@ def _probe_mcp_tools() -> HealthProbe:
# ---------------------------------------------------------------------------
# 4. CLI argument parser (extracted for testability)
# ---------------------------------------------------------------------------
-_VERSION = "0.1.0" # TODO: read from pyproject.toml or importlib.metadata
+_VERSION = "2.0.0" # keep in sync with pyproject.toml [project] version
_EPILOG = """\
examples:
diff --git a/pr-25-diff-final.txt b/pr-25-diff-final.txt
deleted file mode 100644
index 9a77a89..0000000
--- a/pr-25-diff-final.txt
+++ /dev/null
@@ -1,2134 +0,0 @@
-diff --git a/openspace/skill_engine/lineage_tracker.py b/openspace/skill_engine/lineage_tracker.py
-index 4cc3755..1aa4379 100644
---- a/openspace/skill_engine/lineage_tracker.py
-+++ b/openspace/skill_engine/lineage_tracker.py
-@@ -192,7 +192,11 @@ class LineageTracker:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_record_derivation")
- try:
- # For FIXED: deactivate same-name parents (superseded)
- if new_record.lineage.origin == SkillOrigin.FIXED:
-@@ -207,7 +211,10 @@ class LineageTracker:
- new_record.is_active = True
-
- self._repo._upsert(new_record)
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_record_derivation")
-
- origin = new_record.lineage.origin.value
- logger.info(
-@@ -216,7 +223,10 @@ class LineageTracker:
- f"[{new_record.skill_id}] ← parents={parent_skill_ids}"
- )
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_record_derivation")
- raise
-
- # ── Lineage queries ───────────────────────────────────────────────
-diff --git a/openspace/skill_engine/skill_repository.py b/openspace/skill_engine/skill_repository.py
-index a7a302f..f8512e4 100644
---- a/openspace/skill_engine/skill_repository.py
-+++ b/openspace/skill_engine/skill_repository.py
-@@ -136,8 +136,9 @@ class SkillRepository:
- """Open a temporary read-only connection (WAL parallel reads)."""
- self._ensure_open()
- if not self._owns_conn:
-- # When sharing a connection, just use it directly
-- yield self._conn
-+ # When sharing a connection, acquire lock to prevent dirty reads
-+ with self._mu:
-+ yield self._conn
- return
- conn = self._make_connection(read_only=True)
- try:
-diff --git a/openspace/skill_engine/store.py b/openspace/skill_engine/store.py
-index ea0b476..f79e071 100644
---- a/openspace/skill_engine/store.py
-+++ b/openspace/skill_engine/store.py
-@@ -1,5 +1,16 @@
--"""
-+"""SkillStore — SQLite persistence facade for skill quality tracking and evolution.
-+
-+Architecture (Phase 3 decomposition):
-+
-+SkillStore (facade) — store.py
-+├── MigrationManager — Schema creation & versioning (migration_manager.py)
-+├── SkillRepository — CRUD operations (skill_repository.py)
-+├── LineageTracker — Lineage traversal & evolution (lineage_tracker.py)
-+├── AnalysisStore — Execution analysis persistence (analysis_store.py)
-+└── TagSearch — Tag indexing & search operations (tag_search.py)
-+
- Storage location: /.openspace/openspace.db
-+
- Tables:
- skill_records — SkillRecord main table
- skill_lineage_parents — Lineage parent-child relationships (many-to-many)
-@@ -7,6 +18,12 @@ Tables:
- skill_judgments — Per-skill judgments within an analysis
- skill_tool_deps — Tool dependencies
- skill_tags — Auxiliary tags
-+
-+The SkillStore class serves as a unified facade that:
-+1. Manages the persistent SQLite connection and transaction safety
-+2. Delegates specialized operations to focused modules
-+3. Coordinates cross-module workflows (e.g., evolve_skill touches both lineage & repository)
-+4. Provides async API via asyncio.to_thread for thread-safe database access
- """
-
- from __future__ import annotations
-@@ -81,9 +98,13 @@ def _db_retry(
-
-
- class SkillStore:
-- """SQLite persistence engine — Skill quality tracking and evolution ledger.
-+ """SQLite persistence facade for skill quality tracking and evolution.
-
-- Architecture:
-+ Phase 3 Architecture:
-+ Delegates specialized operations to focused modules while maintaining unified API.
-+ All modules share the same connection and lock for transaction consistency.
-+
-+ Concurrency:
- Write path: async method → asyncio.to_thread → _xxx_sync → self._mu lock → self._conn
- Read path: sync method → self._reader() → independent short connection (WAL parallel read)
-
-@@ -185,8 +206,6 @@ class SkillStore:
- if f.exists():
- f.unlink()
-
--
--
- # Lifecycle
- def close(self) -> None:
- """Close the persistent connection. Subsequent ops will raise.
-@@ -280,7 +299,11 @@ class SkillStore:
- created = 0
- refreshed = 0
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_sync_from_registry")
- try:
- # Fetch all existing records keyed by skill_id
- rows = self._conn.execute(
-@@ -374,9 +397,15 @@ class SkillStore:
- created += 1
- logger.debug(f"sync_from_registry: created {meta.name} [{meta.skill_id}]")
-
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_sync_from_registry")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_sync_from_registry")
- raise
-
- if created or refreshed:
-@@ -399,6 +428,11 @@ class SkillStore:
- - total_completions += 1 (if applied and completed)
- - total_fallbacks += 1 (if not applied and not completed)
- - last_updated = now
-+
-+ Note: record_analysis() and evolve_skill() are individually atomic but
-+ not jointly atomic. If evolve_skill() fails after record_analysis()
-+ succeeded, the analysis persists (this is intentional — analysis records
-+ what happened, regardless of whether evolution succeeds).
- """
- await asyncio.to_thread(self._record_analysis_sync, analysis)
-
-@@ -422,6 +456,11 @@ class SkillStore:
-
- In the same SQL transaction, guaranteed by ``self._mu``.
-
-+ Note: record_analysis() and evolve_skill() are individually atomic but
-+ not jointly atomic. If evolve_skill() fails after record_analysis()
-+ succeeded, the analysis persists (this is intentional — analysis records
-+ what happened, regardless of whether evolution succeeds).
-+
- Args:
- new_record : SkillRecord
- New version record, ``lineage.parent_skill_ids`` must be non-empty.
-@@ -453,12 +492,22 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_save_record")
- try:
- self._upsert(record)
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_save_record")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_save_record")
- raise
-
- @_db_retry()
-@@ -470,13 +519,23 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_save_records")
- try:
- for r in records:
- self._upsert(r)
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_save_records")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_save_records")
- raise
-
- @_db_retry()
-@@ -493,7 +552,11 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_record_analysis")
- try:
- # Delegate analysis storage to AnalysisStore (Epic 3.4)
- self._analyses.insert_analysis(analysis)
-@@ -517,9 +580,15 @@ class SkillStore:
- (applied, completed, fallback, now_iso, j.skill_id),
- )
-
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_record_analysis")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_record_analysis")
- raise
-
- @_db_retry()
-@@ -531,35 +600,94 @@ class SkillStore:
- """Atomic: insert new version + deactivate parents (for FIXED).
-
- Delegates to :class:`LineageTracker` (Epic 3.3).
-+
-+ Note: evolve_skill() is individually atomic but not jointly atomic
-+ with record_analysis(). If evolve fails after analysis succeeded,
-+ the analysis persists — this is intentional (analysis records what
-+ happened regardless of evolution outcome).
- """
-+ self._ensure_open()
-+ # Note: We don't acquire self._mu here because the lineage tracker
-+ # will acquire it and both use the same mutex (would cause deadlock).
-+ # The lineage tracker handles its own transaction management.
- self._lineage.record_derivation(new_record, parent_skill_ids)
-
- @_db_retry()
- def _deactivate_record_sync(self, skill_id: str) -> bool:
- self._ensure_open()
- with self._mu:
-- cur = self._conn.execute(
-- "UPDATE skill_records SET is_active=0, last_updated=? WHERE skill_id=?",
-- (datetime.now().isoformat(), skill_id),
-- )
-- self._conn.commit()
-- return cur.rowcount > 0
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_deactivate_record")
-+ try:
-+ cur = self._conn.execute(
-+ "UPDATE skill_records SET is_active=0, last_updated=? WHERE skill_id=?",
-+ (datetime.now().isoformat(), skill_id),
-+ )
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_deactivate_record")
-+ return cur.rowcount > 0
-+ except Exception:
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_deactivate_record")
-+ raise
-
- @_db_retry()
- def _reactivate_record_sync(self, skill_id: str) -> bool:
- self._ensure_open()
- with self._mu:
-- cur = self._conn.execute(
-- "UPDATE skill_records SET is_active=1, last_updated=? WHERE skill_id=?",
-- (datetime.now().isoformat(), skill_id),
-- )
-- self._conn.commit()
-- return cur.rowcount > 0
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_reactivate_record")
-+ try:
-+ cur = self._conn.execute(
-+ "UPDATE skill_records SET is_active=1, last_updated=? WHERE skill_id=?",
-+ (datetime.now().isoformat(), skill_id),
-+ )
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_reactivate_record")
-+ return cur.rowcount > 0
-+ except Exception:
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_reactivate_record")
-+ raise
-
- @_db_retry()
- def _delete_record_sync(self, skill_id: str) -> bool:
- self._ensure_open()
-- return self._repo.delete(skill_id)
-+ with self._mu:
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_delete_record")
-+ try:
-+ # Handle deletion directly rather than delegating to avoid double-commit
-+ cur = self._conn.execute("DELETE FROM skill_records WHERE skill_id=?", (skill_id,))
-+ result = cur.rowcount > 0
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_delete_record")
-+ return result
-+ except Exception:
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_delete_record")
-+ raise
-
- # Read API (sync, each call opens its own read-only conn)
- @_db_retry()
-@@ -620,11 +748,14 @@ class SkillStore:
- ``/a/b/SKILL.md`` and ``/a/b/scenarios/x.md`` match ``/a/b``.
- Returns the newest active record (by ``last_updated DESC``).
- """
-- normalized = skill_dir.rstrip("/")
-+ normalized = skill_dir.rstrip("/").rstrip("\\")
-+ # Escape LIKE wildcards to prevent %, _, and \ injection
-+ # Backslash must be escaped FIRST (it's our ESCAPE char)
-+ escaped = normalized.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
- with self._reader() as conn:
- row = conn.execute(
-- "SELECT * FROM skill_records WHERE path LIKE ? AND is_active=1 ORDER BY last_updated DESC LIMIT 1",
-- (f"{normalized}%",),
-+ "SELECT * FROM skill_records WHERE path LIKE ? ESCAPE '\\' AND is_active=1 ORDER BY last_updated DESC LIMIT 1",
-+ (f"{escaped}%",),
- ).fetchone()
- return self._to_record(conn, row) if row else None
-
-@@ -811,16 +942,26 @@ class SkillStore:
- """Delete all data (keeps schema)."""
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_clear")
- try:
- # CASCADE on skill_records cleans up: lineage_parents, tool_deps, tags
- self._conn.execute("DELETE FROM skill_records")
- # Delegate analysis clearing to AnalysisStore (Epic 3.4)
- self._analyses.clear_all_analyses()
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_clear")
- logger.info("SkillStore cleared")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_clear")
- raise
-
- def vacuum(self) -> None:
-@@ -1053,8 +1194,30 @@ class SkillStore:
- return self._tag_search.get_all_tags()
-
- def sync_tags(self, skill_id: str, tags: List[str]) -> None:
-- """Synchronize tags for a skill (facade to TagSearch)."""
-- return self._tag_search.sync_tags(skill_id, tags)
-+ """Synchronize tags for a skill (facade to TagSearch).
-+
-+ Must manage transaction explicitly — TagSearch shared-mode
-+ delegates commit responsibility to the caller (us).
-+ """
-+ self._ensure_open()
-+ with self._mu:
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_sync_tags")
-+ try:
-+ self._tag_search.sync_tags(skill_id, tags)
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_sync_tags")
-+ except Exception:
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_sync_tags")
-+ raise
-
- # ── Migration Management (Epic 3.6) ─────────────────────────────────
-
-@@ -1070,6 +1233,7 @@ class SkillStore:
- """Set schema version (facade to MigrationManager).
-
- DEPRECATED: Use ensure_current_schema() instead.
-+ This method is kept for backward compatibility with existing tests.
- """
- return self._migrations._set_schema_version(version)
-
-diff --git a/openspace/skill_engine/tag_search.py b/openspace/skill_engine/tag_search.py
-index 4c3ebd6..30b969f 100644
---- a/openspace/skill_engine/tag_search.py
-+++ b/openspace/skill_engine/tag_search.py
-@@ -138,8 +138,6 @@ class TagSearch:
- finally:
- conn.close()
-
-- @_db_retry()
-- @_db_retry()
- @_db_retry()
- def _init_db(self) -> None:
- """Create tables if they don't exist (idempotent).
-@@ -461,7 +459,7 @@ class TagSearch:
- # Text search in name and description
- if query:
- # Escape SQL LIKE wildcards in user input
-- escaped_query = query.replace("%", "\\%").replace("_", "\\_")
-+ escaped_query = query.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
- conditions.append("(sr.name LIKE ? ESCAPE '\\' OR sr.description LIKE ? ESCAPE '\\')")
- like_pattern = f"%{escaped_query}%"
- params.extend([like_pattern, like_pattern])
-diff --git a/tests/test_p3_integration.py b/tests/test_p3_integration.py
-new file mode 100644
-index 0000000..b701beb
---- /dev/null
-+++ b/tests/test_p3_integration.py
-@@ -0,0 +1,1652 @@
-+"""Phase 3 Integration Tests — Epic 3.7
-+
-+Comprehensive tests exercising full workflows through the SkillStore facade,
-+verifying all extracted modules work together:
-+- SkillStore (facade) → store.py
-+- MigrationManager → migration_manager.py
-+- SkillRepository → skill_repository.py
-+- LineageTracker → lineage_tracker.py
-+- AnalysisStore → analysis_store.py
-+- TagSearch → tag_search.py
-+
-+Tests full end-to-end workflows, not individual module boundaries.
-+"""
-+
-+import asyncio
-+import concurrent.futures
-+import gc
-+import tempfile
-+import threading
-+import time
-+from datetime import datetime
-+from pathlib import Path
-+from typing import Dict, List
-+
-+import pytest
-+
-+from openspace.skill_engine.store import SkillStore
-+from openspace.skill_engine.types import (
-+ EvolutionSuggestion,
-+ EvolutionType,
-+ ExecutionAnalysis,
-+ SkillCategory,
-+ SkillJudgment,
-+ SkillLineage,
-+ SkillOrigin,
-+ SkillRecord,
-+ SkillVisibility,
-+)
-+
-+
-+@pytest.fixture
-+def temp_db():
-+ """Temporary database for each test."""
-+ with tempfile.TemporaryDirectory() as tmpdir:
-+ yield Path(tmpdir) / "test.db"
-+
-+
-+@pytest.fixture
-+def store(temp_db):
-+ """Clean SkillStore instance for each test."""
-+ store = SkillStore(temp_db)
-+ try:
-+ yield store
-+ finally:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ # Force garbage collection to clean up any lingering connections
-+ gc.collect()
-+ # Small delay to let Windows finish closing file handles
-+ time.sleep(0.1)
-+
-+
-+@pytest.fixture
-+def sample_record():
-+ """Sample SkillRecord for testing."""
-+ return SkillRecord(
-+ skill_id="test_skill_v1",
-+ name="test_skill",
-+ description="Test skill for integration testing",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/skill.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"skill.py": "def test_skill():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+
-+@pytest.fixture
-+def sample_analysis(sample_record):
-+ """Sample ExecutionAnalysis for testing."""
-+ return ExecutionAnalysis(
-+ task_id="task_123",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Test execution completed successfully",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id="test_skill_v1",
-+ skill_applied=True,
-+ note="Skill executed successfully",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+
-+
-+class TestSkillLifecycleWorkflow:
-+ """Test complete skill lifecycle: create → save → get → update → delete."""
-+
-+ @pytest.mark.asyncio
-+ async def test_basic_crud_operations(self, store, sample_record):
-+ """Test basic create, read, update, delete through facade."""
-+ # CREATE: Save record
-+ await store.save_record(sample_record)
-+
-+ # READ: Load record back
-+ loaded = store.load_record(sample_record.skill_id)
-+ assert loaded is not None
-+ assert loaded.skill_id == sample_record.skill_id
-+ assert loaded.name == sample_record.name
-+ assert loaded.description == sample_record.description
-+
-+ # UPDATE: Modify and save again
-+ sample_record.description = "Updated description"
-+ sample_record.total_selections = 5
-+ await store.save_record(sample_record)
-+
-+ updated = store.load_record(sample_record.skill_id)
-+ assert updated.description == "Updated description"
-+ assert updated.total_selections == 5
-+
-+ # DEACTIVATE: Soft delete
-+ result = await store.deactivate_record(sample_record.skill_id)
-+ assert result is True
-+
-+ # Should not appear in active queries
-+ active_skills = store.load_active()
-+ assert sample_record.skill_id not in active_skills
-+
-+ # But should still exist in full query
-+ all_skills = store.load_all(active_only=False)
-+ assert sample_record.skill_id in all_skills
-+
-+ # REACTIVATE
-+ result = await store.reactivate_record(sample_record.skill_id)
-+ assert result is True
-+
-+ active_skills = store.load_active()
-+ assert sample_record.skill_id in active_skills
-+
-+ # HARD DELETE
-+ result = await store.delete_record(sample_record.skill_id)
-+ assert result is True
-+
-+ deleted = store.load_record(sample_record.skill_id)
-+ assert deleted is None
-+
-+ @pytest.mark.asyncio
-+ async def test_batch_operations(self, store):
-+ """Test batch save and load operations."""
-+ # Create multiple records
-+ records = []
-+ for i in range(5):
-+ record = SkillRecord(
-+ skill_id=f"batch_skill_v{i}",
-+ name=f"batch_skill_{i}",
-+ description=f"Batch skill {i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/batch_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"batch_{i}.py": f"def batch_skill_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=i,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ records.append(record)
-+
-+ # Batch save
-+ await store.save_records(records)
-+
-+ # Verify all were saved
-+ all_skills = store.load_all()
-+ for record in records:
-+ assert record.skill_id in all_skills
-+ loaded = all_skills[record.skill_id]
-+ assert loaded.name == record.name
-+ assert loaded.total_selections == record.total_selections
-+
-+
-+class TestLineageWorkflow:
-+ """Test lineage workflow: create → evolve → record → traverse."""
-+
-+ @pytest.mark.asyncio
-+ async def test_evolution_and_lineage_tracking(self, store, sample_record):
-+ """Test skill evolution with lineage tracking."""
-+ # Save initial skill
-+ await store.save_record(sample_record)
-+
-+ # Evolve the skill
-+ evolved_record = SkillRecord(
-+ skill_id="evolved_test_skill_v1",
-+ name="evolved_test_skill",
-+ description="Evolved test skill with improvements",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="Added return value",
-+ content_snapshot={"skill.py": "def evolved_test_skill():\n return 'improved'"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(evolved_record, [sample_record.skill_id])
-+ evolved_skill_id = evolved_record.skill_id
-+
-+ assert evolved_skill_id != sample_record.skill_id
-+
-+ # Load evolved skill
-+ evolved = store.load_record(evolved_skill_id)
-+ assert evolved is not None
-+ assert evolved.name == "evolved_test_skill"
-+ assert evolved.lineage.generation == 1
-+ assert evolved.lineage.origin == SkillOrigin.DERIVED
-+
-+ # Check lineage relationships
-+ children = store.find_children(sample_record.skill_id)
-+ assert evolved_skill_id in children
-+
-+ # Get ancestry
-+ ancestry = store.get_ancestry(evolved_skill_id)
-+ assert len(ancestry) == 1 # just parent
-+ assert ancestry[0].skill_id == sample_record.skill_id
-+
-+ # Get lineage tree
-+ tree = store.get_lineage_tree(sample_record.skill_id)
-+ assert tree["skill_id"] == sample_record.skill_id
-+ assert len(tree["children"]) == 1
-+ assert tree["children"][0]["skill_id"] == evolved_skill_id
-+
-+ @pytest.mark.asyncio
-+ async def test_multi_generation_lineage(self, store, sample_record):
-+ """Test multiple generations of evolution."""
-+ # Generation 0: Original
-+ await store.save_record(sample_record)
-+
-+ # Generation 1: First evolution
-+ gen1_record = SkillRecord(
-+ skill_id="gen1_skill_v1",
-+ name="gen1_skill",
-+ description="First generation evolution",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="First evolution",
-+ content_snapshot={"skill.py": "def gen1():\n return 1"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(gen1_record, [sample_record.skill_id])
-+ gen1_id = gen1_record.skill_id
-+
-+ # Generation 2: Second evolution
-+ gen2_record = SkillRecord(
-+ skill_id="gen2_skill_v1",
-+ name="gen2_skill",
-+ description="Second generation evolution",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=2,
-+ parent_skill_ids=[gen1_id],
-+ change_summary="Second evolution",
-+ content_snapshot={"skill.py": "def gen2():\n return 2"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(gen2_record, [gen1_id])
-+ gen2_id = gen2_record.skill_id
-+
-+ # Verify generations
-+ gen2 = store.load_record(gen2_id)
-+ assert gen2.lineage.generation == 2
-+
-+ # Check full ancestry
-+ ancestry = store.get_ancestry(gen2_id)
-+ assert len(ancestry) == 2 # gen1 + gen0 (ancestors only)
-+ assert ancestry[0].skill_id == sample_record.skill_id # gen0 (oldest)
-+ assert ancestry[1].skill_id == gen1_id # gen1 (newer)
-+
-+
-+class TestAnalysisWorkflow:
-+ """Test analysis workflow: save skill → record analysis → retrieve → evolve."""
-+
-+ @pytest.mark.asyncio
-+ async def test_analysis_recording_and_retrieval(self, store, sample_record, sample_analysis):
-+ """Test recording and retrieving execution analyses."""
-+ # Save skill first
-+ await store.save_record(sample_record)
-+
-+ # Record analysis
-+ await store.record_analysis(sample_analysis)
-+
-+ # Load analysis back
-+ loaded_analysis = store.load_analyses_for_task(sample_analysis.task_id)
-+ assert loaded_analysis is not None
-+ assert loaded_analysis.task_id == sample_analysis.task_id
-+ assert loaded_analysis.execution_note == sample_analysis.execution_note
-+ assert len(loaded_analysis.skill_judgments) == 1
-+
-+ # Load analyses for skill
-+ skill_analyses = store.load_analyses(sample_record.skill_id)
-+ assert len(skill_analyses) == 1
-+ assert skill_analyses[0].task_id == sample_analysis.task_id
-+
-+ # Get all analyses
-+ all_analyses = store.load_all_analyses()
-+ assert len(all_analyses) == 1
-+ assert all_analyses[0].task_id == sample_analysis.task_id
-+
-+ @pytest.mark.asyncio
-+ async def test_evolution_candidates(self, store, sample_record):
-+ """Test getting evolution candidates from analysis data."""
-+ # Save skill
-+ await store.save_record(sample_record)
-+
-+ # Create analysis with failure and evolution suggestion
-+ failed_analysis = ExecutionAnalysis(
-+ task_id="failed_task",
-+ timestamp=datetime.now(),
-+ task_completed=False, # Failed task
-+ execution_note="Skill failed to execute properly",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=sample_record.skill_id,
-+ skill_applied=False, # Failed to apply
-+ note="Skill failed to execute",
-+ )
-+ ],
-+ evolution_suggestions=[
-+ EvolutionSuggestion(
-+ evolution_type=EvolutionType.FIX,
-+ target_skill_ids=[sample_record.skill_id],
-+ direction="Fix skill execution issues",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+
-+ await store.record_analysis(failed_analysis)
-+
-+ # Get evolution candidates (should include skills with failures)
-+ candidates = store.load_evolution_candidates()
-+ assert len(candidates) > 0
-+
-+ # Should include our failed analysis
-+ task_ids = [c.task_id for c in candidates]
-+ assert failed_analysis.task_id in task_ids
-+
-+
-+class TestTagSearchWorkflow:
-+ """Test tag/search workflow: save with tags → search by tags → search by query."""
-+
-+ @pytest.mark.asyncio
-+ async def test_tag_operations(self, store, sample_record):
-+ """Test tag synchronization and retrieval."""
-+ # Save skill
-+ await store.save_record(sample_record)
-+
-+ # Add tags
-+ tags = ["python", "testing", "integration"]
-+ store.sync_tags(sample_record.skill_id, tags)
-+
-+ # Get tags back
-+ loaded_tags = store.get_tags(sample_record.skill_id)
-+ assert set(loaded_tags) == set(tags)
-+
-+ # Get all tags
-+ all_tags = store.get_all_tags()
-+ tag_names = [tag["tag"] for tag in all_tags]
-+ for tag in tags:
-+ assert tag in tag_names
-+
-+ # Find by tags
-+ found_skills = store.find_skills_by_tags(["python"])
-+ assert sample_record.skill_id in found_skills
-+
-+ found_skills = store.find_skills_by_tags(["python", "testing"], match_all=True)
-+ assert sample_record.skill_id in found_skills
-+
-+ found_skills = store.find_skills_by_tags(["nonexistent"])
-+ assert sample_record.skill_id not in found_skills
-+
-+ @pytest.mark.asyncio
-+ async def test_comprehensive_search(self, store):
-+ """Test comprehensive search functionality."""
-+ # Create skills with different attributes
-+ skills_data = [
-+ {
-+ "skill_id": "tool_skill_v1",
-+ "name": "tool_helper",
-+ "description": "Tool utility functions",
-+ "category": SkillCategory.TOOL_GUIDE,
-+ "tags": ["tool", "utility"]
-+ },
-+ {
-+ "skill_id": "workflow_skill_v1",
-+ "name": "workflow_helper",
-+ "description": "Workflow helper functions",
-+ "category": SkillCategory.WORKFLOW,
-+ "tags": ["workflow", "utility"]
-+ },
-+ {
-+ "skill_id": "private_skill_v1",
-+ "name": "private_helper",
-+ "description": "Private internal tool",
-+ "category": SkillCategory.REFERENCE,
-+ "tags": ["internal"],
-+ "visibility": SkillVisibility.PRIVATE
-+ }
-+ ]
-+
-+ for skill_data in skills_data:
-+ record = SkillRecord(
-+ skill_id=skill_data["skill_id"],
-+ name=skill_data["name"],
-+ description=skill_data["description"],
-+ category=skill_data["category"],
-+ visibility=skill_data.get("visibility", SkillVisibility.PUBLIC),
-+ path=f"/test/{skill_data['name']}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"{skill_data['name']}.py": f"def {skill_data['name']}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+ store.sync_tags(skill_data["skill_id"], skill_data["tags"])
-+
-+ # Search by text query
-+ results = store.search_skills("tool")
-+ assert len(results) >= 1
-+ skill_ids = [r["skill_id"] for r in results]
-+ assert "tool_skill_v1" in skill_ids
-+
-+ # Search by category
-+ results = store.search_skills(category=SkillCategory.TOOL_GUIDE)
-+ assert len(results) == 1
-+ assert results[0]["skill_id"] == "tool_skill_v1"
-+
-+ # Search by tags
-+ results = store.search_skills(tags=["utility"])
-+ assert len(results) == 2 # tool and workflow skills
-+
-+ # Search with visibility filter
-+ results = store.search_skills(visibility=SkillVisibility.PUBLIC)
-+ skill_ids = [r["skill_id"] for r in results]
-+ assert "private_skill_v1" not in skill_ids
-+
-+ # Comprehensive search with multiple filters
-+ results = store.search_skills(
-+ query="utility",
-+ tags=["tool"],
-+ category=SkillCategory.TOOL_GUIDE,
-+ limit=10
-+ )
-+ assert len(results) == 1
-+ assert results[0]["skill_id"] == "tool_skill_v1"
-+
-+
-+class TestMigrationWorkflow:
-+ """Test migration workflow: fresh DB → schema creation → all modules work."""
-+
-+ def test_fresh_database_initialization(self, temp_db):
-+ """Test that a fresh database is properly initialized."""
-+ # Create store which should initialize schema
-+ store = SkillStore(temp_db)
-+
-+ try:
-+ # Verify schema version
-+ version = store.get_schema_version()
-+ assert version == 1
-+
-+ # Verify we can perform basic operations
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 0
-+ assert stats["total_analyses"] == 0
-+
-+ # Test that all module operations work
-+ all_skills = store.load_all()
-+ assert len(all_skills) == 0
-+
-+ all_tags = store.get_all_tags()
-+ assert len(all_tags) == 0
-+
-+ finally:
-+ store.close()
-+
-+ def test_schema_migration_methods(self, store):
-+ """Test schema migration facade methods."""
-+ # Test version operations
-+ current_version = store.get_schema_version()
-+ assert current_version == 1
-+
-+ # Test ensure current schema (should be idempotent)
-+ store.ensure_current_schema()
-+ assert store.get_schema_version() == 1
-+
-+ # Test migration method exists
-+ store.migrate_to_version(1) # Should be no-op
-+ assert store.get_schema_version() == 1
-+
-+
-+class TestCrossModuleWorkflow:
-+ """Test cross-module workflow: save → tag → analyze → evolve → comprehensive retrieval."""
-+
-+ @pytest.mark.asyncio
-+ async def test_complete_skill_workflow(self, store, sample_record):
-+ """Test a complete workflow touching all modules."""
-+ # 1. REPOSITORY: Save initial skill
-+ await store.save_record(sample_record)
-+
-+ # 2. TAG SEARCH: Add tags
-+ tags = ["python", "testing", "v1"]
-+ store.sync_tags(sample_record.skill_id, tags)
-+
-+ # 3. ANALYSIS STORE: Record successful execution
-+ analysis = ExecutionAnalysis(
-+ task_id="workflow_test_123",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Executed successfully in workflow test",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=sample_record.skill_id,
-+ skill_applied=True,
-+ note="Executed successfully in workflow test",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+ await store.record_analysis(analysis)
-+
-+ # 4. LINEAGE TRACKER: Evolve the skill
-+ evolved_record = SkillRecord(
-+ skill_id="evolved_workflow_skill_v1",
-+ name="evolved_workflow_skill",
-+ description="Enhanced workflow skill for testing",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="Enhanced for workflow testing",
-+ content_snapshot={"skill.py": "def evolved_workflow_skill():\n return 'enhanced'"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(evolved_record, [sample_record.skill_id])
-+ evolved_id = evolved_record.skill_id
-+
-+ # 5. COMPREHENSIVE VERIFICATION through facade
-+
-+ # Repository operations
-+ original = store.load_record(sample_record.skill_id)
-+ evolved = store.load_record(evolved_id)
-+ assert original is not None
-+ assert evolved is not None
-+ assert evolved.lineage.generation == 1
-+
-+ # Tag operations
-+ original_tags = store.get_tags(sample_record.skill_id)
-+ assert set(original_tags) == set(tags)
-+
-+ # Search operations
-+ found_by_tag = store.find_skills_by_tags(["python"])
-+ assert sample_record.skill_id in found_by_tag
-+
-+ search_results = store.search_skills("testing")
-+ skill_ids = [r["skill_id"] for r in search_results]
-+ assert sample_record.skill_id in skill_ids
-+
-+ # Analysis operations
-+ skill_analyses = store.load_analyses(sample_record.skill_id)
-+ assert len(skill_analyses) == 1
-+ assert skill_analyses[0].task_id == analysis.task_id
-+
-+ # Lineage operations
-+ children = store.find_children(sample_record.skill_id)
-+ assert evolved_id in children
-+
-+ ancestry = store.get_ancestry(evolved_id)
-+ assert len(ancestry) == 1 # just the parent
-+
-+ # Summary stats should reflect all data
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 2 # original + evolved
-+ assert stats["total_analyses"] == 1
-+
-+
-+class TestConcurrentAccess:
-+ """Test concurrent access to SkillStore."""
-+
-+ @pytest.mark.asyncio
-+ async def test_concurrent_reads_and_writes(self, store, sample_record):
-+ """Test that concurrent operations don't interfere."""
-+ # Save initial skill
-+ await store.save_record(sample_record)
-+
-+ async def reader_task(worker_id: int, iterations: int):
-+ """Read operations in concurrent task."""
-+ results = []
-+ for i in range(iterations):
-+ # Mix of read operations
-+ record = store.load_record(sample_record.skill_id)
-+ all_skills = store.load_all()
-+ stats = store.get_stats()
-+ results.append({
-+ 'worker': worker_id,
-+ 'iteration': i,
-+ 'record_found': record is not None,
-+ 'total_skills': len(all_skills),
-+ 'stats_count': stats.get('total_skills', 0)
-+ })
-+ return results
-+
-+ async def writer_task(worker_id: int, iterations: int):
-+ """Write operations in concurrent task."""
-+ for i in range(iterations):
-+ # Create unique records
-+ record = SkillRecord(
-+ skill_id=f"concurrent_skill_w{worker_id}_i{i}",
-+ name=f"concurrent_skill_{worker_id}_{i}",
-+ description=f"Concurrent test skill worker {worker_id} iteration {i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/concurrent_{worker_id}_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"concurrent_{worker_id}_{i}.py": f"def concurrent_skill_{worker_id}_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+
-+ # Run concurrent readers and writers
-+ num_readers = 3
-+ num_writers = 2
-+ iterations = 5
-+
-+ tasks = []
-+
-+ # Start reader tasks
-+ for i in range(num_readers):
-+ tasks.append(reader_task(i, iterations))
-+
-+ # Start writer tasks
-+ for i in range(num_writers):
-+ tasks.append(writer_task(i, iterations))
-+
-+ # Wait for all to complete
-+ results = await asyncio.gather(*tasks, return_exceptions=True)
-+
-+ # Verify no exceptions occurred
-+ for result in results:
-+ if isinstance(result, Exception):
-+ pytest.fail(f"Concurrent task failed: {result}")
-+
-+ # Verify final state
-+ final_count = store.count()
-+ expected_min = 1 + (num_writers * iterations) # initial + written records
-+ assert final_count >= expected_min
-+
-+ def test_concurrent_threading(self, store, sample_record):
-+ """Test concurrent access from multiple threads using threading."""
-+ # Save initial skill synchronously
-+ asyncio.run(store.save_record(sample_record))
-+
-+ results = {}
-+ errors = []
-+
-+ def reader_thread(thread_id: int):
-+ """Thread function for read operations."""
-+ try:
-+ for i in range(10):
-+ record = store.load_record(sample_record.skill_id)
-+ stats = store.get_stats()
-+ results[f"reader_{thread_id}_{i}"] = {
-+ 'found': record is not None,
-+ 'stats': stats.get('total_skills', 0)
-+ }
-+ time.sleep(0.001) # Small delay
-+ except Exception as e:
-+ errors.append(f"Reader {thread_id}: {e}")
-+
-+ def writer_thread(thread_id: int):
-+ """Thread function for write operations."""
-+ try:
-+ async def write_operations():
-+ for i in range(5):
-+ record = SkillRecord(
-+ skill_id=f"thread_skill_{thread_id}_{i}",
-+ name=f"thread_skill_{thread_id}_{i}",
-+ description=f"Thread test skill {thread_id}-{i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/thread_{thread_id}_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"thread_{thread_id}_{i}.py": f"def thread_skill_{thread_id}_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+
-+ # Run async operations in thread
-+ asyncio.run(write_operations())
-+
-+ except Exception as e:
-+ errors.append(f"Writer {thread_id}: {e}")
-+
-+ # Start threads
-+ threads = []
-+
-+ # Start 2 reader threads
-+ for i in range(2):
-+ thread = threading.Thread(target=reader_thread, args=(i,))
-+ threads.append(thread)
-+ thread.start()
-+
-+ # Start 1 writer thread
-+ writer = threading.Thread(target=writer_thread, args=(0,))
-+ threads.append(writer)
-+ writer.start()
-+
-+ # Wait for completion
-+ for thread in threads:
-+ thread.join(timeout=10.0)
-+ if thread.is_alive():
-+ pytest.fail("Thread did not complete within timeout")
-+
-+ # Check for errors
-+ if errors:
-+ pytest.fail(f"Concurrent threading errors: {errors}")
-+
-+ # Verify some operations succeeded
-+ assert len(results) > 0
-+
-+ # Verify final database state
-+ final_count = store.count()
-+ assert final_count >= 1 # At least the initial record
-+
-+
-+class TestEdgeCases:
-+ """Test edge cases and error conditions in integrated workflows."""
-+
-+ @pytest.mark.asyncio
-+ async def test_nonexistent_skill_operations(self, store):
-+ """Test operations on nonexistent skills."""
-+ nonexistent_id = "nonexistent_skill_v1"
-+
-+ # Load nonexistent
-+ record = store.load_record(nonexistent_id)
-+ assert record is None
-+
-+ # Deactivate nonexistent
-+ result = await store.deactivate_record(nonexistent_id)
-+ assert result is False
-+
-+ # Delete nonexistent
-+ result = await store.delete_record(nonexistent_id)
-+ assert result is False
-+
-+ # Get tags for nonexistent
-+ tags = store.get_tags(nonexistent_id)
-+ assert tags == []
-+
-+ # Find children of nonexistent
-+ children = store.find_children(nonexistent_id)
-+ assert children == []
-+
-+ @pytest.mark.asyncio
-+ async def test_empty_database_operations(self, store):
-+ """Test operations on empty database."""
-+ # Load operations on empty DB
-+ all_skills = store.load_all()
-+ assert len(all_skills) == 0
-+
-+ active_skills = store.load_active()
-+ assert len(active_skills) == 0
-+
-+ # Search operations on empty DB
-+ results = store.search_skills("anything")
-+ assert len(results) == 0
-+
-+ found = store.find_skills_by_tags(["any", "tags"])
-+ assert len(found) == 0
-+
-+ # Analysis operations on empty DB
-+ analyses = store.load_all_analyses()
-+ assert len(analyses) == 0
-+
-+ candidates = store.load_evolution_candidates()
-+ assert len(candidates) == 0
-+
-+ # Stats on empty DB
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 0
-+ assert stats["total_analyses"] == 0
-+
-+ count = store.count()
-+ assert count == 0
-+
-+
-+class TestExternalReaderVisibility:
-+ """Test that data committed by one SkillStore instance is visible to a SEPARATE SkillStore instance (external WAL reader)."""
-+
-+ @pytest.mark.asyncio
-+ async def test_save_visible_to_external_reader(self, sample_record):
-+ """Store A saves a record, Store B (same db_path, separate instance) can load it."""
-+ db_path = tempfile.mktemp(suffix='.db')
-+ store_a = None
-+ store_b = None
-+
-+ try:
-+ # Store A saves a record
-+ store_a = SkillStore(db_path)
-+ await store_a.save_record(sample_record)
-+
-+ # Store B (separate instance) should see it
-+ store_b = SkillStore(db_path)
-+ loaded_record = store_b.load_record(sample_record.skill_id)
-+
-+ assert loaded_record is not None
-+ assert loaded_record.skill_id == sample_record.skill_id
-+ assert loaded_record.name == sample_record.name
-+
-+ finally:
-+ if store_a:
-+ try:
-+ store_a.close()
-+ except Exception:
-+ pass
-+ if store_b:
-+ try:
-+ store_b.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+ @pytest.mark.asyncio
-+ async def test_sync_tags_visible_to_external_reader(self, sample_record):
-+ """Store A saves + sync_tags, Store B sees the tags."""
-+ db_path = tempfile.mktemp(suffix='.db')
-+ store_a = None
-+ store_b = None
-+
-+ try:
-+ # Store A saves a record and syncs tags
-+ store_a = SkillStore(db_path)
-+ await store_a.save_record(sample_record)
-+ store_a.sync_tags(sample_record.skill_id, ["test-tag", "integration-tag"])
-+
-+ # Store B should see the tags
-+ store_b = SkillStore(db_path)
-+ tags = store_b.get_tags(sample_record.skill_id)
-+
-+ assert "test-tag" in tags
-+ assert "integration-tag" in tags
-+
-+ # And should be able to find the skill by tags
-+ found_skills = store_b.find_skills_by_tags(["test-tag"])
-+ assert len(found_skills) == 1
-+ assert found_skills[0] == sample_record.skill_id
-+
-+ finally:
-+ if store_a:
-+ try:
-+ store_a.close()
-+ except Exception:
-+ pass
-+ if store_b:
-+ try:
-+ store_b.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+ @pytest.mark.asyncio
-+ async def test_evolve_visible_to_external_reader(self, sample_record, sample_analysis):
-+ """Store A does save → sync_tags → record_analysis → evolve, Store B can load the evolved record."""
-+ db_path = tempfile.mktemp(suffix='.db')
-+ store_a = None
-+ store_b = None
-+
-+ try:
-+ # Store A does full workflow
-+ store_a = SkillStore(db_path)
-+ await store_a.save_record(sample_record)
-+ store_a.sync_tags(sample_record.skill_id, ["evolution-test"])
-+ await store_a.record_analysis(sample_analysis)
-+
-+ # Create evolved record
-+ evolved_record = SkillRecord(
-+ skill_id="evolved_external_test_v1",
-+ name="evolved_external_test",
-+ description="Evolved record for external reader visibility test",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ content_snapshot={"skill.py": "def evolved_test_skill():\n try:\n pass\n except Exception:\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store_a.evolve_skill(evolved_record, [sample_record.skill_id])
-+
-+ # Store B should see the evolved record
-+ store_b = SkillStore(db_path)
-+ loaded_evolved = store_b.load_record(evolved_record.skill_id)
-+
-+ assert loaded_evolved is not None
-+ assert loaded_evolved.skill_id == evolved_record.skill_id
-+ assert loaded_evolved.lineage.generation == 1
-+ assert loaded_evolved.lineage.parent_skill_ids == [sample_record.skill_id]
-+
-+ # And should see the parent relationship
-+ children = store_b.find_children(sample_record.skill_id)
-+ assert len(children) == 1
-+ assert children[0] == evolved_record.skill_id
-+
-+ finally:
-+ if store_a:
-+ try:
-+ store_a.close()
-+ except Exception:
-+ pass
-+ if store_b:
-+ try:
-+ store_b.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+ @pytest.mark.asyncio
-+ async def test_full_workflow_external_parity(self, sample_record, sample_analysis):
-+ """Store A does save → sync_tags → record_analysis → evolve → deactivate parent → reactivate parent. Store B sees ALL state changes."""
-+ db_path = tempfile.mktemp(suffix='.db')
-+ store_a = None
-+ store_b = None
-+
-+ try:
-+ # Store A does full workflow
-+ store_a = SkillStore(db_path)
-+ await store_a.save_record(sample_record)
-+ store_a.sync_tags(sample_record.skill_id, ["workflow-test", "complete"])
-+ await store_a.record_analysis(sample_analysis)
-+
-+ # Create evolved record
-+ evolved_record = SkillRecord(
-+ skill_id="evolved_workflow_test_v1",
-+ name="evolved_workflow_test",
-+ description="Evolved record for full workflow test",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ content_snapshot={"skill.py": "import logging\ndef test_skill():\n logging.info('test')\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store_a.evolve_skill(evolved_record, [sample_record.skill_id])
-+
-+ # Deactivate parent
-+ deactivate_result = await store_a.deactivate_record(sample_record.skill_id)
-+ assert deactivate_result is True
-+
-+ # Reactivate parent
-+ reactivate_result = await store_a.reactivate_record(sample_record.skill_id)
-+ assert reactivate_result is True
-+
-+ # Store B should see ALL state changes
-+ store_b = SkillStore(db_path)
-+
-+ # Original record should be active
-+ original = store_b.load_record(sample_record.skill_id)
-+ assert original is not None
-+ assert original.is_active is True
-+
-+ # Evolved record should exist
-+ evolved = store_b.load_record(evolved_record.skill_id)
-+ assert evolved is not None
-+
-+ # Tags should be visible
-+ tags = store_b.get_tags(sample_record.skill_id)
-+ assert "workflow-test" in tags
-+ assert "complete" in tags
-+
-+ # Analysis should be visible
-+ analyses = store_b.load_analyses(sample_record.skill_id)
-+ assert len(analyses) == 1
-+ assert analyses[0].task_id == sample_analysis.task_id
-+
-+ # Lineage should be intact
-+ children = store_b.find_children(sample_record.skill_id)
-+ assert len(children) == 1
-+ assert children[0] == evolved.skill_id
-+
-+ finally:
-+ if store_a:
-+ try:
-+ store_a.close()
-+ except Exception:
-+ pass
-+ if store_b:
-+ try:
-+ store_b.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+
-+class TestRollbackRecovery:
-+ """Test that failed operations don't poison the connection for subsequent operations."""
-+
-+ @pytest.mark.asyncio
-+ async def test_failed_save_doesnt_poison_connection(self, temp_db):
-+ """Force a save failure (invalid record), then do a successful save. The successful save must be visible."""
-+ store = None
-+
-+ try:
-+ store = SkillStore(temp_db)
-+
-+ # Create an invalid record (missing required fields)
-+ invalid_record = SkillRecord(
-+ skill_id="invalid_skill_v1",
-+ name="", # Empty name should cause validation error
-+ description="Test invalid record",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="", # Empty path should cause validation error
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={}, # Empty snapshot
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Try to save invalid record - should fail
-+ with pytest.raises(Exception):
-+ await store.save_record(invalid_record)
-+
-+ # Now save a valid record - should succeed
-+ valid_record = SkillRecord(
-+ skill_id="valid_after_failure_v1",
-+ name="valid_after_failure",
-+ description="Valid record after failed save",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/valid_after_failure.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"skill.py": "def valid_skill():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # This should succeed
-+ await store.save_record(valid_record)
-+
-+ # Verify it's actually saved and visible
-+ loaded = store.load_record(valid_record.skill_id)
-+ assert loaded is not None
-+ assert loaded.skill_id == valid_record.skill_id
-+
-+ finally:
-+ if store:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+ @pytest.mark.asyncio
-+ async def test_failed_evolve_doesnt_poison_connection(self, temp_db, sample_record):
-+ """Force an evolve failure (missing parent), then do a successful evolve. Must work."""
-+ store = None
-+
-+ try:
-+ store = SkillStore(temp_db)
-+
-+ # Save a valid parent record first
-+ await store.save_record(sample_record)
-+
-+ # Try to evolve from a non-existent parent - should fail because of lineage validation
-+ invalid_evolved_record = SkillRecord(
-+ skill_id="", # Empty skill_id should cause validation failure
-+ name="invalid_evolved",
-+ description="Invalid evolved record",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/invalid_evolved.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=["nonexistent_parent_v1"],
-+ content_snapshot={"skill.py": "def invalid_evolved():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ with pytest.raises(Exception):
-+ await store.evolve_skill(invalid_evolved_record, ["nonexistent_parent_v1"])
-+
-+ # Now do a successful evolution - should work
-+ valid_evolved_record = SkillRecord(
-+ skill_id="valid_evolved_after_failure_v1",
-+ name="valid_evolved_after_failure",
-+ description="Valid evolved record after failure",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/valid_evolved_after_failure.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ content_snapshot={"skill.py": "def valid_evolved():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(valid_evolved_record, [sample_record.skill_id])
-+
-+ # Verify the evolution worked
-+ evolved = store.load_record(valid_evolved_record.skill_id)
-+ assert evolved is not None
-+ assert evolved.lineage.parent_skill_ids == [sample_record.skill_id]
-+
-+ finally:
-+ if store:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+ @pytest.mark.asyncio
-+ async def test_savepoint_rollback_isolation(self, temp_db, sample_record, sample_analysis):
-+ """Within a workflow, one step fails but prior committed steps survive. Specifically: save succeeds, record_analysis succeeds, evolve fails (bad record) → analysis and save are still visible, evolved record is NOT."""
-+ store = None
-+
-+ try:
-+ store = SkillStore(temp_db)
-+
-+ # Step 1: Save - should succeed
-+ await store.save_record(sample_record)
-+
-+ # Step 2: Record analysis - should succeed
-+ await store.record_analysis(sample_analysis)
-+
-+ # Verify both are committed and visible
-+ saved_record = store.load_record(sample_record.skill_id)
-+ assert saved_record is not None
-+
-+ analyses = store.load_analyses(sample_record.skill_id)
-+ assert len(analyses) == 1
-+ assert analyses[0].task_id == sample_analysis.task_id
-+
-+ # Step 3: Evolve with invalid data - should fail
-+ invalid_evolved_record = SkillRecord(
-+ skill_id="", # Empty skill_id should cause validation failure
-+ name="invalid_evolved_isolation",
-+ description="Invalid evolved record for isolation test",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/invalid_evolved_isolation.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ content_snapshot={"skill.py": "def invalid_evolved():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ with pytest.raises(Exception):
-+ await store.evolve_skill(invalid_evolved_record, [sample_record.skill_id])
-+
-+ # Verify the previous steps are still intact (savepoint isolation worked)
-+ record_after_failure = store.load_record(sample_record.skill_id)
-+ assert record_after_failure is not None
-+ assert record_after_failure.skill_id == sample_record.skill_id
-+
-+ analyses_after_failure = store.load_analyses(sample_record.skill_id)
-+ assert len(analyses_after_failure) == 1
-+ assert analyses_after_failure[0].task_id == sample_analysis.task_id
-+
-+ # Verify no evolved record was created
-+ children = store.find_children(sample_record.skill_id)
-+ assert len(children) == 0
-+
-+ finally:
-+ if store:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+
-+class TestSavepointNesting:
-+ """Test the SAVEPOINT pattern works for nested transaction scenarios."""
-+
-+ @pytest.mark.asyncio
-+ async def test_multiple_operations_in_sequence(self, temp_db, sample_record):
-+ """Test multiple operations work correctly in sequence, ensuring SAVEPOINT pattern handles internal nesting."""
-+ store = None
-+
-+ try:
-+ store = SkillStore(temp_db)
-+ await store.save_record(sample_record)
-+
-+ # Perform multiple operations that each internally use savepoints
-+ # This tests that savepoints can be nested without issues
-+
-+ # 1. Deactivate
-+ result1 = await store.deactivate_record(sample_record.skill_id)
-+ assert result1 is True
-+
-+ # 2. Reactivate (this should work after deactivate)
-+ result2 = await store.reactivate_record(sample_record.skill_id)
-+ assert result2 is True
-+
-+ # 3. Deactivate again
-+ result3 = await store.deactivate_record(sample_record.skill_id)
-+ assert result3 is True
-+
-+ # Verify final state
-+ record = store.load_record(sample_record.skill_id)
-+ assert record is not None
-+ assert record.is_active is False
-+
-+ finally:
-+ if store:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+ @pytest.mark.asyncio
-+ async def test_concurrent_count_during_write(self, temp_db, sample_record):
-+ """Call count() from one thread while save_record is executing on another. Verify no crash and count returns a valid number."""
-+ store = None
-+
-+ try:
-+ store = SkillStore(temp_db)
-+
-+ # Create a barrier to coordinate timing
-+ barrier = threading.Barrier(2)
-+ results = {}
-+ exceptions = {}
-+
-+ def count_worker():
-+ """Worker that calls count() during a save operation."""
-+ try:
-+ barrier.wait() # Wait for save to start
-+ # Multiple counts to increase chance of hitting the save operation
-+ for i in range(10):
-+ count = store.count()
-+ results[f'count_{i}'] = count
-+ time.sleep(0.01) # Small delay
-+ except Exception as e:
-+ exceptions['count_worker'] = e
-+
-+ async def save_worker():
-+ """Worker that does a slow save operation."""
-+ try:
-+ barrier.wait() # Signal that save is starting
-+ # Create multiple records with small delays to make a "slow" save
-+ for i in range(5):
-+ record = SkillRecord(
-+ skill_id=f"concurrent_save_test_{i}_v1",
-+ name=f"concurrent_save_test_{i}",
-+ description=f"Concurrent save test record {i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/concurrent_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"concurrent_{i}.py": f"def concurrent_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+ await asyncio.sleep(0.01) # Small async delay
-+ except Exception as e:
-+ exceptions['save_worker'] = e
-+
-+ # Start count worker in thread
-+ count_thread = threading.Thread(target=count_worker)
-+ count_thread.start()
-+
-+ # Start save worker
-+ await save_worker()
-+
-+ # Wait for count worker to finish
-+ count_thread.join()
-+
-+ # Verify no exceptions occurred
-+ if exceptions:
-+ pytest.fail(f"Concurrent operations failed: {exceptions}")
-+
-+ # Verify count results are valid (non-negative integers)
-+ for key, count in results.items():
-+ assert isinstance(count, int)
-+ assert count >= 0
-+
-+ # Verify final state is consistent
-+ final_count = store.count()
-+ assert final_count >= 5 # At least the 5 records we saved
-+
-+ finally:
-+ if store:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+
-+class TestLikeWildcardEscape:
-+ """Test LIKE wildcard injection is prevented."""
-+
-+ @pytest.mark.asyncio
-+ async def test_load_record_by_path_escapes_percent(self, temp_db):
-+ """Save a record with path `/skills/100%_complete/SKILL.md`, verify `load_record_by_path("/skills/100")` does NOT match it (the % is literal, not a wildcard)."""
-+ store = None
-+
-+ try:
-+ store = SkillStore(temp_db)
-+
-+ # Create record with % in path
-+ record_with_percent = SkillRecord(
-+ skill_id="percent_path_test_v1",
-+ name="percent_path_test",
-+ description="Test record with % in path",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/skills/100%_complete/SKILL.md", # % should be treated literally
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"SKILL.md": "# Skill with % in path"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.save_record(record_with_percent)
-+
-+ # Also create a record that would match if % was treated as wildcard
-+ record_normal = SkillRecord(
-+ skill_id="normal_path_test_v1",
-+ name="normal_path_test",
-+ description="Test record with normal path",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/skills/100x_complete/SKILL.md", # Would match "/skills/100" + % wildcard
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"SKILL.md": "# Normal skill"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.save_record(record_normal)
-+
-+ # Search with path prefix - should NOT match the % record if % is escaped
-+ match = store.load_record_by_path("/skills/100")
-+
-+ # If there's a match, it should NOT be the %_complete record
-+ if match:
-+ assert match.path != "/skills/100%_complete/SKILL.md", "% was treated as wildcard - security issue!"
-+
-+ finally:
-+ if store:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+ @pytest.mark.asyncio
-+ async def test_load_record_by_path_escapes_underscore(self, temp_db):
-+ """Save records with paths `/skills/a_b/SKILL.md` and `/skills/axb/SKILL.md`, verify `load_record_by_path("/skills/a_b")` only matches the first."""
-+ store = None
-+
-+ try:
-+ store = SkillStore(temp_db)
-+
-+ # Record with literal underscore
-+ record_underscore = SkillRecord(
-+ skill_id="underscore_path_test_v1",
-+ name="underscore_path_test",
-+ description="Test record with _ in path",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/skills/a_b/SKILL.md", # _ should be treated literally
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"SKILL.md": "# Skill with _ in path"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Record that would match if _ was treated as single-char wildcard
-+ record_would_match = SkillRecord(
-+ skill_id="would_match_test_v1",
-+ name="would_match_test",
-+ description="Record that would match if _ is wildcard",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/skills/axb/SKILL.md", # Would match "a_b" if _ is wildcard
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"SKILL.md": "# Would match if _ is wildcard"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.save_record(record_underscore)
-+ await store.save_record(record_would_match)
-+
-+ # Search for exact underscore path
-+ match = store.load_record_by_path("/skills/a_b")
-+
-+ if match:
-+ # Should match the literal underscore path
-+ assert match.path == "/skills/a_b/SKILL.md", "Literal underscore path should match"
-+
-+ # Verify we didn't get the "axb" path (which would mean _ was treated as wildcard)
-+ assert match.path != "/skills/axb/SKILL.md", "_ was treated as wildcard - security issue!"
-+
-+ finally:
-+ if store:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ gc.collect()
-+
-+
-+if __name__ == "__main__":
-+ # Run tests if executed directly
-+ pytest.main([__file__, "-v"])
-\ No newline at end of file
diff --git a/pr-28-diff-r3.txt b/pr-28-diff-r3.txt
deleted file mode 100644
index 0aa6ac3..0000000
--- a/pr-28-diff-r3.txt
+++ /dev/null
@@ -1,788 +0,0 @@
-diff --git a/openspace/skill_engine/analysis_store.py b/openspace/skill_engine/analysis_store.py
-index 9426b30..306dd97 100644
---- a/openspace/skill_engine/analysis_store.py
-+++ b/openspace/skill_engine/analysis_store.py
-@@ -22,7 +22,7 @@ from contextlib import contextmanager
- from datetime import datetime
- from functools import wraps
- from pathlib import Path
--from typing import Any, Dict, Generator, List, Optional
-+from typing import Any, Callable, Dict, Generator, List, Optional
-
- from openspace.utils.logging import Logger
-
-@@ -43,12 +43,12 @@ def _db_retry(
- max_retries: int = 5,
- initial_delay: float = 0.1,
- backoff: float = 2.0,
--):
-+) -> Callable:
- """Retry on transient SQLite errors with exponential backoff."""
-
-- def decorator(func):
-+ def decorator(func: Callable) -> Callable:
- @wraps(func)
-- def wrapper(*args, **kwargs):
-+ def wrapper(*args: Any, **kwargs: Any) -> Any:
- delay = initial_delay
- for attempt in range(max_retries):
- try:
-@@ -78,7 +78,7 @@ class AnalysisStore:
- Usage (standalone)::
-
- store = AnalysisStore(db_path=Path("analyses.db"))
-- store.record_execution_analysis(analysis)
-+ store.record_execution_analysis_sync(analysis)
- recent = store.load_analyses(skill_id="some_skill")
- store.close()
-
-@@ -314,9 +314,70 @@ class AnalysisStore:
- last_updated=record.last_updated,
- recent_analyses=recent_analyses, # This is what we're hydrating
- tool_dependencies=record.tool_dependencies,
-+ critical_tools=record.critical_tools, # Fix: preserve critical_tools
- tags=record.tags,
- )
-
-+ def batch_load_recent_analyses(
-+ self, skill_ids: List[str], limit: int = 5
-+ ) -> Dict[str, List[ExecutionAnalysis]]:
-+ """Batch-load recent analyses for multiple skills.
-+
-+ Uses window functions to get top-N per skill in 2 queries total
-+ (analyses + judgments), avoiding O(N*M) fan-out.
-+ """
-+ if not skill_ids:
-+ return {}
-+ placeholders = ",".join("?" * len(skill_ids))
-+ with self._reader() as conn:
-+ # 1 query: ranked analyses per skill (DISTINCT via window over unique ea.id per skill)
-+ rows = conn.execute(
-+ f"SELECT sub.* FROM ("
-+ f" SELECT ea.*, sj.skill_id AS source_skill_id, "
-+ f" ROW_NUMBER() OVER ("
-+ f" PARTITION BY sj.skill_id ORDER BY ea.timestamp DESC"
-+ f" ) AS rn "
-+ f" FROM execution_analyses ea "
-+ f" JOIN skill_judgments sj ON ea.id = sj.analysis_id "
-+ f" WHERE sj.skill_id IN ({placeholders})"
-+ f") sub WHERE sub.rn <= ?",
-+ [*skill_ids, limit],
-+ ).fetchall()
-+
-+ # Collect analysis IDs and group by skill
-+ filtered: Dict[str, list] = {}
-+ analysis_ids: set = set()
-+ for row in rows:
-+ sid = row["source_skill_id"]
-+ filtered.setdefault(sid, []).append(row)
-+ analysis_ids.add(row["id"])
-+
-+ if not analysis_ids:
-+ return {}
-+
-+ # 1 query: all judgments for matched analyses
-+ aid_placeholders = ",".join("?" * len(analysis_ids))
-+ judgment_rows = conn.execute(
-+ f"SELECT analysis_id, skill_id, skill_applied, note "
-+ f"FROM skill_judgments WHERE analysis_id IN ({aid_placeholders})",
-+ list(analysis_ids),
-+ ).fetchall()
-+
-+ judgments_by_analysis: Dict[str, list] = {}
-+ for jr in judgment_rows:
-+ judgments_by_analysis.setdefault(jr["analysis_id"], []).append(jr)
-+
-+ # Deserialize with preloaded judgments
-+ result: Dict[str, List[ExecutionAnalysis]] = {}
-+ for sid, ea_rows in filtered.items():
-+ result[sid] = [
-+ self._to_analysis(
-+ conn, row, judgments_by_analysis.get(row["id"], [])
-+ )
-+ for row in ea_rows
-+ ]
-+ return result
-+
- # ── Bulk Operations ────────────────────────────────────────────────
-
- @_db_retry()
-@@ -495,6 +556,7 @@ class AnalysisStore:
- ),
- )
- analysis_id = cur.lastrowid
-+ assert analysis_id is not None, "INSERT must set lastrowid"
-
- for j in a.skill_judgments:
- self._conn.execute(
-@@ -505,14 +567,21 @@ class AnalysisStore:
- return analysis_id
-
- @staticmethod
-- def _to_analysis(conn: sqlite3.Connection, row: sqlite3.Row) -> ExecutionAnalysis:
-+ def _to_analysis(
-+ conn: sqlite3.Connection,
-+ row: sqlite3.Row,
-+ preloaded_judgments: Optional[List[sqlite3.Row]] = None,
-+ ) -> ExecutionAnalysis:
- """Deserialize an execution_analyses row + judgments → ExecutionAnalysis."""
- analysis_id = row["id"]
-
-- judgment_rows = conn.execute(
-- "SELECT skill_id, skill_applied, note FROM skill_judgments WHERE analysis_id=?",
-- (analysis_id,),
-- ).fetchall()
-+ if preloaded_judgments is not None:
-+ judgment_rows = preloaded_judgments
-+ else:
-+ judgment_rows = conn.execute(
-+ "SELECT skill_id, skill_applied, note FROM skill_judgments WHERE analysis_id=?",
-+ (analysis_id,),
-+ ).fetchall()
-
- suggestions: list[EvolutionSuggestion] = []
- raw_suggestions = row["evolution_suggestions"]
-diff --git a/openspace/skill_engine/lineage_tracker.py b/openspace/skill_engine/lineage_tracker.py
-index 409aabd..c36cede 100644
---- a/openspace/skill_engine/lineage_tracker.py
-+++ b/openspace/skill_engine/lineage_tracker.py
-@@ -116,10 +116,7 @@ class LineageTracker:
- if db_path is None:
- raise ValueError("Either db_path or conn must be provided")
- self._db_path = Path(db_path)
-- self._conn = sqlite3.connect(str(db_path), timeout=30)
-- self._conn.row_factory = sqlite3.Row
-- self._conn.execute("PRAGMA journal_mode=WAL")
-- self._conn.execute("PRAGMA busy_timeout=30000")
-+ self._conn = self._make_connection(read_only=False)
- # Standalone mode: ensure schema exists
- mm = MigrationManager(conn=self._conn, lock=self._mu)
- mm.initialize_schema()
-@@ -480,6 +477,7 @@ class LineageTracker:
- List of ancestor :class:`SkillRecord` objects, sorted by
- generation (nearest first).
- """
-+ max_depth = min(max_depth, 50) # Safety clamp to prevent DoS
- with self._reader() as conn:
- visited: set[str] = {skill_id} # Fix 1: Seed with starting skill_id to prevent cycles
- ancestors: List[SkillRecord] = []
-@@ -551,6 +549,7 @@ class LineageTracker:
- Returns:
- Nested dict representing the lineage tree.
- """
-+ max_depth = min(max_depth, 50) # Safety clamp to prevent DoS
- with self._reader() as conn:
- return self._subtree(conn, skill_id, max_depth, set())
-
-diff --git a/openspace/skill_engine/migration_manager.py b/openspace/skill_engine/migration_manager.py
-index 4776c2f..fd9ab78 100644
---- a/openspace/skill_engine/migration_manager.py
-+++ b/openspace/skill_engine/migration_manager.py
-@@ -171,7 +171,7 @@ class MigrationManager:
-
- # SkillStore creates us internally:
- # store._migrations = MigrationManager(conn=store._conn, lock=store._mu)
-- # SkillStore calls manager.initialize_schema() during __init__
-+ # SkillStore calls manager.ensure_current_schema() during __init__
- """
-
- def __init__(
-diff --git a/openspace/skill_engine/store.py b/openspace/skill_engine/store.py
-index f0d21ff..16f59ba 100644
---- a/openspace/skill_engine/store.py
-+++ b/openspace/skill_engine/store.py
-@@ -29,6 +29,7 @@ The SkillStore class serves as a unified facade that:
- from __future__ import annotations
-
- import asyncio
-+import dataclasses
- import json
- import sqlite3
- import threading
-@@ -681,14 +682,15 @@ class SkillStore:
-
- Delegates to :class:`LineageTracker` (Epic 3.3).
- """
-- # Fix 5: Hydrate recent_analyses after delegation
-+ # Fix: Fully hydrate records after delegation (tags, tool_deps, critical_tools, recent_analyses)
- records = self._lineage.get_evolution_chain(name)
-- return [self._hydrate_recent_analyses(record) for record in records]
-+ return self._hydrate_records(records)
-
- @_db_retry()
- def load_by_category(self, category: SkillCategory, *, active_only: bool = True) -> List[SkillRecord]:
- """Delegate to SkillRepository for category queries."""
-- return self._repo.load_by_category(category, active_only=active_only)
-+ records = self._repo.load_by_category(category, active_only=active_only)
-+ return self._hydrate_records(records)
-
- @_db_retry()
- def load_analyses(
-@@ -826,9 +828,9 @@ class SkillStore:
-
- Delegates to :class:`LineageTracker` (Epic 3.3).
- """
-- # Fix 5: Hydrate recent_analyses after delegation
-+ # Fix: Fully hydrate records after delegation (tags, tool_deps, critical_tools, recent_analyses)
- records = self._lineage.get_ancestors(skill_id, max_depth=max_depth)
-- return [self._hydrate_recent_analyses(record) for record in records]
-+ return self._hydrate_records(records)
-
- @_db_retry()
- def get_lineage_tree(self, skill_id: str, max_depth: int = 5) -> Dict[str, Any]:
-@@ -992,6 +994,82 @@ class SkillStore:
- """Hydrate recent_analyses for a SkillRecord from AnalysisStore delegation."""
- return self._analyses.hydrate_recent_analyses(record)
-
-+ def _hydrate_record(self, record: SkillRecord) -> SkillRecord:
-+ """Fully hydrate a SkillRecord with tags, tool_deps, critical_tools, and recent_analyses.
-+
-+ Used by facade methods that delegate to modules returning partially hydrated records.
-+ """
-+ with self._reader() as conn:
-+ # Hydrate tool dependencies and critical tools
-+ dep_rows = conn.execute(
-+ "SELECT tool_key, critical FROM skill_tool_deps WHERE skill_id=?",
-+ (record.skill_id,),
-+ ).fetchall()
-+
-+ tool_dependencies = [r["tool_key"] for r in dep_rows]
-+ critical_tools = [r["tool_key"] for r in dep_rows if r["critical"]]
-+
-+ # Hydrate tags using TagSearch
-+ tags = self._tag_search.get_tags(record.skill_id)
-+
-+ # Hydrate recent_analyses using AnalysisStore
-+ hydrated_record = self._analyses.hydrate_recent_analyses(record)
-+
-+ # Return a new record with all fields hydrated
-+ return dataclasses.replace(
-+ hydrated_record,
-+ tags=tags,
-+ tool_dependencies=tool_dependencies,
-+ critical_tools=critical_tools,
-+ )
-+
-+ def _hydrate_records(self, records: List[SkillRecord]) -> List[SkillRecord]:
-+ """Batch hydrate multiple SkillRecords — O(1) queries instead of O(N).
-+
-+ Batches tag, tool_dep, and analysis queries across all records.
-+ """
-+ if not records:
-+ return records
-+
-+ skill_ids = [r.skill_id for r in records]
-+ placeholders = ",".join("?" * len(skill_ids))
-+
-+ with self._reader() as conn:
-+ # Batch 1: tool_deps — 1 query
-+ dep_rows = conn.execute(
-+ f"SELECT skill_id, tool_key, critical FROM skill_tool_deps "
-+ f"WHERE skill_id IN ({placeholders})",
-+ skill_ids,
-+ ).fetchall()
-+
-+ deps_by_skill: Dict[str, List[str]] = {}
-+ critical_by_skill: Dict[str, List[str]] = {}
-+ for row in dep_rows:
-+ sid = row["skill_id"]
-+ deps_by_skill.setdefault(sid, []).append(row["tool_key"])
-+ if row["critical"]:
-+ critical_by_skill.setdefault(sid, []).append(row["tool_key"])
-+
-+ # Batch 2: tags — 1 query (via TagSearch)
-+ tags_by_skill = self._tag_search.get_tags_batch(skill_ids)
-+
-+ # Batch 3: analyses — 2 queries (via AnalysisStore)
-+ analyses_by_skill = self._analyses.batch_load_recent_analyses(
-+ skill_ids, SkillRecord.MAX_RECENT
-+ )
-+
-+ # Assemble
-+ return [
-+ dataclasses.replace(
-+ record,
-+ tags=tags_by_skill.get(record.skill_id, []),
-+ tool_dependencies=deps_by_skill.get(record.skill_id, []),
-+ critical_tools=critical_by_skill.get(record.skill_id, []),
-+ recent_analyses=analyses_by_skill.get(record.skill_id, []),
-+ )
-+ for record in records
-+ ]
-+
- # Deserialization
- def _to_record(self, conn: sqlite3.Connection, row: sqlite3.Row) -> SkillRecord:
- """Deserialize a skill_records row + related rows → SkillRecord."""
-@@ -1101,6 +1179,10 @@ class SkillStore:
- """Get all tags for a skill (facade to TagSearch)."""
- return self._tag_search.get_tags(skill_id)
-
-+ def get_tags_batch(self, skill_ids: List[str]) -> Dict[str, List[str]]:
-+ """Batch-get tags for multiple skills (facade to TagSearch)."""
-+ return self._tag_search.get_tags_batch(skill_ids)
-+
- def get_all_tags(self) -> List[Dict[str, Any]]:
- """Get all tags with usage counts (facade to TagSearch)."""
- return self._tag_search.get_all_tags()
-diff --git a/openspace/skill_engine/tag_search.py b/openspace/skill_engine/tag_search.py
-index 982e9bb..f7db667 100644
---- a/openspace/skill_engine/tag_search.py
-+++ b/openspace/skill_engine/tag_search.py
-@@ -18,7 +18,7 @@ import time
- from contextlib import contextmanager
- from functools import wraps
- from pathlib import Path
--from typing import Any, Dict, Generator, List, Optional
-+from typing import Any, Callable, Dict, Generator, List, Optional
-
- from openspace.utils.logging import Logger
-
-@@ -32,12 +32,12 @@ def _db_retry(
- max_retries: int = 5,
- initial_delay: float = 0.1,
- backoff: float = 2.0,
--):
-+) -> Callable:
- """Retry on transient SQLite errors with exponential backoff."""
-
-- def decorator(func):
-+ def decorator(func: Callable) -> Callable:
- @wraps(func)
-- def wrapper(*args, **kwargs):
-+ def wrapper(*args: Any, **kwargs: Any) -> Any:
- delay = initial_delay
- for attempt in range(max_retries):
- try:
-@@ -221,6 +221,23 @@ class TagSearch:
- ).fetchall()
- return [r["tag"] for r in rows]
-
-+ @_db_retry()
-+ def get_tags_batch(self, skill_ids: List[str]) -> Dict[str, List[str]]:
-+ """Batch-get tags for multiple skills. Returns {skill_id: [sorted tags]}."""
-+ if not skill_ids:
-+ return {}
-+ placeholders = ",".join("?" * len(skill_ids))
-+ with self._reader() as conn:
-+ rows = conn.execute(
-+ f"SELECT skill_id, tag FROM skill_tags "
-+ f"WHERE skill_id IN ({placeholders}) ORDER BY skill_id, tag",
-+ skill_ids,
-+ ).fetchall()
-+ result: Dict[str, List[str]] = {}
-+ for row in rows:
-+ result.setdefault(row["skill_id"], []).append(row["tag"])
-+ return result
-+
- @_db_retry()
- def find_skills_by_tags(
- self,
-@@ -448,8 +465,8 @@ class TagSearch:
- List of skill records matching the criteria.
- """
- with self._reader() as conn:
-- conditions = []
-- params = []
-+ conditions: List[str] = []
-+ params: List[Any] = []
-
- # Active filter
- if active_only:
-diff --git a/tests/test_facade_coverage.py b/tests/test_facade_coverage.py
-index cc5422f..ebf5b00 100644
---- a/tests/test_facade_coverage.py
-+++ b/tests/test_facade_coverage.py
-@@ -548,4 +548,390 @@ class TestClose:
- try:
- assert store2.count() == 0
- finally:
-- store2.close()
-\ No newline at end of file
-+ store2.close()
-+
-+
-+class TestFacadeHydrationFix:
-+ """Regression tests for facade methods returning fully hydrated SkillRecords.
-+
-+ Prior to the fix, get_versions(), get_ancestry(), and load_by_category()
-+ returned partially hydrated records missing tags, tool_deps, critical_tools,
-+ and/or recent_analyses.
-+ """
-+
-+ @pytest.mark.asyncio
-+ async def test_get_versions_returns_fully_hydrated_records(self, store):
-+ """get_versions() should return records with tags, tool_deps, critical_tools, and recent_analyses."""
-+ # Create a skill record with all fields populated
-+ record = SkillRecord(
-+ skill_id="hydration_test_v1",
-+ name="hydration_test",
-+ description="Test hydration fix",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/hydration.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"hydration.py": "def test_skill():\n pass"},
-+ ),
-+ tags=["test-tag", "hydration-tag"],
-+ tool_dependencies=["tool1", "tool2"],
-+ critical_tools=["tool1"],
-+ total_selections=5,
-+ total_applied=3,
-+ total_completions=2,
-+ total_fallbacks=1,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Save the record
-+ await store.save_record(record)
-+
-+ # Add an analysis to test recent_analyses hydration
-+ analysis = ExecutionAnalysis(
-+ task_id="hydration_task_123",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Test hydration analysis",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=record.skill_id,
-+ skill_applied=True,
-+ note="Skill applied successfully in hydration test",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+ await store.record_analysis(analysis)
-+
-+ # Test get_versions() - this delegates to LineageTracker
-+ versions = store.get_versions("hydration_test")
-+ assert len(versions) == 1
-+
-+ hydrated_record = versions[0]
-+
-+ # Verify ALL fields are hydrated
-+ assert set(hydrated_record.tags) == {"test-tag", "hydration-tag"}
-+ assert set(hydrated_record.tool_dependencies) == {"tool1", "tool2"}
-+ assert set(hydrated_record.critical_tools) == {"tool1"}
-+ assert len(hydrated_record.recent_analyses) == 1
-+ assert hydrated_record.recent_analyses[0].task_id == "hydration_task_123"
-+
-+ @pytest.mark.asyncio
-+ async def test_get_ancestry_returns_fully_hydrated_records(self, store):
-+ """get_ancestry() should return records with tags, tool_deps, critical_tools, and recent_analyses."""
-+ # Create a parent skill
-+ parent_record = SkillRecord(
-+ skill_id="parent_skill_v1",
-+ name="parent_skill",
-+ description="Parent skill for ancestry test",
-+ category=SkillCategory.WORKFLOW,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/parent.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"parent.py": "def parent_skill():\n pass"},
-+ ),
-+ tags=["parent-tag", "ancestry-tag"],
-+ tool_dependencies=["parent-tool"],
-+ critical_tools=["parent-tool"],
-+ total_selections=2,
-+ total_applied=1,
-+ total_completions=1,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Create a child skill that derives from parent
-+ child_record = SkillRecord(
-+ skill_id="child_skill_v1",
-+ name="child_skill",
-+ description="Child skill derived from parent",
-+ category=SkillCategory.WORKFLOW,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/child.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[parent_record.skill_id],
-+ source_task_id="ancestry_derivation_task",
-+ change_summary="Derived from parent skill",
-+ content_diff="@@ -1,1 +1,1 @@\n-def parent_skill():\n+def child_skill():",
-+ content_snapshot={"child.py": "def child_skill():\n pass"},
-+ created_by="hydration_test",
-+ ),
-+ tags=["child-tag", "derived-tag"],
-+ tool_dependencies=["child-tool"],
-+ critical_tools=["child-tool"],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Save both records
-+ await store.save_record(parent_record)
-+ await store.save_record(child_record)
-+
-+ # Add analyses for both to test recent_analyses hydration
-+ parent_analysis = ExecutionAnalysis(
-+ task_id="parent_task_456",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Parent skill analysis",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=parent_record.skill_id,
-+ skill_applied=True,
-+ note="Parent skill executed",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+ await store.record_analysis(parent_analysis)
-+
-+ # Test get_ancestry() - this delegates to LineageTracker
-+ ancestry = store.get_ancestry(child_record.skill_id)
-+ assert len(ancestry) == 1
-+
-+ hydrated_parent = ancestry[0]
-+
-+ # Verify ALL fields are hydrated for the parent returned by get_ancestry
-+ assert hydrated_parent.skill_id == parent_record.skill_id
-+ assert set(hydrated_parent.tags) == {"parent-tag", "ancestry-tag"}
-+ assert set(hydrated_parent.tool_dependencies) == {"parent-tool"}
-+ assert set(hydrated_parent.critical_tools) == {"parent-tool"}
-+ assert len(hydrated_parent.recent_analyses) == 1
-+ assert hydrated_parent.recent_analyses[0].task_id == "parent_task_456"
-+
-+ @pytest.mark.asyncio
-+ async def test_load_by_category_returns_fully_hydrated_records(self, store):
-+ """load_by_category() should return records with tags, tool_deps, critical_tools, and recent_analyses."""
-+ # Create a skill record with all fields populated
-+ record = SkillRecord(
-+ skill_id="category_test_v1",
-+ name="category_test",
-+ description="Test category hydration",
-+ category=SkillCategory.REFERENCE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/category.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"category.py": "def category_skill():\n pass"},
-+ ),
-+ tags=["category-tag", "reference-tag"],
-+ tool_dependencies=["cat-tool1", "cat-tool2"],
-+ critical_tools=["cat-tool1"],
-+ total_selections=3,
-+ total_applied=2,
-+ total_completions=2,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Save the record
-+ await store.save_record(record)
-+
-+ # Add an analysis to test recent_analyses hydration
-+ analysis = ExecutionAnalysis(
-+ task_id="category_task_789",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Category test analysis",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=record.skill_id,
-+ skill_applied=True,
-+ note="Category skill applied successfully",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+ await store.record_analysis(analysis)
-+
-+ # DEBUG: Let's verify the record was saved correctly
-+ saved_record = store.load_record("category_test_v1")
-+ assert saved_record is not None
-+
-+ # Test load_by_category() - this delegates to SkillRepository
-+ category_records = store.load_by_category(SkillCategory.REFERENCE)
-+ assert len(category_records) == 1
-+
-+ hydrated_record = category_records[0]
-+
-+ # Verify ALL fields are hydrated
-+ assert set(hydrated_record.tags) == {"category-tag", "reference-tag"}
-+ assert set(hydrated_record.tool_dependencies) == {"cat-tool1", "cat-tool2"}
-+ assert set(hydrated_record.critical_tools) == {"cat-tool1"}
-+ assert len(hydrated_record.recent_analyses) == 1
-+ assert hydrated_record.recent_analyses[0].task_id == "category_task_789"
-+
-+ @pytest.mark.asyncio
-+ async def test_hydration_consistency_across_facade_methods(self, store):
-+ """All facade methods should return consistently hydrated records for the same skill."""
-+ # Create a skill that we'll access through multiple facade methods
-+ record = SkillRecord(
-+ skill_id="consistency_test_v1",
-+ name="consistency_test",
-+ description="Test hydration consistency",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/consistency.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"consistency.py": "def consistent_skill():\n pass"},
-+ ),
-+ tags=["consistency-tag", "facade-tag"],
-+ tool_dependencies=["cons-tool1", "cons-tool2"],
-+ critical_tools=["cons-tool1"],
-+ total_selections=1,
-+ total_applied=1,
-+ total_completions=1,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Save the record
-+ await store.save_record(record)
-+
-+ # Add an analysis
-+ analysis = ExecutionAnalysis(
-+ task_id="consistency_task_999",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Consistency test analysis",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=record.skill_id,
-+ skill_applied=True,
-+ note="Consistency skill applied",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+ await store.record_analysis(analysis)
-+
-+ # Get the same skill through different facade methods
-+
-+ # Method 1: load_record (direct facade method, should be fully hydrated)
-+ direct_record = store.load_record(record.skill_id)
-+
-+ # Method 2: get_versions (delegates to LineageTracker)
-+ versions = store.get_versions("consistency_test")
-+ versions_record = versions[0]
-+
-+ # Method 3: load_by_category (delegates to SkillRepository)
-+ category_records = store.load_by_category(SkillCategory.TOOL_GUIDE)
-+ category_record = next(r for r in category_records if r.skill_id == record.skill_id)
-+
-+ # All three methods should return identically hydrated records
-+ expected_tags = {"consistency-tag", "facade-tag"}
-+ expected_tool_deps = {"cons-tool1", "cons-tool2"}
-+ expected_critical_tools = {"cons-tool1"}
-+
-+ for method_name, retrieved_record in [
-+ ("load_record", direct_record),
-+ ("get_versions", versions_record),
-+ ("load_by_category", category_record),
-+ ]:
-+ assert set(retrieved_record.tags) == expected_tags, f"{method_name} failed tag hydration"
-+ assert set(retrieved_record.tool_dependencies) == expected_tool_deps, f"{method_name} failed tool_deps hydration"
-+ assert set(retrieved_record.critical_tools) == expected_critical_tools, f"{method_name} failed critical_tools hydration"
-+ assert len(retrieved_record.recent_analyses) == 1, f"{method_name} failed recent_analyses hydration"
-+ assert retrieved_record.recent_analyses[0].task_id == "consistency_task_999", f"{method_name} failed recent_analyses content"
-+
-+ @pytest.mark.asyncio
-+ async def test_f5_evolved_child_inherits_tags(self, store):
-+ """F5: When skill A (with tags) is evolved to B, B should inherit A's tags (if expected)."""
-+ # Create a parent skill with tags
-+ parent_record = SkillRecord(
-+ skill_id="parent_f5_v1",
-+ name="parent_f5",
-+ description="Parent skill with tags",
-+ category=SkillCategory.WORKFLOW,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/parent_f5.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"parent_f5.py": "def parent_f5():\n pass"},
-+ ),
-+ tags=["parent-tag", "workflow-tag"],
-+ tool_dependencies=["parent-tool"],
-+ critical_tools=["parent-tool"],
-+ total_selections=1,
-+ total_applied=1,
-+ total_completions=1,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Create a child skill that derives from parent (without explicitly copying tags)
-+ child_record = SkillRecord(
-+ skill_id="child_f5_v1",
-+ name="child_f5",
-+ description="Child skill derived from parent",
-+ category=SkillCategory.WORKFLOW,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/child_f5.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[parent_record.skill_id],
-+ source_task_id="f5_derivation_task",
-+ change_summary="Derived from parent skill",
-+ content_diff="@@ -1,1 +1,1 @@\n-def parent_f5():\n+def child_f5():",
-+ content_snapshot={"child_f5.py": "def child_f5():\n pass"},
-+ created_by="f5_test",
-+ ),
-+ tags=[], # No tags initially - should inherit from parent?
-+ tool_dependencies=["child-tool"],
-+ critical_tools=["child-tool"],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ # Save parent, then evolve to child
-+ await store.save_record(parent_record)
-+ await store.evolve_skill(child_record, [parent_record.skill_id])
-+
-+ # Check if child inherited parent's tags
-+ child_loaded = store.load_record(child_record.skill_id)
-+ print(f"DEBUG F5: Parent tags: {parent_record.tags}")
-+ print(f"DEBUG F5: Child record tags: {child_record.tags}")
-+ print(f"DEBUG F5: Child loaded tags: {child_loaded.tags}")
-+
-+ # This test documents current behavior - whether tags are inherited or not
-+ # Based on findings, we may need to implement tag propagation in evolution
-+ # For now, just check what actually happens
-+ assert child_loaded is not None
-+
-+ # If tags should be inherited, this would be the assertion:
-+ # assert set(child_loaded.tags) >= {"parent-tag", "workflow-tag"}
-+ # But let's see what actually happens first
-+ if not child_loaded.tags:
-+ print(f"INFO F5: Child does NOT inherit parent tags (current behavior)")
-+ else:
-+ print(f"INFO F5: Child has tags: {child_loaded.tags}")
-\ No newline at end of file
diff --git a/pr-37-diff-r2.txt b/pr-37-diff-r2.txt
deleted file mode 100644
index 4214e32..0000000
--- a/pr-37-diff-r2.txt
+++ /dev/null
@@ -1,1374 +0,0 @@
-diff --git a/openspace/skill_engine/lineage_tracker.py b/openspace/skill_engine/lineage_tracker.py
-index 4cc3755..1aa4379 100644
---- a/openspace/skill_engine/lineage_tracker.py
-+++ b/openspace/skill_engine/lineage_tracker.py
-@@ -192,7 +192,11 @@ class LineageTracker:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_record_derivation")
- try:
- # For FIXED: deactivate same-name parents (superseded)
- if new_record.lineage.origin == SkillOrigin.FIXED:
-@@ -207,7 +211,10 @@ class LineageTracker:
- new_record.is_active = True
-
- self._repo._upsert(new_record)
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_record_derivation")
-
- origin = new_record.lineage.origin.value
- logger.info(
-@@ -216,7 +223,10 @@ class LineageTracker:
- f"[{new_record.skill_id}] ← parents={parent_skill_ids}"
- )
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_record_derivation")
- raise
-
- # ── Lineage queries ───────────────────────────────────────────────
-diff --git a/openspace/skill_engine/skill_repository.py b/openspace/skill_engine/skill_repository.py
-index a7a302f..f8512e4 100644
---- a/openspace/skill_engine/skill_repository.py
-+++ b/openspace/skill_engine/skill_repository.py
-@@ -136,8 +136,9 @@ class SkillRepository:
- """Open a temporary read-only connection (WAL parallel reads)."""
- self._ensure_open()
- if not self._owns_conn:
-- # When sharing a connection, just use it directly
-- yield self._conn
-+ # When sharing a connection, acquire lock to prevent dirty reads
-+ with self._mu:
-+ yield self._conn
- return
- conn = self._make_connection(read_only=True)
- try:
-diff --git a/openspace/skill_engine/store.py b/openspace/skill_engine/store.py
-index ea0b476..2c3a584 100644
---- a/openspace/skill_engine/store.py
-+++ b/openspace/skill_engine/store.py
-@@ -1,5 +1,16 @@
--"""
-+"""SkillStore — SQLite persistence facade for skill quality tracking and evolution.
-+
-+Architecture (Phase 3 decomposition):
-+
-+SkillStore (facade) — store.py
-+├── MigrationManager — Schema creation & versioning (migration_manager.py)
-+├── SkillRepository — CRUD operations (skill_repository.py)
-+├── LineageTracker — Lineage traversal & evolution (lineage_tracker.py)
-+├── AnalysisStore — Execution analysis persistence (analysis_store.py)
-+└── TagSearch — Tag indexing & search operations (tag_search.py)
-+
- Storage location: /.openspace/openspace.db
-+
- Tables:
- skill_records — SkillRecord main table
- skill_lineage_parents — Lineage parent-child relationships (many-to-many)
-@@ -7,6 +18,12 @@ Tables:
- skill_judgments — Per-skill judgments within an analysis
- skill_tool_deps — Tool dependencies
- skill_tags — Auxiliary tags
-+
-+The SkillStore class serves as a unified facade that:
-+1. Manages the persistent SQLite connection and transaction safety
-+2. Delegates specialized operations to focused modules
-+3. Coordinates cross-module workflows (e.g., evolve_skill touches both lineage & repository)
-+4. Provides async API via asyncio.to_thread for thread-safe database access
- """
-
- from __future__ import annotations
-@@ -81,9 +98,13 @@ def _db_retry(
-
-
- class SkillStore:
-- """SQLite persistence engine — Skill quality tracking and evolution ledger.
-+ """SQLite persistence facade for skill quality tracking and evolution.
-
-- Architecture:
-+ Phase 3 Architecture:
-+ Delegates specialized operations to focused modules while maintaining unified API.
-+ All modules share the same connection and lock for transaction consistency.
-+
-+ Concurrency:
- Write path: async method → asyncio.to_thread → _xxx_sync → self._mu lock → self._conn
- Read path: sync method → self._reader() → independent short connection (WAL parallel read)
-
-@@ -185,8 +206,6 @@ class SkillStore:
- if f.exists():
- f.unlink()
-
--
--
- # Lifecycle
- def close(self) -> None:
- """Close the persistent connection. Subsequent ops will raise.
-@@ -280,7 +299,11 @@ class SkillStore:
- created = 0
- refreshed = 0
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_sync_from_registry")
- try:
- # Fetch all existing records keyed by skill_id
- rows = self._conn.execute(
-@@ -374,9 +397,15 @@ class SkillStore:
- created += 1
- logger.debug(f"sync_from_registry: created {meta.name} [{meta.skill_id}]")
-
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_sync_from_registry")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_sync_from_registry")
- raise
-
- if created or refreshed:
-@@ -399,6 +428,11 @@ class SkillStore:
- - total_completions += 1 (if applied and completed)
- - total_fallbacks += 1 (if not applied and not completed)
- - last_updated = now
-+
-+ Note: record_analysis() and evolve_skill() are individually atomic but
-+ not jointly atomic. If evolve_skill() fails after record_analysis()
-+ succeeded, the analysis persists (this is intentional — analysis records
-+ what happened, regardless of whether evolution succeeds).
- """
- await asyncio.to_thread(self._record_analysis_sync, analysis)
-
-@@ -422,6 +456,11 @@ class SkillStore:
-
- In the same SQL transaction, guaranteed by ``self._mu``.
-
-+ Note: record_analysis() and evolve_skill() are individually atomic but
-+ not jointly atomic. If evolve_skill() fails after record_analysis()
-+ succeeded, the analysis persists (this is intentional — analysis records
-+ what happened, regardless of whether evolution succeeds).
-+
- Args:
- new_record : SkillRecord
- New version record, ``lineage.parent_skill_ids`` must be non-empty.
-@@ -453,12 +492,22 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_save_record")
- try:
- self._upsert(record)
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_save_record")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_save_record")
- raise
-
- @_db_retry()
-@@ -470,13 +519,23 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_save_records")
- try:
- for r in records:
- self._upsert(r)
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_save_records")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_save_records")
- raise
-
- @_db_retry()
-@@ -493,7 +552,11 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_record_analysis")
- try:
- # Delegate analysis storage to AnalysisStore (Epic 3.4)
- self._analyses.insert_analysis(analysis)
-@@ -517,9 +580,15 @@ class SkillStore:
- (applied, completed, fallback, now_iso, j.skill_id),
- )
-
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_record_analysis")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_record_analysis")
- raise
-
- @_db_retry()
-@@ -531,35 +600,94 @@ class SkillStore:
- """Atomic: insert new version + deactivate parents (for FIXED).
-
- Delegates to :class:`LineageTracker` (Epic 3.3).
-+
-+ Note: evolve_skill() is individually atomic but not jointly atomic
-+ with record_analysis(). If evolve fails after analysis succeeded,
-+ the analysis persists — this is intentional (analysis records what
-+ happened regardless of evolution outcome).
- """
-+ self._ensure_open()
-+ # Note: We don't acquire self._mu here because the lineage tracker
-+ # will acquire it and both use the same mutex (would cause deadlock).
-+ # The lineage tracker handles its own transaction management.
- self._lineage.record_derivation(new_record, parent_skill_ids)
-
- @_db_retry()
- def _deactivate_record_sync(self, skill_id: str) -> bool:
- self._ensure_open()
- with self._mu:
-- cur = self._conn.execute(
-- "UPDATE skill_records SET is_active=0, last_updated=? WHERE skill_id=?",
-- (datetime.now().isoformat(), skill_id),
-- )
-- self._conn.commit()
-- return cur.rowcount > 0
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_deactivate_record")
-+ try:
-+ cur = self._conn.execute(
-+ "UPDATE skill_records SET is_active=0, last_updated=? WHERE skill_id=?",
-+ (datetime.now().isoformat(), skill_id),
-+ )
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_deactivate_record")
-+ return cur.rowcount > 0
-+ except Exception:
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_deactivate_record")
-+ raise
-
- @_db_retry()
- def _reactivate_record_sync(self, skill_id: str) -> bool:
- self._ensure_open()
- with self._mu:
-- cur = self._conn.execute(
-- "UPDATE skill_records SET is_active=1, last_updated=? WHERE skill_id=?",
-- (datetime.now().isoformat(), skill_id),
-- )
-- self._conn.commit()
-- return cur.rowcount > 0
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_reactivate_record")
-+ try:
-+ cur = self._conn.execute(
-+ "UPDATE skill_records SET is_active=1, last_updated=? WHERE skill_id=?",
-+ (datetime.now().isoformat(), skill_id),
-+ )
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_reactivate_record")
-+ return cur.rowcount > 0
-+ except Exception:
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_reactivate_record")
-+ raise
-
- @_db_retry()
- def _delete_record_sync(self, skill_id: str) -> bool:
- self._ensure_open()
-- return self._repo.delete(skill_id)
-+ with self._mu:
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_delete_record")
-+ try:
-+ # Handle deletion directly rather than delegating to avoid double-commit
-+ cur = self._conn.execute("DELETE FROM skill_records WHERE skill_id=?", (skill_id,))
-+ result = cur.rowcount > 0
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_delete_record")
-+ return result
-+ except Exception:
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_delete_record")
-+ raise
-
- # Read API (sync, each call opens its own read-only conn)
- @_db_retry()
-@@ -621,10 +749,12 @@ class SkillStore:
- Returns the newest active record (by ``last_updated DESC``).
- """
- normalized = skill_dir.rstrip("/")
-+ # Escape LIKE wildcards to prevent % and _ injection
-+ escaped = normalized.replace("%", "\\%").replace("_", "\\_")
- with self._reader() as conn:
- row = conn.execute(
-- "SELECT * FROM skill_records WHERE path LIKE ? AND is_active=1 ORDER BY last_updated DESC LIMIT 1",
-- (f"{normalized}%",),
-+ "SELECT * FROM skill_records WHERE path LIKE ? ESCAPE '\\' AND is_active=1 ORDER BY last_updated DESC LIMIT 1",
-+ (f"{escaped}%",),
- ).fetchone()
- return self._to_record(conn, row) if row else None
-
-@@ -811,16 +941,26 @@ class SkillStore:
- """Delete all data (keeps schema)."""
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_clear")
- try:
- # CASCADE on skill_records cleans up: lineage_parents, tool_deps, tags
- self._conn.execute("DELETE FROM skill_records")
- # Delegate analysis clearing to AnalysisStore (Epic 3.4)
- self._analyses.clear_all_analyses()
-- self._conn.commit()
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_clear")
- logger.info("SkillStore cleared")
- except Exception:
-- self._conn.rollback()
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_clear")
- raise
-
- def vacuum(self) -> None:
-@@ -1053,8 +1193,30 @@ class SkillStore:
- return self._tag_search.get_all_tags()
-
- def sync_tags(self, skill_id: str, tags: List[str]) -> None:
-- """Synchronize tags for a skill (facade to TagSearch)."""
-- return self._tag_search.sync_tags(skill_id, tags)
-+ """Synchronize tags for a skill (facade to TagSearch).
-+
-+ Must manage transaction explicitly — TagSearch shared-mode
-+ delegates commit responsibility to the caller (us).
-+ """
-+ self._ensure_open()
-+ with self._mu:
-+ owns_txn = not self._conn.in_transaction
-+ if owns_txn:
-+ self._conn.execute("BEGIN")
-+ else:
-+ self._conn.execute("SAVEPOINT sp_sync_tags")
-+ try:
-+ self._tag_search.sync_tags(skill_id, tags)
-+ if owns_txn:
-+ self._conn.commit()
-+ else:
-+ self._conn.execute("RELEASE sp_sync_tags")
-+ except Exception:
-+ if owns_txn:
-+ self._conn.rollback()
-+ else:
-+ self._conn.execute("ROLLBACK TO sp_sync_tags")
-+ raise
-
- # ── Migration Management (Epic 3.6) ─────────────────────────────────
-
-@@ -1070,6 +1232,7 @@ class SkillStore:
- """Set schema version (facade to MigrationManager).
-
- DEPRECATED: Use ensure_current_schema() instead.
-+ This method is kept for backward compatibility with existing tests.
- """
- return self._migrations._set_schema_version(version)
-
-diff --git a/openspace/skill_engine/tag_search.py b/openspace/skill_engine/tag_search.py
-index 4c3ebd6..7f170d4 100644
---- a/openspace/skill_engine/tag_search.py
-+++ b/openspace/skill_engine/tag_search.py
-@@ -138,8 +138,6 @@ class TagSearch:
- finally:
- conn.close()
-
-- @_db_retry()
-- @_db_retry()
- @_db_retry()
- def _init_db(self) -> None:
- """Create tables if they don't exist (idempotent).
-diff --git a/tests/test_p3_integration.py b/tests/test_p3_integration.py
-new file mode 100644
-index 0000000..4638c07
---- /dev/null
-+++ b/tests/test_p3_integration.py
-@@ -0,0 +1,904 @@
-+"""Phase 3 Integration Tests — Epic 3.7
-+
-+Comprehensive tests exercising full workflows through the SkillStore facade,
-+verifying all extracted modules work together:
-+- SkillStore (facade) → store.py
-+- MigrationManager → migration_manager.py
-+- SkillRepository → skill_repository.py
-+- LineageTracker → lineage_tracker.py
-+- AnalysisStore → analysis_store.py
-+- TagSearch → tag_search.py
-+
-+Tests full end-to-end workflows, not individual module boundaries.
-+"""
-+
-+import asyncio
-+import concurrent.futures
-+import gc
-+import tempfile
-+import threading
-+import time
-+from datetime import datetime
-+from pathlib import Path
-+from typing import Dict, List
-+
-+import pytest
-+
-+from openspace.skill_engine.store import SkillStore
-+from openspace.skill_engine.types import (
-+ EvolutionSuggestion,
-+ EvolutionType,
-+ ExecutionAnalysis,
-+ SkillCategory,
-+ SkillJudgment,
-+ SkillLineage,
-+ SkillOrigin,
-+ SkillRecord,
-+ SkillVisibility,
-+)
-+
-+
-+@pytest.fixture
-+def temp_db():
-+ """Temporary database for each test."""
-+ with tempfile.TemporaryDirectory() as tmpdir:
-+ yield Path(tmpdir) / "test.db"
-+
-+
-+@pytest.fixture
-+def store(temp_db):
-+ """Clean SkillStore instance for each test."""
-+ store = SkillStore(temp_db)
-+ try:
-+ yield store
-+ finally:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ # Force garbage collection to clean up any lingering connections
-+ gc.collect()
-+ # Small delay to let Windows finish closing file handles
-+ time.sleep(0.1)
-+
-+
-+@pytest.fixture
-+def sample_record():
-+ """Sample SkillRecord for testing."""
-+ return SkillRecord(
-+ skill_id="test_skill_v1",
-+ name="test_skill",
-+ description="Test skill for integration testing",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/skill.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"skill.py": "def test_skill():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+
-+@pytest.fixture
-+def sample_analysis(sample_record):
-+ """Sample ExecutionAnalysis for testing."""
-+ return ExecutionAnalysis(
-+ task_id="task_123",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Test execution completed successfully",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id="test_skill_v1",
-+ skill_applied=True,
-+ note="Skill executed successfully",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+
-+
-+class TestSkillLifecycleWorkflow:
-+ """Test complete skill lifecycle: create → save → get → update → delete."""
-+
-+ @pytest.mark.asyncio
-+ async def test_basic_crud_operations(self, store, sample_record):
-+ """Test basic create, read, update, delete through facade."""
-+ # CREATE: Save record
-+ await store.save_record(sample_record)
-+
-+ # READ: Load record back
-+ loaded = store.load_record(sample_record.skill_id)
-+ assert loaded is not None
-+ assert loaded.skill_id == sample_record.skill_id
-+ assert loaded.name == sample_record.name
-+ assert loaded.description == sample_record.description
-+
-+ # UPDATE: Modify and save again
-+ sample_record.description = "Updated description"
-+ sample_record.total_selections = 5
-+ await store.save_record(sample_record)
-+
-+ updated = store.load_record(sample_record.skill_id)
-+ assert updated.description == "Updated description"
-+ assert updated.total_selections == 5
-+
-+ # DEACTIVATE: Soft delete
-+ result = await store.deactivate_record(sample_record.skill_id)
-+ assert result is True
-+
-+ # Should not appear in active queries
-+ active_skills = store.load_active()
-+ assert sample_record.skill_id not in active_skills
-+
-+ # But should still exist in full query
-+ all_skills = store.load_all(active_only=False)
-+ assert sample_record.skill_id in all_skills
-+
-+ # REACTIVATE
-+ result = await store.reactivate_record(sample_record.skill_id)
-+ assert result is True
-+
-+ active_skills = store.load_active()
-+ assert sample_record.skill_id in active_skills
-+
-+ # HARD DELETE
-+ result = await store.delete_record(sample_record.skill_id)
-+ assert result is True
-+
-+ deleted = store.load_record(sample_record.skill_id)
-+ assert deleted is None
-+
-+ @pytest.mark.asyncio
-+ async def test_batch_operations(self, store):
-+ """Test batch save and load operations."""
-+ # Create multiple records
-+ records = []
-+ for i in range(5):
-+ record = SkillRecord(
-+ skill_id=f"batch_skill_v{i}",
-+ name=f"batch_skill_{i}",
-+ description=f"Batch skill {i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/batch_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"batch_{i}.py": f"def batch_skill_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=i,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ records.append(record)
-+
-+ # Batch save
-+ await store.save_records(records)
-+
-+ # Verify all were saved
-+ all_skills = store.load_all()
-+ for record in records:
-+ assert record.skill_id in all_skills
-+ loaded = all_skills[record.skill_id]
-+ assert loaded.name == record.name
-+ assert loaded.total_selections == record.total_selections
-+
-+
-+class TestLineageWorkflow:
-+ """Test lineage workflow: create → evolve → record → traverse."""
-+
-+ @pytest.mark.asyncio
-+ async def test_evolution_and_lineage_tracking(self, store, sample_record):
-+ """Test skill evolution with lineage tracking."""
-+ # Save initial skill
-+ await store.save_record(sample_record)
-+
-+ # Evolve the skill
-+ evolved_record = SkillRecord(
-+ skill_id="evolved_test_skill_v1",
-+ name="evolved_test_skill",
-+ description="Evolved test skill with improvements",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="Added return value",
-+ content_snapshot={"skill.py": "def evolved_test_skill():\n return 'improved'"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(evolved_record, [sample_record.skill_id])
-+ evolved_skill_id = evolved_record.skill_id
-+
-+ assert evolved_skill_id != sample_record.skill_id
-+
-+ # Load evolved skill
-+ evolved = store.load_record(evolved_skill_id)
-+ assert evolved is not None
-+ assert evolved.name == "evolved_test_skill"
-+ assert evolved.lineage.generation == 1
-+ assert evolved.lineage.origin == SkillOrigin.DERIVED
-+
-+ # Check lineage relationships
-+ children = store.find_children(sample_record.skill_id)
-+ assert evolved_skill_id in children
-+
-+ # Get ancestry
-+ ancestry = store.get_ancestry(evolved_skill_id)
-+ assert len(ancestry) == 1 # just parent
-+ assert ancestry[0].skill_id == sample_record.skill_id
-+
-+ # Get lineage tree
-+ tree = store.get_lineage_tree(sample_record.skill_id)
-+ assert tree["skill_id"] == sample_record.skill_id
-+ assert len(tree["children"]) == 1
-+ assert tree["children"][0]["skill_id"] == evolved_skill_id
-+
-+ @pytest.mark.asyncio
-+ async def test_multi_generation_lineage(self, store, sample_record):
-+ """Test multiple generations of evolution."""
-+ # Generation 0: Original
-+ await store.save_record(sample_record)
-+
-+ # Generation 1: First evolution
-+ gen1_record = SkillRecord(
-+ skill_id="gen1_skill_v1",
-+ name="gen1_skill",
-+ description="First generation evolution",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="First evolution",
-+ content_snapshot={"skill.py": "def gen1():\n return 1"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(gen1_record, [sample_record.skill_id])
-+ gen1_id = gen1_record.skill_id
-+
-+ # Generation 2: Second evolution
-+ gen2_record = SkillRecord(
-+ skill_id="gen2_skill_v1",
-+ name="gen2_skill",
-+ description="Second generation evolution",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=2,
-+ parent_skill_ids=[gen1_id],
-+ change_summary="Second evolution",
-+ content_snapshot={"skill.py": "def gen2():\n return 2"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(gen2_record, [gen1_id])
-+ gen2_id = gen2_record.skill_id
-+
-+ # Verify generations
-+ gen2 = store.load_record(gen2_id)
-+ assert gen2.lineage.generation == 2
-+
-+ # Check full ancestry
-+ ancestry = store.get_ancestry(gen2_id)
-+ assert len(ancestry) == 2 # gen1 + gen0 (ancestors only)
-+ assert ancestry[0].skill_id == sample_record.skill_id # gen0 (oldest)
-+ assert ancestry[1].skill_id == gen1_id # gen1 (newer)
-+
-+
-+class TestAnalysisWorkflow:
-+ """Test analysis workflow: save skill → record analysis → retrieve → evolve."""
-+
-+ @pytest.mark.asyncio
-+ async def test_analysis_recording_and_retrieval(self, store, sample_record, sample_analysis):
-+ """Test recording and retrieving execution analyses."""
-+ # Save skill first
-+ await store.save_record(sample_record)
-+
-+ # Record analysis
-+ await store.record_analysis(sample_analysis)
-+
-+ # Load analysis back
-+ loaded_analysis = store.load_analyses_for_task(sample_analysis.task_id)
-+ assert loaded_analysis is not None
-+ assert loaded_analysis.task_id == sample_analysis.task_id
-+ assert loaded_analysis.execution_note == sample_analysis.execution_note
-+ assert len(loaded_analysis.skill_judgments) == 1
-+
-+ # Load analyses for skill
-+ skill_analyses = store.load_analyses(sample_record.skill_id)
-+ assert len(skill_analyses) == 1
-+ assert skill_analyses[0].task_id == sample_analysis.task_id
-+
-+ # Get all analyses
-+ all_analyses = store.load_all_analyses()
-+ assert len(all_analyses) == 1
-+ assert all_analyses[0].task_id == sample_analysis.task_id
-+
-+ @pytest.mark.asyncio
-+ async def test_evolution_candidates(self, store, sample_record):
-+ """Test getting evolution candidates from analysis data."""
-+ # Save skill
-+ await store.save_record(sample_record)
-+
-+ # Create analysis with failure and evolution suggestion
-+ failed_analysis = ExecutionAnalysis(
-+ task_id="failed_task",
-+ timestamp=datetime.now(),
-+ task_completed=False, # Failed task
-+ execution_note="Skill failed to execute properly",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=sample_record.skill_id,
-+ skill_applied=False, # Failed to apply
-+ note="Skill failed to execute",
-+ )
-+ ],
-+ evolution_suggestions=[
-+ EvolutionSuggestion(
-+ evolution_type=EvolutionType.FIX,
-+ target_skill_ids=[sample_record.skill_id],
-+ direction="Fix skill execution issues",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+
-+ await store.record_analysis(failed_analysis)
-+
-+ # Get evolution candidates (should include skills with failures)
-+ candidates = store.load_evolution_candidates()
-+ assert len(candidates) > 0
-+
-+ # Should include our failed analysis
-+ task_ids = [c.task_id for c in candidates]
-+ assert failed_analysis.task_id in task_ids
-+
-+
-+class TestTagSearchWorkflow:
-+ """Test tag/search workflow: save with tags → search by tags → search by query."""
-+
-+ @pytest.mark.asyncio
-+ async def test_tag_operations(self, store, sample_record):
-+ """Test tag synchronization and retrieval."""
-+ # Save skill
-+ await store.save_record(sample_record)
-+
-+ # Add tags
-+ tags = ["python", "testing", "integration"]
-+ store.sync_tags(sample_record.skill_id, tags)
-+
-+ # Get tags back
-+ loaded_tags = store.get_tags(sample_record.skill_id)
-+ assert set(loaded_tags) == set(tags)
-+
-+ # Get all tags
-+ all_tags = store.get_all_tags()
-+ tag_names = [tag["tag"] for tag in all_tags]
-+ for tag in tags:
-+ assert tag in tag_names
-+
-+ # Find by tags
-+ found_skills = store.find_skills_by_tags(["python"])
-+ assert sample_record.skill_id in found_skills
-+
-+ found_skills = store.find_skills_by_tags(["python", "testing"], match_all=True)
-+ assert sample_record.skill_id in found_skills
-+
-+ found_skills = store.find_skills_by_tags(["nonexistent"])
-+ assert sample_record.skill_id not in found_skills
-+
-+ @pytest.mark.asyncio
-+ async def test_comprehensive_search(self, store):
-+ """Test comprehensive search functionality."""
-+ # Create skills with different attributes
-+ skills_data = [
-+ {
-+ "skill_id": "tool_skill_v1",
-+ "name": "tool_helper",
-+ "description": "Tool utility functions",
-+ "category": SkillCategory.TOOL_GUIDE,
-+ "tags": ["tool", "utility"]
-+ },
-+ {
-+ "skill_id": "workflow_skill_v1",
-+ "name": "workflow_helper",
-+ "description": "Workflow helper functions",
-+ "category": SkillCategory.WORKFLOW,
-+ "tags": ["workflow", "utility"]
-+ },
-+ {
-+ "skill_id": "private_skill_v1",
-+ "name": "private_helper",
-+ "description": "Private internal tool",
-+ "category": SkillCategory.REFERENCE,
-+ "tags": ["internal"],
-+ "visibility": SkillVisibility.PRIVATE
-+ }
-+ ]
-+
-+ for skill_data in skills_data:
-+ record = SkillRecord(
-+ skill_id=skill_data["skill_id"],
-+ name=skill_data["name"],
-+ description=skill_data["description"],
-+ category=skill_data["category"],
-+ visibility=skill_data.get("visibility", SkillVisibility.PUBLIC),
-+ path=f"/test/{skill_data['name']}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"{skill_data['name']}.py": f"def {skill_data['name']}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+ store.sync_tags(skill_data["skill_id"], skill_data["tags"])
-+
-+ # Search by text query
-+ results = store.search_skills("tool")
-+ assert len(results) >= 1
-+ skill_ids = [r["skill_id"] for r in results]
-+ assert "tool_skill_v1" in skill_ids
-+
-+ # Search by category
-+ results = store.search_skills(category=SkillCategory.TOOL_GUIDE)
-+ assert len(results) == 1
-+ assert results[0]["skill_id"] == "tool_skill_v1"
-+
-+ # Search by tags
-+ results = store.search_skills(tags=["utility"])
-+ assert len(results) == 2 # tool and workflow skills
-+
-+ # Search with visibility filter
-+ results = store.search_skills(visibility=SkillVisibility.PUBLIC)
-+ skill_ids = [r["skill_id"] for r in results]
-+ assert "private_skill_v1" not in skill_ids
-+
-+ # Comprehensive search with multiple filters
-+ results = store.search_skills(
-+ query="utility",
-+ tags=["tool"],
-+ category=SkillCategory.TOOL_GUIDE,
-+ limit=10
-+ )
-+ assert len(results) == 1
-+ assert results[0]["skill_id"] == "tool_skill_v1"
-+
-+
-+class TestMigrationWorkflow:
-+ """Test migration workflow: fresh DB → schema creation → all modules work."""
-+
-+ def test_fresh_database_initialization(self, temp_db):
-+ """Test that a fresh database is properly initialized."""
-+ # Create store which should initialize schema
-+ store = SkillStore(temp_db)
-+
-+ try:
-+ # Verify schema version
-+ version = store.get_schema_version()
-+ assert version == 1
-+
-+ # Verify we can perform basic operations
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 0
-+ assert stats["total_analyses"] == 0
-+
-+ # Test that all module operations work
-+ all_skills = store.load_all()
-+ assert len(all_skills) == 0
-+
-+ all_tags = store.get_all_tags()
-+ assert len(all_tags) == 0
-+
-+ finally:
-+ store.close()
-+
-+ def test_schema_migration_methods(self, store):
-+ """Test schema migration facade methods."""
-+ # Test version operations
-+ current_version = store.get_schema_version()
-+ assert current_version == 1
-+
-+ # Test ensure current schema (should be idempotent)
-+ store.ensure_current_schema()
-+ assert store.get_schema_version() == 1
-+
-+ # Test migration method exists
-+ store.migrate_to_version(1) # Should be no-op
-+ assert store.get_schema_version() == 1
-+
-+
-+class TestCrossModuleWorkflow:
-+ """Test cross-module workflow: save → tag → analyze → evolve → comprehensive retrieval."""
-+
-+ @pytest.mark.asyncio
-+ async def test_complete_skill_workflow(self, store, sample_record):
-+ """Test a complete workflow touching all modules."""
-+ # 1. REPOSITORY: Save initial skill
-+ await store.save_record(sample_record)
-+
-+ # 2. TAG SEARCH: Add tags
-+ tags = ["python", "testing", "v1"]
-+ store.sync_tags(sample_record.skill_id, tags)
-+
-+ # 3. ANALYSIS STORE: Record successful execution
-+ analysis = ExecutionAnalysis(
-+ task_id="workflow_test_123",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Executed successfully in workflow test",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=sample_record.skill_id,
-+ skill_applied=True,
-+ note="Executed successfully in workflow test",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+ await store.record_analysis(analysis)
-+
-+ # 4. LINEAGE TRACKER: Evolve the skill
-+ evolved_record = SkillRecord(
-+ skill_id="evolved_workflow_skill_v1",
-+ name="evolved_workflow_skill",
-+ description="Enhanced workflow skill for testing",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="Enhanced for workflow testing",
-+ content_snapshot={"skill.py": "def evolved_workflow_skill():\n return 'enhanced'"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(evolved_record, [sample_record.skill_id])
-+ evolved_id = evolved_record.skill_id
-+
-+ # 5. COMPREHENSIVE VERIFICATION through facade
-+
-+ # Repository operations
-+ original = store.load_record(sample_record.skill_id)
-+ evolved = store.load_record(evolved_id)
-+ assert original is not None
-+ assert evolved is not None
-+ assert evolved.lineage.generation == 1
-+
-+ # Tag operations
-+ original_tags = store.get_tags(sample_record.skill_id)
-+ assert set(original_tags) == set(tags)
-+
-+ # Search operations
-+ found_by_tag = store.find_skills_by_tags(["python"])
-+ assert sample_record.skill_id in found_by_tag
-+
-+ search_results = store.search_skills("testing")
-+ skill_ids = [r["skill_id"] for r in search_results]
-+ assert sample_record.skill_id in skill_ids
-+
-+ # Analysis operations
-+ skill_analyses = store.load_analyses(sample_record.skill_id)
-+ assert len(skill_analyses) == 1
-+ assert skill_analyses[0].task_id == analysis.task_id
-+
-+ # Lineage operations
-+ children = store.find_children(sample_record.skill_id)
-+ assert evolved_id in children
-+
-+ ancestry = store.get_ancestry(evolved_id)
-+ assert len(ancestry) == 1 # just the parent
-+
-+ # Summary stats should reflect all data
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 2 # original + evolved
-+ assert stats["total_analyses"] == 1
-+
-+
-+class TestConcurrentAccess:
-+ """Test concurrent access to SkillStore."""
-+
-+ @pytest.mark.asyncio
-+ async def test_concurrent_reads_and_writes(self, store, sample_record):
-+ """Test that concurrent operations don't interfere."""
-+ # Save initial skill
-+ await store.save_record(sample_record)
-+
-+ async def reader_task(worker_id: int, iterations: int):
-+ """Read operations in concurrent task."""
-+ results = []
-+ for i in range(iterations):
-+ # Mix of read operations
-+ record = store.load_record(sample_record.skill_id)
-+ all_skills = store.load_all()
-+ stats = store.get_stats()
-+ results.append({
-+ 'worker': worker_id,
-+ 'iteration': i,
-+ 'record_found': record is not None,
-+ 'total_skills': len(all_skills),
-+ 'stats_count': stats.get('total_skills', 0)
-+ })
-+ return results
-+
-+ async def writer_task(worker_id: int, iterations: int):
-+ """Write operations in concurrent task."""
-+ for i in range(iterations):
-+ # Create unique records
-+ record = SkillRecord(
-+ skill_id=f"concurrent_skill_w{worker_id}_i{i}",
-+ name=f"concurrent_skill_{worker_id}_{i}",
-+ description=f"Concurrent test skill worker {worker_id} iteration {i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/concurrent_{worker_id}_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"concurrent_{worker_id}_{i}.py": f"def concurrent_skill_{worker_id}_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+
-+ # Run concurrent readers and writers
-+ num_readers = 3
-+ num_writers = 2
-+ iterations = 5
-+
-+ tasks = []
-+
-+ # Start reader tasks
-+ for i in range(num_readers):
-+ tasks.append(reader_task(i, iterations))
-+
-+ # Start writer tasks
-+ for i in range(num_writers):
-+ tasks.append(writer_task(i, iterations))
-+
-+ # Wait for all to complete
-+ results = await asyncio.gather(*tasks, return_exceptions=True)
-+
-+ # Verify no exceptions occurred
-+ for result in results:
-+ if isinstance(result, Exception):
-+ pytest.fail(f"Concurrent task failed: {result}")
-+
-+ # Verify final state
-+ final_count = store.count()
-+ expected_min = 1 + (num_writers * iterations) # initial + written records
-+ assert final_count >= expected_min
-+
-+ def test_concurrent_threading(self, store, sample_record):
-+ """Test concurrent access from multiple threads using threading."""
-+ # Save initial skill synchronously
-+ asyncio.run(store.save_record(sample_record))
-+
-+ results = {}
-+ errors = []
-+
-+ def reader_thread(thread_id: int):
-+ """Thread function for read operations."""
-+ try:
-+ for i in range(10):
-+ record = store.load_record(sample_record.skill_id)
-+ stats = store.get_stats()
-+ results[f"reader_{thread_id}_{i}"] = {
-+ 'found': record is not None,
-+ 'stats': stats.get('total_skills', 0)
-+ }
-+ time.sleep(0.001) # Small delay
-+ except Exception as e:
-+ errors.append(f"Reader {thread_id}: {e}")
-+
-+ def writer_thread(thread_id: int):
-+ """Thread function for write operations."""
-+ try:
-+ async def write_operations():
-+ for i in range(5):
-+ record = SkillRecord(
-+ skill_id=f"thread_skill_{thread_id}_{i}",
-+ name=f"thread_skill_{thread_id}_{i}",
-+ description=f"Thread test skill {thread_id}-{i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/thread_{thread_id}_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"thread_{thread_id}_{i}.py": f"def thread_skill_{thread_id}_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+
-+ # Run async operations in thread
-+ asyncio.run(write_operations())
-+
-+ except Exception as e:
-+ errors.append(f"Writer {thread_id}: {e}")
-+
-+ # Start threads
-+ threads = []
-+
-+ # Start 2 reader threads
-+ for i in range(2):
-+ thread = threading.Thread(target=reader_thread, args=(i,))
-+ threads.append(thread)
-+ thread.start()
-+
-+ # Start 1 writer thread
-+ writer = threading.Thread(target=writer_thread, args=(0,))
-+ threads.append(writer)
-+ writer.start()
-+
-+ # Wait for completion
-+ for thread in threads:
-+ thread.join(timeout=10.0)
-+ if thread.is_alive():
-+ pytest.fail("Thread did not complete within timeout")
-+
-+ # Check for errors
-+ if errors:
-+ pytest.fail(f"Concurrent threading errors: {errors}")
-+
-+ # Verify some operations succeeded
-+ assert len(results) > 0
-+
-+ # Verify final database state
-+ final_count = store.count()
-+ assert final_count >= 1 # At least the initial record
-+
-+
-+class TestEdgeCases:
-+ """Test edge cases and error conditions in integrated workflows."""
-+
-+ @pytest.mark.asyncio
-+ async def test_nonexistent_skill_operations(self, store):
-+ """Test operations on nonexistent skills."""
-+ nonexistent_id = "nonexistent_skill_v1"
-+
-+ # Load nonexistent
-+ record = store.load_record(nonexistent_id)
-+ assert record is None
-+
-+ # Deactivate nonexistent
-+ result = await store.deactivate_record(nonexistent_id)
-+ assert result is False
-+
-+ # Delete nonexistent
-+ result = await store.delete_record(nonexistent_id)
-+ assert result is False
-+
-+ # Get tags for nonexistent
-+ tags = store.get_tags(nonexistent_id)
-+ assert tags == []
-+
-+ # Find children of nonexistent
-+ children = store.find_children(nonexistent_id)
-+ assert children == []
-+
-+ @pytest.mark.asyncio
-+ async def test_empty_database_operations(self, store):
-+ """Test operations on empty database."""
-+ # Load operations on empty DB
-+ all_skills = store.load_all()
-+ assert len(all_skills) == 0
-+
-+ active_skills = store.load_active()
-+ assert len(active_skills) == 0
-+
-+ # Search operations on empty DB
-+ results = store.search_skills("anything")
-+ assert len(results) == 0
-+
-+ found = store.find_skills_by_tags(["any", "tags"])
-+ assert len(found) == 0
-+
-+ # Analysis operations on empty DB
-+ analyses = store.load_all_analyses()
-+ assert len(analyses) == 0
-+
-+ candidates = store.load_evolution_candidates()
-+ assert len(candidates) == 0
-+
-+ # Stats on empty DB
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 0
-+ assert stats["total_analyses"] == 0
-+
-+ count = store.count()
-+ assert count == 0
-+
-+
-+if __name__ == "__main__":
-+ # Run tests if executed directly
-+ pytest.main([__file__, "-v"])
-\ No newline at end of file
diff --git a/pr-37-diff.txt b/pr-37-diff.txt
deleted file mode 100644
index e1f1c4c..0000000
--- a/pr-37-diff.txt
+++ /dev/null
@@ -1,1207 +0,0 @@
-diff --git a/openspace/skill_engine/lineage_tracker.py b/openspace/skill_engine/lineage_tracker.py
-index 4cc3755..c5d1f76 100644
---- a/openspace/skill_engine/lineage_tracker.py
-+++ b/openspace/skill_engine/lineage_tracker.py
-@@ -192,7 +192,15 @@ class LineageTracker:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ # Use conditional transaction logic to avoid nested transactions
-+ try:
-+ self._conn.execute("BEGIN")
-+ need_commit = True
-+ except sqlite3.OperationalError as e:
-+ if "cannot start a transaction within a transaction" in str(e):
-+ need_commit = False # Already in transaction
-+ else:
-+ raise # Some other error
- try:
- # For FIXED: deactivate same-name parents (superseded)
- if new_record.lineage.origin == SkillOrigin.FIXED:
-@@ -207,7 +215,8 @@ class LineageTracker:
- new_record.is_active = True
-
- self._repo._upsert(new_record)
-- self._conn.commit()
-+ if need_commit:
-+ self._conn.commit()
-
- origin = new_record.lineage.origin.value
- logger.info(
-@@ -216,7 +225,8 @@ class LineageTracker:
- f"[{new_record.skill_id}] ← parents={parent_skill_ids}"
- )
- except Exception:
-- self._conn.rollback()
-+ if need_commit:
-+ self._conn.rollback()
- raise
-
- # ── Lineage queries ───────────────────────────────────────────────
-diff --git a/openspace/skill_engine/store.py b/openspace/skill_engine/store.py
-index ea0b476..d532b0f 100644
---- a/openspace/skill_engine/store.py
-+++ b/openspace/skill_engine/store.py
-@@ -1,5 +1,16 @@
--"""
-+"""SkillStore — SQLite persistence facade for skill quality tracking and evolution.
-+
-+Architecture (Phase 3 decomposition):
-+
-+SkillStore (facade) — store.py
-+├── MigrationManager — Schema creation & versioning (migration_manager.py)
-+├── SkillRepository — CRUD operations (skill_repository.py)
-+├── LineageTracker — Lineage traversal & evolution (lineage_tracker.py)
-+├── AnalysisStore — Execution analysis persistence (analysis_store.py)
-+└── TagSearch — Tag indexing & search operations (tag_search.py)
-+
- Storage location: /.openspace/openspace.db
-+
- Tables:
- skill_records — SkillRecord main table
- skill_lineage_parents — Lineage parent-child relationships (many-to-many)
-@@ -7,6 +18,12 @@ Tables:
- skill_judgments — Per-skill judgments within an analysis
- skill_tool_deps — Tool dependencies
- skill_tags — Auxiliary tags
-+
-+The SkillStore class serves as a unified facade that:
-+1. Manages the persistent SQLite connection and transaction safety
-+2. Delegates specialized operations to focused modules
-+3. Coordinates cross-module workflows (e.g., evolve_skill touches both lineage & repository)
-+4. Provides async API via asyncio.to_thread for thread-safe database access
- """
-
- from __future__ import annotations
-@@ -81,9 +98,13 @@ def _db_retry(
-
-
- class SkillStore:
-- """SQLite persistence engine — Skill quality tracking and evolution ledger.
-+ """SQLite persistence facade for skill quality tracking and evolution.
-
-- Architecture:
-+ Phase 3 Architecture:
-+ Delegates specialized operations to focused modules while maintaining unified API.
-+ All modules share the same connection and lock for transaction consistency.
-+
-+ Concurrency:
- Write path: async method → asyncio.to_thread → _xxx_sync → self._mu lock → self._conn
- Read path: sync method → self._reader() → independent short connection (WAL parallel read)
-
-@@ -185,8 +206,6 @@ class SkillStore:
- if f.exists():
- f.unlink()
-
--
--
- # Lifecycle
- def close(self) -> None:
- """Close the persistent connection. Subsequent ops will raise.
-@@ -280,7 +299,15 @@ class SkillStore:
- created = 0
- refreshed = 0
- with self._mu:
-- self._conn.execute("BEGIN")
-+ # Try to start transaction; if we're already in one, continue without BEGIN
-+ try:
-+ self._conn.execute("BEGIN")
-+ need_commit = True
-+ except sqlite3.OperationalError as e:
-+ if "cannot start a transaction within a transaction" in str(e):
-+ need_commit = False # Already in transaction
-+ else:
-+ raise # Some other error
- try:
- # Fetch all existing records keyed by skill_id
- rows = self._conn.execute(
-@@ -374,9 +401,11 @@ class SkillStore:
- created += 1
- logger.debug(f"sync_from_registry: created {meta.name} [{meta.skill_id}]")
-
-- self._conn.commit()
-+ if need_commit:
-+ self._conn.commit()
- except Exception:
-- self._conn.rollback()
-+ if need_commit:
-+ self._conn.rollback()
- raise
-
- if created or refreshed:
-@@ -453,12 +482,22 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ # Try to start transaction; if we're already in one, continue without BEGIN
-+ try:
-+ self._conn.execute("BEGIN")
-+ need_commit = True
-+ except sqlite3.OperationalError as e:
-+ if "cannot start a transaction within a transaction" in str(e):
-+ need_commit = False # Already in transaction
-+ else:
-+ raise # Some other error
- try:
- self._upsert(record)
-- self._conn.commit()
-+ if need_commit:
-+ self._conn.commit()
- except Exception:
-- self._conn.rollback()
-+ if need_commit:
-+ self._conn.rollback()
- raise
-
- @_db_retry()
-@@ -470,13 +509,23 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ # Try to start transaction; if we're already in one, continue without BEGIN
-+ try:
-+ self._conn.execute("BEGIN")
-+ need_commit = True
-+ except sqlite3.OperationalError as e:
-+ if "cannot start a transaction within a transaction" in str(e):
-+ need_commit = False # Already in transaction
-+ else:
-+ raise # Some other error
- try:
- for r in records:
- self._upsert(r)
-- self._conn.commit()
-+ if need_commit:
-+ self._conn.commit()
- except Exception:
-- self._conn.rollback()
-+ if need_commit:
-+ self._conn.rollback()
- raise
-
- @_db_retry()
-@@ -493,7 +542,15 @@ class SkillStore:
- """
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ # Try to start transaction; if we're already in one, continue without BEGIN
-+ try:
-+ self._conn.execute("BEGIN")
-+ need_commit = True
-+ except sqlite3.OperationalError as e:
-+ if "cannot start a transaction within a transaction" in str(e):
-+ need_commit = False # Already in transaction
-+ else:
-+ raise # Some other error
- try:
- # Delegate analysis storage to AnalysisStore (Epic 3.4)
- self._analyses.insert_analysis(analysis)
-@@ -517,9 +574,11 @@ class SkillStore:
- (applied, completed, fallback, now_iso, j.skill_id),
- )
-
-- self._conn.commit()
-+ if need_commit:
-+ self._conn.commit()
- except Exception:
-- self._conn.rollback()
-+ if need_commit:
-+ self._conn.rollback()
- raise
-
- @_db_retry()
-@@ -532,6 +591,9 @@ class SkillStore:
-
- Delegates to :class:`LineageTracker` (Epic 3.3).
- """
-+ # Note: We don't acquire self._mu here because the lineage tracker
-+ # will acquire it and both use the same mutex (would cause deadlock).
-+ # The lineage tracker handles its own transaction management.
- self._lineage.record_derivation(new_record, parent_skill_ids)
-
- @_db_retry()
-@@ -811,16 +873,26 @@ class SkillStore:
- """Delete all data (keeps schema)."""
- self._ensure_open()
- with self._mu:
-- self._conn.execute("BEGIN")
-+ # Try to start transaction; if we're already in one, continue without BEGIN
-+ try:
-+ self._conn.execute("BEGIN")
-+ need_commit = True
-+ except sqlite3.OperationalError as e:
-+ if "cannot start a transaction within a transaction" in str(e):
-+ need_commit = False # Already in transaction
-+ else:
-+ raise # Some other error
- try:
- # CASCADE on skill_records cleans up: lineage_parents, tool_deps, tags
- self._conn.execute("DELETE FROM skill_records")
- # Delegate analysis clearing to AnalysisStore (Epic 3.4)
- self._analyses.clear_all_analyses()
-- self._conn.commit()
-+ if need_commit:
-+ self._conn.commit()
- logger.info("SkillStore cleared")
- except Exception:
-- self._conn.rollback()
-+ if need_commit:
-+ self._conn.rollback()
- raise
-
- def vacuum(self) -> None:
-@@ -1053,8 +1125,29 @@ class SkillStore:
- return self._tag_search.get_all_tags()
-
- def sync_tags(self, skill_id: str, tags: List[str]) -> None:
-- """Synchronize tags for a skill (facade to TagSearch)."""
-- return self._tag_search.sync_tags(skill_id, tags)
-+ """Synchronize tags for a skill (facade to TagSearch).
-+
-+ Must manage transaction explicitly — TagSearch shared-mode
-+ delegates commit responsibility to the caller (us).
-+ """
-+ self._ensure_open()
-+ with self._mu:
-+ try:
-+ self._conn.execute("BEGIN")
-+ need_commit = True
-+ except sqlite3.OperationalError as e:
-+ if "cannot start a transaction within a transaction" in str(e):
-+ need_commit = False
-+ else:
-+ raise
-+ try:
-+ self._tag_search.sync_tags(skill_id, tags)
-+ if need_commit:
-+ self._conn.commit()
-+ except Exception:
-+ if need_commit:
-+ self._conn.rollback()
-+ raise
-
- # ── Migration Management (Epic 3.6) ─────────────────────────────────
-
-@@ -1070,6 +1163,7 @@ class SkillStore:
- """Set schema version (facade to MigrationManager).
-
- DEPRECATED: Use ensure_current_schema() instead.
-+ This method is kept for backward compatibility with existing tests.
- """
- return self._migrations._set_schema_version(version)
-
-diff --git a/tests/test_p3_integration.py b/tests/test_p3_integration.py
-new file mode 100644
-index 0000000..4638c07
---- /dev/null
-+++ b/tests/test_p3_integration.py
-@@ -0,0 +1,904 @@
-+"""Phase 3 Integration Tests — Epic 3.7
-+
-+Comprehensive tests exercising full workflows through the SkillStore facade,
-+verifying all extracted modules work together:
-+- SkillStore (facade) → store.py
-+- MigrationManager → migration_manager.py
-+- SkillRepository → skill_repository.py
-+- LineageTracker → lineage_tracker.py
-+- AnalysisStore → analysis_store.py
-+- TagSearch → tag_search.py
-+
-+Tests full end-to-end workflows, not individual module boundaries.
-+"""
-+
-+import asyncio
-+import concurrent.futures
-+import gc
-+import tempfile
-+import threading
-+import time
-+from datetime import datetime
-+from pathlib import Path
-+from typing import Dict, List
-+
-+import pytest
-+
-+from openspace.skill_engine.store import SkillStore
-+from openspace.skill_engine.types import (
-+ EvolutionSuggestion,
-+ EvolutionType,
-+ ExecutionAnalysis,
-+ SkillCategory,
-+ SkillJudgment,
-+ SkillLineage,
-+ SkillOrigin,
-+ SkillRecord,
-+ SkillVisibility,
-+)
-+
-+
-+@pytest.fixture
-+def temp_db():
-+ """Temporary database for each test."""
-+ with tempfile.TemporaryDirectory() as tmpdir:
-+ yield Path(tmpdir) / "test.db"
-+
-+
-+@pytest.fixture
-+def store(temp_db):
-+ """Clean SkillStore instance for each test."""
-+ store = SkillStore(temp_db)
-+ try:
-+ yield store
-+ finally:
-+ try:
-+ store.close()
-+ except Exception:
-+ pass
-+ # Force garbage collection to clean up any lingering connections
-+ gc.collect()
-+ # Small delay to let Windows finish closing file handles
-+ time.sleep(0.1)
-+
-+
-+@pytest.fixture
-+def sample_record():
-+ """Sample SkillRecord for testing."""
-+ return SkillRecord(
-+ skill_id="test_skill_v1",
-+ name="test_skill",
-+ description="Test skill for integration testing",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path="/test/skill.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={"skill.py": "def test_skill():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+
-+@pytest.fixture
-+def sample_analysis(sample_record):
-+ """Sample ExecutionAnalysis for testing."""
-+ return ExecutionAnalysis(
-+ task_id="task_123",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Test execution completed successfully",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id="test_skill_v1",
-+ skill_applied=True,
-+ note="Skill executed successfully",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+
-+
-+class TestSkillLifecycleWorkflow:
-+ """Test complete skill lifecycle: create → save → get → update → delete."""
-+
-+ @pytest.mark.asyncio
-+ async def test_basic_crud_operations(self, store, sample_record):
-+ """Test basic create, read, update, delete through facade."""
-+ # CREATE: Save record
-+ await store.save_record(sample_record)
-+
-+ # READ: Load record back
-+ loaded = store.load_record(sample_record.skill_id)
-+ assert loaded is not None
-+ assert loaded.skill_id == sample_record.skill_id
-+ assert loaded.name == sample_record.name
-+ assert loaded.description == sample_record.description
-+
-+ # UPDATE: Modify and save again
-+ sample_record.description = "Updated description"
-+ sample_record.total_selections = 5
-+ await store.save_record(sample_record)
-+
-+ updated = store.load_record(sample_record.skill_id)
-+ assert updated.description == "Updated description"
-+ assert updated.total_selections == 5
-+
-+ # DEACTIVATE: Soft delete
-+ result = await store.deactivate_record(sample_record.skill_id)
-+ assert result is True
-+
-+ # Should not appear in active queries
-+ active_skills = store.load_active()
-+ assert sample_record.skill_id not in active_skills
-+
-+ # But should still exist in full query
-+ all_skills = store.load_all(active_only=False)
-+ assert sample_record.skill_id in all_skills
-+
-+ # REACTIVATE
-+ result = await store.reactivate_record(sample_record.skill_id)
-+ assert result is True
-+
-+ active_skills = store.load_active()
-+ assert sample_record.skill_id in active_skills
-+
-+ # HARD DELETE
-+ result = await store.delete_record(sample_record.skill_id)
-+ assert result is True
-+
-+ deleted = store.load_record(sample_record.skill_id)
-+ assert deleted is None
-+
-+ @pytest.mark.asyncio
-+ async def test_batch_operations(self, store):
-+ """Test batch save and load operations."""
-+ # Create multiple records
-+ records = []
-+ for i in range(5):
-+ record = SkillRecord(
-+ skill_id=f"batch_skill_v{i}",
-+ name=f"batch_skill_{i}",
-+ description=f"Batch skill {i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/batch_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"batch_{i}.py": f"def batch_skill_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=i,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ records.append(record)
-+
-+ # Batch save
-+ await store.save_records(records)
-+
-+ # Verify all were saved
-+ all_skills = store.load_all()
-+ for record in records:
-+ assert record.skill_id in all_skills
-+ loaded = all_skills[record.skill_id]
-+ assert loaded.name == record.name
-+ assert loaded.total_selections == record.total_selections
-+
-+
-+class TestLineageWorkflow:
-+ """Test lineage workflow: create → evolve → record → traverse."""
-+
-+ @pytest.mark.asyncio
-+ async def test_evolution_and_lineage_tracking(self, store, sample_record):
-+ """Test skill evolution with lineage tracking."""
-+ # Save initial skill
-+ await store.save_record(sample_record)
-+
-+ # Evolve the skill
-+ evolved_record = SkillRecord(
-+ skill_id="evolved_test_skill_v1",
-+ name="evolved_test_skill",
-+ description="Evolved test skill with improvements",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="Added return value",
-+ content_snapshot={"skill.py": "def evolved_test_skill():\n return 'improved'"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(evolved_record, [sample_record.skill_id])
-+ evolved_skill_id = evolved_record.skill_id
-+
-+ assert evolved_skill_id != sample_record.skill_id
-+
-+ # Load evolved skill
-+ evolved = store.load_record(evolved_skill_id)
-+ assert evolved is not None
-+ assert evolved.name == "evolved_test_skill"
-+ assert evolved.lineage.generation == 1
-+ assert evolved.lineage.origin == SkillOrigin.DERIVED
-+
-+ # Check lineage relationships
-+ children = store.find_children(sample_record.skill_id)
-+ assert evolved_skill_id in children
-+
-+ # Get ancestry
-+ ancestry = store.get_ancestry(evolved_skill_id)
-+ assert len(ancestry) == 1 # just parent
-+ assert ancestry[0].skill_id == sample_record.skill_id
-+
-+ # Get lineage tree
-+ tree = store.get_lineage_tree(sample_record.skill_id)
-+ assert tree["skill_id"] == sample_record.skill_id
-+ assert len(tree["children"]) == 1
-+ assert tree["children"][0]["skill_id"] == evolved_skill_id
-+
-+ @pytest.mark.asyncio
-+ async def test_multi_generation_lineage(self, store, sample_record):
-+ """Test multiple generations of evolution."""
-+ # Generation 0: Original
-+ await store.save_record(sample_record)
-+
-+ # Generation 1: First evolution
-+ gen1_record = SkillRecord(
-+ skill_id="gen1_skill_v1",
-+ name="gen1_skill",
-+ description="First generation evolution",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="First evolution",
-+ content_snapshot={"skill.py": "def gen1():\n return 1"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(gen1_record, [sample_record.skill_id])
-+ gen1_id = gen1_record.skill_id
-+
-+ # Generation 2: Second evolution
-+ gen2_record = SkillRecord(
-+ skill_id="gen2_skill_v1",
-+ name="gen2_skill",
-+ description="Second generation evolution",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=2,
-+ parent_skill_ids=[gen1_id],
-+ change_summary="Second evolution",
-+ content_snapshot={"skill.py": "def gen2():\n return 2"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(gen2_record, [gen1_id])
-+ gen2_id = gen2_record.skill_id
-+
-+ # Verify generations
-+ gen2 = store.load_record(gen2_id)
-+ assert gen2.lineage.generation == 2
-+
-+ # Check full ancestry
-+ ancestry = store.get_ancestry(gen2_id)
-+ assert len(ancestry) == 2 # gen1 + gen0 (ancestors only)
-+ assert ancestry[0].skill_id == sample_record.skill_id # gen0 (oldest)
-+ assert ancestry[1].skill_id == gen1_id # gen1 (newer)
-+
-+
-+class TestAnalysisWorkflow:
-+ """Test analysis workflow: save skill → record analysis → retrieve → evolve."""
-+
-+ @pytest.mark.asyncio
-+ async def test_analysis_recording_and_retrieval(self, store, sample_record, sample_analysis):
-+ """Test recording and retrieving execution analyses."""
-+ # Save skill first
-+ await store.save_record(sample_record)
-+
-+ # Record analysis
-+ await store.record_analysis(sample_analysis)
-+
-+ # Load analysis back
-+ loaded_analysis = store.load_analyses_for_task(sample_analysis.task_id)
-+ assert loaded_analysis is not None
-+ assert loaded_analysis.task_id == sample_analysis.task_id
-+ assert loaded_analysis.execution_note == sample_analysis.execution_note
-+ assert len(loaded_analysis.skill_judgments) == 1
-+
-+ # Load analyses for skill
-+ skill_analyses = store.load_analyses(sample_record.skill_id)
-+ assert len(skill_analyses) == 1
-+ assert skill_analyses[0].task_id == sample_analysis.task_id
-+
-+ # Get all analyses
-+ all_analyses = store.load_all_analyses()
-+ assert len(all_analyses) == 1
-+ assert all_analyses[0].task_id == sample_analysis.task_id
-+
-+ @pytest.mark.asyncio
-+ async def test_evolution_candidates(self, store, sample_record):
-+ """Test getting evolution candidates from analysis data."""
-+ # Save skill
-+ await store.save_record(sample_record)
-+
-+ # Create analysis with failure and evolution suggestion
-+ failed_analysis = ExecutionAnalysis(
-+ task_id="failed_task",
-+ timestamp=datetime.now(),
-+ task_completed=False, # Failed task
-+ execution_note="Skill failed to execute properly",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=sample_record.skill_id,
-+ skill_applied=False, # Failed to apply
-+ note="Skill failed to execute",
-+ )
-+ ],
-+ evolution_suggestions=[
-+ EvolutionSuggestion(
-+ evolution_type=EvolutionType.FIX,
-+ target_skill_ids=[sample_record.skill_id],
-+ direction="Fix skill execution issues",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+
-+ await store.record_analysis(failed_analysis)
-+
-+ # Get evolution candidates (should include skills with failures)
-+ candidates = store.load_evolution_candidates()
-+ assert len(candidates) > 0
-+
-+ # Should include our failed analysis
-+ task_ids = [c.task_id for c in candidates]
-+ assert failed_analysis.task_id in task_ids
-+
-+
-+class TestTagSearchWorkflow:
-+ """Test tag/search workflow: save with tags → search by tags → search by query."""
-+
-+ @pytest.mark.asyncio
-+ async def test_tag_operations(self, store, sample_record):
-+ """Test tag synchronization and retrieval."""
-+ # Save skill
-+ await store.save_record(sample_record)
-+
-+ # Add tags
-+ tags = ["python", "testing", "integration"]
-+ store.sync_tags(sample_record.skill_id, tags)
-+
-+ # Get tags back
-+ loaded_tags = store.get_tags(sample_record.skill_id)
-+ assert set(loaded_tags) == set(tags)
-+
-+ # Get all tags
-+ all_tags = store.get_all_tags()
-+ tag_names = [tag["tag"] for tag in all_tags]
-+ for tag in tags:
-+ assert tag in tag_names
-+
-+ # Find by tags
-+ found_skills = store.find_skills_by_tags(["python"])
-+ assert sample_record.skill_id in found_skills
-+
-+ found_skills = store.find_skills_by_tags(["python", "testing"], match_all=True)
-+ assert sample_record.skill_id in found_skills
-+
-+ found_skills = store.find_skills_by_tags(["nonexistent"])
-+ assert sample_record.skill_id not in found_skills
-+
-+ @pytest.mark.asyncio
-+ async def test_comprehensive_search(self, store):
-+ """Test comprehensive search functionality."""
-+ # Create skills with different attributes
-+ skills_data = [
-+ {
-+ "skill_id": "tool_skill_v1",
-+ "name": "tool_helper",
-+ "description": "Tool utility functions",
-+ "category": SkillCategory.TOOL_GUIDE,
-+ "tags": ["tool", "utility"]
-+ },
-+ {
-+ "skill_id": "workflow_skill_v1",
-+ "name": "workflow_helper",
-+ "description": "Workflow helper functions",
-+ "category": SkillCategory.WORKFLOW,
-+ "tags": ["workflow", "utility"]
-+ },
-+ {
-+ "skill_id": "private_skill_v1",
-+ "name": "private_helper",
-+ "description": "Private internal tool",
-+ "category": SkillCategory.REFERENCE,
-+ "tags": ["internal"],
-+ "visibility": SkillVisibility.PRIVATE
-+ }
-+ ]
-+
-+ for skill_data in skills_data:
-+ record = SkillRecord(
-+ skill_id=skill_data["skill_id"],
-+ name=skill_data["name"],
-+ description=skill_data["description"],
-+ category=skill_data["category"],
-+ visibility=skill_data.get("visibility", SkillVisibility.PUBLIC),
-+ path=f"/test/{skill_data['name']}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"{skill_data['name']}.py": f"def {skill_data['name']}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+ store.sync_tags(skill_data["skill_id"], skill_data["tags"])
-+
-+ # Search by text query
-+ results = store.search_skills("tool")
-+ assert len(results) >= 1
-+ skill_ids = [r["skill_id"] for r in results]
-+ assert "tool_skill_v1" in skill_ids
-+
-+ # Search by category
-+ results = store.search_skills(category=SkillCategory.TOOL_GUIDE)
-+ assert len(results) == 1
-+ assert results[0]["skill_id"] == "tool_skill_v1"
-+
-+ # Search by tags
-+ results = store.search_skills(tags=["utility"])
-+ assert len(results) == 2 # tool and workflow skills
-+
-+ # Search with visibility filter
-+ results = store.search_skills(visibility=SkillVisibility.PUBLIC)
-+ skill_ids = [r["skill_id"] for r in results]
-+ assert "private_skill_v1" not in skill_ids
-+
-+ # Comprehensive search with multiple filters
-+ results = store.search_skills(
-+ query="utility",
-+ tags=["tool"],
-+ category=SkillCategory.TOOL_GUIDE,
-+ limit=10
-+ )
-+ assert len(results) == 1
-+ assert results[0]["skill_id"] == "tool_skill_v1"
-+
-+
-+class TestMigrationWorkflow:
-+ """Test migration workflow: fresh DB → schema creation → all modules work."""
-+
-+ def test_fresh_database_initialization(self, temp_db):
-+ """Test that a fresh database is properly initialized."""
-+ # Create store which should initialize schema
-+ store = SkillStore(temp_db)
-+
-+ try:
-+ # Verify schema version
-+ version = store.get_schema_version()
-+ assert version == 1
-+
-+ # Verify we can perform basic operations
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 0
-+ assert stats["total_analyses"] == 0
-+
-+ # Test that all module operations work
-+ all_skills = store.load_all()
-+ assert len(all_skills) == 0
-+
-+ all_tags = store.get_all_tags()
-+ assert len(all_tags) == 0
-+
-+ finally:
-+ store.close()
-+
-+ def test_schema_migration_methods(self, store):
-+ """Test schema migration facade methods."""
-+ # Test version operations
-+ current_version = store.get_schema_version()
-+ assert current_version == 1
-+
-+ # Test ensure current schema (should be idempotent)
-+ store.ensure_current_schema()
-+ assert store.get_schema_version() == 1
-+
-+ # Test migration method exists
-+ store.migrate_to_version(1) # Should be no-op
-+ assert store.get_schema_version() == 1
-+
-+
-+class TestCrossModuleWorkflow:
-+ """Test cross-module workflow: save → tag → analyze → evolve → comprehensive retrieval."""
-+
-+ @pytest.mark.asyncio
-+ async def test_complete_skill_workflow(self, store, sample_record):
-+ """Test a complete workflow touching all modules."""
-+ # 1. REPOSITORY: Save initial skill
-+ await store.save_record(sample_record)
-+
-+ # 2. TAG SEARCH: Add tags
-+ tags = ["python", "testing", "v1"]
-+ store.sync_tags(sample_record.skill_id, tags)
-+
-+ # 3. ANALYSIS STORE: Record successful execution
-+ analysis = ExecutionAnalysis(
-+ task_id="workflow_test_123",
-+ timestamp=datetime.now(),
-+ task_completed=True,
-+ execution_note="Executed successfully in workflow test",
-+ skill_judgments=[
-+ SkillJudgment(
-+ skill_id=sample_record.skill_id,
-+ skill_applied=True,
-+ note="Executed successfully in workflow test",
-+ )
-+ ],
-+ analyzed_at=datetime.now(),
-+ )
-+ await store.record_analysis(analysis)
-+
-+ # 4. LINEAGE TRACKER: Evolve the skill
-+ evolved_record = SkillRecord(
-+ skill_id="evolved_workflow_skill_v1",
-+ name="evolved_workflow_skill",
-+ description="Enhanced workflow skill for testing",
-+ category=sample_record.category,
-+ visibility=sample_record.visibility,
-+ path=sample_record.path,
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.DERIVED,
-+ generation=1,
-+ parent_skill_ids=[sample_record.skill_id],
-+ change_summary="Enhanced for workflow testing",
-+ content_snapshot={"skill.py": "def evolved_workflow_skill():\n return 'enhanced'"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+
-+ await store.evolve_skill(evolved_record, [sample_record.skill_id])
-+ evolved_id = evolved_record.skill_id
-+
-+ # 5. COMPREHENSIVE VERIFICATION through facade
-+
-+ # Repository operations
-+ original = store.load_record(sample_record.skill_id)
-+ evolved = store.load_record(evolved_id)
-+ assert original is not None
-+ assert evolved is not None
-+ assert evolved.lineage.generation == 1
-+
-+ # Tag operations
-+ original_tags = store.get_tags(sample_record.skill_id)
-+ assert set(original_tags) == set(tags)
-+
-+ # Search operations
-+ found_by_tag = store.find_skills_by_tags(["python"])
-+ assert sample_record.skill_id in found_by_tag
-+
-+ search_results = store.search_skills("testing")
-+ skill_ids = [r["skill_id"] for r in search_results]
-+ assert sample_record.skill_id in skill_ids
-+
-+ # Analysis operations
-+ skill_analyses = store.load_analyses(sample_record.skill_id)
-+ assert len(skill_analyses) == 1
-+ assert skill_analyses[0].task_id == analysis.task_id
-+
-+ # Lineage operations
-+ children = store.find_children(sample_record.skill_id)
-+ assert evolved_id in children
-+
-+ ancestry = store.get_ancestry(evolved_id)
-+ assert len(ancestry) == 1 # just the parent
-+
-+ # Summary stats should reflect all data
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 2 # original + evolved
-+ assert stats["total_analyses"] == 1
-+
-+
-+class TestConcurrentAccess:
-+ """Test concurrent access to SkillStore."""
-+
-+ @pytest.mark.asyncio
-+ async def test_concurrent_reads_and_writes(self, store, sample_record):
-+ """Test that concurrent operations don't interfere."""
-+ # Save initial skill
-+ await store.save_record(sample_record)
-+
-+ async def reader_task(worker_id: int, iterations: int):
-+ """Read operations in concurrent task."""
-+ results = []
-+ for i in range(iterations):
-+ # Mix of read operations
-+ record = store.load_record(sample_record.skill_id)
-+ all_skills = store.load_all()
-+ stats = store.get_stats()
-+ results.append({
-+ 'worker': worker_id,
-+ 'iteration': i,
-+ 'record_found': record is not None,
-+ 'total_skills': len(all_skills),
-+ 'stats_count': stats.get('total_skills', 0)
-+ })
-+ return results
-+
-+ async def writer_task(worker_id: int, iterations: int):
-+ """Write operations in concurrent task."""
-+ for i in range(iterations):
-+ # Create unique records
-+ record = SkillRecord(
-+ skill_id=f"concurrent_skill_w{worker_id}_i{i}",
-+ name=f"concurrent_skill_{worker_id}_{i}",
-+ description=f"Concurrent test skill worker {worker_id} iteration {i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/concurrent_{worker_id}_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"concurrent_{worker_id}_{i}.py": f"def concurrent_skill_{worker_id}_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+
-+ # Run concurrent readers and writers
-+ num_readers = 3
-+ num_writers = 2
-+ iterations = 5
-+
-+ tasks = []
-+
-+ # Start reader tasks
-+ for i in range(num_readers):
-+ tasks.append(reader_task(i, iterations))
-+
-+ # Start writer tasks
-+ for i in range(num_writers):
-+ tasks.append(writer_task(i, iterations))
-+
-+ # Wait for all to complete
-+ results = await asyncio.gather(*tasks, return_exceptions=True)
-+
-+ # Verify no exceptions occurred
-+ for result in results:
-+ if isinstance(result, Exception):
-+ pytest.fail(f"Concurrent task failed: {result}")
-+
-+ # Verify final state
-+ final_count = store.count()
-+ expected_min = 1 + (num_writers * iterations) # initial + written records
-+ assert final_count >= expected_min
-+
-+ def test_concurrent_threading(self, store, sample_record):
-+ """Test concurrent access from multiple threads using threading."""
-+ # Save initial skill synchronously
-+ asyncio.run(store.save_record(sample_record))
-+
-+ results = {}
-+ errors = []
-+
-+ def reader_thread(thread_id: int):
-+ """Thread function for read operations."""
-+ try:
-+ for i in range(10):
-+ record = store.load_record(sample_record.skill_id)
-+ stats = store.get_stats()
-+ results[f"reader_{thread_id}_{i}"] = {
-+ 'found': record is not None,
-+ 'stats': stats.get('total_skills', 0)
-+ }
-+ time.sleep(0.001) # Small delay
-+ except Exception as e:
-+ errors.append(f"Reader {thread_id}: {e}")
-+
-+ def writer_thread(thread_id: int):
-+ """Thread function for write operations."""
-+ try:
-+ async def write_operations():
-+ for i in range(5):
-+ record = SkillRecord(
-+ skill_id=f"thread_skill_{thread_id}_{i}",
-+ name=f"thread_skill_{thread_id}_{i}",
-+ description=f"Thread test skill {thread_id}-{i}",
-+ category=SkillCategory.TOOL_GUIDE,
-+ visibility=SkillVisibility.PUBLIC,
-+ path=f"/test/thread_{thread_id}_{i}.py",
-+ lineage=SkillLineage(
-+ origin=SkillOrigin.IMPORTED,
-+ generation=0,
-+ content_snapshot={f"thread_{thread_id}_{i}.py": f"def thread_skill_{thread_id}_{i}():\n pass"},
-+ ),
-+ tool_dependencies=[],
-+ critical_tools=[],
-+ total_selections=0,
-+ total_applied=0,
-+ total_completions=0,
-+ total_fallbacks=0,
-+ recent_analyses=[],
-+ first_seen=datetime.now(),
-+ last_updated=datetime.now(),
-+ )
-+ await store.save_record(record)
-+
-+ # Run async operations in thread
-+ asyncio.run(write_operations())
-+
-+ except Exception as e:
-+ errors.append(f"Writer {thread_id}: {e}")
-+
-+ # Start threads
-+ threads = []
-+
-+ # Start 2 reader threads
-+ for i in range(2):
-+ thread = threading.Thread(target=reader_thread, args=(i,))
-+ threads.append(thread)
-+ thread.start()
-+
-+ # Start 1 writer thread
-+ writer = threading.Thread(target=writer_thread, args=(0,))
-+ threads.append(writer)
-+ writer.start()
-+
-+ # Wait for completion
-+ for thread in threads:
-+ thread.join(timeout=10.0)
-+ if thread.is_alive():
-+ pytest.fail("Thread did not complete within timeout")
-+
-+ # Check for errors
-+ if errors:
-+ pytest.fail(f"Concurrent threading errors: {errors}")
-+
-+ # Verify some operations succeeded
-+ assert len(results) > 0
-+
-+ # Verify final database state
-+ final_count = store.count()
-+ assert final_count >= 1 # At least the initial record
-+
-+
-+class TestEdgeCases:
-+ """Test edge cases and error conditions in integrated workflows."""
-+
-+ @pytest.mark.asyncio
-+ async def test_nonexistent_skill_operations(self, store):
-+ """Test operations on nonexistent skills."""
-+ nonexistent_id = "nonexistent_skill_v1"
-+
-+ # Load nonexistent
-+ record = store.load_record(nonexistent_id)
-+ assert record is None
-+
-+ # Deactivate nonexistent
-+ result = await store.deactivate_record(nonexistent_id)
-+ assert result is False
-+
-+ # Delete nonexistent
-+ result = await store.delete_record(nonexistent_id)
-+ assert result is False
-+
-+ # Get tags for nonexistent
-+ tags = store.get_tags(nonexistent_id)
-+ assert tags == []
-+
-+ # Find children of nonexistent
-+ children = store.find_children(nonexistent_id)
-+ assert children == []
-+
-+ @pytest.mark.asyncio
-+ async def test_empty_database_operations(self, store):
-+ """Test operations on empty database."""
-+ # Load operations on empty DB
-+ all_skills = store.load_all()
-+ assert len(all_skills) == 0
-+
-+ active_skills = store.load_active()
-+ assert len(active_skills) == 0
-+
-+ # Search operations on empty DB
-+ results = store.search_skills("anything")
-+ assert len(results) == 0
-+
-+ found = store.find_skills_by_tags(["any", "tags"])
-+ assert len(found) == 0
-+
-+ # Analysis operations on empty DB
-+ analyses = store.load_all_analyses()
-+ assert len(analyses) == 0
-+
-+ candidates = store.load_evolution_candidates()
-+ assert len(candidates) == 0
-+
-+ # Stats on empty DB
-+ stats = store.get_stats()
-+ assert stats["total_skills"] == 0
-+ assert stats["total_analyses"] == 0
-+
-+ count = store.count()
-+ assert count == 0
-+
-+
-+if __name__ == "__main__":
-+ # Run tests if executed directly
-+ pytest.main([__file__, "-v"])
-\ No newline at end of file
diff --git a/pyproject.toml b/pyproject.toml
index 79e7512..95d270a 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "openspace"
-version = "0.1.0"
+version = "2.0.0"
description = "OpenSpace"
readme = "README.md"
requires-python = ">=3.12"
@@ -14,7 +14,7 @@ authors = [
]
dependencies = [
- "litellm>=1.70.0,<1.82.7", # pinned to avoid PYSEC-2026-2 supply-chain compromise (1.82.7/1.82.8 were malicious)
+ "litellm>=1.83.0,<2", # >=1.83.0 fixes CVE-2026-35029 + CVE-2026-35030; skips PYSEC-2026-2 (1.82.7/1.82.8 malicious)
"python-dotenv>=1.0.0,<2",
"openai>=1.0.0,<3",
"jsonschema>=4.25.0,<5",
@@ -65,8 +65,8 @@ all = [
]
[project.urls]
-Repository = "https://github.com/HKUDS/OpenSpace"
-"Bug Tracker" = "https://github.com/HKUDS/OpenSpace/issues"
+Repository = "https://github.com/Deepfreezechill/OpenSpace"
+"Bug Tracker" = "https://github.com/Deepfreezechill/OpenSpace/issues"
[project.scripts]
openspace = "openspace.__main__:run_main"
@@ -77,7 +77,7 @@ openspace-upload-skill = "openspace.cloud.cli.upload_skill:main"
openspace-dashboard = "openspace.dashboard_server:main"
[tool.ruff]
-target-version = "py313"
+target-version = "py312"
line-length = 120
[tool.ruff.lint]
diff --git a/requirements.txt b/requirements.txt
index 83b43d7..19049fe 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -6,7 +6,7 @@
# It re-exports the core deps from pyproject.toml.
# ──────────────────────────────────────────────────────
-litellm>=1.70.0,<1.82.7
+litellm>=1.83.0,<2
python-dotenv>=1.0.0,<2
openai>=1.0.0,<3
jsonschema>=4.25.0,<5
diff --git a/showcase/README.md b/showcase/README.md
index 15e94eb..aee291b 100644
--- a/showcase/README.md
+++ b/showcase/README.md
@@ -4,10 +4,10 @@
### **Your Entire Day on One Live Screen — with an AI Agent That Works for You**
-**Fully generated & evolved by [OpenSpace](https://github.com/HKUDS/OpenSpace) — zero human code**
+**Fully generated & evolved by [OpenSpace](https://github.com/Deepfreezechill/OpenSpace) — zero human code**
-
+
@@ -145,7 +145,7 @@ Open the dashboard and click **⚙ Settings** in the top-right corner. Add your
## 🧬 How Was It Generated? — OpenSpace Skill Evolution
-> **Zero human code was written.** The entire project — every panel, service, style, and API route — was generated and iteratively evolved by [OpenSpace](https://github.com/HKUDS/OpenSpace) with no manual coding involved.
+> **Zero human code was written.** The entire project — every panel, service, style, and API route — was generated and iteratively evolved by [OpenSpace](https://github.com/Deepfreezechill/OpenSpace) with no manual coding involved.
### The Process
@@ -187,6 +187,7 @@ The full evolution history — every skill version, derivation chain, and qualit
---
+
🏗️ Project Structure
@@ -230,5 +231,5 @@ my-daily-monitor/
## 🔗 Related
-- **[OpenSpace](https://github.com/HKUDS/OpenSpace)** — Self-evolving skill worker & community for AI agents, the engine that generated this entire project.
+- **[OpenSpace](https://github.com/Deepfreezechill/OpenSpace)** — Self-evolving skill worker & community for AI agents, the engine that generated this entire project.
- **[WorldMonitor](https://github.com/koala73/worldmonitor)** — Real-time global intelligence dashboard that served as the seed reference for initial skills extraction.
diff --git a/tests/test_auth_provider.py b/tests/test_auth_provider.py
index 584231b..4a0e4f1 100644
--- a/tests/test_auth_provider.py
+++ b/tests/test_auth_provider.py
@@ -610,7 +610,7 @@ def test_expired_entries_gc(self) -> None:
def test_fifo_flooding_does_not_resurrect_revoked_token(self) -> None:
"""Regression: flooding with revocations must NOT evict unexpired entries.
- PoC from /8eyes R1: revoke stolen admin token, then flood with 10K+
+ PoC from security review R1: revoke stolen admin token, then flood with 10K+
other revocations — the victim token MUST stay revoked.
"""
reg = TokenRegistry()
diff --git a/tests/test_dependency_security.py b/tests/test_dependency_security.py
index 10eba01..2e63de6 100644
--- a/tests/test_dependency_security.py
+++ b/tests/test_dependency_security.py
@@ -66,11 +66,11 @@ def test_windows_deps_have_upper_bounds(self, pyproject):
self._check_deps(deps, "windows")
def test_litellm_has_security_cap(self, pyproject):
- """litellm must keep <1.82.7 cap (PYSEC-2026-2)."""
+ """litellm must pin >=1.83.0 (fixes CVE-2026-35029/35030, skips PYSEC-2026-2)."""
deps = pyproject["project"]["dependencies"]
litellm_specs = [d for d in deps if d.startswith("litellm")]
assert len(litellm_specs) == 1
- assert "<1.82.7" in litellm_specs[0]
+ assert ">=1.83.0" in litellm_specs[0]
# ---------------------------------------------------------------------------
diff --git a/tests/test_e2e_outcomes.py b/tests/test_e2e_outcomes.py
new file mode 100644
index 0000000..ccd3814
--- /dev/null
+++ b/tests/test_e2e_outcomes.py
@@ -0,0 +1,985 @@
+"""End-to-end OUTCOME tests for OpenSpace v2.0.0.
+
+These tests validate that the system DOES WHAT USERS EXPECT, not just
+that individual functions return correct types. Real objects are used
+wherever possible; only LLM and cloud calls are mocked.
+
+Test Matrix (10 outcome categories):
+ 1. Health Check — system reports healthy with correct version
+ 2. Platform Info — server returns real platform metadata
+ 3. Skill Store — full CRUD + search + lineage lifecycle
+ 4. Review Gate — blocks unsafe code, passes safe code
+ 5. Execution Engine — submit task → get structured result (orchestration-level;
+ grounding agent is mocked since it requires a live LLM)
+ 6. MCP Server — tools are registered and discoverable
+ 7. Evolution — skill evolves, lineage preserved, quarantine works
+ 8. Command Execute — server runs commands and returns output
+ 9. Version — consistent version across all entry points
+ 10. Analysis — execution analysis persistence and counters
+
+NOTE on local server auth: The Flask desktop server (local_server/main.py)
+runs on 127.0.0.1 by design — it is a localhost-only desktop automation
+endpoint, not a network-facing service. Auth is enforced on the MCP server
+path (see test_auth_integration.py).
+"""
+
+from __future__ import annotations
+
+import asyncio
+import platform as _platform_mod
+import textwrap
+import uuid
+from datetime import datetime
+from unittest.mock import AsyncMock, MagicMock
+
+import pytest
+
+# Canonical version — single source of truth for all version assertions
+from openspace import __version__ as EXPECTED_VERSION
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _run(coro):
+ """Run an async coroutine from sync test code.
+
+ Handles both standalone and pytest-asyncio environments.
+ NOTE: SkillStore uses asyncio.to_thread internally, which creates its
+ own connections, so cross-thread SQLite access is safe.
+ """
+ try:
+ loop = asyncio.get_running_loop()
+ except RuntimeError:
+ loop = None
+
+ if loop and loop.is_running():
+ import concurrent.futures
+ with concurrent.futures.ThreadPoolExecutor() as pool:
+ return pool.submit(asyncio.run, coro).result()
+ else:
+ return asyncio.run(coro)
+
+
+def _make_skill_record(
+ *,
+ skill_id=None,
+ name="test_skill",
+ description="A test skill",
+ origin="imported",
+ generation=0,
+ parent_skill_ids=None,
+ content_snapshot=None,
+ is_active=True,
+ tags=None,
+):
+ """Build a SkillRecord with sensible defaults."""
+ from openspace.skill_engine.types import (
+ SkillCategory,
+ SkillLineage,
+ SkillOrigin,
+ SkillRecord,
+ )
+
+ if skill_id is None:
+ skill_id = f"{name}__e2e_{uuid.uuid4().hex[:8]}"
+ if parent_skill_ids is None:
+ parent_skill_ids = []
+ if content_snapshot is None:
+ content_snapshot = {
+ "SKILL.md": f"# {name}\n\nA test skill.\n\n## Steps\n\n1. Do something.\n",
+ "helper.py": "# Safe helper\ndef greet():\n return 'hello'\n",
+ }
+
+ return SkillRecord(
+ skill_id=skill_id,
+ name=name,
+ description=description,
+ path=f"/skills/{name}",
+ is_active=is_active,
+ category=SkillCategory.WORKFLOW,
+ tags=tags or ["test"],
+ lineage=SkillLineage(
+ origin=SkillOrigin(origin),
+ generation=generation,
+ parent_skill_ids=parent_skill_ids,
+ content_snapshot=content_snapshot,
+ change_summary="initial import" if not parent_skill_ids else "evolved",
+ ),
+ )
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# E2E Outcome 1: Health Check — "Can I trust the system is healthy?"
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestHealthCheckOutcome:
+ """User expectation: GET / returns status=ok with version 2.0.0."""
+
+ def test_health_returns_ok_status(self):
+ """OUTCOME: Health endpoint says the system is operational."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ resp = client.get("/")
+ data = resp.get_json()
+
+ assert resp.status_code == 200
+ assert data["status"] == "ok"
+ assert data["service"] == "OpenSpace Desktop Server"
+
+ def test_health_reports_correct_version(self):
+ """OUTCOME: Reported version matches the release we shipped."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ resp = client.get("/")
+ data = resp.get_json()
+
+ assert data["version"] == EXPECTED_VERSION, (
+ f"Health endpoint reports {data['version']}, expected {EXPECTED_VERSION}"
+ )
+
+ def test_health_includes_features(self):
+ """OUTCOME: User can see what capabilities are available."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ resp = client.get("/")
+ data = resp.get_json()
+
+ assert "features" in data, "Health response must include features dict"
+ assert isinstance(data["features"], dict)
+ assert len(data["features"]) > 0, (
+ "Features dict must not be empty — system should report capabilities"
+ )
+
+ def test_health_includes_fresh_timestamp(self):
+ """OUTCOME: User can verify the response is fresh, not cached stale."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ resp = client.get("/")
+ data = resp.get_json()
+
+ assert "timestamp" in data
+ ts = datetime.fromisoformat(data["timestamp"])
+ delta = abs((datetime.now() - ts).total_seconds())
+ assert delta < 5, f"Timestamp is {delta}s old — expected fresh (<5s)"
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# E2E Outcome 2: Platform Info — "What system am I running on?"
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestPlatformInfoOutcome:
+ """User expectation: GET /platform returns real system metadata."""
+
+ def test_platform_returns_system_info(self):
+ """OUTCOME: User gets real, actionable platform information."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ resp = client.get("/platform")
+ data = resp.get_json()
+
+ assert resp.status_code == 200
+ assert "system" in data
+ assert data["system"] in ("Windows", "Linux", "Darwin"), (
+ f"Unexpected platform: {data['system']}"
+ )
+ assert "release" in data
+ assert "machine" in data
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# E2E Outcome 3: Skill Store Lifecycle — "Can I manage skills?"
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestSkillStoreLifecycleOutcome:
+ """User expectation: I can create, find, browse, and evolve skills
+ using a real database — not mocks."""
+
+ def test_create_and_retrieve_skill(self, in_memory_store):
+ """OUTCOME: User saves a skill and can retrieve it later."""
+ record = _make_skill_record(name="weather_lookup")
+ _run(in_memory_store.save_record(record))
+
+ loaded = in_memory_store.load_record(record.skill_id)
+ assert loaded is not None, "Saved skill must be retrievable"
+ assert loaded.name == "weather_lookup"
+ assert loaded.description == "A test skill"
+ assert loaded.is_active is True
+
+ def test_search_skills_by_tag(self, in_memory_store):
+ """OUTCOME: User searches for skills by tag and finds matches."""
+ skills = [
+ _make_skill_record(name="api_caller", tags=["api", "http"]),
+ _make_skill_record(name="data_parser", tags=["data", "json"]),
+ _make_skill_record(name="api_tester", tags=["api", "testing"]),
+ ]
+ for s in skills:
+ _run(in_memory_store.save_record(s))
+
+ # find_skills_by_tags returns List[str] (skill IDs)
+ matching_ids = in_memory_store.find_skills_by_tags(["api"])
+ # Load the matching records to check names
+ matching_names = []
+ for sid in matching_ids:
+ rec = in_memory_store.load_record(sid)
+ if rec:
+ matching_names.append(rec.name)
+
+ assert "api_caller" in matching_names, "Tag search must find matching skills"
+ assert "api_tester" in matching_names
+ assert "data_parser" not in matching_names, "Non-matching skills must be excluded"
+
+ def test_evolve_skill_creates_new_version(self, in_memory_store):
+ """OUTCOME: When a skill evolves, a new version exists and the old one
+ is deactivated (for FIXED origin)."""
+ parent = _make_skill_record(name="code_review")
+ _run(in_memory_store.save_record(parent))
+
+ child = _make_skill_record(
+ name="code_review",
+ origin="fixed",
+ generation=1,
+ parent_skill_ids=[parent.skill_id],
+ content_snapshot={
+ "SKILL.md": "# code_review\n\nImproved version.\n\n## Steps\n\n1. Better review.\n",
+ "helper.py": "def review():\n return 'improved'\n",
+ },
+ )
+ _run(in_memory_store.evolve_skill(child, [parent.skill_id]))
+
+ # New version is active
+ loaded_child = in_memory_store.load_record(child.skill_id)
+ assert loaded_child is not None
+ assert loaded_child.is_active is True
+ assert loaded_child.lineage.generation == 1
+
+ # Old version is deactivated
+ loaded_parent = in_memory_store.load_record(parent.skill_id)
+ assert loaded_parent is not None
+ assert loaded_parent.is_active is False, (
+ "FIXED evolution must deactivate the parent version"
+ )
+
+ def test_skill_count_reflects_reality(self, in_memory_store):
+ """OUTCOME: User can count how many skills exist in the system."""
+ for i in range(5):
+ _run(in_memory_store.save_record(
+ _make_skill_record(name=f"skill_{i}")
+ ))
+
+ count = in_memory_store.count()
+ assert count == 5, f"Expected 5 skills, got {count}"
+
+ def test_delete_skill_removes_it(self, in_memory_store):
+ """OUTCOME: User deletes a skill and it's gone."""
+ record = _make_skill_record(name="obsolete_skill")
+ _run(in_memory_store.save_record(record))
+ assert in_memory_store.load_record(record.skill_id) is not None
+
+ _run(in_memory_store.delete_record(record.skill_id))
+ assert in_memory_store.load_record(record.skill_id) is None, (
+ "Deleted skill must not be retrievable"
+ )
+
+ def test_lineage_ancestry_chain(self, in_memory_store):
+ """OUTCOME: User can trace back through a skill's full evolution history."""
+ # Create a 3-generation lineage: v0 → v1 → v2
+ v0 = _make_skill_record(name="auto_deploy", tags=["deploy"])
+ _run(in_memory_store.save_record(v0))
+
+ v1 = _make_skill_record(
+ name="auto_deploy",
+ origin="fixed",
+ generation=1,
+ parent_skill_ids=[v0.skill_id],
+ tags=["deploy"],
+ )
+ _run(in_memory_store.evolve_skill(v1, [v0.skill_id]))
+
+ v2 = _make_skill_record(
+ name="auto_deploy",
+ origin="fixed",
+ generation=2,
+ parent_skill_ids=[v1.skill_id],
+ tags=["deploy"],
+ )
+ _run(in_memory_store.evolve_skill(v2, [v1.skill_id]))
+
+ # Only v2 should be active
+ all_records = in_memory_store.load_all() # Dict[str, SkillRecord]
+ active = [r for r in all_records.values() if r.is_active and r.name == "auto_deploy"]
+ assert len(active) == 1, "Only the latest version should be active"
+ assert active[0].skill_id == v2.skill_id
+
+ # Full ancestry exists
+ ancestors = in_memory_store.get_ancestry(v2.skill_id)
+ ancestor_ids = [a.skill_id for a in ancestors]
+ assert v1.skill_id in ancestor_ids, "v1 must be in v2's ancestry"
+ assert v0.skill_id in ancestor_ids, "v0 must be in v2's ancestry"
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# E2E Outcome 4: Review Gate — "Does the system block unsafe code?"
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestReviewGateSecurityOutcome:
+ """User expectation: Unsafe skills are caught and blocked before
+ they can harm the system. Safe skills pass through."""
+
+ def test_safe_skill_passes_review(self):
+ """OUTCOME: A well-formed, safe skill is approved by the gate."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ record = _make_skill_record(
+ name="safe_skill",
+ content_snapshot={
+ "SKILL.md": "# Safe Skill\n\nDoes safe things.\n",
+ "main.py": "def run():\n return 'safe'\n",
+ },
+ )
+
+ gate = ReviewGate()
+ result = gate.review(record)
+
+ assert result.passed, (
+ f"Safe skill should pass review, but got: "
+ f"{[(c.name, c.verdict, c.detail) for c in result.checks]}"
+ )
+
+ def test_dangerous_code_blocked(self):
+ """OUTCOME: A skill with dangerous AST patterns is REJECTED."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ record = _make_skill_record(
+ name="evil_skill",
+ content_snapshot={
+ "SKILL.md": "# Evil Skill\n\nDoes bad things.\n",
+ "exploit.py": textwrap.dedent("""\
+ import subprocess
+ subprocess.call(['id'])
+ """), # nosec B603 — test data for AST scanner validation
+ },
+ )
+
+ gate = ReviewGate()
+ result = gate.review(record)
+
+ assert not result.passed, "Dangerous code MUST be blocked by ReviewGate"
+ ast_check = next((c for c in result.checks if c.name == "ast-safety"), None)
+ assert ast_check is not None
+ assert ast_check.verdict == "fail"
+
+ def test_path_traversal_blocked(self):
+ """OUTCOME: A skill trying to escape its directory is REJECTED."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ record = _make_skill_record(
+ name="traversal_skill",
+ content_snapshot={
+ "SKILL.md": "# Traversal\n\nNotes.\n",
+ "../../etc/passwd": "root:x:0:0:root:/root:/bin/bash",
+ },
+ )
+
+ gate = ReviewGate()
+ result = gate.review(record)
+
+ assert not result.passed, "Path traversal MUST be blocked"
+
+ def test_disallowed_file_types_blocked(self):
+ """OUTCOME: A skill containing forbidden file types is REJECTED."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ record = _make_skill_record(
+ name="template_skill",
+ content_snapshot={
+ "SKILL.md": "# Template\n\nUses Jinja.\n",
+ "template.jinja": "{{ user_input | safe }}", # SSTI risk
+ },
+ )
+
+ gate = ReviewGate()
+ result = gate.review(record)
+
+ assert not result.passed, "Jinja template files MUST be blocked (SSTI risk)"
+
+ def test_missing_skill_md_blocked(self):
+ """OUTCOME: A skill without SKILL.md is incomplete and REJECTED."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ record = _make_skill_record(
+ name="incomplete_skill",
+ content_snapshot={
+ "main.py": "def run():\n pass\n",
+ },
+ )
+
+ gate = ReviewGate()
+ result = gate.review(record)
+
+ assert not result.passed, "Skill without SKILL.md MUST fail content check"
+ content_check = next((c for c in result.checks if c.name == "content"), None)
+ assert content_check is not None
+ assert content_check.verdict == "fail"
+
+ def test_windows_reserved_names_blocked(self):
+ """OUTCOME: Skills with Windows reserved device names are REJECTED."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ record = _make_skill_record(
+ name="windows_exploit",
+ content_snapshot={
+ "SKILL.md": "# WinExploit\n\nBad.\n",
+ "CON.py": "print('this hangs on Windows')",
+ },
+ )
+
+ gate = ReviewGate()
+ result = gate.review(record)
+
+ assert not result.passed, "Windows reserved names MUST be blocked"
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# E2E Outcome 5: Execution Engine — "Can I submit a task and get results?"
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestExecutionEngineOutcome:
+ """User expectation: Submit a task → get a structured result with
+ status, response, execution_time."""
+
+ @pytest.fixture
+ def fake_grounding_agent(self):
+ """A fake grounding agent that returns canned responses."""
+ agent = AsyncMock()
+ agent.process = AsyncMock(return_value={
+ "status": "completed",
+ "response": "The weather in Seattle is 65°F and cloudy.",
+ })
+ agent.clear_skill_context = MagicMock()
+ agent._active_skill_ids = set()
+ agent._last_tools = []
+ return agent
+
+ @pytest.fixture
+ def engine(self, fake_grounding_agent, tmp_path):
+ """Real ExecutionEngine with a fake grounding agent."""
+ from openspace.execution_engine import ExecutionEngine
+
+ config = MagicMock()
+ config.grounding_max_iterations = 10
+ config.workspace = str(tmp_path)
+
+ return ExecutionEngine(
+ config=config,
+ grounding_agent=fake_grounding_agent,
+ grounding_client=None,
+ )
+
+ def test_task_returns_structured_result(self, engine, fake_grounding_agent):
+ """OUTCOME: User submits a task and gets a response with expected fields.
+
+ NOTE: The grounding agent is mocked — this validates the ExecutionEngine
+ orchestration shell (busy-wait, task ID, recording, workspace), not the
+ LLM reasoning path. LLM-level E2E requires a live model.
+ """
+ result = _run(engine.execute("What is the weather in Seattle?"))
+
+ assert isinstance(result, dict), "Result must be a dict"
+ assert "status" in result, "Result must include status"
+ assert result["status"] == "completed"
+ assert "response" in result
+
+ def test_tasks_get_unique_ids_and_count_increments(self, engine, fake_grounding_agent):
+ """OUTCOME: Each task gets a unique identifier and the system tracks count.
+
+ Verifies engine-generated IDs (not caller-supplied) are unique by
+ inspecting the execution_context passed to the grounding agent.
+ """
+ assert engine.execution_count == 0
+
+ # Let engine auto-generate task IDs (don't supply them)
+ _run(engine.execute("Task 1"))
+ assert engine.execution_count == 1
+ ctx1 = fake_grounding_agent.process.call_args_list[0][0][0]
+
+ _run(engine.execute("Task 2"))
+ assert engine.execution_count == 2
+ ctx2 = fake_grounding_agent.process.call_args_list[1][0][0]
+
+ # Verify the engine assigned unique, non-empty task IDs
+ id1 = ctx1.get("task_id", "")
+ id2 = ctx2.get("task_id", "")
+ assert id1, "Engine must assign a task_id to execution context"
+ assert id2, "Engine must assign a task_id to execution context"
+ assert id1 != id2, f"Engine-generated task IDs must be unique: {id1!r} == {id2!r}"
+
+ def test_no_grounding_agent_raises(self, tmp_path):
+ """OUTCOME: System clearly tells user it's not ready if not initialized."""
+ from openspace.execution_engine import ExecutionEngine
+
+ config = MagicMock()
+ config.grounding_max_iterations = 10
+
+ engine = ExecutionEngine(
+ config=config,
+ grounding_agent=None,
+ grounding_client=None,
+ )
+
+ with pytest.raises(RuntimeError, match="not initialized"):
+ _run(engine.execute("This should fail"))
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# E2E Outcome 6: MCP Server — "Does the agent integration actually work?"
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestMCPServerOutcome:
+ """User expectation: Connect my agent to OpenSpace via MCP and the
+ expected tools are available."""
+
+ def test_mcp_app_creates_successfully(self):
+ """OUTCOME: MCP server can be instantiated without errors."""
+ from openspace.mcp.server import create_mcp_app
+
+ app = create_mcp_app()
+ assert app is not None
+ assert app.name == "OpenSpace"
+
+ def test_mcp_has_expected_tools(self):
+ """OUTCOME: All documented tools are registered and discoverable."""
+ from openspace.mcp.server import create_mcp_app
+
+ app = create_mcp_app()
+
+ # The tools that users expect based on SDK docs
+ expected_tools = {
+ "execute_task",
+ "search_skills",
+ "fix_skill",
+ "upload_skill",
+ "health_check",
+ "get_metrics",
+ "get_execution_traces",
+ "check_slos",
+ }
+
+ # Use FastMCP's public list_tools() API
+ # NOTE: If FastMCP changes this API, we WANT the test to break — it means
+ # the tool discovery contract changed.
+ assert hasattr(app, "list_tools"), (
+ "FastMCP must expose list_tools() — tool discovery contract broken"
+ )
+ tools = _run(app.list_tools())
+ registered = {t.name for t in tools}
+
+ missing = expected_tools - registered
+ assert not missing, (
+ f"Missing MCP tools: {missing}. Registered: {registered}"
+ )
+
+ def test_multiple_mcp_apps_are_independent(self):
+ """OUTCOME: Creating multiple instances doesn't cause cross-contamination."""
+ from openspace.mcp.server import create_mcp_app
+
+ app1 = create_mcp_app()
+ app2 = create_mcp_app()
+
+ assert app1 is not app2, "Each call should produce an independent instance"
+
+ # Verify both have the same tools registered (independent copies)
+ tools1 = {t.name for t in _run(app1.list_tools())}
+ tools2 = {t.name for t in _run(app2.list_tools())}
+ assert tools1 == tools2, "Independent apps should have identical tool sets"
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# E2E Outcome 7: Evolution Pipeline — "Can broken skills be fixed?"
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestEvolutionPipelineOutcome:
+ """User expectation: A degraded skill can be evolved into a better
+ version, with full lineage tracking and security review."""
+
+ def test_evolution_with_review_gate_pass(self, in_memory_store):
+ """OUTCOME: A skill evolves, passes review, and the new version
+ is stored with correct lineage."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ # Save the parent skill
+ parent = _make_skill_record(name="data_fetcher", tags=["data"])
+ _run(in_memory_store.save_record(parent))
+
+ # Simulate evolution — create an improved child
+ child = _make_skill_record(
+ name="data_fetcher",
+ origin="fixed",
+ generation=1,
+ parent_skill_ids=[parent.skill_id],
+ content_snapshot={
+ "SKILL.md": "# data_fetcher\n\nFetches data with retry.\n\n## Steps\n\n1. Fetch with retry.\n",
+ "helper.py": "import time\ndef fetch_with_retry():\n return 'data'\n",
+ },
+ tags=["data"],
+ )
+
+ # Review gate check
+ gate = ReviewGate()
+ result = gate.review(child)
+ assert result.passed, (
+ f"Evolved skill should pass review: "
+ f"{[(c.name, c.verdict, c.detail) for c in result.checks]}"
+ )
+
+ # Persist the evolution
+ _run(in_memory_store.evolve_skill(child, [parent.skill_id]))
+
+ # Verify the outcome
+ loaded_child = in_memory_store.load_record(child.skill_id)
+ assert loaded_child.is_active is True
+ assert loaded_child.lineage.origin.value == "fixed"
+ assert loaded_child.lineage.generation == 1
+ assert loaded_child.lineage.parent_skill_ids == [parent.skill_id]
+
+ loaded_parent = in_memory_store.load_record(parent.skill_id)
+ assert loaded_parent.is_active is False
+
+ def test_evolution_with_review_gate_block(self, in_memory_store):
+ """OUTCOME: If an evolved skill contains dangerous code, it's blocked
+ and the original skill remains active."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ parent = _make_skill_record(name="safe_processor")
+ _run(in_memory_store.save_record(parent))
+
+ # Create a "poisoned" evolution
+ poisoned = _make_skill_record(
+ name="safe_processor",
+ origin="fixed",
+ generation=1,
+ parent_skill_ids=[parent.skill_id],
+ content_snapshot={
+ "SKILL.md": "# safe_processor\n\nProcesses data.\n",
+ "exploit.py": "import subprocess\nsubprocess.call(['id'])", # nosec B603 — test data
+ },
+ )
+
+ gate = ReviewGate()
+ result = gate.review(poisoned)
+ assert not result.passed, "Poisoned evolution MUST be blocked"
+
+ # Gate blocked it → do NOT save. Original must still be the only active record.
+ all_records = in_memory_store.load_all()
+ active = [r for r in all_records.values() if r.is_active and r.name == "safe_processor"]
+ assert len(active) == 1, "Only original skill should exist"
+ assert active[0].skill_id == parent.skill_id
+
+ def test_evolution_quarantine_after_save(self, in_memory_store):
+ """OUTCOME: Even if a bad skill was persisted, quarantine deactivates it."""
+ from openspace.skill_engine.review_gate import ReviewResult, quarantine_skill, CheckResult
+
+ record = _make_skill_record(name="bad_skill")
+ _run(in_memory_store.save_record(record))
+
+ failed_result = ReviewResult(
+ verdict="fail",
+ checks=[CheckResult(name="ast-safety", verdict="fail", detail="dangerous patterns")],
+ )
+
+ quarantined = _run(quarantine_skill(in_memory_store, record.skill_id, failed_result))
+ assert quarantined is True
+
+ loaded = in_memory_store.load_record(record.skill_id)
+ assert loaded.is_active is False, "Quarantined skill must be deactivated"
+
+ def test_derived_skill_multi_parent_evolution(self, in_memory_store):
+ """OUTCOME: A DERIVED skill merges two parents and both remain active."""
+ from openspace.skill_engine.review_gate import ReviewGate
+
+ parent_a = _make_skill_record(name="weather_skill", tags=["weather"])
+ parent_b = _make_skill_record(name="geocoding_skill", tags=["geo"])
+ _run(in_memory_store.save_record(parent_a))
+ _run(in_memory_store.save_record(parent_b))
+
+ derived = _make_skill_record(
+ name="location_forecast",
+ origin="derived",
+ generation=1,
+ parent_skill_ids=[parent_a.skill_id, parent_b.skill_id],
+ content_snapshot={
+ "SKILL.md": "# location_forecast\n\nCombines weather + geo.\n\n## Steps\n\n1. Geocode. 2. Forecast.\n",
+ "main.py": "def forecast(city):\n return f'Sunny in {city}'\n",
+ },
+ tags=["weather", "geo"],
+ )
+
+ gate = ReviewGate()
+ result = gate.review(derived)
+ assert result.passed, f"Derived skill should pass: {[(c.name, c.detail) for c in result.checks]}"
+
+ _run(in_memory_store.evolve_skill(derived, [parent_a.skill_id, parent_b.skill_id]))
+
+ # For DERIVED: parents remain active, child is also active
+ loaded_a = in_memory_store.load_record(parent_a.skill_id)
+ loaded_b = in_memory_store.load_record(parent_b.skill_id)
+ loaded_d = in_memory_store.load_record(derived.skill_id)
+
+ assert loaded_a.is_active is True, "DERIVED parent A must stay active"
+ assert loaded_b.is_active is True, "DERIVED parent B must stay active"
+ assert loaded_d.is_active is True, "Derived child must be active"
+ assert loaded_d.lineage.generation == 1
+
+ def test_lineage_validation_catches_invalid_evolution(self):
+ """OUTCOME: System rejects evolution attempts with broken lineage."""
+ from openspace.skill_engine.types import SkillLineage, SkillOrigin, ValidationError
+
+ # FIXED must have exactly 1 parent
+ bad_lineage = SkillLineage(
+ origin=SkillOrigin.FIXED,
+ generation=1,
+ parent_skill_ids=[], # Missing parent!
+ )
+ with pytest.raises(ValidationError, match="exactly 1 parent"):
+ bad_lineage.validate()
+
+ # DERIVED must have at least 1 parent
+ bad_derived = SkillLineage(
+ origin=SkillOrigin.DERIVED,
+ generation=1,
+ parent_skill_ids=[], # Missing parent!
+ )
+ with pytest.raises(ValidationError, match="at least 1 parent"):
+ bad_derived.validate()
+
+ # IMPORTED must have no parents
+ bad_imported = SkillLineage(
+ origin=SkillOrigin.IMPORTED,
+ generation=0,
+ parent_skill_ids=["some_parent"], # Shouldn't have one!
+ )
+ with pytest.raises(ValidationError, match="no parents"):
+ bad_imported.validate()
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# E2E Outcome 8: Command Execution — "Can the server run commands?"
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestCommandExecutionOutcome:
+ """User expectation: POST /execute runs a command and returns the output."""
+
+ def test_execute_returns_output(self):
+ """OUTCOME: User sends a command and gets stdout back."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ resp = client.post(
+ "/execute",
+ json={"command": "echo hello", "shell": True, "timeout": 10},
+ )
+ data = resp.get_json()
+
+ assert resp.status_code == 200
+ assert data["status"] == "success"
+ assert "hello" in data["output"]
+
+ def test_execute_captures_returncode(self):
+ """OUTCOME: User can check if the command succeeded or failed."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ resp = client.post(
+ "/execute",
+ json={"command": "echo success", "shell": True, "timeout": 10},
+ )
+ data = resp.get_json()
+
+ assert data["returncode"] == 0, "Successful command must return exit code 0"
+
+ def test_execute_reports_errors(self):
+ """OUTCOME: Failed commands report stderr and non-zero exit code."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ if _platform_mod.system() == "Windows":
+ cmd = "cmd /c exit 1"
+ else:
+ cmd = "false"
+
+ resp = client.post(
+ "/execute",
+ json={"command": cmd, "shell": True, "timeout": 10},
+ )
+ data = resp.get_json()
+
+ assert resp.status_code == 200 # HTTP 200 even for failed commands
+ assert data["returncode"] != 0, "Failed command must have non-zero exit code"
+ assert "error" in data or "stderr" in data, (
+ "Response must include an error/stderr field for failed commands"
+ )
+
+ def test_execute_timeout_enforcement(self):
+ """OUTCOME: A long-running command is killed after the timeout."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ if _platform_mod.system() == "Windows":
+ cmd = "ping -n 30 127.0.0.1"
+ else:
+ cmd = "sleep 30"
+
+ resp = client.post(
+ "/execute",
+ json={"command": cmd, "shell": True, "timeout": 1},
+ )
+ data = resp.get_json()
+
+ # Should return 408 (timeout) or 200 with error status
+ assert resp.status_code in (200, 408), f"Unexpected status: {resp.status_code}"
+ if resp.status_code == 408:
+ assert "timeout" in data.get("message", "").lower()
+ else:
+ assert data.get("status") == "error"
+
+ def test_local_server_binds_localhost_only(self):
+ """OUTCOME: The desktop server defaults to 127.0.0.1 — not exposed to network.
+
+ This is a security-by-design decision: the Flask desktop server runs
+ shell commands and does NOT have auth middleware. It must only bind
+ to localhost.
+ """
+ from openspace.local_server import main as local_main
+
+ # Verify run_server defaults to 127.0.0.1
+ import inspect
+ sig = inspect.signature(local_main.run_server)
+ host_default = sig.parameters["host"].default
+ assert host_default == "127.0.0.1", (
+ f"Local server must default to 127.0.0.1, got {host_default}"
+ )
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# Cross-cutting: Version Consistency
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestVersionConsistencyOutcome:
+ """User expectation: The version is consistent everywhere — no stale
+ strings leaking through different entry points."""
+
+ def test_package_version(self):
+ """OUTCOME: Python package reports correct version."""
+ assert EXPECTED_VERSION == "2.0.0", (
+ f"Canonical version from openspace.__version__ is {EXPECTED_VERSION}"
+ )
+
+ def test_mcp_server_version(self):
+ """OUTCOME: MCP server advertises correct version."""
+ from openspace.mcp.server import _VERSION
+ assert _VERSION == EXPECTED_VERSION
+
+ def test_local_server_health_version(self):
+ """OUTCOME: Local HTTP server health endpoint reports correct version."""
+ from openspace.local_server.main import app
+
+ client = app.test_client()
+ resp = client.get("/")
+ data = resp.get_json()
+ assert data["version"] == EXPECTED_VERSION
+
+ def test_pyproject_version(self):
+ """OUTCOME: pyproject.toml (packaging source of truth) matches."""
+ import tomllib
+ from openspace.config.constants import PROJECT_ROOT
+
+ pyproject = PROJECT_ROOT / "pyproject.toml"
+ if pyproject.exists():
+ with open(pyproject, "rb") as f:
+ data = tomllib.load(f)
+ pkg_version = data.get("project", {}).get("version", "")
+ assert pkg_version == EXPECTED_VERSION, (
+ f"pyproject.toml version={pkg_version}, expected {EXPECTED_VERSION}"
+ )
+
+
+# ═══════════════════════════════════════════════════════════════════════════
+# Cross-cutting: Execution Analysis Persistence
+# ═══════════════════════════════════════════════════════════════════════════
+
+
+class TestExecutionAnalysisOutcome:
+ """User expectation: After a task runs, I can see what happened —
+ which skills were used, whether they worked, and what to evolve."""
+
+ def test_record_and_retrieve_analysis(self, in_memory_store):
+ """OUTCOME: An execution analysis is persisted and retrievable."""
+ from openspace.skill_engine.types import ExecutionAnalysis, SkillJudgment
+
+ # Save a skill first
+ record = _make_skill_record(name="analyzer_test")
+ _run(in_memory_store.save_record(record))
+
+ # Record an analysis
+ analysis = ExecutionAnalysis(
+ task_id="task_e2e_001",
+ timestamp=datetime.now(),
+ task_completed=True,
+ execution_note="Task completed successfully using analyzer_test skill",
+ skill_judgments=[
+ SkillJudgment(
+ skill_id=record.skill_id,
+ skill_applied=True,
+ note="Skill was applied correctly",
+ )
+ ],
+ )
+ _run(in_memory_store.record_analysis(analysis))
+
+ # Verify counters were updated (exactly 1 analysis recorded)
+ loaded = in_memory_store.load_record(record.skill_id)
+ assert loaded.total_selections == 1, "Selection counter must be exactly 1"
+ assert loaded.total_applied == 1, "Applied counter must be exactly 1"
+ assert loaded.total_completions == 1, "Completion counter must be exactly 1"
+
+ def test_analysis_with_failed_skill(self, in_memory_store):
+ """OUTCOME: When a skill fails, the system records the failure for learning."""
+ from openspace.skill_engine.types import ExecutionAnalysis, SkillJudgment
+
+ record = _make_skill_record(name="flaky_skill")
+ _run(in_memory_store.save_record(record))
+
+ analysis = ExecutionAnalysis(
+ task_id="task_e2e_002",
+ timestamp=datetime.now(),
+ task_completed=False,
+ execution_note="Task failed — skill did not apply correctly",
+ skill_judgments=[
+ SkillJudgment(
+ skill_id=record.skill_id,
+ skill_applied=False,
+ note="Skill instructions were outdated",
+ )
+ ],
+ )
+ _run(in_memory_store.record_analysis(analysis))
+
+ loaded = in_memory_store.load_record(record.skill_id)
+ assert loaded.total_selections == 1
+ assert loaded.total_applied == 0, "Skill was NOT applied"
+ assert loaded.total_fallbacks == 1, "Fallback counter must be exactly 1"
diff --git a/tests/test_execution_engine.py b/tests/test_execution_engine.py
index 4417812..2bcd099 100644
--- a/tests/test_execution_engine.py
+++ b/tests/test_execution_engine.py
@@ -71,7 +71,11 @@ def mock_recording_manager():
rm.recording_status = False
rm.trajectory_dir = "/tmp/traj"
rm.task_id = ""
- rm.start = AsyncMock()
+
+ async def _start():
+ rm.recording_status = True
+
+ rm.start = AsyncMock(side_effect=_start)
rm.stop = AsyncMock()
rm.add_metadata = AsyncMock()
rm.save_execution_outcome = AsyncMock()
diff --git a/tests/test_net_proxy.py b/tests/test_net_proxy.py
index d029465..19768f6 100644
--- a/tests/test_net_proxy.py
+++ b/tests/test_net_proxy.py
@@ -528,7 +528,7 @@ def test_t1_schema_rejects_nonempty_allowed_domains(self) -> None:
class TestSecurityRegressionsR2:
- """Regression tests for R2 /8eyes findings (apex rebinding + loopback SSRF)."""
+ """Regression tests for R2 security review findings (apex rebinding + loopback SSRF)."""
# --- Apex DNS rebinding ---
diff --git a/tests/test_process_broker.py b/tests/test_process_broker.py
index 3c4436e..4d6ba62 100644
--- a/tests/test_process_broker.py
+++ b/tests/test_process_broker.py
@@ -648,7 +648,7 @@ def test_enforcement_order_deny_before_allow(self) -> None:
# ═══════════════════════════════════════════════════════════════════════
-# R1 Review Fixes — /8eyes + /collab findings
+# R1 Review Fixes — security audit findings
# ═══════════════════════════════════════════════════════════════════════
@@ -828,7 +828,7 @@ def test_rksh_blocked(self) -> None:
# ═══════════════════════════════════════════════════════════════════════
-# R2 Review Fixes — /8eyes + Sonnet findings
+# R2 Review Fixes — adversarial review findings
# ═══════════════════════════════════════════════════════════════════════
@@ -934,7 +934,7 @@ def test_env_S_rm_blocked(self, broker: ProcessBroker) -> None:
# ═══════════════════════════════════════════════════════════════════════
-# R3 Review Fixes — /collab findings
+# R3 Review Fixes — peer review findings
# ═══════════════════════════════════════════════════════════════════════
diff --git a/tests/test_sdk_contract.py b/tests/test_sdk_contract.py
index 84a988c..a48bad5 100644
--- a/tests/test_sdk_contract.py
+++ b/tests/test_sdk_contract.py
@@ -351,7 +351,7 @@ class TestHealthContract:
def test_health_fields(self) -> None:
health = HealthStatus(
status="healthy",
- version="0.1.0",
+ version="2.0.0",
initialized=True,
backends=["shell", "mcp"],
)
diff --git a/tests/test_traceback_safety.py b/tests/test_traceback_safety.py
index ffdf126..f5d7ec6 100644
--- a/tests/test_traceback_safety.py
+++ b/tests/test_traceback_safety.py
@@ -111,7 +111,7 @@ def test_long_message_truncated(self):
result = sanitize_error(exc)
assert len(result) <= 300
- # ── Regression tests from /collab + /8eyes review ────────────
+ # ── Regression tests from adversarial code review ────────────
def test_windows_path_with_spaces_stripped(self):
from openspace.errors import sanitize_error