Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,27 @@ Format follows [Keep a Changelog](https://keepachangelog.com/). Versions follow

---

## [0.2.9] - 2026-03-28

### Added

- **Episodic Learning Tools** (Hook Wiring + Tools Exposure):
- `task_episode_create`: Create task episode records manually
- `task_episode_query`: Query episodes by scope and state
- `similar_task_recall`: Find similar past tasks using vector similarity
- `retry_budget_suggest`: Get retry budget suggestions based on history
- `recovery_strategy_suggest`: Get recovery strategy suggestions after failures

- **Automatic Similar Task Recall**: Enhanced `session.idle` to inject similar task context into system prompt using vector similarity

- **Vector Similarity Upgrade**: `findSimilarTasks()` now supports vector-based similarity search with fallback to keyword matching

### Changed

- Extended `EpisodicTaskRecord` to support `taskDescriptionVector` for vector-based similarity

---

## [0.2.8] - 2026-03-28

### Fixed
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-03-28
59 changes: 59 additions & 0 deletions openspec/changes/complete-episodic-learning-hooks/design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Design: Complete Episodic Learning Hook Wiring + Tools Exposure

## Context

Building on the archived episodic learning specs, this change completes implementation by wiring hooks and exposing tools.

## Decisions

### Decision: Hook Architecture

| Event | Action | Store Method |
|-------|--------|--------------|
| `session.start` | Create new task episode | `createTaskEpisode()` |
| `tool.*` | Record command execution | `addCommandToEpisode()` |
| Validation events | Parse and store outcome | `addValidationOutcome()` + `classifyFailure()` |
| `session.end` | Finalize episode state | `updateTaskState()` |
| `session.idle` | Extract patterns + recall | `extractSuccessPatternsFromScope()` + `findSimilarTasks()` |

**Rationale**: Matches existing event pipeline pattern in index.ts

### Decision: Tool Surface

| Tool Name | Purpose | Store Method |
|-----------|---------|---------------|
| `task_episode_create` | Manual episode creation | `createTaskEpisode()` |
| `task_episode_query` | Query episodes by scope/state | `queryTaskEpisodes()` |
| `similar_task_recall` | Find similar past tasks | `findSimilarTasks()` |
| `retry_budget_suggest` | Get retry budget suggestion | `suggestRetryBudget()` |
| `recovery_strategy_suggest` | Get recovery strategies | `suggestRecoveryStrategies()` |

**Rationale**: Consistent naming with existing memory_* tools

### Decision: Vector Similarity

- Upgrade `findSimilarTasks()` to use embedder for vector search
- Fall back to keyword matching if embedding unavailable

**Rationale**: Better semantic matching for task similarity

## Data Flow

```
User Task → session.start → createTaskEpisode()
tool execution → addCommandToEpisode()
validation → addValidationOutcome() + classifyFailure()
session.end → updateTaskState() + addSuccessPatterns()
session.idle → extractSuccessPatternsFromScope()
findSimilarTasks() → inject into system prompt
```

## Risks / Trade-offs

- [Risk] Hook overhead → **Mitigation**: Async execution, error handling with logging
- [Risk] Episode storage growth → **Mitigation**: TTL or manual cleanup (future)
- [Risk] Embedding unavailability → **Mitigation**: Fall back to keyword matching
46 changes: 46 additions & 0 deletions openspec/changes/complete-episodic-learning-hooks/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Proposal: Complete Episodic Learning Hook Wiring + Tools Exposure

**Change ID**: complete-episodic-learning-hooks
**Date**: 2026-03-28
**Status**: Proposed

## Problem Statement

The episodic learning features (BL-003, BL-014-020) were specified in three OpenSpec changes and partially implemented in v0.2.7-0.2.8:

- `2026-03-28-add-episodic-task-schema` - Schema + CRUD methods ✅
- `2026-03-28-add-task-episode-learning` - Episode capture, validation, pattern extraction ✅ (store layer)
- `2026-03-28-add-retry-recovery-evidence` - Retry/recovery tracking ✅ (store layer)

**However, the implementation is incomplete:**
1. ❌ No event hooks to trigger episode capture on session events
2. ❌ No public tools to expose episodic learning to users
3. ❌ Vector similarity not used for task matching (keyword fallback)
4. ❌ Validation outcome parsing not integrated with hooks

This change completes the implementation by adding Hook Wiring + Tools exposure.

## Goals

1. **Hook Wiring**: Connect existing store methods to OpenCode event hooks
2. **Tools Exposure**: Expose episodic learning capabilities as public tools
3. **Vector Similarity**: Upgrade task matching to use embeddings
4. **Integration**: Wire validation outcome parsing into the flow

## Non-Goals

- ML-based pattern extraction (rule-based only)
- Automatic retry execution (suggestions only)
- Changes to existing store schema or types

## Release Impact

**Type**: Internal API + New Tools
**Changelog Wording**: `user-facing` (new tools exposed)

## References

- `openspec/changes/archive/2026-03-28-add-episodic-task-schema/`
- `openspec/changes/archive/2026-03-28-add-task-episode-learning/`
- `openspec/changes/archive/2026-03-28-add-retry-recovery-evidence/`
- `docs/EPISODIC_LEARNING_INDEX.md`
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
## ADDED Requirements

### Requirement: task_episode_create tool
The system SHALL provide a tool to manually create task episode records.

#### Runtime Surface
- Surface: opencode-tool
- Entrypoint: src/index.ts → tool "task_episode_create"

#### Scenario: Create episode manually
- **WHEN** user calls `task_episode_create` with taskId, scope, and initial state
- **THEN** new episode record is created with provided fields
- **AND** episode ID is returned

#### Tool Schema
```typescript
{
taskId: string,
scope: string,
initialState: "pending" | "running"
}
```

---

### Requirement: task_episode_query tool
The system SHALL provide a tool to query task episodes by scope and state.

#### Runtime Surface
- Surface: opencode-tool
- Entrypoint: src/index.ts → tool "task_episode_query"

#### Scenario: Query episodes
- **WHEN** user calls `task_episode_query` with optional scope and state filters
- **THEN** matching episode records are returned
- **AND** results include episode ID, task ID, state, timestamps

#### Tool Schema
```typescript
{
scope?: string,
state?: "pending" | "running" | "success" | "failed" | "timeout",
limit?: number (default: 10)
}
```

---

### Requirement: similar_task_recall tool
The system SHALL provide a tool to find similar past tasks using vector similarity.

#### Runtime Surface
- Surface: opencode-tool
- Entrypoint: src/index.ts → tool "similar_task_recall"

#### Scenario: Find similar tasks
- **WHEN** user calls `similar_task_recall` with query and threshold
- **THEN** similar episodes are retrieved using vector search
- **AND** results include commands, validation outcomes, final state

#### Tool Schema
```typescript
{
query: string,
threshold?: number (default: 0.85),
limit?: number (default: 3)
}
```

---

### Requirement: retry_budget_suggest tool
The system SHALL provide a tool to suggest retry budgets based on historical data.

#### Runtime Surface
- Surface: opencode-tool
- Entrypoint: src/index.ts → tool "retry_budget_suggest"

#### Scenario: Get retry budget
- **WHEN** user calls `retry_budget_suggest` with error type
- **THEN** median-based retry budget is suggested
- **AND** stop conditions are provided if all retries failed historically

#### Tool Schema
```typescript
{
errorType: "syntax" | "runtime" | "logic" | "resource" | "unknown",
minSamples?: number (default: 3)
}
```

---

### Requirement: recovery_strategy_suggest tool
The system SHALL provide a tool to suggest recovery strategies after failures.

#### Runtime Surface
- Surface: opencode-tool
- Entrypoint: src/index.ts → tool "recovery_strategy_suggest"

#### Scenario: Get recovery strategies
- **WHEN** user calls `recovery_strategy_suggest` with failure context
- **THEN** fallback and backoff strategies are suggested
- **AND** confidence scores are provided for each strategy

#### Tool Schema
```typescript
{
failureType: "syntax" | "runtime" | "logic" | "resource" | "unknown",
previousAttempts?: number
}
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
## ADDED Requirements

### Requirement: Session start triggers task episode creation
The system SHALL create a task episode record when a new task session begins.

#### Runtime Surface
- Surface: hook-driven
- Entrypoint: src/index.ts → hook "session.start" (TBD if exists)

#### Scenario: Session task starts
- **WHEN** a new task session begins (event type "session.start")
- **THEN** an episode record is created with state "pending" and start timestamp
- **AND** episode ID is stored in session state for subsequent operations

#### Scenario: Embedding unavailable
- **WHEN** session.start fires but embedding service is unavailable
- **THEN** episode creation is skipped with warning logged
- **AND** retry on next session start

---

### Requirement: Tool execution records commands to episode
The system SHALL record command executions within a task episode.

#### Runtime Surface
- Surface: hook-driven
- Entrypoint: src/index.ts → hook "tool.execute"

#### Scenario: Tool executed within task
- **WHEN** a tool "bash" is executed with command "npm run build" within task "task-123"
- **THEN** the command is added to the episode's command list
- **AND** episode is updated with new command timestamp

#### Scenario: No active episode
- **WHEN** tool executes but no active episode exists
- **THEN** command is not recorded
- **AND** no error thrown (graceful degradation)

---

### Requirement: Session end finalizes task episode
The system SHALL finalize task episode on session end with outcome.

#### Runtime Surface
- Surface: hook-driven
- Entrypoint: src/index.ts → hook "session.end"

#### Scenario: Task completes successfully
- **WHEN** task session ends with outcome "success"
- **THEN** episode record is updated with end timestamp and final state "success"
- **AND** success patterns are extracted and stored

#### Scenario: Task fails
- **WHEN** task session ends with outcome "failed"
- **THEN** episode record is updated with end timestamp and state "failed"
- **AND** failure is classified using classifyFailure()

---

### Requirement: Similar task recall on session idle
The system SHALL recall similar past tasks before execution context is injected.

#### Runtime Surface
- Surface: hook-driven
- Entrypoint: src/index.ts → hook "experimental.chat.system.transform"

#### Scenario: Similar task found
- **WHEN** session.idle fires and similar tasks exist (similarity >= 0.85)
- **THEN** similar task commands and outcomes are injected into system prompt
- **AND** recall is logged as event

#### Scenario: No similar task
- **WHEN** no similar tasks found
- **THEN** no injection occurs
- **AND** no error (normal behavior)
Loading