Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions .opencode/skills/backlog-complete-merge/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,13 +91,96 @@ Proceed to Phase 2.

### Run unit tests

```bash
bun test
```

**Pass conditions**: All unit tests exit 0.

If `bun` is not available, try:

```bash
npm install && npm test
```

Or with Docker:

```bash
docker compose build --no-cache && docker compose up -d
docker compose exec opencode-dev npm run test:unit
```

**Pass conditions**: All unit tests exit 0.

### Run TypeScript verification (CRITICAL)

**Goal**: Catch TypeScript errors BEFORE pushing to CI.

This prevents CI failures due to:
- Missing type exports (TS2305: Module has no exported member)
- Import path errors (TS2459: Module declares locally but is not exported)
- Missing type definitions (TS2304: Cannot find name)

```bash
# Run TypeScript type check
bun tsc --noEmit 2>&1 | head -50

# Or with npm
npx tsc --noEmit 2>&1 | head -50

# Or check specific files that were modified
git diff --name-only HEAD | xargs -I {} bun tsc --noEmit {} 2>&1
```

**Pass conditions**: No TypeScript errors.

**If TypeScript errors found**:
- Check if any type exports were accidentally removed (search for `export type`)
- Verify all imports reference correct modules
- Look for duplicate interface definitions
- Run tests to confirm fix works: `bun test test/unit/<module>.test.ts`

### Common TypeScript Issues and Fixes

#### Issue: Missing type exports (TS2305)

```bash
# Check for removed exports in types.ts
git diff HEAD~1 -- src/types.ts | grep "^-export"
```

**Fix**: Restore missing exports at the top of types.ts:

```typescript
export type RetrievalMode = "hybrid" | "vector";
export type InjectionMode = "fixed" | "budget" | "adaptive";
export type SummarizationMode = "none" | "truncate" | "extract" | "auto";
export type CodeTruncationMode = "smart" | "signature" | "preserve";
export type ContentType = "text" | "code" | "mixed";
export interface ContentDetection {
hasCode: boolean;
isPureCode: boolean;
}
```

#### Issue: Wrong import path (TS2459)

```bash
# Check for import errors
git diff HEAD~1 -- test/ | grep "import.*from"
```

**Fix**: Ensure imports reference correct modules:

```typescript
// ❌ Wrong: import from embedder.js
import type { EmbeddingConfig } from "../../src/embedder.js";

// ✅ Correct: import from types.js
import type { EmbeddingConfig } from "../../src/types.js";
import type { Embedder } from "../../src/embedder.js";
```

### Run E2E tests (if applicable)

Check if E2E tests exist for this change:
Expand Down Expand Up @@ -364,6 +447,10 @@ git rev-parse --abbrev-ref --symbolic-full-name @{upstream}
openspec verify-change "<change-id>"

# Phase 2 — tests
bun test
# TypeScript type check (CRITICAL - run before push!)
bun tsc --noEmit 2>&1 | head -50
# Docker tests (alternative)
docker compose build --no-cache && docker compose up -d
docker compose exec opencode-dev npm run test:unit
docker compose exec opencode-dev npm run test:e2e
Expand Down
2 changes: 1 addition & 1 deletion docs/backlog.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@
| BL-036 | LanceDB ANN fast-path for large scopes | P2 | planned | TBD | TBD | 新增 `LANCEDB_OPENCODE_PRO_VECTOR_INDEX_THRESHOLD` (預設 1000);當 scope entries ≥ 閾值時自動建立 IVF_PQ 向量索引;`memory_stats` 揭露 `searchMode` 欄位;`pruneScope` 超過 `maxEntriesPerScope` 時發出警告日誌 [Surface: Plugin] |
| BL-037 | Event table TTL / archival | P1 | planned | TBD | TBD | 為 `effectiveness_events` 建立保留期與歸檔機制,降低長期 local store 成本 [Surface: Plugin] |
| BL-048 | LanceDB 索引衝突修復與備份安全機制 | P1 | **done** | bl-048-lancedb-index-recovery | openspec/changes/bl-048-lancedb-index-recovery/ | 修復 ensureIndexes() 重試邏輯 + 可選定期備份 config [Surface: Plugin] v0.6.1 |
| BL-049 | Embedder 錯誤容忍與 graceful degradation | P1 | proposed | TBD | TBD | embedder 失敗時的重試/延遲 + 搜尋時 BM25 fallback [Surface: Plugin] |
| BL-049 | Embedder 錯誤容忍與 graceful degradation | P1 | **done** | bl-049-embedder-error-tolerance | openspec/changes/archive/2026-04-03-bl-049-embedder-error-tolerance/ | embedder 失敗時的重試/延遲 + 搜尋時 BM25 fallback [Surface: Plugin] |
| BL-050 | 內建 embedding 模型(transformers.js) | P1 | proposed | TBD | TBD | 新增 TransformersEmbedder,提供離線 embedding 能力 [Surface: Plugin] |

## Epic 10 — 架構可維護性與效能硬化
Expand Down
2 changes: 1 addition & 1 deletion docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -414,7 +414,7 @@ OpenCode 要從「有長期記憶的工具」進化成「會累積團隊工作
14. Scope cache 記憶體治理(Surface: Plugin)→ BL-045 ✅ DONE
15. DB row runtime schema validation(Surface: Plugin + Test-infra)→ BL-046
16. LanceDB 索引衝突修復與備份安全機制(Surface: Plugin)→ BL-048 ✅ DONE v0.6.1
17. Embedder 錯誤容忍與 graceful degradation(Surface: Plugin)→ BL-049 ⚠️ 研究完成,待實作
17. Embedder 錯誤容忍與 graceful degradation(Surface: Plugin)→ BL-049 ✅ DONE
18. 內建 embedding 模型(transformers.js)(Surface: Plugin)→ BL-050 ⚠️ 研究完成,待實作

### P2
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-04-03
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Design: BL-049 Embedder Error Tolerance and Graceful Degradation

## Context

The current `src/embedder.ts` has basic timeout handling (6s default) but no retry logic. When Ollama/OpenAI is unreachable:
1. `embed()` throws immediately on timeout/network error
2. No attempt to recover automatically
3. Users get opaque errors, no fallback search capability

The codebase already has BM25 search infrastructure in `store.ts`. The missing piece is the wiring to trigger BM25-only mode when embedder fails.

## Goals / Non-Goals

**Goals:**
- Add retry with exponential backoff for embedder `embed()` calls (configurable: max 3 attempts, 1s initial, 2x backoff)
- Auto-fallback to BM25-only search after embedder retry exhaustion
- Log structured warnings on embedder failures, retries, and fallback triggers
- Expose search mode and embedder health in `memory_stats`

**Non-Goals:**
- Do NOT implement embedder health check daemon (periodic polling)
- Do NOT add automatic embedder recovery (user must restart service)
- Do NOT change vector index creation fallback logic (already exists)

## Decisions

| Decision | Choice | Why | Trade-off |
|---|---|---|---|
| Runtime surface | hook-driven | Embedder is called from store during search/capture; retry/fallback logic integrates at call site | Extra latency on first embedder failure (backoff delays) |
| Entrypoint | `src/embedder.ts` → `embedWithRetry()` wrapper | Minimal invasion; existing embedders unchanged | Slight complexity in wrapper |
| Data model | Extend `EmbeddingConfig` with retry options + add `EmbedderHealth` type | No new tables; config-only + in-memory metrics | Metrics lost on restart (acceptable) |
| Failure handling | retry → fallback → throw | Matches existing fallback philosophy in codebase | BM25-only search may have lower relevance quality |
| Observability | Console warnings + `memory_stats` fields | Already exposed via existing tool; no new UI needed | Logs only (no structured events) |

### Alternatives Considered

1. **Health check daemon**: Polling embedder periodically to detect issues early. Rejected - adds complexity, not aligned with "react to failure" model.

2. **Circuit breaker**: OpenCircuit after N failures, auto-reset after timeout. Rejected - overkill for single-user plugin; retry/backoff is sufficient.

3. **User-configurable fallback**: Allow users to choose fallback (BM25, transformers.js, none). Deferred to BL-050 (built-in embedding model).

## Operability

- **Trigger path**: User calls `memory_search` or auto-capture triggers → `embedder.embed()` fails → retry with backoff → fallback to BM25 if exhausted
- **Expected visible output**: On embedder failure: `[warn] Embedder failed, retry 1/3 in 1000ms...` → `[warn] Embedder unavailable, falling back to BM25-only search`
- **Misconfiguration/failure behavior**: If user sets `retry.maxAttempts: 0`, no retry; immediate fallback. If BM25 also fails (rare), throw original embedder error.

## Migration Plan

1. Add retry config to `EmbeddingConfig` type and `config.ts`
2. Create `embedWithRetry()` wrapper in `embedder.ts`
3. Update `store.ts` to catch embedder errors and trigger fallback
4. Extend `memory_stats` output with search mode and embedder health
5. Add unit tests for retry logic, integration tests for fallback flow

## Risks / Trade-offs

- **[Risk] First-time users confused by BM25-only mode** → Mitigation: Log clear message indicating fallback, include in docs
- **[Risk] Fallback reduces search relevance** → Mitigation: Document that hybrid → BM25-only has lower semantic recall
- **[Risk] Retry adds latency on slow embedder** → Mitigation: Configurable delays, expose via `memory_stats` health

## Open Questions

- Should retry also apply to `dim()` (dimension probe)? Yes, include for consistency.
- Should fallback trigger only on explicit embedder errors or also on high latency? Current: errors only. Latent is future work (BL-047 scope).
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Proposal: BL-049 Embedder Error Tolerance and Graceful Degradation

## Why

When the embedder (Ollama/OpenAI) is unreachable, times out, or returns invalid responses, the entire memory system becomes non-operational. This blocks both auto-capture and search functionality. Users must manually restart services or reconfigure, which creates poor UX. The system already has BM25 fallback for lexical search, but no structured retry/backoff or graceful degradation when embedder fails during vector operations.

## What Changes

- Add configurable retry with exponential backoff for embedder failures (timeout, HTTP errors, network issues)
- Add automatic fallback to BM25-only search when vector embedding fails after retry exhaustion
- Add structured warning logs and metrics for embedder degradation events
- Expose current search mode (vector, hybrid, bm25-only) in `memory_stats`

## Capabilities

### New Capabilities

- **embedder-retry**: Retry with exponential backoff when embedder fails (timeout, network, HTTP errors)
- **bm25-fallback**: Automatic BM25-only search fallback when embedder is unavailable after retry exhaustion
- **embedder-health-metrics**: Metrics and logs for embedder availability, retry counts, fallback events

### Modified Capabilities

- `memory-stats`: Extend to expose `searchMode: "vector" | "hybrid" | "bm25-only"` and embedder health status

## Impact

- **Affected modules**: `src/embedder.ts`, `src/store.ts`, `src/tools/memory.ts`, `src/config.ts`
- **Configuration**: Add `embedding.retry.enabled`, `embedding.retry.maxAttempts`, `embedding.retry.initialDelayMs`, `embedding.retry.backoffMultiplier`
- **Dependencies**: None new (existing fetch + logging infrastructure)
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Spec: bm25-fallback

## ADDED Requirements

### Requirement: Automatic BM25-only search fallback when embedder unavailable

The system SHALL fall back to BM25-only search when embedder is unavailable after retry exhaustion.

Runtime Surface: hook-driven
Entrypoint: src/store.ts -> search() fallback branch

#### Scenario: Fallback to BM25 when embedder fails

- **WHEN** embedder has failed after max retry attempts and memory_search is invoked
- **THEN** system detects embedder unavailable and switches to BM25-only search mode

#### Scenario: Hybrid search normalizes to BM25-only

- **WHEN** config has retrieval.mode: hybrid and embedder is unavailable
- **THEN** effective weights normalize to vectorWeight: 0, bm25Weight: 1.0

#### Scenario: Embedder recovers mid-session

- **WHEN** embedder was unavailable and subsequent embed() call succeeds
- **THEN** system returns to normal vector/hybrid mode
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Spec: embedder-health-metrics

## ADDED Requirements

### Requirement: Embedder health status exposed in memory_stats

The system SHALL expose embedder health status, retry counts, and current search mode via memory_stats.

Runtime Surface: opencode-tool
Entrypoint: src/tools/memory.ts -> memory_stats

#### Scenario: Memory stats shows embedder healthy

- **WHEN** embedder is reachable and last embed succeeded
- **THEN** memory_stats returns embedderHealth.status: "healthy"

#### Scenario: Memory stats shows embedder degraded

- **WHEN** embedder failed but fallback succeeded
- **THEN** memory_stats returns embedderHealth.status: "degraded" and fallbackActive: true

#### Scenario: Memory stats exposes search mode

- **WHEN** system is in any operational state
- **THEN** memory_stats returns searchMode: "vector" | "hybrid" | "bm25-only"

#### Scenario: Memory stats tracks retry count

- **WHEN** embedder had retry attempts
- **THEN** memory_stats returns embedderHealth.retryCount
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Spec: embedder-retry

## ADDED Requirements

### Requirement: Embedder retry with exponential backoff

The system SHALL retry embedder operations with exponential backoff when embedder fails due to timeout, network errors, or HTTP errors.

Runtime Surface: hook-driven
Entrypoint: src/embedder.ts -> embedWithRetry()

#### Scenario: Embedder timeout triggers retry

- **WHEN** embedder is slow to respond (> timeoutMs) and memory_search calls embed()
- **THEN** first attempt fails with timeout error and system retries after initialDelayMs

#### Scenario: Embedder network error triggers retry

- **WHEN** embedder endpoint is unreachable (connection refused, DNS failure)
- **THEN** first attempt fails with network error and system retries with backoff delay

#### Scenario: Retry exhaustion triggers fallback

- **WHEN** embedder has failed maxAttempts times
- **THEN** system logs warning and signals fallback handler to use BM25-only search

#### Scenario: Retry disabled via config

- **WHEN** config has retry.maxAttempts: 0 and embedder fails
- **THEN** no retry occurs and fallback is triggered immediately
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Tasks: BL-049 Embedder Error Tolerance

## Implementation Tasks

- [x] Add retry config to `EmbeddingConfig` type (`src/types.ts`)
- [x] Add retry config parsing in `src/config.ts`
- [x] Implement `embedWithRetry()` wrapper in `src/embedder.ts`
- [x] Update `src/store.ts` to catch embedder errors and trigger fallback
- [x] Extend `memory_stats` output with `searchMode` and `embedderHealth` in `src/tools/memory.ts`
- [x] Add unit tests for retry logic in `src/embedder.test.ts`
- [x] Add integration test for fallback flow

## Verification Matrix

| Requirement | Unit | Integration | E2E | Required to release |
|---|---|---|---|---|
| R1: Embedder retry with backoff | ✅ | ✅ | n/a | yes |
| R2: BM25 fallback when retry exhausted | ✅ | ✅ | n/a | yes |
| R3: Embedder health in memory_stats | ✅ | ✅ | n/a | yes |
| R4: Graceful degradation (vector→hybrid→bm25) | ✅ | ✅ | n/a | yes |
| R5: Observability (logs + stats) | ✅ | n/a | n/a | yes |

## Changelog Wording Class

`internal-only` — This is a foundation/internal improvement. Users benefit from improved reliability but the feature is not exposed as a user-facing tool. Changelog should note: "Improved embedder error handling and automatic fallback to BM25 search when embedding service is unavailable."
27 changes: 27 additions & 0 deletions openspec/specs/bm25-fallback/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# bm25-fallback Specification

## Purpose
TBD - created by archiving change bl-049-embedder-error-tolerance. Update Purpose after archive.
## Requirements
### Requirement: Automatic BM25-only search fallback when embedder unavailable

The system SHALL fall back to BM25-only search when embedder is unavailable after retry exhaustion.

Runtime Surface: hook-driven
Entrypoint: src/store.ts -> search() fallback branch

#### Scenario: Fallback to BM25 when embedder fails

- **WHEN** embedder has failed after max retry attempts and memory_search is invoked
- **THEN** system detects embedder unavailable and switches to BM25-only search mode

#### Scenario: Hybrid search normalizes to BM25-only

- **WHEN** config has retrieval.mode: hybrid and embedder is unavailable
- **THEN** effective weights normalize to vectorWeight: 0, bm25Weight: 1.0

#### Scenario: Embedder recovers mid-session

- **WHEN** embedder was unavailable and subsequent embed() call succeeds
- **THEN** system returns to normal vector/hybrid mode

Loading