Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,15 @@

All notable changes to GBrain will be documented in this file.

## [0.10.1] - 2026-04-19

### Fixed

- **Action Brain accurately resolves who "you" means in your messages.** Owner context (`owner_name`, `owner_aliases`) is now correctly threaded all the way through the quality gate, so the extractor evaluates accuracy against real owner identity — not a stripped-down version of the prompt.
- **`action_ingest` accepts `owner_name` and `owner_aliases` over MCP.** You can now pass owner identity directly when calling `action_ingest`, letting the extractor know whose obligations to track in every batch.
- **Injection-hardened owner context.** `owner_name` and `owner_aliases` are sanitized before reaching the LLM prompt — control characters, XML-significant chars stripped, length capped. Mis-formed owner strings can no longer inject instructions into the extraction context.
- **Live validation harness ships with a sanitized gold set.** Real WhatsApp message content is no longer committed to the repository. Built-in gold set uses synthetic fixtures; real-message testing remains opt-in via `ACTION_BRAIN_LIVE_GOLDSET_PATH`.

## [0.10.0] - 2026-04-16

### Added
Expand Down
5 changes: 3 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ markdown files (tool-agnostic, work with both CLI and plugin contexts).
- `src/action-brain/types.ts` — Action Brain shared types (ActionItem, CommitmentBatch, ExtractionResult)
- `src/action-brain/action-schema.ts` — PGLite DDL + idempotent schema init for action_items / action_history tables
- `src/action-brain/action-engine.ts` — Storage layer: CRUD, priority scoring (urgency × confidence × recency), PGLite lifecycle
- `src/action-brain/extractor.ts` — LLM commitment extraction (two-tier Haiku→Sonnet), XML delimiter defense, stable source IDs
- `src/action-brain/extractor.ts` — LLM commitment extraction (two-tier Haiku→Sonnet), XML delimiter defense, stable source IDs, owner context sanitization (control-char strip, length caps)
- `src/action-brain/brief.ts` — Morning priority brief generator: ranked action items, overdue detection, deduplication
- `src/action-brain/operations.ts` — 5 Action Brain operations (action_list, action_brief, action_resolve, action_mark_fp, action_ingest)

Expand Down Expand Up @@ -106,7 +106,8 @@ parity), `test/cli.test.ts` (CLI structure), `test/config.test.ts` (config redac
`test/action-brain/action-engine.test.ts` (CRUD, scoring, PGLite lifecycle),
`test/action-brain/extractor.test.ts` (extraction, source ID stability, injection defense, timestamp bounds),
`test/action-brain/brief.test.ts` (brief generation, scoring, dedup, overdue detection),
`test/action-brain/operations.test.ts` (all 5 ops, ingest trust boundary, batch fallbacks).
`test/action-brain/operations.test.ts` (all 5 ops, ingest trust boundary, batch fallbacks),
`test/action-brain/e2e-live-validation.test.ts` (matchCommitment, normalizeActionText, isTypeCompatible matching helpers).

E2E tests (`test/e2e/`): Run against real Postgres+pgvector. Require `DATABASE_URL`.
- `bun run test:e2e` runs Tier 1 (mechanical, all operations, no API keys)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.10.0
0.10.1
25 changes: 21 additions & 4 deletions src/action-brain/extractor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,8 @@ async function evaluateQualityGate(
client: options.client,
model,
timeoutMs: options.timeoutMs,
ownerName: options.ownerName,
ownerAliases: options.ownerAliases,
});
const comparison = compareCommitments(testCase.expected, predicted);

Expand Down Expand Up @@ -233,17 +235,32 @@ function getClient(): AnthropicLike {
return anthropicClient;
}

const MAX_OWNER_NAME_LEN = 100;
const MAX_ALIAS_LEN = 50;
const MAX_ALIAS_COUNT = 10;

function sanitizeOwnerString(value: string, maxLen = MAX_OWNER_NAME_LEN): string {
// Strip newlines, tabs, null bytes, control chars (U+0000–U+001F), and XML-significant chars.
return value.replace(/[\x00-\x1F<>]/g, ' ').trim().slice(0, maxLen);
}

function buildExtractionRequest(
model: string,
messages: WhatsAppMessage[],
ownerName: string | null,
ownerAliases: string[]
): AnthropicCreateParams {
const ownerContext = ownerName
const safeName = ownerName ? sanitizeOwnerString(ownerName) : null;
const safeAliases = ownerAliases
.slice(0, MAX_ALIAS_COUNT)
.map((a) => sanitizeOwnerString(a, MAX_ALIAS_LEN))
.filter(Boolean);

const ownerContext = safeName
? [
`You are extracting commitments for the owner: ${ownerName}.`,
ownerAliases.length > 0
? `The owner may also appear as: ${ownerAliases.join(', ')}.`
`You are extracting commitments for the owner: ${safeName}.`,
safeAliases.length > 0
? `The owner may also appear as: ${safeAliases.join(', ')}.`
: '',
'When the owner sends a message (from_me), the "who" field should be their full name.',
'When someone addresses the owner as "you" or "customer" or "tenant", resolve "who" to the owner\'s name.',
Expand Down
16 changes: 16 additions & 0 deletions src/action-brain/operations.ts
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,13 @@ export const actionBrainOperations: Operation[] = [
commitments: { type: 'array', description: 'Optional pre-extracted commitments (bypass LLM)', items: { type: 'object' } },
model: { type: 'string', description: 'Anthropic model override' },
timeout_ms: { type: 'number', description: 'Extractor timeout in milliseconds' },
owner_name: { type: 'string', description: 'Owner name used to resolve "you"/"from_me" references', maxLength: 100 },
owner_aliases: {
type: 'array',
description: 'Optional owner aliases used by extractor context (max 10)',
items: { type: 'string', maxLength: 100 },
maxItems: 10,
},
actor: { type: 'string', description: 'Actor writing created events' },
},
mutating: true,
Expand All @@ -171,6 +178,8 @@ export const actionBrainOperations: Operation[] = [
: await extractCommitments(messages, {
model: asOptionalNonEmptyString(p.model) ?? undefined,
timeoutMs: asOptionalNumber(p.timeout_ms) ?? undefined,
ownerName: asOptionalNonEmptyString(p.owner_name) ?? undefined,
ownerAliases: parseStringArrayParam(p.owner_aliases) ?? undefined,
});

if (ctx.dryRun) {
Expand Down Expand Up @@ -358,6 +367,13 @@ function parseJsonArrayInput(value: unknown): unknown[] {
return [];
}

function parseStringArrayParam(value: unknown): string[] | undefined {
const values = parseJsonArrayInput(value)
.map((entry) => asOptionalNonEmptyString(entry))
.filter((entry): entry is string => entry !== null);
return values.length > 0 ? values : undefined;
}

function resolveSourceMessage(messages: WhatsAppMessage[], commitment: StructuredCommitment): WhatsAppMessage | null {
if (messages.length === 0) {
return null;
Expand Down
44 changes: 44 additions & 0 deletions test/action-brain/e2e-live-validation.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import { describe, expect, test } from 'bun:test';
import { matchCommitment, normalizeActionText, isTypeCompatible } from './e2e-live-validation.ts';
import type { StructuredCommitment } from '../../src/action-brain/extractor.ts';

function extracted(overrides: Partial<StructuredCommitment>): StructuredCommitment {
return {
who: overrides.who ?? null,
owes_what: overrides.owes_what ?? '',
to_whom: overrides.to_whom ?? null,
by_when: overrides.by_when ?? null,
confidence: overrides.confidence ?? 0.9,
type: overrides.type ?? 'commitment',
source_message_id: overrides.source_message_id ?? null,
};
}

describe('e2e live-validation matching', () => {
test('#1 strict match requires action text, not just actor + type', () => {
const result = matchCommitment(
extracted({ who: 'Jordan', owes_what: 'send vessel docs', type: 'commitment' }),
{ who: 'Jordan', action: 'send invoice bundle', type: 'waiting_on' }
);

expect(result).toBe(false);
});

test('#2 action normalization treats authorise and authorize as equivalent', () => {
expect(normalizeActionText('Please authorised trial payment.')).toBe('please authorized trial payment');

const result = matchCommitment(
extracted({ who: 'Abhinav Bansal', owes_what: 'Please authorize trial payment now', type: 'commitment' }),
{ who: 'Owner', action: 'authorise trial payment', type: 'owed_by_me' }
);

expect(result).toBe(true);
});

test('#3 type compatibility allows owed_by_me/waiting_on mapping only for actionable types', () => {
expect(isTypeCompatible('owed_by_me', 'commitment')).toBe(true);
expect(isTypeCompatible('waiting_on', 'follow_up')).toBe(true);
expect(isTypeCompatible('waiting_on', 'question')).toBe(false);
expect(isTypeCompatible('owed_by_me', 'decision')).toBe(false);
});
});
Loading