Skip to content

feat: Add contextStrategy to control conversation growth and prevent token explosion in agent runs#88

Merged
JackChen-me merged 2 commits intoJackChen-me:mainfrom
ibrahimkzmv:feat.context-strategy
Apr 12, 2026
Merged

feat: Add contextStrategy to control conversation growth and prevent token explosion in agent runs#88
JackChen-me merged 2 commits intoJackChen-me:mainfrom
ibrahimkzmv:feat.context-strategy

Conversation

@ibrahimkzmv
Copy link
Copy Markdown
Contributor

What

Introduces a contextStrategy option to manage conversation history during multi-turn agent runs. Supports sliding-window truncation, LLM-based summarization, and custom compression to prevent unbounded context growth.

Why

Currently, agent conversations grow without limit, causing:

  • Context window overflows (especially for smaller/local models)
  • Rapidly increasing token costs due to full history being resent each turn
  • Degraded model performance from excessively long context

This change ensures context size is actively managed during execution, making long-running agent workflows production-ready and cost-efficient.

Closes #59

Checklist

  • npm run lint passes
  • npm test passes
  • Added/updated tests for changed behavior
  • No new runtime dependencies (or justified in the PR description)

…ustom) to prevent unbounded conversation growth
@JackChen-me
Copy link
Copy Markdown
Owner

Three issues found, two are blockers:

1. summarize produces orphaned tool_result_id after the 2nd compaction. At runner.ts:277, splitAt = Math.max(2, Math.floor(rest.length / 2)) can cut through a tool_use/tool_result pair once rest is odd-length (happens after the first summary injects its own user turn). Anthropic hard-rejects the next call with tool_result block missing corresponding tool_use block. Fix: align splitAt to an even offset, e.g. Math.max(2, Math.floor(rest.length / 4) * 2).

2. Summary adapter.chat() usage isn't added to totalUsage. At runner.ts:311, the summary response's usage is emitted in a trace event but never fed back into the accumulator, so maxTokenBudget is silently bypassed and RunResult.tokenUsage under-reports cost. Fix: return { messages, usage } from summarizeMessages and accumulate at the call site.

3. Synthetic marker/summary messages use role: 'user', producing sequences like [user, user, assistant, user]. Not fatal on Anthropic direct API (silent merge per anthropic-sdk-typescript#565), but breaks Bedrock and blurs prompt semantics on direct API (summary text concatenates onto the original prompt). Fix: tag as role: 'assistant' or fold into the next real user message.

Repros on request.

@ibrahimkzmv
Copy link
Copy Markdown
Contributor Author

Appreciate the review, you caught a subtle edge case with the odd-length rest

Fixed a few things:

  • safe boundary split for tool_result alignment
  • usage tracking now includes summarizeMessages cost
  • removed synthetic user rows, moved notes into next turn

Thanks again for the sharp eye @JackChen-me

@JackChen-me JackChen-me merged commit 0fb8a38 into JackChen-me:main Apr 12, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P1] Context Window Management — sliding window and summarization for long conversations

2 participants