Skip to content

fix(ollama): strip thinking tokens, raise max_tokens, fix panel summary cache#456

Merged
koala73 merged 2 commits intomainfrom
fix/ollama-thinking-tokens
Feb 27, 2026
Merged

fix(ollama): strip thinking tokens, raise max_tokens, fix panel summary cache#456
koala73 merged 2 commits intomainfrom
fix/ollama-thinking-tokens

Conversation

@koala73
Copy link
Owner

@koala73 koala73 commented Feb 27, 2026

Summary

  • Strip <think>...</think> tags from Ollama model responses to prevent chain-of-thought leaking into summaries
  • Raise max_tokens to avoid truncated summaries
  • Fix panel summary cache to key by panelId preventing cross-panel cache collisions

Test plan

  • node --test tests/summarize-reasoning.test.mjs — passes
  • Ollama summaries show clean output without reasoning tokens

@vercel
Copy link

vercel bot commented Feb 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
worldmonitor Ready Ready Preview, Comment Feb 27, 2026 10:41am
worldmonitor-finance Ready Ready Preview, Comment Feb 27, 2026 10:41am
worldmonitor-happy Ready Ready Preview, Comment Feb 27, 2026 10:41am
worldmonitor-startup Ready Ready Preview, Comment Feb 27, 2026 10:41am

Request Review

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

…ry cache (#450)

- Add OLLAMA_MAX_TOKENS env var (clamped 50-2000, default 300) so thinking
  models have enough budget for actual summaries instead of truncated reasoning
- Strip <|begin_of_thought|>/<|end_of_thought|> tags (terminated + unterminated)
- Add mode-scoped min-length gate: reject <20 char outputs for brief/analysis
- Extend TASK_NARRATION regex with first/step/my-task/to-summarize patterns
- Fix client-side summary cache: store headline signature in value, validate on
  read, auto-dismiss stale summaries on headline change, discard in-flight
  results when headlines change during generation
- Add tests for new patterns and negative cases (39/39 pass)
- Add hideSummary() call when headline signature changes mid-generation,
  preventing a stuck "Generating summary..." overlay
- Fix stale comment: cache version is v5, not v4
@koala73 koala73 force-pushed the fix/ollama-thinking-tokens branch from 81399f0 to 49de7bf Compare February 27, 2026 10:40
@koala73 koala73 merged commit b520183 into main Feb 27, 2026
6 checks passed
facusturla pushed a commit to facusturla/worldmonitor that referenced this pull request Feb 27, 2026
…ry cache (koala73#456)

* fix(ollama): strip thinking tokens, raise max_tokens, fix panel summary cache (koala73#450)

- Add OLLAMA_MAX_TOKENS env var (clamped 50-2000, default 300) so thinking
  models have enough budget for actual summaries instead of truncated reasoning
- Strip <|begin_of_thought|>/<|end_of_thought|> tags (terminated + unterminated)
- Add mode-scoped min-length gate: reject <20 char outputs for brief/analysis
- Extend TASK_NARRATION regex with first/step/my-task/to-summarize patterns
- Fix client-side summary cache: store headline signature in value, validate on
  read, auto-dismiss stale summaries on headline change, discard in-flight
  results when headlines change during generation
- Add tests for new patterns and negative cases (39/39 pass)

* fix: hide summary container on stale in-flight discard, fix comment

- Add hideSummary() call when headline signature changes mid-generation,
  preventing a stuck "Generating summary..." overlay
- Fix stale comment: cache version is v5, not v4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant