Skip to content

fix(llm): Gemini 2.5 compatibility — thinking parts and response parsing#98

Open
webframp wants to merge 4 commits intocalesthio:masterfrom
webframp:fix/gemini-25-compatibility
Open

fix(llm): Gemini 2.5 compatibility — thinking parts and response parsing#98
webframp wants to merge 4 commits intocalesthio:masterfrom
webframp:fix/gemini-25-compatibility

Conversation

@webframp
Copy link
Copy Markdown

Problem

Gemini 2.5 Flash and Pro models fail to generate trade ideas due to three compounding issues:

1. Thinking parts in multi-part responses

Gemini 2.5 models return parts arrays where the first element is a \thinking\ part (thought: true) and the actual response is in subsequent parts. The provider reads only parts[0], so it gets raw reasoning instead of the JSON output.

2. Truncated output from thinking token budget

Thinking tokens consume the maxOutputTokens budget, leaving insufficient tokens for the actual response. This causes the JSON array to be cut mid-object (e.g., 693 chars instead of ~2000+), making it unparseable.

3. Brittle code block extraction

The ideas parser only handled code blocks at exact string boundaries (startsWith / regex anchored to $). Gemini responses may have trailing whitespace, extra text, or different formatting that breaks extraction.

Result: [LLM Ideas] No valid ideas parsed from response on every sweep with Gemini 2.5 Flash/Pro, despite the model returning valid ideas inside the response.

Fix

lib/llm/gemini.mjs

  • Filter out thought parts from response, concatenate only non-thinking parts
  • Add thinkingConfig: { thinkingBudget: 1024 } to keep reasoning concise and preserve output token budget

lib/llm/ideas.mjs

  • Rewrite code block extraction to find \``json...```` blocks anywhere in the response (not just at boundaries)
  • Add fallback: extract the JSON array ([...]) directly if no code block is found
  • Bump maxTokens from 4096 to 8192 for idea generation

Testing

Before: 0 ideas (llm-failed) on every sweep with gemini-2.5-flash
After: 5 ideas (llm) consistently generated and parsed correctly

exe.dev user and others added 4 commits April 20, 2026 04:16
The custom .env parser did not handle quoted values, causing passwords
and API keys containing special characters (|, ^, ", >, [, etc.) to
include the quote characters as part of the value or parse incorrectly.

This adds quote stripping for both single and double-quoted values,
matching the behavior of dotenv and other standard .env parsers.

Co-authored-by: Shelley <shelley@exe.dev>
Slash command registration was called before client.login(), so
client.user.id was undefined and fell back to the string "me",
causing a Discord API error:

  Invalid Form Body
  application_id[NUMBER_TYPE_COERCE]: Value "me" is not snowflake.

This moves command registration into the ready event handler and
attaches that handler before login() to avoid a race condition
where the ready event fires before the listener is attached.

Co-authored-by: Shelley <shelley@exe.dev>
When running behind a reverse proxy or on a remote host, the /status
command in Telegram and Discord shows http://localhost:PORT which is
not reachable by users.

Adds a PUBLIC_URL env var that, when set, replaces the hardcoded
localhost URL in bot status responses. Falls back to localhost when
unset, so existing setups are unaffected.

Example: PUBLIC_URL=https://my-crucix.example.com
Co-authored-by: Shelley <shelley@exe.dev>
Three issues when using Gemini 2.5 Flash/Pro models:

1. Gemini 2.5 models return multi-part responses where the first part
   is a "thinking" part and the second is the actual content. The
   provider only read parts[0], getting thinking text instead of the
   response. Fixed by filtering out thought parts.

2. Thinking tokens consumed the maxOutputTokens budget, causing
   truncated JSON responses (cut mid-object). Added thinkingConfig
   with a 1024-token budget to keep reasoning concise, and bumped
   idea generation to 8192 output tokens.

3. The ideas response parser failed on Gemini output because it only
   handled code blocks at string boundaries. Rewrote to extract code
   blocks from anywhere in the response and fall back to finding the
   JSON array if no code block is present.

Co-authored-by: Shelley <shelley@exe.dev>
@webframp webframp requested a review from calesthio as a code owner April 20, 2026 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant