Skip to content

Conversation

@rossmanko
Copy link
Contributor

@rossmanko rossmanko commented Jan 30, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Suppress noisy analytics errors.
  • New Features / Improvements

    • Redesigned message rendering with richer editing, attachments, streaming indicators and faster UI updates.
    • Streaming responses now retry with a fallback model for incomplete assistant replies.
  • Updates

    • Usage accounting prefers provider-reported costs when available.
    • Stricter moderation thresholds for non-paid users; reduced agent-mode step limits for some tiers; assistant replies require more substantial content.
  • Privacy

    • Logs now record only region (no detailed location).
  • Chores

    • Development image updated to include tmux.
  • Documentation

    • Large set of guideline docs added and some legacy guides removed.

✏️ Tip: You can customize this high-level summary in your review settings.

rossmanko and others added 6 commits January 30, 2026 08:54
Include tmux in the Dockerfile to enhance terminal multiplexing capabilities. Update the installation verification step to confirm tmux is correctly installed alongside other tools.
…ption tier

Adjust the maximum steps for users in agent mode: ultra users now receive 40 steps instead of 50, and other users receive 20 steps instead of 25. This change aligns the step limits with updated subscription tier policies.
Remove city and country from user location logging for privacy.
The userLocation data is still available for tools that need it
(e.g., web search uses country), but only region is sent to logs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update the maximum moderation level for non-paid users from 0.95 to 0.90 to enhance moderation control

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…alidPromptError

When fetching file URLs fails (e.g., S3 server error), file parts remained
in messages with type: "file" but no url property. This caused the AI SDK
to reject the messages with AI_InvalidPromptError since file parts require
actual content.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Jan 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hackerai Ready Ready Preview, Comment Jan 30, 2026 11:31pm

Request Review

@coderabbitai
Copy link

coderabbitai bot commented Jan 30, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Adds PostHog no-op error handler; narrows user location to a scalar region; preserves provider metadata in message parts; drops file parts without URLs; introduces stream retry with a fallback model; threads provider-reported cost into usage deduction; renames and extends rate-limit API; large UI refactor to memoized message components; installs tmux in Docker; extensive doc additions/removals.

Changes

Cohort / File(s) Summary
Telemetry & Error Handling
app/posthog.js
Registers a no-op PostHog error handler to suppress client errors.
Chat handler & streaming
lib/api/chat-handler.ts
Adds createStream wrapper and retry-with-fallback flow (isRetryWithFallback, fallback model), persists retry results, logs incomplete streams, and threads providerCost into usage deduction.
Logging / Context
lib/api/chat-logger.ts, lib/logger.ts
Replaces nested location with scalar region in user context; logger prefers provider-reported cost when present, falls back to token-based cost.
Rate limiting / Billing
lib/rate-limit/token-bucket.ts, lib/rate-limit/index.ts, lib/rate-limit/__tests__/*
Renames deductAgentUsagedeductUsage, adds optional providerCostDollars parameter to prefer provider cost, switches pricing calculations/tests to yearly pricing.
Chat processing & moderation
lib/chat/chat-processor.ts, lib/moderation.ts
Tightens assistant-message validation (require meaningful assistant parts), reduces agent-mode max steps (ultra 50→40, non-ultra 25→20), lowers non-paid uncensor threshold (0.95→0.9).
Message normalization & file utils
lib/utils/message-processor.ts, lib/utils/file-transform-utils.ts
Stops stripping provider metadata from parts; adds removeFilePartsWithoutUrls() and invokes it to drop file parts lacking URLs.
UI: message rendering & perf
app/components/MessageItem.tsx, app/components/Messages.tsx, app/components/MessagePartHandler.tsx, app/components/ReasoningHandler.tsx, app/components/tools/*, components/ai-elements/*
Adds MessageItem, consolidates per-message rendering, converts many handlers/tools/reasoning components to memoized components with custom equality checks, hoists/memoizes selectors, stabilizes callbacks; removes or changes several ai-elements exports (some deleted, some API-updated).
Message processor / provider metadata
lib/utils/message-processor.ts
Preserves provider-related metadata (e.g., providerMetadata, thought_signature) in parts for Gemini/tool compatibility.
Infrastructure
docker/Dockerfile
Installs tmux in the main tool image and adds a presence check in validation.
Docs & skills reorg
.agents/..., .claude/..., .cursor/... (many files)
Adds vercel composition/react best-practices docs and many rule files while deleting/moving numerous existing skill docs — large documentation reorganization.
Formatting / tooling
.prettierignore
Adds .agents/ to Prettier ignore list.
Minor UI formatting
app/components/AgentsTab.tsx
Minor import/formatting cleanup only.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ChatHandler as Server/ChatHandler
    participant Model as ModelProvider
    participant DB
    participant Billing as RateLimiter

    Client->>ChatHandler: Start streaming request (chatId, model)
    ChatHandler->>Model: streamText(model)
    Model-->>ChatHandler: stream parts (may finish as single step-start)
    alt stream completes validly
        ChatHandler->>DB: persist assistant message
        ChatHandler->>Billing: deductUsage(providerCost?)
        ChatHandler-->>Client: finalize stream
    else stream ends incomplete (single step-start)
        ChatHandler->>DB: log incomplete stream
        ChatHandler->>ChatHandler: createStream(fallbackModel) [isRetryWithFallback=true]
        ChatHandler->>Model: streamText(fallbackModel)
        Model-->>ChatHandler: stream retry parts
        ChatHandler->>DB: persist assistant message (retry)
        ChatHandler->>Billing: deductUsage(providerCost?)
        ChatHandler-->>Client: finalize retried stream
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hush PostHog's little cries and keep the logs polite,
I trim agent steps and guard the files that lost their sight.
I whisper provider costs into the billing stream,
Retry with a fallback model when the answer's just a dream.
A hop, a nibble, tidy refactor — carrots for the team!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Daily branch 2026 01 30' is vague and generic, using non-descriptive terms that convey no meaningful information about the changeset's actual content or primary objectives. Replace the title with a concise, specific description of the main changes (e.g., 'Add message item component and optimize chat rendering' or 'Refactor chat handler with fallback model retry and moderation updates').
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch daily-branch-2026-01-30

Comment @coderabbitai help to get the list of available commands and usage tips.

Detect and log incomplete responses where the stream ends with only a
step-start part and no actual content.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/chat/chat-processor.ts (1)

8-21: ⚠️ Potential issue | 🟡 Minor

Documentation doesn't match implementation.

The docstring (lines 10-11) states "Ultra: 50 steps, all other paid users: 25 steps" and the inline comment (line 17) says "others get 25 steps", but the code returns 40 for Ultra (line 19) and 20 for other agent users (line 21).

📝 Suggested fix to align documentation with code
 /**
  * Get maximum steps allowed for a user based on mode and subscription tier
- * Agent mode: Ultra: 50 steps, all other paid users: 25 steps
+ * Agent mode: Ultra: 40 steps, all other paid users: 20 steps
  * Ask mode: Free: 5 steps, Pro/Team: 10 steps, Ultra: 15 steps
  */
 export const getMaxStepsForUser = (
   mode: ChatMode,
   subscription: SubscriptionTier,
 ): number => {
-  // Agent mode: Ultra users get 40 steps, others get 25 steps
+  // Agent mode: Ultra users get 40 steps, others get 20 steps
   if (mode === "agent" && subscription === "ultra") {
     return 40;
   } else if (mode === "agent") {
     return 20;
   }
🧹 Nitpick comments (2)
app/posthog.js (1)

17-18: Consider logging suppressed errors for observability.

Silently swallowing all PostHog errors removes visibility into potential issues beyond payload-too-large (e.g., configuration problems, network failures, API changes). Consider logging at debug/warn level or filtering for specific expected error types.

♻️ Suggested improvement
-  // Suppress PostHog errors (e.g., payload too large)
-  posthogClient.on("error", () => {});
+  // Log PostHog errors at debug level (expected: payload too large)
+  posthogClient.on("error", (err) => {
+    console.debug("PostHog error (suppressed):", err?.message || err);
+  });
lib/api/chat-handler.ts (1)

616-636: LGTM! Good observability addition for detecting incomplete responses.

Minor optimization: consider using findLast instead of .slice().reverse().find() to avoid creating intermediate arrays.

♻️ Optional: Use findLast for cleaner iteration
-                const lastAssistantMessage = messages
-                  .slice()
-                  .reverse()
-                  .find((m) => m.role === "assistant");
+                const lastAssistantMessage = messages.findLast(
+                  (m) => m.role === "assistant"
+                );

…g/step-start

When user aborts during reasoning phase, the message only has step-start
and reasoning parts (no text). Gemini fails with "must include at least
one parts field" when this message is sent in the next request.

Now assistant messages are filtered more strictly - they need actual
content (text or tool parts), not just reasoning/step-start alone.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/chat/chat-processor.ts (1)

8-21: ⚠️ Potential issue | 🟡 Minor

Documentation is inconsistent with implementation.

The function has multiple documentation inconsistencies:

  1. Docstring (lines 10-11) states Ultra gets 50 steps, but code returns 40
  2. Docstring states non-Ultra agent users get 25 steps, but code returns 20
  3. Inline comment (line 17) says "others get 25 steps" but line 21 returns 20
📝 Proposed fix to align documentation with implementation
 /**
  * Get maximum steps allowed for a user based on mode and subscription tier
- * Agent mode: Ultra: 50 steps, all other paid users: 25 steps
+ * Agent mode: Ultra: 40 steps, all other paid users: 20 steps
  * Ask mode: Free: 5 steps, Pro/Team: 10 steps, Ultra: 15 steps
  */
 export const getMaxStepsForUser = (
   mode: ChatMode,
   subscription: SubscriptionTier,
 ): number => {
-  // Agent mode: Ultra users get 40 steps, others get 25 steps
+  // Agent mode: Ultra users get 40 steps, others get 20 steps
   if (mode === "agent" && subscription === "ultra") {
     return 40;
   } else if (mode === "agent") {
     return 20;
   }

rossmanko and others added 2 commits January 30, 2026 15:32
…signatures

Stop stripping providerMetadata from message parts in normalizeMessages.
Gemini models require thought_signature in providerMetadata to be passed
back in subsequent requests for tool calls to work correctly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move skill definitions to central .agents/skills/ directory and replace
duplicated files in .claude/skills/ and .cursor/skills/ with symlinks.
This eliminates duplication and ensures both tools share the same skill
definitions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In @.agents/skills/vercel-composition-patterns/rules/react19-no-forwardref.md:
- Line 42: The final sentence in the rule text is a fragment; change it to a
complete statement by rewriting it to something like "use() can be called
conditionally, unlike useContext()." Update the rule file's last line where the
functions use() and useContext() are mentioned so the sentence is grammatically
complete and ends with punctuation.
- Around line 12-42: Update the explanatory text that currently says "use()
replaces useContext()" to instead state that "use() is preferred over
useContext() when reading context because it can be called conditionally," while
keeping the existing code examples; edit the paragraph referencing use() and
useContext() so it clarifies that useContext still exists but use() is
recommended for conditional reads (refer to the symbols use() and useContext()
in the description).

In @.agents/skills/vercel-react-best-practices/rules/js-min-max-loop.md:
- Line 82: Update the note about spread-operator limits to correct the numeric
limits and clarify what they apply to: replace the incorrect "approximately
124000 in Chrome 143 and 638000 in Safari 18" with the actual hard
argument-count limits ("Chrome/V8: 65,535 arguments; Safari/JavaScriptCore:
65,536 arguments"), and explicitly state that this limit applies to function
argument counts for spread usages like Math.max(...arr) and push(...arr), not to
array length itself; keep the recommendation to use a loop (e.g., a for loop or
reduce) for reliability with very large arrays.

In @.agents/skills/vercel-react-best-practices/rules/server-auth-actions.md:
- Line 10: Replace the emphasized line "**Impact: CRITICAL (prevents
unauthorized access to server mutations)**" with a proper Markdown heading
(e.g., "## Impact: CRITICAL (prevents unauthorized access to server mutations)")
so it is a heading instead of emphasized text, satisfying markdownlint MD036;
update the markup for the exact text string found in server-auth-actions.md to
use a heading level appropriate for the document structure.

In
@.agents/skills/vercel-react-best-practices/rules/server-parallel-fetching.md:
- Around line 56-74: The snippet uses ReactNode but doesn’t import it; add an
import for ReactNode from 'react' at the top of the file so the types used by
Layout are resolved. Locate the component definitions (Header, Sidebar, Layout)
and ensure there's a line importing ReactNode (e.g., import { ReactNode } from
'react') before the async functions that call fetchHeader and fetchSidebarItems
so the Layout({ children }: { children: ReactNode }) type is valid.

In @.agents/skills/web-design-guidelines/SKILL.md:
- Around line 4-7: The skill's front matter moved argument-hint under metadata
(metadata.argument-hint) which breaks consumers expecting a top-level
argument-hint; update any parsing code (e.g., YAML/front-matter readers, CLI
option loaders, and skill-parsing functions) to first check
metadata.argument-hint and fall back to a top-level argument-hint for backward
compatibility, or normalize the parsed object to always expose
argumentHint/argument-hint at the expected location before further processing;
adjust references in functions/classes that read argument-hint to use the
normalized property.

In `@lib/utils/message-processor.ts`:
- Around line 178-189: The transformer for terminal tool parts
(transformTerminalToolPart) currently constructs a fresh object and drops
provider metadata from the incoming TerminalToolPart; update
transformTerminalToolPart to spread the incoming terminalPart (the part passed
in from the message processing branch that uses terminalDataMap) into the
returned object so fields like thought_signature and other provider metadata are
preserved, while still applying the transformed streaming/output fields and any
new keys; ensure processedParts.push(transformedPart) and messageChanged logic
remain unchanged.
🧹 Nitpick comments (3)
.agents/skills/vercel-react-best-practices/rules/js-batch-dom-css.md (1)

78-105: Optional: add a note about useLayoutEffect/requestAnimationFrame for timing clarity in React.

This can help readers avoid paint flicker and make the “read vs write phase” split explicit in React examples.

✏️ Suggested doc tweak
-**React example:**
+**React example:**
+// Tip: if you must read/write layout in React, prefer `useLayoutEffect` (or split phases via `requestAnimationFrame`)
+// to avoid flicker and make layout timing explicit.
 ```tsx
 // Incorrect: interleaving style changes with layout queries
 function Box({ isHighlighted }: { isHighlighted: boolean }) {
.agents/skills/vercel-react-best-practices/rules/advanced-event-handler-refs.md (1)

15-15: Consider adding TypeScript type annotations to event parameters.

The handler signatures now accept an event parameter (e) => void, which is more realistic than the previous () => void. However, for better type safety and to demonstrate TypeScript best practices, consider specifying the event type, such as (e: Event) => void or a more specific type like (e: MouseEvent) => void depending on the use case.

📝 Example with type annotations
-function useWindowEvent(event: string, handler: (e) => void) {
+function useWindowEvent(event: string, handler: (e: Event) => void) {

Also applies to: 26-26, 45-45

.agents/skills/vercel-composition-patterns/SKILL.md (1)

75-78: Add a language to the fenced code block.

Markdownlint flags this fence without a language; use text (or plaintext) to satisfy MD040.

📝 Suggested fix
-```
+```text
 rules/architecture-avoid-boolean-props.md
 rules/state-context-interface.md
</details>

</blockquote></details>

</blockquote></details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

- Extract MessageItem component with custom memo comparison
- Add React.memo to MessagePartHandler with tool/text-specific comparison
- Memoize all tool handlers (Terminal, Web, Todo, File, Match, Memory, Summarization)
- Add useMemo for expensive computations (fileParts, webSources, output parsing)
- Add useCallback for event handlers to maintain stable references
- Hoist static values (regex patterns, pure functions) outside components
- Implement CSS content-visibility for off-screen message optimization
- Simplify Messages.tsx by delegating to MessageItem

These changes reduce unnecessary re-renders during streaming by comparing
only relevant props for each component type rather than full object equality.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@app/components/MessageItem.tsx`:
- Around line 50-78: The comparator areMessageItemPropsEqual is missing checks
for props that affect rendering (mode, isTemporaryChat, branchedFromChatId,
branchedFromChatTitle, branchBoundaryIndex, tempChatFileDetails), so add
comparisons for each of those fields to force re-render when they change; update
areMessageItemPropsEqual (which accepts MessageItemProps) to return false if
prev.mode !== next.mode, prev.isTemporaryChat !== next.isTemporaryChat,
prev.branchedFromChatId !== next.branchedFromChatId, prev.branchedFromChatTitle
!== next.branchedFromChatTitle, prev.branchBoundaryIndex !==
next.branchBoundaryIndex, and compare prev.tempChatFileDetails to
next.tempChatFileDetails (by reference or shallow-equality of its identifying
fields) so changes affecting shouldShowBranchIndicator, FinishReasonNotice,
MessageActions, and effectiveFileDetails will trigger re-render.

In `@app/components/MessagePartHandler.tsx`:
- Around line 32-74: The comparator arePropsEqual for MessagePartHandler is
missing a check for prop changes to partIndex, causing stale child rendering
when message.parts are reordered; update arePropsEqual (used by
MessagePartHandler) to return false when prevProps.partIndex !==
nextProps.partIndex so the parent/child re-render alignment matches the behavior
in ReasoningHandler and SummarizationHandler which rely on partIndex to access
the correct entry in message.parts.

In `@app/components/ReasoningHandler.tsx`:
- Around line 39-57: The comparator areReasoningPropsEqual currently only checks
part lengths and a single part's text which can miss changes in combined
reasoning; modify areReasoningPropsEqual to compute and compare the combined
reasoning string (using the same logic as collectReasoningText) for the
reasoning run that spans consecutive parts, i.e., derive the full reasoning text
from prev.message.parts and next.message.parts (using partIndex to find the run
boundaries) and return false if those combined strings differ; keep the existing
short-circuits for status, isLastMessage, partIndex, and parts.length but
replace the single-part text comparison with a compare of the collected
reasoning text for both prev and next.

In `@app/components/tools/TodoToolHandler.tsx`:
- Around line 15-25: The custom comparator function areTodoPropsEqual is missing
a check for message.id, causing stale messageId to be passed to TodoBlock;
update areTodoPropsEqual (used to decide re-rendering of TodoToolHandler) to
also compare prev.message.id !== next.message.id and return false when they
differ so the component updates when the message identity changes, ensuring the
TodoBlock's messageId prop is always in sync.

In `@app/components/tools/WebToolHandler.tsx`:
- Around line 34-44: areWebPropsEqual currently only compares status,
part.state, part.toolCallId and part.output, which lets changes to
part.input/part.toolName/part.type (and derived values like isOpenUrl, target,
query) slip through and render stale UI; update areWebPropsEqual to also compare
prev.part.input === next.part.input, prev.part.toolName === next.part.toolName
and prev.part.type === next.part.type (or the specific derived values if you
prefer comparing isOpenUrl/target/query directly) so any change to input/tool
metadata triggers a re-render; keep the existing checks and return false if any
of these new comparisons differ.
🧹 Nitpick comments (2)
app/components/MessagePartHandler.tsx (1)

72-74: Misleading comment and unreachable code path.

Line 73 is unreachable when prevProps.part !== nextProps.part since Line 44 already returns true for equal references. This effectively means unknown part types always trigger re-renders (returning false), which is safe behavior, but the comment "shallow compare part object" is misleading.

♻️ Suggested clarification
-  // Default: shallow compare part object
-  return prevProps.part === nextProps.part;
+  // Unknown part type - re-render to be safe
+  return false;
app/components/MessageItem.tsx (1)

162-166: Type assertion as any is a minor code smell.

The message as any cast at Line 165 bypasses type checking. Consider adding proper typing to extractWebSourcesFromMessage or defining a type that ChatMessage extends.

Comment on lines +39 to +57
// Custom comparison for reasoning handler
function areReasoningPropsEqual(
prev: ReasoningHandlerProps,
next: ReasoningHandlerProps,
): boolean {
if (prev.status !== next.status) return false;
if (prev.isLastMessage !== next.isLastMessage) return false;
if (prev.partIndex !== next.partIndex) return false;
// Compare parts length and relevant reasoning content
if (prev.message.parts.length !== next.message.parts.length) return false;
// Compare the reasoning part text directly
const prevPart = prev.message.parts[prev.partIndex];
const nextPart = next.message.parts[next.partIndex];
if (prevPart?.type !== nextPart?.type) return false;
if (prevPart?.type === "reasoning" && nextPart?.type === "reasoning") {
return prevPart.text === nextPart.text;
}
return true;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Comparator can miss updates to combined reasoning text.
Rendering uses collectReasoningText across consecutive reasoning parts, but the comparator only checks the current part text plus length. If later reasoning chunks change while length stays constant, this memoization can skip needed updates and show stale reasoning. Consider comparing the combined text for the reasoning run.

✅ Suggested fix
  const prevPart = prev.message.parts[prev.partIndex];
  const nextPart = next.message.parts[next.partIndex];
  if (prevPart?.type !== nextPart?.type) return false;
  if (prevPart?.type === "reasoning" && nextPart?.type === "reasoning") {
-    return prevPart.text === nextPart.text;
+    if (prevPart.text !== nextPart.text) return false;
+    const prevCombined = collectReasoningText(prev.message.parts, prev.partIndex);
+    const nextCombined = collectReasoningText(next.message.parts, next.partIndex);
+    return prevCombined === nextCombined;
  }
  return true;
🤖 Prompt for AI Agents
In `@app/components/ReasoningHandler.tsx` around lines 39 - 57, The comparator
areReasoningPropsEqual currently only checks part lengths and a single part's
text which can miss changes in combined reasoning; modify areReasoningPropsEqual
to compute and compare the combined reasoning string (using the same logic as
collectReasoningText) for the reasoning run that spans consecutive parts, i.e.,
derive the full reasoning text from prev.message.parts and next.message.parts
(using partIndex to find the run boundaries) and return false if those combined
strings differ; keep the existing short-circuits for status, isLastMessage,
partIndex, and parts.length but replace the single-part text comparison with a
compare of the collected reasoning text for both prev and next.

rossmanko and others added 3 commits January 30, 2026 17:37
Fixes Collapsible structure to use single wrapper, removes use-stick-to-bottom
dependency, adds useMemo for context, exports useReasoning hook, and adds
customizable getThinkingMessage prop.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…nOfThought

- Simplified import statements in AgentsTab.tsx
- Added .agents/ to .prettierignore to exclude agent-related files from formatting
- Removed ChainOfThought component and its related files to streamline the codebase

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Updated WideEventBuilder to extract and store provider cost from usage data.
- Refactored deductAgentUsage to deduct costs based on provider cost if available, improving accuracy.
- Renamed deductAgentUsage to deductUsage for consistency across the codebase.
- Adjusted tests to reflect the new function name and ensure correct behavior with provider cost.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/logger.ts (1)

280-298: ⚠️ Potential issue | 🟡 Minor

Validate provider cost before storing it.

If the provider returns a non-finite or negative value, total_cost becomes invalid and the fallback calculation is skipped. Guard it so only valid numbers are persisted.

🛠️ Suggested fix
-      const rawCost = (usage as { raw?: { cost?: number } }).raw?.cost;
+      const rawCost = (usage as { raw?: { cost?: number } }).raw?.cost;
+      const providerCost =
+        typeof rawCost === "number" && Number.isFinite(rawCost) && rawCost >= 0
+          ? rawCost
+          : undefined;
@@
-        total_cost: rawCost,
+        total_cost: providerCost,

Also applies to: 351-359

lib/rate-limit/token-bucket.ts (1)

292-334: ⚠️ Potential issue | 🟠 Major

Treat provider-reported $0 cost as valid to avoid overcharging.

The > 0 check skips provider cost when it’s 0, forcing token-based charging. If a provider reports a real $0 (e.g., cached hit), this will overcharge.

🐛 Suggested fix
-    if (providerCostDollars !== undefined && providerCostDollars > 0) {
+    if (
+      providerCostDollars !== undefined &&
+      Number.isFinite(providerCostDollars) &&
+      providerCostDollars >= 0
+    ) {
🤖 Fix all issues with AI agents
In `@components/ai-elements/reasoning.tsx`:
- Around line 49-51: The effect in useEffect is forcing the panel closed
whenever isStreaming becomes false, overriding defaultOpen/user choice; change
the effect in reasoning.tsx so it only auto-opens when streaming starts (i.e.,
call setIsOpen(true) when isStreaming is true) and do not call setIsOpen(false)
when streaming stops, so user/state-managed closing or defaultOpen are
preserved; keep dependencies (isStreaming, setIsOpen) and ensure any initial
defaultOpen handling remains outside this effect.
🧹 Nitpick comments (2)
lib/rate-limit/token-bucket.ts (1)

56-64: Rename monthlyPrice to reflect yearly pricing.

This is a readability nit, but the variable name now contradicts the comment and source value.

♻️ Suggested refactor
-  const monthlyPrice = PRICING[subscription]?.yearly ?? 0;
-  const monthlyPoints = monthlyPrice * POINTS_PER_DOLLAR;
+  const yearlyPrice = PRICING[subscription]?.yearly ?? 0;
+  const yearlyPoints = yearlyPrice * POINTS_PER_DOLLAR;
@@
-    session: Math.round(monthlyPoints / 30), // Daily budget
-    weekly: Math.round((monthlyPoints * 7) / 30), // Weekly budget
+    session: Math.round(yearlyPoints / 30), // Daily budget
+    weekly: Math.round((yearlyPoints * 7) / 30), // Weekly budget
lib/api/chat-handler.ts (1)

621-641: Consider redacting parts before logging.

Logging full parts could capture user/tool payloads. Logging only types/counts keeps the signal without payload risk.

🧹 Suggested change
-                  parts: lastAssistantMessage.parts,
+                  parts: lastAssistantMessage.parts?.map((p) => p.type),
+                  partsCount: lastAssistantMessage.parts?.length,

…p-start

When the primary model (Google Vertex) returns only a step-start part with no
actual content, automatically retry with the fallback model (Google AI Studio).

Changes:
- Add agent-fallback-model using same gemini model via AI Studio provider
- Configure provider routing: Vertex for main models, AI Studio for fallback
- Extract createStream helper for reusable stream creation
- Implement retry logic in onFinish when step-start only detected
- Only save new assistant messages from retry (skip already-saved user messages)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/api/chat-handler.ts (2)

566-596: ⚠️ Potential issue | 🟠 Major

Prevent double-charging when retrying after step-start-only responses.

deductUsage runs in the first stream’s onFinish before you detect the step-start-only case, then runs again for the fallback stream—charging twice for output/provider cost. Defer deduction until you know you’re not retrying (and apply it once for the final attempt).

✅ Proposed fix (defer deduction until after retry decision)
             onFinish: async ({ finishReason, usage, response }) => {
               // If preemptive timeout triggered, use "timeout" as finish reason
               if (preemptiveTimeout?.isPreemptive()) {
                 streamFinishReason = "timeout";
               } else {
                 streamFinishReason = finishReason;
               }
               // Capture full usage and model
               streamUsage = usage as Record<string, unknown>;
               responseModel = response?.modelId;

               // Update logger with model and usage
               chatLogger!.setStreamResponse(responseModel, streamUsage);
-
-              // Deduct additional cost (output + any input difference)
-              // Input cost was already deducted upfront in checkRateLimit
-              // Free users don't have token buckets, so skip for them
-              if (subscription !== "free" && usage) {
-                // Extract provider cost if available (more accurate than token calculation)
-                const providerCost = (usage as { raw?: { cost?: number } }).raw
-                  ?.cost;
-
-                await deductUsage(
-                  userId,
-                  subscription,
-                  estimatedInputTokens,
-                  usage.inputTokens || 0,
-                  usage.outputTokens || 0,
-                  extraUsageConfig,
-                  providerCost,
-                );
-              }
             },
               onFinish: async ({ messages, isAborted }) => {
                 // Check if stream finished with only step-start (indicates incomplete response)
                 const lastAssistantMessage = messages
                   .slice()
                   .reverse()
                   .find((m) => m.role === "assistant");
                 const hasOnlyStepStart =
                   lastAssistantMessage?.parts?.length === 1 &&
                   lastAssistantMessage.parts[0]?.type === "step-start";

                 if (hasOnlyStepStart) {
                   axiomLogger.error("Stream finished with only step-start", {
                     chatId,
                     endpoint,
                     mode,
                     model: selectedModel,
                     userId,
                     subscription,
                     isTemporary: temporary,
                     messageCount: messages.length,
                     parts: lastAssistantMessage?.parts,
                     isRetryWithFallback,
                   });

                   // Retry with fallback model if not already retrying
                   if (!isRetryWithFallback && !isAborted) {
                     isRetryWithFallback = true;
                     axiomLogger.info(
                       "Retrying with fallback model after step-start only",
                       {
                         chatId,
                         originalModel: selectedModel,
                         fallbackModel,
                         userId,
                       },
                     );

                     const retryResult = await createStream(fallbackModel);
                     const retryMessageId = generateId();

                     writer.merge(
                       retryResult.toUIMessageStream({
                         generateMessageId: () => retryMessageId,
                         onFinish: async ({
                           messages: retryMessages,
                           isAborted: retryAborted,
                         }) => {
                           // Cleanup for retry
                           preemptiveTimeout?.clear();
                           if (!subscriberStopped) {
                             await cancellationSubscriber.stop();
                             subscriberStopped = true;
                           }

                           chatLogger!.emitSuccess({
                             finishReason: streamFinishReason,
                             wasAborted: retryAborted,
                             wasPreemptiveTimeout: false,
                             hadSummarization: hasSummarized,
                           });

+                          // Deduct additional cost for the final (retry) attempt
+                          if (subscription !== "free" && streamUsage) {
+                            const providerCost = (
+                              streamUsage as { raw?: { cost?: number } }
+                            ).raw?.cost;
+                            await deductUsage(
+                              userId,
+                              subscription,
+                              estimatedInputTokens,
+                              (streamUsage as any).inputTokens || 0,
+                              (streamUsage as any).outputTokens || 0,
+                              extraUsageConfig,
+                              providerCost,
+                            );
+                          }

                           const generatedTitle = await titlePromise;
                           ...
                         },
                         sendReasoning: true,
                       }),
                     );

                     return; // Skip normal cleanup - retry handles it
                   }
                 }
+
+                // Deduct additional cost for non-retry flows
+                if (subscription !== "free" && streamUsage) {
+                  const providerCost = (
+                    streamUsage as { raw?: { cost?: number } }
+                  ).raw?.cost;
+                  await deductUsage(
+                    userId,
+                    subscription,
+                    estimatedInputTokens,
+                    (streamUsage as any).inputTokens || 0,
+                    (streamUsage as any).outputTokens || 0,
+                    extraUsageConfig,
+                    providerCost,
+                  );
+                }

Also applies to: 627-752


420-435: ⚠️ Potential issue | 🟠 Major

Retry uses model-specific settings from the original model (Gemini/system prompt).

currentSystemPrompt and isGeminiModel are derived once from selectedModel. If the fallback differs (e.g., Gemini vs non‑Gemini), retry may use the wrong prompt formatting / thought_signature handling and fail again. Recompute model-specific flags and the system prompt before createStream(fallbackModel).

🔧 Proposed fix (recompute model-specific context before retry)
-          const isGeminiModel =
+          let isGeminiModel =
             configuredModelId.includes("gemini") ||
             configuredModelId.includes("google");
                   if (!isRetryWithFallback && !isAborted) {
                     isRetryWithFallback = true;
                     axiomLogger.info(
                       "Retrying with fallback model after step-start only",
                       {
                         chatId,
                         originalModel: selectedModel,
                         fallbackModel,
                         userId,
                       },
                     );
+
+                    // Recompute model-specific context for fallback
+                    const fallbackModelId =
+                      trackedProvider.languageModel(fallbackModel).modelId;
+                    isGeminiModel =
+                      fallbackModelId.includes("gemini") ||
+                      fallbackModelId.includes("google");
+                    currentSystemPrompt = await systemPrompt(
+                      userId,
+                      mode,
+                      subscription,
+                      fallbackModel,
+                      userCustomization,
+                      temporary,
+                      chat?.finish_reason,
+                      sandboxContext,
+                    );
+                    if (isGeminiModel) {
+                      finalMessages =
+                        stripReasoningFromMessagesForGemini(finalMessages);
+                    }

                     const retryResult = await createStream(fallbackModel);
                     const retryMessageId = generateId();

Also applies to: 650-665

🤖 Fix all issues with AI agents
In `@lib/api/chat-handler.ts`:
- Around line 711-739: The retry branch that iterates retryMessages and calls
saveMessage(...) builds newFileIds via getFileAccumulator() but never emits file
metadata to resumable clients; after the loop that saves assistant messages,
call sendFileMetadataToStream(...) (using the same chatId and the newFileIds
produced above, plus the same stream/resumable context used by the normal path)
to emit the file metadata so resumable clients receive it on fallback success;
ensure you place this call immediately after the retry save loop and preserve
the same parameters/behavior as the normal-path sendFileMetadataToStream
invocation.

The retry path was missing sendFileMetadataToStream call which left
resumable stream clients without file metadata on fallback success.
Also adds temporary chat handling (file metadata + temp stream cleanup)
that was missing from the retry flow.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@rossmanko rossmanko merged commit 06e618a into main Jan 30, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants