Skip to content

usage by action + ux sanding#489

Open
Rish-it wants to merge 1 commit intogetnao:mainfrom
Rish-it:feat/usage-by-action
Open

usage by action + ux sanding#489
Rish-it wants to merge 1 commit intogetnao:mainfrom
Rish-it:feat/usage-by-action

Conversation

@Rish-it
Copy link
Contributor

@Rish-it Rish-it commented Mar 20, 2026

Summary

Implements #316 — surfaces LLM cost breakdown by action type (chat, test, memory, voice) in the admin Usage & Costs panel.

Backend

  • Schema: Added estimated_cost column to llm_inference table (both SQLite and PostgreSQL) for voice transcription costs that use per-minute pricing instead of per-token
  • Inference tracking: Every LLM call now saves a record to llm_inference via scheduleSaveLlmInferenceRecord:
    • chat — streaming responses in AgentManager.stream()
    • title_generation — chat title generation in AgentManager._generateTitle()
    • test — test runs and verifications in TestAgentService
    • voice — audio transcription in transcribeAudio() with duration-based cost
  • Usage query: New getActionCostsByDate() query aggregates costs from llm_inference grouped by action type, merged with existing message usage data. Chat cost includes chat, compaction, and title_generation types. Voice cost uses the pre-computed estimated_cost column instead of token math.
  • Date alignment fix: Usage records now merge dates from both message and action queries (previously action-only dates were silently dropped)

Frontend

  • Added "Actions" chart view to the usage page dropdown — stacked bar chart showing Chat, Test, Memory, and Voice costs over time
  • Granularity and provider filters apply to the actions view

UX sanding

  • Added back button on story preview page to return to /stories

Test plan

  • Run npm run db:push to apply schema changes
  • Send chat messages and verify llm_inference records are created with type = 'chat'
  • Run nao test and verify type = 'test' records appear
  • Use voice transcription and verify type = 'voice' records with estimated_cost
  • Open admin Usage & Costs → select "Actions" view → verify stacked bar chart renders with 4 series
  • Toggle granularity (hour/day/month) and provider filter — chart updates correctly
  • Verify existing chart views (Messages, Tokens, Cost) are unaffected
  • Verify story preview back button navigates to /stories
  • npm run lint passes

Copilot AI review requested due to automatic review settings March 20, 2026 08:13
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 15 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/backend/src/services/transcribe.service.ts">

<violation number="1" location="apps/backend/src/services/transcribe.service.ts:85">
P2: Transcription cost calculation silently falls back to zero for missing duration or unknown model pricing, causing underreported voice usage costs.</violation>
</file>

<file name="apps/backend/src/queries/usage.queries.ts">

<violation number="1" location="apps/backend/src/queries/usage.queries.ts:70">
P2: New action-cost feature uses `llm_inference` provider filtering, but provider options are still sourced only from `chat_message`, so some valid providers may be impossible to select in the UI.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

durationInSeconds: number | undefined,
): number {
if (!durationInSeconds) {
return 0;
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Transcription cost calculation silently falls back to zero for missing duration or unknown model pricing, causing underreported voice usage costs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/backend/src/services/transcribe.service.ts, line 85:

<comment>Transcription cost calculation silently falls back to zero for missing duration or unknown model pricing, causing underreported voice usage costs.</comment>

<file context>
@@ -63,6 +76,19 @@ export async function getAvailableModels(projectId: string) {
+	durationInSeconds: number | undefined,
+): number {
+	if (!durationInSeconds) {
+		return 0;
+	}
+	const modelDef = TRANSCRIBE_PROVIDERS[provider].models.find((m) => m.id === modelId);
</file context>
Fix with Cubic

inputCacheReadCost: Number(row.inputCacheReadCost ?? 0),
inputCacheWriteCost: Number(row.inputCacheWriteCost ?? 0),
outputCost: Number(row.outputCost ?? 0),
const actionCostsByDate = await getActionCostsByDate(projectId, filter);
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: New action-cost feature uses llm_inference provider filtering, but provider options are still sourced only from chat_message, so some valid providers may be impossible to select in the UI.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/backend/src/queries/usage.queries.ts, line 70:

<comment>New action-cost feature uses `llm_inference` provider filtering, but provider options are still sourced only from `chat_message`, so some valid providers may be impossible to select in the UI.</comment>

<file context>
@@ -71,30 +67,44 @@ export const getMessagesUsage = async (projectId: string, filter: UsageFilter):
-			inputCacheReadCost: Number(row.inputCacheReadCost ?? 0),
-			inputCacheWriteCost: Number(row.inputCacheWriteCost ?? 0),
-			outputCost: Number(row.outputCost ?? 0),
+	const actionCostsByDate = await getActionCostsByDate(projectId, filter);
+
+	const messagesByDate = new Map(rows.map((row) => [row.date, row]));
</file context>
Fix with Cubic

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements issue #316 by extending LLM inference tracking and surfacing cost breakdowns by action type (chat/test/memory/voice) in the admin Usage & Costs panel, plus a small Stories preview navigation improvement.

Changes:

  • Add estimated_cost to llm_inference (SQLite + Postgres) and record additional inference types (chat, test, voice) across agent/test/transcribe flows.
  • Extend usage aggregation to compute per-action daily/hourly/monthly costs from llm_inference and merge them with existing message usage data (including action-only dates).
  • Add an “Actions” chart view on the frontend usage page and add a back button on the story preview page.

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
package-lock.json Lockfile updates reflecting dependency metadata changes and added optional packages.
apps/frontend/src/routes/_sidebar-layout.stories.preview.$chatId.$storyId.tsx Adds a back button in the story preview header linking back to the Stories list.
apps/frontend/src/routes/_sidebar-layout.settings.usage.tsx Adds “Actions” chart view rendering cost series by action type.
apps/frontend/src/components/settings/usage-filters.tsx Extends chart view selector with the new actions option.
apps/backend/src/utils/date.ts Ensures fillMissingDates initializes new action-cost fields to 0.
apps/backend/src/types/usage.ts Extends UsageRecord with chatCost, testCost, memoryCost, voiceCost.
apps/backend/src/types/llm.ts Expands LLM_INFERENCE_TYPES to include chat, test, voice.
apps/backend/src/trpc/transcribe.routes.ts Passes ctx.user.id into transcription to attribute usage to a user.
apps/backend/src/services/transcribe.service.ts Tracks voice transcription in llm_inference using duration-based estimatedCost.
apps/backend/src/services/test-agent.service.ts Adds LLM inference tracking for test runs/verifications and threads userId through.
apps/backend/src/services/agent.ts Tracks chat streaming calls and title_generation in llm_inference.
apps/backend/src/routes/test.ts Threads authenticated userId into test agent calls for attribution.
apps/backend/src/queries/usage.queries.ts Adds getActionCostsByDate, merges action costs into usage output, and fixes action-only date dropping.
apps/backend/src/db/sqlite-schema.ts Adds estimated_cost column to SQLite llm_inference.
apps/backend/src/db/pg-schema.ts Adds estimated_cost column to Postgres llm_inference.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 36 to 40
updatedAt: Date.now(),
messages: [userMessage],
userId: 'test',
projectId,
};
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runTest now accepts a userId (and uses it for the inference record), but the temporary chat passed into this.create(...) still hardcodes userId: 'test'. This means the agent run will load memory/context for a non-existent "test" user and can diverge from the usage attribution stored in llm_inference. Consider setting tempChat.userId to the passed userId (or, if intentionally anonymous, keep it consistent by also attributing the inference record to the same user).

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Contributor

🚀 Preview Deployment

URL https://pr-489-3356974.preview.getnao.io
Commit 3356974

⚠️ No LLM API keys configured - you'll see the API key setup flow when trying to chat.


Preview will be automatically removed when this PR is closed.

@Bl3f
Copy link
Contributor

Bl3f commented Mar 20, 2026

Hey, it seems the migrations files are missing.

@cainemerrick98
Copy link
Contributor

Hey is it possible to add a column to LLM inference table to capture token usage by file reads and tool use etc (e.g. split a chat message by its message parts)... I think as an admin, the total cost and token usage by file reads in particular would be useful information as it could guide me in making my context more efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants