feat: add transcription billing via SQS events#141
feat: add transcription billing via SQS events#141AnthonyRonning wants to merge 2 commits intomasterfrom
Conversation
WalkthroughAppends PCR snapshot entries to dev and prod history JSON files, adds optional Changes
Sequence DiagramsequenceDiagram
participant Client
participant TranscriptionHandler as Transcription<br/>Handler
participant Estimator as Duration<br/>Estimator
participant Provider as Transcription<br/>Provider
participant SQS as SQS<br/>Publisher
Client->>TranscriptionHandler: Upload/submit audio
TranscriptionHandler->>Provider: Send audio for transcription (stream/chunk)
Provider-->>TranscriptionHandler: Transcription result + provider_name
TranscriptionHandler->>Estimator: estimate_audio_duration_seconds(file_size, content_type)
Estimator-->>TranscriptionHandler: duration_seconds
TranscriptionHandler->>SQS: publish_transcription_usage_event(UsageEvent{audio_seconds,event_type,provider})
SQS-->>TranscriptionHandler: publish acknowledgment
TranscriptionHandler-->>Client: return transcription result (includes provider)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Greptile OverviewGreptile SummaryThis PR implements transcription billing by publishing SQS events with audio duration estimates. The implementation:
The duration estimation uses reasonable bitrate averages for common audio formats (MP3: 20KB/s, WAV: 176KB/s, OGG: 14KB/s, FLAC: 88KB/s, AAC: 20KB/s). The billing server will handle cost calculation based on the estimated Key observation: The implementation tracks only the first provider used when multiple chunks are processed via different providers (lines 1292-1300). This limitation was already noted in previous review threads. Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant OpenAI Handler
participant AudioSplitter
participant Retry Logic
participant Provider (Tinfoil/Continuum)
participant SQS Publisher
participant Billing Server
Client->>OpenAI Handler: POST /v1/audio/transcriptions
OpenAI Handler->>OpenAI Handler: decode base64 audio
OpenAI Handler->>OpenAI Handler: validate file size
OpenAI Handler->>AudioSplitter: split audio into chunks
AudioSplitter-->>OpenAI Handler: return chunks
par Process chunks in parallel
OpenAI Handler->>Retry Logic: send_transcription_with_retries(chunk 0)
Retry Logic->>Provider (Tinfoil/Continuum): try primary provider
Provider (Tinfoil/Continuum)-->>Retry Logic: response
Retry Logic-->>OpenAI Handler: (response, provider_name)
and
OpenAI Handler->>Retry Logic: send_transcription_with_retries(chunk 1)
Retry Logic->>Provider (Tinfoil/Continuum): try primary/fallback
Provider (Tinfoil/Continuum)-->>Retry Logic: response
Retry Logic-->>OpenAI Handler: (response, provider_name)
end
OpenAI Handler->>OpenAI Handler: merge transcription results
OpenAI Handler->>OpenAI Handler: estimate_audio_duration_seconds()
OpenAI Handler->>SQS Publisher: publish_transcription_usage_event()
Note over SQS Publisher: event_type: "transcription"<br/>audio_seconds: estimated duration<br/>provider_name: first successful provider
SQS Publisher->>Billing Server: send UsageEvent to SQS queue
OpenAI Handler->>Client: return encrypted transcription
|
97d55df to
eaff978
Compare
src/web/openai.rs
Outdated
| estimated_cost: BigDecimal::from(0), | ||
| chat_time: Utc::now(), | ||
| is_api_request, | ||
| provider_name: "continuum".to_string(), |
There was a problem hiding this comment.
provider_name is hardcoded to "continuum" but transcriptions can be fulfilled by either Tinfoil or Continuum providers (see send_transcription_with_retries). Consider tracking which provider actually fulfilled the request, similar to how chat completions track successful_provider (line 633)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/web/openai.rs
Line: 968:968
Comment:
`provider_name` is hardcoded to "continuum" but transcriptions can be fulfilled by either Tinfoil or Continuum providers (see `send_transcription_with_retries`). Consider tracking which provider actually fulfilled the request, similar to how chat completions track `successful_provider` (line 633)
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.Adds audio duration estimation based on file size and content type, then publishes SQS events with event_type='transcription' and audio_seconds for the billing server to process. - Tracks actual provider (tinfoil/continuum) that fulfilled request - Publishes asynchronously via tokio::spawn (non-blocking) - Rounds up to nearest second Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
eaff978 to
ab17c23
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/web/openai.rs (1)
1271-1285:⚠️ Potential issue | 🟡 MinorFix formatting to pass CI.
The pipeline indicates
cargo fmtcheck failed in this region. Runcargo fmt --allto fix the formatting issues.
| let mut successful_provider = String::new(); | ||
| for result in results { | ||
| match result { | ||
| Ok(r) => successful_results.push(r), | ||
| Ok((index, value, provider)) => { | ||
| successful_results.push((index, value)); | ||
| // Track the provider (use the first one, or could use the last) | ||
| if successful_provider.is_empty() { | ||
| successful_provider = provider; | ||
| } |
There was a problem hiding this comment.
when multiple chunks succeed via different providers (e.g., chunk 0 via Tinfoil, chunk 1 via Continuum), only the first provider is tracked for billing. consider tracking all providers or using the most common one
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/web/openai.rs
Line: 1292:1300
Comment:
when multiple chunks succeed via different providers (e.g., chunk 0 via Tinfoil, chunk 1 via Continuum), only the first provider is tracked for billing. consider tracking all providers or using the most common one
How can I resolve this? If you propose a fix, please make it concise.| fn estimate_audio_duration_seconds(file_size: usize, content_type: &str) -> f64 { | ||
| // Bytes per second for common audio formats (approximate averages) | ||
| let bytes_per_second: f64 = match content_type.to_lowercase().as_str() { | ||
| // MP3: typically 128-192kbps, use ~160kbps = 20KB/s | ||
| "audio/mpeg" | "audio/mp3" => 20_000.0, | ||
| // WAV: 44.1kHz, 16-bit stereo = 176.4KB/s | ||
| "audio/wav" | "audio/x-wav" | "audio/wave" => 176_400.0, | ||
| // OGG/Opus/WebM: typically ~96-128kbps = ~12-16KB/s, use 14KB/s | ||
| "audio/ogg" | "audio/opus" | "audio/webm" => 14_000.0, | ||
| // FLAC: lossless, roughly half of WAV = ~88KB/s | ||
| "audio/flac" | "audio/x-flac" => 88_000.0, | ||
| // AAC/M4A: similar to MP3, ~160kbps = 20KB/s | ||
| "audio/aac" | "audio/m4a" | "audio/mp4" | "audio/x-m4a" => 20_000.0, | ||
| // Default: assume MP3-like compression | ||
| _ => 20_000.0, | ||
| }; | ||
|
|
||
| file_size as f64 / bytes_per_second | ||
| } |
There was a problem hiding this comment.
consider adding test case for zero or very small file sizes to ensure duration estimation handles edge cases gracefully
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/web/openai.rs
Line: 43:61
Comment:
consider adding test case for zero or very small file sizes to ensure duration estimation handles edge cases gracefully
How can I resolve this? If you propose a fix, please make it concise.ab17c23 to
61bd304
Compare
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Adds audio transcription billing by publishing SQS events with:
event_type: "transcription"audio_seconds: estimated duration based on file size and content typeDuration estimation lookup table:
The billing server will handle the actual cost calculation based on
audio_seconds.Includes unit tests for the duration estimation function.
Summary by CodeRabbit