Skip to content

Feat/aigateway transcription no UI#5

Open
JJassonn69 wants to merge 34 commits intonextfrom
feat/aigateway-transcription-noUI
Open

Feat/aigateway transcription no UI#5
JJassonn69 wants to merge 34 commits intonextfrom
feat/aigateway-transcription-noUI

Conversation

@JJassonn69
Copy link
Copy Markdown

@JJassonn69 JJassonn69 commented Jan 8, 2026

No UI changes, only the essential for adding aigateway.

Notion Instructions instructions for testing locally.


Note

Adds backend AI transcription via github.com/muxionlabs/ai-go-sdk with channel-driven GStreamer integration and event publishing.

  • New AI session flow: AISessionResources, SetupAISession, and NewAISinkCallback in pkg/media/*, plus aigateway_logger and TranscriptStore on MediaManager
  • Ingest updates: MKV/RTMP pipelines tee H264/AAC samples to AI (via WHIP) and continue normal segmenting; transcripts published on bus as place.stream.ai#dataOutput
  • Config: adds --ai-gateway-base-url and --ai-gateway-pipeline flags to wire up the gateway
  • UI: websocket-consumer.tsx logs AI data output to console (no visual changes)
  • Tooling: VSCode launch entries, .gitignore for video files, and util/live_mkv_with_new_did.sh helper
  • Dependencies: add ai-go-sdk; bump pion/webrtc and related libs

Written by Cursor Bugbot for commit d120b76. This will update automatically on new commits. Configure here.

eliteprox and others added 21 commits December 13, 2025 19:08
- Add .mp4 and .mkv to .gitignore
- Update vscode launch configs to use node debugger with bash for better process control
- Add compound launch config for running streamplace + app dev server together
- Add subtitle toggle to viewer context menu
- Implement subtitle support in HLS player with proper track management
- Change default player protocol from WebRTC to HLS
- Add aigateway package for AI transcription gateway client with session
- Add subtitle offset controls to viewer context menu with +1s/-1s/reset options
- Display current subtitle offset in menu title and adjustment UI
- Pass subtitle offset as query parameter to HLS source URL
- Add subtitleOffsetMS state to player store with getter/setter
- Implement structured transcript segments with timing in aigateway
- Add TranscriptSegment and WordTimestamp types for precise subtitle timing
- Refactor TranscriptStore to use segments
Refactor AI gateway client to try multiple endpoint paths for better compatibility with different gateway versions. Remove RewriteURLsTo config option in favor of automatic URL normalization based on the configured base URL. Add StopURL field to Session response. Update TranscriptEvent to use RFC3339 timestamp format instead of milliseconds. Remove unused error tracking from RTMPPublisher and legacy transcript segment conversion logic.
Remove SegmentAndSignElemH264Parse function which was a duplicate of SegmentAndSignElem with identical functionality. Update WebRTCIngest to use the standard SegmentAndSignElem method instead.
Add debug segment dump when MP4 validation fails after signing to help diagnose segment validation issues.
Remove transient mux error retry logic and excessive timestamp zeroing from ConvergeSegment. Add debug segment dump at start of convergence and per-attempt debug file output when SegmentDebugDir is configured. Simplify MP4 metadata zeroing to only handle bitrate fields instead of all timestamps.
Decrease pipeline start timeout in MP4ToMPEGTS from 60 seconds to 10 seconds to fail faster when conversion stalls. Add missing newline at end of segment_converge.go.
Remove AIGatewayPathPrefix config option and related PathPrefix field from aigateway.Config. Simplify endpoint discovery to only try base paths without prefix variants, removing the /gateway/ path candidates that were previously used for proxy compatibility.
Remove the Text field from TranscriptEvent struct in favor of structured Segments data. Simplify SSE payload parsing to expect single-event format wrapped in array instead of handling multiple legacy formats. Update debug logging to remove text field reference.
Remove automatic deduplication of transcript segments by eliminating the seen map tracking and stableSegmentID generation. Remove segment replacement logic that attempted to update segments with similar timestamps. Remove sorting of segments in GetSegments and GenerateVTTForSegment, relying on insertion order instead. Simplify addSeg to only enforce MaxTranscriptEvents limit without deduplication checks.
…ayer components

- Eliminated subtitle display logic and offset management from the context menu and video player.
- Updated player state and store to remove subtitle-related properties and methods.
- Adjusted media handling to focus on WebRTC protocol, removing RTMP dependencies.
- Cleaned up AI gateway configurations by removing unused RTMP settings.
@JJassonn69 JJassonn69 marked this pull request as draft January 8, 2026 19:02
Copy link
Copy Markdown

@eliteprox eliteprox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also explore moving aigateway into a go module in separate muxionlabs org repo for reusability

Comment thread pkg/aigateway/aigateway.go Outdated
base := strings.TrimRight(cfg.BaseURL, "/")
startCandidates := []string{
base + "/process/stream/start",
base + "/ai/stream/start",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed when using latest muxionlabs/go-livepeer:transcoding-with-ai

Comment thread pkg/media/mkv_ingest.go
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can try moving this into segment_conv.go to see if it will cover WHIP ingest also. See https://github.com/streamplace/streamplace/pull/810/changes#diff-8d7219fc0d142be71d6d322c932bbd3cc7c63c1a11a7d8d4cc942138f753d8bd

Comment thread pkg/aigateway/aigateway.go Outdated
eliteprox and others added 7 commits January 8, 2026 19:51
- Removed the aigateway package and its related files, streamlining the codebase.
- Updated go.mod to reflect changes in dependencies, including the addition of muxionlabs/ai-go-sdk.
- Adjusted launch configuration for local development to use the correct AI gateway URL.
- Enhanced media handling by integrating the new ai-go-sdk for AI gateway interactions.
- Updated go.mod to use version 0.1.1 of muxionlabs/ai-go-sdk.
- Enhanced LivestreamState to include transcripts, allowing for real-time AI-generated text segments.
- Implemented handling of transcript segments in websocket consumer, ensuring they are stored and published correctly.
- Added new lexicon definitions for transcript segments and words to support structured data handling.
- Refactored media ingestion to initiate AI sessions and process transcript events, improving the overall media management workflow.
- Replaced AI gateway stream configuration with a new structure for better clarity.
- Increased buffer sizes for audio and video channels to improve performance.
- Adjusted pipeline configuration for video processing to streamline data flow.
- Removed redundant context handling for transcript streaming, ensuring cleaner code.
- Introduced AISessionResources structure to encapsulate video and audio channels along with cleanup functionality.
- Simplified transcript event handling by centralizing the publishing logic in PublishTranscriptToBus.
- Updated MKV and RTMP ingest methods to utilize the new AISessionResources for improved clarity and maintainability.
- Removed redundant code related to transcript event processing, enhancing overall code cleanliness.
@JJassonn69 JJassonn69 marked this pull request as ready for review January 9, 2026 18:28
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Comment thread .vscode/launch.json
Comment thread pkg/media/rtmp_ingest.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants