Integrate MLX for on-device AI assistant and multi-platform streaming#239
Integrate MLX for on-device AI assistant and multi-platform streaming#239arkavo-com wants to merge 14 commits intomainfrom
Conversation
Integrates mlx-swift-examples v2 into MuseCore for local LLM inference, with a streaming chat UI in ArkavoCreator that helps creators draft, rewrite, and adapt content across platforms (Bluesky, YouTube, Twitch, Reddit, Micro.blog) — no backend required. MuseCore additions: - StreamingLLMProvider protocol for async token streams - MLXBackend wrapping MLXLMCommon generate API - ModelRegistry catalog (Gemma 270M default, Qwen 3.5 scale-up) - ModelManager for lifecycle, memory budgeting, GPU coexistence ArkavoCreator additions: - PlatformContext protocol with per-platform constraints/actions - AssistantPromptBuilder for context-aware system prompts - AssistantChatView (full section) and AssistantPanelView (Cmd+Shift+A) - AssistantViewModel with streaming generation and auto-unload in Studio - localAssistant feature flag (ships enabled, independent of aiAgent) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ekick Transform the text chatbot into a role-based architecture where Muse fills three distinct roles for creators. Producer monitors streams via a private Studio overlay panel. Publicist generates platform-native content across six connected platforms. Sidekick (Phase 1.1b) will be the on-camera avatar. - Add AvatarRole enum and RolePromptProvider with per-role/locale prompts - Add MLXResponseProvider adapter (MLXBackend → LLMResponseProvider) - Add Producer panel (slide-in overlay in Studio, Cmd+P toggle) - Add Publicist view (content workspace replacing chat UI) - Add role-aware ConversationManager with context injection - Fix model auto-load on restart (sandbox-aware cache detection) - Fix concurrent model load race condition (generation counter) - Fix <end_of_turn> token leaking into output (stop sequence filtering) - Remove AssistantChatView, AssistantPanelView, AssistantViewModel - Remove StreamingLLMProvider protocol (inlined into MLXBackend) - Rename AssistantAction → PublicistAction Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move Publicist from standalone sidebar section to a slide-in panel on Dashboard (⌘E), matching Producer's panel pattern on Studio (⌘P). Sidebar is now five items: Dashboard, Profile, Studio, Library, Settings. - Wire Sidekick: MLXResponseProvider + ConversationManager into MuseAvatarViewModel's LLM fallback chain for on-device chat responses - Add PublicistPanelView (compact panel for Dashboard trailing edge) - Move Settings to bottom of sidebar (below divider) - Move Send Feedback to top of Settings page - Remove feedback toggle and unused appState references - Dashboard subtitle: "Your Social Command Center" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix Settings sidebar using same NavigationLink highlight as other items - Change LIVE button from blue/purple gradient to red (broadcast standard) - Improve timer contrast and enlarge primary screen star badge - Format recording titles as human-readable dates instead of raw timestamps - Consolidate recording card metadata into single compact subtitle line - Demote Settings Reset button from destructive red to secondary gray - Fix Settings text hierarchy and compact the feedback banner - Add segmented control container styling to Publicist selectors - Add placeholder text and border to Publicist source TextEditor - Clarify character limit label from "max" to "chars" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…io layout Phase 2 visual overhaul: - Replace flat controlBackgroundColor with .ultraThinMaterial + specular gradient borders (top-left light source) on all cards - Add ambient gradient background behind NavigationSplitView for glass refraction - Spring animations replace all easeInOut transitions - Remove page headers (headerless pro layout), move Library metadata to native toolbar Studio restructure: - Fixed bottom control bar (edge-to-edge .regularMaterial, 68pt) - Consolidate Chat + Producer into unified Producer command center with integrated dense monospaced chat feed at bottom - Single panel toggle button replaces three separate panel buttons - Audio controls: mic/speaker toggle + chevron volume popovers - Scene picker moved next to timer as chevron popover - Control bar grouped: inputs (left), broadcast (center), toggle (right) - LIVE button pulses red shadow when streaming Recording cards: - Flush thumbnails clipped by outer card shape (no internal cornerRadius) - Adaptive grid columns (280-400pt) for fluid window resizing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Enable YouTube feature flag and implement full broadcast lifecycle: create broadcast, bind stream, transition ready→testing→live, and end on stop. Add network.server entitlement for OAuth callback, silent audio generator for YouTube stream activation, and background RTMP server message handler for ping/pong and window acknowledgements to prevent connection drops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Encode video/audio once, fan out to N RTMPPublisher instances in parallel. Each publisher has independent send tasks, sequence headers, AsyncStream continuations with .bufferingNewest(30) backpressure, and per-destination frame rate limiting to prevent YouTube "faster than realtime" errors. - VideoEncoder: startStreaming(to:) accepts array of destinations - RecordingSession: multi-destination pass-through, per-platform stop - StreamViewModel: selectedPlatforms set, per-platform PlatformConfig - StreamDestinationPicker: multi-select toggle cards, bandwidth estimate - StreamInfoFormView: universal form with platform-specific sections - ChatPanelViewModel: concurrent Twitch IRC + YouTube polling into unified message feed with per-platform connect/disconnect - TwitchAuthClient: token refresh, EventSub scopes, keychain storage - TwitchEventSubClient: WebSocket client for follows, subs, cheers, raids - YouTubeClient: live chat polling via OAuth, Sendable response types Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code Review: PR #239 — Integrate MLX for on-device AI assistant and multi-platform streamingSummaryThis is a large PR (+4900/-998 across 44 files) that introduces three major feature areas:
Additionally includes: UI refresh with glassmorphism design, Swift 6.3 version bump, Twitch EventSub WebSocket client, token refresh flow, feedback UX refactor, and feature flag updates. StrengthsArchitecture & Design
Code Quality
Testing
Issues / Concerns🔴 Critical1. Debug // AudioEncoder.swift
nonisolated(unsafe) private var feedCount = 0
public func feed(_ sampleBuffer: CMSampleBuffer) {
feedCount += 1
if feedCount == 1 || feedCount % 500 == 0 {
print("🔊 AudioEncoder.feed() called #\(feedCount)...")
}Recommendation: Remove or gate behind 2. Silent audio generator has // Always generate silent audio as fallback
if true {
// Create a CMSampleBuffer with silent PCM data
var formatDesc: CMAudioFormatDescription?Recommendation: Remove the 3. YouTube OAuth scope broadened significantly without justification // Before: youtube.readonly youtube.upload youtube.force-ssl
// After: youtube youtube.force-sslRecommendation: Use the minimum required scopes. For live streaming, you need 🟡 Major4. struct PlatformConfig {
var streamKey: String = ""
var broadcastId: String?
var transitionTask: Task<Void, Never>? // ⚠️ Reference type in value type
Recommendation: Either make 5. public final class MLXResponseProvider: LLMResponseProvider, @unchecked Sendable {
public var activeRole: AvatarRole = .sidekick // ⚠️ No synchronization
public var voiceLocale: VoiceLocale = .english // ⚠️ No synchronization
public var contextInjection: String? // ⚠️ No synchronization
Recommendation: Either make 6. No error handling for failed RTMP destinations in simulcast try await withThrowingTaskGroup(of: StreamDestination.self) { group in
for dest in destinations {
group.addTask { /* connect */ }
}
for try await dest in group { // ⚠️ One failure kills all
streamDestinations[dest.id] = dest
}
}Recommendation: Use 7. YouTube broadcast hardcodes "status": ["privacyStatus": "public"] // ⚠️ Ignores user's selectionRecommendation: Pass the privacy status through 8. Duplicate streaming logic between Recommendation: Consolidate into 🟢 Minor9. private struct BackendState: ~Copyable {
10. Magic numbers in video bitrate estimation private var videoBitrate: Int {
let cores = ProcessInfo.processInfo.activeProcessorCount
if cores >= 8 { return 4_500_000 }
if cores >= 4 { return 3_000_000 }
return 1_500_000
}These should be documented constants or reference the 11. 12. Inconsistent platform identification strings Recommendation: Use the 13. Missing 14. Hardcoded frame-rate drop threshold in video send task if elapsed < .milliseconds(Int(targetInterval * 900)) && !frame.isKeyframe {
continue // Drop frame — too fast
}The 15. Several
Suggestions
Questions
Overall RecommendationRequest Changes 🔄 The architecture and feature design are strong, but the PR should address the critical issues (debug logging in production audio path, silent audio generator cleanup, overly broad YouTube OAuth scope) and the major issue of I'd also strongly recommend splitting this into smaller PRs if timeline permits — the blast radius of a 4900-line PR across core streaming infrastructure is high. |
- Replace mlx-swift-examples 2.29.1 with mlx-swift 0.31.3 + mlx-swift-lm (arkavo-ai fork with Gemma 4 text model at 73.7 tok/s) - Add MLXHuggingFace, Tokenizers, HuggingFace dependencies - Update MLXBackend to use #huggingFaceLoadModelContainer macro - Add Gemma 4 E4B (8B params, 4B active MoE, 8-bit) to ModelRegistry - Fix Sendable capture in MLXBackend.generate() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sandboxed apps default to Library/Caches/huggingface/hub inside the app container. This means models downloaded by the Python CLI (at ~/.cache/huggingface/hub) aren't found, triggering a 9 GB re-download. Fix: explicitly configure HubClient with the shared cache path so the app finds models cached by any HF client. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Settings UI: - Preferred model picker (Gemma 4 E4B, Qwen 3.5 0.8B, Qwen 3.5 9B) - Model state indicator (idle/downloading/loading/ready/error) - Load/Unload/Retry buttons - Custom model cache folder with NSOpenPanel folder picker - Persisted via UserDefaults Debug logging: - MLXBackend: logs cache directory resolution, model cache check, load start/success - ModelManager: logs init state, auto-load decisions, load lifecycle, errors with full descriptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sync project-level Package.resolved with workspace-level to fix Xcode Cloud build failure from stale mlx-swift-examples reference. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Capture Sendable values (String) before the Task boundary instead of sending the non-Sendable Profile across isolation domains. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|



No description provided.