Skip to content

refactor: decompose TranscriptionEngine god object into focused managers#6

Closed
Newarr wants to merge 10 commits intorefactor/logging-migration-stackedfrom
refactor/transcription-engine-decomposition
Closed

refactor: decompose TranscriptionEngine god object into focused managers#6
Newarr wants to merge 10 commits intorefactor/logging-migration-stackedfrom
refactor/transcription-engine-decomposition

Conversation

@Newarr
Copy link
Copy Markdown
Owner

@Newarr Newarr commented Mar 29, 2026

Stacked on #5 — after the logging migration merges, this PR introduces the manager extraction.

What

Decomposes the 949-line TranscriptionEngine god object into 3 focused, testable managers:

Manager Lines Responsibility
ModelDownloadManager 219 Model download, loading, progress tracking
DeviceRoutingManager 333 CoreAudio listeners, device restart coordination
TranscriptionStreamCoordinator 374 Audio stream orchestration, transcribers

Result

  • TranscriptionEngine: 949 → 551 lines (42% reduction)
  • Clean separation of concerns with single-responsibility managers
  • Public API preserved for SwiftUI compatibility
  • Swift 6 strict concurrency throughout

Architecture

TranscriptionEngine (551 lines, facade)
├── ModelDownloadManager - model lifecycle
├── DeviceRoutingManager - device listeners/restarts  
├── TranscriptionStreamCoordinator - stream orchestration
└── Existing utilities (unchanged)

Key Changes

New Files

  • ModelDownloadManager.swift - Extracted model loading/download logic
  • DeviceRoutingManager.swift - Extracted CoreAudio device listeners
  • TranscriptionStreamCoordinator.swift - Extracted stream orchestration
  • Utils/Logging.swift - Centralized Log enum (from refactor: migrate diagLog to unified os.Logger #5)

TranscriptionEngine Simplification

  • Delegates model loading to ModelDownloadManager
  • Delegates device listeners to DeviceRoutingManager with async callbacks
  • Delegates stream management to TranscriptionStreamCoordinator
  • Maintains all @Observable properties for SwiftUI binding

Concurrency Fixes (from code review)

  • Converted fire-and-forget callbacks to async/await
  • Added bounded buffering .bufferingNewest(1) to AsyncStreams
  • Removed nonisolated(unsafe) patterns in favor of proper Sendable
  • Fixed diarization task lifecycle (proper cancel/await)

Testing

  • swift build compiles with Swift 6 strict concurrency
  • All 304 tests pass
  • No public API changes — existing SwiftUI bindings work unchanged

Migration Path

This is a pure refactoring with no behavioral changes. The public API remains:

@Observable @MainActor final class TranscriptionEngine {
    var isRunning: Bool
    var assetStatus: String
    var lastError: String?
    var needsModelDownload: Bool
    var downloadProgress: Double?
    var audioLevel: Float
    var isMicMuted: Bool
    
    func start(locale:inputDeviceID:transcriptionModel:) async
    func stop()
    func finalize() async
    func restartMic(inputDeviceID:)
}

Szymon Sypniewicz added 10 commits March 29, 2026 23:41
…iptionStreamCoordinator

Extract three managers from TranscriptionEngine god object:
- ModelDownloadManager: model download, loading, progress tracking
- DeviceRoutingManager: CoreAudio device listeners, restart coordination
- TranscriptionStreamCoordinator: audio stream capture, transcriber orchestration

Key fixes from code review:
- Centralized DownloadProgressDetail in ModelDownloadManager
- Added storeRestartState() for proper restart state management
- Added finalizeMicStream/finalizeSystemStream with draining
- Fixed Sendable closure isolation with @mainactor annotations
- Removed Sendable conformance from @mainactor classes
- Fixed clearCache() to use .info log level
- Added deinit cleanup for CoreAudio listeners
…view

- Add missing import FluidAudio to DeviceRoutingManager
- Add Utils/Logging.swift for centralized Log definitions
- Clear restart state in stopListening() to prevent memory leaks
  (VadManager, backends now properly released on session end)
- Add diarizationTask property to track diarization audio feed
- Cancel and await diarization task in stop/finalize paths
- Check Task.isCancelled in diarization loop to exit promptly
- Prevent old audio from draining into diarizer after session stop
Change onMicRestartRequested and onSystemRestartRequested from
synchronous to async closures to prevent overlapping restarts when
the engine responds with async work.

- Change callback types to (@sendable ...) async -> Void
- Await callbacks in performMicRestart() and performSystemAudioRestart()
- Fix deinit to dispatch cleanup to MainActor

Fixes fire-and-forget callback issue (lines 10, 117, 139, 276, 285)
Add bounded buffering to AsyncStream to prevent unbounded memory growth
when transcription/diarization falls behind. Add proper error propagation
with user-facing error messages.

- Use .bufferingNewest(1) for all AsyncStream.makeStream() calls
- Add TranscriptionStreamError enum with localized error descriptions
- Change startMicStream/startSystemStream to return Result type
- Provide user-friendly error messages for common failure modes

Fixes unbounded buffering (lines 153, 316) and error propagation issues
Fix stale state dependency where loadModel() trusted the cached
needsDownload flag instead of recomputing from the model argument.
If the caller changed models between calls, the manager misreported state.

- Recompute availability from model parameter at top of loadModel()
- Rename local variable to avoid shadowing property

Fixes stale state dependency (lines 49, 77)
Remove the duplicate DownloadProgressDetail struct from TranscriptionEngine.
The type is now defined in ModelDownloadManager and shared across the module.

Fixes invalid redeclaration compilation error.
- Remove nonisolated(unsafe) from diarization task (capture dm properly)
- Remove @unchecked Sendable Box struct from tappedStream
- Make MicCapture and SystemAudioCapture injectable via init

All unsafe concurrency patterns have been removed.
- TranscriptionEngine reduced from 949 to 551 lines (42% reduction)
- Delegates to ModelDownloadManager for model lifecycle
- Delegates to DeviceRoutingManager for device listeners/restarts
- Delegates to TranscriptionStreamCoordinator for stream orchestration
- Public API preserved for SwiftUI compatibility
Critical:
- Wire ModelDownloadManager progress callbacks to engine UI state
- Reset downloadConfirmed/needsModelDownload on model load failure
- Remove broken deinit (weak self always nil), document stopListening contract

High:
- Use shared static queue for CoreAudio listener add/remove calls
- Replace 4 force unwraps with guard let
- Remove duplicate micTask/sysTask ownership (coordinator is sole owner)
- Use bufferingOldest(128) instead of bufferingNewest(1) to prevent audio drops

Medium:
- Nil diarizationTask synchronously in stopSystemStream to prevent race
- Surface mic health check failure to user via lastError callback
- Remove dead clearSystemAudioErrorIfPresent, inline at restart success

Concurrency:
- Extract data from buffer before yielding to diarization continuation
- Use nonisolated(unsafe) for read-only AVAudioPCMBuffer across send boundaries
- Wrap tappedStream iteration in Sendable TapForwarder struct
@Newarr Newarr force-pushed the refactor/transcription-engine-decomposition branch from 4e72249 to d6bfcba Compare March 29, 2026 21:45
@Newarr Newarr changed the base branch from refactor/naming-collision to refactor/logging-migration-stacked March 29, 2026 21:45
@Newarr Newarr deleted the branch refactor/logging-migration-stacked March 29, 2026 23:49
@Newarr Newarr closed this Mar 29, 2026
@Newarr Newarr deleted the refactor/transcription-engine-decomposition branch March 29, 2026 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant