Skip to content

Async post-processing and batch queue inserts for import service#529

Closed
javi11 wants to merge 1 commit intomainfrom
claude/optimize-file-import-GGQ1w
Closed

Async post-processing and batch queue inserts for import service#529
javi11 wants to merge 1 commit intomainfrom
claude/optimize-file-import-GGQ1w

Conversation

@javi11
Copy link
Copy Markdown
Owner

@javi11 javi11 commented Apr 23, 2026

Summary

This PR refactors the NZB import service to improve throughput and responsiveness by:

  1. Moving post-processing work (VFS notifications, symlinks, ARR updates, NZB cleanup) to background goroutines
  2. Implementing batch queue inserts during directory scanning to reduce database overhead
  3. Adding concurrency controls to prevent unbounded goroutine growth during large import bursts

Key Changes

Post-Processing Async Dispatch

  • Extracted post-processing logic from handleProcessingSuccess() into a new dispatchPostProcessing() method that runs in a background goroutine
  • Added postProcessWG (WaitGroup) to track in-flight post-processing goroutines for graceful shutdown
  • Added postProcessSem (buffered channel) to bound concurrent post-processing to 32 goroutines, preventing runaway growth during bulk imports
  • Updated Stop() to wait for in-flight post-processing with a 15-second timeout before shutdown
  • Queue items are now marked as completed immediately after storage path persistence, allowing workers to claim the next item without waiting for downstream operations (VFS propagation, ARR notifications, etc.)

Batch Queue Inserts

  • Added AddBatchToQueue() method to queueAdapterForScanner to insert multiple items in a single database transaction
  • Introduced QueueBatchItem struct in the scanner package to describe pending queue insertions
  • Modified DirectoryScanner.performScan() to accumulate discovered files in a pending batch (up to 100 items) before flushing to the database
  • This amortizes per-file insert overhead and improves scan throughput for large directory trees

Code Organization

  • Extracted buildQueueItem() helper method to reduce duplication between single and batch insert paths
  • Improved comments documenting the rationale for async post-processing and concurrency bounds

Implementation Details

  • Post-processing uses the service-level context (s.ctx) rather than per-worker context, ensuring work survives worker pool resizing but is cancelled on full service shutdown
  • Batch flush is triggered either when the pending buffer reaches 100 items or at the end of the directory walk
  • Post-processing failures are logged as warnings and do not block queue completion or worker pickup
  • Shutdown gracefully handles stuck external calls (ARR, rclone) with a bounded timeout

https://claude.ai/code/session_01RK87AYKMQKBxk1k9cWafDC

…erts

Queue workers previously blocked on post-processing (VFS notify + 1s FUSE
propagation sleep, symlink/STRM creation, health check scheduling, ARR
notifications) before picking up the next item. Dispatch that work to a
bounded background pool and mark the queue item completed synchronously so
workers return immediately after a successful import.

Directory scans also inserted one queue row per file. Accumulate discovered
files and flush every 100 into a single transaction via the existing
AddBatchToQueue path, cutting per-file DB overhead on large libraries.
@javi11 javi11 closed this Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants