fix: Persist watch_folder_processed + honor Skaldleita server_notice (#208)#213
Merged
deucebucket merged 1 commit intodevelopfrom Apr 18, 2026
Merged
Conversation
…tice Problem 1: watch_folder_processed was an in-memory set() that got wiped on every restart. Any file that couldn't be processed (unknown author, ambiguous match, move failure, mtime churn) got re-submitted every scan cycle after a restart, forever. One LM instance generated ~48% of all Skaldleita /match traffic for days on the same filename. Problem 2: Skaldleita PR #129 added a server_notice field to /match responses. When the server detects a retry loop it now sends {severity, code, message, action: abort_task, upgrade_url}. LM ignored it. Fixes: - New watch_folder_processed SQLite table (path PK, processed_at, outcome, error_message). outcome in {moved, move_failed, aborted_by_server}. - watch_folder_is_processed() / watch_folder_mark_processed() helpers in library_manager/database.py. process_watch_folder swapped from set ops to these helpers. Restart no longer resets dedup. - bookdb.py logs every server_notice (with upgrade_url). On action=abort_task it stashes the notice in a threading.local() slot so scope is per-thread (watch worker, API endpoint, pipeline layer don't cross-contaminate). - process_watch_folder reads the abort slot after each identify attempt; if set, marks the item as aborted_by_server and skips the pipeline. Bumps APP_VERSION to 0.9.0-beta.148.
There was a problem hiding this comment.
🔍 Vibe Check Review — PR #213
Verdict: APPROVE · Scope: SCOPE_OK · Docs: ✅ CHANGELOG + README updated
Root-cause fix for Issue #208. Persistent SQLite dedup replaces the in-memory set() that was re-submitting the same failing filename every 30s across restarts (~48% of Skaldleita /match traffic from one instance). Server server_notice / action=abort_task now short-circuits the retry loop via a thread-local signal.
Verified:
- New
watch_folder_is_processed/watch_folder_mark_processedfollow existingdatabase.pypatterns (rawsqlite3.connect(timeout=30)+ try/finally, parameterized queries). WAL persists database-wide fromget_db(). - All three
watch_folder_processedcall-sites inapp.pyare migrated;globaldeclaration trimmed. threading.local()inbookdb.pyis safe —worker.pyruns scan and watch in separate threads, so abort state can't cross-contaminate.get_and_clear_server_abort()read-and-clear contract is correct.- Notice parsing placed before the
confidence < 0.5early return insearch_bookdb, so low-confidence responses still carry the abort. - Abort check sits between API lookup and
move_to_output_folder, exactly where you want it.
No blocking issues. Only nit (not filed): all notices log at WARNING regardless of severity field — trivial cleanup later, not PR-blocking.
Review saved to pr_213_review.md.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #208
Summary
Two related fixes that stop the watch-folder retry loop that had one LM instance generating ~48% of Skaldleita `/match` traffic for days.
1. `watch_folder_processed` is now persistent — was an in-memory `set()`, wiped on every restart, so any file that couldn't move (unknown author, ambiguous match, move failure, mtime churn) got re-submitted every scan cycle forever.
2. LM now honors Skaldleita's `server_notice` field (added in deucebucket/skaldleita#129). When the server detects a retry loop it sends `{severity, code, message, action: abort_task, upgrade_url}`. LM logs it, and on `action=abort_task` stops retrying that file immediately.
Changes
`library_manager/database.py`
`app.py`
`library_manager/providers/bookdb.py`
Behavior matrix
Test plan
Notes / follow-ups
References