Skip to content

[BUG] watch_folder_processed is in-memory only + add handler for Skaldleita server_notice #208

@deucebucket

Description

@deucebucket

Problem

A single LM instance (216.212.51.99) has been generating ~48% of all Skaldleita API traffic for days by re-querying /match every 30 seconds with the same filename. Server-side evidence:

  • UA: LibraryManager/unknown
  • Valid lm-public-* API key
  • Same filename (Star Trek - Star Trek Khan), same response (286 books, confidence 0.7), every 30s
  • 2,840 requests on 2026-04-16 from this single instance, another 648 before 05:00 UTC on 2026-04-17

Root cause

app.py (line ~6043):

```python
watch_folder_processed = set()
```

This set is only in memory. Every LM restart wipes it, so the watch-folder worker (library_manager/worker.py) re-processes the same file each cycle. Whatever is preventing the file from moving to the output folder (mismatched author, ambiguous 286-book match, move failure, or mtime churn from an external downloader) causes the loop to repeat indefinitely.

What to fix

1. Persist watch_folder_processed to SQLite

Convert the in-memory set into a table in the existing LM database, e.g.:

```sql
CREATE TABLE IF NOT EXISTS watch_folder_processed (
path TEXT PRIMARY KEY,
processed_at TEXT DEFAULT (datetime('now')),
outcome TEXT, -- 'moved' | 'move_failed' | 'unknown_author' | 'aborted_by_server'
error_message TEXT
);
```

Replace watch_folder_processed.add(x) with an INSERT OR REPLACE, and if x in watch_folder_processed: with an EXISTS query. Survives restarts.

2. Honor Skaldleita's new server_notice field

Skaldleita PR deucebucket/skaldleita#129 adds a server_notice field to /match responses. When the server detects a retry loop (3 identical queries within 5 minutes), the response now carries:

```json
{
"series": { ... },
"books": [ ... ],
"confidence": 0.7,
"server_notice": {
"severity": "warning",
"code": "retry_loop_detected",
"message": "This filename has been queried N times in M minutes. Your Library Manager may have a retry-loop bug. Please upgrade.",
"action": "abort_task",
"upgrade_url": "https://github.com/deucebucket/library-manager/releases/latest\"
}
}
```

In library_manager/providers/bookdb.py, after the successful JSON parse, check for server_notice:

```python
notice = data.get('server_notice')
if notice:
logger.warning(f"Skaldleita server notice [{notice.get('code')}]: {notice.get('message')}")
if notice.get('action') == 'abort_task':
# Stop retrying this file. Mark it in watch_folder_processed so the
# watch-folder worker doesn't re-submit it next cycle.
return {'_server_abort': True, '_server_message': notice.get('message')}
```

And in the watch-folder worker, if the result dict contains _server_abort, mark the file as outcome='aborted_by_server' in the new persistent table and optionally surface the message in the UI with the upgrade link.

3. Suggested UI surfacing

Show a one-time banner or notification when a server_notice with severity: warning is received, linking to the upgrade URL. Low priority relative to (1) and (2), but this is how users learn they need to upgrade.

Why this matters

The server-side cache in deucebucket/skaldleita#129 makes the loop free for us to absorb, so there's no production emergency — but the user's LM is still burning CPU/network/disk IO on a file it can never process. Fix (1) stops the loop dead on restart. Fix (2) stops it without restart (and gives us a reusable client↔server notification channel for future operational signals).

References

  • Skaldleita server-side mitigation PR: deucebucket/skaldleita#129
  • Skaldleita tracking issue: deucebucket/skaldleita#128

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions