From fbdd6184e8964fae9dfc4c6ea91322c4e98cf469 Mon Sep 17 00:00:00 2001 From: Sarah Wolff Date: Fri, 27 Mar 2026 00:00:23 -0400 Subject: [PATCH 01/13] Security audit fixes + refactors (#4) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Fix remaining book_id migration issues, guard Booklore cache, scope suggestions (#20) Completes the ABS ID decoupling by fixing service/repository methods that still used abs_id as lookup keys, removing 19 dead backward-compat methods, and cleaning up unnecessary abs_id parameters. Key changes: - Fix reading_stats, alignment, storyteller, dashboard lookups to use book_id - Guard Booklore cache loading behind is_configured() for unconfigured instances - Scope suggestion operations by (source_id, source) composite key with unique index migration, preventing collisions across ABS/KoSync/Booklore - Remove dead is_hash_linked_to_device methods from kosync and suggestion repos - Add 14 new tests for book_id resolution, suggestion scoping, and alignment ops All 458 tests passing. * Fix database upgrade safety issues from v0.1.4 compatibility review - Guard save_state() against double-NULL book_id/abs_id lookup - Isolate per-column error handling in _ensure_model_columns - Log orphaned rows in nullable table backfill migration - Remove dead delete_hardcover_details_by_book_id method * fix: sort imports to satisfy ruff I001 * Fix abs_id→book_id migration gaps from CodeRabbit review (#50) Fixes 6 issues found during v0.1.5 PR review: - Restore _rdAbsId JS variable in reading_detail.html (all action buttons broken) - Key KoSync debounce, poll cache, and write-suppression by book.id not abs_id (ebook-only books have abs_id=None, collapsing all into one dict entry) - Fix link_kosync_document to set linked_abs_id for backward compat; query linked/unlinked docs by linked_book_id (the canonical FK) - Guard get_book_by_abs_id(None) with early return - Gate Base.metadata.create_all() on migration success * Fix ebook card display: portrait covers instead of square Change .resource-card from forced 1:1 aspect ratio to portrait layout matching audiobook cards. Add 2:3 aspect ratio to .resource-cover-container with object-fit: cover. Increase resource grid min column width to 160px to match audiobook grid. Ghost cards retain compact centered layout. * Smart mode defaults: auto-detect available services Default to Ebook Only mode when ABS is not configured. Detect all ebook sources (Booklore, CWA, ABS ebook libs, local /books mount). Disable mode buttons that have no backing service. Update subtitle from "ABS library" to "audiobook library". * Rename ABS-specific methods to generic audiobook names get_abs_title → get_audiobook_title, get_abs_author → get_audiobook_author. Change fallback source label from "abs" to "unknown" in suggestion serialization. These methods extract generic metadata, not ABS-specific. * Extract _create_book_mapping helper to deduplicate match logic Single match POST and batch process_queue shared ~130 lines of identical logic (Booklore lookup, KOSync, hash preservation, duplicate merge, Hardcover, Booklore shelf, Storyteller, suggestion resolution). Both now call _create_book_mapping(), reducing net code by ~100 lines. * Add ebook-only support to batch match Allow adding ebook-only items to the batch queue (no audiobook required). Add ebook-only processing branch in process_queue. Update JS to enable "Add to Queue" when ebook is selected without audiobook. Queue items now use a generic queue_key for dedup. * Clean up dead code and unnecessary getattr usage Replace getattr() with direct attribute access in _copy_book_merge_metadata since existing_book is always a SQLAlchemy Book model. Update docstring for get_audiobook_author to remove ABS-specific language. * Fix missing BookFusion covers and broken onerror fallback Skip ABS cover proxy for bf- prefixed books (always 404'd), deduplicate dashboard cover waterfall into resolve_book_covers(), fix onerror chain so placeholder shows when KoSync fallback also fails, add branded BookFusion placeholder logo. * Consolidate suggestion serializer and remove dismissed status Move _serialize_suggestion into helpers.py as shared utility, removing duplicate definitions from api.py and matching_bp.py. Unify dismissed → hidden status throughout suggestion_repository. Allow suggestion rescan to proceed when ABS is unconfigured (BookFusion-only setups). Pass storyteller_configured flag to match/batch_match templates. * Hide Storyteller UI when unconfigured and fix ABS cover proxy fallback Conditionally hide Storyteller column in match/batch_match when the integration is not configured. ABS cover proxy now falls back to using the raw book_ref as abs_id when no book record exists, allowing direct ABS ID lookups without a mapped book. * Improve suggestions page UX with inline refresh and modal errors Replace location.reload() after rescan/link actions with inline data refresh via refreshSuggestionsData(). Replace alert() calls with showErrorToast() using the app's confirm modal. Update copy to be source-agnostic ("unmapped book pairings" instead of "audiobook"). * Cache book metadata (author/subtitle) locally and clean up helpers Add author and subtitle columns to Book model so these fields survive ABS outages. Dashboard opportunistically refreshes from live ABS data and falls back to cached values when disconnected. All book creation sites in matching_bp now populate author/subtitle from ABS metadata. Extract shared helpers (find_booklore_metadata, attempt_hardcover_automatch) to reduce duplication across dashboard, reading, and matching blueprints. Remove dead getattr calls for columns that have model defaults. Also includes ABS cover proxy local caching for offline resilience. * Show service logo placeholder when book cover is unavailable Add placeholder_logo field to mapping/book data dicts, determined by primary source (BookFusion, Booklore, or Audiobookshelf). Display the logo in all cover placeholder divs across dashboard, reading log, reading detail, and backlog cards. * Deduplicate placeholder_logo logic, fix cover proxy streaming, and fix N+1 query Extract resolve_placeholder_logo() into cover_resolver.py and return it from resolve_book_covers(), removing duplicate 4-branch conditionals from dashboard.py and reading_bp.py. Drop unnecessary stream=True from cover proxy requests that immediately buffer via .content. Bulk-fetch Hardcover details on the reading page to avoid per-book N+1 queries. * De-center ABS on batch match page and fix BookFusion enabled check Hide the audiobook column on batch match when ABS isn't configured, adapting section numbers, hints, and status text to be service-agnostic. Fix BookFusionClient.is_configured() to respect BOOKFUSION_ENABLED, matching the pattern used by all other service clients. * Hide Suggestions nav link and guard route when ABS is not configured Suggestions require Audiobookshelf to produce results. Gate the nav link on abs_url and redirect /suggestions to dashboard when ABS is unavailable. Also conditionally hide BookFusion filter/stat on the suggestions page when BookFusion is not enabled. * Update suggestions page description wording * Frontend overhaul + testing gaps: extract inline JS, unify modals, add 260 tests Frontend: - Create shared utils.js (escapeHtml, debounce, toggleHiddenSection) - Create unified confirm-modal.js with PKModal API (confirm, confirmForm, alert) - Create shared confirm_modal.html partial, replacing 5 duplicate modal blocks - Extract inline JS from 5 templates into external files: suggestions.html (445 lines), bookfusion.html (833 lines), logs.html (655 lines), match.html (363 lines), batch_match.html (162 lines) - Use PK_PAGE_DATA pattern for Jinja2→JS data bridging - Consolidate .btn-error into .btn-danger - Wire dashboard.js to use shared PKModal via legacy bridge Testing (461 → 721 tests): - Expand conftest.py with canonical MockContainer, pytest fixtures, test helpers - Add env var save/restore to flask_app fixture to prevent test pollution - New blueprint tests: bookfusion routes (56), logs routes (27), dashboard errors (7) - New service tests: BackgroundJobService (30), ReadingDateService (30), BookMetadataService (12), ReadingService (10), ClientPoller (8) - New integration tests: settings hot-reload (10), sync concurrency (7) - New error path tests: helpers (12), matching (9), reading bp (8), API (17) * Address code review findings: error handling, modal convention, dead code - bookfusion.js: capture error parameter in all 12 catch blocks, surface error messages to users instead of generic "Error" text - confirm-modal.js: add null guard in _resolve() when modal partial is missing from the page - dashboard.js: replace native confirm()/alert() with PKModal.confirm() and PKModal.alert() per project convention - logs.js: remove undeclared lastLogTimestamp variable (dead code from extraction), replace with shownLogs.clear() - reading_service.py: add warning log to pull_started_at catch block that was silently falling back to today's date * Address code review findings: null guards, offset bug, import sorting, dead code - Guard PKModal public methods against missing modal partial - Fix double-increment of currentOffset in logs loadMore handler - Add null guards in batch-match.js and utils.js DOM access - Remove unused preselectedEb variable in match.js - Guard against book.abs_id being None in cover proxy - Default mock_abs_client.is_configured to False in test fixtures - Wrap flask_app fixture in try/finally for safe teardown - Remove unused imports and variables in test files - Fix ruff I001 import sorting across all affected files * Address remaining review findings: unify modals, fix state detection, clean up patterns - Replace native confirm()/alert()/prompt() with PKModal on tbr-detail, dashboard, reading-detail, and settings pages - Migrate settings.js custom modal system to shared PKModal API - Fix dashboard refreshPaused flag never resetting on button-close - Replace fragile textContent.includes() state detection in logs.js with filterPending boolean flag - Add visible error display for live log fetch failures - Fix double JSON.stringify in suggestions.js BookFusion flow * KoSync system overhaul: service extraction, document management, bug fixes Major refactoring and feature additions for the KoSync subsystem: Service extraction: - Extract 375 lines of business logic from kosync_server.py into new KosyncService class (src/services/kosync_service.py) - Decompose _try_find_epub_by_hash (151 lines) into 3 focused methods - Remove dead code: _hash_cache, unused repository methods KoSync Document Management page (/kosync-documents): - New page accessible from Settings > KoSync tab - Three sections: Healthy, Needs Attention, Stale (30+ days) - Actions: Link to Book (search), Link to Self, Create Book, Clear Hash, Unlink, Delete - Rich context: book titles, time-ago indicators, device vs bot labels - Dashboard "Pending Identification" section for unlinked hashes with reading progress Bug fixes: - Fix sync direction inversion: mixed text-matched and percentage-fallback normalization could elect wrong leader (Entitled at 39% over Booklore at 45%) - Fix Booklore get_text_from_current_state using wrong filename - Fix Booklore 2 not showing as pairing option when only BL2 enabled - Fix Booklore crash on books with no ebook filename - Fix ebook-only books showing as unlinked (linked_abs_id vs linked_book_id) - Fix Link to Self sending empty body (Flask 400) - Fix external KoSync server missing credential fields and secret handling - Prevent orphaned hashes by creating KosyncDocument on every book save Improvements: - Rename abs-kosync-bot to pagekeeper-bot, centralize in constants.py - Remove legacy bot names (book-stitch, book-sync) - Redesign KoSync settings tab: sync source at top, conditional sections - Auto-create books for exact ABS title matches (skip suggestion approval) - Downgrade noisy no-progress warnings to debug - Include book title in Instant Sync log message - Add external KoSync server credential fields (KOSYNC_SERVER_USER/KEY) * Address code review findings: security, data integrity, dead code Critical fixes: - XSS: Replace |safe with |tojson for JSON in templates (kosync_documents, suggestions) — prevents script injection via book titles - Path traversal: Use Path().name to strip directory components from Booklore filenames before cache write, add is_safe_path_within check - Data integrity: Unlink/delete endpoints now clear book.kosync_doc_id to prevent recreating orphaned hashes - Primary key mutation: Never mutate document_hash on cached doc — delete old record and create new one instead Other fixes: - Sanitize book.title in Instant Sync and Booklore log messages - Settings: server-side radio state for builtin/external KoSync mode - _test_kosync uses urlparse-based external detection for credential selection - _is_external uses urlparse instead of fragile string matching - Null guards in toggleKosyncSourceMode JS - CSS: unquote Outfit font-family - Remove dead get_all_books() call in Booklore discovery - Strip server_id prefix from Booklore-cached filenames for title derivation - Remove unused json import from matching_bp * Address Macroscope review: linked_abs_id, LibraryService args, null guard, dead code * Address remaining review findings: atomicity, metadata consistency, dead code - Fix book_id None producing malformed "server_id:None" in Booklore search - Return cached_doc.filename for consistent Booklore path format - Reorder resolve-orphan to register hash before mutating books - Populate filename on existing KosyncDocument when linking - Add filename to register_hash_for_book new-doc path - Expand localhost detection to cover localhost/::1 variants - Add KOSYNC_SERVER_KEY to secret reveal whitelist - Add credential fallback for external KoSync test - Log instead of swallowing exceptions in _find_epub_in_db - Extract _serialize_document helper to deduplicate listing code - Fix isnot(None) style to match repo convention * Fix null guard, prefix stripping, enable toggle visibility, secret classification - Use getattr for raw_metadata_dict to handle None values safely - Only strip numeric prefixes (Booklore server IDs) from cached filenames - Move KoSync Enable toggle to always-visible Sync Source section - Remove KOSYNC_SERVER_USER from SECRET_SETTING_KEYS * Decompose EbookParser god class into focused modules Extract KoReaderXPathService and LocatorSearchService from the 1,140-line EbookParser, reducing it to a 335-line facade. The two new stateless services receive (full_text, spine_map) as arguments, making them independently testable without file I/O or mocking. - koreader_xpath.py: XPath generation/resolution, get_perfect_ko_xpath broken into 3 phases (text node location, hybrid BS4→LXML anchor mapping, BS4 structural fallback) - locator_search.py: text search (anchor/exact/normalized/fuzzy), CFI resolution, Storyteller/Readium locator resolution - Remove dead code: get_character_delta(), _has_text_content() - Add 46 new unit tests (23 xpath, 23 locator search) - Zero caller/DI changes — EbookParser facade preserves the public API * Refactor KoSync server: split monolith, extract utilities, add tests Split the 788-line kosync_server.py into focused modules: - kosync_server.py (123 lines): thin protocol route handlers - kosync_admin.py (284 lines): dashboard management routes - kosync_auth.py (121 lines): shared auth decorators Extract reusable utilities: - rate_limiter.py: TokenBucketRateLimiter (thread-safe token bucket) - debounce_manager.py: DebounceManager (PUT event debouncing) Move PUT/GET business logic from route handlers to KosyncService: - handle_put_progress: validation, furthest-wins, save, link, activity flag - handle_get_progress: 4-step lookup chain - resolve_best_progress: sibling doc selection + state fallback Eliminate module-level globals — services stored in Flask app.config, matching the existing blueprint helpers pattern. Add 38 new tests (20 service, 7 rate limiter, 11 debounce). Update 3 existing tests for new architecture. * Security audit: fix unauthenticated secret endpoint, path traversal, XSS, add hardening - Wire up _is_secret_request_authorized() in get_secret() (was defined but never called) - Add is_safe_path_within() checks in covers endpoint and ebook cache fallback - Apply sanitize_html filter to TBR description template - Add X-Frame-Options and X-Content-Type-Options security headers - Run container as non-root appuser in Dockerfile * fix(security,perf): defusedxml, dead code cleanup, N+1 query, DOM perf Security: - Replace xml.etree.ElementTree with defusedxml in CWA client and SMIL extractor to prevent XML entity expansion attacks - Remove dead session['is_admin'] check from secret auth function - Mask ABS API token in stream URL log messages - Set SESSION_COOKIE_SAMESITE=Lax and SESSION_COOKIE_HTTPONLY=True Performance: - Fix N+1 query in /api/processing-status using existing bulk method - Remove redundant rglob("*") fallback in resolve_book_path - Add 150ms debounce to reading page search input - Use DocumentFragment for dashboard sort to batch DOM reflows - Skip sorting hidden dashboard grids Infrastructure: - Add defusedxml==0.7.1 to requirements - Run test container as root for pip install compatibility * fix: SMIL indentation bug, restore secret reveal auth, sort hidden grids Address PR #3 review feedback from Macroscope: - Fix incorrect nesting of relative timestamp handler inside absolute block - Restore session-based admin check for browser secret reveal endpoint - Remove offsetParent guard so hidden dashboard grids get sorted * fix: remove unused variable to pass ruff CI * fix(security): sanitize KoSync endpoint inputs for Snyk XSS findings Validate doc_id format on GET progress, type-check request body on PUT. * fix(security): use make_response with explicit content type to break Snyk taint chain * fix: address PR #4 review findings from Macroscope - Guard suggestion cleanup when kept_ids is empty (ABS outage safety) - Move thread-start check inside lock in debounce_manager (race fix) - Remove offset addition on empty text nodes in xpath resolver - Iterate sentence tags in document order, not tag priority order - Map dismissed status to hidden in serialize_suggestion - Fix refreshPaused scoping: move to module scope in dashboard.js - Re-read localStorage in suggestions.js viewport change handler * fix: distinguish Storyteller-only from ebook-only in batch match UI * fix: map dismissed status to hidden in serialize_suggestion * fix: handle IPv4-mapped IPv6 addresses in private IP check --- .gitignore | 10 + Dockerfile | 8 +- ...m5n6o7p8q9_migrate_child_fks_to_book_id.py | 12 +- ...t0u1_add_suggestion_source_unique_index.py | 40 + docker-compose.test.yml | 1 + requirements.txt | 1 + src/api/api_clients.py | 267 +++-- src/api/bookfusion_client.py | 3 + src/api/booklore_client.py | 11 +- src/api/cwa_client.py | 156 +-- src/api/kosync_admin.py | 284 +++++ src/api/kosync_auth.py | 123 ++ src/api/kosync_server.py | 1045 +---------------- src/blueprints/abs_bp.py | 64 +- src/blueprints/api.py | 231 ++-- src/blueprints/bookfusion_bp.py | 50 +- src/blueprints/books.py | 11 +- src/blueprints/covers.py | 78 +- src/blueprints/dashboard.py | 113 +- src/blueprints/helpers.py | 210 ++-- src/blueprints/matching_bp.py | 607 +++++----- src/blueprints/reading_bp.py | 47 +- src/blueprints/settings_bp.py | 345 +++--- src/blueprints/tbr_bp.py | 5 +- src/db/book_repository.py | 5 + src/db/bookfusion_repository.py | 99 +- src/db/database_service.py | 67 +- src/db/hardcover_repository.py | 7 - src/db/kosync_repository.py | 40 +- src/db/models.py | 6 +- src/db/storyteller_repository.py | 40 +- src/db/suggestion_repository.py | 64 +- src/db/tbr_repository.py | 19 +- src/services/abs_service.py | 4 + src/services/alignment_service.py | 122 +- src/services/background_job_service.py | 14 +- src/services/book_metadata_service.py | 20 +- src/services/client_poller.py | 6 +- src/services/hardcover_service.py | 6 +- src/services/kosync_service.py | 740 ++++++++++++ src/services/progress_reset_service.py | 4 +- src/services/reading_service.py | 3 +- src/services/reading_stats_service.py | 4 +- .../storyteller_submission_service.py | 14 +- src/services/suggestion_service.py | 400 ++++--- src/services/write_tracker.py | 12 +- src/sync_clients/abs_ebook_sync_client.py | 2 +- src/sync_clients/abs_sync_client.py | 8 +- src/sync_clients/booklore_sync_client.py | 8 +- src/sync_clients/hardcover_sync_client.py | 6 +- src/sync_clients/storyteller_sync_client.py | 2 +- src/sync_manager.py | 10 +- src/utils/constants.py | 13 + src/utils/cover_resolver.py | 26 +- src/utils/debounce_manager.py | 81 ++ src/utils/ebook_utils.py | 996 ++-------------- src/utils/koreader_xpath.py | 510 ++++++++ src/utils/locator_search.py | 435 +++++++ src/utils/rate_limiter.py | 55 + src/utils/smil_extractor.py | 344 +++--- src/web_server.py | 301 +++-- static/css/components.css | 14 +- static/css/kosync.css | 340 ++++++ static/css/match.css | 61 +- static/css/reading.css | 37 +- static/js/batch-match.js | 128 ++ static/js/bookfusion.js | 888 ++++++++++++++ static/js/confirm-modal.js | 246 ++++ static/js/dashboard.js | 187 +-- static/js/kosync-documents.js | 746 ++++++++++++ static/js/logs.js | 647 ++++++++++ static/js/match.js | 391 ++++++ static/js/reading.js | 4 +- static/js/settings.js | 79 +- static/js/suggestions.js | 441 +++++++ static/js/tbr-detail.js | 27 +- static/js/utils.js | 45 + templates/batch_match.html | 184 +-- templates/bookfusion.html | 836 +------------ templates/index.html | 55 +- templates/kosync_documents.html | 77 ++ templates/logs.html | 657 +---------- templates/match.html | 400 +------ templates/partials/book_card.html | 10 +- templates/partials/confirm_modal.html | 22 + templates/partials/navbar.html | 2 +- templates/reading.html | 17 +- templates/reading_detail.html | 29 +- templates/settings.html | 138 ++- templates/suggestions.html | 446 +------ templates/tbr_detail.html | 4 +- tests/conftest.py | 230 ++++ tests/test_abs_socket_listener.py | 241 ++-- tests/test_alignment_service.py | 6 +- tests/test_api_errors.py | 253 ++++ tests/test_apply_settings_integration.py | 305 +++++ tests/test_background_job_service.py | 637 ++++++++++ tests/test_book_metadata_service.py | 246 ++++ tests/test_bookfusion_routes.py | 639 ++++++++++ tests/test_booklore_client.py | 4 +- tests/test_client_poller.py | 313 +++++ tests/test_cover_proxy.py | 110 +- tests/test_dashboard_errors.py | 124 ++ tests/test_database_service_integration.py | 189 +++ tests/test_debounce_manager.py | 152 +++ tests/test_ebook_normalization.py | 14 +- tests/test_ebook_sentence_xpath_fallback.py | 16 +- tests/test_fix_sync_issues.py | 3 +- tests/test_hardcover_sync_client.py | 2 +- tests/test_helpers.py | 236 ++++ tests/test_koreader_xpath.py | 247 ++++ tests/test_kosync_server.py | 519 ++++---- tests/test_kosync_service.py | 314 +++++ tests/test_locator_search.py | 231 ++++ tests/test_logs_routes.py | 404 +++++++ tests/test_matching_errors.py | 265 +++++ tests/test_rate_limiter.py | 62 + tests/test_reading_bp_errors.py | 230 ++++ tests/test_reading_date_service.py | 477 ++++++++ tests/test_reading_routes.py | 1 + tests/test_reading_service.py | 205 ++++ tests/test_settings_comprehensive.py | 2 +- tests/test_storyteller_submission.py | 37 +- tests/test_storyteller_wordtimeline.py | 8 +- tests/test_suggestions_feature.py | 8 +- tests/test_sync_concurrency.py | 281 +++++ tests/test_tbr_repository.py | 20 +- tests/test_webserver.py | 6 +- 128 files changed, 15517 insertions(+), 6893 deletions(-) create mode 100644 alembic/versions/p6q7r8s9t0u1_add_suggestion_source_unique_index.py create mode 100644 src/api/kosync_admin.py create mode 100644 src/api/kosync_auth.py create mode 100644 src/services/kosync_service.py create mode 100644 src/utils/constants.py create mode 100644 src/utils/debounce_manager.py create mode 100644 src/utils/koreader_xpath.py create mode 100644 src/utils/locator_search.py create mode 100644 src/utils/rate_limiter.py create mode 100644 static/css/kosync.css create mode 100644 static/js/batch-match.js create mode 100644 static/js/bookfusion.js create mode 100644 static/js/confirm-modal.js create mode 100644 static/js/kosync-documents.js create mode 100644 static/js/logs.js create mode 100644 static/js/match.js create mode 100644 static/js/suggestions.js create mode 100644 static/js/utils.js create mode 100644 templates/kosync_documents.html create mode 100644 templates/partials/confirm_modal.html create mode 100644 tests/test_api_errors.py create mode 100644 tests/test_apply_settings_integration.py create mode 100644 tests/test_background_job_service.py create mode 100644 tests/test_book_metadata_service.py create mode 100644 tests/test_bookfusion_routes.py create mode 100644 tests/test_client_poller.py create mode 100644 tests/test_dashboard_errors.py create mode 100644 tests/test_debounce_manager.py create mode 100644 tests/test_helpers.py create mode 100644 tests/test_koreader_xpath.py create mode 100644 tests/test_kosync_service.py create mode 100644 tests/test_locator_search.py create mode 100644 tests/test_logs_routes.py create mode 100644 tests/test_matching_errors.py create mode 100644 tests/test_rate_limiter.py create mode 100644 tests/test_reading_bp_errors.py create mode 100644 tests/test_reading_date_service.py create mode 100644 tests/test_reading_service.py create mode 100644 tests/test_sync_concurrency.py diff --git a/.gitignore b/.gitignore index 60c8118..8f0ea7b 100644 --- a/.gitignore +++ b/.gitignore @@ -51,3 +51,13 @@ CLAUDE.md # Claude Code local settings (personal permissions/config) .claude/settings.local.json .kilocodemodes +# Nightshift plan artifacts (keep out of version control) +.nightshift-plan +.todos/ +.sidecar/ +.sidecar-agent +.sidecar-task +.sidecar-pr +.sidecar-start.sh +.sidecar-base +.td-root diff --git a/Dockerfile b/Dockerfile index 33bb0d6..b368af4 100644 --- a/Dockerfile +++ b/Dockerfile @@ -32,8 +32,10 @@ RUN pip install --no-cache-dir --upgrade pip && \ pip install --no-cache-dir nvidia-cublas-cu12 nvidia-cudnn-cu12; \ fi -# 3. Create directories -RUN mkdir -p /app/src /app/templates /app/static /data/audio_cache /data/logs /data/transcripts /storyteller-import /storyteller-data +# 3. Create non-root user and directories +RUN useradd -r -u 1000 appuser && \ + mkdir -p /app/src /app/templates /app/static /data/audio_cache /data/logs /data/transcripts /storyteller-import /storyteller-data && \ + chown -R appuser:appuser /data /storyteller-import /storyteller-data # 4. Copy Application Code COPY src/ /app/src/ @@ -46,6 +48,8 @@ COPY scripts/ /app/scripts/ COPY start.sh /app/start.sh RUN sed -i 's/\r$//' /app/start.sh && chmod +x /app/start.sh +USER appuser + EXPOSE 4477 HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \ diff --git a/alembic/versions/l4m5n6o7p8q9_migrate_child_fks_to_book_id.py b/alembic/versions/l4m5n6o7p8q9_migrate_child_fks_to_book_id.py index 8936b0a..d74307e 100644 --- a/alembic/versions/l4m5n6o7p8q9_migrate_child_fks_to_book_id.py +++ b/alembic/versions/l4m5n6o7p8q9_migrate_child_fks_to_book_id.py @@ -17,6 +17,7 @@ from typing import Sequence import sqlalchemy as sa +from sqlalchemy.exc import OperationalError from alembic import op @@ -53,7 +54,7 @@ def _log_orphan_count(conn, table: str, fk_col: str = "abs_id") -> None: f"WHERE b.id IS NULL" )).scalar() if count: - logger.warning( + logger.error( "Dropping %d orphaned row(s) from '%s' with no matching book", count, table, @@ -293,10 +294,15 @@ def _upgrade_nullable_tables(conn) -> None: f"(SELECT id FROM books WHERE books.abs_id = [{table}].[{old_col}]) " f"WHERE [{old_col}] IS NOT NULL AND [{new_col}] IS NULL" )) + orphan_count = conn.execute(sa.text( + f"SELECT COUNT(*) FROM [{table}] WHERE [{old_col}] IS NOT NULL AND [{new_col}] IS NULL" + )).scalar() + if orphan_count: + logger.warning("%d row(s) in '%s' could not be linked to a book", orphan_count, table) # Create index if it doesn't exist (safe for re-runs) try: op.create_index(f'ix_{table}_{new_col}', table, [new_col]) - except Exception: + except OperationalError: pass # Index already exists from a previous partial run @@ -314,7 +320,7 @@ def downgrade() -> None: if _column_exists(conn, table, new_col): try: op.drop_index(f'ix_{table}_{new_col}', table_name=table) - except Exception: + except OperationalError: pass with op.batch_alter_table(table, recreate='always') as batch_op: batch_op.drop_column(new_col) diff --git a/alembic/versions/p6q7r8s9t0u1_add_suggestion_source_unique_index.py b/alembic/versions/p6q7r8s9t0u1_add_suggestion_source_unique_index.py new file mode 100644 index 0000000..a7baf7f --- /dev/null +++ b/alembic/versions/p6q7r8s9t0u1_add_suggestion_source_unique_index.py @@ -0,0 +1,40 @@ +"""add unique index on (source_id, source) to pending_suggestions + +Ensures that the same source_id can exist for different source types +without collision. + +Revision ID: p6q7r8s9t0u1 +Revises: o7p8q9r0s1t2 +Create Date: 2026-03-19 +""" + +from typing import Sequence + +import sqlalchemy as sa + +from alembic import op + +# revision identifiers, used by Alembic. +revision: str = 'p6q7r8s9t0u1' +down_revision: str = 'o7p8q9r0s1t2' +branch_labels: Sequence[str] | None = None +depends_on: str | None = None + + +def upgrade() -> None: + try: + op.create_index( + 'ix_pending_suggestions_source_id_source', + 'pending_suggestions', + ['source_id', 'source'], + unique=True, + ) + except sa.exc.OperationalError: + pass # Index already exists (idempotent) + + +def downgrade() -> None: + try: + op.drop_index('ix_pending_suggestions_source_id_source', table_name='pending_suggestions') + except sa.exc.OperationalError: + pass diff --git a/docker-compose.test.yml b/docker-compose.test.yml index b20db0b..6fb054d 100644 --- a/docker-compose.test.yml +++ b/docker-compose.test.yml @@ -4,6 +4,7 @@ services: context: . dockerfile: Dockerfile platform: linux/amd64 + user: root entrypoint: ["sh", "-c", "pip install -q pytest && python -m pytest tests/ \"$@\"", "--"] volumes: - ./src:/app/src:ro diff --git a/requirements.txt b/requirements.txt index dc54251..bc8865a 100644 --- a/requirements.txt +++ b/requirements.txt @@ -12,6 +12,7 @@ gTTS==2.5.4 flask==3.1.3 nh3==0.3.1 lxml==5.4.0 +defusedxml==0.7.1 dependency-injector==4.48.3 sqlalchemy==2.0.48 alembic==1.18.4 diff --git a/src/api/api_clients.py b/src/api/api_clients.py index 3e512a9..86cbc88 100644 --- a/src/api/api_clients.py +++ b/src/api/api_clients.py @@ -2,14 +2,17 @@ import os import time from concurrent.futures import ThreadPoolExecutor +from urllib.parse import urlparse import requests +from src.utils.constants import BOT_DEVICE_NAME, DEFAULT_COLLECTION_NAME from src.utils.kosync_headers import hash_kosync_key, kosync_auth_headers from src.utils.logging_utils import sanitize_log_data logger = logging.getLogger(__name__) + class ABSClient: def __init__(self): # Configuration is now dynamic via properties (no caching) @@ -23,9 +26,9 @@ def __init__(self): @property def base_url(self): """Dynamic base_url from environment (no caching).""" - url = os.environ.get("ABS_SERVER", "").rstrip('/') + url = os.environ.get("ABS_SERVER", "").rstrip("/") # Validate URL scheme to help catch configuration errors - if url and not url.startswith(('http://', 'https://')): + if url and not url.startswith(("http://", "https://")): logger.warning(f"ABS_SERVER missing http:// or https:// scheme: {url}") return url @@ -45,7 +48,7 @@ def _update_session_headers(self): def is_configured(self): """Check if ABS is enabled and configured with URL and token.""" - if os.environ.get("ABS_ENABLED", "true").lower() == 'false': + if os.environ.get("ABS_ENABLED", "true").lower() == "false": return False return bool(self.base_url and self.token) @@ -61,7 +64,7 @@ def check_connection(self): r = self.session.get(url, timeout=self.timeout) if r.status_code == 200: # If this is the first container start, show INFO for visibility; otherwise use DEBUG - first_run_marker = '/data/.first_run_done' + first_run_marker = "/data/.first_run_done" try: first_run = not os.path.exists(first_run_marker) except Exception: @@ -70,7 +73,7 @@ def check_connection(self): if first_run: logger.info(f"Connected to Audiobookshelf as user: {r.json().get('username', 'Unknown')}") try: - open(first_run_marker, 'w').close() + open(first_run_marker, "w").close() except Exception: pass return True @@ -86,7 +89,8 @@ def check_connection(self): return False def get_all_audiobooks(self): - if not self.is_configured(): return [] + if not self.is_configured(): + return [] # Return cached result if still fresh now = time.time() @@ -103,10 +107,10 @@ def get_all_audiobooks(self): logger.warning("ABS library fetch failed, returning stale cache") return self._audiobooks_cache return [] - libraries = r.json().get('libraries', []) + libraries = r.json().get("libraries", []) all_audiobooks = [] with ThreadPoolExecutor(max_workers=min(len(libraries) or 1, 4)) as pool: - for items in pool.map(self.get_audiobooks_for_lib, [lib['id'] for lib in libraries]): + for items in pool.map(self.get_audiobooks_for_lib, [lib["id"] for lib in libraries]): all_audiobooks.extend(items) self._audiobooks_cache = all_audiobooks self._audiobooks_cache_time = time.time() @@ -124,13 +128,14 @@ def invalidate_audiobooks_cache(self): self._audiobooks_cache_time = 0 def get_audiobooks_for_lib(self, lib: str): - if not self.is_configured(): return [] + if not self.is_configured(): + return [] self._update_session_headers() items_url = f"{self.base_url}/api/libraries/{lib}/items" params = {"mediaType": "audiobook"} r_items = self.session.get(items_url, params=params, timeout=self.timeout) if r_items.status_code == 200: - return r_items.json().get('results', []) + return r_items.json().get("results", []) logger.warning(f"ABS - Failed to fetch audiobooks for library '{lib}'") return [] @@ -143,8 +148,8 @@ def get_libraries(self) -> list: r = self.session.get(f"{self.base_url}/api/libraries", timeout=self.timeout) if r.status_code == 200: return [ - {'id': lib['id'], 'name': lib['name'], 'mediaType': lib.get('mediaType', 'book')} - for lib in r.json().get('libraries', []) + {"id": lib["id"], "name": lib["name"], "mediaType": lib.get("mediaType", "book")} + for lib in r.json().get("libraries", []) ] except Exception as e: logger.error(f"Failed to fetch ABS libraries: {e}") @@ -158,7 +163,8 @@ def get_audiobooks_for_libs(self, lib_ids: list) -> list: return all_items def get_audio_files(self, item_id): - if not self.is_configured(): return [] + if not self.is_configured(): + return [] self._update_session_headers() url = f"{self.base_url}/api/items/{item_id}" try: @@ -167,16 +173,13 @@ def get_audio_files(self, item_id): data = r.json() files = [] # Return list of dicts with stream_url and ext (for transcriber) - audio_files = data.get('media', {}).get('audioFiles', []) - audio_files.sort(key=lambda x: (x.get('disc', 0) or 0, x.get('track', 0) or 0)) + audio_files = data.get("media", {}).get("audioFiles", []) + audio_files.sort(key=lambda x: (x.get("disc", 0) or 0, x.get("track", 0) or 0)) for af in audio_files: stream_url = f"{self.base_url}/api/items/{item_id}/file/{af['ino']}?token={self.token}" # Return dict with stream URL and extension (default to mp3) - files.append({ - "stream_url": stream_url, - "ext": af.get("ext", "mp3") - }) + files.append({"stream_url": stream_url, "ext": af.get("ext", "mp3")}) return files return [] except Exception as e: @@ -185,26 +188,23 @@ def get_audio_files(self, item_id): def get_ebook_files(self, item_id): """Get ebook files for an item (from libraryFiles).""" - if not self.is_configured(): return [] + if not self.is_configured(): + return [] self._update_session_headers() url = f"{self.base_url}/api/items/{item_id}" try: r = self.session.get(url, timeout=self.timeout) if r.status_code == 200: data = r.json() - library_files = data.get('libraryFiles', []) + library_files = data.get("libraryFiles", []) ebook_files = [] for f in library_files: - ext = f.get('metadata', {}).get('ext') or f.get('ext') or "" - ext = ext.lower().replace('.', '') - if ext in ['epub', 'mobi', 'pdf', 'azw3']: - stream_url = f"{self.base_url}/api/items/{item_id}/file/{f['ino']}?token={self.token}" - ebook_files.append({ - "stream_url": stream_url, - "ext": ext, - "ino": f['ino'] - }) + ext = f.get("metadata", {}).get("ext") or f.get("ext") or "" + ext = ext.lower().replace(".", "") + if ext in ["epub", "mobi", "pdf", "azw3"]: + stream_url = f"{self.base_url}/api/items/{item_id}/file/{f['ino']}?token={self.token}" + ebook_files.append({"stream_url": stream_url, "ext": ext, "ino": f["ino"]}) return ebook_files return [] except Exception as e: @@ -213,7 +213,8 @@ def get_ebook_files(self, item_id): def search_ebooks(self, query): """Search for ebooks across all book libraries.""" - if not self.is_configured(): return [] + if not self.is_configured(): + return [] self._update_session_headers() results = [] try: @@ -223,17 +224,17 @@ def search_ebooks(self, query): logger.warning(f"ABS Search: Failed to get libraries (status {r_libs.status_code})") return [] - libraries = r_libs.json().get('libraries', []) + libraries = r_libs.json().get("libraries", []) logger.debug(f"ABS Search: Found {len(libraries)} libraries to search") # Search ALL libraries to support mixed content (e.g. ebooks in audiobook libraries) for lib in libraries: - lib_name = lib.get('name', 'Unknown') - lib_type = lib.get('mediaType', 'unknown') + lib_name = lib.get("name", "Unknown") + lib_type = lib.get("mediaType", "unknown") logger.debug(f" Searching library '{lib_name}' (type: {lib_type})") search_url = f"{self.base_url}/api/libraries/{lib['id']}/search" - params = {'q': query, 'limit': 10} + params = {"q": query, "limit": 10} r = self.session.get(search_url, params=params, timeout=self.timeout) if r.status_code == 200: @@ -246,27 +247,29 @@ def search_ebooks(self, query): logger.debug(f" Response keys: {list(data.keys())}") # Try different possible keys - items = data.get('book', []) or data.get('libraryItem', []) or data.get('results', []) + items = data.get("book", []) or data.get("libraryItem", []) or data.get("results", []) if items: logger.debug(f" ABS Search: Found {len(items)} hits in library '{lib_name}'") for item in items: # Handle different response structures if isinstance(item, dict): - lib_item = item.get('libraryItem', item) - metadata = lib_item.get('media', {}).get('metadata', {}) or lib_item.get('metadata', {}) - item_id = lib_item.get('id', item.get('id')) - title = metadata.get('title') or item.get('matchKey') - author = metadata.get('authorName') or metadata.get('author') - - results.append({ - "id": item_id, - "title": title, - "author": author, - "libraryId": lib['id'], - "source": "ABS", - "ext": "epub" - }) + lib_item = item.get("libraryItem", item) + metadata = lib_item.get("media", {}).get("metadata", {}) or lib_item.get("metadata", {}) + item_id = lib_item.get("id", item.get("id")) + title = metadata.get("title") or item.get("matchKey") + author = metadata.get("authorName") or metadata.get("author") + + results.append( + { + "id": item_id, + "title": title, + "author": author, + "libraryId": lib["id"], + "source": "ABS", + "ext": "epub", + } + ) else: logger.debug(f" No items found in library '{lib_name}'") else: @@ -281,10 +284,11 @@ def download_file(self, stream_url, output_path): """Download file from stream_url to output_path.""" self._update_session_headers() try: - logger.info(f"ABS: Downloading file from {stream_url}...") + safe_url = stream_url.split("?")[0] + "?token=***" if "?token=" in stream_url else stream_url + logger.info(f"ABS: Downloading file from {safe_url}...") with self.session.get(stream_url, stream=True, timeout=120) as r: r.raise_for_status() - with open(output_path, 'wb') as f: + with open(output_path, "wb") as f: for chunk in r.iter_content(chunk_size=8192): f.write(chunk) @@ -293,27 +297,32 @@ def download_file(self, stream_url, output_path): return False except Exception as e: logger.error(f"ABS Download failed: {e}") - if os.path.exists(output_path): os.remove(output_path) + if os.path.exists(output_path): + os.remove(output_path) return False def get_item_details(self, item_id): - if not self.is_configured(): return None + if not self.is_configured(): + return None self._update_session_headers() url = f"{self.base_url}/api/items/{item_id}" try: r = self.session.get(url, timeout=self.timeout) - if r.status_code == 200: return r.json() + if r.status_code == 200: + return r.json() except Exception: pass return None def get_progress(self, item_id): - if not self.is_configured(): return None + if not self.is_configured(): + return None self._update_session_headers() url = f"{self.base_url}/api/me/progress/{item_id}" try: r = self.session.get(url, timeout=self.timeout) - if r.status_code == 200: return r.json() + if r.status_code == 200: + return r.json() except Exception: logger.exception(f"Error fetching ABS progress for item {item_id}") pass @@ -359,10 +368,7 @@ def update_ebook_progress(self, item_id, progress, location): # Ensure we use a float for the progress progress = float(progress) url = f"{self.base_url}/api/me/progress/{item_id}" - payload = { - "ebookProgress": progress, - "ebookLocation": location - } + payload = {"ebookProgress": progress, "ebookLocation": location} try: r = self.session.patch(url, json=payload, timeout=self.timeout) @@ -390,10 +396,7 @@ def update_progress(self, abs_id, timestamp, time_listened): time_listened = 0.0 time_listened = float(time_listened) - payload = { - "currentTime": timestamp, - "timeListened": time_listened - } + payload = {"currentTime": timestamp, "timeListened": time_listened} return self.update_progress_using_payload(abs_id, payload) def update_progress_using_payload(self, abs_id, payload: dict): @@ -429,8 +432,8 @@ def get_all_progress_raw(self): r = self.session.get(url, timeout=self.timeout) if r.status_code == 200: data = r.json() - items = data if isinstance(data, list) else data.get('libraryItemsInProgress', []) - mapped_items = {item.get('libraryItemId'): item for item in items if item.get('libraryItemId')} + items = data if isinstance(data, list) else data.get("libraryItemsInProgress", []) + mapped_items = {item.get("libraryItemId"): item for item in items if item.get("libraryItemId")} return mapped_items elif r.status_code == 404: # Fallback to /api/me (normal for older ABS versions) @@ -440,11 +443,11 @@ def get_all_progress_raw(self): data = r2.json() # Try 'mediaInProgress' (some versions) or 'mediaProgress' (others) - items = data.get('mediaInProgress', []) + items = data.get("mediaInProgress", []) if not items: - items = data.get('mediaProgress', []) + items = data.get("mediaProgress", []) - return {item.get('libraryItemId'): item for item in items if item.get('libraryItemId')} + return {item.get("libraryItemId"): item for item in items if item.get("libraryItemId")} else: logger.warning(f"Fallback to /api/me failed: {r2.status_code}") else: @@ -461,37 +464,43 @@ def get_in_progress(self, min_progress=0.01): url = f"{self.base_url}/api/me/progress" try: r = self.session.get(url, timeout=self.timeout) - if r.status_code != 200: return [] + if r.status_code != 200: + return [] data = r.json() - items = data if isinstance(data, list) else data.get('libraryItemsInProgress', []) + items = data if isinstance(data, list) else data.get("libraryItemsInProgress", []) active_items = [] for item in items: # Filter for audiobooks only - if item.get('mediaType') and item.get('mediaType') != 'audiobook': continue + if item.get("mediaType") and item.get("mediaType") != "audiobook": + continue - duration = item.get('duration', 0) - current_time = item.get('currentTime', 0) - if duration == 0 or item.get('isFinished'): continue + duration = item.get("duration", 0) + current_time = item.get("currentTime", 0) + if duration == 0 or item.get("isFinished"): + continue pct = current_time / duration if pct >= min_progress: - lib_item_id = item.get('libraryItemId') or item.get('itemId') - if not lib_item_id: continue + lib_item_id = item.get("libraryItemId") or item.get("itemId") + if not lib_item_id: + continue # Return basic info without recursive detail fetch if possible # but if we need title/author we might still need it unless we have it in the list - title = item.get('metadata', {}).get('title') or "Unknown" - author = item.get('metadata', {}).get('authorName') - - active_items.append({ - "id": lib_item_id, - "title": title, - "author": author, - "progress": pct, - "duration": duration, - "source": "ABS", - "currentTime": current_time - }) + title = item.get("metadata", {}).get("title") or "Unknown" + author = item.get("metadata", {}).get("authorName") + + active_items.append( + { + "id": lib_item_id, + "title": title, + "author": author, + "progress": pct, + "duration": duration, + "source": "ABS", + "currentTime": current_time, + } + ) return active_items except Exception as e: logger.error(f"Error fetching ABS in-progress: {e}") @@ -503,23 +512,23 @@ def create_session(self, abs_id): play_url = f"{self.base_url}/api/items/{abs_id}/play" play_payload = { "deviceInfo": { - "id": "abs-kosync-bot", - "deviceId": "abs-kosync-bot", + "id": BOT_DEVICE_NAME, + "deviceId": BOT_DEVICE_NAME, "clientName": "PageKeeper", "clientVersion": "1.0", "manufacturer": "PageKeeper", "model": "Bridge", - "sdkVersion": "1.0" + "sdkVersion": "1.0", }, "mediaPlayer": "PageKeeper", "supportedMimeTypes": ["audio/mpeg", "audio/mp4"], "forceDirectPlay": True, - "forceTranscode": False + "forceTranscode": False, } try: r = self.session.post(play_url, json=play_payload, timeout=self.timeout) if r.status_code == 200: - id = r.json().get('id') + id = r.json().get("id") logger.debug(f"Created new ABS session for item {abs_id}, id: {id}") return id else: @@ -539,7 +548,7 @@ def close_session(self, session_id): def add_to_collection(self, item_id, collection_name=None): """Add an audiobook to a collection, creating the collection if it doesn't exist.""" if not collection_name: - collection_name = os.environ.get("ABS_COLLECTION_NAME", "abs-kosync") + collection_name = os.environ.get("ABS_COLLECTION_NAME", DEFAULT_COLLECTION_NAME) self._update_session_headers() try: @@ -548,17 +557,18 @@ def add_to_collection(self, item_id, collection_name=None): if r.status_code != 200: return False - collections = r.json().get('collections', []) - target_collection = next((c for c in collections if c.get('name') == collection_name), None) + collections = r.json().get("collections", []) + target_collection = next((c for c in collections if c.get("name") == collection_name), None) if not target_collection: lib_url = f"{self.base_url}/api/libraries" r_lib = self.session.get(lib_url) if r_lib.status_code == 200: - libraries = r_lib.json().get('libraries', []) + libraries = r_lib.json().get("libraries", []) if libraries: - r_create = self.session.post(collections_url, - json={"libraryId": libraries[0]['id'], "name": collection_name}) + r_create = self.session.post( + collections_url, json={"libraryId": libraries[0]["id"], "name": collection_name} + ) if r_create.status_code in [200, 201]: target_collection = r_create.json() @@ -570,7 +580,7 @@ def add_to_collection(self, item_id, collection_name=None): if r_add.status_code in [200, 201, 204]: try: details = self.get_item_details(item_id) - title = details.get('media', {}).get('metadata', {}).get('title') if details else None + title = details.get("media", {}).get("metadata", {}).get("title") if details else None except Exception: title = None logger.info(f"Added '{sanitize_log_data(title or str(item_id))}' to ABS Collection: {collection_name}") @@ -580,8 +590,10 @@ def add_to_collection(self, item_id, collection_name=None): logger.error(f"Error adding item to ABS collection: {e}") return False - def remove_from_collection(self, item_id, collection_name="abs-kosync"): + def remove_from_collection(self, item_id, collection_name=None): """Remove an audiobook from a collection.""" + if not collection_name: + collection_name = os.environ.get("ABS_COLLECTION_NAME", DEFAULT_COLLECTION_NAME) self._update_session_headers() try: # Get collection by name @@ -591,8 +603,8 @@ def remove_from_collection(self, item_id, collection_name="abs-kosync"): logger.warning(f"Failed to fetch collections to remove item '{item_id}'") return False - collections = r.json().get('collections', []) - target_collection = next((c for c in collections if c.get('name') == collection_name), None) + collections = r.json().get("collections", []) + target_collection = next((c for c in collections if c.get("name") == collection_name), None) if not target_collection: logger.warning(f"Collection '{collection_name}' not found, cannot remove item '{item_id}'") @@ -606,13 +618,16 @@ def remove_from_collection(self, item_id, collection_name="abs-kosync"): logger.info(f"Removed item '{item_id}' from ABS Collection: '{collection_name}'") return True else: - logger.warning(f"Failed to remove item '{item_id}' from collection '{collection_name}': {r_remove.status_code} - {r_remove.text}") + logger.warning( + f"Failed to remove item '{item_id}' from collection '{collection_name}': {r_remove.status_code} - {r_remove.text}" + ) return False except Exception as e: logger.error(f"Error removing item from ABS collection: {e}") return False + class KoSyncClient: def __init__(self): # Configuration is now dynamic via properties @@ -620,21 +635,36 @@ def __init__(self): @property def base_url(self): - url = os.environ.get("KOSYNC_SERVER", "").rstrip('/') + url = os.environ.get("KOSYNC_SERVER", "").rstrip("/") # Ensure scheme is present (case-insensitive check) - if url and not url.lower().startswith(('http://', 'https://')): + if url and not url.lower().startswith(("http://", "https://")): logger.warning(f"KOSYNC_SERVER missing scheme, auto-correcting: {url}") url = f"http://{url}" return url + @property + def _is_external(self): + if not self.base_url: + return False + hostname = urlparse(self.base_url).hostname or "" + return hostname not in ("127.0.0.1", "::1", "localhost") + @property def user(self): + if self._is_external: + ext_user = os.environ.get("KOSYNC_SERVER_USER") + if ext_user: + return ext_user return os.environ.get("KOSYNC_USER") @property def auth_token(self): + if self._is_external: + ext_key = os.environ.get("KOSYNC_SERVER_KEY", "") + if ext_key: + return hash_kosync_key(ext_key) key = os.environ.get("KOSYNC_KEY", "") if not key: return "" @@ -642,7 +672,7 @@ def auth_token(self): def is_configured(self): enabled_val = os.environ.get("KOSYNC_ENABLED", "").lower() - if enabled_val == 'false': + if enabled_val == "false": return False return bool(self.base_url and self.user) @@ -651,14 +681,14 @@ def check_connection(self): logger.warning("KoSync not configured (skipping)") return False - is_local = '127.0.0.1' in self.base_url or 'localhost' in self.base_url + is_local = "127.0.0.1" in self.base_url or "localhost" in self.base_url url = f"{self.base_url}/healthcheck" headers = kosync_auth_headers(self.user, self.auth_token) try: r = self.session.get(url, timeout=5, headers=headers) if r.status_code == 200: # First-run visible INFO, otherwise DEBUG - first_run_marker = '/data/.first_run_done' + first_run_marker = "/data/.first_run_done" try: first_run = not os.path.exists(first_run_marker) except Exception: @@ -667,7 +697,7 @@ def check_connection(self): if first_run: logger.info(f"Connected to KoSync Server at {self.base_url}") try: - open(first_run_marker, 'w').close() + open(first_run_marker, "w").close() except Exception: pass return True @@ -697,9 +727,9 @@ def get_progress(self, doc_id): r = self.session.get(url, headers=headers, timeout=10) if r.status_code == 200: data = r.json() - pct = float(data.get('percentage', 0)) + pct = float(data.get("percentage", 0)) # Grab the raw progress string (XPath) - xpath = data.get('progress') + xpath = data.get("progress") return pct, xpath else: logger.warning(f"KoSync GET progress for '{doc_id[:8]}...' returned {r.status_code}") @@ -708,7 +738,8 @@ def get_progress(self, doc_id): return None, None def update_progress(self, doc_id, percentage, xpath=None): - if not self.is_configured(): return False + if not self.is_configured(): + return False headers = { **kosync_auth_headers(self.user, self.auth_token), @@ -723,10 +754,10 @@ def update_progress(self, doc_id, percentage, xpath=None): "document": doc_id, "percentage": percentage, "progress": progress_val, - "device": "abs-sync-bot", - "device_id": "abs-sync-bot", + "device": BOT_DEVICE_NAME, + "device_id": BOT_DEVICE_NAME, "timestamp": int(time.time()), - "force": True # Force update to override server-side "furthest wins" logic + "force": True, # Force update to override server-side "furthest wins" logic } try: r = self.session.put(url, headers=headers, json=payload, timeout=10) diff --git a/src/api/bookfusion_client.py b/src/api/bookfusion_client.py index f846846..5b8fbe4 100644 --- a/src/api/bookfusion_client.py +++ b/src/api/bookfusion_client.py @@ -162,6 +162,9 @@ def upload_api_key(self) -> str: return os.environ.get('BOOKFUSION_UPLOAD_API_KEY', '') def is_configured(self) -> bool: + enabled_val = os.environ.get('BOOKFUSION_ENABLED', '').lower() + if enabled_val == 'false': + return False return bool(self.highlights_api_key) or bool(self.upload_api_key) def check_connection(self, api_key_override: str | None = None) -> tuple[bool, str]: diff --git a/src/api/booklore_client.py b/src/api/booklore_client.py index 84507d2..97cc572 100644 --- a/src/api/booklore_client.py +++ b/src/api/booklore_client.py @@ -7,6 +7,7 @@ import requests from src.sync_clients.sync_client_interface import LocatorResult +from src.utils.constants import DEFAULT_SHELF_NAME from src.utils.logging_utils import sanitize_log_data logger = logging.getLogger(__name__) @@ -28,7 +29,8 @@ def __init__(self, database_service=None, env_prefix="BOOKLORE", instance_id="de self.session = requests.Session() # Load cache from DB (and migrate legacy JSON if needed) - self._load_cache() + if self.is_configured(): + self._load_cache() @property def base_url(self) -> str: @@ -264,6 +266,9 @@ def _refresh_book_cache(self): Refresh the book cache using robust pagination. Fetches books in batches to ensure complete library sync. """ + if not self.is_configured(): + return + all_books_list = [] page = 0 batch_size = 200 # Reasonable chunk size @@ -728,7 +733,7 @@ def get_recent_activity(self, min_progress=0.01): def add_to_shelf(self, ebook_filename, shelf_name=None): """Add a book to a shelf, creating the shelf if it doesn't exist.""" if not shelf_name: - shelf_name = os.environ.get(f"{self.env_prefix}_SHELF_NAME") or "abs-kosync" + shelf_name = os.environ.get(f"{self.env_prefix}_SHELF_NAME") or DEFAULT_SHELF_NAME try: # Find the book @@ -779,7 +784,7 @@ def add_to_shelf(self, ebook_filename, shelf_name=None): def remove_from_shelf(self, ebook_filename, shelf_name=None): """Remove a book from a shelf.""" if not shelf_name: - shelf_name = os.environ.get(f"{self.env_prefix}_SHELF_NAME") or "abs-kosync" + shelf_name = os.environ.get(f"{self.env_prefix}_SHELF_NAME") or DEFAULT_SHELF_NAME try: # Find the book diff --git a/src/api/cwa_client.py b/src/api/cwa_client.py index 6f429c5..e8a155c 100644 --- a/src/api/cwa_client.py +++ b/src/api/cwa_client.py @@ -1,28 +1,31 @@ import base64 import logging import os -import xml.etree.ElementTree as ET from urllib.parse import quote +import defusedxml.ElementTree as ET import requests logger = logging.getLogger(__name__) + class CWAClient: def __init__(self): self.session = requests.Session() - self.session.headers.update({ - "User-Agent": "KOReader/2023.10", - "Accept": "application/atom+xml,application/xml,application/xhtml+xml,text/xml;q=0.9,*/*;q=0.8", - }) + self.session.headers.update( + { + "User-Agent": "KOReader/2023.10", + "Accept": "application/atom+xml,application/xml,application/xhtml+xml,text/xml;q=0.9,*/*;q=0.8", + } + ) self.search_template = None @property def base_url(self) -> str: - raw_url = os.environ.get("CWA_SERVER", "").rstrip('/') - if raw_url.endswith('/opds'): + raw_url = os.environ.get("CWA_SERVER", "").rstrip("/") + if raw_url.endswith("/opds"): raw_url = raw_url[:-5] - if raw_url and not raw_url.lower().startswith(('http://', 'https://')): + if raw_url and not raw_url.lower().startswith(("http://", "https://")): raw_url = f"http://{raw_url}" return raw_url @@ -55,8 +58,8 @@ def _make_request(self, url, **kwargs): """Helper to make requests with fresh auth headers and cleared cookies.""" try: self.session.cookies.clear() - kwargs.setdefault('timeout', self.timeout) - headers = {**self._get_auth_headers(), **kwargs.pop('headers', {})} + kwargs.setdefault("timeout", self.timeout) + headers = {**self._get_auth_headers(), **kwargs.pop("headers", {})} return self.session.get(url, headers=headers, **kwargs) except Exception as e: logger.error(f"CWA Request failed: {e}") @@ -79,8 +82,10 @@ def check_connection(self): # Check for soft login redirect (status 200 but HTML content) if r.status_code == 200: - if r.text.lstrip().lower().startswith(('/link", methods=["POST"]) +@admin_or_local_required +def api_link_kosync_document(doc_hash): + """Link a KOSync document to a book (by abs_id or book_id).""" + db = _get_db() + data = request.json + if not data: + return jsonify({"error": "Missing request body"}), 400 + + book = None + if data.get("abs_id"): + book = db.get_book_by_abs_id(data["abs_id"]) + elif data.get("book_id"): + book = db.get_book_by_id(data["book_id"]) + else: + return jsonify({"error": "Missing abs_id or book_id"}), 400 + + if not book: + return jsonify({"error": "Book not found"}), 404 + + doc = db.get_kosync_document(doc_hash) + if not doc: + return jsonify({"error": "KOSync document not found"}), 404 + + success = db.link_kosync_document(doc_hash, book.id, book.abs_id) + if success: + current_id = book.kosync_doc_id + if current_id != doc_hash: + logger.info(f"Updating Book {book.title} KOSync ID: {current_id} -> {doc_hash}") + book.kosync_doc_id = doc_hash + db.save_book(book) + + db.resolve_suggestion(doc_hash) + return jsonify({"success": True, "message": f"Linked to {book.title}"}) + + return jsonify({"error": "Failed to link document"}), 500 + + +@kosync_admin_bp.route("/api/kosync-documents//unlink", methods=["POST"]) +@admin_or_local_required +def api_unlink_kosync_document(doc_hash): + """Remove the ABS book link from a KOSync document.""" + db = _get_db() + doc = db.get_kosync_document(doc_hash) + if doc and doc.linked_book_id: + book = db.get_book_by_id(doc.linked_book_id) + if book and book.kosync_doc_id == doc_hash: + book.kosync_doc_id = None + db.save_book(book) + + success = db.unlink_kosync_document(doc_hash) + if success: + _cleanup_cache_for_hash(doc_hash) + return jsonify({"success": True, "message": "Document unlinked"}) + return jsonify({"error": "Document not found"}), 404 + + +@kosync_admin_bp.route("/api/kosync-documents/", methods=["DELETE"]) +@admin_or_local_required +def api_delete_kosync_document(doc_hash): + """Delete a KOSync document.""" + db = _get_db() + doc = db.get_kosync_document(doc_hash) + if doc and doc.linked_book_id: + book = db.get_book_by_id(doc.linked_book_id) + if book and book.kosync_doc_id == doc_hash: + book.kosync_doc_id = None + db.save_book(book) + + _cleanup_cache_for_hash(doc_hash) + success = db.delete_kosync_document(doc_hash) + if success: + return jsonify({"success": True, "message": "Document deleted"}) + return jsonify({"error": "Document not found"}), 404 + + +# ---------------- KOSync Document Management Page ---------------- + + +@kosync_admin_bp.route("/kosync-documents") +@admin_or_local_required +def kosync_documents_page(): + """Render the KoSync Document Management page.""" + db = _get_db() + svc = _get_svc() + docs = db.get_all_kosync_documents() + documents = [_serialize_document(doc) for doc in docs] + + orphaned = svc.get_orphaned_kosync_books() + orphaned_books = [ + { + "book_id": b.id, + "abs_id": b.abs_id, + "title": b.title, + "kosync_doc_id": b.kosync_doc_id, + "status": b.status, + "sync_mode": b.sync_mode, + } + for b in orphaned + ] + + return render_template("kosync_documents.html", documents=documents, orphaned_books=orphaned_books) + + +@kosync_admin_bp.route("/api/kosync-documents/orphaned", methods=["GET"]) +@admin_or_local_required +def api_get_orphaned_kosync_books(): + """Get books with kosync_doc_id set but no matching KosyncDocument.""" + orphaned = _get_svc().get_orphaned_kosync_books() + return jsonify( + [ + { + "book_id": b.id, + "abs_id": b.abs_id, + "title": b.title, + "kosync_doc_id": b.kosync_doc_id, + "status": b.status, + "sync_mode": b.sync_mode, + } + for b in orphaned + ] + ) + + +@kosync_admin_bp.route("/api/kosync-documents/clear-orphan/", methods=["POST"]) +@admin_or_local_required +def api_clear_orphaned_hash(book_id): + """Clear kosync_doc_id from a book to stop 502 cycle.""" + book = _get_svc().clear_orphaned_hash(book_id) + if book: + return jsonify({"success": True, "message": f"Cleared hash from {book.title}"}) + return jsonify({"error": "Book not found"}), 404 + + +@kosync_admin_bp.route("/api/kosync-documents/resolve-orphan/", methods=["POST"]) +@admin_or_local_required +def api_resolve_orphaned_hash(book_id): + """Create a KosyncDocument for an orphaned hash and link it to a book.""" + db = _get_db() + svc = _get_svc() + + source_book = db.get_book_by_id(book_id) + if not source_book or not source_book.kosync_doc_id: + return jsonify({"error": "Book not found or has no hash"}), 404 + + doc_hash = source_book.kosync_doc_id + data = request.json or {} + target_book_id = data.get("target_book_id") + + if target_book_id: + target_book = db.get_book_by_id(target_book_id) + if not target_book: + return jsonify({"error": "Target book not found"}), 404 + svc.register_hash_for_book(doc_hash, target_book) + source_book.kosync_doc_id = None + target_book.kosync_doc_id = doc_hash + db.save_book(source_book) + db.save_book(target_book) + return jsonify({"success": True, "message": f"Linked hash to {target_book.title}"}) + + svc.register_hash_for_book(doc_hash, source_book) + return jsonify({"success": True, "message": f"Linked hash to {source_book.title}"}) + + +@kosync_admin_bp.route("/api/kosync-documents//create-book", methods=["POST"]) +@admin_or_local_required +def api_create_book_from_hash(doc_hash): + """Create an ebook-only book from an unlinked KoSync document.""" + db = _get_db() + svc = _get_svc() + data = request.json + if not data or not data.get("title", "").strip(): + return jsonify({"error": "Title is required"}), 400 + + doc = db.get_kosync_document(doc_hash) + if not doc: + return jsonify({"error": "KoSync document not found"}), 404 + + title = data["title"].strip() + book = svc.create_ebook_only_book(doc_hash, title, doc.filename) + return jsonify( + { + "success": True, + "message": f'Created book "{book.title}"', + "book_id": book.id, + } + ) diff --git a/src/api/kosync_auth.py b/src/api/kosync_auth.py new file mode 100644 index 0000000..19288ab --- /dev/null +++ b/src/api/kosync_auth.py @@ -0,0 +1,123 @@ +""" +KoSync authentication decorators. + +Shared by both kosync_sync_bp (protocol routes) and kosync_admin_bp +(dashboard admin routes). +""" + +import hmac +import ipaddress +import logging +import os +from functools import wraps + +from flask import current_app, jsonify, request + +from src.utils.kosync_headers import hash_kosync_key + +logger = logging.getLogger(__name__) + +_PRIVATE_NETWORKS = ( + ipaddress.ip_network("10.0.0.0/8"), + ipaddress.ip_network("172.16.0.0/12"), + ipaddress.ip_network("192.168.0.0/16"), + ipaddress.ip_network("127.0.0.0/8"), + ipaddress.ip_network("::1/128"), + ipaddress.ip_network("fd00::/8"), +) + + +def _is_private_ip(addr: str) -> bool: + """Check if an address is on a private/local network.""" + try: + ip = ipaddress.ip_address(addr) + if isinstance(ip, ipaddress.IPv6Address) and ip.ipv4_mapped: + ip = ip.ipv4_mapped + return any(ip in net for net in _PRIVATE_NETWORKS) + except (ValueError, TypeError): + return False + + +def kosync_auth_required(f): + """Decorator for KOSync authentication with rate limiting.""" + + @wraps(f) + def decorated_function(*args, **kwargs): + remote = request.remote_addr + is_loopback = remote in ("127.0.0.1", "::1") + + rate_limiter = current_app.config.get("rate_limiter") + if not is_loopback and rate_limiter: + from src.utils.rate_limiter import TokenBucketRateLimiter + + if not rate_limiter.check(remote, TokenBucketRateLimiter.AUTH_TOKEN_COST): + return jsonify({"error": "Too many requests"}), 429 + + user = request.headers.get("x-auth-user") + key = request.headers.get("x-auth-key") + + expected_user = os.environ.get("KOSYNC_USER") + expected_password = os.environ.get("KOSYNC_KEY") + + if not expected_user or not expected_password: + logger.error( + f"KOSync Integrated Server: Credentials not configured in settings (request from {request.remote_addr})" + ) + return jsonify({"error": "Server not configured"}), 500 + + expected_hash = hash_kosync_key(expected_password) + + if ( + user + and key + and expected_user + and user.lower() == expected_user.lower() + and hmac.compare_digest(key, expected_hash) + ): + return f(*args, **kwargs) + + logger.warning( + f"KOSync Integrated Server: Unauthorized access attempt from '{request.remote_addr}' (user: '{user}')" + ) + return jsonify({"error": "Unauthorized"}), 401 + + return decorated_function + + +def admin_or_local_required(f): + """Allow private IPs through; require KOSync credentials from public IPs. + + Safety: this decorator is only used on kosync_admin_bp routes, which are + registered exclusively on the LAN dashboard (port 4477). The internet-facing + sync port only serves kosync_sync_bp, so the proxy bypass here is never + reachable from outside the local network. + """ + + @wraps(f) + def decorated_function(*args, **kwargs): + if _is_private_ip(request.remote_addr): + return f(*args, **kwargs) + + # Public IP — require KOSync credentials + user = request.headers.get("x-auth-user") + key = request.headers.get("x-auth-key") + expected_user = os.environ.get("KOSYNC_USER") + expected_password = os.environ.get("KOSYNC_KEY") + + if not expected_user or not expected_password: + return jsonify({"error": "Unauthorized"}), 401 + + expected_hash = hash_kosync_key(expected_password) + if ( + user + and expected_user + and user.lower() == expected_user.lower() + and key + and hmac.compare_digest(key, expected_hash) + ): + return f(*args, **kwargs) + + logger.warning(f"KOSync Admin: Unauthorized access attempt from public IP '{request.remote_addr}'") + return jsonify({"error": "Unauthorized"}), 401 + + return decorated_function diff --git a/src/api/kosync_server.py b/src/api/kosync_server.py index 99b9b55..63de5ea 100644 --- a/src/api/kosync_server.py +++ b/src/api/kosync_server.py @@ -1,205 +1,54 @@ -# KoSync Server - Extracted from web_server.py for clean code separation +# KoSync Protocol Server — KOReader sync endpoints # Implements KOSync protocol compatible with kosync-dotnet import hmac -import ipaddress import logging import os -import threading -import time -from datetime import datetime -from functools import wraps -from pathlib import Path -from flask import Blueprint, jsonify, request +from flask import Blueprint, current_app, jsonify, make_response, request +from src.api.kosync_auth import kosync_auth_required from src.utils.kosync_headers import hash_kosync_key -from src.utils.path_utils import is_safe_path_within logger = logging.getLogger(__name__) -# Create Blueprints for KoSync endpoints -# kosync_sync_bp: KOReader protocol routes (safe to expose to internet) -# kosync_admin_bp: Dashboard management routes (LAN only) -kosync_sync_bp = Blueprint('kosync', __name__) -kosync_admin_bp = Blueprint('kosync_admin', __name__) - -# Module-level references - set via init_kosync_server() -_database_service = None -_container = None -_manager = None -_hash_cache = None -_ebook_dir = None -_active_scans = set() -_active_scans_lock = threading.Lock() - -# KoSync PUT debounce state -_kosync_debounce: dict = {} # {abs_id: {'last_event': float, 'title': str, 'synced': bool}} -_kosync_debounce_lock = threading.Lock() -_debounce_thread_started = False - -# Rate limiting (token bucket per IP) -_RATE_LIMIT_CAPACITY = 30 # max burst -_RATE_LIMIT_REFILL = 2.0 # tokens per second -_AUTH_TOKEN_COST = 5 # auth attempts are expensive -_rate_limit_store: dict = {} # {ip: {'tokens': float, 'last': float}} -_rate_limit_lock = threading.Lock() - -# Stale entry cleanup threshold (seconds) -_STALE_ENTRY_SECONDS = 300 -# Debounce loop poll interval (seconds) -_DEBOUNCE_POLL_INTERVAL = 10 -# Auto-discovery concurrency cap -_MAX_ACTIVE_SCANS = 5 - - -def _rate_limit_check(ip: str, cost: int = 1) -> bool: - """Consume tokens from the bucket for `ip`. Returns True if allowed.""" - now = time.time() - with _rate_limit_lock: - bucket = _rate_limit_store.get(ip) - if bucket is None: - bucket = {'tokens': _RATE_LIMIT_CAPACITY, 'last': now} - _rate_limit_store[ip] = bucket - - elapsed = now - bucket['last'] - bucket['tokens'] = min(_RATE_LIMIT_CAPACITY, bucket['tokens'] + elapsed * _RATE_LIMIT_REFILL) - bucket['last'] = now - - if bucket['tokens'] >= cost: - bucket['tokens'] -= cost - return True - return False - - -def _prune_rate_limit_store(): - """Remove entries idle for more than 5 minutes.""" - now = time.time() - with _rate_limit_lock: - stale = [ip for ip, b in _rate_limit_store.items() if now - b['last'] > _STALE_ENTRY_SECONDS] - for ip in stale: - del _rate_limit_store[ip] - - -def init_kosync_server(database_service, container, manager, ebook_dir=None): - """Initialize KoSync server with required dependencies.""" - global _database_service, _container, _manager, _ebook_dir - _database_service = database_service - _container = container - _manager = manager - _ebook_dir = ebook_dir - - -def _record_kosync_event(abs_id: str, title: str) -> None: - """Record a KoSync PUT event for debounced sync triggering.""" - global _debounce_thread_started - with _kosync_debounce_lock: - _kosync_debounce[abs_id] = { - 'last_event': time.time(), - 'title': title, - 'synced': False, - } - if not _debounce_thread_started: - _debounce_thread_started = True - threading.Thread(target=_kosync_debounce_loop, daemon=True).start() - - -def _kosync_debounce_loop() -> None: - """Check periodically for books that stopped receiving KoSync PUTs.""" - debounce_seconds = int(os.environ.get('ABS_SOCKET_DEBOUNCE_SECONDS', '30')) - while True: - time.sleep(_DEBOUNCE_POLL_INTERVAL) - now = time.time() - to_sync = [] - - with _kosync_debounce_lock: - for abs_id, info in _kosync_debounce.items(): - if not info['synced'] and (now - info['last_event']) > debounce_seconds: - info['synced'] = True - to_sync.append((abs_id, info['title'])) - - for abs_id, title in to_sync: - if _manager: - book = _database_service.get_book_by_abs_id(abs_id) if _database_service else None - if not book: - logger.warning(f"KOSync PUT: No book found for '{abs_id}' — skipping sync") - continue - logger.info(f"KOSync PUT: Triggering sync for '{title}' (debounced)") - threading.Thread( - target=_manager.sync_cycle, - kwargs={'target_book_id': book.id}, - daemon=True, - ).start() - - # Clean up stale debounce entries - with _kosync_debounce_lock: - stale = [k for k, v in _kosync_debounce.items() if now - v['last_event'] > _STALE_ENTRY_SECONDS] - for k in stale: - del _kosync_debounce[k] - - # Prune stale rate-limit buckets - _prune_rate_limit_store() - - -def kosync_auth_required(f): - """Decorator for KOSync authentication.""" - @wraps(f) - def decorated_function(*args, **kwargs): - remote = request.remote_addr - is_loopback = remote in ('127.0.0.1', '::1') - if not is_loopback and not _rate_limit_check(remote, _AUTH_TOKEN_COST): - return jsonify({"error": "Too many requests"}), 429 - - user = request.headers.get('x-auth-user') - key = request.headers.get('x-auth-key') - - expected_user = os.environ.get("KOSYNC_USER") - expected_password = os.environ.get("KOSYNC_KEY") - - if not expected_user or not expected_password: - logger.error(f"KOSync Integrated Server: Credentials not configured in settings (request from {request.remote_addr})") - return jsonify({"error": "Server not configured"}), 500 - - expected_hash = hash_kosync_key(expected_password) - - if (user and key and expected_user - and user.lower() == expected_user.lower() - and hmac.compare_digest(key, expected_hash)): - return f(*args, **kwargs) - - logger.warning(f"KOSync Integrated Server: Unauthorized access attempt from '{request.remote_addr}' (user: '{user}')") - return jsonify({"error": "Unauthorized"}), 401 - return decorated_function +kosync_sync_bp = Blueprint("kosync", __name__) # ---------------- CORS: block browser preflight ---------------- + @kosync_sync_bp.before_request def _kosync_cors_preflight(): """Return bare 204 for OPTIONS requests. KOReader is native — it never sends Origin/OPTIONS. This blocks browser-based cross-origin abuse.""" - if request.method == 'OPTIONS': - return '', 204 + if request.method == "OPTIONS": + return "", 204 # ---------------- KOSync Protocol Endpoints ---------------- -@kosync_sync_bp.route('/healthcheck') -@kosync_sync_bp.route('/koreader/healthcheck') + +@kosync_sync_bp.route("/healthcheck") +@kosync_sync_bp.route("/koreader/healthcheck") def kosync_healthcheck(): """KOSync connectivity check""" return "OK", 200 -@kosync_sync_bp.route('/users/auth', methods=['GET']) -@kosync_sync_bp.route('/koreader/users/auth', methods=['GET']) +@kosync_sync_bp.route("/users/auth", methods=["GET"]) +@kosync_sync_bp.route("/koreader/users/auth", methods=["GET"]) def kosync_users_auth(): """KOReader auth check - validates credentials per kosync-dotnet spec""" remote = request.remote_addr - if remote not in ('127.0.0.1', '::1') and not _rate_limit_check(remote, _AUTH_TOKEN_COST): - return jsonify({"message": "Too many requests"}), 429 + rate_limiter = current_app.config.get("rate_limiter") + if remote not in ("127.0.0.1", "::1") and rate_limiter: + from src.utils.rate_limiter import TokenBucketRateLimiter - user = request.headers.get('x-auth-user') - key = request.headers.get('x-auth-key') + if not rate_limiter.check(remote, TokenBucketRateLimiter.AUTH_TOKEN_COST): + return jsonify({"message": "Too many requests"}), 429 + + user = request.headers.get("x-auth-user") + key = request.headers.get("x-auth-key") expected_user = os.environ.get("KOSYNC_USER") expected_password = os.environ.get("KOSYNC_KEY") @@ -222,832 +71,62 @@ def kosync_users_auth(): return jsonify({"message": "Unauthorized"}), 401 -@kosync_sync_bp.route('/users/create', methods=['POST']) -@kosync_sync_bp.route('/koreader/users/create', methods=['POST']) +@kosync_sync_bp.route("/users/create", methods=["POST"]) +@kosync_sync_bp.route("/koreader/users/create", methods=["POST"]) def kosync_users_create(): """Stub for KOReader user registration check""" remote = request.remote_addr - if remote not in ('127.0.0.1', '::1') and not _rate_limit_check(remote, _AUTH_TOKEN_COST): - return jsonify({"error": "Too many requests"}), 429 + rate_limiter = current_app.config.get("rate_limiter") + if remote not in ("127.0.0.1", "::1") and rate_limiter: + from src.utils.rate_limiter import TokenBucketRateLimiter - return jsonify({ - "id": 1, - "username": os.environ.get("KOSYNC_USER", "user") - }), 201 + if not rate_limiter.check(remote, TokenBucketRateLimiter.AUTH_TOKEN_COST): + return jsonify({"error": "Too many requests"}), 429 + return jsonify({"id": 1, "username": os.environ.get("KOSYNC_USER", "user")}), 201 -@kosync_sync_bp.route('/users/login', methods=['POST']) -@kosync_sync_bp.route('/koreader/users/login', methods=['POST']) + +@kosync_sync_bp.route("/users/login", methods=["POST"]) +@kosync_sync_bp.route("/koreader/users/login", methods=["POST"]) def kosync_users_login(): """Stub for KOReader login check""" remote = request.remote_addr - if remote not in ('127.0.0.1', '::1') and not _rate_limit_check(remote, _AUTH_TOKEN_COST): - return jsonify({"error": "Too many requests"}), 429 + rate_limiter = current_app.config.get("rate_limiter") + if remote not in ("127.0.0.1", "::1") and rate_limiter: + from src.utils.rate_limiter import TokenBucketRateLimiter + + if not rate_limiter.check(remote, TokenBucketRateLimiter.AUTH_TOKEN_COST): + return jsonify({"error": "Too many requests"}), 429 - return jsonify({ - "id": 1, - "username": os.environ.get("KOSYNC_USER", "user"), - "active": True - }), 200 + return jsonify({"id": 1, "username": os.environ.get("KOSYNC_USER", "user"), "active": True}), 200 -@kosync_sync_bp.route('/syncs/progress/', methods=['GET']) -@kosync_sync_bp.route('/koreader/syncs/progress/', methods=['GET']) +@kosync_sync_bp.route("/syncs/progress/", methods=["GET"]) +@kosync_sync_bp.route("/koreader/syncs/progress/", methods=["GET"]) @kosync_auth_required def kosync_get_progress(doc_id): - """ - Fetch progress for a specific document. - Returns 502 (not 404) if document not found, per kosync-dotnet spec. - - Lookup order: - 1. Direct hash match in kosync_documents - 2. Book lookup by kosync_doc_id - 3. Sibling hash resolution (same book, different epub hash) - 4. Background auto-discovery for completely unknown hashes - """ - if len(doc_id) > 64: - return jsonify({"error": "Document ID too long"}), 400 - - logger.info(f"KOSync: GET progress for doc {doc_id[:8]}... from {request.remote_addr}") - - # Step 1: Direct hash lookup - kosync_doc = _database_service.get_kosync_document(doc_id) - if kosync_doc: - # If linked to a book, always check siblings for freshest progress. - # This prevents "shadow" docs (created by sync-bot PUTs) from returning - # stale data when the real device hash has advanced further. - if kosync_doc.linked_abs_id: - book = _database_service.get_book_by_abs_id(kosync_doc.linked_abs_id) - if book: - return _respond_from_book_states(doc_id, book) - - has_progress = kosync_doc.percentage and float(kosync_doc.percentage) > 0 - if has_progress: - return jsonify({ - "device": kosync_doc.device or "", - "device_id": kosync_doc.device_id or "", - "document": kosync_doc.document_hash, - "percentage": float(kosync_doc.percentage) if kosync_doc.percentage else 0, - "progress": kosync_doc.progress or "", - "timestamp": int(kosync_doc.timestamp.timestamp()) if kosync_doc.timestamp else 0 - }), 200 - # Document exists but has no progress and no linked book — fall through - # to try sibling resolution for better data - - # Step 2: Book lookup by kosync_doc_id - book = _database_service.get_book_by_kosync_id(doc_id) - if book: - return _respond_from_book_states(doc_id, book) - - # Step 3: Sibling hash resolution — find the book via other linked hashes - resolved_book = _resolve_book_by_sibling_hash(doc_id, existing_doc=kosync_doc) - if resolved_book: - _register_hash_for_book(doc_id, resolved_book) - return _respond_from_book_states(doc_id, resolved_book) - - # Step 4: Unknown hash — register stub and start background discovery - auto_create = os.environ.get('AUTO_CREATE_EBOOK_MAPPING', 'true').lower() == 'true' - start_discovery = False - with _active_scans_lock: - if auto_create and doc_id not in _active_scans and len(_active_scans) < _MAX_ACTIVE_SCANS: - _active_scans.add(doc_id) - start_discovery = True - if start_discovery: - from src.db.models import KosyncDocument as KD - stub = KD(document_hash=doc_id) - _database_service.save_kosync_document(stub) - logger.info(f"KOSync: Created stub for unknown hash {doc_id[:8]}..., starting background discovery") - threading.Thread(target=_run_get_auto_discovery, args=(doc_id,), daemon=True).start() - - logger.warning(f"KOSync: Document not found: {doc_id[:8]}... (GET from {request.remote_addr})") - return jsonify({"message": "Document not found on server"}), 502 - - -@kosync_sync_bp.route('/syncs/progress', methods=['PUT']) -@kosync_sync_bp.route('/koreader/syncs/progress', methods=['PUT']) + """Fetch progress for a specific document. + Returns 502 (not 404) if document not found, per kosync-dotnet spec.""" + if not doc_id or len(doc_id) > 64 or not all(c.isalnum() or c in "-_" for c in doc_id): + return jsonify({"message": "Invalid document ID"}), 400 + svc = current_app.config["kosync_service"] + result, status = svc.handle_get_progress(doc_id, request.remote_addr) + resp = make_response(jsonify(result), status) + resp.content_type = "application/json" + return resp + + +@kosync_sync_bp.route("/syncs/progress", methods=["PUT"]) +@kosync_sync_bp.route("/koreader/syncs/progress", methods=["PUT"]) @kosync_auth_required def kosync_put_progress(): - """ - Receive progress update from KOReader. - Stores ALL documents, whether mapped to ABS or not. - """ - from src.db.models import Book, KosyncDocument - + """Receive progress update from KOReader.""" data = request.json - if not data: - logger.warning(f"KOSync: PUT progress with no JSON data from {request.remote_addr}") - return jsonify({"error": "No data"}), 400 - - doc_hash = data.get('document') - if not doc_hash or not isinstance(doc_hash, str): - logger.warning(f"KOSync: PUT progress with no document ID from {request.remote_addr}") - return jsonify({"error": "Missing document ID"}), 400 - if len(doc_hash) > 64: - return jsonify({"error": "Document hash too long"}), 400 - - # Validate percentage - percentage = data.get('percentage', 0) - try: - percentage = float(percentage) - except (TypeError, ValueError): - return jsonify({"error": "Invalid percentage value"}), 400 - if percentage < 0.0 or percentage > 1.0: - return jsonify({"error": "Percentage must be between 0.0 and 1.0"}), 400 - - logger.info(f"KOSync: PUT progress request for doc {doc_hash[:8]}... from {request.remote_addr} (device: {data.get('device', 'unknown')})") - - progress = str(data.get('progress', ''))[:512] - device = str(data.get('device', ''))[:128] - device_id = str(data.get('device_id', ''))[:64] - - now = datetime.utcnow() - - kosync_doc = _database_service.get_kosync_document(doc_hash) - - # Optional "furthest wins" protection - furthest_wins = os.environ.get('KOSYNC_FURTHEST_WINS', 'true').lower() == 'true' - force_update = data.get('force', False) - - # Allow rewinds if: - # 1. Force flag is set (e.g. from SyncManager) - # 2. Update comes from the SAME device (user moved slider back) - same_device = (kosync_doc and kosync_doc.device_id == device_id) - - if furthest_wins and kosync_doc and kosync_doc.percentage and not force_update and not same_device: - existing_pct = float(kosync_doc.percentage) - new_pct = float(percentage) - - if new_pct < existing_pct - 0.0001: - logger.info(f"KOSync: Ignored progress from '{device}' for doc {doc_hash[:8]}... (server has higher: {existing_pct:.2f}% vs new {new_pct:.2f}%)") - return jsonify({ - "document": doc_hash, - "timestamp": int(kosync_doc.timestamp.timestamp()) if kosync_doc.timestamp else int(now.timestamp()) - }), 200 - - if kosync_doc is None: - kosync_doc = KosyncDocument( - document_hash=doc_hash, - progress=progress, - percentage=percentage, - device=device, - device_id=device_id, - timestamp=now - ) - logger.info(f"KOSync: New document tracked: {doc_hash[:8]}... from device '{device}'") - else: - logger.info(f"KOSync: Received progress from '{device}' for doc {doc_hash[:8]}... -> {float(percentage):.2f}% (Updated from {float(kosync_doc.percentage) if kosync_doc.percentage else 0:.2f}%)") - kosync_doc.progress = progress - kosync_doc.percentage = percentage - kosync_doc.device = device - kosync_doc.device_id = device_id - kosync_doc.timestamp = now - - _database_service.save_kosync_document(kosync_doc) - - # Update linked book if exists - linked_book = None - if kosync_doc.linked_abs_id: - linked_book = _database_service.get_book_by_abs_id(kosync_doc.linked_abs_id) - else: - linked_book = _database_service.get_book_by_kosync_id(doc_hash) - if linked_book: - _database_service.link_kosync_document(doc_hash, linked_book.abs_id) - - # AUTO-DISCOVERY - if not linked_book: - auto_create = os.environ.get('AUTO_CREATE_EBOOK_MAPPING', 'true').lower() == 'true' - - if auto_create: - start_discovery = False - with _active_scans_lock: - if doc_hash not in _active_scans and len(_active_scans) < _MAX_ACTIVE_SCANS: - _active_scans.add(doc_hash) - start_discovery = True - - if start_discovery: - def run_auto_discovery(doc_hash_val): - try: - import json - - from src.db.models import PendingSuggestion - - logger.info(f"KOSync: Scheduled auto-discovery for unmapped document {doc_hash_val[:8]}...") - epub_filename = _try_find_epub_by_hash(doc_hash_val) - - if not epub_filename: - logger.debug(f"Could not auto-match EPUB for KOSync document '{doc_hash_val[:8]}'") - return - - title = Path(epub_filename).stem - - # Step 1: Check if there's a matching audiobook in ABS - audiobook_matches = [] - if _container.abs_client().is_configured(): - try: - audiobooks = _container.abs_client().get_all_audiobooks() - search_term = title - - logger.debug(f"Auto-discovery: Searching for audiobook matching '{search_term}' in {len(audiobooks)} audiobooks") - - for ab in audiobooks: - media = ab.get('media', {}) - metadata = media.get('metadata', {}) - ab_title = (metadata.get('title') or ab.get('name', '')) - ab_author = metadata.get('authorName', '') - - # Use same simple matching as UI search (normalized substring) - def normalize(s): - import re - return re.sub(r'[^\w\s]', '', s.lower()) - - search_norm = normalize(search_term) - title_norm = normalize(ab_title) - - if (search_norm and title_norm) and (search_norm in title_norm or title_norm in search_norm): - # Skip books with high progress (>75%) - they're already mostly done - duration = media.get('duration', 0) - progress_pct = 0 - if duration > 0: - # Get progress from ABS for this audiobook - try: - ab_progress = _container.abs_client().get_progress(ab['id']) - if ab_progress: - progress_pct = ab_progress.get('progress', 0) * 100 - except Exception as e: - logger.debug(f"Failed to get ABS progress during auto-discovery: {e}") - - if progress_pct > 75: - logger.debug(f"Auto-discovery: Skipping '{ab_title}' - already {progress_pct:.0f}% complete") - continue - - logger.debug(f"Auto-discovery: Matched '{ab_title}' by {ab_author} for search term '{search_term}'") - audiobook_matches.append({ - "source": "abs", - "abs_id": ab['id'], - "title": ab_title, - "author": ab_author, - "duration": duration, - "confidence": "high" - }) - - except Exception as e: - logger.warning(f"Error searching ABS for audiobooks: {e}") - - # Step 2: If audiobook matches found, create a suggestion for user review - if audiobook_matches: - # Check if suggestion already exists (pending OR hidden - don't re-suggest) - if not _database_service.suggestion_exists(doc_hash_val): - suggestion = PendingSuggestion( - source_id=doc_hash_val, - title=title, - author=None, # Could extract from EPUB metadata - cover_url=f"/api/cover-proxy/{audiobook_matches[0]['abs_id']}", - matches_json=json.dumps(audiobook_matches + [{ - "source": "ebook", - "filename": epub_filename, - "confidence": "high" - }]), - source='kosync', - ) - _database_service.save_pending_suggestion(suggestion) - logger.info(f"Created suggestion for '{title}' - found {len(audiobook_matches)} audiobook match(es)") - return - - # Step 3: No audiobook found - fall back to ebook-only mapping - logger.info(f"No audiobook match for '{title}' - creating ebook-only mapping") - book = Book( - abs_id=None, - title=title, - ebook_filename=epub_filename, - kosync_doc_id=doc_hash_val, - transcript_file=None, - status='active', - duration=None, - sync_mode='ebook_only' - ) - _database_service.save_book(book, is_new=True) - _database_service.link_kosync_document(doc_hash_val, str(book.id)) - _database_service.resolve_suggestion(doc_hash_val) - logger.info(f"Auto-created ebook-only mapping: {book.id} -> {epub_filename}") - - if _manager: - _manager.sync_cycle(target_book_id=book.id) - - except Exception as e: - logger.error(f"Error in auto-discovery background task: {e}") - finally: - with _active_scans_lock: - _active_scans.discard(doc_hash_val) - - threading.Thread(target=run_auto_discovery, args=(doc_hash,), daemon=True).start() - - if linked_book: - # Flag activity on paused/DNF books - if linked_book.status in ('paused', 'dnf', 'not_started') and not linked_book.activity_flag: - linked_book.activity_flag = True - _database_service.save_book(linked_book) - logger.info(f"KOSync PUT: Activity detected on {linked_book.status} book '{linked_book.title}'") - - # NOTE: We intentionally do NOT update book_states here. - # The sync cycle is the only thing that should update book_states. - # This ensures proper delta detection between cycles. - logger.debug(f"KOSync: Updated linked book '{linked_book.title}' to {percentage:.2%}") - - # Debounce sync trigger — wait until the reader stops turning pages - # Skip if the update came from the sync bot itself (prevents sync→PUT→sync loop) - # Skip if instant sync is globally disabled. - is_internal = device and device.lower() in ('abs-sync-bot', 'book-stitch', 'book-sync', 'pagekeeper') - instant_sync_enabled = os.environ.get('INSTANT_SYNC_ENABLED', 'true').lower() != 'false' - if linked_book.status == 'active' and _manager and not is_internal and instant_sync_enabled: - logger.debug(f"KOSync PUT: Progress event recorded for '{linked_book.title}'") - _record_kosync_event(linked_book.abs_id, linked_book.title) - - response_timestamp = now.isoformat() + "Z" - if device and device.lower() == "booknexus": - # BookNexus expects an integer timestamp (Unix epoch) - response_timestamp = int(now.timestamp()) - - return jsonify({ - "document": doc_hash, - "timestamp": response_timestamp - }), 200 - - -# ---------------- Helper Functions ---------------- - -def _upsert_kosync_metadata(document_hash, filename, source, mtime=None, booklore_id=None): - """Cache hash metadata without overwriting any existing progress data.""" - from src.db.models import KosyncDocument - - existing = _database_service.get_kosync_document(document_hash) - if existing: - existing.filename = filename - existing.source = source - if mtime is not None: - existing.mtime = mtime - if booklore_id is not None: - existing.booklore_id = booklore_id - _database_service.save_kosync_document(existing) - else: - doc = KosyncDocument( - document_hash=document_hash, - filename=filename, - source=source, - mtime=mtime, - booklore_id=booklore_id, - ) - _database_service.save_kosync_document(doc) - - -def _try_find_epub_by_hash(doc_hash: str) -> str | None: - """Try to find matching EPUB file for a KOSync document hash.""" - try: - # Check database for linked document first - doc = _database_service.get_kosync_document(doc_hash) - if doc and doc.filename: - try: - _container.ebook_parser().resolve_book_path(doc.filename) - logger.info(f"Matched EPUB via DB: {doc.filename}") - return doc.filename - except FileNotFoundError: - logger.debug(f"DB suggested '{doc.filename}' but file is missing — Re-scanning") - - # Check if valid linked book exists with original filename - if doc and doc.linked_abs_id: - book = _database_service.get_book_by_abs_id(doc.linked_abs_id) - if book and book.original_ebook_filename: - try: - _container.ebook_parser().resolve_book_path(book.original_ebook_filename) - logger.info(f"Matched EPUB via Linked Book Original Filename: {book.original_ebook_filename}") - return book.original_ebook_filename - except Exception: - pass - - # Check filesystem - if _ebook_dir and _ebook_dir.exists(): - logger.info(f"Starting filesystem search in {_ebook_dir} for hash {doc_hash[:8]}...") - count = 0 - for epub_path in _ebook_dir.rglob("*.epub"): - count += 1 - if count % 100 == 0: - logger.debug(f"Checked {count} local EPUBs...") - - # Optimization: Check if we already have this file's hash in DB - cached_doc = _database_service.get_kosync_doc_by_filename(epub_path.name) - if cached_doc: - # Check mtime for invalidation - current_mtime = epub_path.stat().st_mtime - if cached_doc.mtime == current_mtime: - if cached_doc.document_hash == doc_hash: - logger.info(f"Matched EPUB via DB filename lookup: {epub_path.name}") - return epub_path.name - continue - - try: - computed_hash = _container.ebook_parser().get_kosync_id(epub_path) - - # Store/Update in DB - if cached_doc: - cached_doc.document_hash = computed_hash - cached_doc.mtime = epub_path.stat().st_mtime - cached_doc.source = 'filesystem' - _database_service.save_kosync_document(cached_doc) - else: - _upsert_kosync_metadata(computed_hash, epub_path.name, 'filesystem', - mtime=epub_path.stat().st_mtime) - - if computed_hash == doc_hash: - logger.info(f"Matched EPUB via filesystem: {epub_path.name}") - return epub_path.name - except Exception as e: - logger.debug(f"Error checking file {epub_path.name}: {e}") - logger.info(f"Filesystem search finished. Checked {count} files. No match found") - - # Fallback to Booklore - bl_group = _container.booklore_client_group() - if bl_group.is_configured(): - logger.info("Starting Booklore API search...") - - try: - # Query BookloreBook cache in DB first (all servers) - books = _database_service.get_all_booklore_books() - if not books: - # If DB cache empty, fetch from API - bl_group.get_all_books() - logger.info("Booklore cache in DB is empty. Consider running a library sync.") - from src.services.library_service import LibraryService - lib_service = LibraryService(_database_service, bl_group) - lib_service.sync_library_books() - books = _database_service.get_all_booklore_books() - - logger.info(f"Scanning {len(books)} books from Booklore DB cache...") - - for book in books: - raw_id = book.raw_metadata_dict.get('id') if hasattr(book, 'raw_metadata_dict') else None - book_id = str(raw_id) if raw_id is not None else None - # Fallback to parsing raw_metadata if needed - if not book_id: - import json - try: - meta = json.loads(book.raw_metadata) - fallback_id = meta.get('id') - book_id = str(fallback_id) if fallback_id is not None else None - except (json.JSONDecodeError, AttributeError) as e: - logger.debug(f"Failed to parse raw_metadata JSON: {e}") - continue - - # Qualified booklore_id: "server_id:book_id" - qualified_id = f"{book.server_id}:{book_id}" - - # Check if we have a KosyncDocument for this Booklore ID - cached_doc = _database_service.get_kosync_doc_by_booklore_id(qualified_id) - if cached_doc: - if cached_doc.document_hash == doc_hash: - logger.info(f"Matched EPUB via Booklore ID in DB: {book.filename}") - return book.filename - - try: - # Use qualified ID for download so the group routes to the right server - book_content = bl_group.download_book(qualified_id) - if book_content: - computed_hash = _container.ebook_parser().get_kosync_id_from_bytes(book.filename, book_content) - - if computed_hash == doc_hash: - safe_title = f"{book.server_id}_{book.filename}" - cache_dir = _container.data_dir() / "epub_cache" - cache_dir.mkdir(parents=True, exist_ok=True) - cache_path = cache_dir / safe_title - with open(cache_path, 'wb') as f: - f.write(book_content) - logger.info(f"Persisted Booklore book to cache: {safe_title}") - - # Save/Update KosyncDocument in DB - if cached_doc: - cached_doc.document_hash = computed_hash - cached_doc.filename = safe_title - cached_doc.source = 'booklore' - _database_service.save_kosync_document(cached_doc) - else: - _upsert_kosync_metadata(computed_hash, safe_title, 'booklore', - booklore_id=qualified_id) - - logger.info(f"Matched EPUB via Booklore download: {safe_title}") - return safe_title - except Exception as e: - logger.warning(f"Failed to check Booklore book '{book.title}': {e}") - - logger.info(f"Booklore search finished. Checked {len(books)} books. No match found") - - except Exception as e: - logger.debug(f"Error querying Booklore for EPUB matching: {e}") - - except Exception as e: - logger.error(f"Error in EPUB auto-discovery: {e}") - - logger.info("Auto-discovery finished. No match found") - return None - - -# ---------------- GET Fallback Helpers ---------------- - -def _respond_from_book_states(doc_id, book): - """Build a GET response from a book's state data. Returns (response, status_code).""" - states = _database_service.get_states_for_book(book.id) - - # Also check sibling kosync_documents for device-specific progress - sibling_docs = _database_service.get_kosync_documents_for_book_by_book_id(book.id) - # Filter out stale siblings (not updated in >30 days), with fallback to any sibling with progress - now_ts = time.time() - docs_with_progress = [ - d for d in sibling_docs - if d.percentage and float(d.percentage) > 0 - and d.timestamp and (now_ts - d.timestamp.timestamp()) < 30 * 86400 - ] - if not docs_with_progress: - docs_with_progress = [ - d for d in sibling_docs - if d.percentage and float(d.percentage) > 0 and d.timestamp - ] - if docs_with_progress: - best_doc = max(docs_with_progress, key=lambda d: float(d.percentage)) - logger.info(f"KOSync: Resolved {doc_id[:8]}... to '{book.title}' via sibling hash {best_doc.document_hash[:8]}... ({float(best_doc.percentage):.2%})") - return jsonify({ - "device": best_doc.device or "pagekeeper", - "device_id": best_doc.device_id or "pagekeeper", - "document": doc_id, - "percentage": float(best_doc.percentage), - "progress": best_doc.progress or "", - "timestamp": int(best_doc.timestamp.timestamp()) if best_doc.timestamp else 0 - }), 200 - - if not states: - return jsonify({"message": "Document not found on server"}), 502 - - kosync_state = next((s for s in states if s.client_name.lower() == 'kosync'), None) - latest_state = kosync_state or max(states, key=lambda s: s.last_updated if s.last_updated else 0) - - return jsonify({ - "device": "pagekeeper", - "device_id": "pagekeeper", - "document": doc_id, - "percentage": float(latest_state.percentage) if latest_state.percentage else 0, - "progress": (latest_state.xpath or latest_state.cfi) if hasattr(latest_state, 'xpath') else "", - "timestamp": int(latest_state.last_updated) if latest_state.last_updated else 0 - }), 200 - - -def _resolve_book_by_sibling_hash(doc_id: str, existing_doc=None): - """ - Try to resolve an unknown hash to a known book using DB-only lookups. - Checks if any other KosyncDocument with the same filename is already linked. - """ - # Check if this hash has a filename cached (from a prior scan/PUT) - doc = existing_doc or _database_service.get_kosync_document(doc_id) - if doc and doc.filename: - # Find a sibling document with the same filename that's linked to a book - sibling = _database_service.get_kosync_doc_by_filename(doc.filename) - if sibling and sibling.linked_abs_id and sibling.document_hash != doc_id: - book = _database_service.get_book_by_abs_id(sibling.linked_abs_id) - if book: - logger.info(f"KOSync: Resolved {doc_id[:8]}... to '{book.title}' via filename sibling") - return book - - # Check if the filename matches a book's ebook_filename directly - book = _database_service.get_book_by_ebook_filename(doc.filename) - if book: - logger.info(f"KOSync: Resolved {doc_id[:8]}... to '{book.title}' via ebook filename match") - return book - - return None - - -def _register_hash_for_book(doc_id: str, book): - """Register a new hash and link it to an existing book.""" - from src.db.models import KosyncDocument as KD - - existing = _database_service.get_kosync_document(doc_id) - if existing: - if not existing.linked_abs_id: - _database_service.link_kosync_document(doc_id, book.abs_id) - logger.info(f"KOSync: Linked existing document {doc_id[:8]}... to '{book.title}'") - else: - doc = KD(document_hash=doc_id, linked_abs_id=book.abs_id) - _database_service.save_kosync_document(doc) - logger.info(f"KOSync: Created and linked new document {doc_id[:8]}... to '{book.title}'") - - -def _run_get_auto_discovery(doc_id: str): - """Background auto-discovery triggered by GET for an unknown hash. - Finds the matching epub and links the hash to an existing book.""" - try: - logger.info(f"KOSync: Background discovery (GET) for {doc_id[:8]}...") - epub_filename = _try_find_epub_by_hash(doc_id) - - if not epub_filename: - logger.info(f"KOSync: GET-discovery found no epub for {doc_id[:8]}...") - return - - # Update stub with filename - doc = _database_service.get_kosync_document(doc_id) - if doc and not doc.filename: - doc.filename = epub_filename - _database_service.save_kosync_document(doc) - - # Try to find an existing book that uses this epub - book = _database_service.get_book_by_ebook_filename(epub_filename) - if book: - _database_service.link_kosync_document(doc_id, book.abs_id) - logger.info(f"KOSync: GET-discovery linked {doc_id[:8]}... to '{book.title}'") - return - - logger.info(f"KOSync: GET-discovery found epub '{epub_filename}' but no matching book") - except Exception as e: - logger.error(f"Error in GET auto-discovery: {e}") - finally: - with _active_scans_lock: - _active_scans.discard(doc_id) - - -# ---------------- Admin Auth ---------------- - -_PRIVATE_NETWORKS = ( - ipaddress.ip_network('10.0.0.0/8'), - ipaddress.ip_network('172.16.0.0/12'), - ipaddress.ip_network('192.168.0.0/16'), - ipaddress.ip_network('127.0.0.0/8'), - ipaddress.ip_network('::1/128'), - ipaddress.ip_network('fd00::/8'), -) - - -def _is_private_ip(addr: str) -> bool: - """Check if an address is on a private/local network.""" - try: - ip = ipaddress.ip_address(addr) - return any(ip in net for net in _PRIVATE_NETWORKS) - except (ValueError, TypeError): - return False - - -def admin_or_local_required(f): - """Allow private IPs through; require KOSync credentials from public IPs. - - Safety: this decorator is only used on kosync_admin_bp routes, which are - registered exclusively on the LAN dashboard (port 4477). The internet-facing - sync port only serves kosync_sync_bp, so the proxy bypass here is never - reachable from outside the local network. - """ - @wraps(f) - def decorated_function(*args, **kwargs): - if _is_private_ip(request.remote_addr): - return f(*args, **kwargs) - - # Public IP — require KOSync credentials - user = request.headers.get('x-auth-user') - key = request.headers.get('x-auth-key') - expected_user = os.environ.get("KOSYNC_USER") - expected_password = os.environ.get("KOSYNC_KEY") - - if not expected_user or not expected_password: - return jsonify({"error": "Unauthorized"}), 401 - - expected_hash = hash_kosync_key(expected_password) - if (user and expected_user and user.lower() == expected_user.lower() - and key and hmac.compare_digest(key, expected_hash)): - return f(*args, **kwargs) - - logger.warning(f"KOSync Admin: Unauthorized access attempt from public IP '{request.remote_addr}'") - return jsonify({"error": "Unauthorized"}), 401 - return decorated_function - - -# ---------------- KOSync Document Management API ---------------- - -@kosync_admin_bp.route('/api/kosync-documents', methods=['GET']) -@admin_or_local_required -def api_get_kosync_documents(): - """Get all KOSync documents with their link status.""" - docs = _database_service.get_all_kosync_documents() - result = [] - for doc in docs: - linked_book = None - if doc.linked_abs_id: - linked_book = _database_service.get_book_by_abs_id(doc.linked_abs_id) - - result.append({ - 'document_hash': doc.document_hash, - 'progress': doc.progress, - 'percentage': float(doc.percentage) if doc.percentage else 0, - 'device': doc.device, - 'device_id': doc.device_id, - 'timestamp': doc.timestamp.isoformat() if doc.timestamp else None, - 'first_seen': doc.first_seen.isoformat() if doc.first_seen else None, - 'last_updated': doc.last_updated.isoformat() if doc.last_updated else None, - 'linked_abs_id': doc.linked_abs_id, - 'linked_book_title': linked_book.title if linked_book else None - }) - - return jsonify({ - 'documents': result, - 'total': len(result), - 'linked': sum(1 for d in result if d['linked_abs_id']), - 'unlinked': sum(1 for d in result if not d['linked_abs_id']) - }) - - -@kosync_admin_bp.route('/api/kosync-documents//link', methods=['POST']) -@admin_or_local_required -def api_link_kosync_document(doc_hash): - """Link a KOSync document to an ABS book.""" - data = request.json - if not data or 'abs_id' not in data: - return jsonify({'error': 'Missing abs_id'}), 400 - - abs_id = data['abs_id'] - - book = _database_service.get_book_by_abs_id(abs_id) - if not book: - return jsonify({'error': 'Book not found'}), 404 - - doc = _database_service.get_kosync_document(doc_hash) - if not doc: - return jsonify({'error': 'KOSync document not found'}), 404 - - success = _database_service.link_kosync_document(doc_hash, abs_id) - if success: - # [FIX] Always update the book's KOSync ID to match what we just linked. - # This handles cases where the book had a "wrong" hash (e.g. from Storyteller artifact) - # and we want to align it with the actual device hash. - current_id = book.kosync_doc_id - if current_id != doc_hash: - logger.info(f"Updating Book {book.title} KOSync ID: {current_id} -> {doc_hash}") - book.kosync_doc_id = doc_hash - _database_service.save_book(book) - elif not current_id: - book.kosync_doc_id = doc_hash - _database_service.save_book(book) - - # Cleanup: remove any actionable suggestion for this document since it's now linked - _database_service.resolve_suggestion(doc_hash) - - return jsonify({'success': True, 'message': f'Linked to {book.title}'}) - - return jsonify({'error': 'Failed to link document'}), 500 - - -@kosync_admin_bp.route('/api/kosync-documents//unlink', methods=['POST']) -@admin_or_local_required -def api_unlink_kosync_document(doc_hash): - """Remove the ABS book link from a KOSync document.""" - success = _database_service.unlink_kosync_document(doc_hash) - if success: - # Cleanup cached EPUB for this hash - _cleanup_cache_for_hash(doc_hash) - return jsonify({'success': True, 'message': 'Document unlinked'}) - return jsonify({'error': 'Document not found'}), 404 - - -@kosync_admin_bp.route('/api/kosync-documents/', methods=['DELETE']) -@admin_or_local_required -def api_delete_kosync_document(doc_hash): - """Delete a KOSync document.""" - _cleanup_cache_for_hash(doc_hash) # Must run before delete (needs doc record for filename) - success = _database_service.delete_kosync_document(doc_hash) - if success: - return jsonify({'success': True, 'message': 'Document deleted'}) - return jsonify({'error': 'Document not found'}), 404 - - -def _cleanup_cache_for_hash(doc_hash): - """Delete cached EPUB file for a document.""" - try: - # Identify filename from DB - doc = _database_service.get_kosync_document(doc_hash) - filename = doc.filename if doc else None - - # Fallback: check linked book - if not filename and doc and doc.linked_abs_id: - book = _database_service.get_book_by_abs_id(doc.linked_abs_id) - if book: - filename = book.original_ebook_filename or book.ebook_filename - - if filename: - # Delete file if in epub_cache - if _container: - cache_dir = _container.data_dir() / "epub_cache" - file_path = cache_dir / filename - if not is_safe_path_within(file_path, cache_dir): - logger.warning(f"Blocked cache deletion — path escapes cache dir: '{filename}'") - elif file_path.exists(): - try: - os.remove(file_path) - logger.info(f"Deleted cached EPUB: {filename}") - except Exception as e: - logger.warning(f"Failed to delete cached file '{filename}': {e}") - - # Note: We don't delete the KosyncDocument record here, - # as it may contain important progress data. - # The filename/mtime/source fields just become stale or are cleared if unlinked. - - except Exception as e: - logger.error(f"Error cleaning up cache for '{doc_hash}': {e}") + if not data or not isinstance(data, dict): + return jsonify({"error": "Invalid request body"}), 400 + svc = current_app.config["kosync_service"] + debounce = current_app.config.get("debounce_manager") + result, status = svc.handle_put_progress(data, request.remote_addr, debounce) + resp = make_response(jsonify(result), status) + resp.content_type = "application/json" + return resp diff --git a/src/blueprints/abs_bp.py b/src/blueprints/abs_bp.py index 475cd34..67aaab5 100644 --- a/src/blueprints/abs_bp.py +++ b/src/blueprints/abs_bp.py @@ -4,9 +4,9 @@ import re import requests -from flask import Blueprint, Response, jsonify +from flask import Blueprint, Response, jsonify, send_from_directory -from src.blueprints.helpers import get_abs_service, get_book_or_404, get_container +from src.blueprints.helpers import get_abs_service, get_container, get_covers_dir, get_database_service logger = logging.getLogger(__name__) @@ -25,33 +25,41 @@ def get_abs_libraries(): @abs_bp.route('/api/cover-proxy/') def proxy_cover(book_ref): - """Proxy cover access to allow loading covers from local network ABS instances.""" - abs_service = get_abs_service() - if not abs_service.is_available(): - return "ABS not configured", 404 - - book = get_book_or_404(book_ref) - abs_id = book.abs_id - if not abs_id: - return "Book has no ABS cover", 404 + """Proxy cover access with local caching for offline resilience.""" + book = get_database_service().get_book_by_ref(book_ref) + abs_id = book.abs_id if book and book.abs_id else book_ref if not re.fullmatch(r'[a-zA-Z0-9_\-]+', abs_id): return "Invalid ID", 400 - try: - container = get_container() - token = container.abs_client().token - base_url = container.abs_client().base_url - - url = f"{base_url.rstrip('/')}/api/items/{abs_id}/cover" - - req = requests.get(url, headers={"Authorization": f"Bearer {token}"}, stream=True, timeout=10) - if req.status_code == 200: - resp = Response(req.iter_content(chunk_size=1024), content_type=req.headers.get('content-type', 'image/jpeg')) - resp.headers['Cache-Control'] = 'public, max-age=86400, immutable' - return resp - else: - return "Cover not found", 404 - except Exception as e: - logger.error(f"Error proxying cover for '{abs_id}': {e}") - return "Error loading cover", 500 + covers_dir = get_covers_dir() + cache_file = covers_dir / f"abs-{abs_id}.jpg" + + # Try upstream when ABS is available + abs_service = get_abs_service() + if abs_service.is_available(): + try: + container = get_container() + token = container.abs_client().token + base_url = container.abs_client().base_url + url = f"{base_url.rstrip('/')}/api/items/{abs_id}/cover" + req = requests.get(url, headers={"Authorization": f"Bearer {token}"}, timeout=10) + if req.status_code == 200: + data = req.content + try: + cache_file.write_bytes(data) + except Exception: + logger.debug(f"Failed to cache cover for '{abs_id}'") + resp = Response(data, content_type=req.headers.get('content-type', 'image/jpeg')) + resp.headers['Cache-Control'] = 'public, max-age=86400, immutable' + return resp + except Exception as e: + logger.error(f"Error proxying cover for '{abs_id}': {e}") + + # Fall back to local cache + if cache_file.exists(): + resp = send_from_directory(covers_dir, cache_file.name) + resp.headers['Cache-Control'] = 'public, max-age=86400, immutable' + return resp + + return "Cover not found", 404 diff --git a/src/blueprints/api.py b/src/blueprints/api.py index 3273bdd..c8b2cd0 100644 --- a/src/blueprints/api.py +++ b/src/blueprints/api.py @@ -3,7 +3,6 @@ ABS-specific routes (/api/abs/*, /api/cover-proxy/*) are in abs_bp.py. """ -import json import logging from flask import Blueprint, current_app, jsonify, request @@ -15,45 +14,19 @@ get_container, get_database_service, get_kosync_id_for_ebook, + serialize_suggestion, ) from src.db.models import Book logger = logging.getLogger(__name__) -api_bp = Blueprint('api', __name__) - - -def _serialize_suggestion(s): - try: - matches = json.loads(s.matches_json) if s.matches_json else [] - except Exception as e: - logger.debug(f"Failed to parse matches_json for suggestion '{s.source_id}': {e}") - matches = [] - - for m in matches: - evidence = m.get('evidence') or [] - m['has_bookfusion'] = ( - m.get('source_family') == 'bookfusion' - or any(ev.startswith('bookfusion') for ev in evidence) - ) - - return { - "id": s.id, - "source_id": s.source_id, - "title": s.title, - "author": s.author, - "cover_url": s.cover_url, - "matches": matches, - "has_bookfusion_evidence": any(m.get('has_bookfusion') for m in matches), - "created_at": s.created_at.isoformat() if s.created_at else None, - "status": 'hidden' if s.status == 'dismissed' else s.status, - "hidden": s.status in ('hidden', 'dismissed'), - } +api_bp = Blueprint("api", __name__) # ---------------- Status ---------------- -@api_bp.route('/api/status') + +@api_bp.route("/api/status") def api_status(): """Return status of all books from database service""" database_service = get_database_service() @@ -70,45 +43,45 @@ def api_status(): state_by_client = {state.client_name: state for state in states_by_book.get(book.id, [])} mapping = { - 'id': book.id, - 'abs_id': book.abs_id, - 'title': book.title, - 'ebook_filename': book.ebook_filename, - 'kosync_doc_id': book.kosync_doc_id, - 'transcript_file': book.transcript_file, - 'status': book.status, - 'sync_mode': getattr(book, 'sync_mode', 'audiobook'), - 'duration': book.duration, - 'storyteller_uuid': book.storyteller_uuid, - 'states': {} + "id": book.id, + "abs_id": book.abs_id, + "title": book.title, + "ebook_filename": book.ebook_filename, + "kosync_doc_id": book.kosync_doc_id, + "transcript_file": book.transcript_file, + "status": book.status, + "sync_mode": book.sync_mode, + "duration": book.duration, + "storyteller_uuid": book.storyteller_uuid, + "states": {}, } for client_name, state in state_by_client.items(): pct_val = round(state.percentage * 100, 1) if state.percentage is not None else 0 - mapping['states'][client_name] = { - 'timestamp': state.timestamp or 0, - 'percentage': pct_val, - 'xpath': getattr(state, 'xpath', None), - 'last_updated': state.last_updated + mapping["states"][client_name] = { + "timestamp": state.timestamp or 0, + "percentage": pct_val, + "xpath": getattr(state, "xpath", None), + "last_updated": state.last_updated, } - if client_name == 'kosync': - mapping['kosync_pct'] = pct_val - mapping['kosync_xpath'] = getattr(state, 'xpath', None) - elif client_name == 'abs': - mapping['abs_pct'] = pct_val - mapping['abs_ts'] = state.timestamp - elif client_name == 'storyteller': - mapping['storyteller_pct'] = pct_val - mapping['storyteller_xpath'] = getattr(state, 'xpath', None) - elif client_name == 'booklore': - mapping['booklore_pct'] = pct_val - mapping['booklore_xpath'] = getattr(state, 'xpath', None) + if client_name == "kosync": + mapping["kosync_pct"] = pct_val + mapping["kosync_xpath"] = getattr(state, "xpath", None) + elif client_name == "abs": + mapping["abs_pct"] = pct_val + mapping["abs_ts"] = state.timestamp + elif client_name == "storyteller": + mapping["storyteller_pct"] = pct_val + mapping["storyteller_xpath"] = getattr(state, "xpath", None) + elif client_name == "booklore": + mapping["booklore_pct"] = pct_val + mapping["booklore_xpath"] = getattr(state, "xpath", None) # Compute unified_progress — max percentage across all clients - all_pcts = [s['percentage'] for s in mapping['states'].values()] - mapping['unified_progress'] = min(max(all_pcts), 100.0) if all_pcts else 0 + all_pcts = [s["percentage"] for s in mapping["states"].values()] + mapping["unified_progress"] = min(max(all_pcts), 100.0) if all_pcts else 0 mappings.append(mapping) @@ -117,74 +90,79 @@ def api_status(): # ---------------- Processing Status ---------------- -@api_bp.route('/api/processing-status') + +@api_bp.route("/api/processing-status") def api_processing_status(): """Return status and progress for all non-active (processing/pending/failed) books.""" database_service = get_database_service() books = database_service.get_all_books() + processing_books = [b for b in books if b.status in ("pending", "processing", "failed_retry_later")] + jobs_by_book = database_service.get_latest_jobs_bulk([b.id for b in processing_books]) result = {} - for book in books: - if book.status not in ('pending', 'processing', 'failed_retry_later'): - continue - job = database_service.get_latest_job(book.id) + for book in processing_books: + job = jobs_by_book.get(book.id) result[str(book.id)] = { - 'status': book.status, - 'job_progress': round((job.progress or 0.0) * 100, 1) if job else 0.0, - 'retry_count': (job.retry_count or 0) if job else 0, + "status": book.status, + "job_progress": round((job.progress or 0.0) * 100, 1) if job else 0.0, + "retry_count": (job.retry_count or 0) if job else 0, } return jsonify(result) # ---------------- Suggestions ---------------- -@api_bp.route('/api/suggestions', methods=['GET']) + +@api_bp.route("/api/suggestions", methods=["GET"]) def get_suggestions(): database_service = get_database_service() suggestions = database_service.get_all_actionable_suggestions() - return jsonify([_serialize_suggestion(s) for s in suggestions if s.matches]) + return jsonify([serialize_suggestion(s) for s in suggestions if s.matches]) -@api_bp.route('/api/suggestions/rescan', methods=['POST']) +@api_bp.route("/api/suggestions/rescan", methods=["POST"]) def rescan_suggestions(): container = get_container() data = request.get_json(silent=True) or {} - force = bool(data.get('force')) + force = bool(data.get("force")) stats = container.suggestion_service().request_rescan_library_suggestions(force=force) return jsonify({"success": True, **stats}) -@api_bp.route('/api/suggestions/rescan-status', methods=['GET']) +@api_bp.route("/api/suggestions/rescan-status", methods=["GET"]) def rescan_suggestions_status(): container = get_container() status = container.suggestion_service().get_rescan_status() return jsonify({"success": True, **status}) -@api_bp.route('/api/suggestions//hide', methods=['POST']) +@api_bp.route("/api/suggestions//hide", methods=["POST"]) def hide_suggestion(source_id): database_service = get_database_service() - if database_service.hide_suggestion(source_id): + source = request.args.get("source", "abs") + if database_service.hide_suggestion(source_id, source=source): return jsonify({"success": True}) return jsonify({"success": False, "error": "Not found"}), 404 -@api_bp.route('/api/suggestions//unhide', methods=['POST']) +@api_bp.route("/api/suggestions//unhide", methods=["POST"]) def unhide_suggestion(source_id): database_service = get_database_service() - if database_service.unhide_suggestion(source_id): + source = request.args.get("source", "abs") + if database_service.unhide_suggestion(source_id, source=source): return jsonify({"success": True}) return jsonify({"success": False, "error": "Not found"}), 404 -@api_bp.route('/api/suggestions//ignore', methods=['POST']) +@api_bp.route("/api/suggestions//ignore", methods=["POST"]) def ignore_suggestion(source_id): database_service = get_database_service() - if database_service.ignore_suggestion(source_id): + source = request.args.get("source", "abs") + if database_service.ignore_suggestion(source_id, source=source): return jsonify({"success": True}) return jsonify({"success": False, "error": "Not found"}), 404 -@api_bp.route('/api/suggestions/clear_stale', methods=['POST']) +@api_bp.route("/api/suggestions/clear_stale", methods=["POST"]) def clear_stale_suggestions(): database_service = get_database_service() count = database_service.clear_stale_suggestions() @@ -192,54 +170,60 @@ def clear_stale_suggestions(): return jsonify({"success": True, "count": count}) -@api_bp.route('/api/suggestions//link-bookfusion', methods=['POST']) +@api_bp.route("/api/suggestions//link-bookfusion", methods=["POST"]) def link_suggestion_bookfusion(source_id): database_service = get_database_service() container = get_container() - suggestion = database_service.get_pending_suggestion(source_id) + data = request.get_json(silent=True) or {} + source = data.get("source", "abs") + if source != "abs": + return jsonify({"success": False, "error": "BookFusion linking only supported for ABS suggestions"}), 400 + + suggestion = database_service.get_pending_suggestion(source_id, source=source) if not suggestion: return jsonify({"success": False, "error": "Suggestion not found"}), 404 - data = request.get_json(silent=True) or {} - match_index = data.get('match_index') + match_index = data.get("match_index") matches = suggestion.matches or [] if match_index is None or not isinstance(match_index, int) or match_index < 0 or match_index >= len(matches): return jsonify({"success": False, "error": "Valid match_index required"}), 400 match = matches[match_index] - bookfusion_ids = match.get('bookfusion_ids') or [] - if match.get('source_family') != 'bookfusion' or not bookfusion_ids: + bookfusion_ids = match.get("bookfusion_ids") or [] + if match.get("source_family") != "bookfusion" or not bookfusion_ids: return jsonify({"success": False, "error": "Selected match is not a BookFusion candidate"}), 400 abs_book = database_service.get_book_by_ref(source_id) if not abs_book: abs_client = container.abs_client() item = abs_client.get_item_details(source_id) if abs_client else None - metadata = (item or {}).get('media', {}).get('metadata', {}) + metadata = (item or {}).get("media", {}).get("metadata", {}) abs_book = Book( abs_id=source_id, - title=metadata.get('title') or suggestion.title or source_id, - status='not_started', - duration=(item or {}).get('media', {}).get('duration'), - sync_mode='audiobook', + title=metadata.get("title") or suggestion.title or source_id, + status="not_started", + duration=(item or {}).get("media", {}).get("duration"), + sync_mode="audiobook", ) database_service.save_book(abs_book) abs_service = container.abs_service() if abs_service and abs_service.is_available(): try: - abs_service.add_to_collection(source_id, current_app.config['ABS_COLLECTION_NAME']) + abs_service.add_to_collection(source_id, current_app.config["ABS_COLLECTION_NAME"]) except Exception as e: logger.warning(f"Failed to add '{source_id}' to ABS collection during BookFusion link: {e}") + # Re-fetch to get the auto-assigned book ID + abs_book = database_service.get_book_by_ref(source_id) for bid in bookfusion_ids: - database_service.set_bookfusion_book_match(bid, source_id) - database_service.link_bookfusion_book(bid, source_id) + database_service.set_bookfusion_book_match_by_book_id(bid, abs_book.id) + database_service.link_bookfusion_highlights_by_book_id(bid, abs_book.id) database_service.resolve_suggestion(source_id) return jsonify({"success": True, "abs_id": source_id}) -@api_bp.route('/api/sync-reading-dates', methods=['POST']) +@api_bp.route("/api/sync-reading-dates", methods=["POST"]) def sync_reading_dates_api(): """Auto-complete books at 100% progress and fill missing dates.""" container = get_container() @@ -250,37 +234,38 @@ def sync_reading_dates_api(): # ---------------- Storyteller ---------------- -@api_bp.route('/api/storyteller/search', methods=['GET']) + +@api_bp.route("/api/storyteller/search", methods=["GET"]) def api_storyteller_search(): container = get_container() - query = request.args.get('q', '') + query = request.args.get("q", "") if not query: return jsonify({"success": False, "error": "Query parameter 'q' is required"}), 400 results = container.storyteller_client().search_books(query) return jsonify(results) -@api_bp.route('/api/storyteller/link/', methods=['POST']) +@api_bp.route("/api/storyteller/link/", methods=["POST"]) def api_storyteller_link(book_ref): database_service = get_database_service() data = request.get_json() - if not data or 'uuid' not in data: + if not data or "uuid" not in data: return jsonify({"success": False, "error": "Missing 'uuid' in JSON payload"}), 400 - storyteller_uuid = data['uuid'] + storyteller_uuid = data["uuid"] book = get_book_or_404(book_ref) # Handle explicit unlinking if storyteller_uuid == "none" or not storyteller_uuid: logger.info(f"Unlinking Storyteller for '{book.title}'") book.storyteller_uuid = None - book.status = 'pending' + book.status = "pending" database_service.save_book(book) return jsonify({"message": "Storyteller unlinked successfully", "filename": book.ebook_filename}), 200 book.storyteller_uuid = storyteller_uuid - book.status = 'pending' + book.status = "pending" database_service.save_book(book) if book.abs_id: database_service.resolve_suggestion(book.abs_id) @@ -289,6 +274,7 @@ def api_storyteller_link(book_ref): # ---------------- Booklore ---------------- + def _get_booklore_libraries(client_getter, name): container = get_container() client = client_getter(container) @@ -297,22 +283,22 @@ def _get_booklore_libraries(client_getter, name): return jsonify(client.get_libraries()) -@api_bp.route('/api/booklore/libraries', methods=['GET']) +@api_bp.route("/api/booklore/libraries", methods=["GET"]) def get_booklore_libraries(): """Return available Booklore libraries.""" return _get_booklore_libraries(lambda c: c.booklore_client(), "Booklore") -@api_bp.route('/api/booklore2/libraries', methods=['GET']) +@api_bp.route("/api/booklore2/libraries", methods=["GET"]) def get_booklore2_libraries(): """Return available Booklore 2 libraries.""" return _get_booklore_libraries(lambda c: c.booklore_client_2(), "Booklore 2") -@api_bp.route('/api/booklore/search', methods=['GET']) +@api_bp.route("/api/booklore/search", methods=["GET"]) def api_booklore_search(): """Search Booklore books by title/author/filename.""" - query = request.args.get('q', '').strip() + query = request.args.get("q", "").strip() if not query: return jsonify([]) @@ -324,21 +310,23 @@ def api_booklore_search(): label = current_app.config.get("BOOKLORE_LABEL", "Booklore") results = [] books = client.search_books(query) - for b in (books or []): - results.append({ - 'id': b.get('id'), - 'title': b.get('title', ''), - 'authors': b.get('authors', ''), - 'fileName': b.get('fileName', ''), - 'source': label, - }) + for b in books or []: + results.append( + { + "id": b.get("id"), + "title": b.get("title", ""), + "authors": b.get("authors", ""), + "fileName": b.get("fileName", ""), + "source": label, + } + ) return jsonify(results) except Exception: logger.warning("Booklore search failed", exc_info=True) return jsonify([]) -@api_bp.route('/api/booklore/link/', methods=['POST']) +@api_bp.route("/api/booklore/link/", methods=["POST"]) def api_booklore_link(book_ref): """Link or unlink a PageKeeper book to a Booklore book by filename.""" database_service = get_database_service() @@ -348,11 +336,11 @@ def api_booklore_link(book_ref): if not isinstance(data, dict): return jsonify({"success": False, "error": "No data provided"}), 400 - if 'filename' not in data: + if "filename" not in data: return jsonify({"success": False, "error": "Missing 'filename' in JSON payload"}), 400 - filename_raw = data.get('filename') + filename_raw = data.get("filename") if filename_raw is None: - filename = '' + filename = "" elif not isinstance(filename_raw, str): return jsonify({"success": False, "error": "'filename' must be a string or null"}), 400 else: @@ -371,11 +359,14 @@ def api_booklore_link(book_ref): booklore_id = None bl_book, bl_client = find_in_booklore(filename) if bl_book: - booklore_id = bl_book.get('id') + booklore_id = bl_book.get("id") kosync_doc_id = get_kosync_id_for_ebook(filename, booklore_id, bl_client=bl_client) if kosync_doc_id: book.kosync_doc_id = kosync_doc_id book.original_ebook_filename = book.original_ebook_filename or filename database_service.save_book(book) + from src.services.kosync_service import ensure_kosync_document + + ensure_kosync_document(book, database_service) logger.info(f"Linked Booklore file '{filename}' to '{book.title}'") return jsonify({"success": True, "message": "Linked successfully"}) diff --git a/src/blueprints/bookfusion_bp.py b/src/blueprints/bookfusion_bp.py index c8ff212..4065ec1 100644 --- a/src/blueprints/bookfusion_bp.py +++ b/src/blueprints/bookfusion_bp.py @@ -134,12 +134,12 @@ def _auto_match_highlights(db_service) -> int: if not books: return 0 - # Build normalized title → abs_id list map (detect ambiguous duplicates) - book_map: dict[str, list[str]] = defaultdict(list) + # Build normalized title → book_id list map (detect ambiguous duplicates) + book_map: dict[str, list[int]] = defaultdict(list) for b in books: if b.title: norm = normalize_title(b.title) - book_map[norm].append(b.abs_id) + book_map[norm].append(b.id) # Group unmatched by book_title title_groups: dict[str, list] = {} @@ -152,11 +152,11 @@ def _auto_match_highlights(db_service) -> int: for bf_title, highlights in title_groups.items(): norm_bf = normalize_title(bf_title) - abs_id = None + matched_book_id = None # Exact match (only if unambiguous) if norm_bf in book_map and len(book_map[norm_bf]) == 1: - abs_id = book_map[norm_bf][0] + matched_book_id = book_map[norm_bf][0] else: # Fuzzy match (only if unambiguous) candidates = [] @@ -169,12 +169,12 @@ def _auto_match_highlights(db_service) -> int: if candidates: candidates.sort(key=lambda c: c[0], reverse=True) if len(candidates) == 1 or (candidates[0][0] - candidates[1][0]) >= 0.05: - abs_id = candidates[0][1] + matched_book_id = candidates[0][1] - if abs_id: + if matched_book_id: bf_ids = {hl.bookfusion_book_id for hl in highlights if hl.bookfusion_book_id} for bf_id in bf_ids: - db_service.link_bookfusion_book(bf_id, abs_id) + db_service.link_bookfusion_highlights_by_book_id(bf_id, matched_book_id) matched_count += len(highlights) return matched_count @@ -337,7 +337,12 @@ def link_highlight(): return jsonify({'error': 'bookfusion_book_id required'}), 400 db_service = get_database_service() - db_service.link_bookfusion_book(bookfusion_book_id, abs_id or None) + if abs_id: + book = db_service.get_book_by_ref(abs_id) + book_id = book.id if book else None + else: + book_id = None + db_service.link_bookfusion_highlights_by_book_id(bookfusion_book_id, book_id) return jsonify({'success': True}) @@ -357,7 +362,8 @@ def save_highlight_to_journal(): # When no highlights provided in the request, fetch them server-side if not highlights: db_service = get_database_service() - bf_highlights = db_service.get_bookfusion_highlights_for_book(abs_id) + book = db_service.get_book_by_ref(abs_id) + bf_highlights = db_service.get_bookfusion_highlights_for_book_by_book_id(book.id) if book else [] if not bf_highlights: return jsonify({'error': 'No highlights found for this book'}), 400 highlights = [] @@ -515,10 +521,14 @@ def add_to_dashboard(): ) db_service.save_book(book, is_new=True) + # Re-fetch book to get the auto-assigned ID + saved_book = db_service.get_book_by_ref(abs_id) + saved_book_id = saved_book.id if saved_book else None + # Auto-link ALL catalog books + highlights in the group for bid in bookfusion_ids: - db_service.set_bookfusion_book_match(bid, abs_id) - db_service.link_bookfusion_book(bid, abs_id) + db_service.set_bookfusion_book_match_by_book_id(bid, saved_book_id) + db_service.link_bookfusion_highlights_by_book_id(bid, saved_book_id) # Auto-populate reading dates date_info = _estimate_reading_dates(db_service, abs_id, bookfusion_ids, title) @@ -547,19 +557,21 @@ def match_to_book(): db_service = get_database_service() - if abs_id and not db_service.get_book_by_ref(abs_id): + book = db_service.get_book_by_ref(abs_id) if abs_id else None + if abs_id and not book: return jsonify({'error': 'Book not found'}), 404 + book_id = book.id if book else None + # Link ALL catalog books + highlights in the group for bid in bookfusion_ids: - db_service.set_bookfusion_book_match(bid, abs_id or None) - db_service.link_bookfusion_book(bid, abs_id or None) + db_service.set_bookfusion_book_match_by_book_id(bid, book_id) + db_service.link_bookfusion_highlights_by_book_id(bid, book_id) resp = {'success': True, 'abs_id': abs_id} # Auto-populate reading dates if linking (not unlinking) if abs_id: - book = db_service.get_book_by_ref(abs_id) title = book.title if book else '' date_info = _estimate_reading_dates(db_service, abs_id, bookfusion_ids, title) resp.update(date_info) @@ -590,7 +602,7 @@ def hide_book(): @bookfusion_bp.route('/api/bookfusion/unlink', methods=['POST']) def unlink_book(): - """Unlink a BookFusion book from a dashboard book by abs_id.""" + """Unlink a BookFusion book from a dashboard book.""" data = request.get_json() if not data: return jsonify({'error': 'No data provided'}), 400 @@ -600,5 +612,7 @@ def unlink_book(): return jsonify({'error': 'abs_id required'}), 400 db_service = get_database_service() - db_service.unlink_bookfusion_by_abs_id(abs_id) + book = db_service.get_book_by_ref(abs_id) + if book: + db_service.unlink_bookfusion_by_book_id(book.id) return jsonify({'success': True}) diff --git a/src/blueprints/books.py b/src/blueprints/books.py index 6427b1b..6798624 100644 --- a/src/blueprints/books.py +++ b/src/blueprints/books.py @@ -86,7 +86,7 @@ def realign_book(book_ref): alignment_service = container.alignment_service() logger.info(f"Re-aligning '{sanitize_log_data(book.title or str(book.id))}'") - alignment_service.realign_book(book.abs_id or str(book.id)) + alignment_service.realign_book(book.id) return jsonify({"success": True}) @@ -189,9 +189,12 @@ def update_hash(book_ref): flash("Could not recalculate hash (file not found?)", "error") return redirect(url_for('dashboard.index')) - if updated and book.kosync_doc_id != old_hash: - logger.info(f"Hash changed for '{sanitize_log_data(book.title)}' -- triggering instant sync to reconcile progress") - threading.Thread(target=manager.sync_cycle, kwargs={'target_book_id': book.id}, daemon=True).start() + if updated: + from src.services.kosync_service import ensure_kosync_document + ensure_kosync_document(book, get_database_service()) + if book.kosync_doc_id != old_hash: + logger.info(f"Hash changed for '{sanitize_log_data(book.title)}' -- triggering instant sync to reconcile progress") + threading.Thread(target=manager.sync_cycle, kwargs={'target_book_id': book.id}, daemon=True).start() flash(f"Updated KoSync Hash for {book.title}", "success") return redirect(url_for('dashboard.index')) diff --git a/src/blueprints/covers.py b/src/blueprints/covers.py index 212a745..8da3d2f 100644 --- a/src/blueprints/covers.py +++ b/src/blueprints/covers.py @@ -6,23 +6,28 @@ from flask import Blueprint, Response, send_from_directory from src.blueprints.helpers import get_container, get_covers_dir, get_database_service +from src.utils.path_utils import is_safe_path_within logger = logging.getLogger(__name__) -covers_bp = Blueprint('covers', __name__) +covers_bp = Blueprint("covers", __name__) -@covers_bp.route('/covers/') +@covers_bp.route("/covers/") def serve_cover(filename): """Serve cover images with lazy extraction.""" COVERS_DIR = get_covers_dir() - doc_hash = filename.replace('.jpg', '') + + if not is_safe_path_within(COVERS_DIR / filename, COVERS_DIR): + return "Cover not found", 404 + + doc_hash = filename.replace(".jpg", "") # 1. Check if file exists cover_path = COVERS_DIR / filename if cover_path.exists(): resp = send_from_directory(COVERS_DIR, filename) - resp.headers['Cache-Control'] = 'public, max-age=86400, immutable' + resp.headers["Cache-Control"] = "public, max-age=86400, immutable" return resp # 2. Try to extract @@ -37,7 +42,7 @@ def serve_cover(filename): if parser.extract_cover(full_book_path, cover_path): resp = send_from_directory(COVERS_DIR, filename) - resp.headers['Cache-Control'] = 'public, max-age=86400, immutable' + resp.headers["Cache-Control"] = "public, max-age=86400, immutable" return resp except Exception as e: logger.debug(f"Lazy cover extraction failed: {e}") @@ -45,38 +50,47 @@ def serve_cover(filename): return "Cover not found", 404 -@covers_bp.route('/api/cover-proxy/booklore/') +@covers_bp.route("/api/cover-proxy/booklore/") def proxy_booklore_cover(book_id): """Proxy cover access to Booklore (auth via query-parameter JWT).""" container = get_container() - return _proxy_booklore_cover_for(container.booklore_client(), book_id) + return _proxy_booklore_cover_for(container.booklore_client(), book_id, cache_prefix="bl") -@covers_bp.route('/api/cover-proxy/booklore2/') +@covers_bp.route("/api/cover-proxy/booklore2/") def proxy_booklore2_cover(book_id): """Proxy cover access to Booklore 2nd instance.""" container = get_container() - return _proxy_booklore_cover_for(container.booklore_client_2(), book_id) - - -def _proxy_booklore_cover_for(bl_client, book_id): - """Shared cover proxy logic for any BookloreClient instance.""" - try: - if not bl_client.is_configured(): - return "Booklore not configured", 404 - - token = bl_client._get_fresh_token() - if not token: - return "Booklore auth failed", 500 - - url = f"{bl_client.base_url}/api/v1/media/book/{book_id}/cover" - req = requests.get(url, params={"token": token}, stream=True, timeout=10) - if req.status_code == 200: - resp = Response(req.iter_content(chunk_size=1024), content_type='image/jpeg') - resp.headers['Cache-Control'] = 'public, max-age=86400, immutable' - return resp - else: - return "Cover not found", 404 - except Exception as e: - logger.error(f"Error proxying Booklore cover for book {book_id}: {e}") - return "Error loading cover", 500 + return _proxy_booklore_cover_for(container.booklore_client_2(), book_id, cache_prefix="bl2") + + +def _proxy_booklore_cover_for(bl_client, book_id, cache_prefix="bl"): + """Shared cover proxy logic with local caching for offline resilience.""" + covers_dir = get_covers_dir() + cache_file = covers_dir / f"{cache_prefix}-{book_id}.jpg" + + if bl_client.is_configured(): + try: + token = bl_client._get_fresh_token() + if token: + url = f"{bl_client.base_url}/api/v1/media/book/{book_id}/cover" + req = requests.get(url, params={"token": token}, timeout=10) + if req.status_code == 200: + data = req.content + try: + cache_file.write_bytes(data) + except Exception: + logger.debug(f"Failed to cache Booklore cover for book {book_id}") + resp = Response(data, content_type="image/jpeg") + resp.headers["Cache-Control"] = "public, max-age=86400, immutable" + return resp + except Exception as e: + logger.error(f"Error proxying Booklore cover for book {book_id}: {e}") + + # Fall back to local cache + if cache_file.exists(): + resp = send_from_directory(covers_dir, cache_file.name) + resp.headers["Cache-Control"] = "public, max-age=86400, immutable" + return resp + + return "Cover not found", 404 diff --git a/src/blueprints/dashboard.py b/src/blueprints/dashboard.py index 05449e4..16f2c57 100644 --- a/src/blueprints/dashboard.py +++ b/src/blueprints/dashboard.py @@ -9,7 +9,7 @@ from flask import Blueprint, render_template from src.blueprints.helpers import ( - booklore_cover_proxy_prefix, + find_booklore_metadata, get_abs_service, get_container, get_database_service, @@ -17,6 +17,7 @@ get_hardcover_book_url, get_service_web_url, ) +from src.utils.cover_resolver import resolve_book_covers from src.version import APP_VERSION logger = logging.getLogger(__name__) @@ -97,6 +98,11 @@ def _run_date_sync(): for client_name, client in sync_clients.items(): integrations[client_name.lower()] = client.is_configured() + # Merge Booklore 2 into booklore flag so templates show the service + # when either instance is configured + if integrations.get('booklore2') and not integrations.get('booklore'): + integrations['booklore'] = True + # BookFusion integration status bf_client = container.bookfusion_client() integrations["bookfusion"] = bf_client.is_configured() @@ -106,8 +112,8 @@ def _run_date_sync(): bf_linked_ids = set() bf_highlight_counts = {} try: - bf_linked_ids = database_service.get_bookfusion_linked_abs_ids() - bf_highlight_counts = database_service.get_bookfusion_highlight_counts() + bf_linked_ids = database_service.get_bookfusion_linked_book_ids() + bf_highlight_counts = database_service.get_bookfusion_highlight_counts_by_book_id() except Exception as e: logger.warning(f"Could not fetch BookFusion link data: {e}") @@ -132,7 +138,7 @@ def _run_date_sync(): states = states_by_book.get(book.id, []) state_by_client = {state.client_name: state for state in states} - sync_mode = getattr(book, "sync_mode", "audiobook") + sync_mode = book.sync_mode if sync_mode == "ebook_only": book_type = "ebook-only" elif not book.ebook_filename: @@ -140,26 +146,11 @@ def _run_date_sync(): else: book_type = "linked" - # Look up Booklore metadata by ebook_filename or original_ebook_filename - # Prefer entries that have a title, since we use this for display enrichment # bl_meta: filtered to enabled instances (used for covers/deep-links) - bl_meta = None - for fn in (book.ebook_filename, getattr(book, "original_ebook_filename", None)): - if fn: - candidates = booklore_by_filename.get(fn.lower(), []) - bl_meta = next((b for b in candidates if b.title), candidates[0] if candidates else None) - if bl_meta: - break + bl_meta = find_booklore_metadata(book, booklore_by_filename) # bl_meta_enrichment: unfiltered fallback for title/author (stale metadata is fine) - bl_meta_enrichment = bl_meta - if not bl_meta_enrichment: - for fn in (book.ebook_filename, getattr(book, "original_ebook_filename", None)): - if fn: - candidates = booklore_by_filename_all.get(fn.lower(), []) - bl_meta_enrichment = next((b for b in candidates if b.title), candidates[0] if candidates else None) - if bl_meta_enrichment: - break + bl_meta_enrichment = bl_meta or find_booklore_metadata(book, booklore_by_filename_all) # Skip ABS metadata enrichment for ebook-only books (synthetic ID won't resolve) if book_type == "ebook-only": @@ -167,25 +158,39 @@ def _run_date_sync(): abs_author = (bl_meta_enrichment.authors or "") if bl_meta_enrichment else "" else: _abs_meta = abs_metadata_by_id.get(book.abs_id, {}) - abs_subtitle = _abs_meta.get("subtitle", "") - abs_author = _abs_meta.get("author", "") + abs_subtitle = _abs_meta.get("subtitle", "") or book.subtitle or "" + abs_author = _abs_meta.get("author", "") or book.author or "" # Enrich title from Booklore if stored title looks like a filename enriched_title = book.title if bl_meta_enrichment and bl_meta_enrichment.title: stems = set() - for fn in (book.ebook_filename, getattr(book, "original_ebook_filename", None)): + for fn in (book.ebook_filename, book.original_ebook_filename): if fn: stems.add(Path(fn).stem) if book.title in stems or book.title in ( book.ebook_filename, - getattr(book, "original_ebook_filename", None), + book.original_ebook_filename, ): enriched_title = bl_meta_enrichment.title # Persist the improved title so it sticks (batched after loop) book.title = bl_meta_enrichment.title books_needing_title_save.append(book) + # Opportunistic refresh: cache author/subtitle from live ABS data + if book_type != "ebook-only" and book.abs_id in abs_metadata_by_id: + _live = abs_metadata_by_id[book.abs_id] + _live_author = _live.get("author", "") + _live_subtitle = _live.get("subtitle", "") + if _live_author and _live_author != book.author: + book.author = _live_author + if book not in books_needing_title_save: + books_needing_title_save.append(book) + if _live_subtitle and _live_subtitle != book.subtitle: + book.subtitle = _live_subtitle + if book not in books_needing_title_save: + books_needing_title_save.append(book) + mapping = { "id": book.id, "abs_id": book.abs_id, @@ -198,7 +203,7 @@ def _run_date_sync(): "status": book.status, "sync_mode": sync_mode, "book_type": book_type, - "activity_flag": getattr(book, "activity_flag", False), + "activity_flag": book.activity_flag, "unified_progress": 0, "duration": book.duration or 0, "storyteller_uuid": book.storyteller_uuid, @@ -208,7 +213,7 @@ def _run_date_sync(): } # Storyteller submission status (from bulk-fetched dict) - st_submission = st_submissions_by_book.get(book.abs_id) # still keyed by abs_id from bulk query + st_submission = st_submissions_by_book.get(book.id) if st_submission: mapping["storyteller_submission_status"] = st_submission.status @@ -311,9 +316,9 @@ def _run_date_sync(): mapping["hardcover_url"] = None # BookFusion link data - is_bf_linked = (book.abs_id in bf_linked_ids) or (book.abs_id or "").startswith("bf-") + is_bf_linked = (book.id in bf_linked_ids) or (book.abs_id or "").startswith("bf-") mapping["bookfusion_linked"] = is_bf_linked - mapping["bookfusion_highlight_count"] = bf_highlight_counts.get(book.abs_id, 0) if book.abs_id else 0 + mapping["bookfusion_highlight_count"] = bf_highlight_counts.get(book.id, 0) mapping["unified_progress"] = min(max_progress, 100.0) mapping["latest_activity_at"] = latest_update_time or None @@ -329,27 +334,12 @@ def _run_date_sync(): else: mapping["last_sync"] = "Never" - if book.abs_id and book_type != "ebook-only": - mapping["cover_url"] = abs_service.get_cover_proxy_url(book.abs_id) - else: - mapping["cover_url"] = None - - # Custom cover URL override (user-pasted) takes precedence over auto-discovered sources - if book.custom_cover_url: - mapping["cover_url"] = book.custom_cover_url - - # Booklore cover fallback for books without an ABS or custom cover - if not mapping["cover_url"] and mapping.get("booklore_id") and bl_meta: - prefix = booklore_cover_proxy_prefix(bl_meta.server_id) - mapping["cover_url"] = f"{prefix}/{mapping['booklore_id']}" - - # KOSync cover fallback (lazy extraction — covers endpoint extracts on demand) - if not mapping["cover_url"] and book.kosync_doc_id: - mapping["cover_url"] = f'/covers/{book.kosync_doc_id}.jpg' - - # Hardcover cover fallback - if not mapping["cover_url"] and mapping.get("hardcover_cover_url"): - mapping["cover_url"] = mapping["hardcover_cover_url"] + covers = resolve_book_covers( + book, abs_service, database_service, book_type, + booklore_meta=bl_meta, hardcover_details=hardcover_details, + ) + mapping["cover_url"] = covers['cover_url'] + mapping["placeholder_logo"] = covers['placeholder_logo'] duration = mapping.get("duration", 0) progress_pct = mapping.get("unified_progress", 0) @@ -373,6 +363,27 @@ def _run_date_sync(): booklore_label = os.environ.get("BOOKLORE_LABEL", "Booklore") or "Booklore" + # Unlinked KoSync documents — for dashboard toast + pending identification section + kosync_unlinked_count = 0 + unlinked_reading = [] + kosync_active = os.environ.get('KOSYNC_ENABLED', '').lower() in ('true', '1', 'yes', 'on') or os.environ.get('KOSYNC_SERVER', '') + if kosync_active: + try: + unlinked_docs = database_service.get_unlinked_kosync_documents() + kosync_unlinked_count = len(unlinked_docs) + unlinked_reading = [ + { + 'document_hash': doc.document_hash, + 'percentage': float(doc.percentage) if doc.percentage else 0, + 'device': doc.device, + 'last_updated': doc.last_updated.isoformat() if doc.last_updated else None, + } + for doc in unlinked_docs + if doc.percentage and float(doc.percentage) > 0 + ] + except Exception: + pass + return render_template( "index.html", mappings=mappings, @@ -380,4 +391,6 @@ def _run_date_sync(): progress=overall_progress, app_version=APP_VERSION, booklore_label=booklore_label, + kosync_unlinked_count=kosync_unlinked_count, + unlinked_reading=unlinked_reading, ) diff --git a/src/blueprints/helpers.py b/src/blueprints/helpers.py index 9075386..8385c7e 100644 --- a/src/blueprints/helpers.py +++ b/src/blueprints/helpers.py @@ -24,28 +24,29 @@ # --------------- Accessors for shared state --------------- + def get_container(): - return current_app.config['container'] + return current_app.config["container"] def get_manager(): - return current_app.config['sync_manager'] + return current_app.config["sync_manager"] def get_database_service(): - return current_app.config['database_service'] + return current_app.config["database_service"] def get_ebook_dir(): - return current_app.config['EBOOK_DIR'] + return current_app.config["EBOOK_DIR"] def get_covers_dir(): - return current_app.config['COVERS_DIR'] + return current_app.config["COVERS_DIR"] def get_abs_service(): - return current_app.config['abs_service'] + return current_app.config["abs_service"] def get_book_or_404(ref): @@ -58,6 +59,7 @@ def get_book_or_404(ref): # --------------- Booklore helpers --------------- + def get_booklore_client(): """Return the Booklore client group (facade over all instances).""" return get_container().booklore_client_group() @@ -76,7 +78,7 @@ def find_in_booklore(filename): book = group.find_book_by_filename(filename) if book: # Resolve the specific client that owns this book - instance_id = book.get('_instance_id', 'default') + instance_id = book.get("_instance_id", "default") client = _resolve_booklore_instance(instance_id) return book, client return None, None @@ -85,7 +87,7 @@ def find_in_booklore(filename): def _resolve_booklore_instance(instance_id): """Return the single BookloreClient for the given instance_id.""" container = get_container() - if instance_id == '2': + if instance_id == "2": return container.booklore_client_2() return container.booklore_client() @@ -93,7 +95,7 @@ def _resolve_booklore_instance(instance_id): def get_enabled_booklore_server_ids(): """Return set of server_ids for enabled Booklore instances.""" group = get_booklore_client() - active = getattr(group, '_active', None) + active = getattr(group, "_active", None) if not isinstance(active, (list, tuple)): return set() return {c.instance_id for c in active} @@ -101,9 +103,9 @@ def get_enabled_booklore_server_ids(): def booklore_cover_proxy_prefix(server_id): """Return the cover-proxy URL path prefix for a Booklore instance.""" - if server_id == '2': - return '/api/cover-proxy/booklore2' - return '/api/cover-proxy/booklore' + if server_id == "2": + return "/api/cover-proxy/booklore2" + return "/api/cover-proxy/booklore" def any_booklore_configured(): @@ -113,23 +115,24 @@ def any_booklore_configured(): def _booklore_label(instance_id): """Return the user-facing label for a Booklore instance.""" - if instance_id == '2': + if instance_id == "2": return os.environ.get("BOOKLORE_2_LABEL", "Booklore 2") return os.environ.get("BOOKLORE_LABEL", "Booklore") # --------------- Helper functions --------------- + def get_audiobooks_conditionally(): """Get audiobooks from configured libraries (ABS_LIBRARY_IDS) or all libraries if not set.""" return get_abs_service().get_audiobooks() -def get_abs_author(ab): - """Extract author from ABS audiobook metadata.""" - media = ab.get('media', {}) - metadata = media.get('metadata', {}) - return metadata.get('authorName') or (metadata.get('authors') or [{}])[0].get("name", "") +def get_audiobook_author(ab): + """Extract author from audiobook metadata.""" + media = ab.get("media", {}) + metadata = media.get("metadata", {}) + return metadata.get("authorName") or (metadata.get("authors") or [{}])[0].get("name", "") def audiobook_matches_search(ab, search_term): @@ -137,10 +140,10 @@ def audiobook_matches_search(ab, search_term): manager = get_manager() def normalize(s): - return re.sub(r'[^\w\s]', '', s.lower()) + return re.sub(r"[^\w\s]", "", s.lower()) - title = normalize(manager.get_abs_title(ab)) - author = normalize(get_abs_author(ab)) + title = normalize(manager.get_audiobook_title(ab)) + author = normalize(get_audiobook_author(ab)) search_norm = normalize(search_term) # 1. Standard Search @@ -212,7 +215,7 @@ def get_kosync_id_for_ebook(ebook_filename, booklore_id=None, original_filename= if not epub_cache.exists(): epub_cache.mkdir(parents=True, exist_ok=True) - if abs_client.download_file(target['stream_url'], cached_path): + if abs_client.download_file(target["stream_url"], cached_path): logger.info(f" Downloaded ABS ebook to '{cached_path}'") return container.ebook_parser().get_kosync_id(cached_path) else: @@ -239,23 +242,23 @@ def get_kosync_id_for_ebook(ebook_filename, booklore_id=None, original_filename= results = cwa_client.search_ebooks(cwa_id) for res in results: - if str(res.get('id')) == cwa_id: + if str(res.get("id")) == cwa_id: target = res break if not target and len(results) == 1: target = results[0] - if target and target.get('download_url'): + if target and target.get("download_url"): logger.info(f"Using direct download link from search for '{target.get('title', 'Unknown')}'") else: logger.debug("Search did not return a usable result, trying direct ID lookup") target = cwa_client.get_book_by_id(cwa_id) - if target and target.get('download_url'): + if target and target.get("download_url"): if not epub_cache.exists(): epub_cache.mkdir(parents=True, exist_ok=True) - if cwa_client.download_ebook(target['download_url'], cached_path): + if cwa_client.download_ebook(target["download_url"], cached_path): logger.info(f" Downloaded CWA ebook to '{cached_path}'") return container.ebook_parser().get_kosync_id(cached_path) else: @@ -272,7 +275,9 @@ def get_kosync_id_for_ebook(ebook_filename, booklore_id=None, original_filename= "or mount the ebooks directory to /books" ) elif not booklore_id and not find_ebook_file(ebook_filename): - logger.warning(f"Cannot compute KOSync ID for '{ebook_filename}': File not found in Booklore, filesystem, or remote sources") + logger.warning( + f"Cannot compute KOSync ID for '{ebook_filename}': File not found in Booklore, filesystem, or remote sources" + ) return None @@ -280,11 +285,22 @@ def get_kosync_id_for_ebook(ebook_filename, booklore_id=None, original_filename= class EbookResult: """Wrapper to provide consistent interface for ebooks from Booklore, CWA, ABS, or filesystem.""" - def __init__(self, name, title=None, subtitle=None, authors=None, booklore_id=None, path=None, source=None, source_id=None, cover_url=None): + def __init__( + self, + name, + title=None, + subtitle=None, + authors=None, + booklore_id=None, + path=None, + source=None, + source_id=None, + cover_url=None, + ): self.name = name self.title = title or Path(name).stem - self.subtitle = subtitle or '' - self.authors = authors or '' + self.subtitle = subtitle or "" + self.authors = authors or "" self.booklore_id = booklore_id self.path = path self.source = source @@ -329,26 +345,28 @@ def get_searchable_ebooks(search_term): books = bl_group.search_books(search_term) if books: for b in books: - fname = b.get('fileName', '') - if fname.lower().endswith('.epub'): + fname = b.get("fileName", "") + if fname.lower().endswith(".epub"): if fname.lower() in found_filenames: continue found_filenames.add(fname.lower()) found_stems.add(Path(fname).stem.lower()) - bl_id = b.get('id') - instance_id = b.get('_instance_id', 'default') + bl_id = b.get("id") + instance_id = b.get("_instance_id", "default") label = _booklore_label(instance_id) cover_prefix = "booklore2" if instance_id == "2" else "booklore" cover = f"/api/cover-proxy/{cover_prefix}/{bl_id}" if bl_id else None - results.append(EbookResult( - name=fname, - title=b.get('title'), - subtitle=b.get('subtitle'), - authors=b.get('authors'), - booklore_id=bl_id, - source=label, - cover_url=cover - )) + results.append( + EbookResult( + name=fname, + title=b.get("title"), + subtitle=b.get("subtitle"), + authors=b.get("authors"), + booklore_id=bl_id, + source=label, + cover_url=cover, + ) + ) except Exception as e: logger.warning(f"Booklore search failed: {e}") @@ -359,22 +377,24 @@ def get_searchable_ebooks(search_term): abs_ebooks = abs_service.search_ebooks(search_term) if abs_ebooks: for ab in abs_ebooks: - ebook_files = abs_service.get_ebook_files(ab['id']) + ebook_files = abs_service.get_ebook_files(ab["id"]) if ebook_files: ef = ebook_files[0] fname = f"{ab['id']}_abs.{ef['ext']}" if fname.lower() not in found_filenames: - results.append(EbookResult( - name=fname, - title=ab.get('title'), - authors=ab.get('author'), - source='ABS', - source_id=ab.get('id'), - cover_url=f"/api/cover-proxy/{ab['id']}" - )) + results.append( + EbookResult( + name=fname, + title=ab.get("title"), + authors=ab.get("author"), + source="ABS", + source_id=ab.get("id"), + cover_url=f"/api/cover-proxy/{ab['id']}", + ) + ) found_filenames.add(fname.lower()) - if ab.get('title'): - found_stems.add(ab['title'].lower().strip()) + if ab.get("title"): + found_stems.add(ab["title"].lower().strip()) except Exception as e: logger.warning(f"ABS ebook search failed: {e}") @@ -388,16 +408,18 @@ def get_searchable_ebooks(search_term): for cr in cwa_results: fname = f"cwa_{cr.get('id', 'unknown')}.{cr.get('ext', 'epub')}" if fname.lower() not in found_filenames: - results.append(EbookResult( - name=fname, - title=cr.get('title'), - authors=cr.get('author'), - source='CWA', - source_id=cr.get('id') - )) + results.append( + EbookResult( + name=fname, + title=cr.get("title"), + authors=cr.get("author"), + source="CWA", + source_id=cr.get("id"), + ) + ) found_filenames.add(fname.lower()) - if cr.get('title'): - found_stems.add(cr['title'].lower().strip()) + if cr.get("title"): + found_stems.add(cr["title"].lower().strip()) except Exception as e: logger.warning(f"CWA search failed: {e}") @@ -413,7 +435,7 @@ def get_searchable_ebooks(search_term): continue if not search_term or search_term.lower() in fname_lower: - results.append(EbookResult(name=eb.name, path=eb, source='Local File')) + results.append(EbookResult(name=eb.name, path=eb, source="Local File")) found_filenames.add(fname_lower) found_stems.add(stem_lower) @@ -458,7 +480,7 @@ def cleanup_mapping_resources(book): except Exception as e: logger.debug(f"Failed to get epub cache dir: {e}") - manager_cache_dir = getattr(manager, 'epub_cache_dir', None) + manager_cache_dir = getattr(manager, "epub_cache_dir", None) if manager_cache_dir: cache_dirs.append(manager_cache_dir) @@ -481,11 +503,11 @@ def cleanup_mapping_resources(book): except Exception as e: logger.warning(f"Failed to delete cached ebook {book.ebook_filename}: {e}") - if getattr(book, 'sync_mode', 'audiobook') == 'ebook_only' and book.kosync_doc_id: + if book.sync_mode == "ebook_only" and book.kosync_doc_id: logger.info(f"Deleting KOSync document record for ebook-only mapping: '{book.kosync_doc_id[:8]}'") database_service.delete_kosync_document(book.kosync_doc_id) - collection_name = os.environ.get('ABS_COLLECTION_NAME', 'Synced with KOReader') + collection_name = os.environ.get("ABS_COLLECTION_NAME", "Synced with KOReader") try: get_abs_service().remove_from_collection(book.abs_id, collection_name) except Exception as e: @@ -511,10 +533,62 @@ def restart_server(): os.kill(os.getpid(), signal.SIGTERM) +def serialize_suggestion(s): + """Shared serializer for PendingSuggestion → JSON-ready dict.""" + matches = [] + for m in s.matches: + evidence = m.get("evidence") or [] + has_bookfusion = m.get("source_family") == "bookfusion" or any(ev.startswith("bookfusion") for ev in evidence) + matches.append( + { + **m, + "evidence": evidence, + "has_bookfusion": has_bookfusion, + } + ) + + has_bookfusion_evidence = any(m.get("has_bookfusion") for m in matches) + return { + "id": s.id, + "source_id": s.source_id, + "source": s.source or "unknown", + "title": s.title, + "author": s.author, + "cover_url": s.cover_url, + "matches": matches, + "created_at": s.created_at.isoformat() if s.created_at else None, + "has_bookfusion_evidence": has_bookfusion_evidence, + "top_match": matches[0] if matches else None, + "status": "hidden" if s.status == "dismissed" else s.status, + "hidden": s.status in ("hidden", "dismissed"), + } + + +def find_booklore_metadata(book, booklore_by_filename): + """Find best Booklore metadata entry for a book by filename.""" + for fn in (book.ebook_filename, book.original_ebook_filename): + if fn: + candidates = booklore_by_filename.get(fn.lower(), []) + match = next((b for b in candidates if b.title), candidates[0] if candidates else None) + if match: + return match + return None + + +def attempt_hardcover_automatch(container, book): + """Best-effort Hardcover automatch after book creation.""" + try: + hc_service = container.hardcover_service() + if hc_service.is_configured(): + hc_service.automatch_hardcover(book, hardcover_sync_client=container.hardcover_sync_client()) + except Exception as e: + logger.warning(f"Hardcover automatch failed (book saved): {e}") + + def safe_folder_name(name: str) -> str: """Sanitize folder name for file system safe usage.""" invalid = '<>:"/\\|?*' name = html.escape(str(name).strip())[:150] for c in invalid: - name = name.replace(c, '_') + name = name.replace(c, "_") return name.strip() or "Unknown" diff --git a/src/blueprints/matching_bp.py b/src/blueprints/matching_bp.py index 06ca611..2527ecf 100644 --- a/src/blueprints/matching_bp.py +++ b/src/blueprints/matching_bp.py @@ -1,6 +1,5 @@ """Matching blueprint — suggestions, single match, batch match.""" -import json import logging import os import threading @@ -9,17 +8,23 @@ from flask import Blueprint, current_app, flash, redirect, render_template, request, session, url_for from src.blueprints.helpers import ( + any_booklore_configured, + attempt_hardcover_automatch, audiobook_matches_search, find_in_booklore, get_abs_service, + get_audiobook_author, get_audiobooks_conditionally, get_container, get_database_service, + get_ebook_dir, get_kosync_id_for_ebook, get_manager, get_searchable_ebooks, + serialize_suggestion, ) from src.db.models import Book, StorytellerSubmission +from src.services.kosync_service import ensure_kosync_document from src.utils.logging_utils import sanitize_log_data from src.utils.path_utils import sanitize_filename @@ -68,7 +73,8 @@ def _do_submit(): logger.warning(f"Storyteller submission error for '{book_title}': {e}") try: db_svc = container.database_service() - submission = db_svc.get_active_storyteller_submission(abs_id) + book = db_svc.get_book_by_abs_id(abs_id) + submission = db_svc.get_active_storyteller_submission_by_book_id(book.id) if book else None if submission: db_svc.update_storyteller_submission_status(submission.id, "failed") except Exception: @@ -79,48 +85,142 @@ def _do_submit(): def _copy_book_merge_metadata(existing_book, overrides=None): metadata = { - "storyteller_uuid": getattr(existing_book, "storyteller_uuid", None), - "original_ebook_filename": getattr(existing_book, "original_ebook_filename", None), - "abs_ebook_item_id": getattr(existing_book, "abs_ebook_item_id", None), - "ebook_item_id": getattr(existing_book, "ebook_item_id", None) or getattr(existing_book, "abs_ebook_item_id", None), - "custom_cover_url": getattr(existing_book, "custom_cover_url", None), - "started_at": getattr(existing_book, "started_at", None), - "finished_at": getattr(existing_book, "finished_at", None), - "rating": getattr(existing_book, "rating", None), - "read_count": getattr(existing_book, "read_count", 1), + "storyteller_uuid": existing_book.storyteller_uuid, + "original_ebook_filename": existing_book.original_ebook_filename, + "abs_ebook_item_id": existing_book.abs_ebook_item_id, + "ebook_item_id": existing_book.ebook_item_id or existing_book.abs_ebook_item_id, + "custom_cover_url": existing_book.custom_cover_url, + "started_at": existing_book.started_at, + "finished_at": existing_book.finished_at, + "rating": existing_book.rating, + "read_count": existing_book.read_count or 1, } if overrides: metadata.update({key: value for key, value in overrides.items() if value is not None}) return metadata -def _serialize_suggestion(s): - matches = [] - for m in s.matches: - evidence = m.get("evidence") or [] - has_bookfusion = m.get("source_family") == "bookfusion" or any(ev.startswith("bookfusion") for ev in evidence) - matches.append( +def _create_book_mapping(container, abs_id, title, ebook_filename, duration, + storyteller_uuid=None, storyteller_submit=False, + author=None, subtitle=None): + """Create a book mapping with full pipeline: Booklore, KOSync, merge, Hardcover, etc. + + Returns (book, error_message). On success error_message is None. + On failure book is None and error_message describes the problem. + """ + database_service = get_database_service() + abs_service = get_abs_service() + + # Booklore lookup + booklore_id = None + bl_match, bl_match_client = find_in_booklore(ebook_filename) + if bl_match: + booklore_id = bl_match.get("id") + + # KOSync ID + kosync_doc_id = get_kosync_id_for_ebook(ebook_filename, booklore_id, bl_client=bl_match_client) + if not kosync_doc_id: + logger.warning(f"Cannot compute KOSync ID for '{sanitize_log_data(ebook_filename)}'") + return None, "Could not compute KOSync ID for ebook" + + # Hash preservation + current_book_entry = database_service.get_book_by_ref(abs_id) + if current_book_entry and current_book_entry.kosync_doc_id: + logger.info(f"Preserving existing hash '{current_book_entry.kosync_doc_id}' for '{abs_id}'") + kosync_doc_id = current_book_entry.kosync_doc_id + + # Duplicate merge detection + existing_book = database_service.get_book_by_kosync_id(kosync_doc_id) + migration_source_id = None + original_ebook_filename = None + + if existing_book and existing_book.abs_id != abs_id: + logger.info(f"Merging existing '{existing_book.abs_id}' into '{abs_id}'") + migration_source_id = existing_book.abs_id + ebook_item_id = existing_book.ebook_item_id or existing_book.abs_ebook_item_id or existing_book.abs_id + original_ebook_filename = existing_book.original_ebook_filename or existing_book.ebook_filename + merge_metadata = _copy_book_merge_metadata( + existing_book, { - **m, - "evidence": evidence, - "has_bookfusion": has_bookfusion, - } + "abs_ebook_item_id": ebook_item_id, + "ebook_item_id": ebook_item_id, + "original_ebook_filename": original_ebook_filename, + "storyteller_uuid": storyteller_uuid or existing_book.storyteller_uuid, + }, ) + else: + merge_metadata = { + "storyteller_uuid": storyteller_uuid, + "original_ebook_filename": None, + "abs_ebook_item_id": None, + "ebook_item_id": None, + } + + # Create book + book = Book( + abs_id=abs_id, + title=title, + ebook_filename=ebook_filename, + kosync_doc_id=kosync_doc_id, + transcript_file=None, + status="pending", + duration=duration, + author=author, + subtitle=subtitle, + **merge_metadata, + ) + database_service.save_book(book, is_new=True) + ensure_kosync_document(book, database_service) - has_bookfusion_evidence = any(m.get("has_bookfusion") for m in matches) - return { - "id": s.id, - "source_id": s.source_id, - "title": s.title, - "author": s.author, - "cover_url": s.cover_url, - "matches": matches, - "created_at": s.created_at.isoformat() if s.created_at else None, - "has_bookfusion_evidence": has_bookfusion_evidence, - "top_match": matches[0] if matches else None, - "status": "hidden" if s.status == "dismissed" else s.status, - "hidden": s.status in ("hidden", "dismissed"), - } + # Storyteller reservation (before HTTP calls to prevent race) + if storyteller_submit: + _create_storyteller_reservation(database_service, abs_id) + + # Duplicate merge migration + if migration_source_id: + try: + database_service.migrate_book_data(migration_source_id, abs_id) + database_service.delete_book(existing_book.id) + abs_service.add_to_collection(abs_id, current_app.config["ABS_COLLECTION_NAME"]) + logger.info(f"Successfully merged {migration_source_id} into {abs_id}") + except Exception as e: + logger.error(f"Failed to merge book data: {e}") + raise + + # Hardcover automatch + attempt_hardcover_automatch(container, book) + + # ABS collection add + if not migration_source_id: + abs_service.add_to_collection(abs_id, current_app.config["ABS_COLLECTION_NAME"]) + + # Booklore shelf add + if bl_match_client: + shelf_filename = original_ebook_filename or ebook_filename + try: + bl_match_client.add_to_shelf(shelf_filename) + except Exception as e: + logger.warning(f"Booklore add_to_shelf failed for '{sanitize_log_data(shelf_filename)}': {e}") + + # Storyteller submission (background thread) + if storyteller_submit: + _submit_to_storyteller_async( + container, abs_id, title, ebook_filename, + current_app.config.get("BOOKS_DIR", ""), + current_app.config.get("EPUB_CACHE_DIR", ""), + ) + + # Resolve suggestions + database_service.resolve_suggestion(abs_id) + database_service.resolve_suggestion(kosync_doc_id) + try: + device_doc = database_service.get_kosync_doc_by_filename(ebook_filename) + if device_doc and device_doc.document_hash != kosync_doc_id: + database_service.resolve_suggestion(device_doc.document_hash) + except Exception as e: + logger.warning(f"Failed to check/resolve device hash: {e}") + + return book, None def _build_batch_queue_item(item): @@ -132,6 +232,9 @@ def _build_batch_queue_item(item): if item.get("audio_only"): status_label = "Audio Only" status_kind = "audio-only" + elif item.get("ebook_only"): + status_label = "Ebook Only" + status_kind = "ebook-only" elif item.get("abs_id") and item.get("ebook_filename"): status_label = "Ready" status_kind = "ready" @@ -154,8 +257,9 @@ def _build_batch_queue_view(queue): return { "items": queue_items, "total_count": len(queue_items), - "ready_count": sum(1 for item in queue_items if item["status_kind"] in {"ready", "audio-only"}), + "ready_count": sum(1 for item in queue_items if item["status_kind"] in {"ready", "audio-only", "ebook-only"}), "audio_only_count": sum(1 for item in queue_items if item["status_kind"] == "audio-only"), + "ebook_only_count": sum(1 for item in queue_items if item["status_kind"] == "ebook-only"), "incomplete_count": sum(1 for item in queue_items if item["status_kind"] == "incomplete"), } @@ -163,10 +267,14 @@ def _build_batch_queue_view(queue): @matching_bp.route("/suggestions") def suggestions(): """Dedicated page for browsing and acting on pairing suggestions.""" + abs_service = get_abs_service() + if not abs_service.is_available(): + flash("Suggestions require Audiobookshelf to be configured.", "warning") + return redirect(url_for("dashboard.index")) container = get_container() database_service = get_database_service() raw_suggestions = database_service.get_all_actionable_suggestions() - suggestions_list = [_serialize_suggestion(s) for s in raw_suggestions if s.matches] + suggestions_list = [serialize_suggestion(s) for s in raw_suggestions if s.matches] visible_count = sum(1 for s in suggestions_list if not s.get("hidden")) hidden_count = sum(1 for s in suggestions_list if s.get("hidden")) suggestions_enabled = current_app.config.get("SUGGESTIONS_ENABLED", False) @@ -182,7 +290,7 @@ def suggestions(): suggestions_enabled=suggestions_enabled, bookfusion_enabled=bookfusion_enabled, bookfusion_catalog_count=bookfusion_catalog_count, - suggestions_json=json.dumps(suggestions_list), + suggestions_data=suggestions_list, initial_search=initial_search, selected_source_id=selected_source_id, ) @@ -209,21 +317,18 @@ def match(): return "Audiobook not found", 404 book = Book( abs_id=abs_id, - title=manager.get_abs_title(selected_ab), + title=manager.get_audiobook_title(selected_ab), ebook_filename=None, kosync_doc_id=None, status="not_started", duration=manager.get_duration(selected_ab), sync_mode="audiobook", + author=get_audiobook_author(selected_ab), + subtitle=selected_ab.get("media", {}).get("metadata", {}).get("subtitle") or None, ) database_service.save_book(book, is_new=True) abs_service.add_to_collection(abs_id, current_app.config["ABS_COLLECTION_NAME"]) - try: - hc_service = container.hardcover_service() - if hc_service.is_configured(): - hc_service.automatch_hardcover(book, hardcover_sync_client=container.hardcover_sync_client()) - except Exception as e: - logger.warning(f"Hardcover automatch failed (book saved): {e}") + attempt_hardcover_automatch(container, book) database_service.resolve_suggestion(abs_id) return redirect(url_for("dashboard.index")) @@ -264,6 +369,7 @@ def match(): storyteller_uuid=storyteller_uuid, ) database_service.save_book(book, is_new=True) + ensure_kosync_document(book, database_service) if kosync_doc_id: database_service.resolve_suggestion(kosync_doc_id) return redirect(url_for("dashboard.index")) @@ -288,6 +394,7 @@ def match(): book.kosync_doc_id = kosync_doc_id book.status = "pending" database_service.save_book(book) + ensure_kosync_document(book, database_service) if bl_client: try: bl_client.add_to_shelf(ebook_filename) @@ -314,12 +421,14 @@ def match(): return "Audiobook not found", 404 new_book = Book( abs_id=abs_id, - title=manager.get_abs_title(selected_ab), + title=manager.get_audiobook_title(selected_ab), ebook_filename=book.ebook_filename, kosync_doc_id=book.kosync_doc_id, status=book.status or "not_started", duration=manager.get_duration(selected_ab), sync_mode="audiobook", + author=get_audiobook_author(selected_ab), + subtitle=selected_ab.get("media", {}).get("metadata", {}).get("subtitle") or None, **_copy_book_merge_metadata( book, { @@ -329,6 +438,7 @@ def match(): ), ) database_service.save_book(new_book) + ensure_kosync_document(new_book, database_service) try: database_service.migrate_book_data(link_book_id, abs_id) database_service.delete_book(book.id) @@ -337,12 +447,7 @@ def match(): except Exception as e: logger.error(f"Failed to merge book data: {e}") raise - try: - hc_service = container.hardcover_service() - if hc_service.is_configured(): - hc_service.automatch_hardcover(new_book, hardcover_sync_client=container.hardcover_sync_client()) - except Exception as e: - logger.warning(f"Hardcover automatch failed (book saved): {e}") + attempt_hardcover_automatch(container, new_book) database_service.resolve_suggestion(abs_id) if new_book.kosync_doc_id: database_service.resolve_suggestion(new_book.kosync_doc_id) @@ -351,138 +456,28 @@ def match(): # --- Standard flow (requires audiobook) --- abs_service = get_abs_service() abs_id = request.form.get("audiobook_id") - selected_filename = sanitize_filename(request.form.get("ebook_filename")) - ebook_filename = selected_filename - original_ebook_filename = None + ebook_filename = sanitize_filename(request.form.get("ebook_filename")) + storyteller_uuid = request.form.get("storyteller_uuid") + storyteller_submit = request.form.get("storyteller_submit") + audiobooks = abs_service.get_audiobooks() selected_ab = next((ab for ab in audiobooks if ab["id"] == abs_id), None) if not selected_ab: return "Audiobook not found", 404 - booklore_id = None - storyteller_uuid = request.form.get("storyteller_uuid") - - bl_match, bl_match_client = find_in_booklore(ebook_filename) - if bl_match: - booklore_id = bl_match.get("id") - - kosync_doc_id = get_kosync_id_for_ebook(ebook_filename, booklore_id, bl_client=bl_match_client) - - if not kosync_doc_id: - logger.warning( - f"Cannot compute KOSync ID for '{sanitize_log_data(ebook_filename)}': File not found in Booklore or filesystem" - ) - return "Could not compute KOSync ID for ebook", 404 - - # Hash Preservation - current_book_entry = database_service.get_book_by_ref(abs_id) - if current_book_entry and current_book_entry.kosync_doc_id: - logger.info( - f"Preserving existing hash '{current_book_entry.kosync_doc_id}' for '{abs_id}' instead of new hash '{kosync_doc_id}'" - ) - kosync_doc_id = current_book_entry.kosync_doc_id - - # Duplicate Merge - existing_book = database_service.get_book_by_kosync_id(kosync_doc_id) - migration_source_id = None - - if existing_book and existing_book.abs_id != abs_id: - logger.info(f"Found existing book entry '{existing_book.abs_id}' for this ebook -- Merging into '{abs_id}'") - migration_source_id = existing_book.abs_id - ebook_item_id = existing_book.ebook_item_id or existing_book.abs_ebook_item_id or existing_book.abs_id - - if not original_ebook_filename: - original_ebook_filename = existing_book.original_ebook_filename or existing_book.ebook_filename - merge_metadata = _copy_book_merge_metadata( - existing_book, - { - "abs_ebook_item_id": ebook_item_id, - "ebook_item_id": ebook_item_id, - "original_ebook_filename": original_ebook_filename, - "storyteller_uuid": storyteller_uuid or existing_book.storyteller_uuid, - }, - ) - else: - ebook_item_id = None - merge_metadata = { - "storyteller_uuid": storyteller_uuid, - "original_ebook_filename": original_ebook_filename, - "abs_ebook_item_id": ebook_item_id, - "ebook_item_id": ebook_item_id, - } - - book = Book( - abs_id=abs_id, - title=manager.get_abs_title(selected_ab), + _ab_meta = selected_ab.get("media", {}).get("metadata", {}) + book, error = _create_book_mapping( + container, abs_id, + title=manager.get_audiobook_title(selected_ab), ebook_filename=ebook_filename, - kosync_doc_id=kosync_doc_id, - transcript_file=None, - status="pending", duration=manager.get_duration(selected_ab), - **merge_metadata, + storyteller_uuid=storyteller_uuid, + storyteller_submit=bool(storyteller_submit), + author=get_audiobook_author(selected_ab), + subtitle=_ab_meta.get("subtitle") or None, ) - - storyteller_submit = request.form.get("storyteller_submit") - - database_service.save_book(book, is_new=True) - - # Create Storyteller reservation immediately after saving the book — - # before any HTTP calls (Hardcover, Booklore, ABS) that could take - # seconds and let the sync cycle pick up the book without a reservation. - if storyteller_submit: - _create_storyteller_reservation(database_service, abs_id) - - # Duplicate Merge: Migrate - if migration_source_id: - try: - database_service.migrate_book_data(migration_source_id, abs_id) - database_service.delete_book(existing_book.id) - abs_service.add_to_collection(abs_id, current_app.config["ABS_COLLECTION_NAME"]) - logger.info(f"Successfully merged {migration_source_id} into {abs_id}") - except Exception as e: - logger.error(f"Failed to merge book data: {e}") - raise - - # Trigger Hardcover Automatch - try: - hc_service = container.hardcover_service() - if hc_service.is_configured(): - hc_service.automatch_hardcover(book, hardcover_sync_client=container.hardcover_sync_client()) - except Exception as e: - logger.warning(f"Hardcover automatch failed (book saved): {e}") - - if not migration_source_id: - abs_service.add_to_collection(abs_id, current_app.config["ABS_COLLECTION_NAME"]) - if bl_match_client: - shelf_filename = original_ebook_filename or ebook_filename - try: - bl_match_client.add_to_shelf(shelf_filename) - except Exception as e: - logger.warning(f"Booklore add_to_shelf failed for '{sanitize_log_data(shelf_filename)}': {e}") - # Storyteller submission (runs in background thread to avoid blocking) - if storyteller_submit: - _submit_to_storyteller_async( - container, - abs_id, - manager.get_abs_title(selected_ab), - ebook_filename, - current_app.config.get("BOOKS_DIR", ""), - current_app.config.get("EPUB_CACHE_DIR", ""), - ) - - # Remove resolved suggestions once the mapping is created - database_service.resolve_suggestion(abs_id) - database_service.resolve_suggestion(kosync_doc_id) - - try: - device_doc = database_service.get_kosync_doc_by_filename(ebook_filename) - if device_doc and device_doc.document_hash != kosync_doc_id: - logger.info( - f"Resolving additional suggestion/hash for '{ebook_filename}': '{device_doc.document_hash}'" - ) - database_service.resolve_suggestion(device_doc.document_hash) - except Exception as e: - logger.warning(f"Failed to check/resolve device hash: {e}") + if error: + return error, 404 return redirect(url_for("dashboard.index")) @@ -541,6 +536,16 @@ def match(): pass storyteller_force_mode = os.environ.get("STORYTELLER_FORCE_MODE", "false").lower() == "true" + storyteller_configured = container.storyteller_client().is_configured() + + # Detect available services for smart mode defaults + abs_configured = abs_service.is_available() + has_ebook_sources = ( + any_booklore_configured() + or container.cwa_client().is_configured() + or abs_service.has_ebook_libraries() + or get_ebook_dir().exists() + ) # Build sets of IDs already in the library for "In Library" badges library_abs_ids = set() @@ -557,7 +562,7 @@ def match(): ebooks=ebooks, storyteller_books=storyteller_books, search=search, - get_title=manager.get_abs_title, + get_title=manager.get_audiobook_title, attach_to=attach_to, attach_title=attach_title, link_to=link_to, @@ -565,8 +570,11 @@ def match(): preselect_abs_id=preselect_abs_id, storyteller_submit_available=storyteller_submit_available, storyteller_force_mode=storyteller_force_mode, + storyteller_configured=storyteller_configured, library_abs_ids=library_abs_ids, library_ebook_filenames=library_ebook_filenames, + abs_configured=abs_configured, + has_ebook_sources=has_ebook_sources, ) @@ -582,33 +590,55 @@ def batch_match(): action = request.form.get("action") if action == "add_to_queue": session.setdefault("queue", []) - abs_id = request.form.get("audiobook_id") + abs_id = request.form.get("audiobook_id") or "" ebook_filename = sanitize_filename(request.form.get("ebook_filename", "")) or "" ebook_display_name = request.form.get("ebook_display_name", ebook_filename) storyteller_uuid = request.form.get("storyteller_uuid", "") - audiobooks = abs_service.get_audiobooks() - selected_ab = next((ab for ab in audiobooks if ab["id"] == abs_id), None) - if selected_ab: - if not any(item["abs_id"] == abs_id for item in session["queue"]): - is_audio_only = not ebook_filename and not storyteller_uuid - session["queue"].append( - { - "abs_id": abs_id, - "title": manager.get_abs_title(selected_ab), - "ebook_filename": ebook_filename, - "ebook_display_name": ebook_display_name, - "storyteller_uuid": storyteller_uuid, - "storyteller_submit": bool(request.form.get("storyteller_submit")), - "duration": manager.get_duration(selected_ab), - "cover_url": abs_service.get_cover_proxy_url(abs_id), - "audio_only": is_audio_only, - } - ) - session.modified = True + + if not abs_id and not ebook_filename and not storyteller_uuid: + return redirect(url_for("matching.batch_match", search=request.form.get("search", ""))) + + # Resolve audiobook metadata if present + selected_ab = None + if abs_id: + audiobooks = abs_service.get_audiobooks() + selected_ab = next((ab for ab in audiobooks if ab["id"] == abs_id), None) + if not selected_ab: + return redirect(url_for("matching.batch_match", search=request.form.get("search", ""))) + + # Dedup key: abs_id if present, otherwise ebook_filename + queue_key = abs_id or ebook_filename + if not any(item.get("queue_key") == queue_key for item in session["queue"]): + is_ebook_only = not abs_id and (ebook_filename or storyteller_uuid) + is_audio_only = abs_id and not ebook_filename and not storyteller_uuid + title = ( + manager.get_audiobook_title(selected_ab) if selected_ab + else ebook_display_name or Path(ebook_filename).stem if ebook_filename + else "Storyteller Book" + ) + _ab_meta = (selected_ab or {}).get("media", {}).get("metadata", {}) + session["queue"].append( + { + "queue_key": queue_key, + "abs_id": abs_id, + "title": title, + "ebook_filename": ebook_filename, + "ebook_display_name": ebook_display_name, + "storyteller_uuid": storyteller_uuid, + "storyteller_submit": bool(request.form.get("storyteller_submit")), + "duration": manager.get_duration(selected_ab) if selected_ab else 0, + "cover_url": abs_service.get_cover_proxy_url(abs_id) if abs_id else None, + "audio_only": is_audio_only, + "ebook_only": is_ebook_only, + "author": get_audiobook_author(selected_ab) if selected_ab else None, + "subtitle": _ab_meta.get("subtitle") or None, + } + ) + session.modified = True return redirect(url_for("matching.batch_match", search=request.form.get("search", ""))) elif action == "remove_from_queue": - abs_id = request.form.get("abs_id") - session["queue"] = [item for item in session.get("queue", []) if item["abs_id"] != abs_id] + remove_key = request.form.get("queue_key") or request.form.get("abs_id") + session["queue"] = [item for item in session.get("queue", []) if item.get("queue_key", item.get("abs_id")) != remove_key] session.modified = True return redirect(url_for("matching.batch_match")) elif action == "clear_queue": @@ -630,140 +660,61 @@ def batch_match(): status="not_started", duration=item["duration"], sync_mode="audiobook", + author=item.get("author"), + subtitle=item.get("subtitle"), ) database_service.save_book(book, is_new=True) abs_service.add_to_collection(item["abs_id"], current_app.config["ABS_COLLECTION_NAME"]) - try: - hc_service = container.hardcover_service() - if hc_service.is_configured(): - hc_service.automatch_hardcover(book, hardcover_sync_client=container.hardcover_sync_client()) - except Exception as e: - logger.warning(f"Hardcover automatch failed (book saved): {e}") + attempt_hardcover_automatch(container, book) database_service.resolve_suggestion(item["abs_id"]) continue - ebook_filename = item["ebook_filename"] - storyteller_uuid = item.get("storyteller_uuid", "") - original_ebook_filename = None - duration = item["duration"] - booklore_id = None - kosync_doc_id = None - - bl_match, bl_match_client = find_in_booklore(ebook_filename) - if bl_match: - booklore_id = bl_match.get("id") - - kosync_doc_id = get_kosync_id_for_ebook(ebook_filename, booklore_id, bl_client=bl_match_client) - - if not kosync_doc_id: - logger.warning(f"Could not compute KOSync ID for {sanitize_log_data(ebook_filename)}, skipping") - failed_items.append(item.get("ebook_display_name") or ebook_filename) - continue - - # Hash Preservation - current_book_entry = database_service.get_book_by_ref(item["abs_id"]) - if current_book_entry and current_book_entry.kosync_doc_id: - logger.info( - f"Preserving existing hash '{current_book_entry.kosync_doc_id}' for '{item['abs_id']}' instead of new hash '{kosync_doc_id}'" - ) - kosync_doc_id = current_book_entry.kosync_doc_id + # Handle ebook-only queue items + if item.get("ebook_only"): + ebook_filename = item["ebook_filename"] + storyteller_uuid = item.get("storyteller_uuid") or None + + if ebook_filename: + bl_book, bl_client = find_in_booklore(ebook_filename) + booklore_id = bl_book.get("id") if bl_book else None + kosync_doc_id = get_kosync_id_for_ebook(ebook_filename, booklore_id, bl_client=bl_client) + if not kosync_doc_id: + failed_items.append(item.get("ebook_display_name") or ebook_filename) + continue + title = item.get("ebook_display_name") or (bl_book.get("title") if bl_book else None) or Path(ebook_filename).stem + else: + title = item.get("title", "Storyteller Book") + ebook_filename = None + kosync_doc_id = None - # Duplicate Merge - existing_book = database_service.get_book_by_kosync_id(kosync_doc_id) - migration_source_id = None - ebook_item_id = None - - if existing_book and existing_book.abs_id != item["abs_id"]: - logger.info( - f"Found existing book entry '{existing_book.abs_id}' for this ebook -- Merging into '{item['abs_id']}'" - ) - migration_source_id = existing_book.abs_id - ebook_item_id = existing_book.ebook_item_id or existing_book.abs_ebook_item_id or existing_book.abs_id - if not original_ebook_filename: - original_ebook_filename = ( - existing_book.original_ebook_filename or existing_book.ebook_filename - ) - merge_metadata = _copy_book_merge_metadata( - existing_book, - { - "abs_ebook_item_id": ebook_item_id, - "ebook_item_id": ebook_item_id, - "original_ebook_filename": original_ebook_filename, - "storyteller_uuid": storyteller_uuid or existing_book.storyteller_uuid, - }, + book = Book( + abs_id=None, + title=title, + ebook_filename=ebook_filename, + kosync_doc_id=kosync_doc_id, + status="not_started", + sync_mode="ebook_only", + storyteller_uuid=storyteller_uuid, ) - else: - merge_metadata = { - "storyteller_uuid": storyteller_uuid or None, - "original_ebook_filename": original_ebook_filename, - "abs_ebook_item_id": ebook_item_id, - "ebook_item_id": ebook_item_id, - } - - batch_storyteller_submit = item.get("storyteller_submit") + database_service.save_book(book, is_new=True) + ensure_kosync_document(book, database_service) + if kosync_doc_id: + database_service.resolve_suggestion(kosync_doc_id) + continue - book = Book( + book, error = _create_book_mapping( + container, abs_id=item["abs_id"], title=item["title"], - ebook_filename=ebook_filename, - kosync_doc_id=kosync_doc_id, - transcript_file=None, - status="pending", - duration=duration, - **merge_metadata, + ebook_filename=item["ebook_filename"], + duration=item["duration"], + storyteller_uuid=item.get("storyteller_uuid", ""), + storyteller_submit=bool(item.get("storyteller_submit")), + author=item.get("author"), + subtitle=item.get("subtitle"), ) - - database_service.save_book(book, is_new=True) - - # Create reservation immediately after book save, before HTTP calls - if batch_storyteller_submit: - _create_storyteller_reservation(database_service, item["abs_id"]) - - # Duplicate Merge: Migrate - if migration_source_id: - database_service.migrate_book_data(migration_source_id, item["abs_id"]) - database_service.delete_book(existing_book.id) - abs_service.add_to_collection(item["abs_id"], current_app.config["ABS_COLLECTION_NAME"]) - logger.info(f"Successfully merged {migration_source_id} into {item['abs_id']}") - - # Trigger Hardcover Automatch - try: - hc_service = container.hardcover_service() - if hc_service.is_configured(): - hc_service.automatch_hardcover(book, hardcover_sync_client=container.hardcover_sync_client()) - except Exception as e: - logger.warning(f"Hardcover automatch failed (book saved): {e}") - - if not migration_source_id: - abs_service.add_to_collection(item["abs_id"], current_app.config["ABS_COLLECTION_NAME"]) - if bl_match_client: - shelf_filename = original_ebook_filename or ebook_filename - try: - bl_match_client.add_to_shelf(shelf_filename) - except Exception as e: - logger.warning( - f"Booklore add_to_shelf failed for '{sanitize_log_data(shelf_filename)}': {e}" - ) - # Storyteller submission (runs in background thread) - if batch_storyteller_submit: - _submit_to_storyteller_async( - container, - item["abs_id"], - item["title"], - ebook_filename, - current_app.config.get("BOOKS_DIR", ""), - current_app.config.get("EPUB_CACHE_DIR", ""), - ) - - database_service.resolve_suggestion(item["abs_id"]) - database_service.resolve_suggestion(kosync_doc_id) - - try: - device_doc = database_service.get_kosync_doc_by_filename(ebook_filename) - if device_doc and device_doc.document_hash != kosync_doc_id: - database_service.resolve_suggestion(device_doc.document_hash) - except Exception as e: - logger.warning(f"Failed to check/resolve device hash: {e}") + if error: + failed_items.append(item.get("ebook_display_name") or item["ebook_filename"]) except Exception as e: logger.error(f"Failed to process queue item '{sanitize_log_data(item_label)}': {e}") @@ -802,6 +753,15 @@ def batch_match(): pass storyteller_force_mode = os.environ.get("STORYTELLER_FORCE_MODE", "false").lower() == "true" + storyteller_configured = container.storyteller_client().is_configured() + + abs_configured = abs_service.is_available() + has_ebook_sources = ( + any_booklore_configured() + or container.cwa_client().is_configured() + or abs_service.has_ebook_libraries() + or get_ebook_dir().exists() + ) queue_view = _build_batch_queue_view(session.get("queue", [])) return render_template( @@ -812,7 +772,10 @@ def batch_match(): queue=queue_view["items"], queue_summary=queue_view, search=search, - get_title=manager.get_abs_title, + get_title=manager.get_audiobook_title, storyteller_submit_available=storyteller_submit_available, storyteller_force_mode=storyteller_force_mode, + storyteller_configured=storyteller_configured, + abs_configured=abs_configured, + has_ebook_sources=has_ebook_sources, ) diff --git a/src/blueprints/reading_bp.py b/src/blueprints/reading_bp.py index d20a2e4..527d6c5 100644 --- a/src/blueprints/reading_bp.py +++ b/src/blueprints/reading_bp.py @@ -9,6 +9,7 @@ from flask import Blueprint, abort, jsonify, render_template, request from src.blueprints.helpers import ( + find_booklore_metadata, get_abs_service, get_book_or_404, get_container, @@ -48,9 +49,10 @@ def __init__(self): def _build_book_reading_data(book, database_service, abs_service, states_by_book, - booklore_by_filename=None, abs_metadata_by_id=None): + booklore_by_filename=None, abs_metadata_by_id=None, + hardcover_details=None): """Build a reading-focused data dict for a single book.""" - sync_mode = getattr(book, 'sync_mode', 'audiobook') + sync_mode = book.sync_mode if sync_mode == 'ebook_only': book_type = 'ebook-only' elif not book.ebook_filename: @@ -65,20 +67,14 @@ def _build_book_reading_data(book, database_service, abs_service, states_by_book # Enrich title/author from Booklore or ABS metadata when available display_title = book.title or '' display_author = '' - bl_meta = None - if booklore_by_filename: - for fn in (book.ebook_filename, getattr(book, 'original_ebook_filename', None)): - if fn: - candidates = booklore_by_filename.get(fn.lower(), []) - bl_meta = next((b for b in candidates if b.title), candidates[0] if candidates else None) - if bl_meta and bl_meta.title: - stems = set() - for check_fn in (book.ebook_filename, getattr(book, 'original_ebook_filename', None)): - if check_fn: - stems.add(Path(check_fn).stem) - if display_title in stems or display_title == book.ebook_filename: - display_title = bl_meta.title - break + bl_meta = find_booklore_metadata(book, booklore_by_filename) if booklore_by_filename else None + if bl_meta and bl_meta.title: + stems = set() + for check_fn in (book.ebook_filename, book.original_ebook_filename): + if check_fn: + stems.add(Path(check_fn).stem) + if display_title in stems or display_title == book.ebook_filename: + display_title = bl_meta.title if bl_meta and bl_meta.authors: display_author = bl_meta.authors @@ -87,14 +83,14 @@ def _build_book_reading_data(book, database_service, abs_service, states_by_book abs_meta = (abs_metadata_by_id or {}).get(book.abs_id, {}) display_author = abs_meta.get('author') or '' - if not display_author and getattr(book, 'author', None): + if not display_author and book.author: display_author = book.author if not display_author: display_author = book.ebook_filename or '' covers = resolve_book_covers(book, abs_service, database_service, book_type, - booklore_meta=bl_meta) + booklore_meta=bl_meta, hardcover_details=hardcover_details) return { 'id': book.id, @@ -107,6 +103,7 @@ def _build_book_reading_data(book, database_service, abs_service, states_by_book 'book_type': book_type, 'unified_progress': max_progress, 'cover_url': covers['cover_url'], + 'placeholder_logo': covers['placeholder_logo'], 'custom_cover_url': covers['custom_cover_url'], 'abs_cover_url': covers['abs_cover_url'], 'fallback_cover_url': covers['fallback_cover_url'], @@ -166,6 +163,10 @@ def reading_index(): enabled_bl_ids = get_enabled_booklore_server_ids() booklore_by_filename = database_service.get_booklore_by_filename(enabled_server_ids=enabled_bl_ids) + # Bulk-fetch Hardcover details to avoid N+1 in resolve_book_covers + all_hardcover = database_service.get_all_hardcover_details() + hardcover_by_book = {h.book_id: h for h in all_hardcover} + all_book_data = [ _build_book_reading_data( b, @@ -174,6 +175,7 @@ def reading_index(): states_by_book, booklore_by_filename, abs_metadata_by_id, + hardcover_details=hardcover_by_book.get(b.id), ) for b in books ] @@ -307,8 +309,9 @@ def reading_detail(book_ref): enabled_bl_ids = get_enabled_booklore_server_ids() booklore_by_filename = database_service.get_booklore_by_filename(enabled_server_ids=enabled_bl_ids) + hc_details = database_service.get_hardcover_details(book.id) book_data = _build_book_reading_data(book, database_service, abs_service, states_by_book, - booklore_by_filename) + booklore_by_filename, hardcover_details=hc_details) journals = database_service.get_reading_journals(book.id) # Synthesize started/finished timeline entries from book dates if missing @@ -323,12 +326,12 @@ def reading_detail(book_ref): journals.sort(key=lambda j: j.created_at or datetime.min, reverse=True) # BookFusion highlights matched to this book - bf_highlights = database_service.get_bookfusion_highlights_for_book(book.abs_id) + bf_highlights = database_service.get_bookfusion_highlights_for_book_by_book_id(book.id) has_bookfusion_link = ( (book.abs_id or '').startswith('bf-') or len(bf_highlights) > 0 - or database_service.is_bookfusion_linked(book.abs_id) + or database_service.is_bookfusion_linked_by_book_id(book.id) ) container = get_container() @@ -357,7 +360,7 @@ def reading_detail(book_ref): if show_alignment_tab: try: alignment_service = container.alignment_service() - alignment_info = alignment_service.get_alignment_info(book.abs_id or str(book.id)) + alignment_info = alignment_service.get_alignment_info(book.id) if alignment_info: book_duration = book.duration max_ts = alignment_info['max_timestamp'] diff --git a/src/blueprints/settings_bp.py b/src/blueprints/settings_bp.py index 8ee18d9..21e93c9 100644 --- a/src/blueprints/settings_bp.py +++ b/src/blueprints/settings_bp.py @@ -12,39 +12,58 @@ logger = logging.getLogger(__name__) -settings_bp = Blueprint('settings_page', __name__) +settings_bp = Blueprint("settings_page", __name__) URL_SETTING_KEYS = { - 'ABS_SERVER', 'BOOKLORE_SERVER', 'BOOKLORE_2_SERVER', 'STORYTELLER_API_URL', 'CWA_SERVER', 'KOSYNC_SERVER', - 'ABS_WEB_URL', 'BOOKLORE_WEB_URL', 'BOOKLORE_2_WEB_URL', 'STORYTELLER_WEB_URL', 'CWA_WEB_URL', 'HARDCOVER_WEB_URL', - 'ABS_WEB_URL_INTERNAL', 'ABS_WEB_URL_EXTERNAL', - 'STORYTELLER_WEB_URL_INTERNAL', 'STORYTELLER_WEB_URL_EXTERNAL', - 'BOOKLORE_WEB_URL_INTERNAL', 'BOOKLORE_WEB_URL_EXTERNAL', - 'BOOKLORE_2_WEB_URL_INTERNAL', 'BOOKLORE_2_WEB_URL_EXTERNAL', - 'CWA_WEB_URL_INTERNAL', 'CWA_WEB_URL_EXTERNAL', - 'HARDCOVER_WEB_URL_EXTERNAL', - 'KOSYNC_PUBLIC_URL', + "ABS_SERVER", + "BOOKLORE_SERVER", + "BOOKLORE_2_SERVER", + "STORYTELLER_API_URL", + "CWA_SERVER", + "KOSYNC_SERVER", + "ABS_WEB_URL", + "BOOKLORE_WEB_URL", + "BOOKLORE_2_WEB_URL", + "STORYTELLER_WEB_URL", + "CWA_WEB_URL", + "HARDCOVER_WEB_URL", + "ABS_WEB_URL_INTERNAL", + "ABS_WEB_URL_EXTERNAL", + "STORYTELLER_WEB_URL_INTERNAL", + "STORYTELLER_WEB_URL_EXTERNAL", + "BOOKLORE_WEB_URL_INTERNAL", + "BOOKLORE_WEB_URL_EXTERNAL", + "BOOKLORE_2_WEB_URL_INTERNAL", + "BOOKLORE_2_WEB_URL_EXTERNAL", + "CWA_WEB_URL_INTERNAL", + "CWA_WEB_URL_EXTERNAL", + "HARDCOVER_WEB_URL_EXTERNAL", + "KOSYNC_PUBLIC_URL", } SECRET_SETTING_KEYS = { - 'ABS_KEY', 'STORYTELLER_PASSWORD', 'BOOKLORE_PASSWORD', 'BOOKLORE_2_PASSWORD', - 'CWA_PASSWORD', 'KOSYNC_KEY', 'TELEGRAM_BOT_TOKEN', 'HARDCOVER_TOKEN', - 'DEEPGRAM_API_KEY', 'BOOKFUSION_API_KEY', 'BOOKFUSION_UPLOAD_API_KEY', + "ABS_KEY", + "STORYTELLER_PASSWORD", + "BOOKLORE_PASSWORD", + "BOOKLORE_2_PASSWORD", + "CWA_PASSWORD", + "KOSYNC_KEY", + "KOSYNC_SERVER_KEY", + "TELEGRAM_BOT_TOKEN", + "HARDCOVER_TOKEN", + "DEEPGRAM_API_KEY", + "BOOKFUSION_API_KEY", + "BOOKFUSION_UPLOAD_API_KEY", } def _is_secret_request_authorized() -> bool: - """Authorize secret reveal requests. - - Allowed if either: - - Session indicates an admin user, or - - Caller presents a valid internal service token. - """ - if bool(session.get('is_admin')): + """Authorize secret reveal requests via admin session or service token.""" + if bool(session.get("is_admin")): return True - expected_token = os.environ.get('INTERNAL_SERVICE_TOKEN', '').strip() - provided_token = request.headers.get('X-Internal-Service-Token', '').strip() + expected_token = os.environ.get("INTERNAL_SERVICE_TOKEN", "").strip() + provided_token = request.headers.get("X-Internal-Service-Token", "").strip() if expected_token and secrets_compare(expected_token, provided_token): return True @@ -54,7 +73,7 @@ def _is_secret_request_authorized() -> bool: def _mask_secret(value: str) -> str: """Return a masked secret showing only the last 4 characters.""" if not value: - return '' + return "" tail = value[-4:] if len(value) >= 4 else value return f"{'*' * max(0, len(value) - len(tail))}{tail}" @@ -62,13 +81,14 @@ def _mask_secret(value: str) -> str: def secrets_compare(a: str, b: str) -> bool: """Constant-time secret comparison.""" import hmac + return hmac.compare_digest(a, b) def _normalize_url_value(value: str) -> str: clean_value = value.strip() if not clean_value: - return '' + return "" lower_val = clean_value.lower() if not (lower_val.startswith("http://") or lower_val.startswith("https://")): return f"http://{clean_value}" @@ -77,7 +97,7 @@ def _normalize_url_value(value: str) -> str: def _request_payload() -> dict: try: - if request.method == 'POST' and request.is_json: + if request.method == "POST" and request.is_json: return request.get_json(silent=True) or {} except RuntimeError: pass @@ -89,20 +109,20 @@ def _request_value(key: str, env_key: str | None = None, *, secret: bool = False source_key = env_key or key if key in payload: - value = str(payload.get(key, '') or '').strip() - if secret and value == '': - value = os.environ.get(source_key, '') + value = str(payload.get(key, "") or "").strip() + if secret and value == "": + value = os.environ.get(source_key, "") if normalize_url: return _normalize_url_value(value) return value - value = os.environ.get(source_key, '') + value = os.environ.get(source_key, "") if normalize_url: return _normalize_url_value(value) return value.strip() -@settings_bp.route('/settings', methods=['GET', 'POST']) +@settings_bp.route("/settings", methods=["GET", "POST"]) def settings(): """ Handle the settings page: persist submitted configuration on POST and render the settings UI on GET. @@ -121,39 +141,39 @@ def settings(): """ database_service = get_database_service() - if request.method == 'POST': + if request.method == "POST": bool_keys = [ - 'ABS_ENABLED', - 'KOSYNC_USE_PERCENTAGE_FROM_SERVER', - 'SYNC_ABS_EBOOK', - 'XPATH_FALLBACK_TO_PREVIOUS_SEGMENT', - 'KOSYNC_ENABLED', - 'STORYTELLER_ENABLED', - 'STORYTELLER_FORCE_MODE', - 'BOOKLORE_ENABLED', - 'BOOKLORE_2_ENABLED', - 'CWA_ENABLED', - 'HARDCOVER_ENABLED', - 'TELEGRAM_ENABLED', - 'SUGGESTIONS_ENABLED', - 'REPROCESS_ON_CLEAR_IF_NO_ALIGNMENT', - 'INSTANT_SYNC_ENABLED', - 'ABS_SOCKET_ENABLED', - 'BOOKFUSION_ENABLED', + "ABS_ENABLED", + "KOSYNC_USE_PERCENTAGE_FROM_SERVER", + "SYNC_ABS_EBOOK", + "XPATH_FALLBACK_TO_PREVIOUS_SEGMENT", + "KOSYNC_ENABLED", + "STORYTELLER_ENABLED", + "STORYTELLER_FORCE_MODE", + "BOOKLORE_ENABLED", + "BOOKLORE_2_ENABLED", + "CWA_ENABLED", + "HARDCOVER_ENABLED", + "TELEGRAM_ENABLED", + "SUGGESTIONS_ENABLED", + "REPROCESS_ON_CLEAR_IF_NO_ALIGNMENT", + "INSTANT_SYNC_ENABLED", + "ABS_SOCKET_ENABLED", + "BOOKFUSION_ENABLED", ] current_settings = database_service.get_all_settings() # 1. Handle Boolean Toggles for key in bool_keys: - is_checked = (key in request.form) + is_checked = key in request.form val_str = str(is_checked).lower() database_service.set_setting(key, val_str) os.environ[key] = val_str # 2. Handle Text Inputs for key, value in request.form.items(): - if key == '_active_tab': + if key == "_active_tab": continue if key in bool_keys: continue @@ -172,105 +192,111 @@ def settings(): try: from src.web_server import apply_settings + apply_settings(current_app._get_current_object()) - session['message'] = "Settings saved successfully." - session['is_error'] = False + session["message"] = "Settings saved successfully." + session["is_error"] = False except Exception as e: - session['message'] = f"Error applying settings: {e}" - session['is_error'] = True + session["message"] = f"Error applying settings: {e}" + session["is_error"] = True logger.error(f"Error applying settings: {e}") - active_tab = request.form.get('_active_tab', 'general') - return redirect(url_for('settings_page.settings', tab=active_tab)) + active_tab = request.form.get("_active_tab", "general") + return redirect(url_for("settings_page.settings", tab=active_tab)) # GET Request - message = session.pop('message', None) - is_error = session.pop('is_error', False) + message = session.pop("message", None) + is_error = session.pop("is_error", False) latest_version, update_available = get_update_status() - return render_template('settings.html', - message=message, - is_error=is_error, - app_version=APP_VERSION, - update_available=update_available, - latest_version=latest_version) + return render_template( + "settings.html", + message=message, + is_error=is_error, + app_version=APP_VERSION, + update_available=update_available, + latest_version=latest_version, + ) -@settings_bp.route('/api/settings/secret/', methods=['GET']) +@settings_bp.route("/api/settings/secret/", methods=["GET"]) def get_secret(key): """Return a stored secret value (for reveal-on-demand UI).""" - allowed = {'KOSYNC_KEY'} + if not _is_secret_request_authorized(): + return jsonify({"error": "Unauthorized"}), 403 + + allowed = {"KOSYNC_KEY", "KOSYNC_SERVER_KEY"} if key not in allowed: - return jsonify({'error': 'Not allowed'}), 403 + return jsonify({"error": "Not allowed"}), 403 - caller = request.headers.get('X-Forwarded-For', request.remote_addr) + caller = request.headers.get("X-Forwarded-For", request.remote_addr) logger.info(f"AUDIT: Secret requested (key={key}, caller={caller})") - value = os.environ.get(key, '') - return jsonify({'value': value, 'present': bool(value)}) + value = os.environ.get(key, "") + return jsonify({"value": value, "present": bool(value)}) -@settings_bp.route('/api/kosync/test', methods=['POST']) +@settings_bp.route("/api/kosync/test", methods=["POST"]) def test_kosync_connection(): """Test connection to the configured KoSync server (legacy route).""" - return test_connection('kosync') + return test_connection("kosync") -@settings_bp.route('/api/test-connection/', methods=['GET', 'POST']) +@settings_bp.route("/api/test-connection/", methods=["GET", "POST"]) def test_connection(service): """Test connectivity to a configured service. Returns JSON with success/detail.""" testers = { - 'abs': _test_abs, - 'kosync': _test_kosync, - 'storyteller': _test_storyteller, - 'booklore': _test_booklore, - 'booklore2': _test_booklore2, - 'cwa': _test_cwa, - 'hardcover': _test_hardcover, - 'telegram': _test_telegram, - 'bookfusion': _test_bookfusion, - 'bookfusion_upload': _test_bookfusion_upload, + "abs": _test_abs, + "kosync": _test_kosync, + "storyteller": _test_storyteller, + "booklore": _test_booklore, + "booklore2": _test_booklore2, + "cwa": _test_cwa, + "hardcover": _test_hardcover, + "telegram": _test_telegram, + "bookfusion": _test_bookfusion, + "bookfusion_upload": _test_bookfusion_upload, } tester = testers.get(service) if not tester: - return jsonify({'success': False, 'detail': 'Unknown service'}), 400 + return jsonify({"success": False, "detail": "Unknown service"}), 400 try: success, detail = tester() except Exception as e: logger.warning(f"Connection test for '{service}' failed: {_redact_secrets(str(sanitize_log_data(e)))}") success, detail = False, _test_conn_error(e) - return jsonify({'success': success, 'detail': detail}) + return jsonify({"success": success, "detail": detail}) def _redact_secrets(msg: str) -> str: """Replace any known secret values in a string with a fixed mask.""" for key in SECRET_SETTING_KEYS: - val = os.environ.get(key, '') + val = os.environ.get(key, "") if val and val in msg: - msg = msg.replace(val, '***') + msg = msg.replace(val, "***") return msg def _test_conn_error(e: Exception) -> str: """Return a user-friendly error string from a request exception.""" msg = str(e) - if 'ConnectionError' in type(e).__name__ or 'connection' in msg.lower(): - return 'Connection refused — is the server running?' - if 'Timeout' in type(e).__name__: - return 'Request timed out' - if 'NameResolutionError' in msg or 'getaddrinfo' in msg: - return 'Server hostname could not be resolved — check the URL' + if "ConnectionError" in type(e).__name__ or "connection" in msg.lower(): + return "Connection refused — is the server running?" + if "Timeout" in type(e).__name__: + return "Request timed out" + if "NameResolutionError" in msg or "getaddrinfo" in msg: + return "Server hostname could not be resolved — check the URL" return _redact_secrets(str(sanitize_log_data(msg))) _HTTP_FRIENDLY = { - 401: 'Authentication failed — check your username and password', - 403: 'Access denied — your account may not have permission', - 404: 'Endpoint not found — check the server URL', - 500: 'Server returned an internal error', - 502: 'Bad gateway — is a reverse proxy misconfigured?', - 503: 'Service unavailable — the server may be starting up', + 401: "Authentication failed — check your username and password", + 403: "Access denied — your account may not have permission", + 404: "Endpoint not found — check the server URL", + 500: "Server returned an internal error", + 502: "Bad gateway — is a reverse proxy misconfigured?", + 503: "Service unavailable — the server may be starting up", } @@ -278,55 +304,66 @@ def _http_error(status_code: int) -> str: """Return a user-friendly message for an HTTP error status.""" friendly = _HTTP_FRIENDLY.get(status_code) if friendly: - return f'{friendly} (HTTP {status_code})' - return f'Unexpected response (HTTP {status_code})' + return f"{friendly} (HTTP {status_code})" + return f"Unexpected response (HTTP {status_code})" def _test_abs() -> tuple[bool, str]: - url = _request_value('server', 'ABS_SERVER', normalize_url=True).rstrip('/') - token = _request_value('token', 'ABS_KEY', secret=True) + url = _request_value("server", "ABS_SERVER", normalize_url=True).rstrip("/") + token = _request_value("token", "ABS_KEY", secret=True) if not url or not token: - return False, 'Server URL or API token not configured' + return False, "Server URL or API token not configured" resp = http_requests.get( f"{url}/api/me", headers={"Authorization": f"Bearer {token}"}, timeout=10, ) if resp.status_code == 200: - username = resp.json().get('username', 'unknown') - return True, f'Connected as {username}' + username = resp.json().get("username", "unknown") + return True, f"Connected as {username}" return False, _http_error(resp.status_code) def _test_kosync() -> tuple[bool, str]: - url = _request_value('server', 'KOSYNC_SERVER', normalize_url=True).rstrip('/') - user = _request_value('user', 'KOSYNC_USER') - key = _request_value('key', 'KOSYNC_KEY', secret=True) + from urllib.parse import urlparse + + url = _request_value("server", "KOSYNC_SERVER", normalize_url=True).rstrip("/") + hostname = urlparse(url).hostname or "" if url else "" + is_external = hostname not in ("127.0.0.1", "::1", "localhost", "") + if is_external: + user = _request_value("user", "KOSYNC_SERVER_USER") or _request_value("user", "KOSYNC_USER") + key = _request_value("key", "KOSYNC_SERVER_KEY", secret=True) or _request_value( + "key", "KOSYNC_KEY", secret=True + ) + else: + user = _request_value("user", "KOSYNC_USER") + key = _request_value("key", "KOSYNC_KEY", secret=True) if not url or not user: - return False, 'Server URL or credentials not configured' + return False, "Server URL or credentials not configured" headers = {} if key: from src.utils.kosync_headers import hash_kosync_key, kosync_auth_headers + headers = kosync_auth_headers(user, hash_kosync_key(key)) try: resp = http_requests.get(f"{url}/healthcheck", timeout=5, headers=headers) if resp.status_code == 200: - return True, 'Connected' + return True, "Connected" fallback = http_requests.get(f"{url}/syncs/progress/test-connection", timeout=5, headers=headers) if fallback.status_code == 200: - return True, 'Connected' + return True, "Connected" return False, _http_error(fallback.status_code) except Exception as e: return False, _test_conn_error(e) def _test_storyteller() -> tuple[bool, str]: - url = _request_value('api_url', 'STORYTELLER_API_URL', normalize_url=True).rstrip('/') - user = _request_value('user', 'STORYTELLER_USER') - pw = _request_value('password', 'STORYTELLER_PASSWORD', secret=True) + url = _request_value("api_url", "STORYTELLER_API_URL", normalize_url=True).rstrip("/") + user = _request_value("user", "STORYTELLER_USER") + pw = _request_value("password", "STORYTELLER_PASSWORD", secret=True) if not url or not user: - return False, 'API URL or credentials not configured' + return False, "API URL or credentials not configured" resp = http_requests.post( f"{url}/api/token", data={"username": user, "password": pw}, @@ -334,40 +371,40 @@ def _test_storyteller() -> tuple[bool, str]: timeout=10, ) if resp.status_code == 200: - return True, 'Authenticated' + return True, "Authenticated" return False, _http_error(resp.status_code) def _test_booklore() -> tuple[bool, str]: - return _test_booklore_instance('BOOKLORE') + return _test_booklore_instance("BOOKLORE") def _test_booklore2() -> tuple[bool, str]: - return _test_booklore_instance('BOOKLORE_2') + return _test_booklore_instance("BOOKLORE_2") def _test_booklore_instance(prefix: str) -> tuple[bool, str]: - url = _request_value('server', f'{prefix}_SERVER', normalize_url=True).rstrip('/') - user = _request_value('user', f'{prefix}_USER') - pw = _request_value('password', f'{prefix}_PASSWORD', secret=True) + url = _request_value("server", f"{prefix}_SERVER", normalize_url=True).rstrip("/") + user = _request_value("user", f"{prefix}_USER") + pw = _request_value("password", f"{prefix}_PASSWORD", secret=True) if not url or not user: - return False, 'Server URL or credentials not configured' + return False, "Server URL or credentials not configured" resp = http_requests.post( f"{url}/api/v1/auth/login", json={"username": user, "password": pw}, timeout=10, ) if resp.status_code == 200: - return True, 'Authenticated' + return True, "Authenticated" return False, _http_error(resp.status_code) def _test_cwa() -> tuple[bool, str]: - url = _request_value('server', 'CWA_SERVER', normalize_url=True).rstrip('/') - user = _request_value('user', 'CWA_USERNAME') - pw = _request_value('password', 'CWA_PASSWORD', secret=True) + url = _request_value("server", "CWA_SERVER", normalize_url=True).rstrip("/") + user = _request_value("user", "CWA_USERNAME") + pw = _request_value("password", "CWA_PASSWORD", secret=True) if not url or not user: - return False, 'Server URL or credentials not configured' + return False, "Server URL or credentials not configured" resp = http_requests.get( f"{url}/opds", auth=(user, pw), @@ -375,19 +412,19 @@ def _test_cwa() -> tuple[bool, str]: allow_redirects=False, ) # CWA redirects to login page on auth failure - if resp.status_code == 200 and 'login' not in resp.text[:500].lower(): - return True, 'Connected' + if resp.status_code == 200 and "login" not in resp.text[:500].lower(): + return True, "Connected" if resp.status_code in (301, 302): - return False, 'Redirected to login — check credentials' + return False, "Redirected to login — check credentials" return False, _http_error(resp.status_code) def _test_hardcover() -> tuple[bool, str]: - token = _request_value('token', 'HARDCOVER_TOKEN', secret=True) + token = _request_value("token", "HARDCOVER_TOKEN", secret=True) if not token: - return False, 'API token not configured' + return False, "API token not configured" resp = http_requests.post( - 'https://api.hardcover.app/v1/graphql', + "https://api.hardcover.app/v1/graphql", headers={ "Authorization": f"Bearer {token}", "Content-Type": "application/json", @@ -397,28 +434,28 @@ def _test_hardcover() -> tuple[bool, str]: ) if resp.status_code == 200: data = resp.json() - if data.get('errors'): - err_msg = data['errors'][0].get('message', 'Unknown GraphQL error') - return False, f'GraphQL error: {err_msg}' - me = data.get('data', {}).get('me') + if data.get("errors"): + err_msg = data["errors"][0].get("message", "Unknown GraphQL error") + return False, f"GraphQL error: {err_msg}" + me = data.get("data", {}).get("me") if isinstance(me, list): me = me[0] if me else {} elif not isinstance(me, dict): me = {} - if not isinstance(me, dict) or not me.get('id'): - return False, 'Authentication succeeded but user data is missing' - username = me.get('username', 'unknown') - return True, f'Connected as {username}' + if not isinstance(me, dict) or not me.get("id"): + return False, "Authentication succeeded but user data is missing" + username = me.get("username", "unknown") + return True, f"Connected as {username}" return False, _http_error(resp.status_code) def _test_telegram() -> tuple[bool, str]: - token = _request_value('bot_token', 'TELEGRAM_BOT_TOKEN', secret=True) - chat_id = _request_value('chat_id', 'TELEGRAM_CHAT_ID') + token = _request_value("bot_token", "TELEGRAM_BOT_TOKEN", secret=True) + chat_id = _request_value("chat_id", "TELEGRAM_CHAT_ID") if not token: - return False, 'Bot token not configured' + return False, "Bot token not configured" if not chat_id: - return False, 'Chat ID not configured' + return False, "Chat ID not configured" me_resp = http_requests.get( f"https://api.telegram.org/bot{token}/getMe", @@ -428,23 +465,23 @@ def _test_telegram() -> tuple[bool, str]: return False, _http_error(me_resp.status_code) data = me_resp.json() - bot_name = data.get('result', {}).get('first_name', 'Bot') + bot_name = data.get("result", {}).get("first_name", "Bot") send_resp = http_requests.post( f"https://api.telegram.org/bot{token}/sendMessage", data={ - 'chat_id': chat_id, - 'text': 'PageKeeper test message: Telegram notifications are configured correctly.', + "chat_id": chat_id, + "text": "PageKeeper test message: Telegram notifications are configured correctly.", }, timeout=10, ) if send_resp.status_code == 200: - return True, f'Test message sent via {bot_name}' + return True, f"Test message sent via {bot_name}" try: - description = send_resp.json().get('description', '') + description = send_resp.json().get("description", "") except Exception: - description = '' + description = "" if description: return False, _redact_secrets(description) return False, _http_error(send_resp.status_code) @@ -453,12 +490,12 @@ def _test_telegram() -> tuple[bool, str]: def _test_bookfusion() -> tuple[bool, str]: container = get_container() client = container.bookfusion_client() - return client.check_connection(api_key_override=_request_value('api_key', 'BOOKFUSION_API_KEY', secret=True)) + return client.check_connection(api_key_override=_request_value("api_key", "BOOKFUSION_API_KEY", secret=True)) def _test_bookfusion_upload() -> tuple[bool, str]: container = get_container() client = container.bookfusion_client() return client.check_upload_connection( - api_key_override=_request_value('api_key', 'BOOKFUSION_UPLOAD_API_KEY', secret=True) + api_key_override=_request_value("api_key", "BOOKFUSION_UPLOAD_API_KEY", secret=True) ) diff --git a/src/blueprints/tbr_bp.py b/src/blueprints/tbr_bp.py index 57cc6a0..2930b5f 100644 --- a/src/blueprints/tbr_bp.py +++ b/src/blueprints/tbr_bp.py @@ -132,7 +132,7 @@ def add_tbr_from_library(): item, created = database_service.add_tbr_item( title=book.title or book_ref, - author=getattr(book, 'author', None), + author=book.author, source='library', book_abs_id=book.abs_id, book_id=book.id, @@ -611,6 +611,7 @@ def search_library_books(): books = database_service.search_books(q, limit=10) return jsonify([ { + 'id': b.id, 'abs_id': b.abs_id, 'title': b.title, 'author': getattr(b, 'author', None) or '', @@ -633,7 +634,7 @@ def link_tbr_to_library(item_id): if not book: return jsonify({"success": False, "error": "Book not found in library"}), 404 - updated = database_service.link_tbr_to_book(item_id, book.id, book_abs_id=book.abs_id) + updated = database_service.link_tbr_to_book(item_id, book.id) if not updated: return jsonify({"success": False, "error": "TBR item not found"}), 404 diff --git a/src/db/book_repository.py b/src/db/book_repository.py index 462a7cd..1bbafea 100644 --- a/src/db/book_repository.py +++ b/src/db/book_repository.py @@ -22,6 +22,8 @@ class BookRepository(BaseRepository): # ── Book CRUD ── def get_book_by_abs_id(self, abs_id): + if not abs_id: + return None return self._get_one(Book, Book.abs_id == abs_id) def get_book_by_id(self, book_id): @@ -172,6 +174,9 @@ def get_all_states(self): return self._get_all(State) def save_state(self, state): + if not state.book_id and not state.abs_id: + logger.error("save_state called without book_id or abs_id — skipping") + return None # Prefer book_id for upsert lookup; fall back to abs_id for backward compat if state.book_id: lookup = [State.book_id == state.book_id, State.client_name == state.client_name] diff --git a/src/db/bookfusion_repository.py b/src/db/bookfusion_repository.py index 662b2eb..329aa49 100644 --- a/src/db/bookfusion_repository.py +++ b/src/db/bookfusion_repository.py @@ -89,45 +89,26 @@ def get_unmatched_bookfusion_highlights(self): with self.get_session() as session: highlights = ( session.query(BookfusionHighlight) - .filter(BookfusionHighlight.matched_abs_id.is_(None)) + .filter(BookfusionHighlight.matched_book_id.is_(None)) .order_by(BookfusionHighlight.book_title, BookfusionHighlight.id) .all() ) session.expunge_all() return highlights - def link_bookfusion_highlight(self, highlight_id, abs_id): - """Link by abs_id (backward compat). Also sets matched_book_id if possible.""" + def link_bookfusion_highlights_by_book_id(self, bookfusion_book_id, book_id): + """Link all highlights for a BookFusion book to a library book by book_id.""" with self.get_session() as session: - hl = session.query(BookfusionHighlight).filter(BookfusionHighlight.id == highlight_id).first() - if hl: - hl.matched_abs_id = abs_id - # Also set book_id from the books table - from .models import Book - book = session.query(Book).filter(Book.abs_id == abs_id).first() - if book: - hl.matched_book_id = book.id - return True - return False - - def link_bookfusion_book(self, bookfusion_book_id, abs_id): - """Link all highlights for a BookFusion book (backward compat). Also sets matched_book_id.""" - with self.get_session() as session: - from .models import Book - book = session.query(Book).filter(Book.abs_id == abs_id).first() - updates = {BookfusionHighlight.matched_abs_id: abs_id} - if book: - updates[BookfusionHighlight.matched_book_id] = book.id session.query(BookfusionHighlight).filter( BookfusionHighlight.bookfusion_book_id == bookfusion_book_id - ).update(updates, synchronize_session=False) + ).update({BookfusionHighlight.matched_book_id: book_id}, synchronize_session=False) - def get_bookfusion_highlights_for_book(self, abs_id): - """Get by abs_id (backward compat).""" + def get_bookfusion_highlights_for_book_by_book_id(self, book_id): + """Get highlights matched to a book by book_id.""" with self.get_session() as session: highlights = ( session.query(BookfusionHighlight) - .filter(BookfusionHighlight.matched_abs_id == abs_id) + .filter(BookfusionHighlight.matched_book_id == book_id) .order_by(BookfusionHighlight.highlighted_at.desc().nullslast(), BookfusionHighlight.id) .all() ) @@ -176,21 +157,17 @@ def save_bookfusion_books(self, books): def get_bookfusion_books(self): return self._get_all(BookfusionBook, order_by=BookfusionBook.title) - def is_bookfusion_linked(self, abs_id): - """Check by abs_id (backward compat).""" + def is_bookfusion_linked_by_book_id(self, book_id): + """Check if a book has a linked BookFusion catalog entry by book_id.""" with self.get_session() as session: - return session.query(BookfusionBook).filter(BookfusionBook.matched_abs_id == abs_id).first() is not None + return session.query(BookfusionBook).filter(BookfusionBook.matched_book_id == book_id).first() is not None - def set_bookfusion_book_match(self, bookfusion_id, abs_id): - """Match by abs_id (backward compat). Also sets matched_book_id.""" + def set_bookfusion_book_match_by_book_id(self, bookfusion_id, book_id): + """Match a BookFusion catalog book to a library book by book_id.""" with self.get_session() as session: - from .models import Book bf_book = session.query(BookfusionBook).filter(BookfusionBook.bookfusion_id == bookfusion_id).first() if bf_book: - bf_book.matched_abs_id = abs_id - book = session.query(Book).filter(Book.abs_id == abs_id).first() - if book: - bf_book.matched_book_id = book.id + bf_book.matched_book_id = book_id def set_bookfusion_books_hidden(self, bookfusion_ids, hidden): with self.get_session() as session: @@ -201,23 +178,9 @@ def set_bookfusion_books_hidden(self, bookfusion_ids, hidden): def get_bookfusion_book(self, bookfusion_id): return self._get_one(BookfusionBook, BookfusionBook.bookfusion_id == bookfusion_id) - def get_bookfusion_book_by_abs_id(self, abs_id): - """Get by abs_id (backward compat).""" - return self._get_one(BookfusionBook, BookfusionBook.matched_abs_id == abs_id) - def get_bookfusion_book_by_book_id(self, book_id): return self._get_one(BookfusionBook, BookfusionBook.matched_book_id == book_id) - def unlink_bookfusion_by_abs_id(self, abs_id): - """Unlink by abs_id (backward compat).""" - with self.get_session() as session: - session.query(BookfusionBook).filter(BookfusionBook.matched_abs_id == abs_id).update( - {BookfusionBook.matched_abs_id: None, BookfusionBook.matched_book_id: None}, synchronize_session=False - ) - session.query(BookfusionHighlight).filter(BookfusionHighlight.matched_abs_id == abs_id).update( - {BookfusionHighlight.matched_abs_id: None, BookfusionHighlight.matched_book_id: None}, synchronize_session=False - ) - def unlink_bookfusion_by_book_id(self, book_id): with self.get_session() as session: session.query(BookfusionBook).filter(BookfusionBook.matched_book_id == book_id).update( @@ -245,24 +208,6 @@ def get_bookfusion_highlight_date_range(self, bookfusion_book_ids): return result return None - def get_bookfusion_linked_abs_ids(self): - """Get all linked abs_ids (backward compat).""" - with self.get_session() as session: - book_ids = { - r[0] - for r in session.query(BookfusionBook.matched_abs_id) - .filter(BookfusionBook.matched_abs_id.isnot(None)) - .all() - } - highlight_ids = { - r[0] - for r in session.query(BookfusionHighlight.matched_abs_id) - .filter(BookfusionHighlight.matched_abs_id.isnot(None)) - .distinct() - .all() - } - return book_ids | highlight_ids - def get_bookfusion_linked_book_ids(self): with self.get_session() as session: book_ids = { @@ -280,19 +225,21 @@ def get_bookfusion_linked_book_ids(self): } return book_ids | highlight_ids - def get_bookfusion_highlight_counts(self): - """Return counts keyed by abs_id (backward compat).""" + def get_bookfusion_highlight_counts_by_book_id(self): + """Return highlight counts keyed by book_id.""" with self.get_session() as session: rows = ( - session.query(BookfusionHighlight.matched_abs_id, func.count(BookfusionHighlight.id)) - .filter(BookfusionHighlight.matched_abs_id.isnot(None)) - .group_by(BookfusionHighlight.matched_abs_id) + session.query(BookfusionHighlight.matched_book_id, func.count(BookfusionHighlight.id)) + .filter(BookfusionHighlight.matched_book_id.isnot(None)) + .group_by(BookfusionHighlight.matched_book_id) .all() ) - return {abs_id: count for abs_id, count in rows} + return {book_id: count for book_id, count in rows} def auto_link_by_title(self, book): """Auto-link unmatched BookFusion highlights to a book by title similarity.""" + if not book.title: + return try: import difflib @@ -307,8 +254,8 @@ def auto_link_by_title(self, book): norm_bf = normalize_title(bf_title) if norm_bf == norm_book or difflib.SequenceMatcher(None, norm_bf, norm_book).ratio() > 0.85: if hl.bookfusion_book_id: - self.link_bookfusion_book(hl.bookfusion_book_id, book.id) + self.link_bookfusion_highlights_by_book_id(hl.bookfusion_book_id, book.id) logger.info(f"Auto-linked BookFusion highlights for '{bf_title}' to book {book.id}") break - except Exception as e: + except (AttributeError, TypeError) as e: logger.warning(f"BookFusion auto-link failed: {e}") diff --git a/src/db/database_service.py b/src/db/database_service.py index 7bf1477..77a5ed9 100644 --- a/src/db/database_service.py +++ b/src/db/database_service.py @@ -51,7 +51,8 @@ def __init__(self, db_path: str): self._run_alembic_migrations() # Ensure all tables exist (covers new models not yet in migrations) - Base.metadata.create_all(self.db_manager.engine) + if not self._migration_failed: + Base.metadata.create_all(self.db_manager.engine) # Safety net: add any model columns missing from existing tables # Skip if migrations failed — adding columns without constraints would @@ -219,31 +220,34 @@ def _ensure_model_columns(self): existing_cols = {c["name"] for c in inspector.get_columns(table_name)} for col in model.columns: if col.name not in existing_cols: - col_type = col.type.compile(self.db_manager.engine.dialect) - default_clause = "" - if col.default is not None: - default_val = col.default.arg - if callable(default_val): - try: - default_val = default_val() - except TypeError: + try: + col_type = col.type.compile(self.db_manager.engine.dialect) + default_clause = "" + if col.default is not None: + default_val = col.default.arg + if callable(default_val): try: - default_val = default_val(None) - except Exception: - default_val = None - if isinstance(default_val, bool): - default_clause = f" DEFAULT {'TRUE' if default_val else 'FALSE'}" - elif isinstance(default_val, str): - escaped = default_val.replace("'", "''") - default_clause = f" DEFAULT '{escaped}'" - elif isinstance(default_val, (int, float)): - default_clause = f" DEFAULT {default_val}" - - alter = f"ALTER TABLE {table_name} ADD COLUMN {col.name} {col_type}{default_clause}" - with self.db_manager.engine.connect() as conn: - conn.execute(text(alter)) - conn.commit() - logger.info(f"Added missing column: {table_name}.{col.name}") + default_val = default_val() + except TypeError: + try: + default_val = default_val(None) + except Exception: + default_val = None + if isinstance(default_val, bool): + default_clause = f" DEFAULT {'TRUE' if default_val else 'FALSE'}" + elif isinstance(default_val, str): + escaped = default_val.replace("'", "''") + default_clause = f" DEFAULT '{escaped}'" + elif isinstance(default_val, (int, float)): + default_clause = f" DEFAULT {default_val}" + + alter = f"ALTER TABLE {table_name} ADD COLUMN {col.name} {col_type}{default_clause}" + with self.db_manager.engine.connect() as conn: + conn.execute(text(alter)) + conn.commit() + logger.info(f"Added missing column: {table_name}.{col.name}") + except Exception as e: + logger.warning("Could not add column %s.%s: %s", table_name, col.name, e) except Exception as e: logger.warning(f"Column check failed (non-fatal): {e}") @@ -336,8 +340,7 @@ def save_hardcover_details(self, details): hc_id = int(details.hardcover_book_id) tbr_item = self._tbr.find_tbr_by_hardcover_id(hc_id) if tbr_item and not tbr_item.book_id: - self._tbr.link_tbr_to_book(tbr_item.id, details.book_id, - book_abs_id=details.abs_id) + self._tbr.link_tbr_to_book(tbr_item.id, details.book_id) except (TypeError, ValueError): pass return result @@ -358,12 +361,6 @@ def get_bookfusion_sync_cursor(self): def set_bookfusion_sync_cursor(self, cursor): return self._settings.set_setting("BOOKFUSION_SYNC_CURSOR", cursor) - def find_tbr_by_abs_id(self, abs_id): - return self._tbr.find_by_abs_id(abs_id) - - def delete_tbr_by_abs_id(self, abs_id): - return self._tbr.delete_by_abs_id(abs_id) - def find_tbr_by_book_id(self, book_id): return self._tbr.find_by_book_id(book_id) @@ -390,8 +387,8 @@ def update_tbr_item(self, item_id, **fields): def delete_tbr_item(self, item_id): return self._tbr.delete_tbr_item(item_id) - def link_tbr_to_book(self, item_id, book_id, book_abs_id=None): - return self._tbr.link_tbr_to_book(item_id, book_id, book_abs_id=book_abs_id) + def link_tbr_to_book(self, item_id, book_id): + return self._tbr.link_tbr_to_book(item_id, book_id) def find_tbr_by_hardcover_id(self, hc_book_id): return self._tbr.find_tbr_by_hardcover_id(hc_book_id) diff --git a/src/db/hardcover_repository.py b/src/db/hardcover_repository.py index 486f611..8d41fab 100644 --- a/src/db/hardcover_repository.py +++ b/src/db/hardcover_repository.py @@ -43,13 +43,6 @@ def save_hardcover_details(self, details): ], ) - def delete_hardcover_details(self, abs_id): - """Delete by abs_id (backward compat).""" - return self._delete_one(HardcoverDetails, HardcoverDetails.abs_id == abs_id) - - def delete_hardcover_details_by_book_id(self, book_id): - return self._delete_one(HardcoverDetails, HardcoverDetails.book_id == book_id) - def get_all_hardcover_details(self): return self._get_all(HardcoverDetails) diff --git a/src/db/kosync_repository.py b/src/db/kosync_repository.py index d41f4c4..4411ced 100644 --- a/src/db/kosync_repository.py +++ b/src/db/kosync_repository.py @@ -3,7 +3,7 @@ from datetime import UTC, datetime from .base_repository import BaseRepository -from .models import KosyncDocument +from .models import Book, KosyncDocument class KoSyncRepository(BaseRepository): @@ -26,14 +26,7 @@ def get_all_kosync_documents(self): def get_unlinked_kosync_documents(self): return self._get_all( KosyncDocument, - KosyncDocument.linked_abs_id == None, - order_by=KosyncDocument.last_updated.desc(), - ) - - def get_linked_kosync_documents(self): - return self._get_all( - KosyncDocument, - KosyncDocument.linked_abs_id != None, + KosyncDocument.linked_book_id == None, order_by=KosyncDocument.last_updated.desc(), ) @@ -64,17 +57,6 @@ def unlink_kosync_document(self, document_hash): def delete_kosync_document(self, document_hash): return self._delete_one(KosyncDocument, KosyncDocument.document_hash == document_hash) - def get_kosync_document_by_linked_book(self, abs_id): - """Get by abs_id (backward compat).""" - return self._get_one(KosyncDocument, KosyncDocument.linked_abs_id == abs_id) - - def get_kosync_document_by_linked_book_id(self, book_id): - return self._get_one(KosyncDocument, KosyncDocument.linked_book_id == book_id) - - def get_kosync_documents_for_book(self, abs_id): - """Get by abs_id (backward compat).""" - return self._get_all(KosyncDocument, KosyncDocument.linked_abs_id == abs_id) - def get_kosync_documents_for_book_by_book_id(self, book_id): return self._get_all(KosyncDocument, KosyncDocument.linked_book_id == book_id) @@ -88,11 +70,15 @@ def get_kosync_doc_by_booklore_id(self, booklore_id): return None return self._get_one(KosyncDocument, KosyncDocument.booklore_id == str(booklore_id)) - def is_hash_linked_to_device(self, doc_hash): - if not doc_hash: - return False + def get_orphaned_kosync_books(self): + """Get books with kosync_doc_id set but no matching KosyncDocument.""" with self.get_session() as session: - return session.query(KosyncDocument).filter( - KosyncDocument.document_hash == doc_hash, - KosyncDocument.linked_abs_id != None, - ).first() is not None + subq = session.query(KosyncDocument.document_hash) + results = (session.query(Book) + .filter(Book.kosync_doc_id != None) + .filter(~Book.kosync_doc_id.in_(subq)) + .all()) + for r in results: + session.expunge(r) + return results + diff --git a/src/db/models.py b/src/db/models.py index cd9214e..9ddcdfe 100644 --- a/src/db/models.py +++ b/src/db/models.py @@ -98,6 +98,8 @@ class Book(Base): abs_ebook_item_id = Column(String(255), nullable=True) ebook_item_id = Column(String(255), nullable=True) custom_cover_url = Column(String(500), nullable=True) + author = Column(String(500), nullable=True) + subtitle = Column(String(500), nullable=True) # Reading tracker fields started_at = Column(String(10), nullable=True) # YYYY-MM-DD @@ -125,7 +127,7 @@ def __init__(self, abs_id: str = None, title: str = None, ebook_filename: str = kosync_doc_id: str = None, transcript_file: str = None, status: str = 'not_started', duration: float = None, sync_mode: str = 'audiobook', storyteller_uuid: str = None, abs_ebook_item_id: str = None, ebook_item_id: str = None, - custom_cover_url: str = None, + custom_cover_url: str = None, author: str = None, subtitle: str = None, started_at: str = None, finished_at: str = None, rating: float = None, read_count: int = 1): self.abs_id = abs_id @@ -141,6 +143,8 @@ def __init__(self, abs_id: str = None, title: str = None, ebook_filename: str = self.abs_ebook_item_id = abs_ebook_item_id self.ebook_item_id = ebook_item_id self.custom_cover_url = custom_cover_url + self.author = author + self.subtitle = subtitle self.started_at = started_at self.finished_at = finished_at self.rating = rating diff --git a/src/db/storyteller_repository.py b/src/db/storyteller_repository.py index 84d2291..656f851 100644 --- a/src/db/storyteller_repository.py +++ b/src/db/storyteller_repository.py @@ -30,22 +30,6 @@ def save_storyteller_submission(self, submission): session.expunge(submission) return submission - def get_active_storyteller_submission(self, abs_id): - """Get by abs_id (backward compat).""" - with self.get_session() as session: - sub = ( - session.query(StorytellerSubmission) - .filter( - StorytellerSubmission.abs_id == abs_id, - StorytellerSubmission.status.in_(["queued", "processing"]), - ) - .order_by(StorytellerSubmission.submitted_at.desc()) - .first() - ) - if sub: - session.expunge(sub) - return sub - def get_active_storyteller_submission_by_book_id(self, book_id): with self.get_session() as session: sub = ( @@ -75,19 +59,6 @@ def update_storyteller_submission_status(self, submission_id, status, last_check if submission_dir is not None: sub.submission_dir = submission_dir - def get_storyteller_submission(self, abs_id): - """Get by abs_id (backward compat).""" - with self.get_session() as session: - sub = ( - session.query(StorytellerSubmission) - .filter(StorytellerSubmission.abs_id == abs_id) - .order_by(StorytellerSubmission.submitted_at.desc()) - .first() - ) - if sub: - session.expunge(sub) - return sub - def get_storyteller_submission_by_book_id(self, book_id): with self.get_session() as session: sub = ( @@ -103,15 +74,16 @@ def get_storyteller_submission_by_book_id(self, book_id): def get_all_storyteller_submissions_latest(self): """Get the most recent submission per book (for dashboard bulk display). - Returns a dict of {abs_id: StorytellerSubmission}. + Returns a dict of {book_id: StorytellerSubmission}. """ with self.get_session() as session: latest = ( session.query( - StorytellerSubmission.abs_id, + StorytellerSubmission.book_id, func.max(StorytellerSubmission.submitted_at).label("max_ts"), ) - .group_by(StorytellerSubmission.abs_id) + .filter(StorytellerSubmission.book_id.isnot(None)) + .group_by(StorytellerSubmission.book_id) .subquery() ) @@ -119,7 +91,7 @@ def get_all_storyteller_submissions_latest(self): session.query(StorytellerSubmission) .join( latest, - (StorytellerSubmission.abs_id == latest.c.abs_id) + (StorytellerSubmission.book_id == latest.c.book_id) & (StorytellerSubmission.submitted_at == latest.c.max_ts), ) .all() @@ -128,5 +100,5 @@ def get_all_storyteller_submissions_latest(self): result = {} for sub in rows: session.expunge(sub) - result[sub.abs_id] = sub + result[sub.book_id] = sub return result diff --git a/src/db/suggestion_repository.py b/src/db/suggestion_repository.py index 3d8624d..23c5469 100644 --- a/src/db/suggestion_repository.py +++ b/src/db/suggestion_repository.py @@ -1,49 +1,57 @@ """Repository for pending suggestions.""" from .base_repository import BaseRepository -from .models import Book, KosyncDocument, PendingSuggestion +from .models import PendingSuggestion class SuggestionRepository(BaseRepository): - ACTIONABLE_STATUSES = ('pending', 'hidden', 'dismissed') + ACTIONABLE_STATUSES = ('pending', 'hidden') - def get_suggestion(self, source_id): + def get_suggestion(self, source_id, source='abs'): return self._get_one( PendingSuggestion, PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, ) - def get_pending_suggestion(self, source_id): + def get_pending_suggestion(self, source_id, source='abs'): return self._get_one( PendingSuggestion, PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, PendingSuggestion.status == 'pending', ) - def suggestion_exists(self, source_id): + def suggestion_exists(self, source_id, source='abs'): with self.get_session() as session: return session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, ).first() is not None - def is_suggestion_ignored(self, source_id): + def is_suggestion_ignored(self, source_id, source='abs'): with self.get_session() as session: return session.query(PendingSuggestion).filter( PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, PendingSuggestion.status == 'ignored', ).first() is not None def save_pending_suggestion(self, suggestion): with self.get_session() as session: existing = session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == suggestion.source_id + PendingSuggestion.source_id == suggestion.source_id, + PendingSuggestion.source == suggestion.source, ).first() - if existing and existing.status in ('hidden', 'dismissed') and suggestion.status == 'pending': + if existing and existing.status == 'hidden' and suggestion.status == 'pending': suggestion.status = 'hidden' return self._upsert( PendingSuggestion, - [PendingSuggestion.source_id == suggestion.source_id], + [ + PendingSuggestion.source_id == suggestion.source_id, + PendingSuggestion.source == suggestion.source, + ], suggestion, ['title', 'author', 'cover_url', 'matches_json', 'status'], ) @@ -65,47 +73,52 @@ def get_all_actionable_suggestions(self): def get_hidden_suggestions(self): return self._get_all( PendingSuggestion, - PendingSuggestion.status.in_(('hidden', 'dismissed')), + PendingSuggestion.status == 'hidden', order_by=PendingSuggestion.created_at.desc(), ) - def delete_pending_suggestion(self, source_id): + def delete_pending_suggestion(self, source_id, source='abs'): return self._delete_one( PendingSuggestion, PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, PendingSuggestion.status == 'pending', ) - def resolve_suggestion(self, source_id): + def resolve_suggestion(self, source_id, source='abs'): return self._delete_one( PendingSuggestion, PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, ) - def hide_suggestion(self, source_id): + def hide_suggestion(self, source_id, source='abs'): with self.get_session() as session: suggestion = session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, ).first() if suggestion and suggestion.status != 'ignored': suggestion.status = 'hidden' return True return False - def unhide_suggestion(self, source_id): + def unhide_suggestion(self, source_id, source='abs'): with self.get_session() as session: suggestion = session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, ).first() - if suggestion and suggestion.status in ('hidden', 'dismissed'): + if suggestion and suggestion.status == 'hidden': suggestion.status = 'pending' return True return False - def ignore_suggestion(self, source_id): + def ignore_suggestion(self, source_id, source='abs'): with self.get_session() as session: suggestion = session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, ).first() if suggestion: suggestion.status = 'ignored' @@ -115,6 +128,8 @@ def ignore_suggestion(self, source_id): def clear_stale_suggestions(self): """Delete suggestions whose source_id is not in the books table.""" from sqlalchemy import not_ + + from .models import Book with self.get_session() as session: count = session.query(PendingSuggestion).filter( PendingSuggestion.source == 'abs', @@ -129,12 +144,3 @@ def normalize_dismissed_suggestions(self): PendingSuggestion.status == 'dismissed' ).update({'status': 'hidden'}, synchronize_session=False) return updated - - def is_hash_linked_to_device(self, doc_hash): - if not doc_hash: - return False - with self.get_session() as session: - return session.query(KosyncDocument).filter( - KosyncDocument.document_hash == doc_hash, - KosyncDocument.linked_abs_id.isnot(None), - ).count() > 0 diff --git a/src/db/tbr_repository.py b/src/db/tbr_repository.py index 587fd91..78e229a 100644 --- a/src/db/tbr_repository.py +++ b/src/db/tbr_repository.py @@ -102,14 +102,13 @@ def delete_tbr_item(self, item_id): """Remove a TBR item. Returns True if deleted.""" return self._delete_one(TbrItem, TbrItem.id == item_id) - def link_tbr_to_book(self, item_id, book_id, book_abs_id=None): + def link_tbr_to_book(self, item_id, book_id): """Set book_id on a TBR item (linking it to an owned book).""" with self.get_session() as session: item = session.query(TbrItem).filter(TbrItem.id == item_id).first() if not item: return None item.book_id = book_id - item.book_abs_id = book_abs_id session.flush() session.refresh(item) session.expunge(item) @@ -128,22 +127,12 @@ def find_by_book_id(self, book_id): """Find a TBR item linked to a given library book.""" return self._get_one(TbrItem, TbrItem.book_id == book_id) - def find_by_abs_id(self, abs_id): - """Find a TBR item linked to a given abs_id (legacy).""" - return self._get_one(TbrItem, TbrItem.book_abs_id == abs_id) - def delete_by_book_id(self, book_id): """Delete any TBR items linked to the given book_id. Returns count deleted.""" with self.get_session() as session: count = session.query(TbrItem).filter(TbrItem.book_id == book_id).delete() return count - def delete_by_abs_id(self, abs_id): - """Delete any TBR items linked to the given abs_id (legacy). Returns count deleted.""" - with self.get_session() as session: - count = session.query(TbrItem).filter(TbrItem.book_abs_id == abs_id).delete() - return count - def get_unlinked_items(self): """Return TBR items where book_id is NULL (not linked to a library book).""" with self.get_session() as session: @@ -156,6 +145,8 @@ def get_unlinked_items(self): def auto_link_by_title(self, book): """Auto-link unlinked TBR items by normalized title match.""" + if not book.title: + return try: unlinked = self.get_unlinked_items() if not unlinked: @@ -163,8 +154,8 @@ def auto_link_by_title(self, book): norm_title = book.title.lower().strip() for item in unlinked: if item.title and item.title.lower().strip() == norm_title: - self.link_tbr_to_book(item.id, book.id, book_abs_id=book.abs_id) + self.link_tbr_to_book(item.id, book.id) logger.info(f"Auto-linked TBR item '{item.title}' to book {book.id}") break - except Exception as e: + except (AttributeError, TypeError) as e: logger.warning(f"TBR auto-link failed: {e}") diff --git a/src/services/abs_service.py b/src/services/abs_service.py index 50c78ac..58e4891 100644 --- a/src/services/abs_service.py +++ b/src/services/abs_service.py @@ -51,6 +51,10 @@ def remove_from_collection(self, abs_id: str, collection_name: str) -> bool: return False return self.abs_client.remove_from_collection(abs_id, collection_name) + def has_ebook_libraries(self) -> bool: + """Return True if ABS is configured and capable of serving ebooks.""" + return self.is_available() + # --- Ebook operations --- def search_ebooks(self, query: str) -> list[dict]: diff --git a/src/services/alignment_service.py b/src/services/alignment_service.py index f6f7411..f284807 100644 --- a/src/services/alignment_service.py +++ b/src/services/alignment_service.py @@ -40,6 +40,7 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars """ has_abs = 'ABS' in config ebook_clients = [k for k in config.keys() if k != 'ABS'] + book_label = book.title or str(book.id) if not ebook_clients: return None @@ -53,11 +54,12 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars full_text, _ = ebook_parser.extract_text_and_map(book_path) total_text_len = len(full_text) except Exception as e: - logger.debug(f"'{book.abs_id}' Could not load ebook for normalization: {e}") + logger.debug(f"'{book_label}' Could not load ebook for normalization: {e}") return None if not total_text_len: return None normalized = {} + used_fallback = False for client_name in ebook_clients: client = sync_clients.get(client_name) if not client: @@ -68,6 +70,7 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars client_pct = max(0.0, min(1.0, float(client_pct))) except (TypeError, ValueError): client_pct = 0.0 + matched = False try: text_snippet = client.get_text_from_current_state(book, client_state) if text_snippet: @@ -77,17 +80,27 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars ) if loc and loc.match_index is not None: normalized[client_name] = loc.match_index - logger.debug(f"'{book.abs_id}' Normalized '{client_name}' {client_pct:.2%} -> char {loc.match_index}") - continue + logger.debug(f"'{book_label}' Normalized '{client_name}' {client_pct:.2%} -> char {loc.match_index}") + matched = True except Exception as e: - logger.debug(f"'{book.abs_id}' Text-based normalization failed for '{client_name}': {e}") - normalized[client_name] = int(client_pct * total_text_len) - logger.debug(f"'{book.abs_id}' Normalized '{client_name}' {client_pct:.2%} -> char {int(client_pct * total_text_len)} (pct fallback)") + logger.debug(f"'{book_label}' Text-based normalization failed for '{client_name}': {e}") + if not matched: + used_fallback = True + normalized[client_name] = int(client_pct * total_text_len) + logger.debug(f"'{book_label}' Normalized '{client_name}' {client_pct:.2%} -> char {int(client_pct * total_text_len)} (pct fallback)") + + if used_fallback: + # Mixing text-matched positions with percentage-based estimates is + # unreliable — a precise XPath match and a rough percentage-to-char + # conversion can produce nearly identical values that invert the true + # ordering. Fall back to raw percentage comparison instead. + logger.debug(f"'{book_label}' Discarding character normalization — not all clients had text matches") + return None return normalized if len(normalized) > 1 else None # Audio + ebook path if not book.transcript_file: - logger.debug(f"'{book.abs_id}' No transcript available for cross-format normalization") + logger.debug(f"'{book_label}' No transcript available for cross-format normalization") return None normalized = {} @@ -117,12 +130,12 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars txt = full_text[max(0, char_offset - 400):min(total_text_len, char_offset + 400)] if not txt: - logger.debug(f"'{book.abs_id}' Could not get text from '{client_name}' for normalization") + logger.debug(f"'{book_label}' Could not get text from '{client_name}' for normalization") continue if alignment_service: ts_for_text = alignment_service.get_time_for_text( - book.abs_id, + book.id, char_offset_hint=char_offset ) else: @@ -130,11 +143,11 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars if ts_for_text is not None: normalized[client_name] = ts_for_text - logger.debug(f"'{book.abs_id}' Normalized '{client_name}' {client_pct:.2%} -> {ts_for_text:.1f}s") + logger.debug(f"'{book_label}' Normalized '{client_name}' {client_pct:.2%} -> {ts_for_text:.1f}s") else: - logger.debug(f"'{book.abs_id}' Could not find timestamp for '{client_name}' text") + logger.debug(f"'{book_label}' Could not find timestamp for '{client_name}' text") except Exception as e: - logger.warning(f"'{book.abs_id}' Cross-format normalization failed for '{client_name}': {sanitize_exception(e)}") + logger.warning(f"'{book_label}' Cross-format normalization failed for '{client_name}': {sanitize_exception(e)}") if len(normalized) > 1: return normalized @@ -145,11 +158,11 @@ def __init__(self, database_service, polisher: Polisher): self.database_service = database_service self.polisher = polisher - def has_alignment(self, abs_id: str) -> bool: - return bool(abs_id and self._get_alignment(abs_id)) + def has_alignment(self, book_id: int) -> bool: + return bool(book_id and self._get_alignment(book_id)) @time_execution - def align_and_store(self, abs_id: str, raw_segments: list[dict], ebook_text: str, spine_chapters: list[dict] = None, source: str = None): + def align_and_store(self, book_id: int, raw_segments: list[dict], ebook_text: str, spine_chapters: list[dict] = None, source: str = None): """ Main entry point for "Unified Alignment". @@ -161,7 +174,7 @@ def align_and_store(self, abs_id: str, raw_segments: list[dict], ebook_text: str 4. Rebuild: Fix fragmented sentences in transcript using ebook text as a guide. 5. Store: Save ONLY the mapping and essential metadata to DB. """ - logger.info(f"AlignmentService: Processing {abs_id} (Text: {len(ebook_text)} chars, Segments: {len(raw_segments)})") + logger.info(f"AlignmentService: Processing book {book_id} (Text: {len(ebook_text)} chars, Segments: {len(raw_segments)})") # 1. Validation (Spine Check) # Note: This is soft validation. If lengths assume vastly different sizes, warn. @@ -190,11 +203,11 @@ def align_and_store(self, abs_id: str, raw_segments: list[dict], ebook_text: str return False # 4. Store to Database - self._save_alignment(abs_id, alignment_map, source=source) + self._save_alignment(book_id, alignment_map, source=source) return True @time_execution - def align_storyteller_and_store(self, abs_id: str, storyteller_chapters: list[dict], ebook_text: str) -> bool: + def align_storyteller_and_store(self, book_id: int, storyteller_chapters: list[dict], ebook_text: str) -> bool: """Align using Storyteller's native word-level timing data. Converts wordTimeline entries into segments compatible with the existing @@ -203,7 +216,7 @@ def align_storyteller_and_store(self, abs_id: str, storyteller_chapters: list[di Each wordTimeline entry is expected to have 'startTime' (float seconds) and 'word' or 'text' (string). """ - logger.info(f"AlignmentService: Processing {abs_id} via Storyteller wordTimeline " + logger.info(f"AlignmentService: Processing book {book_id} via Storyteller wordTimeline " f"({len(storyteller_chapters)} chapters, {len(ebook_text)} chars)") # Build segments from wordTimeline data (~15-second groups) @@ -245,7 +258,7 @@ def align_storyteller_and_store(self, abs_id: str, storyteller_chapters: list[di }) if not segments: - logger.error(f"AlignmentService: No segments produced from wordTimeline for {abs_id}") + logger.error(f"AlignmentService: No segments produced from wordTimeline for book {book_id}") return False logger.info(f" Built {len(segments)} segments from wordTimeline data") @@ -261,14 +274,14 @@ def align_storyteller_and_store(self, abs_id: str, storyteller_chapters: list[di {"char": 0, "ts": 0.0}, {"char": len(ebook_text), "ts": total_duration}, ] - logger.warning(f" N-gram anchoring failed, using linear fallback for {abs_id}") + logger.warning(f" N-gram anchoring failed, using linear fallback for book {book_id}") - self._save_alignment(abs_id, alignment_map, source='storyteller') + self._save_alignment(book_id, alignment_map, source='storyteller') return True - def get_time_for_text(self, abs_id: str, char_offset_hint: int = None) -> float | None: + def get_time_for_text(self, book_id: int, char_offset_hint: int = None) -> float | None: """Look up a timestamp from the alignment map using a character offset.""" - alignment = self._get_alignment(abs_id) + alignment = self._get_alignment(book_id) if not alignment: return None @@ -296,7 +309,7 @@ def get_time_for_text(self, abs_id: str, char_offset_hint: int = None) -> float real_end = penultimate if target_offset > real_end['char']: - logger.warning(f"'{abs_id}' Char offset {target_offset} exceeds alignment range " + logger.warning(f"book {book_id}: Char offset {target_offset} exceeds alignment range " f"(max {real_end['char']}) — alignment may be partial") return None @@ -329,13 +342,13 @@ def get_time_for_text(self, abs_id: str, char_offset_hint: int = None) -> float return float(estimated_time) - def get_char_for_time(self, abs_id: str, timestamp: float) -> int | None: + def get_char_for_time(self, book_id: int, timestamp: float) -> int | None: """ Reverse lookup: Find character offset for a given timestamp. Returns None if the timestamp is beyond the alignment data range. """ # 1. Fetch Alignment Map - alignment = self._get_alignment(abs_id) + alignment = self._get_alignment(book_id) if not alignment: return None @@ -362,7 +375,7 @@ def get_char_for_time(self, abs_id: str, timestamp: float) -> int | None: if target_ts > real_end['ts']: # Timestamp is beyond the alignment data — can't determine position - logger.warning(f"'{abs_id}' Timestamp {target_ts:.1f}s exceeds alignment range " + logger.warning(f"book {book_id}: Timestamp {target_ts:.1f}s exceeds alignment range " f"(max {real_end['ts']:.1f}s) — alignment may be partial") return None @@ -530,10 +543,10 @@ def build_ngrams(items, is_book=False): logger.info(f" Anchored Alignment: Found {len(valid_anchors)} anchors (Total).") return final_map - def get_alignment_info(self, abs_id: str) -> dict | None: + def get_alignment_info(self, book_id: int) -> dict | None: """Return summary info about a book's alignment data without loading the full map.""" with self.database_service.get_session() as session: - row = session.query(BookAlignment).filter_by(abs_id=abs_id).first() + row = session.query(BookAlignment).filter_by(book_id=book_id).first() if not row: return None @@ -564,63 +577,60 @@ def get_alignment_info(self, abs_id: str) -> dict | None: 'source': row.source, } except (KeyError, TypeError, IndexError): - logger.warning(f"Malformed alignment data for {abs_id}") + logger.warning(f"Malformed alignment data for book {book_id}") return None - def delete_alignment(self, abs_id: str): + def delete_alignment(self, book_id: int): """Delete alignment data for a book.""" with self.database_service.get_session() as session: - session.query(BookAlignment).filter_by(abs_id=abs_id).delete() - logger.info(f"Deleted alignment data for {abs_id}") + session.query(BookAlignment).filter_by(book_id=book_id).delete() + logger.info(f"Deleted alignment data for book {book_id}") - def realign_book(self, abs_id: str): + def realign_book(self, book_id: int): """Atomically delete alignment + jobs and requeue book for re-processing.""" with self.database_service.get_session() as session: - session.query(BookAlignment).filter_by(abs_id=abs_id).delete() - session.query(Job).filter(Job.abs_id == abs_id).delete() - book = session.query(Book).filter_by(abs_id=abs_id).first() + session.query(BookAlignment).filter_by(book_id=book_id).delete() + session.query(Job).filter(Job.book_id == book_id).delete() + book = session.query(Book).filter_by(id=book_id).first() if book: book.transcript_file = None book.status = 'pending' - logger.info(f"Re-alignment queued for {abs_id}") + logger.info(f"Re-alignment queued for book {book_id}") - def _save_alignment(self, abs_id: str, alignment_map: list[dict], source: str = None, book_id: int = None): + def _save_alignment(self, book_id: int, alignment_map: list[dict], source: str = None): """Upsert alignment to SQLite.""" if not alignment_map: - logger.warning(f"Refusing to save empty alignment map for {abs_id}") + logger.warning(f"Refusing to save empty alignment map for book {book_id}") return with self.database_service.get_session() as session: json_blob = json.dumps(alignment_map) # Check exist - existing = session.query(BookAlignment).filter_by(abs_id=abs_id).first() + existing = session.query(BookAlignment).filter_by(book_id=book_id).first() if existing: existing.alignment_map_json = json_blob existing.last_updated = datetime.utcnow() if source: existing.source = source else: - if book_id is None: - book = self.database_service.get_book_by_abs_id(abs_id) - book_id = book.id if book else None - new_align = BookAlignment(abs_id=abs_id, book_id=book_id, alignment_map_json=json_blob, source=source) + new_align = BookAlignment(book_id=book_id, alignment_map_json=json_blob, source=source) session.add(new_align) # Context manager handles commit - logger.info(f" Saved alignment for {abs_id} to DB.") + logger.info(f" Saved alignment for book {book_id} to DB.") - def _get_alignment(self, abs_id: str) -> list[dict] | None: + def _get_alignment(self, book_id: int) -> list[dict] | None: with self.database_service.get_session() as session: - entry = session.query(BookAlignment).filter_by(abs_id=abs_id).first() + entry = session.query(BookAlignment).filter_by(book_id=book_id).first() if not entry: - logger.debug(f"No alignment row for {abs_id}") + logger.debug(f"No alignment row for book {book_id}") return None try: raw = json.loads(entry.alignment_map_json) except (json.JSONDecodeError, TypeError) as e: - logger.warning(f"Corrupt alignment JSON for {abs_id}: {e}") + logger.warning(f"Corrupt alignment JSON for book {book_id}: {e}") return None # Validate structure: each point must have int 'char' and float 'ts' @@ -630,18 +640,18 @@ def _get_alignment(self, abs_id: str) -> list[dict] | None: try: validated.append({'char': int(point['char']), 'ts': float(point['ts'])}) except (ValueError, TypeError): - logger.warning(f"Skipping invalid alignment point for {abs_id}: {point}") + logger.warning(f"Skipping invalid alignment point for book {book_id}: {point}") else: - logger.warning(f"Skipping malformed alignment point for {abs_id}: {point}") + logger.warning(f"Skipping malformed alignment point for book {book_id}: {point}") if not validated: - logger.warning(f"Alignment for {abs_id} has no valid points after validation") + logger.warning(f"Alignment for book {book_id} has no valid points after validation") return None return validated - def get_book_duration(self, abs_id: str) -> float | None: + def get_book_duration(self, book_id: int) -> float | None: """Get the total duration of the book from its alignment map.""" - alignment = self._get_alignment(abs_id) + alignment = self._get_alignment(book_id) if alignment and len(alignment) > 0: # The last point in the alignment map should have the max timestamp return float(alignment[-1]['ts']) diff --git a/src/services/background_job_service.py b/src/services/background_job_service.py index 9766b3b..4bbd507 100644 --- a/src/services/background_job_service.py +++ b/src/services/background_job_service.py @@ -80,7 +80,7 @@ def cleanup_stale_jobs(self): for book in candidates: has_alignment = False if self.alignment_service: - has_alignment = self.alignment_service.has_alignment(book.abs_id) + has_alignment = self.alignment_service.has_alignment(book.id) if has_alignment: if book.status != "active": @@ -266,6 +266,8 @@ def _phase_acquire_epub(self, book, update_progress): if not book.original_ebook_filename: book.original_ebook_filename = book.ebook_filename self.database_service.save_book(book) + from src.services.kosync_service import ensure_kosync_document + ensure_kosync_document(book, self.database_service) logger.info(f"Locked KOSync ID: {computed_hash}") except Exception as e: logger.warning(f"Failed to eager-lock KOSync ID: {e}") @@ -349,7 +351,7 @@ def _phase_alignment(self, book, abs_id, book_title, epub_path, raw_transcript, logger.info(f"Aligning transcript ({transcript_source}) using Anchored Alignment...") update_progress(0.1, 3) success = self.alignment_service.align_and_store( - abs_id, raw_transcript, book_text, chapters, source=transcript_source.lower() + book.id, raw_transcript, book_text, chapters, source=transcript_source.lower() ) update_progress(0.5, 3) @@ -390,7 +392,7 @@ def _try_storyteller_alignment(self, book, abs_id, book_text, update_progress) - st_chapters = self.storyteller_client.get_word_timeline_chapters(book.storyteller_uuid) if st_chapters: # Transcriptions are ready — mark any active submission as done and align - submission = self.database_service.get_active_storyteller_submission(abs_id) + submission = self.database_service.get_active_storyteller_submission_by_book_id(book.id) if submission and submission.status not in ("ready", "failed"): if self.storyteller_submission_service: self.storyteller_submission_service._update_submission_status(submission, "ready") @@ -399,13 +401,13 @@ def _try_storyteller_alignment(self, book, abs_id, book_text, update_progress) - logger.debug(f"Direct Storyteller alignment check failed for '{sanitize_log_data(book.title)}': {e}") # Fall through: check submission status if no direct alignment was possible - submission = self.database_service.get_active_storyteller_submission(abs_id) + submission = self.database_service.get_active_storyteller_submission_by_book_id(book.id) if submission and submission.status in ("queued", "processing"): if self.storyteller_submission_service: fresh_status = self.storyteller_submission_service.check_status(abs_id) if fresh_status == "ready": # Storyteller finished! Update the book's storyteller_uuid and proceed to alignment - updated_sub = self.database_service.get_storyteller_submission(abs_id) + updated_sub = self.database_service.get_storyteller_submission_by_book_id(book.id) if updated_sub and updated_sub.storyteller_uuid and not book.storyteller_uuid: book.storyteller_uuid = updated_sub.storyteller_uuid self.database_service.save_book(book) @@ -442,7 +444,7 @@ def _do_storyteller_alignment(self, book, abs_id, st_chapters, book_text, update f"Using Storyteller wordTimeline for '{sanitize_log_data(book.title)}' ({len(st_chapters)} chapters)" ) update_progress(0.5, 2) - success = self.alignment_service.align_storyteller_and_store(abs_id, st_chapters, book_text) + success = self.alignment_service.align_storyteller_and_store(book.id, st_chapters, book_text) if success: update_progress(1.0, 2) return "STORYTELLER_NATIVE" diff --git a/src/services/book_metadata_service.py b/src/services/book_metadata_service.py index de687e2..3b33b70 100644 --- a/src/services/book_metadata_service.py +++ b/src/services/book_metadata_service.py @@ -16,7 +16,7 @@ def build_book_metadata(book, container, database_service, abs_service, booklore """ abs_id = book.abs_id metadata = {} - sync_mode = getattr(book, 'sync_mode', 'audiobook') + sync_mode = book.sync_mode # ABS metadata (subtitle, author, narrator, duration, genres, description) if sync_mode != 'ebook_only': @@ -37,6 +37,12 @@ def build_book_metadata(book, container, database_service, abs_service, booklore except Exception as e: logger.debug("abs_service.get_item_details failed for abs_id=%s: %s", abs_id, e, exc_info=True) + # Fall back to cached book metadata when ABS data is unavailable + if not metadata.get('author') and book.author: + metadata['author'] = book.author + if not metadata.get('subtitle') and book.subtitle: + metadata['subtitle'] = book.subtitle + # Fallback duration from stored book data (in case ABS API call failed or was skipped) if not metadata.get('duration') and book.duration and book.duration > 0: hrs = int(book.duration // 3600) @@ -88,7 +94,7 @@ def build_book_metadata(book, container, database_service, abs_service, booklore try: if bl_client and bl_client.is_configured(): bl_book = bl_client.find_book_by_filename(book.ebook_filename, allow_refresh=False) - if not bl_book and getattr(book, 'original_ebook_filename', None): + if not bl_book and book.original_ebook_filename: bl_book = bl_client.find_book_by_filename(book.original_ebook_filename, allow_refresh=False) if bl_book: if not metadata.get('description') and bl_book.get('description'): @@ -104,7 +110,7 @@ def build_book_metadata(book, container, database_service, abs_service, booklore getattr(bl_client, 'base_url', '?'), e) # BookFusion catalog entry (tags, series) - bf_book = database_service.get_bookfusion_book_by_abs_id(abs_id) + bf_book = database_service.get_bookfusion_book_by_book_id(book.id) if bf_book: metadata['bf_tags'] = bf_book.tags or '' metadata['bf_series'] = bf_book.series or '' @@ -121,12 +127,12 @@ def build_book_metadata(book, container, database_service, abs_service, booklore def build_service_info(book, states_by_book, container, abs_service, metadata, - has_bookfusion_link, booklore_client=None): + has_bookfusion_link): """Build per-service state data, integration flags, and enabled-service map. Returns (service_states, integrations, services_enabled). """ - sync_mode = getattr(book, 'sync_mode', 'audiobook') + sync_mode = book.sync_mode hardcover = metadata.get('_hardcover') # Build per-service state data for the Services tab @@ -148,14 +154,14 @@ def build_service_info(book, states_by_book, container, abs_service, metadata, storyteller = container.storyteller_client() hardcover = container.hardcover_client() bookfusion = container.bookfusion_client() - booklore = booklore_client or container.booklore_client() + bl_group = container.booklore_client_group() services_enabled = { 'abs': abs_service is not None and abs_service.is_available(), 'kosync': True, # KoSync is always available (built-in server) 'storyteller': storyteller is not None and storyteller.is_configured(), 'hardcover': hardcover is not None and hardcover.is_configured(), 'bookfusion': bookfusion is not None and bookfusion.is_configured(), - 'booklore': booklore is not None and booklore.is_configured(), + 'booklore': bl_group is not None and bl_group.is_configured(), } return service_states, integrations, services_enabled diff --git a/src/services/client_poller.py b/src/services/client_poller.py index 3e22201..7d2f352 100644 --- a/src/services/client_poller.py +++ b/src/services/client_poller.py @@ -30,7 +30,7 @@ def __init__(self, database_service, sync_manager, sync_clients_dict: dict): self._db = database_service self._sync_manager = sync_manager self._sync_clients = sync_clients_dict - self._last_known: dict[tuple, float] = {} # {(client_name, abs_id): last_pct} + self._last_known: dict[tuple, float] = {} # {(client_name, book_id): last_pct} self._last_poll: dict[str, float] = {} # {client_name: last_poll_timestamp} self._running = False @@ -115,7 +115,7 @@ def _poll_client(self, client_name: str) -> None: continue checked += 1 - cache_key = (client_name, book.abs_id) + cache_key = (client_name, book.id) last_pct = self._last_known.get(cache_key) if last_pct is None: @@ -124,7 +124,7 @@ def _poll_client(self, client_name: str) -> None: ) elif abs(current_pct - last_pct) > 0.001: # Check write-suppression before acting - if is_own_write(client_name, book.abs_id, state=current_state.current): + if is_own_write(client_name, book.id, state=current_state.current): logger.debug( f"{client_name} poll: Ignoring self-triggered change for '{book.title}'" ) diff --git a/src/services/hardcover_service.py b/src/services/hardcover_service.py index cb02a5a..72b0160 100644 --- a/src/services/hardcover_service.py +++ b/src/services/hardcover_service.py @@ -129,7 +129,7 @@ def push_local_status(self, book, status_label): ) hardcover_details.hardcover_status_id = hc_status_id self.database_service.save_hardcover_details(hardcover_details) - record_write('Hardcover', book.abs_id, {'status': hc_status_id}) + record_write('Hardcover', book.id, {'status': hc_status_id}) log_hardcover_action( self.database_service, abs_id=book.abs_id, @@ -174,7 +174,7 @@ def push_local_rating(self, book, rating): if not result: return {'hardcover_synced': False, 'hardcover_error': 'Hardcover rejected rating update'} - record_write('Hardcover', book.abs_id, {'rating': rating}) + record_write('Hardcover', book.id, {'rating': rating}) log_hardcover_action( self.database_service, abs_id=book.abs_id, book_title=sanitize_log_data(book.title), @@ -244,7 +244,7 @@ def _get_or_create_user_book(self, book, hardcover_details, edition_id=None): hardcover_details.hardcover_user_book_id = result['id'] hardcover_details.hardcover_status_id = result.get('status_id', hc_status_id) self.database_service.save_hardcover_details(hardcover_details) - record_write('Hardcover', book.abs_id, {'status': hardcover_details.hardcover_status_id}) + record_write('Hardcover', book.id, {'status': hardcover_details.hardcover_status_id}) log_hardcover_action( self.database_service, abs_id=book.abs_id, book_title=sanitize_log_data(book.title), diff --git a/src/services/kosync_service.py b/src/services/kosync_service.py new file mode 100644 index 0000000..b3c6fbb --- /dev/null +++ b/src/services/kosync_service.py @@ -0,0 +1,740 @@ +"""KoSync business logic extracted from kosync_server.py. + +Handles EPUB discovery, hash-to-book linking, auto-discovery, and +document management. Route handlers in kosync_server.py delegate here. +""" + +import json +import logging +import re +import threading +from pathlib import Path + +from src.db.models import Book, KosyncDocument, PendingSuggestion +from src.utils.logging_utils import sanitize_log_data +from src.utils.path_utils import is_safe_path_within + +logger = logging.getLogger(__name__) + +# Auto-discovery concurrency cap +_MAX_ACTIVE_SCANS = 5 + + +def _normalize_title(s): + """Strip punctuation and lowercase for fuzzy title matching.""" + return re.sub(r"[^\w\s]", "", s.lower()) + + +def ensure_kosync_document(book, database_service): + """Create a KosyncDocument for a book's kosync_doc_id if one doesn't exist. + + Call after saving a book with a kosync_doc_id to prevent orphaned hashes + that cause 502 errors every sync cycle. + """ + if not book or not book.kosync_doc_id or not book.id: + return + existing = database_service.get_kosync_document(book.kosync_doc_id) + if existing: + if not existing.linked_book_id: + database_service.link_kosync_document(book.kosync_doc_id, book.id, book.abs_id) + logger.info(f"KOSync: Linked existing document {book.kosync_doc_id[:8]}... to '{book.title}'") + if not existing.filename and book.ebook_filename: + existing.filename = book.ebook_filename + database_service.save_kosync_document(existing) + else: + doc = KosyncDocument( + document_hash=book.kosync_doc_id, + linked_book_id=book.id, + linked_abs_id=book.abs_id, + filename=book.ebook_filename, + ) + database_service.save_kosync_document(doc) + logger.debug(f"KOSync: Created document {book.kosync_doc_id[:8]}... for '{book.title}'") + + +class KosyncService: + """Business logic for KoSync document management and EPUB discovery.""" + + def __init__(self, database_service, container, manager=None, ebook_dir=None): + self._db = database_service + self._container = container + self._manager = manager + self._ebook_dir = ebook_dir + self._active_scans = set() + self._active_scans_lock = threading.Lock() + + # ------------------------------------------------------------------ # + # Progress serialization (was duplicated 3x in kosync_server.py) + # ------------------------------------------------------------------ # + + @staticmethod + def serialize_progress(doc, doc_id=None, device_default="pagekeeper"): + """Build a KoSync-protocol progress dict from a KosyncDocument.""" + return { + "device": doc.device or device_default, + "device_id": doc.device_id or device_default, + "document": doc_id or doc.document_hash, + "percentage": float(doc.percentage) if doc.percentage else 0, + "progress": doc.progress or "", + "timestamp": int(doc.timestamp.timestamp()) if doc.timestamp else 0, + } + + # ------------------------------------------------------------------ # + # Book resolution helpers + # ------------------------------------------------------------------ # + + def resolve_book_by_sibling_hash(self, doc_id, existing_doc=None): + """Try to resolve an unknown hash to a book via sibling filename matches.""" + doc = existing_doc or self._db.get_kosync_document(doc_id) + if doc and doc.filename: + # Find sibling document with same filename that's linked + sibling = self._db.get_kosync_doc_by_filename(doc.filename) + if sibling and sibling.linked_abs_id and sibling.document_hash != doc_id: + book = self._db.get_book_by_abs_id(sibling.linked_abs_id) + if book: + logger.info(f"KOSync: Resolved {doc_id[:8]}... to '{book.title}' via filename sibling") + return book + + # Check if filename matches a book's ebook_filename directly + book = self._db.get_book_by_ebook_filename(doc.filename) + if book: + logger.info(f"KOSync: Resolved {doc_id[:8]}... to '{book.title}' via ebook filename match") + return book + + return None + + def register_hash_for_book(self, doc_id, book): + """Register a new hash and link it to an existing book.""" + existing = self._db.get_kosync_document(doc_id) + if existing: + if not existing.linked_book_id: + self._db.link_kosync_document(doc_id, book.id, book.abs_id) + logger.info(f"KOSync: Linked existing document {doc_id[:8]}... to '{book.title}'") + else: + doc = KosyncDocument( + document_hash=doc_id, + linked_book_id=book.id, + linked_abs_id=book.abs_id, + filename=book.ebook_filename, + ) + self._db.save_kosync_document(doc) + logger.info(f"KOSync: Created and linked new document {doc_id[:8]}... to '{book.title}'") + + # ------------------------------------------------------------------ # + # EPUB discovery — decomposed from _try_find_epub_by_hash (151 lines) + # ------------------------------------------------------------------ # + + def find_epub_by_hash(self, doc_hash): + """Try to find matching EPUB file for a KoSync document hash. + + Searches in order: DB cache → filesystem → Booklore API. + Returns the epub filename on match, or None. + """ + try: + result = self._find_epub_in_db(doc_hash) + if result: + return result + + result = self._find_epub_in_filesystem(doc_hash) + if result: + return result + + result = self._find_epub_in_booklore(doc_hash) + if result: + return result + + except Exception as e: + logger.error(f"Error in EPUB auto-discovery: {e}") + + logger.info("Auto-discovery finished. No match found") + return None + + def _find_epub_in_db(self, doc_hash): + """Check DB for cached filename or linked book's original filename.""" + doc = self._db.get_kosync_document(doc_hash) + if doc and doc.filename: + try: + self._container.ebook_parser().resolve_book_path(doc.filename) + logger.info(f"Matched EPUB via DB: {doc.filename}") + return doc.filename + except FileNotFoundError: + logger.debug(f"DB suggested '{doc.filename}' but file is missing — Re-scanning") + + if doc and doc.linked_abs_id: + book = self._db.get_book_by_abs_id(doc.linked_abs_id) + if book and book.original_ebook_filename: + try: + self._container.ebook_parser().resolve_book_path(book.original_ebook_filename) + logger.info(f"Matched EPUB via Linked Book Original Filename: {book.original_ebook_filename}") + return book.original_ebook_filename + except Exception as e: + logger.debug(f"Failed to resolve original filename for {doc.linked_abs_id}: {e}") + + return None + + def _find_epub_in_filesystem(self, doc_hash): + """Scan configured ebook directory for matching hash.""" + if not self._ebook_dir or not self._ebook_dir.exists(): + return None + + logger.info(f"Starting filesystem search in {self._ebook_dir} for hash {doc_hash[:8]}...") + count = 0 + for epub_path in self._ebook_dir.rglob("*.epub"): + count += 1 + if count % 100 == 0: + logger.debug(f"Checked {count} local EPUBs...") + + # Optimization: check DB cache by filename first + cached_doc = self._db.get_kosync_doc_by_filename(epub_path.name) + if cached_doc: + current_mtime = epub_path.stat().st_mtime + if cached_doc.mtime == current_mtime: + if cached_doc.document_hash == doc_hash: + logger.info(f"Matched EPUB via DB filename lookup: {epub_path.name}") + return epub_path.name + continue + + try: + computed_hash = self._container.ebook_parser().get_kosync_id(epub_path) + + # Store/update in DB — never mutate document_hash (primary key) + if cached_doc and cached_doc.document_hash != computed_hash: + self._db.delete_kosync_document(cached_doc.document_hash) + cached_doc = None # force create below + if cached_doc: + cached_doc.mtime = epub_path.stat().st_mtime + cached_doc.source = "filesystem" + self._db.save_kosync_document(cached_doc) + else: + self._upsert_kosync_metadata( + computed_hash, epub_path.name, "filesystem", mtime=epub_path.stat().st_mtime + ) + + if computed_hash == doc_hash: + logger.info(f"Matched EPUB via filesystem: {epub_path.name}") + return epub_path.name + except Exception as e: + logger.debug(f"Error checking file {epub_path.name}: {e}") + + logger.info(f"Filesystem search finished. Checked {count} files. No match found") + return None + + def _find_epub_in_booklore(self, doc_hash): + """Search Booklore API for matching EPUB hash.""" + bl_group = self._container.booklore_client_group() + if not bl_group.is_configured(): + return None + + logger.info("Starting Booklore API search...") + + try: + books = self._db.get_all_booklore_books() + if not books: + logger.info("Booklore cache in DB is empty. Syncing library...") + bl_group.get_all_books() + books = self._db.get_all_booklore_books() + + logger.info(f"Scanning {len(books)} books from Booklore DB cache...") + + for book in books: + raw_id = book.raw_metadata_dict.get("id") if getattr(book, "raw_metadata_dict", None) else None + book_id = str(raw_id) if raw_id is not None else None + if not book_id: + try: + meta = json.loads(book.raw_metadata) + fallback_id = meta.get("id") + book_id = str(fallback_id) if fallback_id is not None else None + except (json.JSONDecodeError, AttributeError) as e: + logger.debug(f"Failed to parse raw_metadata JSON: {e}") + continue + + if not book_id: + continue + + qualified_id = f"{book.server_id}:{book_id}" + + # Check if we have a KosyncDocument for this Booklore ID + cached_doc = self._db.get_kosync_doc_by_booklore_id(qualified_id) + if cached_doc: + if cached_doc.document_hash == doc_hash: + logger.info(f"Matched EPUB via Booklore ID in DB: {cached_doc.filename}") + return cached_doc.filename + + try: + book_content = bl_group.download_book(qualified_id) + if book_content: + computed_hash = self._container.ebook_parser().get_kosync_id_from_bytes( + book.filename, book_content + ) + + if computed_hash == doc_hash: + safe_title = f"{book.server_id}_{Path(book.filename).name}" + cache_dir = self._container.data_dir() / "epub_cache" + cache_dir.mkdir(parents=True, exist_ok=True) + cache_path = cache_dir / safe_title + if not is_safe_path_within(cache_path, cache_dir): + logger.warning(f"Blocked cache write — path escapes cache dir: '{safe_title}'") + else: + with open(cache_path, "wb") as f: + f.write(book_content) + logger.info(f"Persisted Booklore book to cache: {safe_title}") + + self._upsert_kosync_metadata( + computed_hash, safe_title, "booklore", booklore_id=qualified_id + ) + + logger.info(f"Matched EPUB via Booklore download: {safe_title}") + return safe_title + except Exception as e: + logger.warning(f"Failed to check Booklore book '{sanitize_log_data(book.title)}': {e}") + + logger.info(f"Booklore search finished. Checked {len(books)} books. No match found") + + except Exception as e: + logger.debug(f"Error querying Booklore for EPUB matching: {e}") + + return None + + def _upsert_kosync_metadata(self, document_hash, filename, source, mtime=None, booklore_id=None): + """Cache hash metadata without overwriting any existing progress data.""" + existing = self._db.get_kosync_document(document_hash) + if existing: + existing.filename = filename + existing.source = source + if mtime is not None: + existing.mtime = mtime + if booklore_id is not None: + existing.booklore_id = booklore_id + self._db.save_kosync_document(existing) + else: + doc = KosyncDocument( + document_hash=document_hash, + filename=filename, + source=source, + mtime=mtime, + booklore_id=booklore_id, + ) + self._db.save_kosync_document(doc) + + # ------------------------------------------------------------------ # + # Auto-discovery (background threads) + # ------------------------------------------------------------------ # + + def start_discovery_if_available(self, doc_hash): + """Acquire a discovery slot and return True if started, False if skipped.""" + with self._active_scans_lock: + if doc_hash in self._active_scans or len(self._active_scans) >= _MAX_ACTIVE_SCANS: + return False + self._active_scans.add(doc_hash) + return True + + def finish_discovery(self, doc_hash): + """Release a discovery slot.""" + with self._active_scans_lock: + self._active_scans.discard(doc_hash) + + def run_get_auto_discovery(self, doc_id): + """Background discovery for GET: find epub and link to existing book.""" + try: + logger.info(f"KOSync: Background discovery (GET) for {doc_id[:8]}...") + epub_filename = self.find_epub_by_hash(doc_id) + + if not epub_filename: + logger.info(f"KOSync: GET-discovery found no epub for {doc_id[:8]}...") + return + + # Update stub with filename + doc = self._db.get_kosync_document(doc_id) + if doc and not doc.filename: + doc.filename = epub_filename + self._db.save_kosync_document(doc) + + # Try to find an existing book that uses this epub + book = self._db.get_book_by_ebook_filename(epub_filename) + if book: + self._db.link_kosync_document(doc_id, book.id, book.abs_id) + logger.info(f"KOSync: GET-discovery linked {doc_id[:8]}... to '{book.title}'") + return + + logger.info(f"KOSync: GET-discovery found epub '{epub_filename}' but no matching book") + except Exception as e: + logger.error(f"Error in GET auto-discovery: {e}") + finally: + self.finish_discovery(doc_id) + + def run_put_auto_discovery(self, doc_hash): + """Background discovery for PUT: find epub, match audiobook, create suggestion or book.""" + try: + logger.info(f"KOSync: Scheduled auto-discovery for unmapped document {doc_hash[:8]}...") + epub_filename = self.find_epub_by_hash(doc_hash) + + if not epub_filename: + logger.debug(f"Could not auto-match EPUB for KOSync document '{doc_hash[:8]}'") + return + + # Derive title from filename — strip server_id prefix from Booklore-cached files + stem = Path(epub_filename).stem + # Booklore cache files are named "{server_id}_{original}" — strip numeric prefix + if "_" in stem: + prefix, candidate = stem.split("_", 1) + if prefix.isdigit() and candidate: + stem = candidate + title = stem + + # Step 1: Search ABS for matching audiobooks + audiobook_matches = self._search_abs_audiobooks(title) + + # Step 2: If matches found, auto-create for single exact match or create suggestion + if audiobook_matches: + exact_matches = [m for m in audiobook_matches if m.get("confidence") == "exact"] + + if len(exact_matches) == 1: + # High confidence single match — auto-create book + match = exact_matches[0] + book = Book( + abs_id=match["abs_id"], + title=match["title"], + ebook_filename=epub_filename, + kosync_doc_id=doc_hash, + transcript_file=None, + status="active", + duration=match.get("duration"), + sync_mode="audiobook", + ) + self._db.save_book(book, is_new=True) + self._db.link_kosync_document(doc_hash, book.id, book.abs_id) + self._db.resolve_suggestion(doc_hash) + logger.info( + f"Auto-created book '{match['title']}' from exact title match (abs_id={match['abs_id']})" + ) + if self._manager: + self._manager.sync_cycle(target_book_id=book.id) + return + + # Multiple exact or only fuzzy matches — create suggestion for user review + if not self._db.suggestion_exists(doc_hash): + suggestion = PendingSuggestion( + source_id=doc_hash, + title=title, + author=None, + cover_url=f"/api/cover-proxy/{audiobook_matches[0]['abs_id']}", + matches_json=json.dumps( + audiobook_matches + [{"source": "ebook", "filename": epub_filename, "confidence": "high"}] + ), + source="kosync", + ) + self._db.save_pending_suggestion(suggestion) + logger.info( + f"Created suggestion for '{title}' - found {len(audiobook_matches)} match(es), {len(exact_matches)} exact" + ) + return + + # Step 3: No audiobook found — create ebook-only book + self.create_ebook_only_book(doc_hash, title, epub_filename) + + except Exception as e: + logger.error(f"Error in auto-discovery background task: {e}") + finally: + self.finish_discovery(doc_hash) + + def _search_abs_audiobooks(self, search_term): + """Search AudiobookShelf for audiobooks matching a title. Returns match list.""" + if not self._container.abs_client().is_configured(): + return [] + + matches = [] + try: + audiobooks = self._container.abs_client().get_all_audiobooks() + logger.debug( + f"Auto-discovery: Searching for audiobook matching '{search_term}' in {len(audiobooks)} audiobooks" + ) + search_norm = _normalize_title(search_term) + + for ab in audiobooks: + media = ab.get("media", {}) + metadata = media.get("metadata", {}) + ab_title = metadata.get("title") or ab.get("name", "") + ab_author = metadata.get("authorName", "") + title_norm = _normalize_title(ab_title) + + if not (search_norm and title_norm): + continue + if not (search_norm in title_norm or title_norm in search_norm): + continue + + # Skip books with high progress (>75%) + duration = media.get("duration", 0) + if duration > 0: + try: + ab_progress = self._container.abs_client().get_progress(ab["id"]) + if ab_progress and ab_progress.get("progress", 0) * 100 > 75: + logger.debug(f"Auto-discovery: Skipping '{ab_title}' - already >75% complete") + continue + except Exception as e: + logger.debug(f"Failed to get ABS progress during auto-discovery: {e}") + + confidence = "exact" if search_norm == title_norm else "high" + logger.debug(f"Auto-discovery: Matched '{ab_title}' by {ab_author} (confidence: {confidence})") + matches.append( + { + "source": "abs", + "abs_id": ab["id"], + "title": ab_title, + "author": ab_author, + "duration": duration, + "confidence": confidence, + } + ) + + except Exception as e: + logger.warning(f"Error searching ABS for audiobooks: {e}") + + return matches + + # ------------------------------------------------------------------ # + # Book creation and management + # ------------------------------------------------------------------ # + + def create_ebook_only_book(self, doc_hash, title, epub_filename=None): + """Create a new ebook-only Book and link the KosyncDocument to it.""" + book = Book( + abs_id=None, + title=title, + ebook_filename=epub_filename, + kosync_doc_id=doc_hash, + transcript_file=None, + status="active", + duration=None, + sync_mode="ebook_only", + ) + self._db.save_book(book, is_new=True) + self._db.link_kosync_document(doc_hash, book.id, book.abs_id) + self._db.resolve_suggestion(doc_hash) + logger.info(f"Created ebook-only book: {book.id} '{title}'" + (f" -> {epub_filename}" if epub_filename else "")) + + if self._manager: + self._manager.sync_cycle(target_book_id=book.id) + + return book + + def get_orphaned_kosync_books(self): + """Get books with kosync_doc_id set but no matching KosyncDocument.""" + return self._db.get_orphaned_kosync_books() + + def clear_orphaned_hash(self, book_id): + """Clear kosync_doc_id from a book to stop 502 cycle.""" + book = self._db.get_book_by_id(book_id) + if not book: + return None + old_hash = book.kosync_doc_id + book.kosync_doc_id = None + self._db.save_book(book) + logger.info(f"Cleared orphaned KoSync hash from '{book.title}' (was: {old_hash})") + return book + + # ------------------------------------------------------------------ # + # HTTP handler logic (moved from kosync_server.py route handlers) + # ------------------------------------------------------------------ # + + def handle_put_progress(self, data, remote_addr, debounce_manager=None): + """Process a KoSync PUT progress request. Returns (response_dict, status_code).""" + import os + from datetime import datetime + + from src.utils.constants import INTERNAL_DEVICE_NAMES + + if not data: + logger.warning(f"KOSync: PUT progress with no JSON data from {remote_addr}") + return {"error": "No data"}, 400 + + doc_hash = data.get("document") + if not doc_hash or not isinstance(doc_hash, str): + logger.warning(f"KOSync: PUT progress with no document ID from {remote_addr}") + return {"error": "Missing document ID"}, 400 + if len(doc_hash) > 64: + return {"error": "Document hash too long"}, 400 + + percentage = data.get("percentage", 0) + try: + percentage = float(percentage) + except (TypeError, ValueError): + return {"error": "Invalid percentage value"}, 400 + if percentage < 0.0 or percentage > 1.0: + return {"error": "Percentage must be between 0.0 and 1.0"}, 400 + + logger.info( + f"KOSync: PUT progress request for doc {doc_hash[:8]}... from {remote_addr} (device: {data.get('device', 'unknown')})" + ) + + progress = str(data.get("progress", ""))[:512] + device = str(data.get("device", ""))[:128] + device_id = str(data.get("device_id", ""))[:64] + + now = datetime.utcnow() + + kosync_doc = self._db.get_kosync_document(doc_hash) + + # Optional "furthest wins" protection + furthest_wins = os.environ.get("KOSYNC_FURTHEST_WINS", "true").lower() == "true" + force_update = data.get("force", False) + same_device = kosync_doc and kosync_doc.device_id == device_id + + if furthest_wins and kosync_doc and kosync_doc.percentage and not force_update and not same_device: + existing_pct = float(kosync_doc.percentage) + new_pct = float(percentage) + if new_pct < existing_pct - 0.0001: + logger.info( + f"KOSync: Ignored progress from '{device}' for doc {doc_hash[:8]}... (server has higher: {existing_pct:.2f}% vs new {new_pct:.2f}%)" + ) + return { + "document": doc_hash, + "timestamp": int(kosync_doc.timestamp.timestamp()) + if kosync_doc.timestamp + else int(now.timestamp()), + }, 200 + + if kosync_doc is None: + kosync_doc = KosyncDocument( + document_hash=doc_hash, + progress=progress, + percentage=percentage, + device=device, + device_id=device_id, + timestamp=now, + ) + logger.info(f"KOSync: New document tracked: {doc_hash[:8]}... from device '{device}'") + else: + logger.info( + f"KOSync: Received progress from '{device}' for doc {doc_hash[:8]}... -> {float(percentage):.2f}% (Updated from {float(kosync_doc.percentage) if kosync_doc.percentage else 0:.2f}%)" + ) + kosync_doc.progress = progress + kosync_doc.percentage = percentage + kosync_doc.device = device + kosync_doc.device_id = device_id + kosync_doc.timestamp = now + + self._db.save_kosync_document(kosync_doc) + + # Update linked book if exists + linked_book = None + if kosync_doc.linked_abs_id: + linked_book = self._db.get_book_by_abs_id(kosync_doc.linked_abs_id) + else: + linked_book = self._db.get_book_by_kosync_id(doc_hash) + if linked_book: + self._db.link_kosync_document(doc_hash, linked_book.id, linked_book.abs_id) + + # AUTO-DISCOVERY + if not linked_book: + auto_create = os.environ.get("AUTO_CREATE_EBOOK_MAPPING", "true").lower() == "true" + if auto_create and self.start_discovery_if_available(doc_hash): + threading.Thread(target=self.run_put_auto_discovery, args=(doc_hash,), daemon=True).start() + + if linked_book: + # Flag activity on paused/DNF books + if linked_book.status in ("paused", "dnf", "not_started") and not linked_book.activity_flag: + linked_book.activity_flag = True + self._db.save_book(linked_book) + logger.info(f"KOSync PUT: Activity detected on {linked_book.status} book '{linked_book.title}'") + + logger.debug(f"KOSync: Updated linked book '{linked_book.title}' to {percentage:.2%}") + + # Debounce sync trigger + is_internal = device and device.lower() in INTERNAL_DEVICE_NAMES + instant_sync_enabled = os.environ.get("INSTANT_SYNC_ENABLED", "true").lower() != "false" + if linked_book.status == "active" and self._manager and not is_internal and instant_sync_enabled: + if debounce_manager: + logger.debug(f"KOSync PUT: Progress event recorded for '{linked_book.title}'") + debounce_manager.record_event(linked_book.id, linked_book.title) + + response_timestamp = now.isoformat() + "Z" + if device and device.lower() == "booknexus": + response_timestamp = int(now.timestamp()) + + return {"document": doc_hash, "timestamp": response_timestamp}, 200 + + def handle_get_progress(self, doc_id, remote_addr): + """Process a KoSync GET progress request. Returns (response_dict, status_code).""" + import os + + if len(doc_id) > 64: + return {"error": "Document ID too long"}, 400 + + logger.info(f"KOSync: GET progress for doc {doc_id[:8]}... from {remote_addr}") + + # Step 1: Direct hash lookup + kosync_doc = self._db.get_kosync_document(doc_id) + if kosync_doc: + if kosync_doc.linked_abs_id: + book = self._db.get_book_by_abs_id(kosync_doc.linked_abs_id) + if book: + return self.resolve_best_progress(doc_id, book) + + has_progress = kosync_doc.percentage and float(kosync_doc.percentage) > 0 + if has_progress: + return self.serialize_progress(kosync_doc, device_default=""), 200 + + # Step 2: Book lookup by kosync_doc_id + book = self._db.get_book_by_kosync_id(doc_id) + if book: + return self.resolve_best_progress(doc_id, book) + + # Step 3: Sibling hash resolution + resolved_book = self.resolve_book_by_sibling_hash(doc_id, existing_doc=kosync_doc) + if resolved_book: + self.register_hash_for_book(doc_id, resolved_book) + return self.resolve_best_progress(doc_id, resolved_book) + + # Step 4: Unknown hash — register stub and start background discovery + auto_create = os.environ.get("AUTO_CREATE_EBOOK_MAPPING", "true").lower() == "true" + if auto_create and self.start_discovery_if_available(doc_id): + stub = KosyncDocument(document_hash=doc_id) + self._db.save_kosync_document(stub) + logger.info(f"KOSync: Created stub for unknown hash {doc_id[:8]}..., starting background discovery") + threading.Thread(target=self.run_get_auto_discovery, args=(doc_id,), daemon=True).start() + + logger.warning(f"KOSync: Document not found: {doc_id[:8]}... (GET from {remote_addr})") + return {"message": "Document not found on server"}, 502 + + def resolve_best_progress(self, doc_id, book): + """Find the best progress data for a book across sibling docs and states. + + Returns (response_dict, status_code). + """ + import time + + states = self._db.get_states_for_book(book.id) + + sibling_docs = self._db.get_kosync_documents_for_book_by_book_id(book.id) + now_ts = time.time() + docs_with_progress = [ + d + for d in sibling_docs + if d.percentage + and float(d.percentage) > 0 + and d.timestamp + and (now_ts - d.timestamp.timestamp()) < 30 * 86400 + ] + if not docs_with_progress: + docs_with_progress = [d for d in sibling_docs if d.percentage and float(d.percentage) > 0 and d.timestamp] + if docs_with_progress: + best_doc = max(docs_with_progress, key=lambda d: float(d.percentage)) + logger.info( + f"KOSync: Resolved {doc_id[:8]}... to '{book.title}' via sibling hash {best_doc.document_hash[:8]}... ({float(best_doc.percentage):.2%})" + ) + return self.serialize_progress(best_doc, doc_id), 200 + + if not states: + return {"message": "Document not found on server"}, 502 + + kosync_state = next((s for s in states if s.client_name.lower() == "kosync"), None) + latest_state = kosync_state or max(states, key=lambda s: s.last_updated if s.last_updated else 0) + + return { + "device": "pagekeeper", + "device_id": "pagekeeper", + "document": doc_id, + "percentage": float(latest_state.percentage) if latest_state.percentage else 0, + "progress": (latest_state.xpath or latest_state.cfi) if hasattr(latest_state, "xpath") else "", + "timestamp": int(latest_state.last_updated) if latest_state.last_updated else 0, + }, 200 diff --git a/src/services/progress_reset_service.py b/src/services/progress_reset_service.py index 371e025..da17b96 100644 --- a/src/services/progress_reset_service.py +++ b/src/services/progress_reset_service.py @@ -151,7 +151,7 @@ def _finalize_clear_status(self, book_id): return book = self.database_service.get_book_by_ref(book_id) - has_alignment = bool(book and self.alignment_service and self.alignment_service.has_alignment(book.abs_id)) + has_alignment = bool(book and self.alignment_service and self.alignment_service.has_alignment(book.id)) if has_alignment: logger.info(f" Alignment map exists for '{sanitize_log_data(str(book_id))}' — no re-transcription needed") else: @@ -186,7 +186,7 @@ def _reset_external_clients(self, book_id): 'message': 'Reset to 0%' if result.success else 'Failed to reset' } if result.success: - record_write(client_name, book.abs_id) + record_write(client_name, book.id) logger.info(f"Reset '{client_name}' to 0%") else: logger.warning(f"Failed to reset '{client_name}'") diff --git a/src/services/reading_service.py b/src/services/reading_service.py index 602da36..b72875a 100644 --- a/src/services/reading_service.py +++ b/src/services/reading_service.py @@ -38,7 +38,8 @@ def pull_started_at(self, book_id, container): try: dates = container.reading_date_service().pull_reading_dates(book_id) return dates.get('started_at', date.today().isoformat()) - except Exception: + except Exception as e: + logger.warning("Could not pull started_at for book_id=%s, defaulting to today: %s", book_id, e) return date.today().isoformat() def update_status(self, book_id, new_status, container, *, allowed_from=None): diff --git a/src/services/reading_stats_service.py b/src/services/reading_stats_service.py index b584c9d..580bcf9 100644 --- a/src/services/reading_stats_service.py +++ b/src/services/reading_stats_service.py @@ -27,7 +27,7 @@ def get_year_stats(self, year: int) -> dict: ] states_by_book = {} for state in self.database_service.get_all_states(): - states_by_book.setdefault(state.abs_id, []).append(state) + states_by_book.setdefault(state.book_id, []).append(state) monthly_finished = [0] * 12 books_finished = 0 @@ -35,7 +35,7 @@ def get_year_stats(self, year: int) -> dict: ratings = [] for book in books: - progress = self._max_progress_percent(states_by_book.get(book.abs_id, [])) + progress = self._max_progress_percent(states_by_book.get(book.id, [])) if self._is_genuinely_reading(book, progress): currently_reading += 1 diff --git a/src/services/storyteller_submission_service.py b/src/services/storyteller_submission_service.py index 30d7028..244de72 100644 --- a/src/services/storyteller_submission_service.py +++ b/src/services/storyteller_submission_service.py @@ -127,15 +127,15 @@ def submit_book(self, abs_id: str, title: str, ebook_path: Path, audio_files: li # Update existing reservation or create new submission record from src.db.models import StorytellerSubmission - existing = self.database_service.get_active_storyteller_submission(abs_id) + book = self.database_service.get_book_by_abs_id(abs_id) + book_id = book.id if book else None + existing = self.database_service.get_active_storyteller_submission_by_book_id(book_id) if book_id else None if existing: submission = existing self.database_service.update_storyteller_submission_status( existing.id, "queued", submission_dir=dir_name, ) else: - book = self.database_service.get_book_by_abs_id(abs_id) - book_id = book.id if book else None submission = StorytellerSubmission( abs_id=abs_id, book_id=book_id, @@ -165,7 +165,8 @@ def check_status(self, abs_id: str) -> str: Returns: 'queued', 'processing', 'ready', 'failed', or 'not_found'. """ - submission = self.database_service.get_storyteller_submission(abs_id) + book = self.database_service.get_book_by_abs_id(abs_id) + submission = self.database_service.get_storyteller_submission_by_book_id(book.id) if book else None if not submission: return "not_found" @@ -307,7 +308,10 @@ def _update_submission_status(self, submission, new_status: str): def get_submission(self, abs_id: str): """Get the most recent submission for a book, if any.""" - return self.database_service.get_storyteller_submission(abs_id) + book = self.database_service.get_book_by_abs_id(abs_id) + if book: + return self.database_service.get_storyteller_submission_by_book_id(book.id) + return None def _trigger_processing_after_import(self, title: str, submission) -> str | None: """Wait for Storyteller to detect imported files, then trigger processing. diff --git a/src/services/suggestion_service.py b/src/services/suggestion_service.py index 69a02af..c024f63 100644 --- a/src/services/suggestion_service.py +++ b/src/services/suggestion_service.py @@ -30,14 +30,16 @@ class SuggestionService: """Handles suggestion discovery and creation for unmapped books.""" - def __init__(self, - database_service, - abs_client, - booklore_client, - storyteller_client, - library_service, - books_dir, - ebook_parser): + def __init__( + self, + database_service, + abs_client, + booklore_client, + storyteller_client, + library_service, + books_dir, + ebook_parser, + ): self.database_service = database_service self.abs_client = abs_client self.booklore_client = booklore_client @@ -77,21 +79,23 @@ def _normalize_title(self, title: str | None) -> str: if not title: return "" title = clean_book_title(title) - title = re.sub(r'\s*[\(\[].*?[\)\]]', '', title) - title = re.sub(r'\.(epub|mobi|azw3?|pdf|fb2|cbz|cbr|md)$', '', title, flags=re.IGNORECASE) - title = re.sub(r'[^\w\s]', ' ', title.lower()) - return ' '.join(title.split()) + title = re.sub(r"\s*[\(\[].*?[\)\]]", "", title) + title = re.sub(r"\.(epub|mobi|azw3?|pdf|fb2|cbz|cbr|md)$", "", title, flags=re.IGNORECASE) + title = re.sub(r"[^\w\s]", " ", title.lower()) + return " ".join(title.split()) def _normalize_author(self, author: str | None) -> str: if not author: return "" - author = re.sub(r'[^\w\s,]', ' ', author.lower()) - return ' '.join(author.split()) + author = re.sub(r"[^\w\s,]", " ", author.lower()) + return " ".join(author.split()) def _extract_title_numbers(self, normalized_title: str) -> set[str]: return {token for token in normalized_title.split() if token.isdigit()} - def _compute_match_score(self, source_title: str, source_author: str, candidate_title: str, candidate_author: str) -> tuple[float, list[str]]: + def _compute_match_score( + self, source_title: str, source_author: str, candidate_title: str, candidate_author: str + ) -> tuple[float, list[str]]: norm_source_title = self._normalize_title(source_title) norm_source_author = self._normalize_author(source_author) norm_candidate_title = self._normalize_title(candidate_title) @@ -120,7 +124,9 @@ def _compute_match_score(self, source_title: str, source_author: str, candidate_ author_score = max(author_score, _AUTHOR_PARTIAL_MATCH_FLOOR) evidence.append("author_partial") - score = (title_score * _TITLE_WEIGHT) + (author_score * _AUTHOR_WEIGHT if norm_source_author and norm_candidate_author else 0.0) + score = (title_score * _TITLE_WEIGHT) + ( + author_score * _AUTHOR_WEIGHT if norm_source_author and norm_candidate_author else 0.0 + ) source_numbers = self._extract_title_numbers(norm_source_title) candidate_numbers = self._extract_title_numbers(norm_candidate_title) @@ -144,15 +150,15 @@ def _get_bookfusion_context(self) -> dict: bf_books = [] try: - linked_abs_ids = list(self.database_service.get_bookfusion_linked_abs_ids() or []) + linked_book_ids = list(self.database_service.get_bookfusion_linked_book_ids() or []) except TypeError: - linked_abs_ids = [] + linked_book_ids = [] - visible_books = [b for b in bf_books if not getattr(b, 'hidden', False)] + visible_books = [b for b in bf_books if not getattr(b, "hidden", False)] by_title_author = {} by_title = {} for book in visible_books: - if book.matched_abs_id: + if book.matched_book_id: continue norm_title = self._normalize_title(book.title or book.filename or "") norm_author = self._normalize_author(book.authors or "") @@ -163,7 +169,7 @@ def _get_bookfusion_context(self) -> dict: by_title.setdefault(norm_title, []).append(book) return { "books": visible_books, - "linked_abs_ids": linked_abs_ids, + "linked_book_ids": linked_book_ids, "by_title_author": by_title_author, "by_title": by_title, "has_catalog": bool(visible_books), @@ -194,28 +200,32 @@ def request_rescan_library_suggestions(self, force: bool = False) -> dict: last_finished_at = self._rescan_status.get("last_finished_at") or 0 elapsed = time.time() - last_finished_at if last_finished_at else min_interval if not force and elapsed < min_interval: - self._rescan_status.update({ - "running": False, - "queued": False, - "rate_limited": True, - "next_allowed_in": max(0, min_interval - int(elapsed)), - "message": "Rescan recently completed. Please wait before running it again.", - }) + self._rescan_status.update( + { + "running": False, + "queued": False, + "rate_limited": True, + "next_allowed_in": max(0, min_interval - int(elapsed)), + "message": "Rescan recently completed. Please wait before running it again.", + } + ) return dict(self._rescan_status) - self._rescan_status.update({ - "running": True, - "queued": True, - "rate_limited": False, - "next_allowed_in": 0, - "last_started_at": time.time(), - "phase": "queued", - "message": "Queued suggestions rescan.", - "created": 0, - "updated": 0, - "deleted": 0, - "total": 0, - }) + self._rescan_status.update( + { + "running": True, + "queued": True, + "rate_limited": False, + "next_allowed_in": 0, + "last_started_at": time.time(), + "phase": "queued", + "message": "Queued suggestions rescan.", + "created": 0, + "updated": 0, + "deleted": 0, + "total": 0, + } + ) self._rescan_thread = threading.Thread( target=self._run_rescan_job, daemon=True, @@ -224,7 +234,9 @@ def request_rescan_library_suggestions(self, force: bool = False) -> dict: self._rescan_thread.start() return dict(self._rescan_status) - def _build_library_candidates(self, bookfusion_context: dict | None = None, include_filesystem: bool = True) -> list[dict]: + def _build_library_candidates( + self, bookfusion_context: dict | None = None, include_filesystem: bool = True + ) -> list[dict]: candidates = [] seen = set() @@ -233,23 +245,25 @@ def _build_library_candidates(self, bookfusion_context: dict | None = None, incl if bl_client and bl_client.is_configured(): try: for book in bl_client.get_all_books() or []: - filename = book.get('fileName', '') - if not filename or not filename.lower().endswith('.epub'): + filename = book.get("fileName", "") + if not filename or not filename.lower().endswith(".epub"): continue dedupe_key = ("booklore", filename.lower()) if dedupe_key in seen: continue seen.add(dedupe_key) - candidates.append({ - "source_family": "booklore", - "source": "booklore", - "source_key": f"booklore:{filename}", - "title": book.get('title') or Path(filename).stem, - "author": book.get('authors') or '', - "filename": filename, - "id": str(book.get('id') or ''), - "action_kind": "create_mapping", - }) + candidates.append( + { + "source_family": "booklore", + "source": "booklore", + "source_key": f"booklore:{filename}", + "title": book.get("title") or Path(filename).stem, + "author": book.get("authors") or "", + "filename": filename, + "id": str(book.get("id") or ""), + "action_kind": "create_mapping", + } + ) except Exception as e: logger.warning(f"Booklore cache scan failed during suggestions rescan: {e}") @@ -263,16 +277,18 @@ def _build_library_candidates(self, bookfusion_context: dict | None = None, incl if dedupe_key in seen: continue seen.add(dedupe_key) - candidates.append({ - "source_family": "filesystem", - "source": "filesystem", - "source_key": f"filesystem:{epub.name}", - "title": epub.stem, - "author": '', - "filename": epub.name, - "path": str(epub), - "action_kind": "create_mapping", - }) + candidates.append( + { + "source_family": "filesystem", + "source": "filesystem", + "source_key": f"filesystem:{epub.name}", + "title": epub.stem, + "author": "", + "filename": epub.name, + "path": str(epub), + "action_kind": "create_mapping", + } + ) if idx % batch_size == 0: self._update_rescan_status( phase="loading_filesystem", @@ -297,21 +313,25 @@ def _build_library_candidates(self, bookfusion_context: dict | None = None, incl last_highlighted_at = highlight_range[1].astimezone(UTC).isoformat() except Exception: last_highlighted_at = highlight_range[1].isoformat() - candidates.append({ - "source_family": "bookfusion", - "source": "bookfusion", - "source_key": f"bookfusion:{book.bookfusion_id}", - "title": book.title or book.filename or '', - "author": book.authors or '', - "bookfusion_ids": [book.bookfusion_id], - "highlight_count": book.highlight_count or 0, - "last_highlighted_at": last_highlighted_at, - "action_kind": "link_existing", - }) + candidates.append( + { + "source_family": "bookfusion", + "source": "bookfusion", + "source_key": f"bookfusion:{book.bookfusion_id}", + "title": book.title or book.filename or "", + "author": book.authors or "", + "bookfusion_ids": [book.bookfusion_id], + "highlight_count": book.highlight_count or 0, + "last_highlighted_at": last_highlighted_at, + "action_kind": "link_existing", + } + ) return candidates - def _apply_bookfusion_evidence(self, source_title: str, source_author: str, match: dict, bookfusion_context: dict) -> dict: + def _apply_bookfusion_evidence( + self, source_title: str, source_author: str, match: dict, bookfusion_context: dict + ) -> dict: evidence = list(match.get("evidence") or []) score = float(match.get("score") or 0.0) norm_title = self._normalize_title(source_title) @@ -341,7 +361,9 @@ def _apply_bookfusion_evidence(self, source_title: str, source_author: str, matc match["confidence"] = self._score_to_confidence(match["score"]) return match - def _rank_candidates_for_book(self, source_title: str, source_author: str, candidates: list[dict], bookfusion_context: dict | None = None) -> list[dict]: + def _rank_candidates_for_book( + self, source_title: str, source_author: str, candidates: list[dict], bookfusion_context: dict | None = None + ) -> list[dict]: ranked = [] for candidate in candidates: score, evidence = self._compute_match_score( @@ -364,7 +386,10 @@ def _rank_candidates_for_book(self, source_title: str, source_author: str, candi match = self._apply_bookfusion_evidence(source_title, source_author, match, bookfusion_context) ranked.append(match) - ranked.sort(key=lambda m: (m.get("score", 0.0), m.get("source_family") == "booklore", m.get("highlight_count", 0)), reverse=True) + ranked.sort( + key=lambda m: (m.get("score", 0.0), m.get("source_family") == "booklore", m.get("highlight_count", 0)), + reverse=True, + ) return ranked[:6] def queue_suggestion(self, abs_id: str) -> None: @@ -397,15 +422,17 @@ def check_for_suggestions(self, abs_progress_map, active_books): all_books = self.database_service.get_all_books() mapped_ids = {b.abs_id for b in all_books} - logger.debug(f"Checking for suggestions: {len(abs_progress_map)} books with progress, {len(mapped_ids)} already mapped") + logger.debug( + f"Checking for suggestions: {len(abs_progress_map)} books with progress, {len(mapped_ids)} already mapped" + ) for abs_id, item_data in abs_progress_map.items(): if abs_id in mapped_ids: logger.debug(f"Skipping {abs_id}: already mapped") continue - duration = item_data.get('duration', 0) - current_time = item_data.get('currentTime', 0) + duration = item_data.get("duration", 0) + current_time = item_data.get("currentTime", 0) if duration > 0: pct = current_time / duration @@ -461,10 +488,10 @@ def _check_reverse_suggestions(self): # Index audiobooks by cleaned title for fuzzy matching abs_by_title: dict[str, list[dict]] = {} for ab in all_audiobooks: - meta = ab.get('media', {}).get('metadata', {}) - title = meta.get('title', '') + meta = ab.get("media", {}).get("metadata", {}) + title = meta.get("title", "") if title: - clean = re.sub(r'\s*[\(\[].*?[\)\]]', '', title).strip().lower() + clean = re.sub(r"\s*[\(\[].*?[\)\]]", "", title).strip().lower() if clean: abs_by_title.setdefault(clean, []).append(ab) @@ -473,15 +500,15 @@ def _check_reverse_suggestions(self): try: positions = self.storyteller_client.get_all_positions_bulk() for title_lower, pos_data in positions.items(): - pct = pos_data.get('pct', 0) - uuid = pos_data.get('uuid') + pct = pos_data.get("pct", 0) + uuid = pos_data.get("uuid") if not uuid or pct < 0.01 or pct > 0.70: continue if uuid in mapped_storyteller_uuids: continue # Search ABS for a matching audiobook - clean_title = re.sub(r'\s*[\(\[].*?[\)\]]', '', title_lower).strip().lower() + clean_title = re.sub(r"\s*[\(\[].*?[\)\]]", "", title_lower).strip().lower() matches = self._find_abs_audiobook_matches(clean_title, abs_by_title, mapped_abs_ids) if matches: self._save_reverse_suggestion(matches, clean_title, f"storyteller:{uuid}") @@ -493,8 +520,8 @@ def _check_reverse_suggestions(self): try: bl_books = self.booklore_client.get_all_books() for bl_book in bl_books: - title = bl_book.get('title', '') - filename = bl_book.get('fileName', '') + title = bl_book.get("title", "") + filename = bl_book.get("fileName", "") if not title: continue @@ -502,7 +529,7 @@ def _check_reverse_suggestions(self): if not pct_raw or pct_raw < 0.01 or pct_raw > 0.70: continue - clean_title = re.sub(r'\s*[\(\[].*?[\)\]]', '', title).strip().lower() + clean_title = re.sub(r"\s*[\(\[].*?[\)\]]", "", title).strip().lower() source_key = f"booklore:{filename}" matches = self._find_abs_audiobook_matches(clean_title, abs_by_title, mapped_abs_ids) if matches: @@ -519,24 +546,26 @@ def _find_abs_audiobook_matches(self, clean_title: str, abs_by_title: dict, mapp # Check for substring match in either direction if clean_title in indexed_title or indexed_title in clean_title: for ab in audiobooks: - ab_id = ab.get('id') + ab_id = ab.get("id") if ab_id in mapped_abs_ids: continue - meta = ab.get('media', {}).get('metadata', {}) - matches.append({ - "source": "abs_audiobook", - "abs_id": ab_id, - "title": meta.get('title'), - "author": meta.get('authorName'), - "confidence": "high" if clean_title == indexed_title else "medium", - }) + meta = ab.get("media", {}).get("metadata", {}) + matches.append( + { + "source": "abs_audiobook", + "abs_id": ab_id, + "title": meta.get("title"), + "author": meta.get("authorName"), + "confidence": "high" if clean_title == indexed_title else "medium", + } + ) return matches def _save_reverse_suggestion(self, matches: list[dict], title: str, source_key: str): """Save a reverse suggestion (ebook → audiobook) using the first ABS match as source_id.""" # Use the best ABS match as the anchor - best = next((m for m in matches if m.get('confidence') == 'high'), matches[0]) - abs_id = best['abs_id'] + best = next((m for m in matches if m.get("confidence") == "high"), matches[0]) + abs_id = best["abs_id"] if self.database_service.is_suggestion_ignored(abs_id): return @@ -550,10 +579,10 @@ def _save_reverse_suggestion(self, matches: list[dict], title: str, source_key: def _match_key(match): return ( - match.get('abs_id'), - match.get('source_key'), - match.get('title'), - match.get('author'), + match.get("abs_id"), + match.get("source_key"), + match.get("title"), + match.get("author"), ) for match in (existing.matches if existing else []) + matches_with_provenance: @@ -567,13 +596,13 @@ def _match_key(match): current = merged_matches[prior] merged_matches[prior] = { **current, - **{k: v for k, v in match.items() if v not in (None, '')}, + **{k: v for k, v in match.items() if v not in (None, "")}, } suggestion = PendingSuggestion( source_id=abs_id, - title=(existing.title if existing and existing.title else best.get('title', title)), - author=(existing.author if existing and existing.author else best.get('author')), + title=(existing.title if existing and existing.title else best.get("title", title)), + author=(existing.author if existing and existing.author else best.get("author")), cover_url=(existing.cover_url if existing and existing.cover_url else cover), matches_json=json.dumps(merged_matches), ) @@ -614,20 +643,11 @@ def rescan_library_suggestions(self) -> dict: if os.environ.get("SUGGESTIONS_ENABLED", "true").lower() != "true": return {"created": 0, "updated": 0, "deleted": 0, "total": 0, "bookfusion_catalog": False} - if not self.abs_client: - return {"created": 0, "updated": 0, "deleted": 0, "total": 0, "bookfusion_catalog": False} - - try: - self._update_rescan_status(phase="loading_abs", message="Loading ABS audiobooks...") - all_abs_books = self.abs_client.get_all_audiobooks() or [] - except Exception as e: - logger.warning(f"Suggestions rescan failed to load ABS audiobooks: {e}") - return {"created": 0, "updated": 0, "deleted": 0, "total": 0, "bookfusion_catalog": False} - mapped_ids = {b.abs_id for b in self.database_service.get_all_books()} existing_actionable = { - s.source_id: s for s in self.database_service.get_all_actionable_suggestions() - if getattr(s, 'source', 'abs') == 'abs' + s.source_id: s + for s in self.database_service.get_all_actionable_suggestions() + if getattr(s, "source", "abs") == "abs" } bookfusion_context = self._get_bookfusion_context() candidates = self._build_library_candidates(bookfusion_context=bookfusion_context, include_filesystem=True) @@ -635,58 +655,68 @@ def rescan_library_suggestions(self) -> dict: created = 0 updated = 0 kept_ids = set() - total_books = len(all_abs_books) - - self._update_rescan_status(phase="scoring", message=f"Scoring {total_books} ABS books...") - for idx, abs_book in enumerate(all_abs_books, start=1): - abs_id = abs_book.get('id') - if not abs_id or abs_id in mapped_ids or self.database_service.is_suggestion_ignored(abs_id): - continue - meta = abs_book.get('media', {}).get('metadata', {}) - title = meta.get('title') or '' - author = meta.get('authorName') or '' - matches = self._rank_candidates_for_book(title, author, candidates, bookfusion_context=bookfusion_context) + if self.abs_client: + try: + self._update_rescan_status(phase="loading_abs", message="Loading ABS audiobooks...") + all_abs_books = self.abs_client.get_all_audiobooks() or [] + except Exception as e: + logger.warning(f"Suggestions rescan failed to load ABS audiobooks: {e}") + all_abs_books = [] + + total_books = len(all_abs_books) + self._update_rescan_status(phase="scoring", message=f"Scoring {total_books} ABS books...") + for idx, abs_book in enumerate(all_abs_books, start=1): + abs_id = abs_book.get("id") + if not abs_id or abs_id in mapped_ids or self.database_service.is_suggestion_ignored(abs_id): + continue - if not matches: - continue + meta = abs_book.get("media", {}).get("metadata", {}) + title = meta.get("title") or "" + author = meta.get("authorName") or "" + matches = self._rank_candidates_for_book( + title, author, candidates, bookfusion_context=bookfusion_context + ) - kept_ids.add(abs_id) - existing = existing_actionable.get(abs_id) - suggestion = PendingSuggestion( - source_id=abs_id, - title=title, - author=author, - cover_url=f"/api/cover-proxy/{abs_id}", - matches_json=json.dumps(matches), - status='hidden' if existing and getattr(existing, 'status', None) in ('hidden', 'dismissed') else 'pending', - ) - if abs_id in existing_actionable: - updated += 1 - else: - created += 1 - self.database_service.save_pending_suggestion(suggestion) + if not matches: + continue - if idx % 25 == 0: - self._update_rescan_status( - phase="scoring", - message=f"Scoring ABS books... {idx}/{total_books}", - created=created, - updated=updated, + kept_ids.add(abs_id) + existing = existing_actionable.get(abs_id) + suggestion = PendingSuggestion( + source_id=abs_id, + title=title, + author=author, + cover_url=f"/api/cover-proxy/{abs_id}", + matches_json=json.dumps(matches), + status="hidden" if existing and getattr(existing, "status", None) == "hidden" else "pending", ) - time.sleep(0.01) + if abs_id in existing_actionable: + updated += 1 + else: + created += 1 + self.database_service.save_pending_suggestion(suggestion) + + if idx % 25 == 0: + self._update_rescan_status( + phase="scoring", + message=f"Scoring ABS books... {idx}/{total_books}", + created=created, + updated=updated, + ) + time.sleep(0.01) deleted = 0 - self._update_rescan_status(phase="cleanup", message="Cleaning stale suggestions...") - for source_id in list(existing_actionable.keys()): - if source_id not in kept_ids: - if self.database_service.resolve_suggestion(source_id): - deleted += 1 + if kept_ids: + self._update_rescan_status(phase="cleanup", message="Cleaning stale suggestions...") + for source_id in list(existing_actionable.keys()): + if source_id not in kept_ids: + if self.database_service.resolve_suggestion(source_id): + deleted += 1 total = len(self.database_service.get_all_actionable_suggestions()) logger.info( - "Suggestions rescan completed: created=%s updated=%s deleted=%s total=%s", - created, updated, deleted, total + "Suggestions rescan completed: created=%s updated=%s deleted=%s total=%s", created, updated, deleted, total ) return { "created": created, @@ -711,10 +741,10 @@ def _create_suggestion(self, abs_id, progress_data): logger.debug(f"Suggestion failed: Could not get details for {abs_id}") return - media = item.get('media', {}) - metadata = media.get('metadata', {}) - title = metadata.get('title') or '' - author = metadata.get('authorName') or '' + media = item.get("media", {}) + metadata = media.get("metadata", {}) + title = metadata.get("title") or "" + author = metadata.get("authorName") or "" cover = f"/api/cover-proxy/{abs_id}" logger.debug(f"Checking suggestions for '{title}' (Author: {author})") @@ -738,16 +768,18 @@ def _create_suggestion(self, abs_id, progress_data): filename = book.get("fileName", "") if not filename or not filename.lower().endswith(".epub"): continue - live_candidates.append({ - "source_family": "booklore", - "source": "booklore", - "source_key": f"booklore:{filename}", - "title": book.get("title") or Path(filename).stem, - "author": book.get("authors") or "", - "filename": filename, - "id": str(book.get("id") or ""), - "action_kind": "create_mapping", - }) + live_candidates.append( + { + "source_family": "booklore", + "source": "booklore", + "source_key": f"booklore:{filename}", + "title": book.get("title") or Path(filename).stem, + "author": book.get("authors") or "", + "filename": filename, + "id": str(book.get("id") or ""), + "action_kind": "create_mapping", + } + ) matches.extend( self._rank_candidates_for_book( title, @@ -759,7 +791,11 @@ def _create_suggestion(self, abs_id, progress_data): except Exception as e: logger.warning(f"Booklore live search failed during suggestion: {e}") - if self.library_service and self.library_service.cwa_client and self.library_service.cwa_client.is_configured(): + if ( + self.library_service + and self.library_service.cwa_client + and self.library_service.cwa_client.is_configured() + ): try: cwa_results = self.library_service.cwa_client.search_ebooks(query) for cr in cwa_results or []: @@ -767,8 +803,8 @@ def _create_suggestion(self, abs_id, progress_data): "source_family": "cwa", "source": "cwa", "source_key": f"cwa:{cr.get('id')}", - "title": cr.get('title'), - "author": cr.get('author'), + "title": cr.get("title"), + "author": cr.get("author"), "filename": f"cwa_{cr.get('id', 'unknown')}.{cr.get('ext', 'epub')}", "action_kind": "create_mapping", } @@ -795,11 +831,7 @@ def _create_suggestion(self, abs_id, progress_data): return suggestion = PendingSuggestion( - source_id=abs_id, - title=title, - author=author, - cover_url=cover, - matches_json=json.dumps(matches) + source_id=abs_id, title=title, author=author, cover_url=cover, matches_json=json.dumps(matches) ) self.database_service.save_pending_suggestion(suggestion) match_count = len(matches) diff --git a/src/services/write_tracker.py b/src/services/write_tracker.py index 9f7f98e..a1bffb0 100644 --- a/src/services/write_tracker.py +++ b/src/services/write_tracker.py @@ -1,8 +1,8 @@ """ Write-suppression tracker — prevents self-triggered feedback loops. -Call record_write(client_name, abs_id) after Stitch successfully pushes -progress to any client. Call is_own_write(client_name, abs_id) before acting +Call record_write(client_name, book_id) after Stitch successfully pushes +progress to any client. Call is_own_write(client_name, book_id) before acting on a progress change from that client to suppress round-trip echoes. Supported client_name values: 'ABS', 'Storyteller', 'BookLore', 'KoSync' @@ -51,9 +51,9 @@ def _states_match(recorded: dict | None, incoming: dict | None) -> bool: return True -def record_write(client_name: str, abs_id: str, state: dict | None = None) -> None: +def record_write(client_name: str, book_id, state: dict | None = None) -> None: """Call after Stitch successfully pushes progress to a client.""" - key = f"{client_name}:{abs_id}" + key = f"{client_name}:{book_id}" with _writes_lock: _recent_writes[key] = { 'timestamp': time.time(), @@ -61,9 +61,9 @@ def record_write(client_name: str, abs_id: str, state: dict | None = None) -> No } -def is_own_write(client_name: str, abs_id: str, suppression_window: int = _DEFAULT_SUPPRESSION_WINDOW, state: dict | None = None) -> bool: +def is_own_write(client_name: str, book_id, suppression_window: int = _DEFAULT_SUPPRESSION_WINDOW, state: dict | None = None) -> bool: """Return True if a recent progress event for this client/book was caused by our own write.""" - key = f"{client_name}:{abs_id}" + key = f"{client_name}:{book_id}" with _writes_lock: last_write = _recent_writes.get(key) if last_write and time.time() - last_write['timestamp'] < suppression_window: diff --git a/src/sync_clients/abs_ebook_sync_client.py b/src/sync_clients/abs_ebook_sync_client.py index fb79b98..cff5e3b 100644 --- a/src/sync_clients/abs_ebook_sync_client.py +++ b/src/sync_clients/abs_ebook_sync_client.py @@ -39,7 +39,7 @@ def get_service_state(self, book: Book, prev_state: State | None, title_snip: st abs_pct, abs_cfi = response.get('ebookProgress'), response.get('ebookLocation') if response is not None else None if abs_pct is None: - logger.warning("ABS ebook percentage is None - returning None for service state") + logger.debug("ABS ebook percentage is None - returning None for service state") return None # Get previous ABS ebook state diff --git a/src/sync_clients/abs_sync_client.py b/src/sync_clients/abs_sync_client.py index 946aa1c..d266d3d 100644 --- a/src/sync_clients/abs_sync_client.py +++ b/src/sync_clients/abs_sync_client.py @@ -91,7 +91,7 @@ def _abs_to_percentage(self, abs_seconds, book: Book): if transcript_path == TRANSCRIPT_DB_MANAGED: if self.alignment_service: - dur = self.alignment_service.get_book_duration(book.abs_id) + dur = self.alignment_service.get_book_duration(book.id) if dur: return min(max(abs_seconds / dur, 0.0), 1.0) return None @@ -120,7 +120,7 @@ def get_text_from_current_state(self, book: Book, state: ServiceState) -> str | # DB Managed (Unified Architecture) if book.transcript_file == TRANSCRIPT_DB_MANAGED and self.alignment_service: # Inverse lookup: Time -> Char -> Text - char_offset = self.alignment_service.get_char_for_time(book.abs_id, abs_ts) + char_offset = self.alignment_service.get_char_for_time(book.id, abs_ts) if char_offset is not None: # Need book text book_path = self.ebook_parser.resolve_book_path(book.ebook_filename) @@ -139,7 +139,7 @@ def get_text_from_current_state(self, book: Book, state: ServiceState) -> str | if not path.exists() and self.alignment_service: logger.warning(f"'{book.abs_id}' Legacy transcript file missing: '{path}' — Attempting DB fallback") # Try DB lookup - char_offset = self.alignment_service.get_char_for_time(book.abs_id, abs_ts) + char_offset = self.alignment_service.get_char_for_time(book.id, abs_ts) if char_offset is not None: logger.info(f"'{book.abs_id}' Found in DB despite missing file — Self-healing state") # We can't easily save the book here without circular dependency or passing DB service @@ -186,7 +186,7 @@ def update_progress(self, book: Book, request: UpdateProgressRequest) -> SyncRes char_index = request.locator_result.match_index if char_index is not None: ts_for_text = self.alignment_service.get_time_for_text( - book.abs_id, + book.id, char_offset_hint=char_index ) else: diff --git a/src/sync_clients/booklore_sync_client.py b/src/sync_clients/booklore_sync_client.py index b942e4c..b8909ad 100644 --- a/src/sync_clients/booklore_sync_client.py +++ b/src/sync_clients/booklore_sync_client.py @@ -37,6 +37,8 @@ def get_supported_sync_types(self) -> set: def get_service_state(self, book: Book, prev_state: State | None, title_snip: str = "", bulk_context: dict = None) -> ServiceState | None: # FIX: Use original filename if available (Tri-Link), otherwise standard filename epub = book.original_ebook_filename or book.ebook_filename + if not epub: + return None if bulk_context is not None: lookup_key = Path(epub).name.lower() if epub else '' @@ -49,7 +51,7 @@ def get_service_state(self, book: Book, prev_state: State | None, title_snip: st bl_pct, _ = self.booklore_client.get_progress(epub) if bl_pct is None: - logger.warning("BookLore percentage is None - returning None for service state") + logger.debug("BookLore percentage is None - returning None for service state") return None # Get previous BookLore state @@ -69,7 +71,7 @@ def get_service_state(self, book: Book, prev_state: State | None, title_snip: st def get_text_from_current_state(self, book: Book, state: ServiceState) -> str | None: bl_pct = state.current.get('pct') - epub = book.ebook_filename + epub = book.original_ebook_filename or book.ebook_filename if bl_pct is not None and epub and self.ebook_parser: return self.ebook_parser.get_text_at_percentage(epub, bl_pct) return None @@ -82,7 +84,7 @@ def update_progress(self, book: Book, request: UpdateProgressRequest) -> SyncRes if success: try: from src.services.write_tracker import record_write - record_write(self.client_name, book.abs_id) + record_write(self.client_name, book.id) except ImportError: pass updated_state = { diff --git a/src/sync_clients/hardcover_sync_client.py b/src/sync_clients/hardcover_sync_client.py index 2aae57f..037586c 100644 --- a/src/sync_clients/hardcover_sync_client.py +++ b/src/sync_clients/hardcover_sync_client.py @@ -129,7 +129,7 @@ def _handle_status_transition(self, book, hardcover_details, current_status, per return current_status hardcover_details.hardcover_status_id = new_status self.database_service.save_hardcover_details(hardcover_details) - record_write('Hardcover', book.abs_id, {'status': new_status}) + record_write('Hardcover', book.id, {'status': new_status}) status_names = {1: 'Want to Read', 2: 'Currently Reading', 3: 'Read', 4: 'Paused', 5: 'DNF'} log_hardcover_action( @@ -225,7 +225,7 @@ def update_progress(self, book: Book, request: UpdateProgressRequest) -> SyncRes 'status': current_status, } - record_write('Hardcover', book.abs_id, updated_state) + record_write('Hardcover', book.id, updated_state) return SyncResult(actual_pct, True, updated_state) except Exception as e: @@ -259,7 +259,7 @@ def _update_audiobook_progress(self, book, hardcover_details, ub, percentage, au 'status': current_status, } - record_write('Hardcover', book.abs_id, updated_state) + record_write('Hardcover', book.id, updated_state) return SyncResult(percentage, True, updated_state) except Exception as e: diff --git a/src/sync_clients/storyteller_sync_client.py b/src/sync_clients/storyteller_sync_client.py index c211629..5fbdc44 100644 --- a/src/sync_clients/storyteller_sync_client.py +++ b/src/sync_clients/storyteller_sync_client.py @@ -122,7 +122,7 @@ def update_progress(self, book: Book, request: UpdateProgressRequest) -> SyncRes if success: try: from src.services.write_tracker import record_write - record_write('Storyteller', book.abs_id) + record_write('Storyteller', book.id) except ImportError as e: logger.debug(f"Write tracker not available for Storyteller: {e}") else: diff --git a/src/sync_manager.py b/src/sync_manager.py index d62ebcf..01326ef 100644 --- a/src/sync_manager.py +++ b/src/sync_manager.py @@ -259,7 +259,7 @@ def cleanup_cache(self): except Exception as e: logger.error(f"Error during cache cleanup: {e}") - def get_abs_title(self, ab): + def get_audiobook_title(self, ab): media = ab.get('media', {}) metadata = media.get('metadata', {}) return metadata.get('title') or ab.get('name', 'Unknown') @@ -514,8 +514,8 @@ def _prepare_sync_books(self, target_book_id): """Fetch active books, pre-fetch bulk states, and trigger suggestions.""" active_books = [] if target_book_id: - logger.info(f"Instant Sync triggered for book_id={target_book_id}") book = self.database_service.get_book_by_id(target_book_id) + logger.info(f"Instant Sync triggered for '{sanitize_log_data(book.title)}'" if book else f"Instant Sync triggered for book_id={target_book_id} (not found)") if book and book.status == 'active': active_books = [book] else: @@ -541,13 +541,13 @@ def _prepare_sync_books(self, target_book_id): def _sync_single_book(self, book, bulk_states_per_client): """Process a single book in the sync cycle.""" - abs_id = book.abs_id + abs_id = book.abs_id or f"book-{book.id}" title_snip = sanitize_log_data(book.title or 'Unknown') logger.info(f"'{abs_id}' Syncing '{title_snip}'") # Migration upgrade if self.alignment_service: - alignment = self.alignment_service._get_alignment(abs_id) + alignment = self.alignment_service._get_alignment(book.id) if alignment: if getattr(book, 'transcript_file', None) != 'DB_MANAGED': logger.info(f" Upgrading '{title_snip}' to DB_MANAGED unified architecture") @@ -778,7 +778,7 @@ def _execute_sync_update(self, book, config, abs_id, title_snip, active_clients) logger.info(f"'{abs_id}' '{title_snip}' Updated state data for '{client_name}': {state_data}") try: from src.services.write_tracker import record_write - record_write(client_name, abs_id, state_data) + record_write(client_name, book.id, state_data) except ImportError: pass client_state_model = State( diff --git a/src/utils/constants.py b/src/utils/constants.py new file mode 100644 index 0000000..2a5bc47 --- /dev/null +++ b/src/utils/constants.py @@ -0,0 +1,13 @@ +"""Centralized constants for PageKeeper.""" + +# Bot device name used when syncing progress to ABS and KoSync +BOT_DEVICE_NAME = "pagekeeper-bot" + +# Device names recognized as internal (not real user devices) +INTERNAL_DEVICE_NAMES = frozenset({"pagekeeper-bot", "pagekeeper"}) + +# Default ABS collection name for synced books +DEFAULT_COLLECTION_NAME = "pagekeeper" + +# Default Booklore/Grimmory shelf name +DEFAULT_SHELF_NAME = "pagekeeper" diff --git a/src/utils/cover_resolver.py b/src/utils/cover_resolver.py index b84a8fc..957e312 100644 --- a/src/utils/cover_resolver.py +++ b/src/utils/cover_resolver.py @@ -1,8 +1,19 @@ """Cover URL resolution waterfall for book display.""" +def resolve_placeholder_logo(book, book_type, booklore_meta): + """Determine the placeholder logo based on a book's primary source.""" + if (book.abs_id or "").startswith("bf-"): + return "/static/bookfusion-logo.svg" + elif book_type == "ebook-only" and booklore_meta: + return "/static/booklore.png" + elif book.abs_id: + return "/static/audiobookshelf.png" + return None + + def resolve_book_covers(book, abs_service, database_service, book_type, - booklore_meta=None): + booklore_meta=None, hardcover_details=None): """Resolve cover URLs for a book using the priority waterfall. Priority chain: @@ -13,12 +24,12 @@ def resolve_book_covers(book, abs_service, database_service, book_type, ``fallback_cover_url``. Returns dict with 'cover_url', 'custom_cover_url', 'abs_cover_url', - 'fallback_cover_url'. + 'fallback_cover_url', 'placeholder_logo'. """ custom_cover_url = book.custom_cover_url or None abs_cover_url = None - if book.abs_id and book_type != 'ebook-only': - abs_cover_url = abs_service.get_cover_proxy_url(book.abs_id) + if book.abs_id and book_type != 'ebook-only' and not book.abs_id.startswith('bf-'): + abs_cover_url = f"/api/cover-proxy/{book.abs_id}" # Cover URL -- preserve custom override, otherwise walk the waterfall. cover_url = custom_cover_url @@ -37,9 +48,9 @@ def resolve_book_covers(book, abs_service, database_service, book_type, # Hardcover cover fallback if not cover_url and book.id: - hc_details = database_service.get_hardcover_details(book.id) - if hc_details and hc_details.hardcover_cover_url: - cover_url = hc_details.hardcover_cover_url + hc = hardcover_details if hardcover_details is not None else database_service.get_hardcover_details(book.id) + if hc and hc.hardcover_cover_url: + cover_url = hc.hardcover_cover_url non_abs_cover_url = cover_url if not custom_cover_url and abs_cover_url: @@ -53,4 +64,5 @@ def resolve_book_covers(book, abs_service, database_service, book_type, 'custom_cover_url': custom_cover_url, 'abs_cover_url': abs_cover_url, 'fallback_cover_url': fallback_cover_url, + 'placeholder_logo': resolve_placeholder_logo(book, book_type, booklore_meta), } diff --git a/src/utils/debounce_manager.py b/src/utils/debounce_manager.py new file mode 100644 index 0000000..2dda61e --- /dev/null +++ b/src/utils/debounce_manager.py @@ -0,0 +1,81 @@ +""" +Debounce manager for PageKeeper. + +Debounces rapid KoSync PUT events before triggering sync cycles. +Records (book_id, title) events. A background polling thread fires +the sync callback once no new events arrive within the debounce window. +""" + +import logging +import os +import threading +import time + +logger = logging.getLogger(__name__) + + +class DebounceManager: + def __init__(self, database_service, manager, rate_limiter=None, poll_interval=10, stale_seconds=300): + self._db = database_service + self._manager = manager + self._rate_limiter = rate_limiter + self._poll_interval = poll_interval + self._stale_seconds = stale_seconds + + self._entries: dict[int, dict] = {} + self._lock = threading.Lock() + self._thread_started = False + + def record_event(self, book_id: int, title: str) -> None: + """Record a PUT event for debounced sync triggering.""" + with self._lock: + self._entries[book_id] = { + "last_event": time.time(), + "title": title, + "synced": False, + } + if not self._thread_started: + self._thread_started = True + threading.Thread(target=self._poll_loop, daemon=True).start() + + def _poll_loop(self) -> None: + """Check periodically for books that stopped receiving PUTs.""" + while True: + time.sleep(self._poll_interval) + debounce_seconds = int(os.environ.get("ABS_SOCKET_DEBOUNCE_SECONDS", "30")) + now = time.time() + to_sync = [] + + with self._lock: + for book_id, info in self._entries.items(): + if not info["synced"] and (now - info["last_event"]) > debounce_seconds: + info["synced"] = True + to_sync.append((book_id, info["title"])) + + for book_id, title in to_sync: + self._trigger_sync(book_id, title) + + # Clean up stale entries + with self._lock: + stale = [k for k, v in self._entries.items() if now - v["last_event"] > self._stale_seconds] + for k in stale: + del self._entries[k] + + # Prune stale rate-limit buckets + if self._rate_limiter: + self._rate_limiter.prune() + + def _trigger_sync(self, book_id: int, title: str) -> None: + """Trigger sync for a debounced book.""" + if not self._manager: + return + book = self._db.get_book_by_id(book_id) if self._db else None + if not book: + logger.warning(f"KOSync PUT: No book found for id={book_id} — skipping sync") + return + logger.info(f"KOSync PUT: Triggering sync for '{title}' (debounced)") + threading.Thread( + target=self._manager.sync_cycle, + kwargs={"target_book_id": book.id}, + daemon=True, + ).start() diff --git a/src/utils/ebook_utils.py b/src/utils/ebook_utils.py index 208a563..c91fa39 100644 --- a/src/utils/ebook_utils.py +++ b/src/utils/ebook_utils.py @@ -1,23 +1,26 @@ """ Ebook Utilities for PageKeeper + +EbookParser is the public facade for all EPUB operations. It owns book +resolution, hashing, text extraction/caching, and cover extraction, and +delegates XPath generation/resolution and text search to focused services. """ + import glob import hashlib import logging import os -import re import threading from collections import OrderedDict from pathlib import Path import ebooklib -import epubcfi -import rapidfuzz -from bs4 import BeautifulSoup, Tag +from bs4 import BeautifulSoup from ebooklib import epub -from lxml import html -from src.sync_clients.sync_client_interface import LocatorResult +from src.utils.koreader_xpath import KoReaderXPathService +from src.utils.locator_search import LocatorSearchService +from src.utils.path_utils import is_safe_path_within logger = logging.getLogger(__name__) @@ -49,27 +52,26 @@ def clear(self): class EbookParser: - CRENGINE_FRAGILE_INLINE_TAGS = { - "span", "em", "strong", "b", "i", "u", "a", "font", "small", "big", "sub", "sup" - } - CRENGINE_STRUCTURAL_TAGS = { - "p", "div", "section", "article", "blockquote", - "h1", "h2", "h3", "h4", "h5", "h6", - "li", "header", "footer", "aside", - "td", "th", "dt", "dd", "figcaption", "pre" - } - def __init__(self, books_dir, epub_cache_dir=None): self.books_dir = Path(books_dir) self.epub_cache_dir = Path(epub_cache_dir) if epub_cache_dir else Path("/data/epub_cache") cache_size = int(os.getenv("EBOOK_CACHE_SIZE", 3)) self.cache = LRUCache(capacity=cache_size) - self.fuzzy_threshold = int(os.getenv("FUZZY_MATCH_THRESHOLD", 80)) self.hash_method = os.getenv("KOSYNC_HASH_METHOD", "content").lower() self.useXpathSegmentFallback = os.getenv("XPATH_FALLBACK_TO_PREVIOUS_SEGMENT", "false").lower() == "true" - logger.info(f"EbookParser initialized (cache={cache_size}, hash={self.hash_method}, xpath_fallback={self.useXpathSegmentFallback})") + fuzzy_threshold = int(os.getenv("FUZZY_MATCH_THRESHOLD", 80)) + self._ko_xpath = KoReaderXPathService() + self._locator = LocatorSearchService(fuzzy_threshold=fuzzy_threshold) + + logger.info( + f"EbookParser initialized (cache={cache_size}, hash={self.hash_method}, xpath_fallback={self.useXpathSegmentFallback})" + ) + + # ========================================================================= + # Book path resolution + # ========================================================================= def resolve_book_path(self, filename): try: @@ -78,28 +80,28 @@ def resolve_book_path(self, filename): except StopIteration: pass - for f in self.books_dir.rglob("*"): - if f.name == filename: - return f - if self.epub_cache_dir.exists(): cached_path = self.epub_cache_dir / filename - if cached_path.exists(): + if is_safe_path_within(cached_path, self.epub_cache_dir) and cached_path.exists(): return cached_path raise FileNotFoundError(f"Could not locate {filename}") + # ========================================================================= + # KOReader hashing + # ========================================================================= + def get_kosync_id(self, filepath): filepath = Path(filepath) if self.hash_method == "filename": - return hashlib.md5(filepath.name.encode('utf-8')).hexdigest() + return hashlib.md5(filepath.name.encode("utf-8")).hexdigest() md5 = hashlib.md5() try: file_size = os.path.getsize(filepath) - with open(filepath, 'rb') as f: + with open(filepath, "rb") as f: for i in range(-1, 11): - offset = 0 if i == -1 else 1024 * (4 ** i) + offset = 0 if i == -1 else 1024 * (4**i) if offset >= file_size: break f.seek(offset) @@ -117,11 +119,12 @@ def _compute_koreader_hash_from_bytes(self, content): try: file_size = len(content) for i in range(-1, 11): - offset = 0 if i == -1 else 1024 * (4 ** i) - if offset >= file_size: break - - chunk = content[offset:offset + 1024] - if not chunk: break + offset = 0 if i == -1 else 1024 * (4**i) + if offset >= file_size: + break + chunk = content[offset : offset + 1024] + if not chunk: + break md5.update(chunk) return md5.hexdigest() except Exception as e: @@ -130,30 +133,24 @@ def _compute_koreader_hash_from_bytes(self, content): def get_kosync_id_from_bytes(self, filename, content): if self.hash_method == "filename": - return hashlib.md5(filename.encode('utf-8')).hexdigest() + return hashlib.md5(filename.encode("utf-8")).hexdigest() return self._compute_koreader_hash_from_bytes(content) + # ========================================================================= + # Cover extraction + # ========================================================================= + def extract_cover(self, filepath, output_path): - """ - Extract cover image from EPUB to output_path. - Returns True if successful, False otherwise. - """ + """Extract cover image from EPUB to output_path. Returns True if successful.""" try: filepath = Path(filepath) - # 1. Try to get cover from metadata using ebooklib try: book = epub.read_epub(str(filepath)) - # Check for cover item cover_item = None - # Method A: get_item_with_id('cover') or similar - # ebooklib doesn't have a standard 'get_cover' but often it's in the manifest - - # Method B: Iterate items for item in book.get_items(): if item.get_type() == ebooklib.ITEM_IMAGE: - # naive check: is it named "cover"? - if 'cover' in item.get_name().lower(): + if "cover" in item.get_name().lower(): cover_item = item break if item.get_type() == ebooklib.ITEM_COVER: @@ -161,27 +158,27 @@ def extract_cover(self, filepath, output_path): break if cover_item: - with open(output_path, 'wb') as f: + with open(output_path, "wb") as f: f.write(cover_item.get_content()) logger.debug(f"Extracted cover for {filepath.name}") return True except Exception as e: logger.debug(f"ebooklib cover extraction failed for {filepath.name}: {e}") - # 2. Fallback: ZipFile (if ebooklib fails or returns nothing) - # (ebooklib is basically a zip wrapper anyway, but sometimes direct zip access is easier if we just want the file) - # For now, let's stick to the attempt above. If valid EPUB, ebooklib should handle it. - return False except Exception as e: logger.error(f"Error extracting cover from '{filepath}': {e}") return False + # ========================================================================= + # Text extraction and caching + # ========================================================================= + def extract_text_and_map(self, filepath, progress_callback=None): """ - Used for fuzzy matching and general content extraction. - Uses BeautifulSoup. + Parse EPUB into full text + spine map. Results are cached. + Uses BeautifulSoup for text extraction. """ filepath = Path(filepath) if not filepath.exists(): @@ -190,8 +187,9 @@ def extract_text_and_map(self, filepath, progress_callback=None): cached = self.cache.get(str_path) if cached: - if progress_callback: progress_callback(1.0) - return cached['text'], cached['map'] + if progress_callback: + progress_callback(1.0) + return cached["text"], cached["map"] logger.info(f"Parsing EPUB: {filepath.name}") @@ -209,26 +207,28 @@ def extract_text_and_map(self, filepath, progress_callback=None): item = book.get_item_with_id(item_ref[0]) if item.get_type() == ebooklib.ITEM_DOCUMENT: - soup = BeautifulSoup(item.get_content(), 'html.parser') - text = soup.get_text(separator=' ', strip=True) + soup = BeautifulSoup(item.get_content(), "html.parser") + text = soup.get_text(separator=" ", strip=True) start = current_idx length = len(text) end = current_idx + length - spine_map.append({ - "start": start, - "end": end, - "spine_index": i + 1, - "href": item.get_name(), - "content": item.get_content() - }) + spine_map.append( + { + "start": start, + "end": end, + "spine_index": i + 1, + "href": item.get_name(), + "content": item.get_content(), + } + ) full_text_parts.append(text) current_idx = end + 1 combined_text = " ".join(full_text_parts) - self.cache.put(str_path, {'text': combined_text, 'map': spine_map}) + self.cache.put(str_path, {"text": combined_text, "map": spine_map}) return combined_text, spine_map except Exception as e: @@ -245,7 +245,6 @@ def get_text_at_percentage(self, filename, percentage): return None target_pos = int(len(full_text) * percentage) - # Grab a window of text around the calculated character position start = max(0, target_pos - 400) end = min(len(full_text), target_pos + 400) @@ -254,887 +253,80 @@ def get_text_at_percentage(self, filename, percentage): logger.error(f"Error getting text at percentage: {e}") return None - def get_character_delta(self, filename, percentage_prev, percentage_new): - """Calculate character difference between two percentages.""" - try: - book_path = self.resolve_book_path(filename) - full_text, _ = self.extract_text_and_map(book_path) - if not full_text: - return None - total_len = len(full_text) - return abs(int(total_len * percentage_prev) - int(total_len * percentage_new)) - except Exception as e: - logger.error(f"Error calculating character delta: {e}") - return None - # ========================================================================= - # STORYTELLER / READIUM / GENERAL UTILS - # Uses BeautifulSoup for broad compatibility + # Delegated: KOReader XPath generation/resolution # ========================================================================= - def resolve_locator_id(self, filename, href, fragment_id): - """ - Returns a text snippet starting at the element identified by href + #fragment_id. - Useful for syncing from Storyteller or any Readium-based reader that uses DOM IDs. - """ + def get_perfect_ko_xpath(self, filename, position=0) -> str | None: try: book_path = self.resolve_book_path(filename) full_text, spine_map = self.extract_text_and_map(book_path) - - target_item = None - for item in spine_map: - if href in item['href'] or item['href'] in href: - target_item = item - break - - if not target_item: return None - - soup = BeautifulSoup(target_item['content'], 'html.parser') - clean_id = fragment_id.lstrip('#') - element = soup.find(id=clean_id) - - if not element: return None - - current_offset = 0 - found_offset = -1 - all_strings = soup.find_all(string=True) - - for s in all_strings: - if s.parent == element or element in s.parents: - found_offset = current_offset - break - text_len = len(s.strip()) - if text_len == 0: - continue - current_offset += text_len - - if found_offset == -1: - # Fallback - elem_text = element.get_text(separator=' ', strip=True) - chapter_text = soup.get_text(separator=' ', strip=True) - found_offset = chapter_text.find(elem_text) - - if found_offset == -1: return None - - global_offset = target_item['start'] + found_offset - start = max(0, global_offset) - end = min(len(full_text), global_offset + 500) - return full_text[start:end] - + if not full_text or not spine_map: + return None + return self._ko_xpath.generate_xpath(full_text, spine_map, position) except Exception as e: - logger.error(f"Error resolving locator ID '{fragment_id}' in '{filename}': {e}") + logger.error(f"Error generating KOReader XPath: {e}") return None - def _generate_css_selector(self, target_tag): - """Generate a Readium-compatible CSS selector.""" - if not target_tag: return "" - segments = [] - curr = target_tag - while curr and curr.name != '[document]': - if not isinstance(curr, Tag): - curr = curr.parent - continue - index = 1 - sibling = curr.previous_sibling - while sibling: - if isinstance(sibling, Tag): - index += 1 - sibling = sibling.previous_sibling - segments.append(f"{curr.name}:nth-child({index})") - curr = curr.parent - return " > ".join(reversed(segments)) - - def _generate_cfi(self, spine_index, html_content, local_target_index): - """Generate an EPUB CFI for Booklore/Readium.""" - soup = BeautifulSoup(html_content, 'html.parser') - current_char_count = 0 - target_tag = None - - elements = soup.find_all(string=True) - for string in elements: - text_len = len(string.strip()) - if text_len == 0: continue - if current_char_count + text_len >= local_target_index: - target_tag = string.parent - break - current_char_count += text_len - if current_char_count < local_target_index: - current_char_count += 1 - - if not target_tag: - spine_step = (spine_index + 1) * 2 - return f"epubcfi(/6/{spine_step}!/4/2/1:0)" - - path_segments = [] - curr = target_tag - while curr and curr.name != '[document]': - if curr.name == 'body': - path_segments.append("4") - break - index = 1 - sibling = curr.previous_sibling - while sibling: - if isinstance(sibling, Tag): - index += 1 - sibling = sibling.previous_sibling - path_segments.append(str(index * 2)) - curr = curr.parent - - spine_step = (spine_index + 1) * 2 - element_path = "/".join(reversed(path_segments)) - return f"epubcfi(/6/{spine_step}!/{element_path}:0)" - - def _generate_xpath_bs4(self, html_content, local_target_index): - """ - Original BS4 XPath generator (kept for fuzzy matching references). - Returns: (xpath_string, target_tag_object, is_anchored) - """ - soup = BeautifulSoup(html_content, 'html.parser') - current_char_count = 0 - target_tag = None - - elements = soup.find_all(string=True) - for string in elements: - text_len = len(string.strip()) - if text_len == 0: continue - if current_char_count + text_len >= local_target_index: - target_tag = string.parent - break - current_char_count += text_len - if current_char_count < local_target_index: - current_char_count += 1 - - if not target_tag: return "/body/div/p[1]", None, False - - path_segments = [] - curr = target_tag - found_anchor = False - - while curr and curr.name != '[document]': - if curr.name == 'body': - path_segments.append("body") - break - if curr.has_attr('id') and curr['id']: - path_segments.append(f"*[@id='{curr['id']}']") - found_anchor = True - break - index = 1 - sibling = curr.previous_sibling - while sibling: - if isinstance(sibling, Tag) and sibling.name == curr.name: - index += 1 - sibling = sibling.previous_sibling - path_segments.append(f"{curr.name}[{index}]") - curr = curr.parent - - if not path_segments: - return "/body/p[1]", target_tag, False - - xpath = "//" + "/".join(reversed(path_segments)) if found_anchor else "/" + "/".join(reversed(path_segments)) - xpath = xpath.rstrip("/") - if xpath in ("", "/", "//", "/body", "//body"): - xpath = "/body/p[1]" - found_anchor = False - return xpath, target_tag, found_anchor - - def find_text_location(self, filename, search_phrase, hint_percentage=None) -> LocatorResult | None: - """ - Uses BS4 Engine. Good for fuzzy matching phrases from external apps. - Returns: LocatorResult or None - """ + def get_sentence_level_ko_xpath(self, filename, percentage) -> str | None: try: book_path = self.resolve_book_path(filename) full_text, spine_map = self.extract_text_and_map(book_path) - if not full_text: return None - total_len = len(full_text) - - # 0. Global Uniqueness Check (The "Anchor" Logic) - # Try to find a 10-word sequence that appears EXACTLY once in the book. - # This prevents jumping to duplicate phrases (e.g., "Chapter 1" in the ToC vs the actual chapter). - clean_search = " ".join(search_phrase.split()) - words = clean_search.split() - - match_index = -1 - - if len(words) >= 10: - N = 10 - # Scan through the search phrase to find a unique anchor - for i in range(len(words) - N + 1): - candidate = " ".join(words[i:i+N]) - - # Check if this phrase exists exactly ONCE in the text - if full_text.count(candidate) == 1: - found_idx = full_text.find(candidate) - if found_idx != -1: - match_index = found_idx - logger.info(f"Found unique text anchor: '{candidate[:30]}...' at index {match_index}") - break - - # [End of NEW logic] - Continue to existing fallbacks - - # 1. Exact match (if anchor logic didn't find anything) - if match_index == -1: - match_index = full_text.find(search_phrase) - - # 2. Normalized match - if match_index == -1: - norm_content = self._normalize(full_text) - norm_search = self._normalize(search_phrase) - norm_index = norm_content.find(norm_search) - if norm_index != -1: - match_index = int((norm_index / len(norm_content)) * total_len) - - # 3. Fuzzy match - if match_index == -1: - cutoff = self.fuzzy_threshold - if hint_percentage is not None: - w_start = int(max(0, hint_percentage - 0.10) * total_len) - w_end = int(min(1.0, hint_percentage + 0.10) * total_len) - alignment = rapidfuzz.fuzz.partial_ratio_alignment( - search_phrase, full_text[w_start:w_end], score_cutoff=cutoff - ) - if alignment: match_index = w_start + alignment.dest_start - - if match_index == -1: - alignment = rapidfuzz.fuzz.partial_ratio_alignment( - search_phrase, full_text, score_cutoff=cutoff - ) - if alignment: match_index = alignment.dest_start - - if match_index != -1: - percentage = match_index / total_len - for item in spine_map: - if item['start'] <= match_index < item['end']: - local_index = match_index - item['start'] - - # Use BS4 generator here for Rich Locators - xpath_str, target_tag, is_anchored = self._generate_xpath_bs4(item['content'], local_index) - css_selector = self._generate_css_selector(target_tag) - cfi = self._generate_cfi(item['spine_index'] - 1, item['content'], local_index) - - # FIX: Handle double slashes gracefully - doc_frag_prefix = f"/body/DocFragment[{item['spine_index']}]" - if xpath_str.startswith('//'): - final_xpath = doc_frag_prefix + xpath_str[1:] # //id -> /DocFragment/id (or keep // if valid) - elif xpath_str.startswith('/'): - final_xpath = doc_frag_prefix + xpath_str - else: - final_xpath = f"{doc_frag_prefix}/{xpath_str}" - # Calculate chapter progress (critical for Storyteller) - # better: use start/end from map - spine_item_len = item['end'] - item['start'] - chapter_progress = 0.0 - if spine_item_len > 0: - chapter_progress = local_index / spine_item_len - - perfect_ko = self.get_perfect_ko_xpath(filename, match_index) - - return LocatorResult( - percentage=percentage, - xpath=final_xpath, - perfect_ko_xpath=perfect_ko, - match_index=match_index, - cfi=cfi, - href=item['href'], - fragment=None, - css_selector=css_selector, - chapter_progress=chapter_progress - ) - - return None + return self._ko_xpath.generate_sentence_level_xpath(full_text, spine_map, percentage) except Exception as e: - logger.error(f"Error finding text in '{filename}': {e}") - return None - - def _normalize(self, text): - return re.sub(r'[^a-z0-9]', '', text.lower()) - - - - def _local_tag_name(self, node) -> str: - tag = getattr(node, "tag", None) - if not isinstance(tag, str): - tag = getattr(node, "name", None) - if not isinstance(tag, str): - return "" - if "}" in tag: - tag = tag.split("}", 1)[1] - return tag.lower() - - def _get_parent_node(self, node): - if node is None: - return None - getparent = getattr(node, "getparent", None) - if callable(getparent): - return getparent() - return getattr(node, "parent", None) - - def _nearest_crengine_anchor(self, node): - current = node - while current is not None: - tag_name = self._local_tag_name(current) - if tag_name == "body": - return current - if tag_name in self.CRENGINE_STRUCTURAL_TAGS: - return current - if tag_name in ("html", "document", "[document]"): - break - current = self._get_parent_node(current) - return node - - def _first_non_empty_direct_text_suffix(self, element) -> str | None: - if element is None: + logger.error(f"Error generating sentence-level KOReader XPath: {e}") return None - try: - direct_text_nodes = element.xpath("text()") - for i, node in enumerate(direct_text_nodes, start=1): - if str(node).strip(): - return "/text()" if i == 1 else f"/text()[{i}]" - except Exception as e: - logger.debug(f"XPath text() node lookup failed: {e}") - - if isinstance(element, Tag): - text_nodes = [child for child in element.children if isinstance(child, str)] - for i, node in enumerate(text_nodes, start=1): - if str(node).strip(): - return "/text()" if i == 1 else f"/text()[{i}]" - return None - - def _build_crengine_safe_text_xpath(self, element, spine_index, html_content) -> str: - anchor = self._nearest_crengine_anchor(element) - suffix = self._first_non_empty_direct_text_suffix(anchor) - # If the text was inside a flattened inline tag, the anchor won't have direct text in XML. - # But Crengine WILL flatten it, so we trust the anchor and default to the first text node - # instead of falling back to the start of the chapter. - if not suffix: - suffix = "/text()" - - xpath_base = self._build_xpath(anchor) - return f"/body/DocFragment[{spine_index}]/{xpath_base}{suffix}.0" - - def _build_sentence_level_chapter_fallback_xpath(self, html_content, spine_index) -> str: - """ - Build a safe sentence-level XPath anchored to the first readable text node - in the chapter. This intentionally targets node starts (.0) instead of - character-level offsets. - """ - default_xpath = f"/body/DocFragment[{spine_index}]/body/p[1]/text().0" - try: - tree = html.fromstring(html_content) - except Exception as e: - logger.debug(f"Failed to parse HTML for sentence-level XPath (spine_index={spine_index}): {e}") - return default_xpath - - sentence_tags = ( - "p", "li", "h1", "h2", "h3", "h4", "h5", "h6", - "blockquote", "figcaption", "dd", "dt", "td", "th", - "div", "section", "article", "pre" - ) - - for tag in sentence_tags: - for element in tree.iter(tag): - suffix = self._first_non_empty_direct_text_suffix(element) - if suffix: - xpath_base = self._build_xpath(element) - return f"/body/DocFragment[{spine_index}]/{xpath_base}{suffix}.0" - - for element in tree.iter(): - if self._local_tag_name(element) not in self.CRENGINE_STRUCTURAL_TAGS: - continue - suffix = self._first_non_empty_direct_text_suffix(element) - if suffix: - xpath_base = self._build_xpath(element) - return f"/body/DocFragment[{spine_index}]/{xpath_base}{suffix}.0" - - return default_xpath - - def get_sentence_level_ko_xpath(self, filename, percentage) -> str | None: - """ - Resolve a sentence-level KOReader XPath from percentage. - Returns node-start offset (.0), not word-level offsets. - """ + def resolve_xpath(self, filename, xpath_str): try: book_path = self.resolve_book_path(filename) - full_text, _ = self.extract_text_and_map(book_path) + full_text, spine_map = self.extract_text_and_map(book_path) if not full_text: return None - - pct = float(percentage if percentage is not None else 0.0) - pct = max(0.0, min(1.0, pct)) - position = int((len(full_text) - 1) * pct) if len(full_text) > 1 else 0 - return self.get_perfect_ko_xpath(filename, position) + return self._ko_xpath.resolve_xpath(full_text, spine_map, xpath_str) except Exception as e: - logger.error(f"Error generating sentence-level KOReader XPath: {e}") + logger.error(f"Error resolving XPath '{xpath_str}': {e}") return None - def get_perfect_ko_xpath(self, filename, position=0) -> str | None: - """ - Generate KOReader XPath for a specific character position in the book. - Uses BeautifulSoup (Engine A) to perfectly align with the text extraction, - eliminating parser drift compared to the old LXML offset logic. - """ + # ========================================================================= + # Delegated: Text search and locator resolution + # ========================================================================= + + def find_text_location(self, filename, search_phrase, hint_percentage=None): try: - # Get full text and spine mapping book_path = self.resolve_book_path(filename) full_text, spine_map = self.extract_text_and_map(book_path) - - if not full_text or not spine_map: + if not full_text: return None - - # Clamp position to valid range - position = max(0, min(position, len(full_text) - 1)) - - # Find which spine item contains this position - target_item = next((item for item in spine_map - if item['start'] <= position < item['end']), spine_map[-1]) - - local_pos = position - target_item['start'] - - # Parse HTML content with BeautifulSoup - soup = BeautifulSoup(target_item['content'], 'html.parser') - - # Find the exact text element matching the character count - current_char_count = 0 - target_string = None - first_non_empty_string = None - last_non_empty_string = None - - elements = soup.find_all(string=True) - for string in elements: - # Count lengths exactly like extract_text_and_map's get_text(strip=True) - clean_text = string.strip() - text_len = len(clean_text) - - if text_len == 0: - continue - - if first_non_empty_string is None: - first_non_empty_string = string - last_non_empty_string = string - - if current_char_count + text_len > local_pos: - target_string = string - break - - current_char_count += text_len - # extract_text_and_map uses separator=' ', adding exactly 1 space between words - if current_char_count <= local_pos: - current_char_count += 1 - - if target_string is None: - target_string = last_non_empty_string or first_non_empty_string - - if not target_string: - logger.warning(f"No matching text element found in spine {target_item['spine_index']}") - return self._build_sentence_level_chapter_fallback_xpath( - target_item['content'], - target_item['spine_index'] - ) - - target_tag = target_string.parent - if not target_tag or target_tag.name == '[document]': - return self._build_sentence_level_chapter_fallback_xpath( - target_item['content'], - target_item['spine_index'] - ) - - # ================================================================= - # HYBRID ANCHOR MAPPING: BS4 -> LXML - # 1. We have the exact mathematical text offset via BS4. - # 2. We use the raw text as a unique "anchor" to find the exact - # same node in LXML's strictly structured tree. - # 3. This guarantees perfect KOReader XPaths with zero parser drift. - # ================================================================= - search_text = str(target_string) - occurrence_index = 0 - - # Count which occurrence of this exact text this is in the BS4 document - for string in elements: - if string is target_string: - break - if str(string) == search_text: - occurrence_index += 1 - - tree = html.fromstring(target_item['content']) - current_occurrence = 0 - - for el in tree.iter(): - if el.text and el.text == search_text: - if current_occurrence == occurrence_index: - return self._build_crengine_safe_text_xpath( - el, - target_item['spine_index'], - target_item['content'] - ) - current_occurrence += 1 - - if el.tail and el.tail == search_text: - if current_occurrence == occurrence_index: - parent = el.getparent() - node_to_build = parent if parent is not None else el - return self._build_crengine_safe_text_xpath( - node_to_build, - target_item['spine_index'], - target_item['content'] - ) - current_occurrence += 1 - - logger.warning(f"Hybrid Anchor mapping failed for '{search_text}'. Falling back to BS4 structural path.") - - # Build KOReader-compatible strictly positional XPath using BS4 (Fallback) - path_segments = [] - curr = target_tag - - while curr and curr.name != '[document]': - if curr.name == 'body': - path_segments.append("body") - break - - if curr.name in self.CRENGINE_FRAGILE_INLINE_TAGS: - curr = curr.parent - continue - - index = 1 - sibling = curr.previous_sibling - while sibling: - if isinstance(sibling, Tag) and sibling.name == curr.name: - index += 1 - sibling = sibling.previous_sibling - - path_segments.append(f"{curr.name}[{index}]") - curr = curr.parent - - # Ensure the path starts with body - if not path_segments or path_segments[-1] != 'body': - path_segments.append('body') - - xpath = "/".join(reversed(path_segments)) - if xpath == "body": - return self._build_sentence_level_chapter_fallback_xpath( - target_item['content'], - target_item['spine_index'] - ) - return f"/body/DocFragment[{target_item['spine_index']}]/{xpath}/text().0" - + result = self._locator.find_text_location(full_text, spine_map, search_phrase, hint_percentage) + # Facade coordinates: fill in perfect_ko_xpath from xpath service + if result and result.match_index is not None: + result.perfect_ko_xpath = self._ko_xpath.generate_xpath(full_text, spine_map, result.match_index) + return result except Exception as e: - logger.error(f"Error generating KOReader XPath: {e}") + logger.error(f"Error finding text in '{filename}': {e}") return None - def _has_text_content(self, element): - """Check if element directly contains text (not just in children).""" - return element.text and element.text.strip() and len(element.text.strip()) > 0 - - def _build_xpath(self, element): - """Build XPath for an element, ensuring proper KOReader format.""" - parts = [] - current = element - - while current is not None and current.tag not in ['html', 'document']: - tag_name = self._local_tag_name(current) - - if tag_name in self.CRENGINE_FRAGILE_INLINE_TAGS: - current = current.getparent() - continue - - # Get siblings of same tag to determine index - parent = current.getparent() - if parent is not None: - siblings = [s for s in parent if self._local_tag_name(s) == tag_name] - if len(siblings) > 1: - index = siblings.index(current) + 1 - parts.insert(0, f"{tag_name}[{index}]") - else: - parts.insert(0, tag_name) - else: - parts.insert(0, tag_name) - current = parent - - # Clean up the path - if parts and parts[0] == 'html': - parts.pop(0) - if not parts or parts[0] != 'body': - parts.insert(0, 'body') - - # If we have no meaningful path, create a default - if len(parts) <= 1: # Just 'body' or empty - parts = ['body', 'p[1]'] - - return '/'.join(parts) - - def resolve_xpath(self, filename, xpath_str): - """ - RESOLVER: - Uses LXML to find the target element, then searches for its text in the - BS4-generated full_text to ensure alignment (Fixes Parser Drift). - """ + def resolve_locator_id(self, filename, href, fragment_id): try: - logger.debug(f"Resolving XPath (Hybrid): {xpath_str}") - - match = re.search(r'DocFragment\[(\d+)]', xpath_str) - if not match: - return None - spine_index = int(match.group(1)) - book_path = self.resolve_book_path(filename) full_text, spine_map = self.extract_text_and_map(book_path) - - target_item = next((i for i in spine_map if i['spine_index'] == spine_index), None) - if not target_item: - return None - - # Parse path and offset - relative_path = xpath_str.split(f"DocFragment[{spine_index}]")[-1] - offset_match = re.search(r'/text\(\)\.(\d+)$', relative_path) - target_offset = int(offset_match.group(1)) if offset_match else 0 - clean_xpath = re.sub(r'/text\(\)\.(\d+)$', '', relative_path) - - if clean_xpath.startswith('/'): - clean_xpath = '.' + clean_xpath - - tree = html.fromstring(target_item['content']) - - elements = [] - try: - elements = tree.xpath(clean_xpath) - except Exception as e: - logger.debug(f"XPath query failed: {e}") - - # [Fallback logic from original code for finding elements...] - if not elements and clean_xpath.startswith('./'): - try: elements = tree.xpath(clean_xpath[2:]) - except Exception as e: logger.debug(f"XPath fallback (strip ./) failed: {e}") - - if not elements: - id_match = re.search(r"@id='([^']+)'", clean_xpath) - if id_match: - try: elements = tree.xpath(f"//*[@id='{id_match.group(1)}']") - except Exception as e: logger.debug(f"XPath fallback (id lookup) failed: {e}") - - if not elements: - simple_path = re.sub(r'\[\d+]', '', clean_xpath) - try: elements = tree.xpath(simple_path) - except Exception as e: logger.debug(f"XPath fallback (simplified path) failed: {e}") - - if not elements: - logger.warning(f"Could not resolve XPath in {filename}: {clean_xpath}") - return None - - target_node = elements[0] - - # [NEW LOGIC STARTS HERE] - # Instead of calculating offset via LXML iteration (which drifts), - # grab the text and FIND it in the spine item content. - - # 1. Extract a unique-ish fingerprint from the node - node_text = "" - if target_node.text: node_text += target_node.text.strip() - if target_node.tail: node_text += " " + target_node.tail.strip() - - # If node text is too short, grab parent context - if len(node_text) < 20: - parent = target_node.getparent() - if parent is not None: - node_text = parent.text_content().strip() - - clean_anchor = " ".join(node_text.split()) - if not clean_anchor: - return None - - # 2. Find this anchor in the BS4 content (spine_map item) - # We search specifically in this chapter's content to minimize false positives - bs4_chapter_text = BeautifulSoup(target_item['content'], 'html.parser').get_text(separator=' ', strip=True) - - local_start_index = bs4_chapter_text.find(clean_anchor) - - if local_start_index != -1: - # Found it! Calculate global position - # Add target_offset (clamped to length of anchor) - safe_offset = min(target_offset, len(clean_anchor)) - global_index = target_item['start'] + local_start_index + safe_offset - - # 3. Return text from the Main Source of Truth (full_text) - start = max(0, global_index) - end = min(len(full_text), global_index + 600) # Grab enough context - return full_text[start:end] - - else: - # Fallback: If exact match fails (rare), try the old calculation method - # (This preserves old behavior if the new matching fails) - logger.debug("Exact text match failed, falling back to LXML offset calculation") - # Falling back to strict calculation (Logic from original implementation) - - preceding_len = 0 - found_target = False - SEPARATOR_LEN = 1 - - for node in tree.iter(): - if node == target_node: - found_target = True - if node.text and target_offset > 0: - raw_segment = node.text[:min(len(node.text), target_offset)] - preceding_len += len(raw_segment.strip()) - elif target_offset > 0: - preceding_len += target_offset - break - - if node.text and node.text.strip(): - preceding_len += (len(node.text.strip()) + SEPARATOR_LEN) - if node.tail and node.tail.strip(): - preceding_len += (len(node.tail.strip()) + SEPARATOR_LEN) - - if found_target: - local_pos = preceding_len - global_offset = target_item['start'] + local_pos - start = max(0, global_offset) - end = min(len(full_text), global_offset + 500) - return full_text[start:end] - + if not full_text: return None - + return self._locator.resolve_locator_id(full_text, spine_map, href, fragment_id) except Exception as e: - logger.error(f"Error resolving XPath '{xpath_str}': {e}") + logger.error(f"Error resolving locator ID '{fragment_id}' in '{filename}': {e}") return None def get_text_around_cfi(self, filename, cfi, context=50): - """ - Returns a text fragment of length 2*context centered on the position indicated by the CFI. - Uses the epubcfi library for precise parsing. - - Example supported CFI: epubcfi(/6/16[chapter_6]!/4/2[book-columns]/2[book-inner]/268/4/2[kobo.134.3]/1:11) - """ try: - # Parse CFI using the epubcfi library - parsed_cfi = epubcfi.parse(cfi) - - # Extract spine information and element steps - spine_step = None - element_steps = [] - - for step in parsed_cfi.steps: - if hasattr(step, 'index'): - if step.index == 6: # Skip spine reference marker - continue - elif not spine_step and step.index > 6: # First step after /6/ is spine - spine_step = step.index - elif isinstance(step, epubcfi.cfi.Step): - element_steps.append(step) - # Skip Redirect objects (!) - - char_offset = parsed_cfi.offset.value if parsed_cfi.offset else 0 - - if not spine_step: - logger.error(f"Could not extract spine step from CFI: '{cfi}'") - return None - - # Load the EPUB and find the spine item book_path = self.resolve_book_path(filename) full_text, spine_map = self.extract_text_and_map(book_path) - - # Calculate spine index (CFI spine steps are 2x the actual index) - spine_index = (spine_step // 2) - 1 - if not (0 <= spine_index < len(spine_map)): - logger.error(f"Spine index {spine_index} out of range for CFI '{cfi}'") + if not full_text: return None - - item = spine_map[spine_index] - - # Parse the HTML content with lxml for precise navigation - tree = html.fromstring(item['content']) - - # Follow the CFI path precisely through the DOM - current_element = tree - text_count = 0 - - logger.debug(f"Following CFI path with {len(element_steps)} steps") - - for i, step in enumerate(element_steps): - if not hasattr(step, 'index'): - continue - - step_index = step.index - step_assertion = step.assertion - - logger.debug(f"Step {i}: index={step_index}, assertion={step_assertion}") - - if step_assertion: - # Look for element with specific ID or class - candidates = current_element.xpath(f".//*[contains(@id, '{step_assertion}') or contains(@class, '{step_assertion}')]") - if candidates: - current_element = candidates[0] - logger.debug(f"Found element with assertion: {step_assertion}") - continue - - # CFI uses 1-based indexing, even numbers for elements - if step_index % 2 == 0: # Even number = element - element_index = (step_index // 2) - 1 - children = [child for child in current_element if hasattr(child, 'tag')] - - if 0 <= element_index < len(children): - current_element = children[element_index] - logger.debug(f"Navigated to child element {element_index}: {current_element.tag}") - else: - logger.warning(f"Element index {element_index} out of range (have {len(children)} children)") - break - else: # Odd number = text node - text_index = (step_index // 2) - # For text nodes, we need to count text content - text_nodes = [] - for child in current_element: - if child.text and child.text.strip(): - text_nodes.append(child.text.strip()) - if child.tail and child.tail.strip(): - text_nodes.append(child.tail.strip()) - - if 0 <= text_index < len(text_nodes): - # Calculate position up to this text node - text_count += sum(len(text) for text in text_nodes[:text_index]) - logger.debug(f"Text node {text_index}, accumulated count: {text_count}") - break - - # Calculate text position within the current element - if current_element is not None: - # Get all text content up to the current element's position in the document - soup = BeautifulSoup(item['content'], 'html.parser') - chapter_text = soup.get_text(separator=' ', strip=True) - - # Find the current element's text in the chapter - element_text = "" - if hasattr(current_element, 'text_content'): - element_text = current_element.text_content() - - if element_text and len(element_text.strip()) > 5: - # Find where this element's content appears in the chapter - element_start = chapter_text.find(element_text.strip()[:50]) - if element_start != -1: - local_offset = element_start + char_offset - else: - # Fallback: use text_count + char_offset - local_offset = text_count + char_offset - else: - local_offset = text_count + char_offset - else: - local_offset = text_count + char_offset - - # Clamp to chapter bounds - chapter_text = BeautifulSoup(item['content'], 'html.parser').get_text(separator=' ', strip=True) - local_offset = min(max(0, local_offset), len(chapter_text)) - - # Calculate global position - global_offset = item['start'] + local_offset - - # Extract context - start_pos = max(0, global_offset - context) - end_pos = min(len(full_text), global_offset + context) - - snippet = full_text[start_pos:end_pos] - logger.info(f"Snippet extracted: {snippet[:30]}...") - return snippet - + return self._locator.get_text_around_cfi(full_text, spine_map, cfi, context) except Exception as e: logger.error(f"Error using epubcfi library for '{cfi}': {e}") return None - - diff --git a/src/utils/koreader_xpath.py b/src/utils/koreader_xpath.py new file mode 100644 index 0000000..466a05f --- /dev/null +++ b/src/utils/koreader_xpath.py @@ -0,0 +1,510 @@ +""" +KOReader XPath Generation and Resolution for PageKeeper. + +Generates CREngine-compatible XPaths from character positions in EPUB text, +and resolves existing XPaths back to text snippets. Uses a hybrid BS4→LXML +strategy: BS4 for exact text-offset alignment, LXML for structurally correct +XPath output. +""" + +import logging +import re + +from bs4 import BeautifulSoup, Tag +from lxml import html + +logger = logging.getLogger(__name__) + + +class KoReaderXPathService: + CRENGINE_FRAGILE_INLINE_TAGS = {"span", "em", "strong", "b", "i", "u", "a", "font", "small", "big", "sub", "sup"} + CRENGINE_STRUCTURAL_TAGS = { + "p", + "div", + "section", + "article", + "blockquote", + "h1", + "h2", + "h3", + "h4", + "h5", + "h6", + "li", + "header", + "footer", + "aside", + "td", + "th", + "dt", + "dd", + "figcaption", + "pre", + } + + def generate_xpath(self, full_text, spine_map, position) -> str | None: + """ + Generate a KOReader XPath for a specific character position in the book. + + Uses BS4 to find the exact text node matching the position, then maps + to LXML for a structurally correct XPath. Falls back to BS4 structural + path or sentence-level chapter fallback if hybrid mapping fails. + """ + try: + if not full_text or not spine_map: + return None + + position = max(0, min(position, len(full_text) - 1)) + + target_item = next((item for item in spine_map if item["start"] <= position < item["end"]), spine_map[-1]) + local_pos = position - target_item["start"] + + target_string, target_tag, elements = self._find_text_node(target_item, local_pos) + + if not target_string: + logger.warning(f"No matching text element found in spine {target_item['spine_index']}") + return self._build_sentence_level_chapter_fallback_xpath( + target_item["content"], target_item["spine_index"] + ) + + if not target_tag or target_tag.name == "[document]": + return self._build_sentence_level_chapter_fallback_xpath( + target_item["content"], target_item["spine_index"] + ) + + # Phase 1: Try hybrid BS4→LXML anchor mapping + result = self._hybrid_anchor_to_lxml(target_string, elements, target_item) + if result: + return result + + # Phase 2: Fall back to BS4 structural path + return self._bs4_structural_fallback(target_tag, target_item) + + except Exception as e: + logger.error(f"Error generating KOReader XPath: {e}") + return None + + def generate_sentence_level_xpath(self, full_text, spine_map, percentage) -> str | None: + """ + Resolve a sentence-level KOReader XPath from percentage. + Returns node-start offset (.0), not word-level offsets. + """ + try: + if not full_text: + return None + pct = max(0.0, min(1.0, float(percentage if percentage is not None else 0.0))) + position = int((len(full_text) - 1) * pct) if len(full_text) > 1 else 0 + return self.generate_xpath(full_text, spine_map, position) + except Exception as e: + logger.error(f"Error generating sentence-level KOReader XPath: {e}") + return None + + def resolve_xpath(self, full_text, spine_map, xpath_str) -> str | None: + """ + Resolve a KOReader XPath back to a text snippet. + + Uses LXML to find the target element, then searches for its text in the + BS4-generated full_text to ensure alignment (fixes parser drift). + """ + try: + logger.debug(f"Resolving XPath (Hybrid): {xpath_str}") + + match = re.search(r"DocFragment\[(\d+)]", xpath_str) + if not match: + return None + spine_index = int(match.group(1)) + + target_item = next((i for i in spine_map if i["spine_index"] == spine_index), None) + if not target_item: + return None + + # Parse path and offset + relative_path = xpath_str.split(f"DocFragment[{spine_index}]")[-1] + offset_match = re.search(r"/text\(\)\.(\d+)$", relative_path) + target_offset = int(offset_match.group(1)) if offset_match else 0 + clean_xpath = re.sub(r"/text\(\)\.(\d+)$", "", relative_path) + + if clean_xpath.startswith("/"): + clean_xpath = "." + clean_xpath + + tree = html.fromstring(target_item["content"]) + + elements = self._resolve_xpath_elements(tree, clean_xpath) + if not elements: + logger.warning(f"Could not resolve XPath: {clean_xpath}") + return None + + target_node = elements[0] + + # Try text anchor matching first (avoids parser drift) + result = self._resolve_via_text_anchor(target_node, target_item, target_offset, full_text) + if result: + return result + + # Fallback to LXML offset calculation + return self._resolve_via_lxml_offset(tree, target_node, target_item, target_offset, full_text) + + except Exception as e: + logger.error(f"Error resolving XPath '{xpath_str}': {e}") + return None + + # ========================================================================= + # PRIVATE: Text node location (Phase 0) + # ========================================================================= + + def _find_text_node(self, target_item, local_pos): + """ + Walk BS4 text nodes to find the NavigableString at the given local + character position within a spine item. + + Returns (target_string, target_tag, all_elements) or (None, None, []). + """ + soup = BeautifulSoup(target_item["content"], "html.parser") + current_char_count = 0 + target_string = None + first_non_empty_string = None + last_non_empty_string = None + + elements = soup.find_all(string=True) + for string in elements: + clean_text = string.strip() + text_len = len(clean_text) + if text_len == 0: + continue + + if first_non_empty_string is None: + first_non_empty_string = string + last_non_empty_string = string + + if current_char_count + text_len > local_pos: + target_string = string + break + + current_char_count += text_len + if current_char_count <= local_pos: + current_char_count += 1 + + if target_string is None: + target_string = last_non_empty_string or first_non_empty_string + + target_tag = target_string.parent if target_string else None + return target_string, target_tag, elements + + # ========================================================================= + # PRIVATE: Hybrid BS4→LXML anchor mapping (Phase 1) + # ========================================================================= + + def _hybrid_anchor_to_lxml(self, target_string, elements, target_item): + """ + Map a BS4 NavigableString to its LXML equivalent by counting + text occurrences, then build a CREngine-safe XPath. + + Returns the xpath string, or None if mapping fails. + """ + search_text = str(target_string) + occurrence_index = 0 + + for string in elements: + if string is target_string: + break + if str(string) == search_text: + occurrence_index += 1 + + tree = html.fromstring(target_item["content"]) + current_occurrence = 0 + + for el in tree.iter(): + if el.text and el.text == search_text: + if current_occurrence == occurrence_index: + return self._build_crengine_safe_text_xpath(el, target_item["spine_index"], target_item["content"]) + current_occurrence += 1 + + if el.tail and el.tail == search_text: + if current_occurrence == occurrence_index: + parent = el.getparent() + node_to_build = parent if parent is not None else el + return self._build_crengine_safe_text_xpath( + node_to_build, target_item["spine_index"], target_item["content"] + ) + current_occurrence += 1 + + logger.warning(f"Hybrid Anchor mapping failed for '{search_text}'. Falling back to BS4 structural path.") + return None + + # ========================================================================= + # PRIVATE: BS4 structural fallback (Phase 2) + # ========================================================================= + + def _bs4_structural_fallback(self, target_tag, target_item): + """ + Build a positional XPath by walking BS4 parents, skipping + CREngine-fragile inline tags. + """ + path_segments = [] + curr = target_tag + + while curr and curr.name != "[document]": + if curr.name == "body": + path_segments.append("body") + break + + if curr.name in self.CRENGINE_FRAGILE_INLINE_TAGS: + curr = curr.parent + continue + + index = 1 + sibling = curr.previous_sibling + while sibling: + if isinstance(sibling, Tag) and sibling.name == curr.name: + index += 1 + sibling = sibling.previous_sibling + + path_segments.append(f"{curr.name}[{index}]") + curr = curr.parent + + if not path_segments or path_segments[-1] != "body": + path_segments.append("body") + + xpath = "/".join(reversed(path_segments)) + if xpath == "body": + return self._build_sentence_level_chapter_fallback_xpath(target_item["content"], target_item["spine_index"]) + return f"/body/DocFragment[{target_item['spine_index']}]/{xpath}/text().0" + + # ========================================================================= + # PRIVATE: XPath building helpers + # ========================================================================= + + def _build_crengine_safe_text_xpath(self, element, spine_index, html_content) -> str: + anchor = self._nearest_crengine_anchor(element) + suffix = self._first_non_empty_direct_text_suffix(anchor) + if not suffix: + suffix = "/text()" + xpath_base = self._build_xpath(anchor) + return f"/body/DocFragment[{spine_index}]/{xpath_base}{suffix}.0" + + def _build_sentence_level_chapter_fallback_xpath(self, html_content, spine_index) -> str: + """ + Build a safe sentence-level XPath anchored to the first readable text node + in the chapter. Targets node starts (.0) instead of character-level offsets. + """ + default_xpath = f"/body/DocFragment[{spine_index}]/body/p[1]/text().0" + try: + tree = html.fromstring(html_content) + except Exception as e: + logger.debug(f"Failed to parse HTML for sentence-level XPath (spine_index={spine_index}): {e}") + return default_xpath + + sentence_tags = ( + "p", + "li", + "h1", + "h2", + "h3", + "h4", + "h5", + "h6", + "blockquote", + "figcaption", + "dd", + "dt", + "td", + "th", + "div", + "section", + "article", + "pre", + ) + + sentence_tag_set = set(sentence_tags) + for element in tree.iter(): + if self._local_tag_name(element) in sentence_tag_set: + suffix = self._first_non_empty_direct_text_suffix(element) + if suffix: + xpath_base = self._build_xpath(element) + return f"/body/DocFragment[{spine_index}]/{xpath_base}{suffix}.0" + + for element in tree.iter(): + if self._local_tag_name(element) not in self.CRENGINE_STRUCTURAL_TAGS: + continue + suffix = self._first_non_empty_direct_text_suffix(element) + if suffix: + xpath_base = self._build_xpath(element) + return f"/body/DocFragment[{spine_index}]/{xpath_base}{suffix}.0" + + return default_xpath + + def _build_xpath(self, element): + """Build XPath for an lxml element, ensuring proper KOReader format.""" + parts = [] + current = element + + while current is not None and current.tag not in ["html", "document"]: + tag_name = self._local_tag_name(current) + + if tag_name in self.CRENGINE_FRAGILE_INLINE_TAGS: + current = current.getparent() + continue + + parent = current.getparent() + if parent is not None: + siblings = [s for s in parent if self._local_tag_name(s) == tag_name] + if len(siblings) > 1: + index = siblings.index(current) + 1 + parts.insert(0, f"{tag_name}[{index}]") + else: + parts.insert(0, tag_name) + else: + parts.insert(0, tag_name) + current = parent + + if parts and parts[0] == "html": + parts.pop(0) + if not parts or parts[0] != "body": + parts.insert(0, "body") + if len(parts) <= 1: + parts = ["body", "p[1]"] + + return "/".join(parts) + + def _nearest_crengine_anchor(self, node): + current = node + while current is not None: + tag_name = self._local_tag_name(current) + if tag_name == "body": + return current + if tag_name in self.CRENGINE_STRUCTURAL_TAGS: + return current + if tag_name in ("html", "document", "[document]"): + break + current = self._get_parent_node(current) + return node + + def _first_non_empty_direct_text_suffix(self, element) -> str | None: + if element is None: + return None + try: + direct_text_nodes = element.xpath("text()") + for i, node in enumerate(direct_text_nodes, start=1): + if str(node).strip(): + return "/text()" if i == 1 else f"/text()[{i}]" + except Exception as e: + logger.debug(f"XPath text() node lookup failed: {e}") + + if isinstance(element, Tag): + text_nodes = [child for child in element.children if isinstance(child, str)] + for i, node in enumerate(text_nodes, start=1): + if str(node).strip(): + return "/text()" if i == 1 else f"/text()[{i}]" + return None + + def _local_tag_name(self, node) -> str: + tag = getattr(node, "tag", None) + if not isinstance(tag, str): + tag = getattr(node, "name", None) + if not isinstance(tag, str): + return "" + if "}" in tag: + tag = tag.split("}", 1)[1] + return tag.lower() + + def _get_parent_node(self, node): + if node is None: + return None + getparent = getattr(node, "getparent", None) + if callable(getparent): + return getparent() + return getattr(node, "parent", None) + + # ========================================================================= + # PRIVATE: XPath resolution helpers + # ========================================================================= + + def _resolve_xpath_elements(self, tree, clean_xpath): + """Try multiple XPath resolution strategies, returning matched elements.""" + elements = [] + try: + elements = tree.xpath(clean_xpath) + except Exception as e: + logger.debug(f"XPath query failed: {e}") + + if not elements and clean_xpath.startswith("./"): + try: + elements = tree.xpath(clean_xpath[2:]) + except Exception as e: + logger.debug(f"XPath fallback (strip ./) failed: {e}") + + if not elements: + id_match = re.search(r"@id='([^']+)'", clean_xpath) + if id_match: + try: + elements = tree.xpath(f"//*[@id='{id_match.group(1)}']") + except Exception as e: + logger.debug(f"XPath fallback (id lookup) failed: {e}") + + if not elements: + simple_path = re.sub(r"\[\d+]", "", clean_xpath) + try: + elements = tree.xpath(simple_path) + except Exception as e: + logger.debug(f"XPath fallback (simplified path) failed: {e}") + + return elements + + def _resolve_via_text_anchor(self, target_node, target_item, target_offset, full_text): + """Resolve XPath by finding the target node's text in the BS4 chapter text.""" + node_text = "" + if target_node.text: + node_text += target_node.text.strip() + if target_node.tail: + node_text += " " + target_node.tail.strip() + + if len(node_text) < 20: + parent = target_node.getparent() + if parent is not None: + node_text = parent.text_content().strip() + + clean_anchor = " ".join(node_text.split()) + if not clean_anchor: + return None + + bs4_chapter_text = BeautifulSoup(target_item["content"], "html.parser").get_text(separator=" ", strip=True) + local_start_index = bs4_chapter_text.find(clean_anchor) + + if local_start_index != -1: + safe_offset = min(target_offset, len(clean_anchor)) + global_index = target_item["start"] + local_start_index + safe_offset + start = max(0, global_index) + end = min(len(full_text), global_index + 600) + return full_text[start:end] + + return None + + def _resolve_via_lxml_offset(self, tree, target_node, target_item, target_offset, full_text): + """Fallback: calculate position by iterating LXML tree nodes.""" + logger.debug("Exact text match failed, falling back to LXML offset calculation") + + preceding_len = 0 + found_target = False + SEPARATOR_LEN = 1 + + for node in tree.iter(): + if node == target_node: + found_target = True + if node.text and target_offset > 0: + raw_segment = node.text[: min(len(node.text), target_offset)] + preceding_len += len(raw_segment.strip()) + break + + if node.text and node.text.strip(): + preceding_len += len(node.text.strip()) + SEPARATOR_LEN + if node.tail and node.tail.strip(): + preceding_len += len(node.tail.strip()) + SEPARATOR_LEN + + if found_target: + local_pos = preceding_len + global_offset = target_item["start"] + local_pos + start = max(0, global_offset) + end = min(len(full_text), global_offset + 500) + return full_text[start:end] + + return None diff --git a/src/utils/locator_search.py b/src/utils/locator_search.py new file mode 100644 index 0000000..417a27b --- /dev/null +++ b/src/utils/locator_search.py @@ -0,0 +1,435 @@ +""" +Locator Search and Resolution for PageKeeper. + +Finds text positions in EPUBs using anchor/exact/normalized/fuzzy matching, +resolves Readium/Storyteller locators (href + fragment ID), and resolves +EPUB CFIs back to text snippets. +""" + +import logging +import re + +import epubcfi +import rapidfuzz +from bs4 import BeautifulSoup, Tag +from lxml import html + +from src.sync_clients.sync_client_interface import LocatorResult + +logger = logging.getLogger(__name__) + + +class LocatorSearchService: + def __init__(self, fuzzy_threshold: int = 80): + self.fuzzy_threshold = fuzzy_threshold + + def find_text_location(self, full_text, spine_map, search_phrase, hint_percentage=None) -> LocatorResult | None: + """ + Search for text in the EPUB using multiple strategies: + 1. Unique 10-word anchor (avoids ToC duplicates) + 2. Exact match + 3. Normalized match (case/punctuation insensitive) + 4. Fuzzy match (with optional percentage hint for windowed search) + + Returns LocatorResult with perfect_ko_xpath=None (caller fills it in). + """ + try: + if not full_text: + return None + total_len = len(full_text) + + clean_search = " ".join(search_phrase.split()) + words = clean_search.split() + + match_index = -1 + + # 1. Unique anchor: find a 10-word subsequence that appears exactly once + if len(words) >= 10: + N = 10 + for i in range(len(words) - N + 1): + candidate = " ".join(words[i : i + N]) + if full_text.count(candidate) == 1: + found_idx = full_text.find(candidate) + if found_idx != -1: + match_index = found_idx + logger.info(f"Found unique text anchor: '{candidate[:30]}...' at index {match_index}") + break + + # 2. Exact match + if match_index == -1: + match_index = full_text.find(search_phrase) + + # 3. Normalized match + if match_index == -1: + norm_content = self._normalize(full_text) + norm_search = self._normalize(search_phrase) + norm_index = norm_content.find(norm_search) + if norm_index != -1: + match_index = int((norm_index / len(norm_content)) * total_len) + + # 4. Fuzzy match + if match_index == -1: + match_index = self._fuzzy_match(full_text, search_phrase, hint_percentage, total_len) + + if match_index == -1: + return None + + return self._build_locator_result(full_text, spine_map, match_index, total_len) + + except Exception as e: + logger.error(f"Error finding text: {e}") + return None + + def resolve_locator_id(self, full_text, spine_map, href, fragment_id) -> str | None: + """ + Resolve a Storyteller/Readium locator (href + fragment ID) to a text snippet. + """ + try: + target_item = None + for item in spine_map: + if href in item["href"] or item["href"] in href: + target_item = item + break + + if not target_item: + return None + + soup = BeautifulSoup(target_item["content"], "html.parser") + clean_id = fragment_id.lstrip("#") + element = soup.find(id=clean_id) + + if not element: + return None + + current_offset = 0 + found_offset = -1 + all_strings = soup.find_all(string=True) + + for s in all_strings: + if s.parent == element or element in s.parents: + found_offset = current_offset + break + text_len = len(s.strip()) + if text_len == 0: + continue + current_offset += text_len + + if found_offset == -1: + elem_text = element.get_text(separator=" ", strip=True) + chapter_text = soup.get_text(separator=" ", strip=True) + found_offset = chapter_text.find(elem_text) + + if found_offset == -1: + return None + + global_offset = target_item["start"] + found_offset + start = max(0, global_offset) + end = min(len(full_text), global_offset + 500) + return full_text[start:end] + + except Exception as e: + logger.error(f"Error resolving locator ID '{fragment_id}': {e}") + return None + + def get_text_around_cfi(self, full_text, spine_map, cfi, context=50) -> str | None: + """ + Returns a text fragment of length 2*context centered on the position indicated by the CFI. + Uses the epubcfi library for precise parsing. + """ + try: + parsed_cfi = epubcfi.parse(cfi) + + spine_step = None + element_steps = [] + + for step in parsed_cfi.steps: + if hasattr(step, "index"): + if step.index == 6: + continue + elif not spine_step and step.index > 6: + spine_step = step.index + elif isinstance(step, epubcfi.cfi.Step): + element_steps.append(step) + + char_offset = parsed_cfi.offset.value if parsed_cfi.offset else 0 + + if not spine_step: + logger.error(f"Could not extract spine step from CFI: '{cfi}'") + return None + + spine_index = (spine_step // 2) - 1 + if not (0 <= spine_index < len(spine_map)): + logger.error(f"Spine index {spine_index} out of range for CFI '{cfi}'") + return None + + item = spine_map[spine_index] + + tree = html.fromstring(item["content"]) + + current_element = tree + text_count = 0 + + logger.debug(f"Following CFI path with {len(element_steps)} steps") + + for i, step in enumerate(element_steps): + if not hasattr(step, "index"): + continue + + step_index = step.index + step_assertion = step.assertion + + logger.debug(f"Step {i}: index={step_index}, assertion={step_assertion}") + + if step_assertion: + candidates = current_element.xpath( + f".//*[contains(@id, '{step_assertion}') or contains(@class, '{step_assertion}')]" + ) + if candidates: + current_element = candidates[0] + logger.debug(f"Found element with assertion: {step_assertion}") + continue + + if step_index % 2 == 0: + element_index = (step_index // 2) - 1 + children = [child for child in current_element if hasattr(child, "tag")] + + if 0 <= element_index < len(children): + current_element = children[element_index] + logger.debug(f"Navigated to child element {element_index}: {current_element.tag}") + else: + logger.warning(f"Element index {element_index} out of range (have {len(children)} children)") + break + else: + text_index = step_index // 2 + text_nodes = [] + for child in current_element: + if child.text and child.text.strip(): + text_nodes.append(child.text.strip()) + if child.tail and child.tail.strip(): + text_nodes.append(child.tail.strip()) + + if 0 <= text_index < len(text_nodes): + text_count += sum(len(text) for text in text_nodes[:text_index]) + logger.debug(f"Text node {text_index}, accumulated count: {text_count}") + break + + if current_element is not None: + soup = BeautifulSoup(item["content"], "html.parser") + chapter_text = soup.get_text(separator=" ", strip=True) + + element_text = "" + if hasattr(current_element, "text_content"): + element_text = current_element.text_content() + + if element_text and len(element_text.strip()) > 5: + element_start = chapter_text.find(element_text.strip()[:50]) + if element_start != -1: + local_offset = element_start + char_offset + else: + local_offset = text_count + char_offset + else: + local_offset = text_count + char_offset + else: + local_offset = text_count + char_offset + + chapter_text = BeautifulSoup(item["content"], "html.parser").get_text(separator=" ", strip=True) + local_offset = min(max(0, local_offset), len(chapter_text)) + + global_offset = item["start"] + local_offset + + start_pos = max(0, global_offset - context) + end_pos = min(len(full_text), global_offset + context) + + snippet = full_text[start_pos:end_pos] + logger.info(f"Snippet extracted: {snippet[:30]}...") + return snippet + + except Exception as e: + logger.error(f"Error using epubcfi library for '{cfi}': {e}") + return None + + # ========================================================================= + # PRIVATE: Search helpers + # ========================================================================= + + def _normalize(self, text): + return re.sub(r"[^a-z0-9]", "", text.lower()) + + def _fuzzy_match(self, full_text, search_phrase, hint_percentage, total_len): + """Run fuzzy matching, optionally windowed around hint_percentage.""" + cutoff = self.fuzzy_threshold + match_index = -1 + + if hint_percentage is not None: + w_start = int(max(0, hint_percentage - 0.10) * total_len) + w_end = int(min(1.0, hint_percentage + 0.10) * total_len) + alignment = rapidfuzz.fuzz.partial_ratio_alignment( + search_phrase, full_text[w_start:w_end], score_cutoff=cutoff + ) + if alignment: + match_index = w_start + alignment.dest_start + + if match_index == -1: + alignment = rapidfuzz.fuzz.partial_ratio_alignment(search_phrase, full_text, score_cutoff=cutoff) + if alignment: + match_index = alignment.dest_start + + return match_index + + # ========================================================================= + # PRIVATE: Locator building helpers + # ========================================================================= + + def _build_locator_result(self, full_text, spine_map, match_index, total_len): + """Build a LocatorResult from a match position.""" + percentage = match_index / total_len + for item in spine_map: + if item["start"] <= match_index < item["end"]: + local_index = match_index - item["start"] + + xpath_str, target_tag, is_anchored = self._generate_xpath_bs4(item["content"], local_index) + css_selector = self._generate_css_selector(target_tag) + cfi = self._generate_cfi(item["spine_index"] - 1, item["content"], local_index) + + doc_frag_prefix = f"/body/DocFragment[{item['spine_index']}]" + if xpath_str.startswith("//"): + final_xpath = doc_frag_prefix + xpath_str[1:] + elif xpath_str.startswith("/"): + final_xpath = doc_frag_prefix + xpath_str + else: + final_xpath = f"{doc_frag_prefix}/{xpath_str}" + + spine_item_len = item["end"] - item["start"] + chapter_progress = 0.0 + if spine_item_len > 0: + chapter_progress = local_index / spine_item_len + + return LocatorResult( + percentage=percentage, + xpath=final_xpath, + perfect_ko_xpath=None, + match_index=match_index, + cfi=cfi, + href=item["href"], + fragment=None, + css_selector=css_selector, + chapter_progress=chapter_progress, + ) + + return None + + def _generate_css_selector(self, target_tag): + """Generate a Readium-compatible CSS selector.""" + if not target_tag: + return "" + segments = [] + curr = target_tag + while curr and curr.name != "[document]": + if not isinstance(curr, Tag): + curr = curr.parent + continue + index = 1 + sibling = curr.previous_sibling + while sibling: + if isinstance(sibling, Tag): + index += 1 + sibling = sibling.previous_sibling + segments.append(f"{curr.name}:nth-child({index})") + curr = curr.parent + return " > ".join(reversed(segments)) + + def _generate_cfi(self, spine_index, html_content, local_target_index): + """Generate an EPUB CFI for Booklore/Readium.""" + soup = BeautifulSoup(html_content, "html.parser") + current_char_count = 0 + target_tag = None + + elements = soup.find_all(string=True) + for string in elements: + text_len = len(string.strip()) + if text_len == 0: + continue + if current_char_count + text_len >= local_target_index: + target_tag = string.parent + break + current_char_count += text_len + if current_char_count < local_target_index: + current_char_count += 1 + + if not target_tag: + spine_step = (spine_index + 1) * 2 + return f"epubcfi(/6/{spine_step}!/4/2/1:0)" + + path_segments = [] + curr = target_tag + while curr and curr.name != "[document]": + if curr.name == "body": + path_segments.append("4") + break + index = 1 + sibling = curr.previous_sibling + while sibling: + if isinstance(sibling, Tag): + index += 1 + sibling = sibling.previous_sibling + path_segments.append(str(index * 2)) + curr = curr.parent + + spine_step = (spine_index + 1) * 2 + element_path = "/".join(reversed(path_segments)) + return f"epubcfi(/6/{spine_step}!/{element_path}:0)" + + def _generate_xpath_bs4(self, html_content, local_target_index): + """ + BS4 XPath generator for find_text_location. + Returns: (xpath_string, target_tag_object, is_anchored) + """ + soup = BeautifulSoup(html_content, "html.parser") + current_char_count = 0 + target_tag = None + + elements = soup.find_all(string=True) + for string in elements: + text_len = len(string.strip()) + if text_len == 0: + continue + if current_char_count + text_len >= local_target_index: + target_tag = string.parent + break + current_char_count += text_len + if current_char_count < local_target_index: + current_char_count += 1 + + if not target_tag: + return "/body/div/p[1]", None, False + + path_segments = [] + curr = target_tag + found_anchor = False + + while curr and curr.name != "[document]": + if curr.name == "body": + path_segments.append("body") + break + if curr.has_attr("id") and curr["id"]: + path_segments.append(f"*[@id='{curr['id']}']") + found_anchor = True + break + index = 1 + sibling = curr.previous_sibling + while sibling: + if isinstance(sibling, Tag) and sibling.name == curr.name: + index += 1 + sibling = sibling.previous_sibling + path_segments.append(f"{curr.name}[{index}]") + curr = curr.parent + + if not path_segments: + return "/body/p[1]", target_tag, False + + xpath = "//" + "/".join(reversed(path_segments)) if found_anchor else "/" + "/".join(reversed(path_segments)) + xpath = xpath.rstrip("/") + if xpath in ("", "/", "//", "/body", "//body"): + xpath = "/body/p[1]" + found_anchor = False + return xpath, target_tag, found_anchor diff --git a/src/utils/rate_limiter.py b/src/utils/rate_limiter.py new file mode 100644 index 0000000..0e691b1 --- /dev/null +++ b/src/utils/rate_limiter.py @@ -0,0 +1,55 @@ +""" +Token bucket rate limiter for PageKeeper. + +Thread-safe per-IP rate limiting. Each IP gets a bucket with configurable +capacity that refills at a steady rate. Requests consume tokens; excess +requests are rejected. +""" + +import threading +import time + + +class TokenBucketRateLimiter: + DEFAULT_CAPACITY = 30 + DEFAULT_REFILL_RATE = 2.0 + AUTH_TOKEN_COST = 5 + STALE_SECONDS = 300 + + def __init__(self, capacity: int = None, refill_rate: float = None): + self._capacity = capacity or self.DEFAULT_CAPACITY + self._refill_rate = refill_rate or self.DEFAULT_REFILL_RATE + self._store: dict[str, dict] = {} + self._lock = threading.Lock() + + def check(self, ip: str, cost: int = 1) -> bool: + """Consume `cost` tokens for `ip`. Returns True if allowed.""" + now = time.time() + with self._lock: + bucket = self._store.get(ip) + if bucket is None: + bucket = {"tokens": self._capacity, "last": now} + self._store[ip] = bucket + + elapsed = now - bucket["last"] + bucket["tokens"] = min(self._capacity, bucket["tokens"] + elapsed * self._refill_rate) + bucket["last"] = now + + if bucket["tokens"] >= cost: + bucket["tokens"] -= cost + return True + return False + + def prune(self, max_idle_seconds: int = None) -> None: + """Remove entries idle for more than max_idle_seconds.""" + threshold = max_idle_seconds if max_idle_seconds is not None else self.STALE_SECONDS + now = time.time() + with self._lock: + stale = [ip for ip, b in self._store.items() if now - b["last"] > threshold] + for ip in stale: + del self._store[ip] + + def clear(self) -> None: + """Clear all buckets (useful for testing).""" + with self._lock: + self._store.clear() diff --git a/src/utils/smil_extractor.py b/src/utils/smil_extractor.py index 4198486..cf4b6ae 100644 --- a/src/utils/smil_extractor.py +++ b/src/utils/smil_extractor.py @@ -4,12 +4,13 @@ import urllib.parse import zipfile from pathlib import Path -from xml.etree import ElementTree as ET from bs4 import BeautifulSoup +from defusedxml import ElementTree as ET logger = logging.getLogger(__name__) + class SmilExtractor: """ Extracts transcript data from EPUB3 media overlays. @@ -27,38 +28,42 @@ def __init__(self): def _strip_namespaces(self, xml_string: str) -> str: # Remove default xmlns - xml_string = re.sub(r'\sxmlns="[^"]+"', '', xml_string) + xml_string = re.sub(r'\sxmlns="[^"]+"', "", xml_string) # Remove named namespaces (xmlns:foo="bar") - xml_string = re.sub(r'\sxmlns:[a-zA-Z0-9-]+\s*=\s*"[^"]+"', '', xml_string) + xml_string = re.sub(r'\sxmlns:[a-zA-Z0-9-]+\s*=\s*"[^"]+"', "", xml_string) # Remove tag prefixes ( -> ) - xml_string = re.sub(r'<([a-zA-Z0-9-]+):', '<', xml_string) - xml_string = re.sub(r' textref="foo") # Match whitespace, then prefix:name= - xml_string = re.sub(r'(\s)[a-zA-Z0-9-]+:([a-zA-Z0-9-]+\s*=)', r'\1\2', xml_string) + xml_string = re.sub(r"(\s)[a-zA-Z0-9-]+:([a-zA-Z0-9-]+\s*=)", r"\1\2", xml_string) return xml_string def has_media_overlays(self, epub_path: str) -> bool: """Check if an EPUB has media overlay (SMIL) files.""" try: - with zipfile.ZipFile(epub_path, 'r') as zf: + with zipfile.ZipFile(epub_path, "r") as zf: opf_path = self._find_opf_path(zf) - if not opf_path: return False + if not opf_path: + return False - opf_content = zf.read(opf_path).decode('utf-8') + opf_content = zf.read(opf_path).decode("utf-8") root = ET.fromstring(opf_content) - manifest = root.find('.//{http://www.idpf.org/2007/opf}manifest') - if manifest is None: return False + manifest = root.find(".//{http://www.idpf.org/2007/opf}manifest") + if manifest is None: + return False - for item in manifest.findall('{http://www.idpf.org/2007/opf}item'): - if item.get('media-type') == 'application/smil+xml': + for item in manifest.findall("{http://www.idpf.org/2007/opf}item"): + if item.get("media-type") == "application/smil+xml": return True return False except Exception as e: logger.debug(f"Error checking media overlays: {e}") return False - def extract_transcript(self, epub_path: str, abs_chapters: list[dict] = None, audio_offset: float = 0.0) -> list[dict]: + def extract_transcript( + self, epub_path: str, abs_chapters: list[dict] = None, audio_offset: float = 0.0 + ) -> list[dict]: """ Extract transcript from EPUB SMIL files. @@ -71,16 +76,17 @@ def extract_transcript(self, epub_path: str, abs_chapters: list[dict] = None, au self._xhtml_cache = {} try: - with zipfile.ZipFile(epub_path, 'r') as zf: + with zipfile.ZipFile(epub_path, "r") as zf: opf_path = self._find_opf_path(zf) if not opf_path: logger.error(f"Could not find OPF file in EPUB: '{epub_path}'") return [] opf_dir = str(Path(opf_path).parent) - if opf_dir == '.': opf_dir = '' + if opf_dir == ".": + opf_dir = "" - opf_content = zf.read(opf_path).decode('utf-8') + opf_content = zf.read(opf_path).decode("utf-8") smil_files = self._get_smil_files_in_order(opf_content, opf_dir, zf) if not smil_files: @@ -95,7 +101,7 @@ def extract_transcript(self, epub_path: str, abs_chapters: list[dict] = None, au timestamp_mode = self._detect_timestamp_mode(zf, smil_files) logger.info(f"Detected timestamp mode: {timestamp_mode}") - if timestamp_mode == 'absolute': + if timestamp_mode == "absolute": # Process all SMIL files with absolute timestamps for idx, smil_path in enumerate(smil_files): segments = self._process_smil_absolute(zf, smil_path) @@ -104,10 +110,15 @@ def extract_transcript(self, epub_path: str, abs_chapters: list[dict] = None, au if logger.isEnabledFor(logging.DEBUG): if idx < 3 or idx == len(smil_files) - 1: if segments: - logger.debug(f" {Path(smil_path).name}: {len(segments)} segments ({segments[0]['start']:.1f}s - {segments[-1]['end']:.1f}s)") + logger.debug( + f" {Path(smil_path).name}: {len(segments)} segments ({segments[0]['start']:.1f}s - {segments[-1]['end']:.1f}s)" + ) + elif timestamp_mode == "relative": # Relative timestamps - calculate offsets if abs_chapters: - logger.info(f" Using Smart Duration Mapping (Files: {len(smil_files)}, Chapters: {len(abs_chapters)})") + logger.info( + f" Using Smart Duration Mapping (Files: {len(smil_files)}, Chapters: {len(abs_chapters)})" + ) transcript = self._process_relative_with_chapters(zf, smil_files, abs_chapters) else: logger.info(" Using Sequential Stacking (No ABS chapters provided)") @@ -117,13 +128,13 @@ def extract_transcript(self, epub_path: str, abs_chapters: list[dict] = None, au transcript = self._process_auto_sequence(zf, smil_files) # Sort and deduplicate - transcript.sort(key=lambda x: (x['start'], x['end'])) + transcript.sort(key=lambda x: (x["start"], x["end"])) # Remove exact duplicates seen = set() unique_transcript = [] for seg in transcript: - key = (seg['start'], seg['end'], seg['text']) + key = (seg["start"], seg["end"], seg["text"]) if key not in seen: seen.add(key) unique_transcript.append(seg) @@ -131,17 +142,17 @@ def extract_transcript(self, epub_path: str, abs_chapters: list[dict] = None, au # Post-processing: Clamp to audiobook duration if known if abs_chapters: - abs_end = float(abs_chapters[-1].get('end', 0)) + abs_end = float(abs_chapters[-1].get("end", 0)) if abs_end > 0: original_count = len(transcript) - transcript = [s for s in transcript if s['start'] < abs_end] + transcript = [s for s in transcript if s["start"] < abs_end] removed = original_count - len(transcript) if removed > 0: logger.debug(f" Removed {removed} segments starting after audiobook end ({abs_end:.0f}s)") for s in transcript: - if s['end'] > abs_end: - s['end'] = min(s['end'], abs_end) + if s["end"] > abs_end: + s["end"] = min(s["end"], abs_end) total_segments = len(transcript) logger.info(f"SMIL extraction complete: {total_segments} segments from {len(smil_files)} files") @@ -155,6 +166,7 @@ def extract_transcript(self, epub_path: str, abs_chapters: list[dict] = None, au except Exception as e: logger.error(f"Error extracting SMIL transcript: {e}") import traceback + logger.error(traceback.format_exc()) return [] @@ -171,19 +183,19 @@ def _detect_timestamp_mode(self, zf: zipfile.ZipFile, smil_files: list[str]) -> """ sample_ranges = [] # List of (min_ts, max_ts) for each file - for smil_path in smil_files[:min(10, len(smil_files))]: # Sample more files + for smil_path in smil_files[: min(10, len(smil_files))]: # Sample more files try: - smil_content = zf.read(smil_path).decode('utf-8') + smil_content = zf.read(smil_path).decode("utf-8") smil_content = self._strip_namespaces(smil_content) root = ET.fromstring(smil_content) timestamps = [] - for par in root.iter('par'): - audio_elem = par.find('audio') + for par in root.iter("par"): + audio_elem = par.find("audio") if audio_elem is not None: - clip_begin = self._parse_timestamp(audio_elem.get('clipBegin', '0s')) - clip_end = self._parse_timestamp(audio_elem.get('clipEnd', '0s')) + clip_begin = self._parse_timestamp(audio_elem.get("clipBegin", "0s")) + clip_end = self._parse_timestamp(audio_elem.get("clipEnd", "0s")) timestamps.append(clip_begin) timestamps.append(clip_end) @@ -194,7 +206,7 @@ def _detect_timestamp_mode(self, zf: zipfile.ZipFile, smil_files: list[str]) -> continue if len(sample_ranges) < 2: - return 'absolute' # Default to absolute (safer - no offset applied) + return "absolute" # Default to absolute (safer - no offset applied) # Check how many files start near 0 files_starting_near_zero = sum(1 for min_ts, _ in sample_ranges if min_ts < 30) @@ -203,26 +215,26 @@ def _detect_timestamp_mode(self, zf: zipfile.ZipFile, smil_files: list[str]) -> sorted_ranges = sorted(sample_ranges, key=lambda x: x[0]) sequential_count = 0 for i in range(1, len(sorted_ranges)): - if sorted_ranges[i][0] >= sorted_ranges[i-1][1] - 10: # 10s tolerance + if sorted_ranges[i][0] >= sorted_ranges[i - 1][1] - 10: # 10s tolerance sequential_count += 1 # If most files don't start near 0 OR ranges are mostly sequential → absolute # But if multiple files start near 0, assume auto/mixed to be safe. if files_starting_near_zero <= 1 and sequential_count >= len(sorted_ranges) - 2: - return 'absolute' + return "absolute" # If most files start near 0 → relative if files_starting_near_zero >= len(sample_ranges) * 0.7: - return 'relative' + return "relative" # Mixed case - use auto/smart sequence logger.info(f" Mixed timestamp patterns: {files_starting_near_zero}/{len(sample_ranges)} start near 0") - return 'auto' + return "auto" def _get_raw_info(self, zf: zipfile.ZipFile, smil_path: str) -> tuple[float, float, str | None]: """Get the min start, max end, and audio src from a SMIL file.""" try: - smil_content = zf.read(smil_path).decode('utf-8') + smil_content = zf.read(smil_path).decode("utf-8") # Strip namespaces smil_content = self._strip_namespaces(smil_content) @@ -231,13 +243,13 @@ def _get_raw_info(self, zf: zipfile.ZipFile, smil_path: str) -> tuple[float, flo ends = [] audio_src = None - for par in root.iter('par'): - audio = par.find('audio') + for par in root.iter("par"): + audio = par.find("audio") if audio is not None: if audio_src is None: - audio_src = audio.get('src') - starts.append(self._parse_timestamp(audio.get('clipBegin', '0s'))) - ends.append(self._parse_timestamp(audio.get('clipEnd', '0s'))) + audio_src = audio.get("src") + starts.append(self._parse_timestamp(audio.get("clipBegin", "0s"))) + ends.append(self._parse_timestamp(audio.get("clipEnd", "0s"))) if starts: return min(starts), max(ends), audio_src @@ -250,45 +262,44 @@ def _process_smil_absolute(self, zf: zipfile.ZipFile, smil_path: str) -> list[di """Process SMIL file using timestamps directly (absolute mode).""" segments = [] try: - smil_content = zf.read(smil_path).decode('utf-8') + smil_content = zf.read(smil_path).decode("utf-8") smil_dir = str(Path(smil_path).parent) - if smil_dir == '.': smil_dir = '' + if smil_dir == ".": + smil_dir = "" smil_content = self._strip_namespaces(smil_content) root = ET.fromstring(smil_content) - for par in root.iter('par'): - text_elem = par.find('text') - audio_elem = par.find('audio') + for par in root.iter("par"): + text_elem = par.find("text") + audio_elem = par.find("audio") if text_elem is None or audio_elem is None: continue - clip_begin = self._parse_timestamp(audio_elem.get('clipBegin', '0s')) - clip_end = self._parse_timestamp(audio_elem.get('clipEnd', '0s')) + clip_begin = self._parse_timestamp(audio_elem.get("clipBegin", "0s")) + clip_end = self._parse_timestamp(audio_elem.get("clipEnd", "0s")) - text_src = urllib.parse.unquote(text_elem.get('src', '')) + text_src = urllib.parse.unquote(text_elem.get("src", "")) text_content = self._get_text_content(zf, smil_dir, text_src) if text_content: - segments.append({ - 'start': round(clip_begin, 3), - 'end': round(clip_end, 3), - 'text': text_content - }) + segments.append({"start": round(clip_begin, 3), "end": round(clip_end, 3), "text": text_content}) else: logger.debug(f" Text content empty for '{text_src}' (decoded)") except Exception as e: logger.warning(f"Error processing SMIL '{smil_path}': {e}") import traceback + logger.debug(traceback.format_exc()) return segments - def _process_relative_with_chapters(self, zf: zipfile.ZipFile, smil_files: list[str], - abs_chapters: list[dict]) -> list[dict]: + def _process_relative_with_chapters( + self, zf: zipfile.ZipFile, smil_files: list[str], abs_chapters: list[dict] + ) -> list[dict]: """Process SMIL files with relative timestamps using Smart Duration Mapping.""" transcript = [] @@ -313,7 +324,7 @@ def _process_relative_with_chapters(self, zf: zipfile.ZipFile, smil_files: list[ best_match_idx = -1 best_offset = current_sequential_offset - smallest_diff = float('inf') + smallest_diff = float("inf") # 2. Search forward in ABS chapters for a duration match # Look ahead up to 6 chapters to account for skipped intro/prologue tracks @@ -322,8 +333,8 @@ def _process_relative_with_chapters(self, zf: zipfile.ZipFile, smil_files: list[ for abs_idx in range(search_start, search_end): ch = abs_chapters[abs_idx] - ch_start = float(ch.get('start', 0)) - ch_end = float(ch.get('end', 0)) + ch_start = float(ch.get("start", 0)) + ch_end = float(ch.get("end", 0)) ch_duration = ch_end - ch_start diff = abs(ch_duration - smil_duration) @@ -339,12 +350,16 @@ def _process_relative_with_chapters(self, zf: zipfile.ZipFile, smil_files: list[ if last_matched_abs_idx != -1 and best_match_idx > last_matched_abs_idx + 1: logger.info(f" Skipped {best_match_idx - last_matched_abs_idx - 1} ABS tracks to find match.") - logger.debug(f" Matched SMIL {Path(smil_path).name} ({smil_duration:.1f}s) to ABS Ch {best_match_idx} ({abs_chapters[best_match_idx].get('start', 0):.1f}s) - diff: {smallest_diff:.1f}s") + logger.debug( + f" Matched SMIL {Path(smil_path).name} ({smil_duration:.1f}s) to ABS Ch {best_match_idx} ({abs_chapters[best_match_idx].get('start', 0):.1f}s) - diff: {smallest_diff:.1f}s" + ) last_matched_abs_idx = best_match_idx offset = best_offset - current_sequential_offset = float(abs_chapters[best_match_idx].get('end', 0)) + current_sequential_offset = float(abs_chapters[best_match_idx].get("end", 0)) else: - logger.warning(f" No duration match for {Path(smil_path).name} ({smil_duration:.1f}s). Falling back to sequential offset {current_sequential_offset:.1f}s") + logger.warning( + f" No duration match for {Path(smil_path).name} ({smil_duration:.1f}s). Falling back to sequential offset {current_sequential_offset:.1f}s" + ) offset = current_sequential_offset current_sequential_offset += smil_duration @@ -357,8 +372,9 @@ def _process_relative_with_chapters(self, zf: zipfile.ZipFile, smil_files: list[ return transcript - def _process_relative_sequential(self, zf: zipfile.ZipFile, smil_files: list[str], - initial_offset: float) -> list[dict]: + def _process_relative_sequential( + self, zf: zipfile.ZipFile, smil_files: list[str], initial_offset: float + ) -> list[dict]: """Process SMIL files with relative timestamps, stacking sequentially.""" transcript = [] current_offset = initial_offset @@ -368,7 +384,7 @@ def _process_relative_sequential(self, zf: zipfile.ZipFile, smil_files: list[str if segments: transcript.extend(segments) - current_offset = max(s['end'] for s in segments) + current_offset = max(s["end"] for s in segments) if idx < 3 or idx == len(smil_files) - 1: if segments: @@ -389,7 +405,7 @@ def _process_auto_sequence(self, zf: zipfile.ZipFile, smil_files: list[str]) -> # State tracking current_audio_src = None - part_max_end = 0.0 # The furthest point reached in the current Part + part_max_end = 0.0 # The furthest point reached in the current Part # We need to track the cumulative offset of all PREVIOUS parts cumulative_previous_duration = 0.0 @@ -415,7 +431,9 @@ def _process_auto_sequence(self, zf: zipfile.ZipFile, smil_files: list[str]) -> current_audio_src = audio_src elif audio_src != current_audio_src: # NEW PART Detected - logger.info(f" Audio source changed at {Path(smil_path).name} ({current_audio_src} -> {audio_src}). Stacking.") + logger.info( + f" Audio source changed at {Path(smil_path).name} ({current_audio_src} -> {audio_src}). Stacking." + ) # Update cumulative duration with the length of the *previous* part cumulative_previous_duration += part_max_end @@ -445,46 +463,51 @@ def _process_auto_sequence(self, zf: zipfile.ZipFile, smil_files: list[str]) -> if idx < 3 or idx == len(smil_files) - 1: seg_len = len(segments) if segments else 0 - logger.debug(f" {Path(smil_path).name}: {seg_len} segs (src {audio_src}, raw {start_raw:.1f}-{end_raw:.1f} → abs {start_raw+current_offset:.1f}-{end_raw+current_offset:.1f})") + logger.debug( + f" {Path(smil_path).name}: {seg_len} segs (src {audio_src}, raw {start_raw:.1f}-{end_raw:.1f} → abs {start_raw + current_offset:.1f}-{end_raw + current_offset:.1f})" + ) return transcript - def _process_smil_with_offset(self, zf: zipfile.ZipFile, smil_path: str, - offset: float) -> list[dict]: + def _process_smil_with_offset(self, zf: zipfile.ZipFile, smil_path: str, offset: float) -> list[dict]: """Process SMIL file adding an offset to all timestamps.""" segments = [] try: - smil_content = zf.read(smil_path).decode('utf-8') + smil_content = zf.read(smil_path).decode("utf-8") smil_dir = str(Path(smil_path).parent) - if smil_dir == '.': smil_dir = '' + if smil_dir == ".": + smil_dir = "" smil_content = self._strip_namespaces(smil_content) root = ET.fromstring(smil_content) - for par in root.iter('par'): - text_elem = par.find('text') - audio_elem = par.find('audio') + for par in root.iter("par"): + text_elem = par.find("text") + audio_elem = par.find("audio") if text_elem is None or audio_elem is None: continue - clip_begin = self._parse_timestamp(audio_elem.get('clipBegin', '0s')) - clip_end = self._parse_timestamp(audio_elem.get('clipEnd', '0s')) + clip_begin = self._parse_timestamp(audio_elem.get("clipBegin", "0s")) + clip_end = self._parse_timestamp(audio_elem.get("clipEnd", "0s")) - text_src = urllib.parse.unquote(text_elem.get('src', '')) + text_src = urllib.parse.unquote(text_elem.get("src", "")) text_content = self._get_text_content(zf, smil_dir, text_src) if text_content: - segments.append({ - 'start': round(clip_begin + offset, 3), - 'end': round(clip_end + offset, 3), - 'text': text_content - }) + segments.append( + { + "start": round(clip_begin + offset, 3), + "end": round(clip_end + offset, 3), + "text": text_content, + } + ) except Exception as e: logger.warning(f"Error processing SMIL '{smil_path}': {e}") import traceback + logger.debug(traceback.format_exc()) return segments @@ -497,9 +520,9 @@ def _log_gap_analysis(self, transcript: list[dict], abs_chapters: list[dict] = N # Find gaps > 100 seconds gaps = [] for i in range(1, len(transcript)): - gap = transcript[i]['start'] - transcript[i-1]['end'] + gap = transcript[i]["start"] - transcript[i - 1]["end"] if gap > 100: - gaps.append((transcript[i-1]['end'], transcript[i]['start'], gap)) + gaps.append((transcript[i - 1]["end"], transcript[i]["start"], gap)) if gaps: logger.warning(f"Found {len(gaps)} gaps > 100s in transcript") @@ -508,61 +531,73 @@ def _log_gap_analysis(self, transcript: list[dict], abs_chapters: list[dict] = N # Check coverage if abs_chapters: - abs_end = float(abs_chapters[-1].get('end', 0)) - transcript_end = transcript[-1]['end'] + abs_end = float(abs_chapters[-1].get("end", 0)) + transcript_end = transcript[-1]["end"] coverage = (transcript_end / abs_end * 100) if abs_end > 0 else 0 if coverage < 90: - logger.warning(f"Low transcript coverage: {coverage:.1f}% (ends at {transcript_end:.0f}s, audiobook ends at {abs_end:.0f}s)") + logger.warning( + f"Low transcript coverage: {coverage:.1f}% (ends at {transcript_end:.0f}s, audiobook ends at {abs_end:.0f}s)" + ) elif coverage > 105: - logger.warning(f"Transcript exceeds audiobook: {coverage:.1f}% (ends at {transcript_end:.0f}s, audiobook ends at {abs_end:.0f}s)") + logger.warning( + f"Transcript exceeds audiobook: {coverage:.1f}% (ends at {transcript_end:.0f}s, audiobook ends at {abs_end:.0f}s)" + ) def _is_front_matter(self, filename: str) -> bool: """Check if filename indicates front matter using word boundary matching.""" # Use word boundaries to avoid matching 'toc' in 'TOCREF' etc. front_patterns = [ - r'\bcontents\b', r'\btoc\b', r'\bcopyright\b', r'\btitle\b', - r'\bcover\b', r'\bdedication\b', r'\backnowledgment\b', - r'\bpreface\b', r'\bforeword\b', r'\bfm0\b', r'\bfrontmatter\b' + r"\bcontents\b", + r"\btoc\b", + r"\bcopyright\b", + r"\btitle\b", + r"\bcover\b", + r"\bdedication\b", + r"\backnowledgment\b", + r"\bpreface\b", + r"\bforeword\b", + r"\bfm0\b", + r"\bfrontmatter\b", ] return any(re.search(p, filename, re.IGNORECASE) for p in front_patterns) def _find_opf_path(self, zf: zipfile.ZipFile) -> str | None: try: - container = zf.read('META-INF/container.xml').decode('utf-8') + container = zf.read("META-INF/container.xml").decode("utf-8") root = ET.fromstring(container) for rootfile in root.iter(): - if rootfile.tag.endswith('rootfile'): - return rootfile.get('full-path') + if rootfile.tag.endswith("rootfile"): + return rootfile.get("full-path") except (KeyError, UnicodeDecodeError, ET.ParseError) as e: logger.debug(f"Failed to read OPF path from container.xml: {e}") return None def _natural_sort_key(self, s): - return [int(text) if text.isdigit() else text.lower() - for text in re.split(r'(\d+)', s)] + return [int(text) if text.isdigit() else text.lower() for text in re.split(r"(\d+)", s)] def _get_smil_files_in_order(self, opf_content: str, opf_dir: str, zf: zipfile.ZipFile) -> list[str]: root = ET.fromstring(opf_content) - manifest = root.find('.//{http://www.idpf.org/2007/opf}manifest') - spine = root.find('.//{http://www.idpf.org/2007/opf}spine') - if manifest is None: return [] + manifest = root.find(".//{http://www.idpf.org/2007/opf}manifest") + spine = root.find(".//{http://www.idpf.org/2007/opf}spine") + if manifest is None: + return [] smil_items = {} content_to_overlay = {} - for item in manifest.findall('{http://www.idpf.org/2007/opf}item'): - if item.get('media-type') == 'application/smil+xml': - smil_items[item.get('id')] = item.get('href') - if item.get('media-overlay'): - content_to_overlay[item.get('id')] = item.get('media-overlay') + for item in manifest.findall("{http://www.idpf.org/2007/opf}item"): + if item.get("media-type") == "application/smil+xml": + smil_items[item.get("id")] = item.get("href") + if item.get("media-overlay"): + content_to_overlay[item.get("id")] = item.get("media-overlay") smil_files = [] seen_smil = set() if spine is not None: - for itemref in spine.findall('{http://www.idpf.org/2007/opf}itemref'): - idref = itemref.get('idref') + for itemref in spine.findall("{http://www.idpf.org/2007/opf}itemref"): + idref = itemref.get("idref") if idref in content_to_overlay: smil_id = content_to_overlay[idref] if smil_id in smil_items and smil_id not in seen_smil: @@ -577,7 +612,7 @@ def _get_smil_files_in_order(self, opf_content: str, opf_dir: str, zf: zipfile.Z valid_files = [] for smil_path in smil_files: - for path_variant in [smil_path, smil_path.lstrip('/'), smil_path.replace('\\', '/')]: + for path_variant in [smil_path, smil_path.lstrip("/"), smil_path.replace("\\", "/")]: try: zf.getinfo(path_variant) valid_files.append(path_variant) @@ -587,77 +622,92 @@ def _get_smil_files_in_order(self, opf_content: str, opf_dir: str, zf: zipfile.Z return valid_files def _resolve_path(self, base_dir: str, relative_path: str) -> str: - if not base_dir: return relative_path + if not base_dir: + return relative_path full = str(Path(base_dir) / relative_path) parts = [] - for part in full.replace('\\', '/').split('/'): - if part == '..': - if parts: parts.pop() - elif part and part != '.': + for part in full.replace("\\", "/").split("/"): + if part == "..": + if parts: + parts.pop() + elif part and part != ".": parts.append(part) - return '/'.join(parts) + return "/".join(parts) def _parse_timestamp(self, ts_str: str) -> float: - if not ts_str: return 0.0 + if not ts_str: + return 0.0 ts_str = ts_str.strip() - if ts_str.endswith('ms'): - try: return float(ts_str.replace('ms', '')) / 1000.0 - except ValueError: return 0.0 - - ts_str = ts_str.replace('s', '') - if ':' in ts_str: - parts = ts_str.split(':') - return sum(float(p) * (60 ** i) for i, p in enumerate(reversed(parts))) - try: return float(ts_str) - except ValueError: return 0.0 - - def _get_text_content(self, zf: zipfile.ZipFile, smil_dir: str, - text_src: str) -> str | None: - if not text_src: return None - if '#' in text_src: file_path, fragment_id = text_src.split('#', 1) - else: file_path, fragment_id = text_src, None + if ts_str.endswith("ms"): + try: + return float(ts_str.replace("ms", "")) / 1000.0 + except ValueError: + return 0.0 + + ts_str = ts_str.replace("s", "") + if ":" in ts_str: + parts = ts_str.split(":") + return sum(float(p) * (60**i) for i, p in enumerate(reversed(parts))) + try: + return float(ts_str) + except ValueError: + return 0.0 + + def _get_text_content(self, zf: zipfile.ZipFile, smil_dir: str, text_src: str) -> str | None: + if not text_src: + return None + if "#" in text_src: + file_path, fragment_id = text_src.split("#", 1) + else: + file_path, fragment_id = text_src, None full_path = self._resolve_path(smil_dir, file_path) if full_path not in self._xhtml_cache: - for variant in [full_path, full_path.lstrip('/'), full_path.replace('\\', '/')]: + for variant in [full_path, full_path.lstrip("/"), full_path.replace("\\", "/")]: try: - content = zf.read(variant).decode('utf-8') - self._xhtml_cache[full_path] = BeautifulSoup(content, 'html.parser') + content = zf.read(variant).decode("utf-8") + self._xhtml_cache[full_path] = BeautifulSoup(content, "html.parser") break - except KeyError: continue + except KeyError: + continue soup = self._xhtml_cache.get(full_path) - if not soup: return None + if not soup: + return None if fragment_id: element = soup.find(id=fragment_id) if element: - text = element.get_text(separator=' ', strip=True) - return re.sub(r'\s+', ' ', text).strip() + text = element.get_text(separator=" ", strip=True) + return re.sub(r"\s+", " ", text).strip() return None -def extract_transcript_from_epub(epub_path: str, abs_chapters: list[dict] = None, - output_path: str = None) -> str | None: +def extract_transcript_from_epub( + epub_path: str, abs_chapters: list[dict] = None, output_path: str = None +) -> str | None: extractor = SmilExtractor() - if not extractor.has_media_overlays(epub_path): return None + if not extractor.has_media_overlays(epub_path): + return None transcript = extractor.extract_transcript(epub_path, abs_chapters) - if not transcript: return None + if not transcript: + return None if output_path is None: - output_path = str(Path(epub_path).with_suffix('.transcript.json')) + output_path = str(Path(epub_path).with_suffix(".transcript.json")) - with open(output_path, 'w', encoding='utf-8') as f: + with open(output_path, "w", encoding="utf-8") as f: json.dump(transcript, f, ensure_ascii=False) return output_path -if __name__ == '__main__': +if __name__ == "__main__": import sys + logging.basicConfig(level=logging.DEBUG) if len(sys.argv) < 2: print("Usage: python smil_extractor.py ") diff --git a/src/web_server.py b/src/web_server.py index e55ea2f..6c88cae 100644 --- a/src/web_server.py +++ b/src/web_server.py @@ -13,7 +13,8 @@ from markupsafe import Markup from src.api.hardcover_routes import hardcover_bp, init_hardcover_routes -from src.api.kosync_server import init_kosync_server, kosync_admin_bp, kosync_sync_bp +from src.api.kosync_admin import kosync_admin_bp +from src.api.kosync_server import kosync_sync_bp from src.blueprints import register_blueprints from src.blueprints.helpers import safe_folder_name from src.utils.config_loader import ConfigLoader @@ -27,7 +28,7 @@ def _reconfigure_logging(): Reads LOG_LEVEL (default "INFO"), resolves it to a logging level constant, sets the root logger to that level, and logs the outcome. On failure, emits a warning describing the error. """ try: - new_level_str = os.environ.get('LOG_LEVEL', 'INFO').upper() + new_level_str = os.environ.get("LOG_LEVEL", "INFO").upper() new_level = getattr(logging, new_level_str, logging.INFO) root = logging.getLogger() @@ -37,6 +38,7 @@ def _reconfigure_logging(): except Exception as e: logger.warning(f"Failed to reconfigure logging: {e}") + def apply_settings(app): """Hot-reload settings that don't propagate automatically via os.environ. @@ -54,15 +56,15 @@ def apply_settings(app): # 2. Reschedule sync_cycle job with new period try: - sync_mgr = app.config.get('sync_manager') - raw_period = os.environ.get('SYNC_PERIOD_MINS', '5') + sync_mgr = app.config.get("sync_manager") + raw_period = os.environ.get("SYNC_PERIOD_MINS", "5") new_period = int(raw_period) if new_period <= 0: raise ValueError("SYNC_PERIOD_MINS must be an integer greater than 0") - schedule.clear('sync_cycle') + schedule.clear("sync_cycle") if sync_mgr: - schedule.every(new_period).minutes.do(sync_mgr.sync_cycle).tag('sync_cycle') + schedule.every(new_period).minutes.do(sync_mgr.sync_cycle).tag("sync_cycle") logger.info(f"Sync schedule updated to every {new_period} minutes") except Exception as e: errors.append(f"sync reschedule failed: {e}") @@ -74,12 +76,13 @@ def apply_settings(app): errors.append(f"socket listener reconciliation failed: {e}") # 4. Refresh config values that blueprints read from app.config - app.config['ABS_COLLECTION_NAME'] = os.environ.get('ABS_COLLECTION_NAME', 'Synced with KOReader') - app.config['SUGGESTIONS_ENABLED'] = os.environ.get('SUGGESTIONS_ENABLED', 'false').lower() == 'true' + app.config["ABS_COLLECTION_NAME"] = os.environ.get("ABS_COLLECTION_NAME", "Synced with KOReader") + app.config["SUGGESTIONS_ENABLED"] = os.environ.get("SUGGESTIONS_ENABLED", "false").lower() == "true" # 5. Reconcile Telegram logging handler state try: from src.utils.logging_utils import reconcile_telegram_logging + reconcile_telegram_logging() except Exception as e: errors.append(f"telegram logging reconciliation failed: {e}") @@ -108,36 +111,36 @@ def _reconcile_socket_listener(app): """ from src.services.abs_socket_listener import ABSSocketListener - instant_sync = os.environ.get('INSTANT_SYNC_ENABLED', 'true').lower() != 'false' - socket_enabled = os.environ.get('ABS_SOCKET_ENABLED', 'true').lower() != 'false' - abs_server = os.environ.get('ABS_SERVER', '') - abs_key = os.environ.get('ABS_KEY', '') + instant_sync = os.environ.get("INSTANT_SYNC_ENABLED", "true").lower() != "false" + socket_enabled = os.environ.get("ABS_SOCKET_ENABLED", "true").lower() != "false" + abs_server = os.environ.get("ABS_SERVER", "") + abs_key = os.environ.get("ABS_KEY", "") should_run = instant_sync and socket_enabled and abs_server and abs_key - current: ABSSocketListener | None = app.config.get('abs_listener') - current_server = app.config.get('_abs_listener_server', '') - current_key = app.config.get('_abs_listener_key', '') + current: ABSSocketListener | None = app.config.get("abs_listener") + current_server = app.config.get("_abs_listener_server", "") + current_key = app.config.get("_abs_listener_key", "") if should_run and current is None: # Start new listener listener = ABSSocketListener( abs_server_url=abs_server, abs_api_token=abs_key, - database_service=app.config['database_service'], - sync_manager=app.config['sync_manager'], + database_service=app.config["database_service"], + sync_manager=app.config["sync_manager"], ) threading.Thread(target=listener.start, daemon=True).start() - app.config['abs_listener'] = listener - app.config['_abs_listener_server'] = abs_server - app.config['_abs_listener_key'] = abs_key + app.config["abs_listener"] = listener + app.config["_abs_listener_server"] = abs_server + app.config["_abs_listener_key"] = abs_key logger.info("ABS Socket.IO listener started via hot-reload") elif not should_run and current is not None: # Stop running listener current.stop() - app.config['abs_listener'] = None - app.config['_abs_listener_server'] = '' - app.config['_abs_listener_key'] = '' + app.config["abs_listener"] = None + app.config["_abs_listener_server"] = "" + app.config["_abs_listener_key"] = "" logger.info("ABS Socket.IO listener stopped via hot-reload") elif should_run and current is not None and (abs_server != current_server or abs_key != current_key): @@ -146,13 +149,13 @@ def _reconcile_socket_listener(app): listener = ABSSocketListener( abs_server_url=abs_server, abs_api_token=abs_key, - database_service=app.config['database_service'], - sync_manager=app.config['sync_manager'], + database_service=app.config["database_service"], + sync_manager=app.config["sync_manager"], ) threading.Thread(target=listener.start, daemon=True).start() - app.config['abs_listener'] = listener - app.config['_abs_listener_server'] = abs_server - app.config['_abs_listener_key'] = abs_key + app.config["abs_listener"] = listener + app.config["_abs_listener_server"] = abs_server + app.config["_abs_listener_key"] = abs_key logger.info("ABS Socket.IO listener restarted via hot-reload (credentials changed)") @@ -161,6 +164,7 @@ def _reconcile_socket_listener(app): manager = None database_service = None + def setup_dependencies(app, test_container=None): """ Initialize dependencies for the web server. @@ -173,6 +177,7 @@ def setup_dependencies(app, test_container=None): # Initialize Database Service from src.db.migration_utils import initialize_database + database_service = initialize_database(os.environ.get("DATA_DIR", "/data")) # Load settings from DB @@ -182,13 +187,13 @@ def setup_dependencies(app, test_container=None): logger.info("Settings loaded into environment variables") # Migrate ABS_LIBRARY_ID -> ABS_LIBRARY_IDS - old_lib_id = os.environ.get('ABS_LIBRARY_ID', '') - new_lib_ids = os.environ.get('ABS_LIBRARY_IDS', '') + old_lib_id = os.environ.get("ABS_LIBRARY_ID", "") + new_lib_ids = os.environ.get("ABS_LIBRARY_IDS", "") if old_lib_id and not new_lib_ids: - old_only_search = os.environ.get('ABS_ONLY_SEARCH_IN_ABS_LIBRARY_ID', 'false') - if old_only_search.lower() == 'true': - database_service.set_setting('ABS_LIBRARY_IDS', old_lib_id) - os.environ['ABS_LIBRARY_IDS'] = old_lib_id + old_only_search = os.environ.get("ABS_ONLY_SEARCH_IN_ABS_LIBRARY_ID", "false") + if old_only_search.lower() == "true": + database_service.set_setting("ABS_LIBRARY_IDS", old_lib_id) + os.environ["ABS_LIBRARY_IDS"] = old_lib_id logger.info(f"Migrated ABS_LIBRARY_ID '{old_lib_id}' to ABS_LIBRARY_IDS") # Force reconfigure logging level based on new settings @@ -214,6 +219,7 @@ def _get_float_env(key, default): else: # Create production container AFTER loading settings from src.utils.di_container import create_container + container = create_container() # Override the container's database_service with our already-initialized instance @@ -233,22 +239,34 @@ def _get_float_env(key, default): COVERS_DIR.mkdir(parents=True, exist_ok=True) # Store shared state on app.config for blueprint access - app.config['container'] = container - app.config['sync_manager'] = manager - app.config['database_service'] = database_service - if hasattr(container, 'abs_service'): - app.config['abs_service'] = container.abs_service() + app.config["container"] = container + app.config["sync_manager"] = manager + app.config["database_service"] = database_service + if hasattr(container, "abs_service"): + app.config["abs_service"] = container.abs_service() else: from src.services.abs_service import ABSService - app.config['abs_service'] = ABSService(container.abs_client()) - app.config['DATA_DIR'] = DATA_DIR - app.config['EBOOK_DIR'] = EBOOK_DIR - app.config['COVERS_DIR'] = COVERS_DIR - app.config['ABS_COLLECTION_NAME'] = os.environ.get('ABS_COLLECTION_NAME', 'Synced with KOReader') - app.config['SUGGESTIONS_ENABLED'] = os.environ.get('SUGGESTIONS_ENABLED', 'false').lower() == 'true' - - # Register KoSync Blueprint and initialize with dependencies - init_kosync_server(database_service, container, manager, EBOOK_DIR) + + app.config["abs_service"] = ABSService(container.abs_client()) + app.config["DATA_DIR"] = DATA_DIR + app.config["EBOOK_DIR"] = EBOOK_DIR + app.config["COVERS_DIR"] = COVERS_DIR + app.config["ABS_COLLECTION_NAME"] = os.environ.get("ABS_COLLECTION_NAME", "Synced with KOReader") + app.config["SUGGESTIONS_ENABLED"] = os.environ.get("SUGGESTIONS_ENABLED", "false").lower() == "true" + + # Register KoSync blueprints and initialize services + from src.services.kosync_service import KosyncService + from src.utils.debounce_manager import DebounceManager + from src.utils.rate_limiter import TokenBucketRateLimiter + + rate_limiter = TokenBucketRateLimiter() + kosync_service = KosyncService(database_service, container, manager, EBOOK_DIR) + debounce_manager = DebounceManager(database_service, manager, rate_limiter=rate_limiter) + + app.config["kosync_service"] = kosync_service + app.config["debounce_manager"] = debounce_manager + app.config["rate_limiter"] = rate_limiter + app.register_blueprint(kosync_sync_bp) app.register_blueprint(kosync_admin_bp) @@ -276,46 +294,48 @@ def inject_global_vars(): - get_bool (callable): get_bool(key) returns `True` if get_val(key, 'false') yields a case-insensitive value in ('true', '1', 'yes', 'on'), `False` otherwise. """ - pagekeeper_env = os.environ.get('PAGEKEEPER_ENV', '').strip().lower() - is_dev_container = pagekeeper_env == 'dev' - title_prefix = '[DEV] ' if is_dev_container else '' + pagekeeper_env = os.environ.get("PAGEKEEPER_ENV", "").strip().lower() + is_dev_container = pagekeeper_env == "dev" + title_prefix = "[DEV] " if is_dev_container else "" def get_val(key, default_val=None): - if key in os.environ: return os.environ[key] + if key in os.environ: + return os.environ[key] DEFAULTS = { - 'TZ': 'America/New_York', - 'LOG_LEVEL': 'INFO', - 'DATA_DIR': '/data', - 'BOOKS_DIR': '/books', - 'ABS_COLLECTION_NAME': 'Synced with KOReader', - 'BOOKLORE_SHELF_NAME': 'Kobo', - 'SYNC_PERIOD_MINS': '5', - 'SYNC_DELTA_ABS_SECONDS': '60', - 'SYNC_DELTA_KOSYNC_PERCENT': '0.5', - 'SYNC_DELTA_BETWEEN_CLIENTS_PERCENT': '0.5', - 'SYNC_DELTA_KOSYNC_WORDS': '400', - 'FUZZY_MATCH_THRESHOLD': '80', - 'WHISPER_MODEL': 'tiny', - 'JOB_MAX_RETRIES': '5', - 'JOB_RETRY_DELAY_MINS': '15', - 'MONITOR_INTERVAL': '3600', - 'AUDIOBOOKS_DIR': '/audiobooks', - 'ABS_PROGRESS_OFFSET_SECONDS': '0', - 'EBOOK_CACHE_SIZE': '3', - 'KOSYNC_HASH_METHOD': 'content', - 'TELEGRAM_LOG_LEVEL': 'ERROR', - 'ABS_ENABLED': 'true', - 'KOSYNC_ENABLED': 'false', - 'STORYTELLER_ENABLED': 'false', - 'BOOKLORE_ENABLED': 'false', - 'HARDCOVER_ENABLED': 'false', - 'TELEGRAM_ENABLED': 'false', - 'SUGGESTIONS_ENABLED': 'false', - 'BOOKFUSION_ENABLED': 'false', - 'REPROCESS_ON_CLEAR_IF_NO_ALIGNMENT': 'true' + "TZ": "America/New_York", + "LOG_LEVEL": "INFO", + "DATA_DIR": "/data", + "BOOKS_DIR": "/books", + "ABS_COLLECTION_NAME": "Synced with KOReader", + "BOOKLORE_SHELF_NAME": "Kobo", + "SYNC_PERIOD_MINS": "5", + "SYNC_DELTA_ABS_SECONDS": "60", + "SYNC_DELTA_KOSYNC_PERCENT": "0.5", + "SYNC_DELTA_BETWEEN_CLIENTS_PERCENT": "0.5", + "SYNC_DELTA_KOSYNC_WORDS": "400", + "FUZZY_MATCH_THRESHOLD": "80", + "WHISPER_MODEL": "tiny", + "JOB_MAX_RETRIES": "5", + "JOB_RETRY_DELAY_MINS": "15", + "MONITOR_INTERVAL": "3600", + "AUDIOBOOKS_DIR": "/audiobooks", + "ABS_PROGRESS_OFFSET_SECONDS": "0", + "EBOOK_CACHE_SIZE": "3", + "KOSYNC_HASH_METHOD": "content", + "TELEGRAM_LOG_LEVEL": "ERROR", + "ABS_ENABLED": "true", + "KOSYNC_ENABLED": "false", + "STORYTELLER_ENABLED": "false", + "BOOKLORE_ENABLED": "false", + "HARDCOVER_ENABLED": "false", + "TELEGRAM_ENABLED": "false", + "SUGGESTIONS_ENABLED": "false", + "BOOKFUSION_ENABLED": "false", + "REPROCESS_ON_CLEAR_IF_NO_ALIGNMENT": "true", } - if key in DEFAULTS: return DEFAULTS[key] - return default_val if default_val is not None else '' + if key in DEFAULTS: + return DEFAULTS[key] + return default_val if default_val is not None else "" def get_bool(key): """ @@ -327,22 +347,23 @@ def get_bool(key): Returns: bool: `True` if the variable's value (case-insensitive) is one of `'true'`, `'1'`, `'yes'`, or `'on'`; `False` otherwise. """ - val = get_val(key, 'false') - return val.lower() in ('true', '1', 'yes', 'on') + val = get_val(key, "false") + return val.lower() in ("true", "1", "yes", "on") def get_header_service_url(service_name): from src.utils.service_url_helper import get_service_web_url + prefix = service_name.upper() - if not get_bool(f'{prefix}_ENABLED'): - return '' + if not get_bool(f"{prefix}_ENABLED"): + return "" return get_service_web_url(prefix) def is_active_path(path): - req_path = request.path.rstrip('/') or '/' - target_path = path.rstrip('/') or '/' - if target_path == '/': - return req_path == '/' - return req_path == target_path or req_path.startswith(f'{target_path}/') + req_path = request.path.rstrip("/") or "/" + target_path = path.rstrip("/") or "/" + if target_path == "/": + return req_path == "/" + return req_path == target_path or req_path.startswith(f"{target_path}/") return dict( abs_server=os.environ.get("ABS_SERVER", ""), @@ -365,18 +386,19 @@ def sync_daemon(): Schedules the main sync cycle to run every SYNC_PERIOD_MINS minutes and a pending-job checker every minute, performs an initial sync once at startup, then enters a loop that runs scheduled jobs and sleeps between checks. Errors during the initial sync or in the main loop are logged; the daemon continues retrying after failures. """ try: - schedule.every(int(SYNC_PERIOD_MINS)).minutes.do(manager.sync_cycle).tag('sync_cycle') - schedule.every(1).minutes.do(manager.check_pending_jobs).tag('check_jobs') + schedule.every(int(SYNC_PERIOD_MINS)).minutes.do(manager.sync_cycle).tag("sync_cycle") + schedule.every(1).minutes.do(manager.check_pending_jobs).tag("check_jobs") logger.info(f"Sync daemon started (period: {SYNC_PERIOD_MINS} minutes)") # Wait for the built-in KoSync split-port server to be ready def _wait_for_local_services(timeout=30): - kosync_port = os.environ.get('KOSYNC_PORT') - if not kosync_port or kosync_port == '4477': + kosync_port = os.environ.get("KOSYNC_PORT") + if not kosync_port or kosync_port == "4477": return # No split-port server to wait for import urllib.request + url = f"http://127.0.0.1:{kosync_port}/healthcheck" deadline = time.time() + timeout while time.time() < deadline: @@ -434,75 +456,86 @@ def _get_or_create_secret_key() -> str: def _log_security_warnings(): """Log warnings for common security misconfigurations at startup.""" - kosync_user = os.environ.get('KOSYNC_USER', '') - kosync_key = os.environ.get('KOSYNC_KEY', '') - kosync_port = os.environ.get('KOSYNC_PORT', '') - public_url = os.environ.get('KOSYNC_PUBLIC_URL', '') + kosync_user = os.environ.get("KOSYNC_USER", "") + kosync_key = os.environ.get("KOSYNC_KEY", "") + kosync_port = os.environ.get("KOSYNC_PORT", "") + public_url = os.environ.get("KOSYNC_PUBLIC_URL", "") if not kosync_user or not kosync_key: logger.warning("SECURITY: KOSYNC_USER/KOSYNC_KEY not configured — sync endpoints will reject all requests") elif len(kosync_key) < 8: logger.warning("SECURITY: KOSYNC_KEY is shorter than 8 characters — consider using a stronger password") - if not kosync_port or kosync_port == '4477': - logger.warning("SECURITY: Split-port mode not active — dashboard and sync API share port 4477. " - "Set KOSYNC_PORT to a different port before exposing sync to the internet.") + if not kosync_port or kosync_port == "4477": + logger.warning( + "SECURITY: Split-port mode not active — dashboard and sync API share port 4477. " + "Set KOSYNC_PORT to a different port before exposing sync to the internet." + ) if public_url: from urllib.parse import urlsplit, urlunsplit + parts = urlsplit(public_url) safe_netloc = parts.hostname or "" if parts.port: safe_netloc = f"{safe_netloc}:{parts.port}" safe_url = urlunsplit((parts.scheme, safe_netloc, parts.path or "", "", "")) logger.info(f"KOSync public URL: {safe_url}") - elif kosync_port and kosync_port != '4477': + elif kosync_port and kosync_port != "4477": logger.info("Tip: Set KOSYNC_PUBLIC_URL in settings if you expose KOSync through a reverse proxy") -_ALLOWED_HTML_TAGS = {'p', 'br', 'b', 'i', 'em', 'strong', 'ul', 'ol', 'li'} +_ALLOWED_HTML_TAGS = {"p", "br", "b", "i", "em", "strong", "ul", "ol", "li"} def _sanitize_html(value): """Allow only safe formatting tags and strip all attributes/protocols.""" if not value: - return '' + return "" cleaned = nh3.clean(str(value), tags=_ALLOWED_HTML_TAGS, attributes={}) return Markup(cleaned) # --- Application Factory --- def create_app(test_container=None): - STATIC_DIR = os.environ.get('STATIC_DIR', '/app/static') - TEMPLATE_DIR = os.environ.get('TEMPLATE_DIR', '/app/templates') - app = Flask(__name__, static_folder=STATIC_DIR, static_url_path='/static', template_folder=TEMPLATE_DIR) + STATIC_DIR = os.environ.get("STATIC_DIR", "/app/static") + TEMPLATE_DIR = os.environ.get("TEMPLATE_DIR", "/app/templates") + app = Flask(__name__, static_folder=STATIC_DIR, static_url_path="/static", template_folder=TEMPLATE_DIR) app.secret_key = _get_or_create_secret_key() + app.config["SESSION_COOKIE_SAMESITE"] = "Lax" + app.config["SESSION_COOKIE_HTTPONLY"] = True # Setup dependencies and inject into app context setup_dependencies(app, test_container=test_container) # Register context processors, jinja globals app.context_processor(inject_global_vars) - app.jinja_env.globals['safe_folder_name'] = safe_folder_name - app.jinja_env.filters['sanitize_html'] = _sanitize_html + app.jinja_env.globals["safe_folder_name"] = safe_folder_name + app.jinja_env.filters["sanitize_html"] = _sanitize_html # Register all application blueprints register_blueprints(app) + @app.after_request + def set_security_headers(response): + response.headers["X-Frame-Options"] = "DENY" + response.headers["X-Content-Type-Options"] = "nosniff" + return response + # Return both app and container for external reference return app, container # ---------------- MAIN ---------------- -if __name__ == '__main__': - +if __name__ == "__main__": # Setup signal handlers to catch unexpected kills import signal + def handle_exit_signal(signum, frame): logger.warning(f"Received signal {signum} - Shutting down...") for handler in logger.handlers: handler.flush() - if hasattr(logging.getLogger(), 'handlers'): + if hasattr(logging.getLogger(), "handlers"): for handler in logging.getLogger().handlers: handler.flush() sys.exit(0) @@ -522,25 +555,26 @@ def handle_exit_signal(signum, frame): logger.info("Sync daemon thread started") # Start ABS Socket.IO listener for real-time / instant sync - instant_sync_enabled = os.environ.get('INSTANT_SYNC_ENABLED', 'true').lower() != 'false' - abs_socket_enabled = os.environ.get('ABS_SOCKET_ENABLED', 'true').lower() != 'false' + instant_sync_enabled = os.environ.get("INSTANT_SYNC_ENABLED", "true").lower() != "false" + abs_socket_enabled = os.environ.get("ABS_SOCKET_ENABLED", "true").lower() != "false" if instant_sync_enabled and abs_socket_enabled and container.abs_client().is_configured(): from src.services.abs_socket_listener import ABSSocketListener + abs_listener = ABSSocketListener( - abs_server_url=os.environ.get('ABS_SERVER', ''), - abs_api_token=os.environ.get('ABS_KEY', ''), + abs_server_url=os.environ.get("ABS_SERVER", ""), + abs_api_token=os.environ.get("ABS_KEY", ""), database_service=database_service, - sync_manager=manager + sync_manager=manager, ) threading.Thread(target=abs_listener.start, daemon=True).start() - app.config['abs_listener'] = abs_listener - app.config['_abs_listener_server'] = os.environ.get('ABS_SERVER', '') - app.config['_abs_listener_key'] = os.environ.get('ABS_KEY', '') + app.config["abs_listener"] = abs_listener + app.config["_abs_listener_server"] = os.environ.get("ABS_SERVER", "") + app.config["_abs_listener_key"] = os.environ.get("ABS_KEY", "") logger.info("ABS Socket.IO listener started (instant sync enabled)") else: - app.config['abs_listener'] = None - app.config['_abs_listener_server'] = '' - app.config['_abs_listener_key'] = '' + app.config["abs_listener"] = None + app.config["_abs_listener_server"] = "" + app.config["_abs_listener_key"] = "" if not instant_sync_enabled: logger.info("ABS Socket.IO listener disabled (INSTANT_SYNC_ENABLED=false)") elif not abs_socket_enabled: @@ -548,6 +582,7 @@ def handle_exit_signal(signum, frame): # Start per-client poller from src.services.client_poller import ClientPoller + client_poller = ClientPoller( database_service=database_service, sync_manager=manager, @@ -574,17 +609,23 @@ def handle_exit_signal(signum, frame): logger.info("Web interface starting on port 4477") # --- Split-Port Mode --- - sync_port = os.environ.get('KOSYNC_PORT') + sync_port = os.environ.get("KOSYNC_PORT") if sync_port and int(sync_port) != 4477: + def run_sync_only_server(port): sync_app = Flask(__name__) + sync_app.config["kosync_service"] = app.config["kosync_service"] + sync_app.config["debounce_manager"] = app.config["debounce_manager"] + sync_app.config["rate_limiter"] = app.config["rate_limiter"] sync_app.register_blueprint(kosync_sync_bp) - @sync_app.route('/') + + @sync_app.route("/") def sync_health(): return "Sync Server OK", 200 - sync_app.run(host='0.0.0.0', port=port, debug=False, use_reloader=False) + + sync_app.run(host="0.0.0.0", port=port, debug=False, use_reloader=False) threading.Thread(target=run_sync_only_server, args=(int(sync_port),), daemon=True).start() logger.info(f"Split-Port Mode Active: Sync-only server on port {sync_port}") - app.run(host='0.0.0.0', port=4477, debug=False) + app.run(host="0.0.0.0", port=4477, debug=False) diff --git a/static/css/components.css b/static/css/components.css index 3007262..cde6b96 100644 --- a/static/css/components.css +++ b/static/css/components.css @@ -54,12 +54,6 @@ font-size: 12px; } -.btn-error { - background: var(--color-error-light); - color: var(--color-error); - border: 1px solid var(--color-error-border); -} - .btn-danger { background: var(--color-error-light); color: var(--color-error); @@ -354,6 +348,14 @@ select:focus { font-size: 36px; } +.book-cover-placeholder-logo { + width: 40%; + max-width: 48px; + opacity: 0.3; + filter: grayscale(1); + object-fit: contain; +} + /* Info Section */ .book-info { flex: 1; diff --git a/static/css/kosync.css b/static/css/kosync.css new file mode 100644 index 0000000..082e41e --- /dev/null +++ b/static/css/kosync.css @@ -0,0 +1,340 @@ +/* KoSync Document Management page styles */ + +.kosync-shell { + max-width: 900px; + padding-bottom: 60px; +} + +/* ── Page header ── */ +.kosync-page-header { + display: flex; + align-items: flex-start; + justify-content: space-between; + gap: 16px; + margin-bottom: 20px; +} + +.kosync-page-kicker { + font-size: 12px; + font-weight: 600; + text-transform: uppercase; + letter-spacing: 0.08em; + color: var(--color-text-faint); + margin: 0 0 4px; +} + +.kosync-page-title { + font-family: Outfit, sans-serif; + font-size: 24px; + font-weight: 700; + margin: 0 0 4px; + color: var(--color-text); +} + +.kosync-page-description { + font-size: 13px; + color: var(--color-text-muted); + margin: 0; + line-height: 1.5; +} + +/* ── Stats strip ── */ +.kosync-stats { + display: flex; + gap: 10px; + flex-wrap: wrap; + margin-bottom: 24px; +} + +.kosync-stat { + font-size: 12px; + color: var(--color-text-muted); + background: var(--color-surface-2); + padding: 4px 12px; + border-radius: 20px; + white-space: nowrap; +} + +.kosync-stat strong { + color: var(--color-text); + font-weight: 600; +} + +/* ── Sections ── */ +.kosync-section { + margin-bottom: 32px; +} + +.kosync-section-header { + display: flex; + align-items: center; + gap: 10px; + margin-bottom: 12px; +} + +.kosync-section-title { + font-family: Outfit, sans-serif; + font-size: 16px; + font-weight: 600; + color: var(--color-text); + margin: 0; +} + +.kosync-section-toggle { + font-size: 12px; + color: var(--color-text-faint); + background: none; + border: none; + cursor: pointer; + padding: 2px 8px; + border-radius: 4px; +} + +.kosync-section-toggle:hover { + background: var(--color-surface-2); + color: var(--color-text-muted); +} + +.kosync-section-help { + font-size: 13px; + color: var(--color-text-muted); + margin: 0 0 12px; + line-height: 1.5; +} + +/* ── Cards ── */ +.kosync-card { + display: flex; + align-items: flex-start; + justify-content: space-between; + gap: 16px; + padding: 14px 16px; + background: var(--color-surface-2); + border-radius: 10px; + margin-bottom: 8px; + transition: background 0.15s; +} + +.kosync-card:hover { + background: var(--color-surface-3); +} + +.kosync-card-info { + flex: 1; + min-width: 0; +} + +.kosync-card-title { + font-size: 14px; + font-weight: 600; + color: var(--color-text); + margin: 0 0 4px; + white-space: nowrap; + overflow: hidden; + text-overflow: ellipsis; +} + +.kosync-card-meta { + display: flex; + gap: 12px; + flex-wrap: wrap; + font-size: 12px; + color: var(--color-text-faint); +} + +.kosync-hash { + font-family: 'IBM Plex Mono', monospace; + font-size: 12px; + color: var(--color-text-muted); + background: var(--color-surface-1); + padding: 2px 6px; + border-radius: 4px; + letter-spacing: 0.02em; +} + +.kosync-card-actions { + display: flex; + gap: 6px; + flex-shrink: 0; + align-items: flex-start; +} + +.kosync-card-actions .btn { + font-size: 11px; + padding: 4px 10px; + white-space: nowrap; +} + +/* ── Inline search panel ── */ +.kosync-search-panel { + margin-top: 10px; + padding: 10px 12px; + background: var(--color-surface-1); + border-radius: 8px; +} + +.kosync-search-panel .search-box { + width: 100%; + margin-bottom: 8px; + box-sizing: border-box; +} + +.kosync-search-result { + display: flex; + align-items: center; + justify-content: space-between; + gap: 8px; + padding: 6px 0; + border-bottom: 1px solid var(--color-border); +} + +.kosync-search-result:last-child { + border-bottom: none; +} + +.kosync-search-result-info { + flex: 1; + min-width: 0; +} + +.kosync-search-result-title { + font-size: 13px; + font-weight: 600; + white-space: nowrap; + overflow: hidden; + text-overflow: ellipsis; +} + +.kosync-search-result-status { + font-size: 11px; + color: var(--color-text-muted); +} + +/* ── Empty state ── */ +.kosync-empty { + text-align: center; + padding: 24px; + color: var(--color-text-faint); + font-size: 13px; +} + +/* ── Tags ── */ +.kosync-tag { + display: inline-block; + font-size: 10px; + font-weight: 600; + text-transform: uppercase; + letter-spacing: 0.06em; + padding: 2px 8px; + border-radius: 4px; + margin-bottom: 4px; +} + +.kosync-tag--unlinked { + background: rgba(239, 68, 68, 0.15); + color: #ef4444; +} + +.kosync-tag--orphaned { + background: rgba(245, 158, 11, 0.15); + color: #f59e0b; +} + +/* ── Card variants ── */ +.kosync-card--attention { + border-left: 3px solid rgba(239, 68, 68, 0.5); +} + +.kosync-card--stale { + opacity: 0.7; +} + +/* ── Meta highlights ── */ +.kosync-meta--highlight { + color: var(--color-text); + font-weight: 500; +} + +.kosync-meta--warn { + color: #f59e0b; + font-weight: 500; +} + +/* ── Stat variants ── */ +.kosync-stat--alert strong { + color: #ef4444; +} + +.kosync-stat--warn strong { + color: #f59e0b; +} + +/* ── Dashboard: Pending Identification ── */ +.pending-id-help { + font-size: 13px; + color: var(--color-text-muted); + margin: 0 0 12px; +} + +.pending-id-grid { + display: flex; + flex-direction: column; + gap: 6px; +} + +.pending-id-card { + display: flex; + align-items: center; + justify-content: space-between; + gap: 12px; + padding: 10px 14px; + background: var(--color-surface-2); + border-radius: 8px; + border-left: 3px solid rgba(245, 158, 11, 0.5); +} + +.pending-id-info { + display: flex; + align-items: center; + gap: 12px; + flex-wrap: wrap; + font-size: 13px; + color: var(--color-text-muted); +} + +.pending-id-pct { + font-weight: 600; + color: var(--color-text); +} + +.pending-id-device { + font-size: 12px; +} + +.pending-id-btn { + font-size: 11px; + padding: 4px 10px; + flex-shrink: 0; +} + +/* ── Mobile responsive ── */ +@media (max-width: 600px) { + .kosync-card { + flex-direction: column; + gap: 10px; + } + + .kosync-card-actions { + width: 100%; + flex-wrap: wrap; + } + + .kosync-card-actions .btn { + flex: 1; + text-align: center; + } + + .kosync-page-header { + flex-direction: column; + } +} diff --git a/static/css/match.css b/static/css/match.css index b795f93..8afbed8 100644 --- a/static/css/match.css +++ b/static/css/match.css @@ -57,12 +57,12 @@ margin: 20px 0; } -/* Resource grid (square cards) */ +/* Resource grid (portrait cards, matching audiobook grid) */ .resource-grid { display: grid; - grid-template-columns: repeat(auto-fill, minmax(140px, 1fr)); - gap: 16px; - margin: 16px 0; + grid-template-columns: repeat(auto-fill, minmax(160px, 1fr)); + gap: 24px; + margin: 20px 0; } /* Audiobook option card */ @@ -124,28 +124,55 @@ background: rgba(124, 58, 237, 0.08) !important; } -/* Resource card (square) */ +/* Resource card (portrait layout matching audiobook cards) */ .resource-card { - aspect-ratio: 1/1; display: flex; flex-direction: column; + gap: 0; + padding: 0; + text-align: center; + overflow: hidden; +} + +.resource-card .resource-title, +.resource-card .resource-subtitle, +.resource-card .source-badge { + padding-left: 8px; + padding-right: 8px; +} + +.resource-card .resource-title { + padding-top: 10px; +} + +.resource-card .source-badge:last-child { + margin-bottom: 10px; +} + +/* Ghost/icon-only cards (no cover image) keep compact centered layout */ +.resource-card.ghost-card { align-items: center; justify-content: center; gap: 8px; padding: 12px 8px; - text-align: center; +} + +.resource-card.ghost-card .resource-title, +.resource-card.ghost-card .resource-subtitle { + padding-top: 0; } .resource-cover-container { width: 100%; - flex-shrink: 0; + aspect-ratio: 2/3; overflow: hidden; + flex-shrink: 0; } .resource-cover { width: 100%; - height: auto; - border-radius: var(--radius); + height: 100%; + object-fit: cover; } .resource-icon { @@ -301,12 +328,17 @@ input[type="radio"] { } .r-match-mode-btn.active, -.r-match-mode-btn:hover { +.r-match-mode-btn:hover:not(:disabled) { color: var(--color-text); background: rgba(255, 255, 255, 0.08); border-color: rgba(255, 255, 255, 0.08); } +.r-match-mode-btn:disabled { + opacity: 0.35; + cursor: not-allowed; +} + /* ── Match page workspace (single-column progressive layout) ── */ .match-workspace { margin-bottom: 28px; @@ -815,7 +847,8 @@ button.batch-select-card { color: #6ee7b7; } -.batch-status-pill.audio-only { +.batch-status-pill.audio-only, +.batch-status-pill.ebook-only { background: rgba(245, 158, 11, 0.16); color: #fbbf24; } @@ -920,8 +953,8 @@ button.batch-select-card { } .resource-grid { - grid-template-columns: repeat(auto-fill, minmax(110px, 1fr)); - gap: 10px; + grid-template-columns: repeat(auto-fill, minmax(130px, 1fr)); + gap: 16px; } .mt-16 { diff --git a/static/css/reading.css b/static/css/reading.css index 0f76fea..3a6c9d6 100644 --- a/static/css/reading.css +++ b/static/css/reading.css @@ -569,6 +569,17 @@ width: 100%; height: 100%; background: var(--gradient-primary); + display: flex; + align-items: center; + justify-content: center; +} + +.r-cover-placeholder .book-cover-placeholder-logo { + width: 40%; + max-width: 48px; + opacity: 0.3; + filter: grayscale(1); + object-fit: contain; } .r-card-body { @@ -1041,10 +1052,17 @@ aspect-ratio: 1 / 1; object-fit: cover; border-radius: var(--radius); - display: block; box-shadow: 0 8px 32px rgba(0, 0, 0, 0.5); } +.r-hero-cover img { + display: block; +} + +.r-hero-cover .book-cover-placeholder-logo { + max-width: 56px; +} + .r-hero-cover--ebook img, .r-hero-cover--ebook .r-cover-placeholder { aspect-ratio: 2 / 3; @@ -3123,6 +3141,23 @@ a.r-service-row-name:hover { text-decoration: underline; } color: var(--color-text-faint); } +.r-tbr-card-cover-placeholder { + width: 100%; + height: 100%; + background: var(--gradient-primary); + display: flex; + align-items: center; + justify-content: center; +} + +.r-tbr-card-cover-placeholder .book-cover-placeholder-logo { + width: 40%; + max-width: 36px; + opacity: 0.3; + filter: grayscale(1); + object-fit: contain; +} + .r-tbr-card-body { flex: 1; min-width: 0; diff --git a/static/js/batch-match.js b/static/js/batch-match.js new file mode 100644 index 0000000..96e194e --- /dev/null +++ b/static/js/batch-match.js @@ -0,0 +1,128 @@ +/* ═══════════════════════════════════════════ + PageKeeper — batch match page + ═══════════════════════════════════════════ */ + +var selectionState = { + audiobook: null, + storyteller: null, + ebook: null, + ebookDisplayName: '', +}; + +function applySelection(card) { + var group = card.dataset.selectGroup; + var value = card.dataset.value || ''; + var targetInputId = card.dataset.targetInput; + var targetInput = document.getElementById(targetInputId); + if (!targetInput) return; + + document.querySelectorAll('[data-select-group="' + group + '"]').forEach(function (el) { + el.classList.remove('selected'); + }); + + card.classList.add('selected'); + targetInput.value = value; + + if (group === 'audiobook') { + selectionState.audiobook = value || null; + } else if (group === 'storyteller') { + selectionState.storyteller = value || null; + } else if (group === 'ebook') { + selectionState.ebook = value || null; + selectionState.ebookDisplayName = card.dataset.displayName || value; + var displayInput = document.getElementById(card.dataset.displayInput); + if (displayInput) { + displayInput.value = selectionState.ebookDisplayName; + } + } + + if (group !== 'ebook' && !selectionState.ebook) { + var displayNameInput = document.getElementById('selected_ebook_display_name'); + if (displayNameInput) displayNameInput.value = ''; + } + + updateBatchActionState(); +} + +function updateBatchActionState() { + var addButton = document.getElementById('addToQueueBtn'); + var statusLabel = document.getElementById('selectionStatus'); + if (!addButton || !statusLabel) return; + var hasAudiobook = Boolean(selectionState.audiobook); + var hasEbook = Boolean(selectionState.ebook); + var hasStoryteller = Boolean(selectionState.storyteller); + var hasLinkedSource = hasEbook || hasStoryteller; + var hasAnything = hasAudiobook || hasLinkedSource; + + addButton.disabled = !hasAnything; + + if (hasAudiobook && !hasLinkedSource) { + addButton.textContent = 'Add Audio Only to Queue'; + } else if (!hasAudiobook && hasEbook) { + addButton.textContent = 'Add Ebook Only to Queue'; + } else if (!hasAudiobook && hasStoryteller && !hasEbook) { + addButton.textContent = 'Add Storyteller Only to Queue'; + } else { + addButton.textContent = 'Add to Queue'; + } + + if (!hasAnything) { + statusLabel.textContent = 'Select a book to enable queueing.'; + return; + } + + if (!hasAudiobook && hasStoryteller && !hasEbook) { + statusLabel.textContent = 'Queue will be created as Storyteller-only.'; + return; + } + + if (!hasAudiobook) { + statusLabel.textContent = 'Queue will be created as ebook-only.'; + return; + } + + if (!hasLinkedSource) { + statusLabel.textContent = 'Queue will be created as audio-only.'; + return; + } + + if (hasStoryteller && hasEbook) { + statusLabel.textContent = 'Queue will include both Storyteller and ebook.'; + return; + } + + if (hasStoryteller) { + statusLabel.textContent = 'Queue will use Storyteller as the linked source.'; + return; + } + + statusLabel.textContent = 'Queue will use the selected ebook source.'; +} + +(function initBatchMatch() { + document.querySelectorAll('.batch-select-card').forEach(function (card) { + card.addEventListener('click', function () { applySelection(card); }); + }); + + var preselectedAudiobook = document.querySelector('[data-select-group="audiobook"].selected'); + if (preselectedAudiobook) { + selectionState.audiobook = preselectedAudiobook.dataset.value || null; + document.getElementById('selected_audiobook_id').value = selectionState.audiobook || ''; + } + + var preselectedStoryteller = document.querySelector('[data-select-group="storyteller"].selected'); + if (preselectedStoryteller) { + selectionState.storyteller = preselectedStoryteller.dataset.value || null; + document.getElementById('selected_storyteller_uuid').value = preselectedStoryteller.dataset.value || ''; + } + + var preselectedEbook = document.querySelector('[data-select-group="ebook"].selected'); + if (preselectedEbook) { + selectionState.ebook = preselectedEbook.dataset.value || null; + selectionState.ebookDisplayName = preselectedEbook.dataset.displayName || selectionState.ebook || ''; + document.getElementById('selected_ebook_filename').value = selectionState.ebook || ''; + document.getElementById('selected_ebook_display_name').value = selectionState.ebookDisplayName; + } + + updateBatchActionState(); +})(); diff --git a/static/js/bookfusion.js b/static/js/bookfusion.js new file mode 100644 index 0000000..874a1ec --- /dev/null +++ b/static/js/bookfusion.js @@ -0,0 +1,888 @@ +/* ═══════════════════════════════════════════ + PageKeeper — BookFusion page + ═══════════════════════════════════════════ + Depends on: utils.js + No Jinja2 vars — clean extraction. + ═══════════════════════════════════════════ */ + +/* ── Helpers ── */ + +function getSpinnerHtml() { + return ''; +} + +function isMobileViewport() { + return window.matchMedia('(max-width: 768px)').matches; +} + +function keepElementVisible(el, block) { + if (!el || !isMobileViewport()) return; + window.setTimeout(function () { + try { + el.scrollIntoView({ behavior: 'smooth', block: block || 'center', inline: 'nearest' }); + } catch (e) { + el.scrollIntoView(); + } + }, 180); +} + +function revealFirstMobileResult(listId) { + if (!isMobileViewport()) return; + var active = document.activeElement; + if (!active || (active.id !== 'bf-search-input' && active.id !== 'bf-library-search')) return; + var first = document.querySelector('#' + listId + ' .bf-book-item, #' + listId + ' .bf-highlight-group, #' + listId + ' .bf-empty'); + if (first) keepElementVisible(first, 'nearest'); +} + +function scrollActiveTabIntoView(tab) { + var activeTab = document.querySelector('.bf-tab[data-tab="' + tab + '"]'); + if (!activeTab || !isMobileViewport()) return; + activeTab.scrollIntoView({ behavior: 'smooth', inline: 'center', block: 'nearest' }); +} + +/* + * All user-facing text in dynamically generated HTML is passed through + * escapeHtml() (from utils.js) before insertion. + */ + +function createComboboxHtml(options, placeholder, onChangeFnName, extraAttrs) { + extraAttrs = extraAttrs || ''; + var optionsHtml = options.map(function (opt) { + return '
' + escapeHtml(opt.label) + '
'; + }).join(''); + + var selectedOpt = options.find(function (o) { return o.selected; }); + var initialValue = selectedOpt ? escapeHtml(selectedOpt.label) : ''; + var initialDataValue = selectedOpt ? escapeHtml(selectedOpt.value) : ''; + + return '
' + + '' + + '
' + optionsHtml + '
' + + '
'; +} + +function handleComboboxFilter(input) { + var filter = input.value.toLowerCase(); + var options = input.nextElementSibling.querySelectorAll('.bf-combobox-option'); + options.forEach(function (opt) { + if (opt.textContent.toLowerCase().indexOf(filter) !== -1) { + opt.classList.remove('hidden'); + } else { + opt.classList.add('hidden'); + } + }); + input.dataset.selectedValue = ''; +} + +function handleComboboxSelect(optionEl) { + var combobox = optionEl.closest('.bf-combobox'); + var input = combobox.querySelector('.bf-combobox-input'); + input.value = optionEl.textContent; + input.dataset.selectedValue = optionEl.dataset.value; + combobox.classList.remove('open'); + + if (combobox.dataset.onChange) { + window[combobox.dataset.onChange](combobox); + } +} + +var _newHighlightIds = []; + +/* ── Tab switching ── */ +function switchBFTab(tab) { + _newHighlightIds = []; + document.querySelectorAll('.bf-tab').forEach(function (t) { t.classList.remove('active'); }); + document.querySelectorAll('.bf-panel').forEach(function (p) { p.classList.remove('active'); }); + document.getElementById('bf-panel-' + tab).classList.add('active'); + document.querySelectorAll('.bf-tab').forEach(function (t) { + if (t.dataset.tab === tab) t.classList.add('active'); + }); + scrollActiveTabIntoView(tab); + + if (tab === 'highlights') loadHighlights(); + if (tab === 'library') loadLibrary(); +} + +/* ── Upload Tab ── */ +var searchTimer; + +function debounceSearch() { + clearTimeout(searchTimer); + searchTimer = setTimeout(searchBooks, 300); +} + +function searchBooks() { + var q = document.getElementById('bf-search-input').value.trim(); + var list = document.getElementById('bf-book-list'); + if (!q) { + list.textContent = ''; + var emptyEl = document.createElement('div'); + emptyEl.className = 'bf-empty'; + var h = document.createElement('div'); h.className = 'bf-empty-heading'; h.textContent = 'Search your Booklore library'; + var d = document.createElement('div'); d.className = 'bf-empty-desc'; d.textContent = 'Type above to find books to upload'; + emptyEl.appendChild(h); emptyEl.appendChild(d); + list.appendChild(emptyEl); + return; + } + fetch('/api/bookfusion/booklore-books?q=' + encodeURIComponent(q)) + .then(function (r) { + if (!r.ok) throw new Error('Search failed'); + return r.json(); + }) + .then(function (books) { + list.textContent = ''; + if (!books.length) { + var emptyEl = document.createElement('div'); + emptyEl.className = 'bf-empty'; + var icon = document.createElement('div'); icon.className = 'bf-empty-icon'; icon.textContent = '\uD83D\uDCDA'; + var h = document.createElement('div'); h.className = 'bf-empty-heading'; h.textContent = 'No books found'; + var d = document.createElement('div'); d.className = 'bf-empty-desc'; d.textContent = 'Try a different search term'; + emptyEl.appendChild(icon); emptyEl.appendChild(h); emptyEl.appendChild(d); + list.appendChild(emptyEl); + return; + } + books.forEach(function (b) { + var item = document.createElement('div'); + item.className = 'bf-book-item'; + + var info = document.createElement('div'); + info.className = 'bf-book-info'; + + var titleEl = document.createElement('div'); + titleEl.className = 'bf-book-title'; + titleEl.textContent = b.title || b.fileName; + + var metaEl = document.createElement('div'); + metaEl.className = 'bf-book-meta'; + var metaText = ''; + if (b.authors) metaText += b.authors + ' \u00B7 '; + metaText += b.fileName; + metaEl.textContent = metaText; + var sourceTag = document.createElement('span'); + sourceTag.className = 'bf-source-tag'; + sourceTag.textContent = b.source; + metaEl.appendChild(document.createTextNode(' ')); + metaEl.appendChild(sourceTag); + + info.appendChild(titleEl); + info.appendChild(metaEl); + + var btn = document.createElement('button'); + btn.className = 'bf-upload-btn'; + btn.textContent = 'Upload'; + btn.dataset.bookId = b.id; + btn.dataset.title = b.title || ''; + btn.dataset.authors = b.authors || ''; + btn.dataset.fileName = b.fileName || ''; + btn.addEventListener('click', function () { handleUploadClick(btn); }); + + item.appendChild(info); + item.appendChild(btn); + list.appendChild(item); + }); + revealFirstMobileResult('bf-book-list'); + }) + .catch(function (err) { + list.textContent = ''; + var emptyEl = document.createElement('div'); + emptyEl.className = 'bf-empty'; + var icon = document.createElement('div'); icon.className = 'bf-empty-icon'; icon.textContent = '\u26A0\uFE0F'; + var h = document.createElement('div'); h.className = 'bf-empty-heading'; h.textContent = 'Search failed'; + var d = document.createElement('div'); d.className = 'bf-empty-desc'; d.textContent = 'Please try again'; + emptyEl.appendChild(icon); emptyEl.appendChild(h); emptyEl.appendChild(d); + list.appendChild(emptyEl); + }); +} + +function handleUploadClick(btn) { + var book = { + id: btn.dataset.bookId, + title: btn.dataset.title, + authors: btn.dataset.authors, + fileName: btn.dataset.fileName + }; + uploadBook(btn, book); +} + +function uploadBook(btn, book) { + btn.disabled = true; + btn.innerHTML = getSpinnerHtml() + 'Uploading\u2026'; + + fetch('/api/bookfusion/upload', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + book_id: book.id, + title: book.title, + authors: book.authors, + fileName: book.fileName + }) + }) + .then(function (r) { + if (!r.ok) throw new Error('Upload failed'); + return r.json(); + }) + .then(function (data) { + if (data.success) { + btn.textContent = 'Done'; + btn.classList.add('done'); + } else { + btn.textContent = data.error || 'Error'; + btn.classList.add('error'); + btn.disabled = false; + } + }) + .catch(function (err) { + btn.textContent = err.message || 'Upload failed'; + btn.classList.add('error'); + btn.disabled = false; + }); +} + +/* ── Highlights Tab ── */ +function syncHighlights(fullResync) { + var btn = document.getElementById('bf-sync-btn'); + var resyncBtn = document.getElementById('bf-resync-btn'); + var info = document.getElementById('bf-sync-info'); + + var activeBtn = fullResync ? resyncBtn : btn; + var originalText = activeBtn.textContent; + + btn.disabled = true; + resyncBtn.disabled = true; + activeBtn.innerHTML = getSpinnerHtml() + (fullResync ? 'Re-syncing\u2026' : 'Syncing\u2026'); + info.textContent = ''; + + fetch('/api/bookfusion/sync-highlights', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ full_resync: !!fullResync }) + }) + .then(function (r) { + if (!r.ok) throw new Error('Sync failed'); + return r.json(); + }) + .then(function (data) { + btn.disabled = false; + resyncBtn.disabled = false; + activeBtn.textContent = originalText; + if (data.success) { + var parts = []; + if (data.new_highlights) parts.push(data.new_highlights + ' new highlight' + (data.new_highlights !== 1 ? 's' : '')); + if (data.books_saved) parts.push(data.books_saved + ' book' + (data.books_saved !== 1 ? 's' : '') + ' cataloged'); + info.textContent = parts.length ? 'Synced: ' + parts.join(', ') : 'Up to date'; + _newHighlightIds = data.new_ids || []; + loadHighlights(); + } else { + info.textContent = data.error || 'Sync failed'; + } + }) + .catch(function (err) { + btn.disabled = false; + resyncBtn.disabled = false; + activeBtn.textContent = originalText; + info.textContent = err.message || 'Sync failed'; + }); +} + +var _pkBooks = []; +var _currentHighlightGroups = {}; + +function loadHighlights() { + fetch('/api/bookfusion/highlights') + .then(function (r) { + if (!r.ok) throw new Error('Failed to load highlights'); + return r.json(); + }) + .then(function (data) { + var container = document.getElementById('bf-highlights-container'); + + _pkBooks = data.books || []; + var groups = data.highlights; + _currentHighlightGroups = groups; + var bookNames = Object.keys(groups); + + if (!bookNames.length) { + container.textContent = ''; + var emptyEl = document.createElement('div'); + emptyEl.className = 'bf-empty'; + var icon = document.createElement('div'); icon.className = 'bf-empty-icon'; icon.textContent = '\uD83D\uDCDD'; + var h = document.createElement('div'); h.className = 'bf-empty-heading'; + h.textContent = data.has_synced ? 'No highlights found' : 'Sync your highlights'; + var d = document.createElement('div'); d.className = 'bf-empty-desc'; + d.textContent = data.has_synced ? 'Your synced books have no highlights yet' : 'Click "Sync Highlights" to fetch highlights from BookFusion'; + emptyEl.appendChild(icon); emptyEl.appendChild(h); emptyEl.appendChild(d); + container.appendChild(emptyEl); + return; + } + + /* Highlight data is escaped via escapeHtml before insertion */ + container.innerHTML = bookNames.map(function (book) { // eslint-disable-line no-unsanitized/property + var groupData = groups[book]; + var hls = groupData.highlights; + var matchedAbsId = groupData.matched_abs_id; + + var options = _pkBooks.map(function (b) { + return { + value: b.abs_id, + label: b.title, + selected: (matchedAbsId && b.abs_id === matchedAbsId) + }; + }); + + var comboboxHtml = createComboboxHtml(options, 'Match to book\u2026', 'handleHighlightLinkChange', 'data-book-title="' + escapeHtml(book) + '" data-bf-id="' + escapeHtml(groupData.bookfusion_book_id) + '"'); + + var hlsHtml = hls.map(function (h) { + var metaParts = []; + if (h.chapter_heading) metaParts.push(h.chapter_heading.replace(/^#{1,6}\s*/, '')); + if (h.date) metaParts.push(h.date); + var metaText = metaParts.map(escapeHtml).join(' · '); + var isNew = _newHighlightIds.length && h.highlight_id && _newHighlightIds.indexOf(h.highlight_id) !== -1; + var newBadge = isNew ? 'New' : ''; + var newAttr = isNew ? ' data-new-highlight="1"' : ''; + + return '
' + + '
' + newBadge + escapeHtml(h.quote || '') + '
' + + '
' + metaText + '
' + + '
'; + }).join(''); + + return '
' + + '
' + + '\u25BC' + + '' + escapeHtml(book) + '' + + '(' + hls.length + ')' + + (matchedAbsId ? '\u2714 Linked' : '') + + '' + + comboboxHtml + + '' + + '' + + '
' + + '
' + hlsHtml + '
' + + '
'; + }).join(''); + revealFirstMobileResult('bf-highlights-container'); + + if (_newHighlightIds.length) { + var firstNew = container.querySelector('[data-new-highlight]'); + if (firstNew) { + var group = firstNew.closest('.bf-group-body'); + if (group && group.classList.contains('hidden')) { + group.classList.remove('hidden'); + var header = group.previousElementSibling; + if (header) header.classList.remove('collapsed'); + } + setTimeout(function () { + firstNew.scrollIntoView({ behavior: 'smooth', block: 'center' }); + }, 100); + } + } + }) + .catch(function (err) { + var container = document.getElementById('bf-highlights-container'); + container.textContent = ''; + var emptyEl = document.createElement('div'); + emptyEl.className = 'bf-empty'; + var icon = document.createElement('div'); icon.className = 'bf-empty-icon'; icon.textContent = '\u26A0\uFE0F'; + var h = document.createElement('div'); h.className = 'bf-empty-heading'; h.textContent = 'Failed to load highlights'; + var d = document.createElement('div'); d.className = 'bf-empty-desc'; d.textContent = 'Please try again'; + emptyEl.appendChild(icon); emptyEl.appendChild(h); emptyEl.appendChild(d); + container.appendChild(emptyEl); + }); +} + +function toggleGroup(e, headerEl) { + if (e.target.closest('.bf-journal-controls')) return; + headerEl.classList.toggle('collapsed'); + headerEl.nextElementSibling.classList.toggle('hidden'); +} + +function handleHighlightLinkChange(comboboxEl) { + var input = comboboxEl.querySelector('.bf-combobox-input'); + var absId = input.dataset.selectedValue; + var bookfusionBookId = comboboxEl.dataset.bfId; + linkHighlight(bookfusionBookId, absId); +} + +function handleSaveJournalClick(btn) { + var bookTitle = btn.dataset.bookTitle; + var groupData = _currentHighlightGroups[bookTitle]; + if (!groupData) return; + var comboboxEl = btn.previousElementSibling; + var input = comboboxEl.querySelector('.bf-combobox-input'); + saveToJournal(btn, input.dataset.selectedValue, groupData.highlights); +} + +function saveToJournal(btn, absId, highlights) { + if (!absId) { + btn.textContent = 'Select a book first'; + btn.classList.add('error'); + setTimeout(function () { + btn.textContent = 'Save to Journal'; + btn.classList.remove('error'); + }, 2000); + return; + } + btn.disabled = true; + btn.innerHTML = getSpinnerHtml() + 'Saving\u2026'; + + var payload = highlights.map(function (h) { + return { + quote: h.quote || '', + chapter: (h.chapter_heading || '').replace(/^#{1,6}\s*/, ''), + highlighted_at: h.date || '' + }; + }); + + fetch('/api/bookfusion/save-journal', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ abs_id: absId, highlights: payload }) + }) + .then(function (r) { + if (!r.ok) throw new Error('Save failed'); + return r.json(); + }) + .then(function (data) { + if (data.success) { + btn.textContent = '\u2714 Saved ' + data.saved; + btn.classList.add('done'); + } else { + btn.textContent = data.error || 'Error'; + btn.classList.add('error'); + btn.disabled = false; + } + }) + .catch(function (err) { + btn.textContent = err.message || 'Save failed'; + btn.classList.add('error'); + btn.disabled = false; + }); +} + +function linkHighlight(bookfusionBookId, absId) { + var inputs = document.querySelectorAll('.bf-combobox[data-book-title] .bf-combobox-input'); + inputs.forEach(function (s) { s.disabled = true; }); + fetch('/api/bookfusion/link-highlight', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ bookfusion_book_id: bookfusionBookId, abs_id: absId || null }) + }) + .then(function (r) { + if (!r.ok) throw new Error('Link failed'); + return r.json(); + }) + .then(function (data) { + inputs.forEach(function (s) { s.disabled = false; }); + if (data.success) { + loadHighlights(); + } else { + console.error('Link failed:', data.error || 'unknown error'); + loadHighlights(); + } + }) + .catch(function (err) { + inputs.forEach(function (s) { s.disabled = false; }); + console.error('Link request failed:', err); + loadHighlights(); + }); +} + +/* ── Library Tab ── */ +var _libraryData = []; +var _dashboardBooks = []; +var _currentRenderedBooks = []; + +function loadLibrary() { + var list = document.getElementById('bf-library-list'); + list.textContent = ''; + var emptyEl = document.createElement('div'); + emptyEl.className = 'bf-empty'; + var icon = document.createElement('div'); icon.className = 'bf-empty-icon'; icon.textContent = '\u23F3'; + var h = document.createElement('div'); h.className = 'bf-empty-heading'; h.textContent = 'Loading\u2026'; + emptyEl.appendChild(icon); emptyEl.appendChild(h); + list.appendChild(emptyEl); + + fetch('/api/bookfusion/library') + .then(function (r) { + if (!r.ok) throw new Error('Failed to load library'); + return r.json(); + }) + .then(function (data) { + _libraryData = data.books || []; + _dashboardBooks = data.dashboard_books || []; + var urlQ = new URLSearchParams(window.location.search).get('q'); + if (urlQ) { + document.getElementById('bf-library-search').value = urlQ; + filterLibrary(); + } else { + renderLibrary(_libraryData); + } + }) + .catch(function (err) { + list.textContent = ''; + var errEl = document.createElement('div'); + errEl.className = 'bf-empty'; + var icon = document.createElement('div'); icon.className = 'bf-empty-icon'; icon.textContent = '\u26A0\uFE0F'; + var h = document.createElement('div'); h.className = 'bf-empty-heading'; h.textContent = 'Failed to load library'; + var d = document.createElement('div'); d.className = 'bf-empty-desc'; d.textContent = 'Please try again'; + errEl.appendChild(icon); errEl.appendChild(h); errEl.appendChild(d); + list.appendChild(errEl); + }); +} + +function filterLibrary() { + var q = document.getElementById('bf-library-search').value.trim().toLowerCase(); + if (!q) { + renderLibrary(_libraryData); + return; + } + var filtered = _libraryData.filter(function (b) { + var searchable = (b.title || '') + ' ' + (b.authors || '') + ' ' + (b.series || '') + ' ' + (b.filenames || []).join(' '); + return searchable.toLowerCase().indexOf(q) !== -1; + }); + renderLibrary(filtered); +} + +function _extractExt(filename) { + var dot = filename.lastIndexOf('.'); + if (dot > 0) return filename.substring(dot + 1).toUpperCase(); + return ''; +} + +function _formatDateNote(data) { + if (!data.dates_set) return null; + var parts = []; + if (data.started_at) parts.push(data.started_at); + if (data.finished_at) parts.push(data.finished_at); + if (!parts.length) return null; + var range = parts.join(' to '); + if (data.dates_source === 'hardcover') { + return 'Reading dates set from Hardcover \u2014 ' + range; + } + return 'Dates estimated from highlights \u2014 ' + range; +} + +/* Library item rendering — all user-facing text is escapeHtml-sanitized */ +function _renderBookItem(b, i) { + var metaHtml = ''; + if (b.authors) metaHtml += escapeHtml(b.authors); + if (b.series) { + if (b.authors) metaHtml += ' · '; + metaHtml += escapeHtml(b.series); + } + if (b.highlight_count > 0) { + metaHtml += ' ' + b.highlight_count + ' highlight' + (b.highlight_count !== 1 ? 's' : '') + ''; + } + + var filenames = b.filenames || (b.filename ? [b.filename] : []); + if (filenames.length) { + metaHtml += ' '; + var exts = []; + filenames.forEach(function (fn) { + var ext = _extractExt(fn); + if (ext && ext !== 'MD' && exts.indexOf(ext) === -1) exts.push(ext); + }); + exts.forEach(function (ext) { + metaHtml += '' + escapeHtml(ext) + ''; + }); + } + + var actionsHtml = ''; + if (b.on_dashboard) { + actionsHtml = '\u2714 Matched' + + ''; + } else { + var options = _dashboardBooks.map(function (db) { + return { value: db.abs_id, label: db.title, selected: false }; + }); + var comboboxHtml = createComboboxHtml(options, 'Match to book\u2026', ''); + actionsHtml = comboboxHtml + + '' + + '' + + ''; + } + + var hideBtn = b.hidden + ? '' + : ''; + + return '
' + + '
' + + '
' + escapeHtml(b.title || b.filename) + '
' + + '
' + metaHtml + '
' + + '
' + + '
' + + actionsHtml + + hideBtn + + '
' + + '
'; +} + +function renderLibrary(books) { + var list = document.getElementById('bf-library-list'); + _currentRenderedBooks = books; + + var visible = books.filter(function (b) { return !b.hidden; }); + var hidden = books.filter(function (b) { return b.hidden; }); + + if (!books.length) { + list.textContent = ''; + var emptyEl = document.createElement('div'); + emptyEl.className = 'bf-empty'; + var heading = _libraryData.length ? 'No matches' : 'No books in catalog'; + var desc = _libraryData.length ? 'Try a different filter' : 'Run a Full Re-sync to populate your library'; + var icon = document.createElement('div'); icon.className = 'bf-empty-icon'; icon.textContent = '\uD83D\uDCDA'; + var h = document.createElement('div'); h.className = 'bf-empty-heading'; h.textContent = heading; + var d = document.createElement('div'); d.className = 'bf-empty-desc'; d.textContent = desc; + emptyEl.appendChild(icon); emptyEl.appendChild(h); emptyEl.appendChild(d); + list.appendChild(emptyEl); + return; + } + + /* Library book data is escaped via escapeHtml before HTML insertion */ + var html = ''; + + if (visible.length) { + html += visible.map(function (b) { return _renderBookItem(b, books.indexOf(b)); }).join(''); + } else if (hidden.length) { + html += '
\uD83D\uDCDA
All books are hidden
Expand the hidden section below to manage them
'; + } + + if (hidden.length) { + html += '
' + + '
' + + '\u25BC' + + 'Hidden' + + '(' + hidden.length + ')' + + '
' + + '' + + '
'; + } + + list.innerHTML = html; // eslint-disable-line no-unsanitized/property + revealFirstMobileResult('bf-library-list'); +} + +/* toggleHiddenSection — provided by utils.js */ + +function handleHideClick(btn, index) { + var book = _currentRenderedBooks[index]; + btn.disabled = true; + btn.innerHTML = getSpinnerHtml() + 'Hiding\u2026'; + fetch('/api/bookfusion/hide', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ bookfusion_ids: book.bookfusion_ids || [book.bookfusion_id], hidden: true }) + }) + .then(function (r) { + if (!r.ok) throw new Error('Hide failed'); + return r.json(); + }) + .then(function (data) { + if (data.success) { + book.hidden = true; + renderLibrary(_currentRenderedBooks); + } else { + btn.textContent = 'Hide'; + btn.disabled = false; + } + }) + .catch(function (err) { + btn.textContent = 'Hide'; + btn.disabled = false; + }); +} + +function handleUnhideClick(btn, index) { + var book = _currentRenderedBooks[index]; + btn.disabled = true; + btn.innerHTML = getSpinnerHtml() + 'Unhiding\u2026'; + fetch('/api/bookfusion/hide', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ bookfusion_ids: book.bookfusion_ids || [book.bookfusion_id], hidden: false }) + }) + .then(function (r) { + if (!r.ok) throw new Error('Unhide failed'); + return r.json(); + }) + .then(function (data) { + if (data.success) { + book.hidden = false; + renderLibrary(_currentRenderedBooks); + } else { + btn.textContent = 'Unhide'; + btn.disabled = false; + } + }) + .catch(function (err) { + btn.textContent = 'Unhide'; + btn.disabled = false; + }); +} + +function handleUnlinkClick(btn, index) { + var book = _currentRenderedBooks[index]; + btn.disabled = true; + btn.innerHTML = getSpinnerHtml() + 'Unlinking\u2026'; + fetch('/api/bookfusion/unlink', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ abs_id: book.abs_id }) + }) + .then(function (r) { + if (!r.ok) throw new Error('Unlink failed'); + return r.json(); + }) + .then(function (data) { + if (data.success) { + book.on_dashboard = false; + book.abs_id = null; + renderLibrary(_currentRenderedBooks); + } else { + btn.textContent = 'Unlink'; + btn.disabled = false; + } + }) + .catch(function (err) { + btn.textContent = 'Unlink'; + btn.disabled = false; + }); +} + +function handleLinkClick(btn, index) { + var book = _currentRenderedBooks[index]; + var comboboxEl = btn.previousElementSibling; + var input = comboboxEl.querySelector('.bf-combobox-input'); + if (input.dataset.selectedValue) { + matchToBook(btn, input, book, index); + } else { + btn.classList.add('error'); + btn.textContent = 'Select a book first'; + setTimeout(function () { + btn.classList.remove('error'); + btn.textContent = 'Link'; + }, 2000); + } +} + +function handleAddClick(btn, index, status) { + var book = _currentRenderedBooks[index]; + addToDashboard(btn, book, index, status); +} + +function _showDateNote(actionsEl, data) { + var note = _formatDateNote(data); + if (!note) return; + var noteEl = document.createElement('div'); + noteEl.className = 'bf-date-note'; + noteEl.textContent = note; + noteEl.style.cssText = 'font-size: 0.75rem; color: var(--color-text-muted); margin-top: 4px; opacity: 0; transition: opacity 0.4s; width: 100%;'; + actionsEl.appendChild(noteEl); + requestAnimationFrame(function () { noteEl.style.opacity = '1'; }); +} + +function matchToBook(btn, input, book, index) { + var absId = input.dataset.selectedValue; + btn.disabled = true; + input.disabled = true; + btn.innerHTML = getSpinnerHtml() + 'Linking\u2026'; + + fetch('/api/bookfusion/match-to-book', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ bookfusion_ids: book.bookfusion_ids || [book.bookfusion_id], abs_id: absId }) + }) + .then(function (r) { + if (!r.ok) throw new Error('Match failed'); + return r.json(); + }) + .then(function (data) { + if (data.success) { + book.on_dashboard = true; + book.abs_id = absId; + renderLibrary(_currentRenderedBooks); + setTimeout(function () { + var actionsEls = document.querySelectorAll('.bf-library-actions'); + var actionsEl = Array.from(actionsEls).find(function (el) { return el.dataset.index == index; }); + if (actionsEl) _showDateNote(actionsEl, data); + }, 50); + } else { + btn.textContent = 'Link'; + btn.disabled = false; + input.disabled = false; + } + }) + .catch(function (err) { + btn.textContent = 'Link'; + btn.disabled = false; + input.disabled = false; + }); +} + +function addToDashboard(btn, book, index, status) { + btn.disabled = true; + btn.textContent = 'Adding\u2026'; + + var payload = { bookfusion_ids: book.bookfusion_ids || [book.bookfusion_id] }; + if (status) payload.status = status; + + fetch('/api/bookfusion/add-to-dashboard', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify(payload) + }) + .then(function (r) { + if (!r.ok) throw new Error('Add failed'); + return r.json(); + }) + .then(function (data) { + if (data.success) { + book.on_dashboard = true; + book.abs_id = data.abs_id; + renderLibrary(_currentRenderedBooks); + setTimeout(function () { + var actionsEls = document.querySelectorAll('.bf-library-actions'); + var actionsEl = Array.from(actionsEls).find(function (el) { return el.dataset.index == index; }); + if (actionsEl) _showDateNote(actionsEl, data); + }, 50); + } else { + btn.textContent = data.error || 'Error'; + btn.classList.add('error'); + btn.disabled = false; + } + }) + .catch(function (err) { + btn.textContent = err.message || 'Error'; + btn.classList.add('error'); + btn.disabled = false; + }); +} + +/* ── Init ── */ +document.addEventListener('focusin', function (e) { + if (e.target.matches('#bf-search-input, #bf-library-search, .bf-combobox-input')) { + keepElementVisible(e.target, 'center'); + } +}); + +window.addEventListener('resize', function () { + var active = document.activeElement; + if (active && active.matches && active.matches('#bf-search-input, #bf-library-search, .bf-combobox-input')) { + keepElementVisible(active, 'center'); + } +}); + +var urlTab = new URLSearchParams(window.location.search).get('tab'); +if (urlTab && document.getElementById('bf-panel-' + urlTab)) { + switchBFTab(urlTab); +} else { + loadLibrary(); +} diff --git a/static/js/confirm-modal.js b/static/js/confirm-modal.js new file mode 100644 index 0000000..29195fe --- /dev/null +++ b/static/js/confirm-modal.js @@ -0,0 +1,246 @@ +/* ═══════════════════════════════════════════ + PageKeeper — unified confirm modal + ═══════════════════════════════════════════ + Requires the #pk-confirm-modal partial + (templates/partials/confirm_modal.html). + + Usage: + + // JS callback mode + PKModal.confirm({ + title: 'Delete Book', + message: 'Are you sure?', + confirmLabel: 'Delete', // optional, default 'Confirm' + confirmClass: 'btn btn-danger', // optional, default 'btn btn-warning' + onConfirm: function() { ... } + }); + + // Form POST mode + PKModal.confirmForm({ + title: 'Clear Progress', + message: 'Clear all progress?', + formAction: '/clear-progress/123', + hiddenFields: { action: 'clear_queue' }, // optional + confirmLabel: 'Clear', + confirmClass: 'btn btn-warning' + }); + + // Alert / info mode (OK button only) + PKModal.alert({ + title: 'Success', + message: 'Operation complete.' + }); + + PKModal.close(); + ═══════════════════════════════════════════ */ + +var PKModal = (function () { + 'use strict'; + + /* ── cached DOM refs (resolved lazily) ── */ + var _modal, _icon, _title, _message, _cancelBtn, _confirmBtn, _form, _hiddenContainer; + var _onConfirmCallback = null; + + function _el(id) { return document.getElementById(id); } + + function _resolve() { + if (_modal) return; + _modal = _el('pk-confirm-modal'); + if (!_modal) { + console.error('PKModal: #pk-confirm-modal not found — is confirm_modal.html included?'); + return; + } + _icon = _el('pk-modal-icon'); + _title = _el('pk-modal-title'); + _message = _el('pk-modal-message'); + _cancelBtn = _el('pk-modal-cancel'); + _confirmBtn = _el('pk-modal-confirm'); + _form = _el('pk-modal-form'); + _hiddenContainer = _el('pk-modal-hidden-fields'); + } + + /* ── internal helpers ── */ + + function _clearChildren(el) { + while (el.firstChild) el.removeChild(el.firstChild); + } + + function _setIcon(accentClass) { + _icon.className = 'confirm-modal-icon'; + _icon.textContent = '\u26A0'; + if (accentClass) _icon.classList.add(accentClass); + } + + function _accentClassFromBtn(btnClass) { + if (!btnClass) return 'confirm-icon-warning'; + if (btnClass.indexOf('danger') !== -1) return 'confirm-icon-danger'; + return 'confirm-icon-warning'; + } + + function _open() { + _modal.style.display = 'flex'; + } + + function _handleConfirmClick() { + if (_onConfirmCallback) { + var cb = _onConfirmCallback; + close(); + cb(); + } + } + + /* ── public API ── */ + + /** + * JS callback mode — shows modal, calls onConfirm when confirmed. + */ + function confirm(opts) { + _resolve(); + if (!_modal) return; + var confirmClass = opts.confirmClass || 'btn btn-warning'; + + _setIcon(_accentClassFromBtn(confirmClass)); + _title.textContent = opts.title || 'Confirm'; + _message.textContent = opts.message || ''; + + /* Cancel button */ + _cancelBtn.textContent = 'Cancel'; + _cancelBtn.style.display = ''; + + /* Confirm button (plain button, not form submit) */ + _confirmBtn.style.display = ''; + _confirmBtn.className = confirmClass; + _confirmBtn.textContent = opts.confirmLabel || 'Confirm'; + _confirmBtn.type = 'button'; + _onConfirmCallback = opts.onConfirm || null; + + /* Hide form */ + _form.style.display = 'none'; + + _open(); + } + + /** + * Form POST mode — shows modal, submits form on confirm. + */ + function confirmForm(opts) { + _resolve(); + if (!_modal) return; + var confirmClass = opts.confirmClass || 'btn btn-warning'; + + _setIcon(_accentClassFromBtn(confirmClass)); + _title.textContent = opts.title || 'Confirm'; + _message.textContent = opts.message || ''; + + /* Cancel button */ + _cancelBtn.textContent = 'Cancel'; + _cancelBtn.style.display = ''; + + /* Hide plain confirm button */ + _confirmBtn.style.display = 'none'; + _onConfirmCallback = null; + + /* Configure form */ + _form.style.display = ''; + _form.action = opts.formAction || ''; + + /* Hidden fields — clear old ones safely, add new */ + _clearChildren(_hiddenContainer); + if (opts.hiddenFields) { + Object.keys(opts.hiddenFields).forEach(function (name) { + var input = document.createElement('input'); + input.type = 'hidden'; + input.name = name; + input.value = opts.hiddenFields[name]; + _hiddenContainer.appendChild(input); + }); + } + + /* Submit button inside form */ + var submitBtn = _form.querySelector('button[type="submit"]'); + submitBtn.className = confirmClass; + submitBtn.textContent = opts.confirmLabel || opts.title || 'Confirm'; + + _open(); + } + + /** + * Alert / info mode — OK button only, no confirm action. + */ + function alert(opts) { + _resolve(); + if (!_modal) return; + + _icon.className = 'confirm-modal-icon'; + _icon.textContent = ''; + + _title.textContent = opts.title || ''; + _message.textContent = opts.message || ''; + if (opts.preserveWhitespace) { + _message.style.whiteSpace = 'pre-line'; + } else { + _message.style.whiteSpace = ''; + } + + /* Only OK button */ + _cancelBtn.textContent = 'OK'; + _cancelBtn.style.display = ''; + + _confirmBtn.style.display = 'none'; + _form.style.display = 'none'; + _onConfirmCallback = null; + + _open(); + } + + /** + * Close the modal. + */ + function close() { + _resolve(); + if (!_modal) return; + _modal.style.display = 'none'; + _onConfirmCallback = null; + _message.style.whiteSpace = ''; + } + + /* ── keyboard support ── */ + document.addEventListener('keydown', function (e) { + if (e.key === 'Escape' && _modal && _modal.style.display !== 'none') { + close(); + } + }); + + /* ── delegate click on confirm button ── */ + document.addEventListener('click', function (e) { + if (e.target && e.target.id === 'pk-modal-confirm') { + _handleConfirmClick(); + } + }); + + return { + confirm: confirm, + confirmForm: confirmForm, + alert: alert, + close: close + }; +})(); + +/* ─── Legacy bridge ─── + Maps the old showConfirmModal(title, message, formAction, accentType) + signature used in book_card.html and batch_match.html to PKModal. + Remove once all call-sites are migrated. + ─────────────────────────────────────────── */ +function showConfirmModal(title, message, formAction, accentType) { + PKModal.confirmForm({ + title: title, + message: message, + formAction: formAction, + confirmLabel: title, + confirmClass: accentType === 'danger' ? 'btn btn-danger' : 'btn btn-warning' + }); +} + +function closeConfirmModal() { + PKModal.close(); +} diff --git a/static/js/dashboard.js b/static/js/dashboard.js index 95a382d..37617a6 100644 --- a/static/js/dashboard.js +++ b/static/js/dashboard.js @@ -1,6 +1,8 @@ /* PageKeeper — Dashboard JS Extracted from index.html inline scripts. */ +var refreshPaused = false; + function initDashboard() { const processingGrid = document.getElementById('processing-grid'); const currentlyReadingGrid = document.getElementById('currently-reading-grid'); @@ -159,7 +161,9 @@ function initDashboard() { return direction === 'asc' ? comparison : -comparison; }); - sortedCards.forEach(card => grid.appendChild(card)); + var fragment = document.createDocumentFragment(); + sortedCards.forEach(function(card) { fragment.appendChild(card); }); + grid.appendChild(fragment); } function applySorting(sortBy) { @@ -224,8 +228,6 @@ function initDashboard() { dashboardSearch.addEventListener('input', filterBooks); } - let refreshPaused = false; - function refreshDashboard() { if (refreshPaused) { setTimeout(refreshDashboard, 30000); @@ -292,6 +294,11 @@ function initDashboard() { document.body.addEventListener('click', e => { if (e.target.classList.contains('modal-overlay')) refreshPaused = false; }); + document.addEventListener('click', e => { + if (e.target.id === 'pk-modal-cancel' || e.target.id === 'pk-modal-confirm') { + refreshPaused = false; + } + }); setTimeout(refreshDashboard, 30000); @@ -390,22 +397,68 @@ function updateKoSyncHash(event) { const title = item.dataset.title; const currentHash = item.dataset.hash; - const msg = `Enter new KoSync MD5 Hash for '${title}'\n\nCurrent: ${currentHash}\n\n(Leave empty to automatically recalculate from the ebook file)`; - const input = prompt(msg); - - if (input !== null) { - const form = document.createElement('form'); + var backdrop = document.createElement('div'); + backdrop.className = 'modal-backdrop'; + backdrop.style.zIndex = '1100'; + + var content = document.createElement('div'); + content.className = 'modal-content'; + content.style.maxWidth = '420px'; + content.style.padding = '24px'; + + var heading = document.createElement('h3'); + heading.style.cssText = 'margin: 0 0 8px; font-size: 16px; font-weight: 600;'; + heading.textContent = 'Update KoSync Hash'; + content.appendChild(heading); + + var desc = document.createElement('p'); + desc.style.cssText = 'margin: 0 0 6px; font-size: 13px; color: var(--color-text-muted); line-height: 1.5;'; + desc.textContent = `Enter new KoSync MD5 Hash for "${title}".`; + content.appendChild(desc); + + var current = document.createElement('p'); + current.style.cssText = 'margin: 0 0 12px; font-size: 12px; color: var(--color-text-muted); font-family: monospace;'; + current.textContent = `Current: ${currentHash}`; + content.appendChild(current); + + var hashInput = document.createElement('input'); + hashInput.type = 'text'; + hashInput.className = 'form-control'; + hashInput.placeholder = 'Leave empty to auto-recalculate from ebook file'; + hashInput.style.cssText = 'width: 100%; margin-bottom: 16px; box-sizing: border-box;'; + content.appendChild(hashInput); + + var btns = document.createElement('div'); + btns.style.cssText = 'display: flex; gap: 8px; justify-content: flex-end;'; + + var cancelBtn = document.createElement('button'); + cancelBtn.className = 'btn btn-secondary'; + cancelBtn.textContent = 'Cancel'; + cancelBtn.addEventListener('click', function () { backdrop.remove(); }); + btns.appendChild(cancelBtn); + + var submitBtn = document.createElement('button'); + submitBtn.className = 'btn btn-warning'; + submitBtn.textContent = 'Update'; + submitBtn.addEventListener('click', function () { + var form = document.createElement('form'); form.method = 'POST'; form.action = `/update-hash/${encodeURIComponent(bookId)}`; - const inputField = document.createElement('input'); - inputField.type = 'hidden'; - inputField.name = 'new_hash'; - inputField.value = input.trim(); - - form.appendChild(inputField); + var hidden = document.createElement('input'); + hidden.type = 'hidden'; + hidden.name = 'new_hash'; + hidden.value = hashInput.value.trim(); + form.appendChild(hidden); document.body.appendChild(form); form.submit(); - } + }); + btns.appendChild(submitBtn); + + content.appendChild(btns); + backdrop.appendChild(content); + backdrop.addEventListener('click', function (e) { if (e.target === backdrop) backdrop.remove(); }); + document.body.appendChild(backdrop); + hashInput.focus(); } function syncNow(bookId, btn) { @@ -546,24 +599,28 @@ function resumeBook(bookId, btn) { function dnfBook(bookId, title) { closeAllMenus(); - if (!confirm('Mark "' + title + '" as Did Not Finish? This book will be excluded from syncing.')) { - return; - } - - fetch('/api/dnf/' + encodeURIComponent(bookId), { - method: 'POST', - headers: { 'Content-Type': 'application/json' } - }).then(response => response.json()) - .then(data => { - if (data.success) { - window.location.reload(); - } else { - alert('Error marking book as DNF: ' + (data.error || 'Unknown error')); - } - }).catch(error => { - console.error('Error:', error); - alert('Connection error while marking book as DNF'); - }); + PKModal.confirm({ + title: 'Did Not Finish', + message: 'Mark "' + title + '" as Did Not Finish? This book will be excluded from syncing.', + confirmLabel: 'Mark DNF', + confirmClass: 'btn btn-warning', + onConfirm: function () { + fetch('/api/dnf/' + encodeURIComponent(bookId), { + method: 'POST', + headers: { 'Content-Type': 'application/json' } + }).then(function (response) { return response.json(); }) + .then(function (data) { + if (data.success) { + window.location.reload(); + } else { + PKModal.alert({ title: 'Error', message: data.error || 'Unknown error' }); + } + }).catch(function (error) { + console.error('Error:', error); + PKModal.alert({ title: 'Error', message: 'Connection error while marking book as DNF' }); + }); + } + }); } function retryTranscription(bookId, btn) { @@ -598,18 +655,24 @@ function retryTranscription(bookId, btn) { function markComplete(bookId, title) { closeAllMenus(); - if (!confirm('Are you sure you want to mark "' + title + '" as complete? This will set progress to 100% on all synced platforms.')) { - return; - } - window._mcBookId = bookId; - const modal = document.getElementById('delete-mapping-modal'); - if (modal) modal.style.display = 'flex'; + PKModal.confirm({ + title: 'Mark Complete', + message: 'Mark "' + title + '" as complete? This will set progress to 100% on all synced platforms.', + confirmLabel: 'Mark Complete', + confirmClass: 'btn btn-warning', + onConfirm: function () { + window._mcBookId = bookId; + var modal = document.getElementById('delete-mapping-modal'); + if (modal) modal.style.display = 'flex'; + } + }); } function closeDeleteMappingModal() { const modal = document.getElementById('delete-mapping-modal'); if (modal) modal.style.display = 'none'; window._mcBookId = null; + refreshPaused = false; } function _dmExecuteFetch(bookId, shouldDelete) { closeDeleteMappingModal(); @@ -631,11 +694,11 @@ function _dmExecuteFetch(bookId, shouldDelete) { window.location.reload(); } } else { - alert('Error marking book as complete: ' + (data.error || 'Unknown error')); + PKModal.alert({ title: 'Error', message: data.error || 'Unknown error' }); } - }).catch(error => { + }).catch(function (error) { console.error('Error:', error); - alert('Connection error while marking book as complete'); + PKModal.alert({ title: 'Error', message: 'Connection error while marking book as complete' }); }); } @@ -707,6 +770,7 @@ function closeActionPanel() { } panel.style.display = 'none'; + refreshPaused = false; } function closeAllMenus() { @@ -741,41 +805,12 @@ document.addEventListener('keydown', function(e) { } }); -function showConfirmModal(title, message, formAction, accentType) { +/* Override the legacy bridge from confirm-modal.js to also close card menus */ +var _baseShowConfirmModal = showConfirmModal; +showConfirmModal = function(title, message, formAction, accentType) { closeAllMenus(); - const modal = document.getElementById('confirm-modal'); - if (!modal) return; - const iconEl = document.getElementById('confirm-modal-icon'); - const titleEl = document.getElementById('confirm-modal-title'); - const msgEl = document.getElementById('confirm-modal-message'); - const formEl = document.getElementById('confirm-modal-form'); - const submitBtn = document.getElementById('confirm-modal-submit'); - - titleEl.textContent = title; - msgEl.textContent = message; - formEl.action = formAction; - - submitBtn.className = 'btn'; - iconEl.className = 'confirm-modal-icon'; - - if (accentType === 'danger') { - submitBtn.classList.add('btn-danger'); - iconEl.textContent = '\u26A0'; - iconEl.classList.add('confirm-icon-danger'); - } else { - submitBtn.classList.add('btn-warning'); - iconEl.textContent = '\u26A0'; - iconEl.classList.add('confirm-icon-warning'); - } - - submitBtn.textContent = title; - modal.style.display = 'flex'; -} - -function closeConfirmModal() { - const modal = document.getElementById('confirm-modal'); - if (modal) modal.style.display = 'none'; -} + _baseShowConfirmModal(title, message, formAction, accentType); +}; document.addEventListener('click', function(e) { const trigger = e.target.closest('.card-menu-trigger'); if (trigger) { diff --git a/static/js/kosync-documents.js b/static/js/kosync-documents.js new file mode 100644 index 0000000..7e973f3 --- /dev/null +++ b/static/js/kosync-documents.js @@ -0,0 +1,746 @@ +/* PageKeeper — KoSync Document Management */ +(function () { + 'use strict'; + + var data = window.PK_PAGE_DATA; + var documents = data.documents || []; + var orphanedBooks = data.orphanedBooks || []; + + var STALE_DAYS = 30; + + // ── Toast ── + + function showToast(message) { + var existing = document.querySelector('.r-tbr-toast'); + if (existing) existing.remove(); + var toast = document.createElement('div'); + toast.className = 'r-tbr-toast'; + toast.textContent = message; + document.body.appendChild(toast); + setTimeout(function () { + toast.style.transition = 'opacity 0.3s'; + toast.style.opacity = '0'; + setTimeout(function () { toast.remove(); }, 300); + }, 3000); + } + + // ── Helpers ── + + function truncHash(hash) { + return hash ? hash.substring(0, 12) + '\u2026' : ''; + } + + function timeAgo(isoStr) { + if (!isoStr) return 'never'; + var diff = Date.now() - new Date(isoStr).getTime(); + var mins = Math.floor(diff / 60000); + if (mins < 1) return 'just now'; + if (mins < 60) return mins + 'm ago'; + var hours = Math.floor(mins / 60); + if (hours < 24) return hours + 'h ago'; + var days = Math.floor(hours / 24); + if (days < 30) return days + 'd ago'; + var months = Math.floor(days / 30); + return months + 'mo ago'; + } + + function daysSince(isoStr) { + if (!isoStr) return Infinity; + return (Date.now() - new Date(isoStr).getTime()) / 86400000; + } + + function clearEl(el) { + while (el.firstChild) el.removeChild(el.firstChild); + } + + function makeEmpty(msg) { + var el = document.createElement('div'); + el.className = 'kosync-empty'; + el.textContent = msg; + return el; + } + + function isRealDevice(doc) { + var d = (doc.device || '').toLowerCase(); + return d && d !== 'pagekeeper-bot' && d !== 'pagekeeper'; + } + + // ── Categorize documents ── + + function categorize() { + var needsAttention = []; // unlinked docs + orphaned books + var healthy = []; // linked, updated within STALE_DAYS + var stale = []; // linked, not updated within STALE_DAYS + + // Unlinked documents -> needs attention + documents.forEach(function (doc) { + if (!doc.linked_book_id) { + needsAttention.push({ type: 'unlinked', doc: doc }); + } else { + var days = daysSince(doc.last_updated); + if (days <= STALE_DAYS) { + healthy.push(doc); + } else { + stale.push(doc); + } + } + }); + + // Orphaned books -> needs attention + orphanedBooks.forEach(function (book) { + needsAttention.push({ type: 'orphaned', book: book }); + }); + + return { needsAttention: needsAttention, healthy: healthy, stale: stale }; + } + + // ── Stats ── + + function renderStats(cats) { + var statsEl = document.getElementById('kosync-stats'); + clearEl(statsEl); + + var items = [ + { label: 'need attention', count: cats.needsAttention.length, cls: cats.needsAttention.length > 0 ? 'kosync-stat--alert' : '' }, + { label: 'healthy', count: cats.healthy.length, cls: '' }, + { label: 'stale (30d+)', count: cats.stale.length, cls: cats.stale.length > 0 ? 'kosync-stat--warn' : '' }, + { label: 'total docs', count: documents.length, cls: '' } + ]; + + items.forEach(function (item) { + var pill = document.createElement('div'); + pill.className = 'kosync-stat' + (item.cls ? ' ' + item.cls : ''); + var strong = document.createElement('strong'); + strong.textContent = item.count; + pill.appendChild(strong); + pill.appendChild(document.createTextNode(' ' + item.label)); + statsEl.appendChild(pill); + }); + } + + // ── Needs Attention section ── + + function renderNeedsAttention(items) { + var list = document.getElementById('attention-list'); + clearEl(list); + + if (!items.length) { + list.appendChild(makeEmpty('Nothing needs attention.')); + document.getElementById('attention-section').style.display = 'none'; + return; + } + document.getElementById('attention-section').style.display = ''; + + items.forEach(function (item) { + if (item.type === 'unlinked') { + list.appendChild(buildUnlinkedCard(item.doc)); + } else { + list.appendChild(buildOrphanedCard(item.book)); + } + }); + } + + function buildUnlinkedCard(doc) { + var card = document.createElement('div'); + card.className = 'kosync-card kosync-card--attention'; + + var info = document.createElement('div'); + info.className = 'kosync-card-info'; + + // Tag + var tag = document.createElement('span'); + tag.className = 'kosync-tag kosync-tag--unlinked'; + tag.textContent = 'Unlinked Hash'; + info.appendChild(tag); + + // Hash + var hashEl = document.createElement('span'); + hashEl.className = 'kosync-hash'; + hashEl.textContent = truncHash(doc.document_hash); + hashEl.title = doc.document_hash; + info.appendChild(hashEl); + + // Meta line + var meta = document.createElement('div'); + meta.className = 'kosync-card-meta'; + + if (doc.device) { + var devSpan = document.createElement('span'); + devSpan.textContent = isRealDevice(doc) ? doc.device : doc.device + ' (bot)'; + if (isRealDevice(doc)) devSpan.className = 'kosync-meta--highlight'; + meta.appendChild(devSpan); + } + if (doc.percentage) { + var pctSpan = document.createElement('span'); + pctSpan.textContent = (doc.percentage * 100).toFixed(1) + '%'; + meta.appendChild(pctSpan); + } + if (doc.first_seen) { + var seenSpan = document.createElement('span'); + seenSpan.textContent = 'First seen ' + timeAgo(doc.first_seen); + meta.appendChild(seenSpan); + } + info.appendChild(meta); + card.appendChild(info); + + // Actions + var actions = document.createElement('div'); + actions.className = 'kosync-card-actions'; + + var linkBtn = document.createElement('button'); + linkBtn.className = 'btn btn-primary'; + linkBtn.textContent = 'Link to Book'; + linkBtn.type = 'button'; + linkBtn.addEventListener('click', function () { toggleSearchPanel(card, doc.document_hash); }); + actions.appendChild(linkBtn); + + var createBtn = document.createElement('button'); + createBtn.className = 'btn btn-secondary'; + createBtn.textContent = 'Create Book'; + createBtn.type = 'button'; + createBtn.addEventListener('click', function () { showCreateBookModal(doc.document_hash); }); + actions.appendChild(createBtn); + + var deleteBtn = document.createElement('button'); + deleteBtn.className = 'btn btn-danger'; + deleteBtn.textContent = 'Delete'; + deleteBtn.type = 'button'; + deleteBtn.addEventListener('click', function () { + PKModal.confirm({ + title: 'Delete Document', + message: 'Delete KoSync document ' + truncHash(doc.document_hash) + '? This removes all stored progress for this hash.', + confirmLabel: 'Delete', + confirmClass: 'btn btn-danger', + onConfirm: function () { deleteDocument(doc.document_hash); } + }); + }); + actions.appendChild(deleteBtn); + + card.appendChild(actions); + return card; + } + + function buildOrphanedCard(book) { + var card = document.createElement('div'); + card.className = 'kosync-card kosync-card--attention'; + + var info = document.createElement('div'); + info.className = 'kosync-card-info'; + + var tag = document.createElement('span'); + tag.className = 'kosync-tag kosync-tag--orphaned'; + tag.textContent = 'Orphaned Hash'; + info.appendChild(tag); + + var title = document.createElement('div'); + title.className = 'kosync-card-title'; + title.textContent = book.title; + info.appendChild(title); + + var meta = document.createElement('div'); + meta.className = 'kosync-card-meta'; + + var hashSpan = document.createElement('span'); + hashSpan.className = 'kosync-hash'; + hashSpan.textContent = truncHash(book.kosync_doc_id); + hashSpan.title = book.kosync_doc_id; + meta.appendChild(hashSpan); + + var statusSpan = document.createElement('span'); + statusSpan.textContent = (book.status || '').replace(/_/g, ' '); + meta.appendChild(statusSpan); + + var helpSpan = document.createElement('span'); + helpSpan.className = 'kosync-meta--warn'; + helpSpan.textContent = 'No device is syncing this hash \u2014 causes 502 errors each sync cycle'; + meta.appendChild(helpSpan); + + info.appendChild(meta); + card.appendChild(info); + + var actions = document.createElement('div'); + actions.className = 'kosync-card-actions'; + + var linkBookBtn = document.createElement('button'); + linkBookBtn.className = 'btn btn-primary'; + linkBookBtn.textContent = 'Link to Book'; + linkBookBtn.type = 'button'; + linkBookBtn.title = 'Search your library and link this hash to a book'; + linkBookBtn.addEventListener('click', function () { + toggleOrphanSearchPanel(card, book.book_id); + }); + actions.appendChild(linkBookBtn); + + var resolveBtn = document.createElement('button'); + resolveBtn.className = 'btn btn-secondary'; + resolveBtn.textContent = 'Link to Self'; + resolveBtn.type = 'button'; + resolveBtn.title = 'Create the missing record and link it back to ' + book.title; + resolveBtn.addEventListener('click', function () { + PKModal.confirm({ + title: 'Link to "' + book.title + '"', + message: 'This creates the missing KoSync document record and links it to "' + book.title + '". The sync engine will then be able to track ebook progress for this hash.', + confirmLabel: 'Link', + confirmClass: 'btn btn-primary', + onConfirm: function () { resolveOrphanedHash(book.book_id); } + }); + }); + actions.appendChild(resolveBtn); + + var clearBtn = document.createElement('button'); + clearBtn.className = 'btn btn-warning'; + clearBtn.textContent = 'Clear Hash'; + clearBtn.type = 'button'; + clearBtn.title = 'Remove the hash from this book entirely'; + clearBtn.addEventListener('click', function () { + PKModal.confirm({ + title: 'Clear Orphaned Hash', + message: 'This removes the pre-calculated KoSync hash from "' + book.title + '". The sync engine will stop trying to look up ebook progress for this book, which eliminates the repeated 502 errors. If a KOReader device later syncs this ebook, it will appear as a new unlinked hash you can link manually.', + confirmLabel: 'Clear Hash', + confirmClass: 'btn btn-warning', + onConfirm: function () { clearOrphanedHash(book.book_id); } + }); + }); + actions.appendChild(clearBtn); + card.appendChild(actions); + return card; + } + + // ── Healthy section ── + + function renderHealthy(items) { + var list = document.getElementById('healthy-list'); + clearEl(list); + + if (!items.length) { + list.appendChild(makeEmpty('No active linked documents.')); + return; + } + + items.forEach(function (doc) { + list.appendChild(buildLinkedCard(doc)); + }); + } + + // ── Stale section ── + + function renderStale(items) { + var section = document.getElementById('stale-section'); + var list = document.getElementById('stale-list'); + clearEl(list); + + if (!items.length) { + section.style.display = 'none'; + return; + } + section.style.display = ''; + + items.forEach(function (doc) { + list.appendChild(buildLinkedCard(doc, true)); + }); + } + + function buildLinkedCard(doc, isStale) { + var card = document.createElement('div'); + card.className = 'kosync-card' + (isStale ? ' kosync-card--stale' : ''); + + var info = document.createElement('div'); + info.className = 'kosync-card-info'; + + // Title prominently + var title = document.createElement('div'); + title.className = 'kosync-card-title'; + title.textContent = doc.linked_book_title || '(unknown book)'; + info.appendChild(title); + + var meta = document.createElement('div'); + meta.className = 'kosync-card-meta'; + + // Hash (secondary) + var hashSpan = document.createElement('span'); + hashSpan.className = 'kosync-hash'; + hashSpan.textContent = truncHash(doc.document_hash); + hashSpan.title = doc.document_hash; + meta.appendChild(hashSpan); + + // Percentage + if (doc.percentage) { + var pctSpan = document.createElement('span'); + pctSpan.textContent = (doc.percentage * 100).toFixed(1) + '%'; + meta.appendChild(pctSpan); + } + + // Device with bot indicator + if (doc.device) { + var devSpan = document.createElement('span'); + if (isRealDevice(doc)) { + devSpan.textContent = doc.device; + devSpan.className = 'kosync-meta--highlight'; + } else { + devSpan.textContent = doc.device + ' (bot)'; + } + meta.appendChild(devSpan); + } + + // Last updated as time ago + if (doc.last_updated) { + var updSpan = document.createElement('span'); + updSpan.textContent = timeAgo(doc.last_updated); + if (isStale) updSpan.className = 'kosync-meta--warn'; + meta.appendChild(updSpan); + } + + info.appendChild(meta); + card.appendChild(info); + + // Actions + var actions = document.createElement('div'); + actions.className = 'kosync-card-actions'; + + var unlinkBtn = document.createElement('button'); + unlinkBtn.className = 'btn btn-secondary'; + unlinkBtn.textContent = 'Unlink'; + unlinkBtn.type = 'button'; + unlinkBtn.addEventListener('click', function () { + PKModal.confirm({ + title: 'Unlink Document', + message: 'Unlink ' + truncHash(doc.document_hash) + ' from "' + (doc.linked_book_title || 'unknown') + '"?', + confirmLabel: 'Unlink', + confirmClass: 'btn btn-warning', + onConfirm: function () { unlinkDocument(doc.document_hash); } + }); + }); + actions.appendChild(unlinkBtn); + + var deleteBtn = document.createElement('button'); + deleteBtn.className = 'btn btn-danger'; + deleteBtn.textContent = 'Delete'; + deleteBtn.type = 'button'; + deleteBtn.addEventListener('click', function () { + PKModal.confirm({ + title: 'Delete Document', + message: 'Delete KoSync document for "' + (doc.linked_book_title || 'unknown') + '"? This removes stored progress.', + confirmLabel: 'Delete', + confirmClass: 'btn btn-danger', + onConfirm: function () { deleteDocument(doc.document_hash); } + }); + }); + actions.appendChild(deleteBtn); + + card.appendChild(actions); + return card; + } + + // ── Inline book search ── + + function toggleSearchPanel(card, docHash) { + var existing = card.querySelector('.kosync-search-panel'); + if (existing) { + existing.remove(); + return; + } + + var panel = document.createElement('div'); + panel.className = 'kosync-search-panel'; + + var input = document.createElement('input'); + input.type = 'text'; + input.className = 'search-box'; + input.placeholder = 'Search library by title...'; + input.autocomplete = 'off'; + panel.appendChild(input); + + var results = document.createElement('div'); + panel.appendChild(results); + + card.querySelector('.kosync-card-info').appendChild(panel); + input.focus(); + + var timer = null; + input.addEventListener('input', function () { + clearTimeout(timer); + var q = input.value.trim(); + if (q.length < 2) { clearEl(results); return; } + timer = setTimeout(function () { searchBooks(q, results, docHash); }, 350); + }); + } + + function searchBooks(query, resultsEl, docHash) { + clearEl(resultsEl); + resultsEl.appendChild(makeEmpty('Searching...')); + + fetch('/api/reading/library-search?q=' + encodeURIComponent(query)) + .then(function (r) { return r.json(); }) + .then(function (books) { + clearEl(resultsEl); + if (!books.length) { + resultsEl.appendChild(makeEmpty('No matching books found.')); + return; + } + books.forEach(function (book) { + var row = document.createElement('div'); + row.className = 'kosync-search-result'; + + var info = document.createElement('div'); + info.className = 'kosync-search-result-info'; + var t = document.createElement('div'); + t.className = 'kosync-search-result-title'; + t.textContent = book.title || book.abs_id || '(untitled)'; + info.appendChild(t); + var s = document.createElement('div'); + s.className = 'kosync-search-result-status'; + s.textContent = (book.status || '').replace(/_/g, ' '); + info.appendChild(s); + row.appendChild(info); + + var btn = document.createElement('button'); + btn.className = 'btn btn-primary'; + btn.textContent = 'Link'; + btn.type = 'button'; + btn.style.cssText = 'flex-shrink: 0; padding: 4px 12px; font-size: 12px;'; + btn.addEventListener('click', function () { linkDocument(docHash, book); }); + row.appendChild(btn); + resultsEl.appendChild(row); + }); + }) + .catch(function () { + clearEl(resultsEl); + resultsEl.appendChild(makeEmpty('Search failed.')); + }); + } + + // ── Orphan search panel (link hash to a different book) ── + + function toggleOrphanSearchPanel(card, sourceBookId) { + var existing = card.querySelector('.kosync-search-panel'); + if (existing) { existing.remove(); return; } + + var panel = document.createElement('div'); + panel.className = 'kosync-search-panel'; + + var input = document.createElement('input'); + input.type = 'text'; + input.className = 'search-box'; + input.placeholder = 'Search library by title...'; + input.autocomplete = 'off'; + panel.appendChild(input); + + var results = document.createElement('div'); + panel.appendChild(results); + + card.querySelector('.kosync-card-info').appendChild(panel); + input.focus(); + + var timer = null; + input.addEventListener('input', function () { + clearTimeout(timer); + var q = input.value.trim(); + if (q.length < 2) { clearEl(results); return; } + timer = setTimeout(function () { searchBooksForOrphan(q, results, sourceBookId); }, 350); + }); + } + + function searchBooksForOrphan(query, resultsEl, sourceBookId) { + clearEl(resultsEl); + resultsEl.appendChild(makeEmpty('Searching...')); + + fetch('/api/reading/library-search?q=' + encodeURIComponent(query)) + .then(function (r) { return r.json(); }) + .then(function (books) { + clearEl(resultsEl); + if (!books.length) { resultsEl.appendChild(makeEmpty('No matching books found.')); return; } + books.forEach(function (book) { + var row = document.createElement('div'); + row.className = 'kosync-search-result'; + + var info = document.createElement('div'); + info.className = 'kosync-search-result-info'; + var t = document.createElement('div'); + t.className = 'kosync-search-result-title'; + t.textContent = book.title || '(untitled)'; + info.appendChild(t); + var s = document.createElement('div'); + s.className = 'kosync-search-result-status'; + s.textContent = (book.status || '').replace(/_/g, ' '); + info.appendChild(s); + row.appendChild(info); + + var btn = document.createElement('button'); + btn.className = 'btn btn-primary'; + btn.textContent = 'Link'; + btn.type = 'button'; + btn.style.cssText = 'flex-shrink: 0; padding: 4px 12px; font-size: 12px;'; + btn.addEventListener('click', function () { + resolveOrphanedHash(sourceBookId, book.id); + }); + row.appendChild(btn); + resultsEl.appendChild(row); + }); + }) + .catch(function () { + clearEl(resultsEl); + resultsEl.appendChild(makeEmpty('Search failed.')); + }); + } + + // ── Create book modal ── + + function showCreateBookModal(docHash) { + var backdrop = document.createElement('div'); + backdrop.className = 'modal-backdrop'; + backdrop.style.zIndex = '1100'; + + var content = document.createElement('div'); + content.className = 'modal-content'; + content.style.maxWidth = '420px'; + content.style.padding = '24px'; + + var heading = document.createElement('h3'); + heading.style.cssText = 'margin: 0 0 8px; font-size: 16px; font-weight: 600;'; + heading.textContent = 'Create Ebook-Only Book'; + content.appendChild(heading); + + var desc = document.createElement('p'); + desc.style.cssText = 'margin: 0 0 12px; font-size: 13px; color: var(--color-text-muted); line-height: 1.5;'; + desc.textContent = 'Create a new book in your library linked to this KoSync hash.'; + content.appendChild(desc); + + var titleInput = document.createElement('input'); + titleInput.type = 'text'; + titleInput.className = 'search-box'; + titleInput.placeholder = 'Book title'; + titleInput.style.cssText = 'width: 100%; margin-bottom: 16px; box-sizing: border-box;'; + content.appendChild(titleInput); + + var btns = document.createElement('div'); + btns.style.cssText = 'display: flex; gap: 8px; justify-content: flex-end;'; + + var cancelBtn = document.createElement('button'); + cancelBtn.className = 'btn btn-secondary'; + cancelBtn.textContent = 'Cancel'; + cancelBtn.type = 'button'; + cancelBtn.addEventListener('click', function () { backdrop.remove(); }); + btns.appendChild(cancelBtn); + + var createBtn = document.createElement('button'); + createBtn.className = 'btn btn-primary'; + createBtn.textContent = 'Create'; + createBtn.type = 'button'; + createBtn.addEventListener('click', function () { + var title = titleInput.value.trim(); + if (!title) { titleInput.focus(); return; } + backdrop.remove(); + createBookFromHash(docHash, title); + }); + btns.appendChild(createBtn); + + content.appendChild(btns); + backdrop.appendChild(content); + backdrop.addEventListener('click', function (e) { + if (e.target === backdrop) backdrop.remove(); + }); + document.body.appendChild(backdrop); + titleInput.focus(); + } + + // ── Toggle sections ── + + function setupToggle(btnId, listId) { + var btn = document.getElementById(btnId); + var list = document.getElementById(listId); + if (btn && list) { + btn.addEventListener('click', function () { + var visible = list.style.display !== 'none'; + list.style.display = visible ? 'none' : 'block'; + btn.textContent = visible ? 'show' : 'hide'; + }); + } + } + + setupToggle('toggle-healthy', 'healthy-list'); + setupToggle('toggle-stale', 'stale-list'); + + // ── API actions ── + + function linkDocument(docHash, book) { + var body = book.abs_id ? { abs_id: book.abs_id } : { book_id: book.id }; + fetch('/api/kosync-documents/' + encodeURIComponent(docHash) + '/link', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify(body) + }) + .then(function (r) { return r.json(); }) + .then(function (d) { showToast(d.success ? (d.message || 'Linked') : (d.error || 'Failed')); if (d.success) refreshAll(); }) + .catch(function () { showToast('Link failed'); }); + } + + function deleteDocument(docHash) { + fetch('/api/kosync-documents/' + encodeURIComponent(docHash), { method: 'DELETE' }) + .then(function (r) { return r.json(); }) + .then(function (d) { showToast(d.success ? (d.message || 'Deleted') : (d.error || 'Failed')); if (d.success) refreshAll(); }) + .catch(function () { showToast('Delete failed'); }); + } + + function unlinkDocument(docHash) { + fetch('/api/kosync-documents/' + encodeURIComponent(docHash) + '/unlink', { method: 'POST' }) + .then(function (r) { return r.json(); }) + .then(function (d) { showToast(d.success ? (d.message || 'Unlinked') : (d.error || 'Failed')); if (d.success) refreshAll(); }) + .catch(function () { showToast('Unlink failed'); }); + } + + function resolveOrphanedHash(bookId, targetBookId) { + var body = targetBookId ? { target_book_id: targetBookId } : {}; + var opts = { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify(body), + }; + fetch('/api/kosync-documents/resolve-orphan/' + encodeURIComponent(bookId), opts) + .then(function (r) { return r.json(); }) + .then(function (d) { showToast(d.success ? (d.message || 'Linked') : (d.error || 'Failed')); if (d.success) refreshAll(); }) + .catch(function () { showToast('Resolve failed'); }); + } + + function clearOrphanedHash(bookId) { + fetch('/api/kosync-documents/clear-orphan/' + encodeURIComponent(bookId), { method: 'POST' }) + .then(function (r) { return r.json(); }) + .then(function (d) { showToast(d.success ? (d.message || 'Cleared') : (d.error || 'Failed')); if (d.success) refreshAll(); }) + .catch(function () { showToast('Clear failed'); }); + } + + function createBookFromHash(docHash, title) { + fetch('/api/kosync-documents/' + encodeURIComponent(docHash) + '/create-book', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ title: title }) + }) + .then(function (r) { return r.json(); }) + .then(function (d) { showToast(d.success ? (d.message || 'Created') : (d.error || 'Failed')); if (d.success) refreshAll(); }) + .catch(function () { showToast('Create failed'); }); + } + + // ── Refresh ── + + function refreshAll() { + Promise.all([ + fetch('/api/kosync-documents').then(function (r) { return r.json(); }), + fetch('/api/kosync-documents/orphaned').then(function (r) { return r.json(); }) + ]).then(function (results) { + documents = results[0].documents || []; + orphanedBooks = results[1] || []; + renderAll(); + }).catch(function () { showToast('Failed to refresh'); }); + } + + // ── Render all ── + + function renderAll() { + var cats = categorize(); + renderStats(cats); + renderNeedsAttention(cats.needsAttention); + renderHealthy(cats.healthy); + renderStale(cats.stale); + } + + renderAll(); +})(); diff --git a/static/js/logs.js b/static/js/logs.js new file mode 100644 index 0000000..c5eb8b6 --- /dev/null +++ b/static/js/logs.js @@ -0,0 +1,647 @@ +/* ═══════════════════════════════════════════ + PageKeeper — logs page + ═══════════════════════════════════════════ + Depends on: utils.js (escapeHtml, debounce) + No Jinja2 vars — clean extraction. + ═══════════════════════════════════════════ */ + +let autoRefreshInterval = null; +let liveRefreshInterval = null; +let currentOffset = 0; +let shownLogs = new Set(); + +// DOM elements +const logContent = document.getElementById('logContent'); +const logLevel = document.getElementById('logLevel'); +const searchInput = document.getElementById('searchInput'); +const linesCount = document.getElementById('linesCount'); +const autoRefresh = document.getElementById('autoRefresh'); +const liveMode = document.getElementById('liveMode'); +const refreshBtn = document.getElementById('refreshBtn'); +const totalLinesStats = document.getElementById('totalLinesStats'); +const displayedLinesStats = document.getElementById('displayedLinesStats'); +const lastUpdated = document.getElementById('lastUpdated'); +const scrollToTop = document.getElementById('scrollToTop'); +const scrollToBottom = document.getElementById('scrollToBottom'); +const loadMore = document.getElementById('loadMore'); + +let isAtBottom = true; +let userScrolled = false; +let filterPending = false; + +function showNoLogsMessage() { + logContent.textContent = ''; + var noLogsLine = document.createElement('div'); + noLogsLine.className = 'log-line'; + var noLogsLevel = document.createElement('span'); + noLogsLevel.className = 'log-level INFO'; + noLogsLevel.textContent = 'INFO'; + var noLogsMsg = document.createElement('span'); + noLogsMsg.className = 'log-message'; + noLogsMsg.textContent = 'No logs found matching current filters'; + noLogsLine.appendChild(noLogsLevel); + noLogsLine.appendChild(noLogsMsg); + logContent.appendChild(noLogsLine); +} + +logContent.addEventListener('scroll', () => { + const isCurrentlyAtBottom = logContent.scrollTop + logContent.clientHeight >= logContent.scrollHeight - 10; + if (!isCurrentlyAtBottom && !userScrolled) { + userScrolled = true; + } + isAtBottom = isCurrentlyAtBottom; +}); + +async function fetchLogs(offset = 0, append = false) { + try { + const params = new URLSearchParams({ + level: logLevel.value, + search: searchInput.value, + lines: linesCount.value, + offset: offset + }); + + const response = await fetch(`/api/logs?${params}`); + const data = await response.json(); + + if (!response.ok) { + throw new Error(data.error || 'Failed to fetch logs'); + } + + displayLogs(data, append); + updateStats(data); + updateLastUpdated(); + + return data; + } catch (error) { + console.error('Error fetching logs:', error); + logContent.textContent = ''; + const line = document.createElement('div'); + line.className = 'log-line'; + const level = document.createElement('span'); + level.className = 'log-level ERROR'; + level.textContent = 'ERROR'; + const msg = document.createElement('span'); + msg.className = 'log-message'; + msg.textContent = `Failed to fetch logs: ${error.message}`; + line.appendChild(level); + line.appendChild(msg); + logContent.appendChild(line); + } +} + +async function fetchLiveLogs() { + try { + const params = new URLSearchParams({ + level: logLevel.value, + search: searchInput.value, + count: 50 + }); + + const response = await fetch(`/api/logs/live?${params}`); + const data = await response.json(); + + if (!response.ok) { + throw new Error(data.error || 'Failed to fetch live logs'); + } + + if (data.logs && data.logs.length > 0) { + appendNewLogs(data.logs, filterPending); + updateLastUpdated(); + } else if (filterPending) { + filterPending = false; + showNoLogsMessage(); + } + + return data; + } catch (error) { + console.error('Error fetching live logs:', error); + filterPending = false; + var errDiv = document.createElement('div'); + errDiv.className = 'log-line'; + var errLevel = document.createElement('span'); + errLevel.className = 'log-level ERROR'; + errLevel.textContent = 'ERROR'; + var errMsg = document.createElement('span'); + errMsg.className = 'log-message'; + errMsg.textContent = 'Failed to fetch live logs: ' + (error.message || error); + errDiv.appendChild(errLevel); + errDiv.appendChild(errMsg); + logContent.appendChild(errDiv); + } +} + +function appendNewLogs(logs, forceShow = false) { + if (!logs || logs.length === 0) { + if (filterPending) { + filterPending = false; + showNoLogsMessage(); + } + return; + } + + const scrollToBottomAfter = isAtBottom; + + if (filterPending) { + filterPending = false; + logContent.textContent = ''; + shownLogs.clear(); + forceShow = true; + } + + logs.forEach(log => { + const logId = `${log.timestamp}|${log.message}`; + + if (forceShow || !shownLogs.has(logId)) { + const logLine = document.createElement('div'); + logLine.className = 'log-line'; + + var ts = document.createElement('span'); + ts.className = 'log-timestamp'; + ts.textContent = log.timestamp; + logLine.appendChild(ts); + + var lvl = document.createElement('span'); + lvl.className = 'log-level ' + log.level; + lvl.textContent = log.level; + logLine.appendChild(lvl); + + var mod = document.createElement('span'); + mod.className = 'log-module'; + mod.textContent = log.module || 'unknown'; + logLine.appendChild(mod); + + var msgEl = document.createElement('span'); + msgEl.className = 'log-message'; + msgEl.textContent = log.message; + logLine.appendChild(msgEl); + + if (!forceShow) { + logLine.style.background = 'rgba(124, 58, 237, 0.1)'; + setTimeout(() => { + logLine.style.background = ''; + }, 2000); + } + + logContent.appendChild(logLine); + shownLogs.add(logId); + } + }); + + if (scrollToBottomAfter) { + setTimeout(() => { + logContent.scrollTop = logContent.scrollHeight; + }, 10); + } + + const currentLines = logContent.children.length; + displayedLinesStats.textContent = `Showing: ${currentLines} lines (live mode)`; +} + +function displayLogs(data, append = false) { + const logs = data.logs || []; + + if (!append) { + logContent.textContent = ''; + currentOffset = 0; + shownLogs.clear(); + userScrolled = false; + isAtBottom = true; + } + + if (logs.length === 0) { + if (!append) { + showNoLogsMessage(); + } + return; + } + + logs.forEach(log => { + const logLine = document.createElement('div'); + logLine.className = 'log-line'; + + var ts = document.createElement('span'); + ts.className = 'log-timestamp'; + ts.textContent = log.timestamp; + logLine.appendChild(ts); + + var lvl = document.createElement('span'); + lvl.className = 'log-level ' + log.level; + lvl.textContent = log.level; + logLine.appendChild(lvl); + + var mod = document.createElement('span'); + mod.className = 'log-module'; + mod.style.cssText = 'color: rgba(255,255,255,0.6); font-size: 11px; min-width: 80px;'; + mod.textContent = log.module || 'unknown'; + logLine.appendChild(mod); + + var msgEl = document.createElement('span'); + msgEl.className = 'log-message'; + msgEl.textContent = log.message; + logLine.appendChild(msgEl); + + logContent.appendChild(logLine); + + const logId = `${log.timestamp}|${log.message}`; + shownLogs.add(logId); + }); + + if (liveMode.checked) { + loadMore.style.display = 'none'; + } else if (data.has_more) { + loadMore.style.display = 'block'; + currentOffset = data.displayed_lines; + } else { + loadMore.style.display = 'none'; + } + + if (!append) { + setTimeout(() => { + logContent.scrollTop = logContent.scrollHeight; + }, 10); + } +} + +function updateStats(data) { + if (data.total_lines !== undefined) { + totalLinesStats.textContent = `Total: ${data.total_lines || 0} lines`; + } + if (data.displayed_lines !== undefined) { + displayedLinesStats.textContent = `Showing: ${data.displayed_lines || 0} lines`; + } +} + +function updateLastUpdated() { + const now = new Date(); + lastUpdated.textContent = `Last updated: ${now.toLocaleTimeString()}`; +} + +/* escapeHtml and debounce are provided by utils.js */ + +function toggleLiveMode() { + if (liveMode.checked) { + fetchLogs().then(() => { + displayedLinesStats.textContent = `Showing: ${logContent.children.length} lines (live mode)`; + }); + liveRefreshInterval = setInterval(fetchLiveLogs, 2000); + loadMore.style.display = 'none'; + + if (autoRefresh.checked) { + autoRefresh.checked = false; + if (autoRefreshInterval) { + clearInterval(autoRefreshInterval); + autoRefreshInterval = null; + } + } + } else { + if (liveRefreshInterval) { + clearInterval(liveRefreshInterval); + liveRefreshInterval = null; + } + fetchLogs(); + } +} + +// Event listeners +refreshBtn.addEventListener('click', () => { + if (liveMode.checked) { + fetchLiveLogs(); + } else { + fetchLogs(); + } +}); + +logLevel.addEventListener('change', () => { + if (liveMode.checked) { + logContent.textContent = ''; + filterPending = true; + var loadingDiv = document.createElement('div'); + loadingDiv.className = 'loading'; + var spinner = document.createElement('div'); + spinner.className = 'spinner'; + loadingDiv.appendChild(spinner); + loadingDiv.appendChild(document.createTextNode('Switching filter...')); + logContent.appendChild(loadingDiv); + shownLogs.clear(); + setTimeout(() => { + fetchLiveLogs().then(() => { + if (filterPending) { + filterPending = false; + showNoLogsMessage(); + } + }); + }, 500); + } else { + fetchLogs(); + } +}); + +linesCount.addEventListener('change', () => { + if (!liveMode.checked) { + fetchLogs(); + } +}); + +searchInput.addEventListener('input', debounce(() => { + if (liveMode.checked) { + logContent.textContent = ''; + filterPending = true; + var loadingDiv = document.createElement('div'); + loadingDiv.className = 'loading'; + var spinner = document.createElement('div'); + spinner.className = 'spinner'; + loadingDiv.appendChild(spinner); + loadingDiv.appendChild(document.createTextNode('Applying filter...')); + logContent.appendChild(loadingDiv); + shownLogs.clear(); + setTimeout(() => { + fetchLiveLogs().then(() => { + if (filterPending) { + filterPending = false; + showNoLogsMessage(); + } + }); + }, 500); + } else { + fetchLogs(); + } +}, 500)); + +autoRefresh.addEventListener('change', (e) => { + if (e.target.checked) { + if (liveMode.checked) { + liveMode.checked = false; + toggleLiveMode(); + } + autoRefreshInterval = setInterval(() => fetchLogs(), 30000); + } else { + if (autoRefreshInterval) { + clearInterval(autoRefreshInterval); + autoRefreshInterval = null; + } + } +}); + +liveMode.addEventListener('change', toggleLiveMode); + +scrollToTop.addEventListener('click', () => { + logContent.scrollTop = 0; + userScrolled = true; + isAtBottom = false; +}); + +scrollToBottom.addEventListener('click', () => { + logContent.scrollTop = logContent.scrollHeight; + userScrolled = false; + isAtBottom = true; +}); + +loadMore.addEventListener('click', async () => { + if (!liveMode.checked) { + await fetchLogs(currentOffset, true); + } +}); + +// Keyboard shortcuts +document.addEventListener('keydown', (e) => { + if (e.ctrlKey || e.metaKey) { + switch (e.key) { + case 'r': + e.preventDefault(); + if (liveMode.checked) { + fetchLiveLogs(); + } else { + fetchLogs(); + } + break; + case 'f': + e.preventDefault(); + searchInput.focus(); + break; + case 'l': + e.preventDefault(); + liveMode.checked = !liveMode.checked; + toggleLiveMode(); + break; + } + } +}); + +// Initial load +logLevel.value = 'INFO'; +liveMode.checked = true; +autoRefresh.checked = false; +linesCount.value = '1000'; +searchInput.value = ''; +toggleLiveMode(); + +// Cleanup on page unload +window.addEventListener('beforeunload', () => { + if (autoRefreshInterval) { + clearInterval(autoRefreshInterval); + } + if (liveRefreshInterval) { + clearInterval(liveRefreshInterval); + } +}); + +// ── Tab Switching ── +const mainTabs = Array.from(document.querySelectorAll('.r-main-tab-btn')); +const mainPanels = Array.from(document.querySelectorAll('.r-main-panel')); + +function setMainTab(tabName) { + mainTabs.forEach(tab => { + const active = tab.dataset.mainTab === tabName; + tab.classList.toggle('active', active); + tab.setAttribute('aria-selected', active ? 'true' : 'false'); + }); + mainPanels.forEach(panel => { + panel.hidden = panel.dataset.mainPanel !== tabName; + }); + if (tabName === 'hardcover' && !hcLoaded) { + fetchHardcoverLogs(); + } +} + +mainTabs.forEach(tab => { + tab.addEventListener('click', () => setMainTab(tab.dataset.mainTab)); +}); + +// ── Hardcover Sync Logs ── +let hcPage = 1; +let hcTotalPages = 1; +let hcLoaded = false; +const hcLogBody = document.getElementById('hcLogBody'); +const hcStatsLine = document.getElementById('hcStatsLine'); +const hcPageInfo = document.getElementById('hcPageInfo'); +const hcPrevBtn = document.getElementById('hcPrevBtn'); +const hcNextBtn = document.getElementById('hcNextBtn'); + +const HC_STATUS_NAMES = {1: 'Want to Read', 2: 'Currently Reading', 3: 'Read', 4: 'Paused', 5: 'DNF'}; +const HC_ACTION_LABELS = { + status_update: 'Status Update', + status_transition: 'Status Transition', + status_pull: 'Status Pull', + create_user_book: 'Create User Book', + adopt_user_book: 'Adopt User Book', + rating: 'Rating', + date_pull: 'Date Pull', + date_push: 'Date Push', + automatch: 'Automatch', + manual_match: 'Manual Match', + journal_note: 'Journal Note', +}; +const HC_PRIVACY_NAMES = {1: 'public', 2: 'followers', 3: 'private'}; + +function formatHcDetail(log) { + if (log.error_message) return log.error_message; + const d = log.detail; + if (!d || typeof d !== 'object') return d ? String(d) : ''; + const action = log.action || ''; + + if (action === 'journal_note') { + const privacy = HC_PRIVACY_NAMES[d.privacy] || 'private'; + const preview = d.entry_preview || ''; + const source = d.source && d.source !== 'note' ? ` [${d.source}]` : ''; + return preview + ? `\u201c${preview}\u201d (${privacy})${source}` + : `Pushed note (${privacy})${source}`; + } + if (action === 'rating') { + return d.rating != null ? `Set rating to ${d.rating}` : 'Cleared rating'; + } + if (action === 'status_update') { + const label = d.status_label || ''; + const hcId = d.hc_status_id; + const hcName = HC_STATUS_NAMES[hcId] || ''; + return label ? `${label} \u2192 ${hcName || 'HC ' + hcId}` : (hcName || ''); + } + if (action === 'status_transition') { + const from = HC_STATUS_NAMES[d.from] || d.from; + const to = HC_STATUS_NAMES[d.to] || d.to; + return `${from} \u2192 ${to}`; + } + if (action === 'status_pull') { + return `${d.old_status || '?'} \u2192 ${d.new_status || '?'} (HC status ${d.hc_status_id || '?'})`; + } + if (action === 'date_pull' || action === 'date_push') { + const parts = []; + if (d.started_at) parts.push(`started: ${d.started_at}`); + if (d.finished_at) parts.push(`finished: ${d.finished_at}`); + return parts.join(', ') || ''; + } + if (action === 'automatch') { + const by = d.matched_by || 'unknown'; + return `Matched by ${by}` + (d.slug ? ` (${d.slug})` : ''); + } + if (action === 'manual_match') { + return `Linked to ${d.slug || d.input || 'HC ' + (d.hardcover_book_id || '?')}`; + } + if (action === 'adopt_user_book') { + const status = HC_STATUS_NAMES[d.status_id] || ''; + return status ? `Adopted existing (${status})` : 'Adopted existing'; + } + if (action === 'create_user_book') { + const status = HC_STATUS_NAMES[d.status_id] || ''; + return status ? `Created as ${status}` : 'Created'; + } + return Object.entries(d) + .filter(([, v]) => v != null) + .map(([k, v]) => k.replace(/_/g, ' ') + ': ' + v) + .join(', '); +} + +function renderHcRow(log) { + const tr = document.createElement('tr'); + + const tdTs = document.createElement('td'); + tdTs.className = 'hc-ts'; + tdTs.textContent = log.created_at ? new Date(log.created_at).toLocaleString() : ''; + tr.appendChild(tdTs); + + const tdDir = document.createElement('td'); + const badge = document.createElement('span'); + badge.className = 'hc-direction-badge hc-dir-' + (log.direction || 'push'); + badge.textContent = log.direction || ''; + tdDir.appendChild(badge); + tr.appendChild(tdDir); + + const tdAction = document.createElement('td'); + tdAction.textContent = HC_ACTION_LABELS[log.action] || (log.action || '').replace(/_/g, ' '); + tr.appendChild(tdAction); + + const tdTitle = document.createElement('td'); + tdTitle.className = 'hc-title'; + tdTitle.textContent = log.book_title || '\u2014'; + tr.appendChild(tdTitle); + + const tdDetail = document.createElement('td'); + tdDetail.className = 'hc-detail'; + tdDetail.textContent = formatHcDetail(log); + tr.appendChild(tdDetail); + + const tdStatus = document.createElement('td'); + const icon = document.createElement('span'); + icon.className = 'hc-status-icon ' + (log.success ? 'success' : 'failure'); + icon.title = log.success ? 'Success' : 'Failed'; + icon.textContent = log.success ? '\u2713' : '\u2717'; + tdStatus.appendChild(icon); + tr.appendChild(tdStatus); + + return tr; +} + +async function fetchHardcoverLogs() { + const params = new URLSearchParams({ page: hcPage, per_page: 50 }); + const dir = document.getElementById('hcDirection').value; + const act = document.getElementById('hcAction').value; + const search = document.getElementById('hcSearch').value; + if (dir) params.set('direction', dir); + if (act) params.set('action', act); + if (search) params.set('search', search); + + try { + const res = await fetch('/api/logs/hardcover?' + params); + if (!res.ok) throw new Error('Server returned ' + res.status); + const data = await res.json(); + hcLoaded = true; + hcTotalPages = data.total_pages || 1; + + const showing = data.logs ? data.logs.length : 0; + hcStatsLine.textContent = 'Showing ' + showing + ' of ' + (data.total || 0) + ' entries'; + hcPageInfo.textContent = 'Page ' + (data.page || 1) + ' of ' + hcTotalPages; + hcPrevBtn.disabled = (data.page || 1) <= 1; + hcNextBtn.disabled = (data.page || 1) >= hcTotalPages; + + hcLogBody.textContent = ''; + if (!data.logs || data.logs.length === 0) { + const emptyRow = document.createElement('tr'); + const emptyTd = document.createElement('td'); + emptyTd.colSpan = 6; + emptyTd.className = 'hc-empty'; + emptyTd.textContent = 'No sync events recorded yet.'; + emptyRow.appendChild(emptyTd); + hcLogBody.appendChild(emptyRow); + return; + } + + data.logs.forEach(log => hcLogBody.appendChild(renderHcRow(log))); + } catch (err) { + hcLogBody.textContent = ''; + const errRow = document.createElement('tr'); + const errTd = document.createElement('td'); + errTd.colSpan = 6; + errTd.className = 'hc-empty'; + errTd.textContent = 'Error loading logs: ' + err.message; + errRow.appendChild(errTd); + hcLogBody.appendChild(errRow); + } +} + +document.getElementById('hcRefreshBtn').addEventListener('click', () => { hcPage = 1; fetchHardcoverLogs(); }); +document.getElementById('hcDirection').addEventListener('change', () => { hcPage = 1; fetchHardcoverLogs(); }); +document.getElementById('hcAction').addEventListener('change', () => { hcPage = 1; fetchHardcoverLogs(); }); +document.getElementById('hcSearch').addEventListener('input', debounce(() => { hcPage = 1; fetchHardcoverLogs(); }, 500)); +hcPrevBtn.addEventListener('click', () => { if (hcPage > 1) { hcPage--; fetchHardcoverLogs(); } }); +hcNextBtn.addEventListener('click', () => { if (hcPage < hcTotalPages) { hcPage++; fetchHardcoverLogs(); } }); diff --git a/static/js/match.js b/static/js/match.js new file mode 100644 index 0000000..fbccfb3 --- /dev/null +++ b/static/js/match.js @@ -0,0 +1,391 @@ +/* ═══════════════════════════════════════════ + PageKeeper — match page + ═══════════════════════════════════════════ + Dependencies: + - static/js/utils.js (escapeHtml, debounce, toggleHiddenSection) + - static/js/confirm-modal.js (PKModal) + - templates/partials/confirm_modal.html + + Expects a global PK_PAGE_DATA object with: + isAttachEbook (boolean) + isAttachAudiobook (boolean) + storytellerForceMode (boolean) + absConfigured (boolean) + hasEbookSources (boolean) + ═══════════════════════════════════════════ */ + +(function () { + 'use strict'; + + var isAttachEbook = PK_PAGE_DATA.isAttachEbook; + var isAttachAudiobook = PK_PAGE_DATA.isAttachAudiobook; + var isAttachFlow = isAttachEbook || isAttachAudiobook; + var hasStorytellerSection = !!document.getElementById('storytellerSection'); + var storytellerForceMode = PK_PAGE_DATA.storytellerForceMode; + + var absConfigured = PK_PAGE_DATA.absConfigured; + var hasEbookSources = PK_PAGE_DATA.hasEbookSources; + var currentMode = absConfigured ? 'match' : (hasEbookSources ? 'ebook' : 'match'); + var currentPhase = (currentMode === 'match') ? 'select-audio' : 'done'; + + /* ── Remove from library ── */ + + function showRemoveModal(absId, title) { + var nextUrl = window.location.pathname + window.location.search; + PKModal.confirmForm({ + title: 'Remove from Library', + message: 'Remove "' + title + '" and all its cached files from your library?', + formAction: '/delete/' + encodeURIComponent(String(absId)) + '?next=' + encodeURIComponent(nextUrl), + confirmLabel: 'Remove', + confirmClass: 'btn btn-danger' + }); + } + + /* ── Mode / phase helpers ── */ + + function setMode(mode) { + currentMode = mode; + currentPhase = (mode === 'match') ? 'select-audio' : 'done'; + + clearSelections('audiobook_id'); + clearSelections('ebook_filename'); + clearSelections('storyteller_uuid'); + + // Restore storyteller default (skip selected) and clear stale title + var skipRadio = document.querySelector('.st-option.ghost-card input[type="radio"]'); + if (skipRadio) { + skipRadio.checked = true; + skipRadio.closest('.st-option').classList.add('selected'); + } + var stTitleInput = document.getElementById('input_storyteller_title'); + if (stTitleInput) stTitleInput.value = ''; + + // Adapt Storyteller hint and transcription visibility per mode + var stHint = document.querySelector('.match-storyteller-toggle-hint'); + if (stHint) { + stHint.textContent = (mode === 'ebook') + ? 'Link to an existing Storyteller book' + : 'Link to Storyteller for synced audio + text playback'; + } + var stHelperText = document.getElementById('storytellerHelperText'); + if (stHelperText) { + stHelperText.textContent = (mode === 'ebook') + ? 'If this book already exists in Storyteller, select it to sync your reading position.' + : 'If this book already exists in Storyteller, select it to sync your reading position and use its word-level timings for more precise audio-to-text alignment.'; + } + var stSubmitOption = document.querySelector('.storyteller-submit-option'); + var stSubmitDetail = document.querySelector('.storyteller-submit-detail'); + var stDivider = document.querySelector('.storyteller-divider'); + if (stSubmitOption) stSubmitOption.style.display = (mode === 'ebook') ? 'none' : ''; + if (stSubmitDetail) stSubmitDetail.style.display = (mode === 'ebook') ? 'none' : ''; + if (stDivider) stDivider.style.display = (mode === 'ebook') ? 'none' : ''; + + var stSubmitCheckbox = document.querySelector('input[type="checkbox"][name="storyteller_submit"]'); + var stSubmitHidden = document.querySelector('input[type="hidden"][name="storyteller_submit"]'); + if (stSubmitCheckbox && !storytellerForceMode) { + stSubmitCheckbox.disabled = (mode === 'ebook'); + if (mode === 'ebook') stSubmitCheckbox.checked = false; + } + if (stSubmitHidden) stSubmitHidden.disabled = (mode === 'ebook'); + + updateLayout(); + updateFooter(); + } + + function setPhase(phase) { + currentPhase = phase; + updateLayout(); + updateFooter(); + } + + function clearSelections(groupName) { + var cls = ''; + if (groupName === 'audiobook_id') cls = '.ab-option'; + if (groupName === 'storyteller_uuid') cls = '.st-option'; + if (groupName === 'ebook_filename') cls = '.eb-option'; + if (cls) { + document.querySelectorAll(cls).forEach(function (el) { + el.classList.remove('selected'); + var r = el.querySelector('input[type="radio"]'); + if (r) r.checked = false; + }); + } + } + + function updateLayout() { + var audioSection = document.getElementById('audiobookSection'); + var ebookSection = document.getElementById('ebookSection'); + var stSection = document.getElementById('storytellerSection'); + var chip = document.getElementById('selectedChip'); + + if (isAttachFlow) { + // Attach flows: show only the relevant section, no chip + if (audioSection) audioSection.style.display = isAttachAudiobook ? '' : 'none'; + if (ebookSection) ebookSection.style.display = isAttachEbook ? '' : 'none'; + if (stSection) stSection.style.display = 'none'; + if (chip) chip.style.display = 'none'; + return; + } + + if (currentMode === 'match') { + if (currentPhase === 'select-audio') { + if (audioSection) audioSection.style.display = ''; + if (ebookSection) ebookSection.style.display = 'none'; + if (stSection) stSection.style.display = 'none'; + if (chip) chip.style.display = 'none'; + } else { + // select-ebook phase + if (audioSection) audioSection.style.display = 'none'; + if (ebookSection) ebookSection.style.display = ''; + if (stSection) stSection.style.display = ''; + if (chip) chip.style.display = ''; + updateChip(); + } + } else if (currentMode === 'audio') { + if (audioSection) audioSection.style.display = ''; + if (ebookSection) ebookSection.style.display = 'none'; + if (stSection) stSection.style.display = 'none'; + if (chip) chip.style.display = 'none'; + } else if (currentMode === 'ebook') { + if (audioSection) audioSection.style.display = 'none'; + if (ebookSection) ebookSection.style.display = ''; + if (stSection) stSection.style.display = ''; + if (chip) chip.style.display = 'none'; + } + } + + function updateChip() { + var selected = document.querySelector('.ab-option.selected'); + if (!selected) return; + var title = selected.dataset.title || ''; + var cover = selected.dataset.cover || ''; + var chipCover = document.getElementById('chipCover'); + var chipTitle = document.getElementById('chipTitle'); + if (chipTitle) chipTitle.textContent = title; + if (chipCover) { + if (cover) { + chipCover.src = cover; + chipCover.style.display = ''; + } else { + chipCover.style.display = 'none'; + } + } + } + + function updateFooter() { + var btn = document.getElementById('actionBtn'); + var status = document.getElementById('actionStatus'); + if (!btn || !status) return; + + var ab = document.querySelector('input[name="audiobook_id"]:checked'); + var eb = document.querySelector('input[name="ebook_filename"]:checked'); + var st = document.querySelector('input[name="storyteller_uuid"]:checked'); + var hasAudio = !!ab; + var ebVal = eb ? eb.value : ''; + var stVal = st ? st.value : ''; + var hasText = ebVal !== '' || stVal !== ''; + + // Summary chips + var chips = document.getElementById('summaryChips'); + if (chips) { + var showChips = !isAttachFlow && currentMode === 'match' && currentPhase === 'select-ebook'; + chips.style.display = showChips ? '' : 'none'; + if (showChips) { + var truncate = function (s, n) { return s && s.length > n ? s.substring(0, n) + '\u2026' : (s || ''); }; + var chipAudio = document.getElementById('summaryAudio'); + var chipEbook = document.getElementById('summaryEbook'); + var chipSt = document.getElementById('summaryStoryteller'); + + var abLabel = ab ? ab.closest('.ab-option') : null; + var abTitle = abLabel ? (abLabel.dataset.title || '') : ''; + chipAudio.textContent = abTitle ? '\uD83C\uDFA7 ' + truncate(abTitle, 28) : '\uD83C\uDFA7 \u2014'; + chipAudio.dataset.empty = abTitle ? 'false' : 'true'; + + var ebLabel = eb ? eb.closest('.eb-option') : null; + var ebTitleEl = ebLabel ? ebLabel.querySelector('.resource-title') : null; + var ebTitle = ebTitleEl ? ebTitleEl.textContent.trim() : ''; + chipEbook.textContent = ebVal ? '\uD83D\uDCD6 ' + truncate(ebTitle, 28) : '\uD83D\uDCD6 \u2014'; + chipEbook.dataset.empty = ebVal ? 'false' : 'true'; + + if (hasStorytellerSection) { + var stLabel = st ? st.closest('.st-option') : null; + var stTitleEl = (stVal && stLabel) ? stLabel.querySelector('.resource-title') : null; + var stTitle = stTitleEl ? stTitleEl.textContent.trim() : ''; + chipSt.textContent = stVal ? '\uD83D\uDCDA ' + truncate(stTitle, 28) : '\uD83D\uDCDA \u2014'; + chipSt.dataset.empty = stVal ? 'false' : 'true'; + chipSt.style.display = ''; + } else { + chipSt.style.display = 'none'; + } + } + } + + if (isAttachEbook) { + var ready = ebVal !== ''; + btn.disabled = !ready; + btn.textContent = 'Attach Ebook'; + status.textContent = ready ? 'eBook selected. Ready to attach.' : 'Select an ebook to attach.'; + status.dataset.state = ready ? 'ready' : 'warning'; + return; + } + + if (isAttachAudiobook) { + btn.disabled = !hasAudio; + btn.textContent = 'Attach Audiobook'; + status.textContent = hasAudio ? 'Audiobook selected. Ready to attach.' : 'Select an audiobook to link.'; + status.dataset.state = hasAudio ? 'ready' : 'warning'; + return; + } + + // Normal modes + if (currentMode === 'match') { + if (currentPhase === 'select-audio') { + btn.disabled = true; + btn.textContent = 'Create Mapping'; + status.textContent = hasAudio ? 'Audiobook selected.' : 'Select an audiobook to continue.'; + status.dataset.state = 'warning'; + } else { + // select-ebook phase + var ready = hasText; + btn.disabled = !ready; + btn.textContent = 'Create Mapping'; + document.getElementById('input_action').value = ''; + status.textContent = ready + ? 'Ready to create mapping.' + : 'Select an ebook to complete the mapping.'; + status.dataset.state = ready ? 'ready' : 'warning'; + } + } else if (currentMode === 'audio') { + btn.disabled = !hasAudio; + btn.textContent = 'Add Audio Only'; + document.getElementById('input_action').value = 'audio_only'; + status.textContent = hasAudio ? 'Ready to add audiobook.' : 'Select an audiobook.'; + status.dataset.state = hasAudio ? 'ready' : 'warning'; + } else if (currentMode === 'ebook') { + btn.disabled = !hasText; + btn.textContent = 'Add eBook'; + document.getElementById('input_action').value = 'ebook_only'; + status.textContent = hasText ? 'Ready to add ebook.' : 'Select an ebook.'; + status.dataset.state = hasText ? 'ready' : 'warning'; + } + } + + function selectItem(element, groupName) { + var wrapperClass = ''; + if (groupName === 'audiobook_id') wrapperClass = '.ab-option'; + if (groupName === 'storyteller_uuid') wrapperClass = '.st-option'; + if (groupName === 'ebook_filename') wrapperClass = '.eb-option'; + + if (wrapperClass) { + document.querySelectorAll(wrapperClass).forEach(function (el) { + el.classList.remove('selected'); + }); + } + + element.classList.add('selected'); + var radio = element.querySelector('input[type="radio"]'); + if (radio) radio.checked = true; + + if (groupName === 'storyteller_uuid') { + var titleEl = element.querySelector('.resource-title'); + var stInput = document.getElementById('input_storyteller_title'); + if (stInput) { + stInput.value = (radio && radio.value) ? (titleEl ? titleEl.textContent.trim() : '') : ''; + } + } + + // In Match mode, selecting an audiobook advances to ebook phase + if (groupName === 'audiobook_id' && currentMode === 'match' && currentPhase === 'select-audio') { + setPhase('select-ebook'); + return; + } + + updateFooter(); + } + + /* ── DOMContentLoaded ── */ + + document.addEventListener('DOMContentLoaded', function () { + // ── Attach flow: simple init ── + if (isAttachFlow) { + updateLayout(); + var initialEb = document.querySelector('input[name="ebook_filename"]:checked'); + if (initialEb) { + var label = initialEb.closest('.eb-option'); + if (label) label.classList.add('selected'); + } + var initialAb = document.querySelector('input[name="audiobook_id"]:checked'); + if (initialAb) { + var label = initialAb.closest('.ab-option'); + if (label) label.classList.add('selected'); + } + updateFooter(); + return; + } + + // ── Mode selector ── + var modeBar = document.getElementById('match-mode-bar'); + if (modeBar) { + var modeBtns = modeBar.querySelectorAll('.r-match-mode-btn'); + modeBtns.forEach(function (btn) { + btn.addEventListener('click', function () { + modeBtns.forEach(function (b) { b.classList.remove('active'); }); + btn.classList.add('active'); + setMode(btn.dataset.mode); + }); + }); + } + + // ── Chip "Change" button ── + var chipChange = document.getElementById('chipChange'); + if (chipChange) { + chipChange.addEventListener('click', function () { + clearSelections('ebook_filename'); + clearSelections('storyteller_uuid'); + // Restore storyteller skip and clear stale title + var skipRadio = document.querySelector('.st-option.ghost-card input[type="radio"]'); + if (skipRadio) { + skipRadio.checked = true; + skipRadio.closest('.st-option').classList.add('selected'); + } + var stTitleInput = document.getElementById('input_storyteller_title'); + if (stTitleInput) stTitleInput.value = ''; + setPhase('select-audio'); + }); + } + + // ── Storyteller disclosure toggle ── + var stToggle = document.getElementById('storytellerToggle'); + var stBody = document.getElementById('storytellerBody'); + var stArrow = document.getElementById('storytellerArrow'); + if (stToggle && stBody) { + // Determine initial state + var shouldExpand = true; + if (shouldExpand) { + stBody.classList.add('expanded'); + if (stArrow) stArrow.classList.add('expanded'); + } + + stToggle.addEventListener('click', function () { + stBody.classList.toggle('expanded'); + if (stArrow) stArrow.classList.toggle('expanded'); + }); + } + + // ── Handle preselected audiobook or single match ── + var preselectedAb = document.querySelector('.ab-option.selected'); + if (preselectedAb) { + if (currentMode === 'match') { + setPhase('select-ebook'); + } + } + + updateLayout(); + updateFooter(); + }); + + /* ── Expose functions needed by inline onclick handlers ── */ + window.selectItem = selectItem; + window.showRemoveModal = showRemoveModal; + +})(); diff --git a/static/js/reading.js b/static/js/reading.js index d983d18..21df948 100644 --- a/static/js/reading.js +++ b/static/js/reading.js @@ -100,11 +100,13 @@ function initReadingPage(currentYear, activeTab) { }); }); + var _filterTimer; [searchInput, mobileSearchInput].forEach(input => { if (!input) return; input.addEventListener('input', () => { syncSearchInputs(input); - applyFiltersAndSort(); + clearTimeout(_filterTimer); + _filterTimer = setTimeout(applyFiltersAndSort, 150); }); }); diff --git a/static/js/settings.js b/static/js/settings.js index f18511d..df65f27 100644 --- a/static/js/settings.js +++ b/static/js/settings.js @@ -96,8 +96,8 @@ function getServiceTestPayload(service) { if (service === 'kosync') { return { server: getInputValue('kosync_external_url'), - user: getInputValue('KOSYNC_USER'), - key: getInputValue('KOSYNC_KEY') + user: getInputValue('KOSYNC_SERVER_USER') || getInputValue('KOSYNC_USER'), + key: getInputValue('kosync_server_key_input') || getInputValue('KOSYNC_KEY') }; } return {}; @@ -109,12 +109,12 @@ function setTestButtonState(btn, success, detail, originalText) { btn.classList.add('btn-success'); } else { btn.textContent = '\u2717 ' + (detail || 'Failed'); - btn.classList.add('btn-error'); + btn.classList.add('btn-danger'); } setTimeout(function() { btn.textContent = originalText; btn.disabled = false; - btn.classList.remove('btn-success', 'btn-error'); + btn.classList.remove('btn-success', 'btn-danger'); }, 3000); } @@ -187,20 +187,20 @@ function fetchBookloreLibs(url, event) { .then(function(response) { return response.json(); }) .then(function(data) { if (data.error) { - showSettingsModal('Error', data.error); + PKModal.alert({ title: 'Error', message: data.error }); return; } if (data.length === 0) { - showSettingsModal('No Libraries', 'No libraries found. Check connection or try syncing first.'); + PKModal.alert({ title: 'No Libraries', message: 'No libraries found. Check connection or try syncing first.' }); return; } var lines = data.map(function(lib) { return 'ID: ' + lib.id + ' \u2014 ' + lib.name; }); - showSettingsModal('Found Libraries', lines.join('\n')); + PKModal.alert({ title: 'Found Libraries', message: lines.join('\n'), preserveWhitespace: true }); }) - .catch(function(err) { showSettingsModal('Error', 'Failed to fetch libraries: ' + err); }) + .catch(function(err) { PKModal.alert({ title: 'Error', message: 'Failed to fetch libraries: ' + err }); }) .finally(function() { btn.textContent = originalText; btn.disabled = false; @@ -230,15 +230,19 @@ function togglePollSeconds(rowId, mode) { function toggleKosyncSourceMode() { var isBuiltin = document.getElementById('kosync_mode_builtin').checked; var hiddenInput = document.getElementById('kosync_server_input'); - var externalGroup = document.getElementById('kosync_external_group'); + var builtinSection = document.getElementById('kosync_builtin_section'); + var externalSection = document.getElementById('kosync_external_section'); var externalUrl = document.getElementById('kosync_external_url'); + if (!hiddenInput || !builtinSection || !externalSection || !externalUrl) return; var builtinUrl = 'http://127.0.0.1:' + SETTINGS_CONFIG.kosyncPort; if (isBuiltin) { hiddenInput.value = builtinUrl; - externalGroup.classList.add('hidden'); + builtinSection.classList.remove('hidden'); + externalSection.classList.add('hidden'); } else { - externalGroup.classList.remove('hidden'); + builtinSection.classList.add('hidden'); + externalSection.classList.remove('hidden'); if (externalUrl.value && externalUrl.value !== builtinUrl) { hiddenInput.value = externalUrl.value; } else { @@ -258,41 +262,16 @@ function copyInputValue(inputId) { setTimeout(function() { copyText.classList.remove('input-copied'); }, 200); } -/* ─── Modal System ─── */ -function showSettingsModal(title, message, onConfirm) { - var modal = document.getElementById('settings-modal'); - document.getElementById('settings-modal-title').textContent = title; - /* Use textContent for safety — message is always plain text */ - document.getElementById('settings-modal-message').textContent = message; - - var confirmBtn = document.getElementById('settings-modal-confirm'); - var cancelBtn = document.getElementById('settings-modal-cancel'); - - if (onConfirm) { - confirmBtn.style.display = ''; - cancelBtn.textContent = 'Cancel'; - confirmBtn.onclick = function() { - closeSettingsModal(); - onConfirm(); - }; - } else { - confirmBtn.style.display = 'none'; - cancelBtn.textContent = 'OK'; - } - modal.style.display = 'flex'; -} - -function closeSettingsModal() { - document.getElementById('settings-modal').style.display = 'none'; -} /* ─── Tool Actions ─── */ function clearStaleSuggestions() { - showSettingsModal( - 'Clear Stale Suggestions', - 'This will permanently delete all suggestions for books that are not currently matched in your bridge. Books you are already syncing will be preserved.', - function() { + PKModal.confirm({ + title: 'Clear Stale Suggestions', + message: 'This will permanently delete all suggestions for books that are not currently matched in your bridge. Books you are already syncing will be preserved.', + confirmLabel: 'Clear', + confirmClass: 'btn btn-danger', + onConfirm: function() { fetch('/api/suggestions/clear_stale', { method: 'POST', headers: { 'Content-Type': 'application/json' } @@ -300,17 +279,17 @@ function clearStaleSuggestions() { .then(function(r) { return r.json(); }) .then(function(data) { if (data.success) { - showSettingsModal('Success', 'Cleared ' + data.count + ' stale suggestions.'); + PKModal.alert({ title: 'Success', message: 'Cleared ' + data.count + ' stale suggestions.' }); } else { - showSettingsModal('Error', 'Failed to clear suggestions: ' + (data.error || 'Unknown error')); + PKModal.alert({ title: 'Error', message: 'Failed to clear suggestions: ' + (data.error || 'Unknown error') }); } }) .catch(function(err) { console.error('Error clearing suggestions:', err); - showSettingsModal('Error', 'An error occurred while clearing suggestions.'); + PKModal.alert({ title: 'Error', message: 'An error occurred while clearing suggestions.' }); }); } - ); + }); } function syncReadingDates(btn) { @@ -327,14 +306,14 @@ function syncReadingDates(btn) { if (data.updated) parts.push(data.updated + ' updated'); if (data.completed) parts.push(data.completed + ' newly completed'); if (data.errors) parts.push(data.errors + ' errors'); - showSettingsModal('Sync Complete', parts.length ? parts.join(', ') + '.' : 'All reading dates are already up to date.'); + PKModal.alert({ title: 'Sync Complete', message: parts.length ? parts.join(', ') + '.' : 'All reading dates are already up to date.' }); } else { - showSettingsModal('Error', 'Failed: ' + (data.error || 'Unknown error')); + PKModal.alert({ title: 'Error', message: 'Failed: ' + (data.error || 'Unknown error') }); } }) .catch(function(err) { console.error('Error syncing reading dates:', err); - showSettingsModal('Error', 'An error occurred. Check console for details.'); + PKModal.alert({ title: 'Error', message: 'An error occurred. Check console for details.' }); }) .finally(function() { btn.disabled = false; @@ -427,7 +406,7 @@ function initDynamicForms() { if (deviceSelect) { deviceSelect.addEventListener('change', function () { if (this.value === 'cuda') { - showSettingsModal('NVIDIA GPU', 'To use an NVIDIA GPU, you must modify your docker-compose.yml to include the \'deploy\' block with \'capabilities: [gpu]\' and ensure the NVIDIA Container Toolkit is installed on your host.'); + PKModal.alert({ title: 'NVIDIA GPU', message: 'To use an NVIDIA GPU, you must modify your docker-compose.yml to include the \'deploy\' block with \'capabilities: [gpu]\' and ensure the NVIDIA Container Toolkit is installed on your host.' }); } }); } diff --git a/static/js/suggestions.js b/static/js/suggestions.js new file mode 100644 index 0000000..55ef9ab --- /dev/null +++ b/static/js/suggestions.js @@ -0,0 +1,441 @@ +/* ═══════════════════════════════════════════ + PageKeeper — suggestions page + ═══════════════════════════════════════════ + Depends on: utils.js, confirm-modal.js + Reads: window.PK_PAGE_DATA.suggestionsData + window.PK_PAGE_DATA.selectedSourceId + ═══════════════════════════════════════════ */ + +(function () { + 'use strict'; + + var suggestionData = window.PK_PAGE_DATA.suggestionsData; + var selectedSourceId = window.PK_PAGE_DATA.selectedSourceId; + var rescanPollTimer = null; + var desktopMedia = window.matchMedia('(min-width: 961px)'); + var currentView = 'list'; + + /* ── helpers ── */ + + function formatEvidence(evidence) { + return (evidence || []).map(function (item) { + return '' + escapeHtml(item.split('_').join(' ')) + ''; + }).join(''); + } + + function confidenceRank(confidence) { + if (confidence === 'high') return 3; + if (confidence === 'medium') return 2; + return 1; + } + + function filterSuggestions() { + var query = (document.getElementById('suggestion-search').value || '').toLowerCase().trim(); + var confidenceFilter = document.getElementById('confidence-filter').value; + var bfFilterEl = document.getElementById('bookfusion-filter'); + var bookfusionFilter = bfFilterEl ? bfFilterEl.value : 'all'; + + return suggestionData.filter(function (suggestion) { + if (selectedSourceId && suggestion.source_id !== selectedSourceId) return false; + if (bookfusionFilter === 'bookfusion' && !suggestion.has_bookfusion_evidence) return false; + + var topConfidence = suggestion.top_match ? suggestion.top_match.confidence : 'low'; + if (confidenceFilter === 'high' && topConfidence !== 'high') return false; + if (confidenceFilter === 'medium' && confidenceRank(topConfidence) < 2) return false; + + if (!query) return true; + var haystack = [ + suggestion.title, + suggestion.author + ].concat((suggestion.matches || []).map(function (match) { + return [match.title, match.author, match.filename, match.source_family, (match.evidence || []).join(' ')].join(' '); + })).join(' ').toLowerCase(); + return haystack.indexOf(query) !== -1; + }); + } + + /* ── rendering ── + Note: all user-facing strings are passed through escapeHtml() (from utils.js) + before insertion into HTML markup strings. */ + + function renderCandidate(match, suggestion, index) { + var confidenceClass = 'chip--confidence-' + (match.confidence || 'low'); + var actions = []; + var sgSource = suggestion.source || 'unknown'; + + if (!suggestion.hidden) { + if (match.source_family === 'bookfusion') { + actions.push(''); + if (match.bookfusion_ids && match.bookfusion_ids.length) { + actions.push(''); + } + } else { + var mappingUrl = '/match?search=' + encodeURIComponent(suggestion.title || ''); + if (sgSource === 'abs') { + mappingUrl = '/match?abs_id=' + encodeURIComponent(suggestion.source_id) + '&search=' + encodeURIComponent(suggestion.title || ''); + } + actions.push('Create Mapping'); + } + } + + return '' + + '
' + + '
' + + '
' + + '
' + escapeHtml(match.title || match.filename || 'Untitled') + '
' + + '
' + escapeHtml(match.author || match.source_family || '') + '
' + + '
' + + '
' + + '' + escapeHtml(match.confidence || 'low') + '' + + '' + Math.round((match.score || 0) * 100) + '%' + + '
' + + '
' + + '
' + + '' + escapeHtml(match.source_family || 'unknown') + '' + + formatEvidence(match.evidence) + + '
' + + (match.highlight_count ? '
BookFusion highlights: ' + escapeHtml(match.highlight_count) + '
' : '') + + (actions.length ? '
' + actions.join('') + '
' : '') + + '
'; + } + + function renderSuggestionCard(suggestion) { + var matches = (suggestion.matches || []).map(function (match, index) { + return renderCandidate(match, suggestion, index); + }).join(''); + + var suggestionSource = suggestion.source || 'unknown'; + var actionButtons = suggestion.hidden + ? '' + : ''; + + return '' + + '
' + + '
' + + (suggestion.cover_url + ? '' + : '
') + + '
' + + '

' + escapeHtml(suggestion.title) + '

' + + '

' + escapeHtml(suggestion.author || 'Unknown author') + '

' + + '
' + + '' + escapeHtml((suggestion.matches || []).length) + ' candidates' + + (suggestion.hidden ? 'Hidden' : '') + + (suggestion.has_bookfusion_evidence ? 'BookFusion evidence' : '') + + '
' + + '
' + + '
' + + '
' + matches + '
' + + '
' + + actionButtons + + '' + + '
' + + '
'; + } + + /* ── view toggle ── */ + + function setView(view, persist) { + var results = document.getElementById('suggestions-results'); + var hiddenGrid = document.getElementById('hidden-grid'); + var viewButtons = document.querySelectorAll('.sg-view-btn'); + var forcedView = desktopMedia.matches ? view : 'list'; + + currentView = forcedView; + + if (results) { + results.classList.toggle('sg-grid-view', forcedView === 'grid'); + results.classList.toggle('sg-list-view', forcedView !== 'grid'); + } + + if (hiddenGrid) { + hiddenGrid.classList.toggle('sg-list-grid', forcedView !== 'grid'); + } + + viewButtons.forEach(function (btn) { + var isActive = btn.dataset.view === forcedView; + btn.classList.toggle('active', isActive); + btn.disabled = !desktopMedia.matches; + btn.setAttribute('aria-pressed', isActive ? 'true' : 'false'); + }); + + if (persist && desktopMedia.matches) { + try { localStorage.setItem('pk-suggestions-view', forcedView); } catch (e) {} + } + } + + /* ── main render ── */ + + function renderSuggestions() { + var filtered = filterSuggestions(); + var visible = filtered.filter(function (item) { return !item.hidden; }); + var hidden = filtered.filter(function (item) { return item.hidden; }); + var grid = document.getElementById('suggestion-grid'); + var hiddenSection = document.getElementById('hidden-section'); + var hiddenGrid = document.getElementById('hidden-grid'); + var hiddenCount = document.getElementById('hidden-section-count'); + var empty = document.getElementById('empty-state'); + + document.getElementById('visible-count').textContent = visible.length; + document.getElementById('hidden-count').textContent = hidden.length; + document.getElementById('total-count').textContent = filtered.length; + + /* All values passed to renderSuggestionCard are escapeHtml-sanitized */ + grid.innerHTML = visible.map(renderSuggestionCard).join(''); // eslint-disable-line no-unsanitized/property + + if (hidden.length) { + hiddenSection.classList.remove('hidden'); + hiddenGrid.innerHTML = hidden.map(renderSuggestionCard).join(''); // eslint-disable-line no-unsanitized/property + hiddenCount.textContent = '(' + hidden.length + ')'; + } else { + hiddenSection.classList.add('hidden'); + hiddenGrid.textContent = ''; + hiddenCount.textContent = '(0)'; + } + + if (!visible.length) { + empty.classList.remove('hidden'); + } else { + empty.classList.add('hidden'); + } + } + + /* ── state management ── */ + + function updateSuggestionState(sourceId, updater) { + suggestionData = suggestionData.map(function (item) { + if (item.source_id !== sourceId) return item; + return updater(Object.assign({}, item)); + }).filter(Boolean); + renderSuggestions(); + } + + function showErrorToast(message) { + PKModal.alert({ title: 'Error', message: message }); + } + + function actOnSuggestion(url, btn, onSuccess) { + if (btn) btn.disabled = true; + fetch(url, { method: 'POST' }) + .then(function (r) { return r.json(); }) + .then(function (data) { + if (!data.success) throw new Error(data.error || 'Request failed'); + onSuccess(); + }) + .catch(function (err) { + if (btn) btn.disabled = false; + showErrorToast(err.message || String(err)); + }); + } + + /* ── actions ── */ + + function hideSuggestion(sourceId, source, btn) { + PKModal.confirm({ + title: 'Hide Suggestion?', + message: 'This suggestion will move to the Hidden section. You can restore it later.', + confirmLabel: 'Hide', + confirmClass: 'btn', + onConfirm: function () { + actOnSuggestion('/api/suggestions/' + encodeURIComponent(sourceId) + '/hide?source=' + encodeURIComponent(source || 'abs'), btn, function () { + updateSuggestionState(sourceId, function (item) { + item.status = 'hidden'; + item.hidden = true; + return item; + }); + }); + } + }); + } + + function ignoreSuggestion(sourceId, source, btn) { + PKModal.confirm({ + title: 'Never Ask Again?', + message: 'This suggestion will be permanently ignored and will not return on future rescans.', + confirmLabel: 'Never Ask', + confirmClass: 'btn btn-danger', + onConfirm: function () { + actOnSuggestion('/api/suggestions/' + encodeURIComponent(sourceId) + '/ignore?source=' + encodeURIComponent(source || 'abs'), btn, function () { + updateSuggestionState(sourceId, function () { + return null; + }); + }); + } + }); + } + + function unhideSuggestion(sourceId, source, btn) { + actOnSuggestion('/api/suggestions/' + encodeURIComponent(sourceId) + '/unhide?source=' + encodeURIComponent(source || 'abs'), btn, function () { + updateSuggestionState(sourceId, function (item) { + item.status = 'pending'; + item.hidden = false; + return item; + }); + }); + } + + function linkBookFusion(sourceId, matchIndex, source) { + fetch('/api/suggestions/' + encodeURIComponent(sourceId) + '/link-bookfusion', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ match_index: matchIndex, source: source || 'abs' }) + }) + .then(function (r) { return r.json(); }) + .then(function (data) { + if (!data.success) throw new Error(data.error || 'Link failed'); + refreshSuggestionsData('BookFusion link created.'); + }) + .catch(function (err) { + showErrorToast(err.message || String(err)); + }); + } + + function addBookFusionToDashboard(bookfusionIds) { + fetch('/api/bookfusion/add-to-dashboard', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ bookfusion_ids: bookfusionIds }) + }) + .then(function (r) { return r.json(); }) + .then(function (data) { + if (!data.success) throw new Error(data.error || 'Add failed'); + window.location.href = '/'; + }) + .catch(function (err) { + showErrorToast(err.message || String(err)); + }); + } + + /* ── rescan ── */ + + function rescanSuggestions() { + var btn = document.getElementById('rescan-btn'); + var status = document.getElementById('rescan-status'); + btn.disabled = true; + status.textContent = 'Queued library rescan...'; + fetch('/api/suggestions/rescan', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({}) + }) + .then(function (r) { return r.json(); }) + .then(function (data) { + if (!data.success) throw new Error(data.error || 'Rescan failed'); + if (data.rate_limited) { + status.textContent = data.message || ('Please wait ' + (data.next_allowed_in || 0) + 's before rescanning again.'); + btn.disabled = false; + return; + } + status.textContent = data.message || 'Suggestions rescan started...'; + pollRescanStatus(); + }) + .catch(function (err) { + status.textContent = err.message || String(err); + btn.disabled = false; + }); + } + + function pollRescanStatus() { + if (rescanPollTimer) { + clearTimeout(rescanPollTimer); + rescanPollTimer = null; + } + fetch('/api/suggestions/rescan-status') + .then(function (r) { return r.json(); }) + .then(function (data) { + if (!data.success) throw new Error(data.error || 'Status failed'); + var status = document.getElementById('rescan-status'); + var btn = document.getElementById('rescan-btn'); + + if (data.running) { + status.textContent = data.message || 'Rescan in progress...'; + btn.disabled = true; + rescanPollTimer = setTimeout(pollRescanStatus, 1500); + return; + } + + if (data.phase === 'complete') { + refreshSuggestionsData(data.message || 'Rescan complete.'); + return; + } + + if (data.rate_limited) { + status.textContent = data.message || ('Please wait ' + (data.next_allowed_in || 0) + 's before rescanning again.'); + } else if (data.message) { + status.textContent = data.message; + } + btn.disabled = false; + }) + .catch(function (err) { + document.getElementById('rescan-status').textContent = err.message || String(err); + document.getElementById('rescan-btn').disabled = false; + }); + } + + function refreshSuggestionsData(statusMessage) { + fetch('/api/suggestions') + .then(function (r) { return r.json(); }) + .then(function (data) { + suggestionData = data; + renderSuggestions(); + var status = document.getElementById('rescan-status'); + var btn = document.getElementById('rescan-btn'); + if (statusMessage) status.textContent = statusMessage; + btn.disabled = false; + }) + .catch(function (err) { + var status = document.getElementById('rescan-status'); + var btn = document.getElementById('rescan-btn'); + status.textContent = 'Refresh failed: ' + (err.message || String(err)); + btn.disabled = false; + }); + } + + /* ── init ── */ + + document.querySelectorAll('.sg-view-btn').forEach(function (btn) { + btn.addEventListener('click', function () { + setView(btn.dataset.view, true); + }); + }); + + try { + var savedView = localStorage.getItem('pk-suggestions-view'); + setView(savedView === 'grid' ? 'grid' : 'list', false); + } catch (e) { + setView('list', false); + } + + function handleViewportChange() { + var savedView = 'list'; + try { savedView = localStorage.getItem('pk-suggestions-view') || 'list'; } catch (e) {} + setView(savedView, false); + } + + if (desktopMedia.addEventListener) { + desktopMedia.addEventListener('change', handleViewportChange); + } else if (desktopMedia.addListener) { + desktopMedia.addListener(handleViewportChange); + } + + /* Wire up filter inputs */ + document.getElementById('suggestion-search').addEventListener('input', renderSuggestions); + document.getElementById('confidence-filter').addEventListener('change', renderSuggestions); + var bfFilter = document.getElementById('bookfusion-filter'); + if (bfFilter) bfFilter.addEventListener('change', renderSuggestions); + + /* Wire up rescan button */ + document.getElementById('rescan-btn').addEventListener('click', rescanSuggestions); + + renderSuggestions(); + pollRescanStatus(); + + /* ── expose functions called from inline onclick in rendered HTML ── */ + window.PK_Suggestions = { + hideSuggestion: hideSuggestion, + unhideSuggestion: unhideSuggestion, + ignoreSuggestion: ignoreSuggestion, + linkBookFusion: linkBookFusion, + addBookFusionToDashboard: addBookFusionToDashboard + }; +})(); diff --git a/static/js/tbr-detail.js b/static/js/tbr-detail.js index 6806972..30fc906 100644 --- a/static/js/tbr-detail.js +++ b/static/js/tbr-detail.js @@ -241,9 +241,13 @@ if (isHc) { showRemoveConfirm(); } else { - if (confirm('Remove this book from your TBR list?')) { - doRemove(false); - } + PKModal.confirm({ + title: 'Remove from TBR', + message: 'Remove this book from your TBR list?', + confirmLabel: 'Remove', + confirmClass: 'btn btn-danger', + onConfirm: function () { doRemove(false); } + }); } }); } @@ -443,11 +447,18 @@ var unlinkHcBtn = document.getElementById('unlink-hc-btn'); if (unlinkHcBtn) { unlinkHcBtn.addEventListener('click', function () { - if (!confirm('Unlink this book from Hardcover?')) return; - patchItem({ hardcover_book_id: null, hardcover_slug: null }).then(function (ok) { - if (ok) { - showToast('Unlinked from Hardcover'); - location.reload(); + PKModal.confirm({ + title: 'Unlink from Hardcover', + message: 'Unlink this book from Hardcover?', + confirmLabel: 'Unlink', + confirmClass: 'btn btn-warning', + onConfirm: function () { + patchItem({ hardcover_book_id: null, hardcover_slug: null }).then(function (ok) { + if (ok) { + showToast('Unlinked from Hardcover'); + location.reload(); + } + }); } }); }); diff --git a/static/js/utils.js b/static/js/utils.js new file mode 100644 index 0000000..47043cf --- /dev/null +++ b/static/js/utils.js @@ -0,0 +1,45 @@ +/* ═══════════════════════════════════════════ + PageKeeper — shared utilities + ═══════════════════════════════════════════ */ + +/** + * Escape HTML special characters to prevent XSS. + * @param {*} text — value to escape (coerced to string) + * @returns {string} + */ +function escapeHtml(text) { + var s = String(text == null ? '' : text); + return s + .replace(/&/g, '&') + .replace(//g, '>') + .replace(/"/g, '"') + .replace(/'/g, '''); +} + +/** + * Returns a debounced version of `func` that delays invocation + * until `wait` ms have elapsed since the last call. + * @param {Function} func + * @param {number} wait — milliseconds + * @returns {Function} + */ +function debounce(func, wait) { + var timeout; + return function () { + var ctx = this, args = arguments; + clearTimeout(timeout); + timeout = setTimeout(function () { func.apply(ctx, args); }, wait); + }; +} + +/** + * Toggle a collapsible "hidden" section. + * Expects `headerEl` followed by a sibling whose visibility is toggled. + * @param {HTMLElement} headerEl — the clickable header element + */ +function toggleHiddenSection(headerEl) { + headerEl.classList.toggle('collapsed'); + var sibling = headerEl.nextElementSibling; + if (sibling) sibling.classList.toggle('hidden'); +} diff --git a/templates/batch_match.html b/templates/batch_match.html index 5343773..7ebef93 100644 --- a/templates/batch_match.html +++ b/templates/batch_match.html @@ -63,7 +63,7 @@

Batch Match

@@ -82,12 +82,16 @@

Batch Match

{% set single_st_match = storyteller_books|length == 1 %} {% set single_eb_match = ebooks|length == 1 %} + {% set abs_ok = abs_configured|default(true) %} + {% set ebook_ok = has_ebook_sources|default(true) %} +
+ {% if abs_ok %}

1Audiobooks

-

Choose one audiobook from your ABS library.

+

Choose one audiobook from your audiobook library.

{% if audiobooks %} @@ -122,10 +126,12 @@

No audiobooks found

{% endif %}
+ {% endif %} + {% if storyteller_configured %}
-

2Storyteller

+

{% if abs_ok %}2{% else %}1{% endif %}Storyteller

Optional narrated ebook sync source.

@@ -170,11 +176,12 @@

2Storyteller

{% endif %}
+ {% endif %}
-

3eBooks

-

Optional ebook source for progress sync.

+

{% if abs_ok and storyteller_configured %}3{% elif abs_ok or storyteller_configured %}2{% else %}1{% endif %}eBooks

+

{% if abs_ok %}Optional ebook source for progress sync.{% else %}Choose an ebook for progress sync.{% endif %}

{% if ebooks %} @@ -213,7 +220,7 @@

3eBooks

No ebooks found

-

You can still queue an audio-only mapping.

+

{% if abs_ok %}You can still queue an audio-only mapping.{% else %}Search to load ebook matches.{% endif %}

{% endif %}
@@ -222,7 +229,11 @@

No ebooks found

+ {% if storyteller_configured %} {{ item.storyteller_label }} + {% endif %} {{ item.ebook_label }} @@ -284,7 +297,7 @@

Queue Builder

- +
@@ -324,156 +337,11 @@

Validation

- - - - + {% include 'partials/confirm_modal.html' %} - + + + diff --git a/templates/bookfusion.html b/templates/bookfusion.html index 1a56a42..33ab958 100644 --- a/templates/bookfusion.html +++ b/templates/bookfusion.html @@ -74,840 +74,8 @@

BookFusion

- + + diff --git a/templates/index.html b/templates/index.html index 1f174f5..25dd6b0 100644 --- a/templates/index.html +++ b/templates/index.html @@ -80,6 +80,27 @@

Processing

{% endif %} + {% if unlinked_reading %} +
+
+

Pending Identification

+
+

Books being read on KOReader that haven't been matched to your library yet.

+
+ {% for doc in unlinked_reading %} +
+
+ {{ doc.document_hash[:12] }}… + {{ '%.1f'|format(doc.percentage * 100) }}% + {% if doc.device %}{{ doc.device }}{% endif %} +
+ Manage +
+ {% endfor %} +
+
+ {% endif %} +

Currently Reading

@@ -256,21 +277,7 @@

Link Storyteller

- - + {% include 'partials/confirm_modal.html' %} + + + {% if kosync_unlinked_count and kosync_unlinked_count > 0 %} + + {% endif %} diff --git a/templates/kosync_documents.html b/templates/kosync_documents.html new file mode 100644 index 0000000..bade34c --- /dev/null +++ b/templates/kosync_documents.html @@ -0,0 +1,77 @@ + + + + + + {{ title_prefix }}KoSync Documents - PageKeeper + + + + + + + + + + + + + + + {% include 'partials/navbar.html' %} + +
+
+
+

KoSync

+

Document Management

+

Manage KOReader sync hashes and book linkages

+
+ +
+ +
+ + {# Healthy: linked and recently active #} +
+
+

Healthy

+ +
+
+
+ + {# Needs Attention: unlinked docs + orphaned hashes #} +
+
+

Needs Attention

+
+

Unlinked hashes from KOReader devices and orphaned hashes causing sync errors.

+
+
+ + {# Stale: linked but no activity in 30+ days #} +
+
+

Stale (30+ days)

+ +
+ +
+
+ + {% include 'partials/confirm_modal.html' %} + + + + + + + diff --git a/templates/logs.html b/templates/logs.html index 3644659..a50347b 100644 --- a/templates/logs.html +++ b/templates/logs.html @@ -170,661 +170,8 @@

Log Entries

- + + diff --git a/templates/match.html b/templates/match.html index 7bdb887..0795c16 100644 --- a/templates/match.html +++ b/templates/match.html @@ -23,7 +23,7 @@

Attach Ebook

Choose an ebook source to attach to {{ attach_title }}.

{% elif link_to %}

Attach Audiobook

-

Choose an audiobook from your ABS library to link to {{ link_title }}.

+

Choose an audiobook from your audiobook library to link to {{ link_title }}.

{% else %}

Add Book

{% endif %} @@ -44,10 +44,15 @@

Add Book

{% if not attach_to and not link_to %} + {% set abs_ok = abs_configured|default(true) %} + {% set ebook_ok = has_ebook_sources|default(true) %}
- - - + + +
{% endif %} @@ -125,7 +130,7 @@

No audiobooks found

{# ── Storyteller disclosure ── #} {% if not link_to and not attach_to %} - {% if storyteller_books or storyteller_submit_available %} + {% if storyteller_configured %}
- - + {% include 'partials/confirm_modal.html' %} + + + diff --git a/templates/partials/book_card.html b/templates/partials/book_card.html index 983d847..52a1639 100644 --- a/templates/partials/book_card.html +++ b/templates/partials/book_card.html @@ -33,10 +33,14 @@
{% if mapping.cover_url %} Cover - + onerror="{% if mapping.kosync_doc_id %}this.onerror=function(){this.style.display='none'; this.nextElementSibling.classList.remove('hidden');}; this.src='/covers/{{ mapping.kosync_doc_id }}.jpg';{% else %}this.style.display='none'; this.nextElementSibling.classList.remove('hidden');{% endif %}"> + {% else %} -
+
+ {% if mapping.placeholder_logo %}{% endif %} +
{% endif %}
diff --git a/templates/partials/confirm_modal.html b/templates/partials/confirm_modal.html new file mode 100644 index 0000000..2a20cdf --- /dev/null +++ b/templates/partials/confirm_modal.html @@ -0,0 +1,22 @@ +{# ── Unified confirm modal ── + Include once per page. Controlled by static/js/confirm-modal.js (PKModal). + Supports: callback mode, form-POST mode, and alert/info mode. #} + + diff --git a/templates/partials/navbar.html b/templates/partials/navbar.html index d918e8b..0380633 100644 --- a/templates/partials/navbar.html +++ b/templates/partials/navbar.html @@ -85,7 +85,7 @@

PageKeeper

{% if show_bookfusion_page %} BookFusion Books {% endif %} - {% if get_bool('SUGGESTIONS_ENABLED') %} + {% if get_bool('SUGGESTIONS_ENABLED') and abs_url %} Suggestions {% endif %} Batch diff --git a/templates/reading.html b/templates/reading.html index 1918b3c..2e9f9c0 100644 --- a/templates/reading.html +++ b/templates/reading.html @@ -21,9 +21,13 @@ {% if book.cover_url %} - + {% else %} -
+
+ {% if book.placeholder_logo %}{% endif %} +
{% endif %} {% endmacro %} @@ -286,9 +290,14 @@

Want to Read

{% if book.cover_url %} + onerror="this.style.display='none'; this.nextElementSibling.classList.remove('hidden');"> + {% else %} -
+
+ {% if book.placeholder_logo %}{% endif %} +
{% endif %}
diff --git a/templates/reading_detail.html b/templates/reading_detail.html index d08cfbf..9f6675f 100644 --- a/templates/reading_detail.html +++ b/templates/reading_detail.html @@ -44,9 +44,13 @@ {% if book.cover_url %} - + {% else %} -
+
+ {% if book.placeholder_logo %}{% endif %} +
{% endif %}
- {% set kosync_configured = get_val('KOSYNC_SERVER') != '' %} - {% set kosync_enabled = get_bool('KOSYNC_ENABLED') or (kosync_configured and get_val('KOSYNC_ENABLED') == '') %} + {% set kosync_enabled = get_bool('KOSYNC_ENABLED') %} + {% set kosync_builtin_url = 'http://127.0.0.1:' ~ get_val('KOSYNC_PORT', '4477') %} + {% set kosync_server_val = get_val('KOSYNC_SERVER') %} + {% set kosync_is_external = kosync_server_val and kosync_server_val != '' and '127.0.0.1' not in kosync_server_val and 'localhost' not in kosync_server_val and '::1' not in kosync_server_val %} -
Use the LAN address for devices on your local network. Set the Public URL if you expose KOSync through a reverse proxy for remote access.
- -
+ {# ── Section 2B: External Server Config ── #} +
-

Sync Engine Source

+

External Server

-

Controls where PageKeeper's sync engine reads KoSync data from.

+

PageKeeper will read KOReader progress from this server instead of its built-in KoSync bridge.

-
- - +
+ +
+ + +
+
- +
+ +

Credentials PageKeeper uses to authenticate with the external server.

- - + {# ── Section 3: Advanced ── #}

Advanced

@@ -517,6 +544,19 @@

Advanced

+ + {# ── Section 4: Document Management ── #} +
+
+

Document Management

+
+
+

View and manage KoSync hash-to-book mappings, clear orphaned hashes, and link unmatched documents.

+ +
+
@@ -1005,18 +1045,7 @@

Transcription Engine

- - + {% include 'partials/confirm_modal.html' %}
+ diff --git a/templates/suggestions.html b/templates/suggestions.html index fd4cd48..5d33000 100644 --- a/templates/suggestions.html +++ b/templates/suggestions.html @@ -23,25 +23,27 @@

Discovery Queue

Pairing Suggestions

-

Review unmapped audiobook pairings across your indexed ebook sources.

+

Review potential audiobook & ebook pairings

- +
- - + - + {% endif %}
@@ -90,434 +94,18 @@

No Visible Suggestions

- + {% include 'partials/confirm_modal.html' %} + + + diff --git a/templates/tbr_detail.html b/templates/tbr_detail.html index c03d8fc..4f19ae5 100644 --- a/templates/tbr_detail.html +++ b/templates/tbr_detail.html @@ -129,7 +129,7 @@

{{ item.title|e }}{% if item.subtitle %}: {{ item.sub {% if item.description %}

About

-
{{ item.description }}
+
{{ item.description | sanitize_html }}
{% endif %} @@ -208,6 +208,8 @@

Library Link

var TBR_ITEM_ID = {{ item.id }}; var HC_CONFIGURED = {{ 'true' if hc_configured else 'false' }}; + {% include 'partials/confirm_modal.html' %} + diff --git a/tests/conftest.py b/tests/conftest.py index 07a7a5a..1409c9f 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -1,10 +1,240 @@ """Shared pytest fixtures and module stubs for the test suite.""" +import os +import shutil import sys +import tempfile +from pathlib import Path from types import ModuleType +from unittest.mock import Mock + +import pytest # Stub native modules only available inside Docker so that test files # can import production code without raising ImportError. for _mod_name in ('epubcfi',): if _mod_name not in sys.modules: sys.modules[_mod_name] = ModuleType(_mod_name) + + +# ── MockABSService ───────────────────────────────────────────────── +# Lightweight stand-in for ABSService that avoids network calls. + +class MockABSService: + """Minimal ABS service mock suitable for route-level tests.""" + + def is_available(self): + return True + + def get_audiobooks(self): + return [] + + def get_cover_proxy_url(self, abs_id): + return f'/covers/{abs_id}.jpg' + + def add_to_collection(self, abs_id, collection_name): + pass + + +# ── Canonical MockContainer ──────────────────────────────────────── +# Superset of every per-file variant. Individual tests can override +# attributes after construction (e.g. ``mc.mock_abs_client.is_configured …``). + +class MockContainer: + """Test-friendly replacement for the DI Container. + + Every service accessor is backed by a ``Mock()`` instance stored as + ``self.mock_``. Override attributes/return-values in your + test setUp or fixture as needed. + """ + + def __init__(self): + # ── Database ── + self.mock_database_service = Mock() + self.mock_database_service.get_all_settings.return_value = {} + self.mock_database_service.get_book_by_ref.return_value = None + self.mock_database_service.get_all_books.return_value = [] + self.mock_database_service.get_books_by_status.return_value = [] + self.mock_database_service.get_all_pending_suggestions.return_value = [] + self.mock_database_service.get_all_actionable_suggestions.return_value = [] + self.mock_database_service.get_bookfusion_books.return_value = [] + self.mock_database_service.get_bookfusion_linked_book_ids.return_value = set() + self.mock_database_service.get_bookfusion_highlight_counts_by_book_id.return_value = {} + + # ── API Clients ── + self.mock_abs_client = Mock() + self.mock_abs_client.is_configured.return_value = False + self.mock_booklore_client = Mock() + self.mock_booklore_client.is_configured.return_value = False + self.mock_storyteller_client = Mock() + self.mock_storyteller_client.is_configured.return_value = False + self.mock_hardcover_client = Mock() + self.mock_hardcover_client.is_configured.return_value = False + self.mock_bookfusion_client = Mock() + self.mock_bookfusion_client.is_configured.return_value = False + self.mock_bookfusion_client.highlights_api_key = '' + self.mock_bookfusion_client.upload_api_key = '' + + # ── Services ── + self.mock_abs_service = MockABSService() + self.mock_hardcover_service = Mock() + self.mock_hardcover_service.is_configured.return_value = False + self.mock_reading_date_service = Mock() + self.mock_reading_date_service.pull_reading_dates.return_value = {} + self.mock_reading_date_service.push_dates_to_hardcover.return_value = (True, "Dates synced") + + # ── Sync Clients ── + self.mock_hardcover_sync_client = Mock() + self.mock_hardcover_sync_client.is_configured.return_value = False + + # ── Utilities ── + self.mock_ebook_parser = Mock() + + # ── Manager ── + self.mock_sync_manager = Mock() + self.mock_sync_manager.abs_client = self.mock_abs_client + self.mock_sync_manager.booklore_client = self.mock_booklore_client + self.mock_sync_manager.storyteller_client = self.mock_storyteller_client + self.mock_sync_manager.get_audiobook_title.return_value = 'Test Book Title' + self.mock_sync_manager.get_duration.return_value = 3600 + self.mock_sync_manager.clear_progress = Mock() + + # ── Paths (temp) ── + self._tmp = Path(tempfile.gettempdir()) + + # ── Accessors (match Container's callable interface) ── + + def database_service(self): + return self.mock_database_service + + def sync_manager(self): + return self.mock_sync_manager + + def abs_client(self): + return self.mock_abs_client + + def abs_service(self): + return self.mock_abs_service + + def booklore_client(self): + return self.mock_booklore_client + + def booklore_client_group(self): + return self.mock_booklore_client + + def storyteller_client(self): + return self.mock_storyteller_client + + def bookfusion_client(self): + return self.mock_bookfusion_client + + def hardcover_client(self): + return self.mock_hardcover_client + + def hardcover_service(self): + return self.mock_hardcover_service + + def hardcover_sync_client(self): + return self.mock_hardcover_sync_client + + def reading_date_service(self): + return self.mock_reading_date_service + + def ebook_parser(self): + return self.mock_ebook_parser + + def sync_clients(self): + return {} + + def data_dir(self): + return self._tmp / 'test_data' + + def books_dir(self): + return self._tmp / 'test_books' + + def epub_cache_dir(self): + return self._tmp / 'test_epub_cache' + + +# ── Pytest fixtures ──────────────────────────────────────────────── + +@pytest.fixture() +def mock_container(): + """Yield a fresh MockContainer for each test.""" + return MockContainer() + + +@pytest.fixture() +def flask_app(mock_container, tmp_path): + """Create a Flask test app wired to the given mock_container.""" + saved_env = os.environ.copy() + os.environ['DATA_DIR'] = str(tmp_path) + + import src.db.migration_utils + original_init_db = src.db.migration_utils.initialize_database + src.db.migration_utils.initialize_database = lambda data_dir: mock_container.mock_database_service + + try: + from src.web_server import create_app + app, _ = create_app(test_container=mock_container) + app.config['TESTING'] = True + yield app + finally: + src.db.migration_utils.initialize_database = original_init_db + # Restore environment to avoid leaking bootstrapped settings + os.environ.clear() + os.environ.update(saved_env) + + +@pytest.fixture() +def client(flask_app): + """Return a Flask test client.""" + return flask_app.test_client() + + +# ── Test data helpers ────────────────────────────────────────────── + +def make_test_book(**overrides): + """Build a dict resembling a book/mapping row, with sensible defaults. + + Usage:: + + book = make_test_book(title='Dune', abs_id='abc-123') + """ + defaults = { + 'id': 1, + 'abs_id': 'test-abs-id', + 'title': 'Test Book', + 'author': 'Test Author', + 'status': 'active', + 'ebook_source': None, + 'ebook_id': None, + 'audio_progress': 0.0, + 'ebook_progress': 0.0, + 'duration': 3600, + 'hardcover_book_id': None, + 'hardcover_edition_id': None, + } + defaults.update(overrides) + return defaults + + +def make_test_state(**overrides): + """Build a dict resembling a sync state row, with sensible defaults. + + Usage:: + + state = make_test_state(abs_id='abc-123', audio_progress=0.5) + """ + defaults = { + 'id': 1, + 'abs_id': 'test-abs-id', + 'audio_progress': 0.0, + 'audio_current_time': 0.0, + 'ebook_progress': 0.0, + 'ebook_cfi': None, + 'last_sync_source': None, + 'last_updated': None, + } + defaults.update(overrides) + return defaults diff --git a/tests/test_abs_socket_listener.py b/tests/test_abs_socket_listener.py index b0f1f1b..71e9058 100644 --- a/tests/test_abs_socket_listener.py +++ b/tests/test_abs_socket_listener.py @@ -129,12 +129,14 @@ def test_handles_nested_data_format(self): book = self._make_active_book("nested-id") self.mock_db.get_book_by_abs_id.return_value = book - self.listener._handle_progress_event({ - "id": "34621755-32df-4876-b235-abc123", - "sessionId": "session-1", - "deviceDescription": "Windows 10 / Firefox", - "data": {"libraryItemId": "nested-id", "progress": 0.42} - }) + self.listener._handle_progress_event( + { + "id": "34621755-32df-4876-b235-abc123", + "sessionId": "session-1", + "deviceDescription": "Windows 10 / Firefox", + "data": {"libraryItemId": "nested-id", "progress": 0.42}, + } + ) self.assertIn("nested-id", self.listener._pending) def test_handles_top_level_library_item_id(self): @@ -166,159 +168,126 @@ def test_url_stripping(self): class TestKosyncPutInstantSync(unittest.TestCase): """Test that KoSync PUT records debounce events for active linked books. - Sync fires via _kosync_debounce_loop (background thread). These tests verify - that events are correctly recorded in (or excluded from) the debounce queue — - the actual fire timing is handled by the debounce loop, not tested here. + These tests call KosyncService.handle_put_progress directly with a mock + debounce_manager, verifying that events are correctly recorded (or not). """ def setUp(self): import os - os.environ.setdefault('DATA_DIR', '/tmp/test_kosync_instant') - os.environ.setdefault('KOSYNC_USER', 'testuser') - os.environ.setdefault('KOSYNC_KEY', 'testpass') - os.environ['INSTANT_SYNC_ENABLED'] = 'true' - import src.api.kosync_server as ks - # Reset debounce state so each test starts clean - ks._debounce_thread_started = False - with ks._kosync_debounce_lock: - ks._kosync_debounce.clear() + os.environ.setdefault("DATA_DIR", "/tmp/test_kosync_instant") + os.environ["INSTANT_SYNC_ENABLED"] = "true" def tearDown(self): import os - os.environ.pop('INSTANT_SYNC_ENABLED', None) - - def _make_put_context(self, doc_hash, percentage=0.55, device='TestDevice'): - from flask import Flask - app = Flask(__name__) - return app.test_request_context( - '/syncs/progress', - method='PUT', - json={ - 'document': doc_hash, - 'percentage': percentage, - 'progress': '/body/test', - 'device': device, - 'device_id': 'D1', - }, - content_type='application/json', - ) + + os.environ.pop("INSTANT_SYNC_ENABLED", None) def test_put_records_debounce_event_for_active_linked_book(self): """PUT for a linked active book should record a debounce event.""" - import src.api.kosync_server as ks - from src.db.models import KosyncDocument + from src.services.kosync_service import KosyncService mock_db = MagicMock() - original_db = ks._database_service - ks._database_service = mock_db - original_manager = ks._manager - ks._manager = MagicMock() - - try: - mock_book = MagicMock() - mock_book.abs_id = "test-instant-sync" - mock_book.title = "Instant Sync Book" - mock_book.status = "active" - mock_book.kosync_doc_id = "x" * 32 - - mock_doc = MagicMock(spec=KosyncDocument) - mock_doc.linked_abs_id = "test-instant-sync" - mock_doc.percentage = 0.3 - mock_doc.device_id = "D1" - - mock_db.get_kosync_document.return_value = mock_doc - mock_db.get_book_by_abs_id.return_value = mock_book - mock_db.get_book_by_kosync_id.return_value = None - - with self._make_put_context('x' * 32): - ks.kosync_put_progress.__wrapped__() - - # Event should be queued for the debounce loop to fire - with ks._kosync_debounce_lock: - self.assertIn("test-instant-sync", ks._kosync_debounce) - self.assertFalse(ks._kosync_debounce["test-instant-sync"]["synced"]) - self.assertEqual(ks._kosync_debounce["test-instant-sync"]["title"], "Instant Sync Book") - - finally: - ks._database_service = original_db - ks._manager = original_manager + mock_manager = MagicMock() + svc = KosyncService(mock_db, MagicMock(), mock_manager) + + mock_book = MagicMock() + mock_book.id = 42 + mock_book.abs_id = "test-instant-sync" + mock_book.title = "Instant Sync Book" + mock_book.status = "active" + mock_book.activity_flag = False + + mock_doc = MagicMock() + mock_doc.linked_abs_id = "test-instant-sync" + mock_doc.percentage = 0.3 + mock_doc.device_id = "D1" + + mock_db.get_kosync_document.return_value = mock_doc + mock_db.get_book_by_abs_id.return_value = mock_book + + mock_debounce = MagicMock() + + with patch.dict("os.environ", {"KOSYNC_FURTHEST_WINS": "false", "INSTANT_SYNC_ENABLED": "true"}): + svc.handle_put_progress( + { + "document": "x" * 32, + "percentage": 0.55, + "progress": "/body/test", + "device": "TestDevice", + "device_id": "D1", + }, + "127.0.0.1", + debounce_manager=mock_debounce, + ) + + mock_debounce.record_event.assert_called_once_with(42, "Instant Sync Book") def test_instant_sync_disabled_skips_debounce(self): """PUT should NOT record a debounce event when INSTANT_SYNC_ENABLED=false.""" - import os + from src.services.kosync_service import KosyncService - import src.api.kosync_server as ks - from src.db.models import KosyncDocument - - os.environ['INSTANT_SYNC_ENABLED'] = 'false' mock_db = MagicMock() - original_db = ks._database_service - ks._database_service = mock_db - original_manager = ks._manager - ks._manager = MagicMock() - - try: - mock_book = MagicMock() - mock_book.abs_id = "test-disabled" - mock_book.title = "Disabled Book" - mock_book.status = "active" - mock_book.kosync_doc_id = "d" * 32 - - mock_doc = MagicMock(spec=KosyncDocument) - mock_doc.linked_abs_id = "test-disabled" - mock_doc.percentage = 0.1 - mock_doc.device_id = "D1" - - mock_db.get_kosync_document.return_value = mock_doc - mock_db.get_book_by_abs_id.return_value = mock_book - mock_db.get_book_by_kosync_id.return_value = None - - with self._make_put_context('d' * 32): - ks.kosync_put_progress.__wrapped__() - - # No debounce event should have been recorded - with ks._kosync_debounce_lock: - self.assertNotIn("test-disabled", ks._kosync_debounce) - - finally: - ks._database_service = original_db - ks._manager = original_manager + mock_manager = MagicMock() + svc = KosyncService(mock_db, MagicMock(), mock_manager) + + mock_book = MagicMock() + mock_book.abs_id = "test-disabled" + mock_book.title = "Disabled Book" + mock_book.status = "active" + mock_book.activity_flag = False + + mock_doc = MagicMock() + mock_doc.linked_abs_id = "test-disabled" + mock_doc.percentage = 0.1 + mock_doc.device_id = "D1" + + mock_db.get_kosync_document.return_value = mock_doc + mock_db.get_book_by_abs_id.return_value = mock_book + + mock_debounce = MagicMock() + + with patch.dict("os.environ", {"KOSYNC_FURTHEST_WINS": "false", "INSTANT_SYNC_ENABLED": "false"}): + svc.handle_put_progress( + {"document": "d" * 32, "percentage": 0.55, "device": "TestDevice", "device_id": "D1"}, + "127.0.0.1", + debounce_manager=mock_debounce, + ) + + mock_debounce.record_event.assert_not_called() def test_put_does_not_record_debounce_event_for_inactive_book(self): """PUT for a linked but inactive book should NOT record a debounce event.""" - import src.api.kosync_server as ks + from src.services.kosync_service import KosyncService mock_db = MagicMock() - original_db = ks._database_service - ks._database_service = mock_db - original_manager = ks._manager - ks._manager = MagicMock() - - try: - mock_book = MagicMock() - mock_book.abs_id = "test-inactive" - mock_book.title = "Inactive Book" - mock_book.status = "pending" - - mock_doc = MagicMock() - mock_doc.linked_abs_id = "test-inactive" - mock_doc.percentage = 0.1 - mock_doc.device_id = "D1" - - mock_db.get_kosync_document.return_value = mock_doc - mock_db.get_book_by_abs_id.return_value = mock_book - mock_db.get_book_by_kosync_id.return_value = None - - with self._make_put_context('y' * 32): - ks.kosync_put_progress.__wrapped__() - - with ks._kosync_debounce_lock: - self.assertNotIn("test-inactive", ks._kosync_debounce) - - finally: - ks._database_service = original_db - ks._manager = original_manager + mock_manager = MagicMock() + svc = KosyncService(mock_db, MagicMock(), mock_manager) + + mock_book = MagicMock() + mock_book.abs_id = "test-inactive" + mock_book.title = "Inactive Book" + mock_book.status = "pending" + mock_book.activity_flag = False + + mock_doc = MagicMock() + mock_doc.linked_abs_id = "test-inactive" + mock_doc.percentage = 0.1 + mock_doc.device_id = "D1" + + mock_db.get_kosync_document.return_value = mock_doc + mock_db.get_book_by_abs_id.return_value = mock_book + + mock_debounce = MagicMock() + + with patch.dict("os.environ", {"KOSYNC_FURTHEST_WINS": "false", "INSTANT_SYNC_ENABLED": "true"}): + svc.handle_put_progress( + {"document": "y" * 32, "percentage": 0.55, "device": "TestDevice", "device_id": "D1"}, + "127.0.0.1", + debounce_manager=mock_debounce, + ) + + mock_debounce.record_event.assert_not_called() if __name__ == "__main__": diff --git a/tests/test_alignment_service.py b/tests/test_alignment_service.py index 5ad1445..7974091 100644 --- a/tests/test_alignment_service.py +++ b/tests/test_alignment_service.py @@ -36,7 +36,7 @@ def test_align_and_store_success(service, mock_db): # Ensure DB query returns None (Simulate no existing record) session.query.return_value.filter_by.return_value.first.return_value = None - result = service.align_and_store("test_id", segments, ebook_text) + result = service.align_and_store(42, segments, ebook_text) assert result == True session.add.assert_called() @@ -87,9 +87,9 @@ def test_get_time_for_text(service, mock_db): session.query.return_value.filter_by.return_value.first.return_value = mock_entry # Test Exact - ts = service.get_time_for_text("test_id", char_offset_hint=0) + ts = service.get_time_for_text(42, char_offset_hint=0) assert ts == 0.0 # Test Interpolation (50 chars -> 5.0s) - ts = service.get_time_for_text("test_id", char_offset_hint=50) + ts = service.get_time_for_text(42, char_offset_hint=50) assert ts == 5.0 diff --git a/tests/test_api_errors.py b/tests/test_api_errors.py new file mode 100644 index 0000000..8668d27 --- /dev/null +++ b/tests/test_api_errors.py @@ -0,0 +1,253 @@ +"""Tests for error paths in API blueprint (src/blueprints/api.py).""" + +from unittest.mock import Mock + +import pytest + +# ── Suggestion resolve/ignore/hide: DB raises ───────────────────── + +def test_hide_suggestion_returns_404_when_not_found(client, mock_container): + """hide_suggestion returns 404 when DB returns False (not found).""" + mock_container.mock_database_service.hide_suggestion.return_value = False + + response = client.post("/api/suggestions/nonexistent/hide") + + assert response.status_code == 404 + data = response.get_json() + assert data["success"] is False + assert "Not found" in data["error"] + + +def test_hide_suggestion_succeeds(client, mock_container): + """hide_suggestion returns 200 on success.""" + mock_container.mock_database_service.hide_suggestion.return_value = True + + response = client.post("/api/suggestions/abc123/hide") + + assert response.status_code == 200 + assert response.get_json()["success"] is True + + +def test_unhide_suggestion_returns_404_when_not_found(client, mock_container): + """unhide_suggestion returns 404 when DB returns False.""" + mock_container.mock_database_service.unhide_suggestion.return_value = False + + response = client.post("/api/suggestions/nonexistent/unhide") + + assert response.status_code == 404 + + +def test_ignore_suggestion_returns_404_when_not_found(client, mock_container): + """ignore_suggestion returns 404 when DB returns False.""" + mock_container.mock_database_service.ignore_suggestion.return_value = False + + response = client.post("/api/suggestions/nonexistent/ignore") + + assert response.status_code == 404 + data = response.get_json() + assert data["success"] is False + + +def test_ignore_suggestion_succeeds(client, mock_container): + """ignore_suggestion returns 200 on success.""" + mock_container.mock_database_service.ignore_suggestion.return_value = True + + response = client.post("/api/suggestions/abc123/ignore") + + assert response.status_code == 200 + assert response.get_json()["success"] is True + + +def test_hide_suggestion_with_source_param(client, mock_container): + """hide_suggestion passes source query param to DB service.""" + mock_container.mock_database_service.hide_suggestion.return_value = True + + response = client.post("/api/suggestions/abc/hide?source=kosync") + + assert response.status_code == 200 + mock_container.mock_database_service.hide_suggestion.assert_called_once_with("abc", source="kosync") + + +# ── Booklore search: client raises ──────────────────────────────── + +def test_booklore_search_returns_empty_when_not_configured(flask_app, mock_container): + """Booklore search returns empty list when client is not configured.""" + mock_container.mock_booklore_client.is_configured.return_value = False + + with flask_app.test_client() as client: + response = client.get("/api/booklore/search?q=test") + + assert response.status_code == 200 + assert response.get_json() == [] + + +def test_booklore_search_returns_empty_when_client_raises(flask_app, mock_container): + """Booklore search returns empty list when search_books throws.""" + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.search_books.side_effect = Exception("Booklore down") + + with flask_app.test_client() as client: + response = client.get("/api/booklore/search?q=test") + + assert response.status_code == 200 + assert response.get_json() == [] + + +def test_booklore_search_returns_empty_for_empty_query(flask_app, mock_container): + """Booklore search returns empty list for empty query.""" + with flask_app.test_client() as client: + response = client.get("/api/booklore/search?q=") + + assert response.status_code == 200 + assert response.get_json() == [] + + +def test_booklore_search_returns_results(flask_app, mock_container): + """Booklore search returns formatted results on success.""" + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.search_books.return_value = [ + {"id": 1, "title": "Dune", "authors": "Frank Herbert", "fileName": "dune.epub"}, + ] + + with flask_app.test_client() as client: + response = client.get("/api/booklore/search?q=dune") + + assert response.status_code == 200 + data = response.get_json() + assert len(data) == 1 + assert data[0]["title"] == "Dune" + assert data[0]["source"] == "Booklore" + + +# ── Storyteller search: client raises ───────────────────────────── + +def test_storyteller_search_raises(flask_app, mock_container): + """Storyteller search returns 500 when search_books throws (no try/except in route).""" + mock_container.mock_storyteller_client.is_configured.return_value = True + mock_container.mock_storyteller_client.search_books.side_effect = Exception("Storyteller down") + + # Disable exception propagation so Flask returns a 500 response + flask_app.config['TESTING'] = False + flask_app.config['PROPAGATE_EXCEPTIONS'] = False + + with flask_app.test_client() as client: + response = client.get("/api/storyteller/search?q=test") + + assert response.status_code == 500 + + +def test_storyteller_search_missing_query(flask_app, mock_container): + """Storyteller search returns 400 when query param is missing.""" + with flask_app.test_client() as client: + response = client.get("/api/storyteller/search?q=") + + assert response.status_code == 400 + data = response.get_json() + assert data["success"] is False + + +# ── Suggestion link-bookfusion: edge cases ──────────────────────── + +def test_link_bookfusion_suggestion_not_found(client, mock_container): + """link-bookfusion returns 404 when suggestion doesn't exist.""" + mock_container.mock_database_service.get_pending_suggestion.return_value = None + + response = client.post( + "/api/suggestions/abc/link-bookfusion", + json={"source": "abs", "match_index": 0}, + ) + + assert response.status_code == 404 + data = response.get_json() + assert data["success"] is False + + +def test_link_bookfusion_invalid_match_index(client, mock_container): + """link-bookfusion returns 400 when match_index is out of range.""" + suggestion = Mock() + suggestion.matches = [{"ebook_filename": "test.epub"}] + mock_container.mock_database_service.get_pending_suggestion.return_value = suggestion + + response = client.post( + "/api/suggestions/abc/link-bookfusion", + json={"source": "abs", "match_index": 5}, + ) + + assert response.status_code == 400 + + +def test_link_bookfusion_non_bookfusion_match(client, mock_container): + """link-bookfusion returns 400 when selected match is not a BookFusion candidate.""" + suggestion = Mock() + suggestion.matches = [{"ebook_filename": "test.epub", "source_family": "booklore", "bookfusion_ids": []}] + mock_container.mock_database_service.get_pending_suggestion.return_value = suggestion + + response = client.post( + "/api/suggestions/abc/link-bookfusion", + json={"source": "abs", "match_index": 0}, + ) + + assert response.status_code == 400 + + +def test_link_bookfusion_non_abs_source(client, mock_container): + """link-bookfusion returns 400 for non-abs source.""" + response = client.post( + "/api/suggestions/abc/link-bookfusion", + json={"source": "kosync"}, + ) + + assert response.status_code == 400 + data = response.get_json() + assert "ABS" in data["error"] + + +# ── Booklore link: recompute KOSync ────────────────────────────── + +def test_booklore_link_unlink(flask_app, mock_container): + """Booklore link with null filename unlinks the book.""" + book = Mock() + book.title = "Test" + book.ebook_filename = "old.epub" + book.original_ebook_filename = "old.epub" + book.kosync_doc_id = "hash" + mock_container.mock_database_service.get_book_by_ref.return_value = book + + with flask_app.test_client() as client: + response = client.post( + "/api/booklore/link/test-abs", + json={"filename": None}, + ) + + assert response.status_code == 200 + data = response.get_json() + assert data["success"] is True + assert "unlinked" in data["message"].lower() + + +# ── Booklore libraries: not configured ──────────────────────────── + +def test_booklore_libraries_not_configured(flask_app, mock_container): + """Booklore libraries returns 400 when not configured.""" + mock_container.mock_booklore_client.is_configured.return_value = False + + with flask_app.test_client() as client: + response = client.get("/api/booklore/libraries") + + assert response.status_code == 400 + data = response.get_json() + assert data["success"] is False + + +# ── Clear stale suggestions ─────────────────────────────────────── + +def test_clear_stale_suggestions(client, mock_container): + """clear_stale_suggestions returns count of cleared items.""" + mock_container.mock_database_service.clear_stale_suggestions.return_value = 5 + + response = client.post("/api/suggestions/clear_stale") + + assert response.status_code == 200 + data = response.get_json() + assert data["success"] is True + assert data["count"] == 5 diff --git a/tests/test_apply_settings_integration.py b/tests/test_apply_settings_integration.py new file mode 100644 index 0000000..4a5c7a0 --- /dev/null +++ b/tests/test_apply_settings_integration.py @@ -0,0 +1,305 @@ +"""Integration tests for apply_settings hot-reload behaviour.""" + +import os +from unittest.mock import MagicMock, Mock, patch + +import pytest +import schedule + +# --------------------------------------------------------------------------- +# Unit-level tests for apply_settings (mock dependencies directly) +# --------------------------------------------------------------------------- + +class TestApplySettingsSyncPeriod: + """SYNC_PERIOD_MINS reschedule logic.""" + + def setup_method(self): + schedule.clear() + + def teardown_method(self): + schedule.clear() + + def test_valid_sync_period_reschedules(self): + """apply_settings clears old job and creates a new one with correct period.""" + from src.web_server import apply_settings + + app = MagicMock() + sync_mgr = Mock() + app.config = { + 'sync_manager': sync_mgr, + 'abs_listener': None, + '_abs_listener_server': '', + '_abs_listener_key': '', + } + + with patch.dict(os.environ, { + 'SYNC_PERIOD_MINS': '10', + 'LOG_LEVEL': 'INFO', + 'INSTANT_SYNC_ENABLED': 'false', + 'ABS_SOCKET_ENABLED': 'false', + 'TELEGRAM_ENABLED': 'false', + }), patch('src.utils.logging_utils.reconcile_telegram_logging'): + apply_settings(app) + + # Should have exactly one sync_cycle job tagged 'sync_cycle' + jobs = schedule.get_jobs('sync_cycle') + assert len(jobs) == 1 + assert jobs[0].interval == 10 + + def test_invalid_sync_period_non_integer(self): + """Non-integer SYNC_PERIOD_MINS is collected as an error but does not crash.""" + from src.web_server import apply_settings + + app = MagicMock() + app.config = { + 'sync_manager': Mock(), + 'abs_listener': None, + '_abs_listener_server': '', + '_abs_listener_key': '', + } + + with patch.dict(os.environ, { + 'SYNC_PERIOD_MINS': 'abc', + 'LOG_LEVEL': 'INFO', + 'INSTANT_SYNC_ENABLED': 'false', + 'ABS_SOCKET_ENABLED': 'false', + 'TELEGRAM_ENABLED': 'false', + }), patch('src.utils.logging_utils.reconcile_telegram_logging'): + with pytest.raises(RuntimeError, match='sync reschedule failed'): + apply_settings(app) + + def test_zero_sync_period_raises(self): + """SYNC_PERIOD_MINS=0 is rejected.""" + from src.web_server import apply_settings + + app = MagicMock() + app.config = { + 'sync_manager': Mock(), + 'abs_listener': None, + '_abs_listener_server': '', + '_abs_listener_key': '', + } + + with patch.dict(os.environ, { + 'SYNC_PERIOD_MINS': '0', + 'LOG_LEVEL': 'INFO', + 'INSTANT_SYNC_ENABLED': 'false', + 'ABS_SOCKET_ENABLED': 'false', + 'TELEGRAM_ENABLED': 'false', + }), patch('src.utils.logging_utils.reconcile_telegram_logging'): + with pytest.raises(RuntimeError, match='must be an integer greater than 0'): + apply_settings(app) + + def test_negative_sync_period_raises(self): + """Negative SYNC_PERIOD_MINS is rejected.""" + from src.web_server import apply_settings + + app = MagicMock() + app.config = { + 'sync_manager': Mock(), + 'abs_listener': None, + '_abs_listener_server': '', + '_abs_listener_key': '', + } + + with patch.dict(os.environ, { + 'SYNC_PERIOD_MINS': '-5', + 'LOG_LEVEL': 'INFO', + 'INSTANT_SYNC_ENABLED': 'false', + 'ABS_SOCKET_ENABLED': 'false', + 'TELEGRAM_ENABLED': 'false', + }), patch('src.utils.logging_utils.reconcile_telegram_logging'): + with pytest.raises(RuntimeError, match='must be an integer greater than 0'): + apply_settings(app) + + +# --------------------------------------------------------------------------- +# Socket listener reconciliation +# --------------------------------------------------------------------------- + +class TestSocketListenerReconciliation: + """_reconcile_socket_listener start/stop/restart behaviour.""" + + @patch('src.services.abs_socket_listener.ABSSocketListener') + @patch('threading.Thread') + def test_starts_listener_when_enabled_and_none_running(self, mock_thread_cls, mock_listener_cls): + """Listener is created and started when config says enabled and none exists.""" + from src.web_server import _reconcile_socket_listener + + mock_listener = Mock() + mock_listener_cls.return_value = mock_listener + + mock_thread = Mock() + mock_thread_cls.return_value = mock_thread + + app = MagicMock() + app.config = { + 'abs_listener': None, + '_abs_listener_server': '', + '_abs_listener_key': '', + 'database_service': Mock(), + 'sync_manager': Mock(), + } + + with patch.dict(os.environ, { + 'INSTANT_SYNC_ENABLED': 'true', + 'ABS_SOCKET_ENABLED': 'true', + 'ABS_SERVER': 'http://abs:13378', + 'ABS_KEY': 'secret-key', + }): + _reconcile_socket_listener(app) + + mock_listener_cls.assert_called_once() + mock_thread.start.assert_called_once() + assert app.config['abs_listener'] is mock_listener + + @patch('src.services.abs_socket_listener.ABSSocketListener') + def test_stops_listener_when_disabled(self, mock_listener_cls): + """Running listener is stopped when socket is disabled.""" + from src.web_server import _reconcile_socket_listener + + existing_listener = Mock() + app = MagicMock() + app.config = { + 'abs_listener': existing_listener, + '_abs_listener_server': 'http://abs:13378', + '_abs_listener_key': 'old-key', + 'database_service': Mock(), + 'sync_manager': Mock(), + } + + with patch.dict(os.environ, { + 'INSTANT_SYNC_ENABLED': 'true', + 'ABS_SOCKET_ENABLED': 'false', + 'ABS_SERVER': 'http://abs:13378', + 'ABS_KEY': 'secret-key', + }): + _reconcile_socket_listener(app) + + existing_listener.stop.assert_called_once() + assert app.config['abs_listener'] is None + + @patch('src.services.abs_socket_listener.ABSSocketListener') + @patch('threading.Thread') + def test_restarts_listener_when_credentials_change(self, mock_thread_cls, mock_listener_cls): + """Listener is restarted when server URL or key change.""" + from src.web_server import _reconcile_socket_listener + + old_listener = Mock() + new_listener = Mock() + mock_listener_cls.return_value = new_listener + mock_thread_cls.return_value = Mock() + + app = MagicMock() + app.config = { + 'abs_listener': old_listener, + '_abs_listener_server': 'http://old-server:13378', + '_abs_listener_key': 'old-key', + 'database_service': Mock(), + 'sync_manager': Mock(), + } + + with patch.dict(os.environ, { + 'INSTANT_SYNC_ENABLED': 'true', + 'ABS_SOCKET_ENABLED': 'true', + 'ABS_SERVER': 'http://new-server:13378', + 'ABS_KEY': 'new-key', + }): + _reconcile_socket_listener(app) + + old_listener.stop.assert_called_once() + mock_listener_cls.assert_called_once() + assert app.config['abs_listener'] is new_listener + + +# --------------------------------------------------------------------------- +# Telegram reconciliation failure +# --------------------------------------------------------------------------- + +class TestTelegramReconciliationFailure: + """Telegram failure is collected but other settings still apply.""" + + def setup_method(self): + schedule.clear() + + def teardown_method(self): + schedule.clear() + + def test_telegram_failure_collects_error_others_still_apply(self): + """If reconcile_telegram_logging raises, error is collected in RuntimeError + but sync schedule and config refresh still happen.""" + from src.web_server import apply_settings + + app = MagicMock() + sync_mgr = Mock() + app.config = { + 'sync_manager': sync_mgr, + 'abs_listener': None, + '_abs_listener_server': '', + '_abs_listener_key': '', + } + + with patch.dict(os.environ, { + 'SYNC_PERIOD_MINS': '7', + 'LOG_LEVEL': 'INFO', + 'INSTANT_SYNC_ENABLED': 'false', + 'ABS_SOCKET_ENABLED': 'false', + 'ABS_COLLECTION_NAME': 'MyCollection', + 'SUGGESTIONS_ENABLED': 'true', + }), patch('src.utils.logging_utils.reconcile_telegram_logging', side_effect=RuntimeError('telegram boom')): + with pytest.raises(RuntimeError, match='telegram logging reconciliation failed'): + apply_settings(app) + + # Sync schedule was still updated despite telegram failure + jobs = schedule.get_jobs('sync_cycle') + assert len(jobs) == 1 + assert jobs[0].interval == 7 + + # Config refresh still happened + assert app.config['ABS_COLLECTION_NAME'] == 'MyCollection' + assert app.config['SUGGESTIONS_ENABLED'] is True + + +# --------------------------------------------------------------------------- +# Route-level: POST /settings → apply_settings called → schedule changed +# --------------------------------------------------------------------------- + +class TestSettingsRouteIntegration: + """Full POST /settings verifying apply_settings is called and schedule is updated.""" + + def test_post_settings_calls_apply_and_updates_schedule(self, mock_container, flask_app, client): + """POST /settings saves settings to DB and calls apply_settings.""" + schedule.clear() + + with patch('src.blueprints.settings_bp.get_database_service', + return_value=mock_container.mock_database_service): + with patch('src.web_server.apply_settings') as mock_apply: + mock_apply.return_value = True + + resp = client.post('/settings', data={ + 'SYNC_PERIOD_MINS': '15', + '_active_tab': 'general', + }, follow_redirects=False) + + # Should redirect back to settings page + assert resp.status_code == 302 + + # apply_settings was invoked + mock_apply.assert_called_once() + + def test_post_settings_apply_failure_sets_error_session(self, mock_container, flask_app, client): + """When apply_settings raises, the session message is an error.""" + with patch('src.blueprints.settings_bp.get_database_service', + return_value=mock_container.mock_database_service): + with patch('src.web_server.apply_settings', side_effect=RuntimeError('boom')): + resp = client.post('/settings', data={ + 'SYNC_PERIOD_MINS': '15', + '_active_tab': 'general', + }, follow_redirects=False) + + assert resp.status_code == 302 + + # Follow the redirect to GET /settings to verify the error message + with patch('src.version.get_update_status', return_value=(None, False)): + get_resp = client.get('/settings') + assert b'Error applying settings' in get_resp.data diff --git a/tests/test_background_job_service.py b/tests/test_background_job_service.py new file mode 100644 index 0000000..09f2982 --- /dev/null +++ b/tests/test_background_job_service.py @@ -0,0 +1,637 @@ +"""Tests for BackgroundJobService — focused on error paths and job lifecycle.""" + +import time +from pathlib import Path +from unittest.mock import MagicMock, Mock, patch + +import pytest + +from src.db.models import Job +from src.services.background_job_service import BackgroundJobService +from src.services.storyteller_submission_service import StorytellerDeferral + + +def _make_service(**overrides): + """Create a BackgroundJobService with mock dependencies.""" + defaults = dict( + database_service=Mock(), + abs_client=Mock(), + booklore_client=Mock(), + ebook_parser=Mock(), + transcriber=Mock(), + alignment_service=Mock(), + library_service=Mock(), + storyteller_client=None, + storyteller_submission_service=None, + epub_cache_dir="/tmp/epub_cache", + data_dir="/tmp/data", + books_dir="/tmp/books", + ) + defaults.update(overrides) + return BackgroundJobService(**defaults) + + +def _make_book(**kwargs): + """Create a mock book object with sensible defaults.""" + book = Mock() + book.id = kwargs.get("id", 1) + book.abs_id = kwargs.get("abs_id", "abs-123") + book.title = kwargs.get("title", "Test Book") + book.status = kwargs.get("status", "pending") + book.ebook_filename = kwargs.get("ebook_filename", "test.epub") + book.storyteller_uuid = kwargs.get("storyteller_uuid", None) + book.kosync_doc_id = kwargs.get("kosync_doc_id", None) + book.original_ebook_filename = kwargs.get("original_ebook_filename", None) + book.transcript_file = kwargs.get("transcript_file", None) + return book + + +def _make_job(**kwargs): + """Create a mock job object.""" + job = Mock(spec=Job) + job.retry_count = kwargs.get("retry_count", 0) + job.last_attempt = kwargs.get("last_attempt", 0) + job.last_error = kwargs.get("last_error", None) + job.progress = kwargs.get("progress", 0.0) + job.book_id = kwargs.get("book_id", 1) + job.abs_id = kwargs.get("abs_id", "abs-123") + return job + + +# --------------------------------------------------------------------------- +# check_pending_jobs +# --------------------------------------------------------------------------- + + +class TestCheckPendingJobs: + def test_no_pending_or_failed_jobs(self): + db = Mock() + db.get_books_by_status.return_value = [] + service = _make_service(database_service=db) + + service.check_pending_jobs() + + # Called for "pending", then "failed_retry_later" + assert db.get_books_by_status.call_count == 2 + db.save_book.assert_not_called() + db.save_job.assert_not_called() + + def test_skips_if_thread_alive(self): + db = Mock() + service = _make_service(database_service=db) + service._job_thread = Mock() + service._job_thread.is_alive.return_value = True + + service.check_pending_jobs() + + db.get_books_by_status.assert_not_called() + + def test_starts_thread_for_pending_book(self): + book = _make_book() + db = Mock() + db.get_books_by_status.return_value = [book] + db.get_latest_job.return_value = None + service = _make_service(database_service=db) + + with patch("threading.Thread") as mock_thread: + mock_instance = Mock() + mock_thread.return_value = mock_instance + mock_instance.is_alive.return_value = False + + service.check_pending_jobs() + + assert book.status == "processing" + db.save_book.assert_called_with(book) + db.save_job.assert_called_once() + mock_thread.assert_called_once() + mock_instance.start.assert_called_once() + + def test_retry_respects_max_retries(self): + """Books at max retries are skipped.""" + book = _make_book(status="failed_retry_later") + job = _make_job(retry_count=5, last_attempt=0) + + db = Mock() + db.get_books_by_status.side_effect = lambda s: ( + [] if s == "pending" else [book] if s == "failed_retry_later" else [] + ) + db.get_latest_job.return_value = job + + service = _make_service(database_service=db) + + with patch.dict("os.environ", {"JOB_MAX_RETRIES": "5"}): + service.check_pending_jobs() + + db.save_book.assert_not_called() + + def test_retry_respects_delay(self): + """Books within retry delay window are skipped.""" + book = _make_book(status="failed_retry_later") + job = _make_job(retry_count=1, last_attempt=time.time()) # just now + + db = Mock() + db.get_books_by_status.side_effect = lambda s: ( + [] if s == "pending" else [book] if s == "failed_retry_later" else [] + ) + db.get_latest_job.return_value = job + + service = _make_service(database_service=db) + + with patch.dict("os.environ", {"JOB_RETRY_DELAY_MINS": "15"}): + service.check_pending_jobs() + + db.save_book.assert_not_called() + + def test_retry_picks_up_eligible_book(self): + """Book past retry delay and under max retries gets picked up.""" + book = _make_book(status="failed_retry_later") + job = _make_job(retry_count=1, last_attempt=time.time() - 3600) + + db = Mock() + db.get_books_by_status.side_effect = lambda s: ( + [] if s == "pending" else [book] if s == "failed_retry_later" else [] + ) + db.get_latest_job.return_value = job + + service = _make_service(database_service=db) + + with patch("threading.Thread") as mock_thread, patch.dict( + "os.environ", {"JOB_MAX_RETRIES": "5", "JOB_RETRY_DELAY_MINS": "15"} + ): + mock_instance = Mock() + mock_thread.return_value = mock_instance + mock_instance.is_alive.return_value = False + + service.check_pending_jobs() + + assert book.status == "processing" + mock_instance.start.assert_called_once() + + def test_preserves_existing_retry_count(self): + """When creating a job record for a pending book with existing job, retry_count is preserved.""" + book = _make_book() + existing_job = _make_job(retry_count=3) + + db = Mock() + db.get_books_by_status.return_value = [book] + db.get_latest_job.return_value = existing_job + + service = _make_service(database_service=db) + + with patch("threading.Thread") as mock_thread: + mock_instance = Mock() + mock_thread.return_value = mock_instance + mock_instance.is_alive.return_value = False + + service.check_pending_jobs() + + saved_job = db.save_job.call_args[0][0] + assert saved_job.retry_count == 3 + + +# --------------------------------------------------------------------------- +# _run_background_job +# --------------------------------------------------------------------------- + + +class TestRunBackgroundJob: + def test_success_flow(self): + """All three phases succeed — book becomes active.""" + book = _make_book() + db = Mock() + job = _make_job() + db.get_latest_job.return_value = job + service = _make_service(database_service=db) + + service._phase_acquire_epub = Mock(return_value=(Path("/tmp/test.epub"), {})) + service._phase_transcription = Mock(return_value=("transcript", "WHISPER", "text", [])) + service._phase_alignment = Mock() + + service._run_background_job(book) + + service._phase_acquire_epub.assert_called_once() + service._phase_transcription.assert_called_once() + service._phase_alignment.assert_called_once() + + def test_storyteller_deferral_does_not_increment_retry(self): + """StorytellerDeferral sets failed_retry_later without incrementing retry_count.""" + book = _make_book() + db = Mock() + existing_job = _make_job(retry_count=2, progress=0.3) + db.get_latest_job.return_value = existing_job + service = _make_service(database_service=db) + + service._phase_acquire_epub = Mock(side_effect=StorytellerDeferral("waiting")) + + service._run_background_job(book) + + assert book.status == "failed_retry_later" + db.save_book.assert_called_with(book) + saved_job = db.save_job.call_args[0][0] + assert saved_job.retry_count == 2 # not incremented + + def test_exception_increments_retry_count(self): + """Generic exception increments retry_count.""" + book = _make_book() + db = Mock() + existing_job = _make_job(retry_count=1, progress=0.1) + db.get_latest_job.return_value = existing_job + service = _make_service(database_service=db) + + service._phase_acquire_epub = Mock(side_effect=RuntimeError("disk full")) + + with patch.dict("os.environ", {"JOB_MAX_RETRIES": "5"}): + service._run_background_job(book) + + assert book.status == "failed_retry_later" + saved_job = db.save_job.call_args[0][0] + assert saved_job.retry_count == 2 + assert saved_job.last_error == "disk full" + + def test_max_retries_marks_permanent_failure(self): + """Hitting max retries sets status to failed_permanent.""" + book = _make_book() + db = Mock() + existing_job = _make_job(retry_count=4, progress=0.5) + db.get_latest_job.return_value = existing_job + service = _make_service(database_service=db) + + service._phase_acquire_epub = Mock(side_effect=RuntimeError("always fails")) + + with patch.dict("os.environ", {"JOB_MAX_RETRIES": "5"}): + service._run_background_job(book) + + assert book.status == "failed_permanent" + + def test_max_retries_cleans_audio_cache(self): + """On permanent failure, audio cache directory is cleaned up.""" + book = _make_book(abs_id="abc-456") + db = Mock() + existing_job = _make_job(retry_count=4) + db.get_latest_job.return_value = existing_job + service = _make_service(database_service=db, data_dir="/tmp/data") + + service._phase_acquire_epub = Mock(side_effect=RuntimeError("fail")) + + with patch.dict("os.environ", {"JOB_MAX_RETRIES": "5"}), \ + patch("shutil.rmtree") as mock_rmtree, \ + patch.object(Path, "exists", return_value=True): + service._run_background_job(book) + + mock_rmtree.assert_called_once() + + def test_max_retries_audio_cache_cleanup_failure_logged(self): + """Cache cleanup failure is caught and logged, not re-raised.""" + book = _make_book(abs_id="abc-456") + db = Mock() + existing_job = _make_job(retry_count=4) + db.get_latest_job.return_value = existing_job + service = _make_service(database_service=db, data_dir="/tmp/data") + + service._phase_acquire_epub = Mock(side_effect=RuntimeError("fail")) + + with patch.dict("os.environ", {"JOB_MAX_RETRIES": "5"}), \ + patch("shutil.rmtree", side_effect=OSError("permission denied")), \ + patch.object(Path, "exists", return_value=True): + # Should not raise + service._run_background_job(book) + + assert book.status == "failed_permanent" + + def test_exception_with_no_existing_job(self): + """When get_latest_job returns None, retry_count starts from 0.""" + book = _make_book() + db = Mock() + db.get_latest_job.return_value = None + service = _make_service(database_service=db) + + service._phase_acquire_epub = Mock(side_effect=RuntimeError("oops")) + + with patch.dict("os.environ", {"JOB_MAX_RETRIES": "5"}): + service._run_background_job(book) + + saved_job = db.save_job.call_args[0][0] + assert saved_job.retry_count == 1 + assert book.status == "failed_retry_later" + + +# --------------------------------------------------------------------------- +# cleanup_stale_jobs +# --------------------------------------------------------------------------- + + +class TestCleanupStaleJobs: + def test_resets_crashed_books(self): + crashed_book = _make_book(status="crashed") + db = Mock() + db.get_books_by_status.side_effect = lambda s: ( + [crashed_book] if s == "crashed" else [] + ) + service = _make_service(database_service=db) + + service.cleanup_stale_jobs() + + assert crashed_book.status == "active" + db.save_book.assert_called_with(crashed_book) + + def test_recovers_processing_book_with_alignment(self): + """Processing book with existing alignment is set to active.""" + book = _make_book(status="processing") + alignment = Mock() + alignment.has_alignment.return_value = True + + db = Mock() + db.get_books_by_status.side_effect = lambda s: ( + [book] if s == "processing" else [] + ) + service = _make_service(database_service=db, alignment_service=alignment) + + service.cleanup_stale_jobs() + + assert book.status == "active" + + def test_recovers_processing_book_without_alignment(self): + """Processing book without alignment is set to failed_retry_later with a job record.""" + book = _make_book(status="processing") + existing_job = _make_job(retry_count=2) + + alignment = Mock() + alignment.has_alignment.return_value = False + + db = Mock() + db.get_books_by_status.side_effect = lambda s: ( + [book] if s == "processing" else [] + ) + db.get_latest_job.return_value = existing_job + service = _make_service(database_service=db, alignment_service=alignment) + + service.cleanup_stale_jobs() + + assert book.status == "failed_retry_later" + saved_job = db.save_job.call_args[0][0] + assert saved_job.retry_count == 2 + assert saved_job.last_error == "Interrupted by restart" + + def test_processing_book_no_existing_job_gets_zero_retries(self): + """Processing book without existing job record gets retry_count=0.""" + book = _make_book(status="processing") + alignment = Mock() + alignment.has_alignment.return_value = False + + db = Mock() + db.get_books_by_status.side_effect = lambda s: ( + [book] if s == "processing" else [] + ) + db.get_latest_job.return_value = None + service = _make_service(database_service=db, alignment_service=alignment) + + service.cleanup_stale_jobs() + + saved_job = db.save_job.call_args[0][0] + assert saved_job.retry_count == 0 + + def test_exception_in_cleanup_is_caught(self): + """Errors during cleanup are logged, not raised.""" + db = Mock() + db.get_books_by_status.side_effect = RuntimeError("db gone") + service = _make_service(database_service=db) + + # Should not raise + service.cleanup_stale_jobs() + + def test_failed_permanent_with_alignment_becomes_active(self): + """Even failed_permanent books become active if alignment data exists.""" + book = _make_book(status="failed_permanent") + alignment = Mock() + alignment.has_alignment.return_value = True + + db = Mock() + db.get_books_by_status.side_effect = lambda s: ( + [book] if s == "failed_permanent" else [] + ) + service = _make_service(database_service=db, alignment_service=alignment) + + service.cleanup_stale_jobs() + + assert book.status == "active" + + +# --------------------------------------------------------------------------- +# prune_hardcover_sync_logs +# --------------------------------------------------------------------------- + + +class TestPruneHardcoverSyncLogs: + def test_prune_calls_db(self): + db = Mock() + db.prune_hardcover_sync_logs.return_value = 5 + service = _make_service(database_service=db) + + with patch.dict("os.environ", {"HARDCOVER_LOG_RETENTION_DAYS": "30"}): + service.prune_hardcover_sync_logs() + + db.prune_hardcover_sync_logs.assert_called_once() + + def test_prune_zero_deleted_no_error(self): + db = Mock() + db.prune_hardcover_sync_logs.return_value = 0 + service = _make_service(database_service=db) + + service.prune_hardcover_sync_logs() + + db.prune_hardcover_sync_logs.assert_called_once() + + def test_prune_exception_is_caught(self): + db = Mock() + db.prune_hardcover_sync_logs.side_effect = RuntimeError("db locked") + service = _make_service(database_service=db) + + # Should not raise + service.prune_hardcover_sync_logs() + + +# --------------------------------------------------------------------------- +# _phase_acquire_epub +# --------------------------------------------------------------------------- + + +class TestPhaseAcquireEpub: + def test_library_service_failure_falls_back(self): + """If library_service.acquire_ebook raises, falls back to get_local_epub.""" + book = _make_book() + db = Mock() + lib = Mock() + lib.acquire_ebook.side_effect = RuntimeError("nfs down") + abs_client = Mock() + abs_client.get_item_details.return_value = {"some": "details"} + ebook_parser = Mock() + ebook_parser.get_kosync_id.return_value = None + + service = _make_service( + database_service=db, + abs_client=abs_client, + library_service=lib, + ebook_parser=ebook_parser, + ) + + with patch("src.services.background_job_service.get_local_epub", return_value="/tmp/test.epub"): + epub_path, details = service._phase_acquire_epub(book, Mock()) + + assert epub_path == Path("/tmp/test.epub") + + def test_no_epub_found_raises(self): + """FileNotFoundError when no epub source works.""" + book = _make_book() + abs_client = Mock() + abs_client.get_item_details.return_value = {} + service = _make_service(abs_client=abs_client, library_service=None) + + with patch("src.services.background_job_service.get_local_epub", return_value=None): + with pytest.raises(FileNotFoundError, match="Could not locate"): + service._phase_acquire_epub(book, Mock()) + + def test_kosync_lock_failure_is_caught(self): + """Failure to lock KOSync ID is caught, not raised.""" + book = _make_book(kosync_doc_id=None) + abs_client = Mock() + abs_client.get_item_details.return_value = {"id": "test"} # truthy so library_service is used + lib = Mock() + lib.acquire_ebook.return_value = "/tmp/test.epub" + ebook_parser = Mock() + ebook_parser.get_kosync_id.side_effect = RuntimeError("epub corrupt") + db = Mock() + + service = _make_service( + database_service=db, + abs_client=abs_client, + library_service=lib, + ebook_parser=ebook_parser, + ) + + epub_path, _ = service._phase_acquire_epub(book, Mock()) + + assert epub_path == Path("/tmp/test.epub") + assert book.kosync_doc_id is None # not set due to error + + +# --------------------------------------------------------------------------- +# _phase_alignment +# --------------------------------------------------------------------------- + + +class TestPhaseAlignment: + def test_storyteller_native_skips_transcript(self): + """STORYTELLER_NATIVE source skips transcript requirement.""" + book = _make_book() + db = Mock() + job = _make_job() + db.get_latest_job.return_value = job + service = _make_service(database_service=db) + + service._phase_alignment( + book, "abs-123", "Test", Path("/tmp/t.epub"), + None, "STORYTELLER_NATIVE", "text", [], Mock(), + ) + + assert book.status == "active" + + def test_no_transcript_raises(self): + """No transcript with non-Storyteller source raises.""" + service = _make_service() + with pytest.raises(Exception, match="Failed to generate transcript"): + service._phase_alignment( + _make_book(), "abs-123", "Test", Path("/tmp/t.epub"), + None, "WHISPER", "text", [], Mock(), + ) + + def test_no_alignment_service_raises(self): + """Missing alignment_service raises.""" + service = _make_service(alignment_service=None) + with pytest.raises(Exception, match="alignment_service not available"): + service._phase_alignment( + _make_book(), "abs-123", "Test", Path("/tmp/t.epub"), + "transcript", "WHISPER", "text", [], Mock(), + ) + + def test_alignment_failure_raises(self): + """alignment_service.align_and_store returning False raises.""" + alignment = Mock() + alignment.align_and_store.return_value = False + service = _make_service(alignment_service=alignment) + + with pytest.raises(Exception, match="Alignment failed"): + service._phase_alignment( + _make_book(), "abs-123", "Test", Path("/tmp/t.epub"), + "transcript", "WHISPER", "text", [], Mock(), + ) + + def test_success_updates_book_and_job(self): + """Successful alignment updates book status and clears job error.""" + book = _make_book() + alignment = Mock() + alignment.align_and_store.return_value = True + db = Mock() + job = _make_job(retry_count=2, last_error="prev error") + db.get_latest_job.return_value = job + service = _make_service(database_service=db, alignment_service=alignment) + + service._phase_alignment( + book, "abs-123", "Test", Path("/tmp/test.epub"), + "transcript", "WHISPER", "text", [], Mock(), + ) + + assert book.status == "active" + assert book.ebook_filename == "test.epub" + assert job.retry_count == 0 + assert job.last_error is None + assert job.progress == 1.0 + + def test_no_job_record_on_completion_does_not_crash(self): + """If job record is missing at completion, it logs but does not crash.""" + book = _make_book() + alignment = Mock() + alignment.align_and_store.return_value = True + db = Mock() + db.get_latest_job.return_value = None + service = _make_service(database_service=db, alignment_service=alignment) + + # Should not raise + service._phase_alignment( + book, "abs-123", "Test", Path("/tmp/test.epub"), + "transcript", "WHISPER", "text", [], Mock(), + ) + + assert book.status == "active" + + +# --------------------------------------------------------------------------- +# _phase_transcription (SMIL except block) +# --------------------------------------------------------------------------- + + +class TestPhaseTranscription: + def test_smil_failure_falls_back_to_whisper(self): + """SMIL extraction exception falls through to Whisper.""" + book = _make_book(storyteller_uuid=None) + abs_client = Mock() + abs_client.get_audio_files.return_value = ["file.mp3"] + ebook_parser = Mock() + ebook_parser.extract_text_and_map.return_value = ("book text", {}) + transcriber = Mock() + transcriber.transcribe_from_smil.side_effect = RuntimeError("bad smil") + transcriber.process_audio.return_value = "whisper transcript" + + service = _make_service( + abs_client=abs_client, + ebook_parser=ebook_parser, + transcriber=transcriber, + storyteller_client=None, + ) + + with patch.dict("os.environ", {"STORYTELLER_FORCE_MODE": "false"}): + raw, source, text, chapters = service._phase_transcription( + book, "abs-123", "Test", Path("/tmp/t.epub"), {"media": {"chapters": []}}, Mock() + ) + + assert source == "WHISPER" + assert raw == "whisper transcript" diff --git a/tests/test_book_metadata_service.py b/tests/test_book_metadata_service.py new file mode 100644 index 0000000..bed6756 --- /dev/null +++ b/tests/test_book_metadata_service.py @@ -0,0 +1,246 @@ +"""Tests for book_metadata_service — focused on error paths and fallback behavior.""" + +import sys +from pathlib import Path +from unittest.mock import Mock, patch + +import pytest + +sys.path.insert(0, str(Path(__file__).parent.parent)) + +from src.services.book_metadata_service import build_book_metadata, build_service_info + + +def _make_book(**overrides): + book = Mock() + book.id = overrides.get('id', 1) + book.abs_id = overrides.get('abs_id', 'abs-123') + book.sync_mode = overrides.get('sync_mode', 'audio_ebook') + book.author = overrides.get('author', 'Cached Author') + book.subtitle = overrides.get('subtitle', 'Cached Subtitle') + book.duration = overrides.get('duration', 7200) + book.ebook_filename = overrides.get('ebook_filename', 'book.epub') + book.original_ebook_filename = overrides.get('original_ebook_filename', None) + book.kosync_doc_id = overrides.get('kosync_doc_id', None) + book.storyteller_uuid = overrides.get('storyteller_uuid', None) + book.title = overrides.get('title', 'Test Book') + book.status = overrides.get('status', 'active') + return book + + +def _make_container(hc_configured=False, bl_configured=False): + container = Mock() + + hc_client = Mock() + hc_client.is_configured.return_value = hc_configured + container.hardcover_client.return_value = hc_client + + bl_client = Mock() + bl_client.is_configured.return_value = bl_configured + container.booklore_client_group.return_value = bl_client + container.booklore_client.return_value = bl_client + + container.storyteller_client.return_value = Mock(is_configured=Mock(return_value=False)) + container.bookfusion_client.return_value = Mock(is_configured=Mock(return_value=False)) + + return container + + +def _make_abs_service(available=True, item_details=None): + svc = Mock() + svc.is_available.return_value = available + svc.get_item_details.return_value = item_details + svc.abs_client = Mock(base_url='http://abs:13378') + return svc + + +def _make_db_service(hardcover=None, bf_book=None): + db = Mock() + db.get_hardcover_details.return_value = hardcover + db.get_bookfusion_book_by_book_id.return_value = bf_book + return db + + +# --------------------------------------------------------------------------- +# ABS API failure +# --------------------------------------------------------------------------- + +class TestABSFailureFallback: + """When abs_service.get_item_details raises, metadata should fall back to cached book fields.""" + + def test_abs_api_exception_falls_back_to_cached_author(self): + book = _make_book(author='Fallback Author', duration=3700) + abs_service = _make_abs_service() + abs_service.get_item_details.side_effect = ConnectionError("ABS unreachable") + db = _make_db_service() + container = _make_container() + + result = build_book_metadata(book, container, db, abs_service) + + assert result['author'] == 'Fallback Author' + + def test_abs_api_exception_falls_back_to_cached_subtitle(self): + book = _make_book(subtitle='A Subtitle') + abs_service = _make_abs_service() + abs_service.get_item_details.side_effect = RuntimeError("timeout") + db = _make_db_service() + container = _make_container() + + result = build_book_metadata(book, container, db, abs_service) + + assert result['subtitle'] == 'A Subtitle' + + def test_abs_api_exception_falls_back_to_cached_duration(self): + book = _make_book(duration=5400) # 1h 30m + abs_service = _make_abs_service() + abs_service.get_item_details.side_effect = Exception("fail") + db = _make_db_service() + container = _make_container() + + result = build_book_metadata(book, container, db, abs_service) + + assert result['duration'] == '1h 30m' + + def test_abs_returns_none_falls_back_to_cached(self): + book = _make_book(author='Cached Author') + abs_service = _make_abs_service(item_details=None) + db = _make_db_service() + container = _make_container() + + result = build_book_metadata(book, container, db, abs_service) + + assert result['author'] == 'Cached Author' + + +# --------------------------------------------------------------------------- +# Booklore API failure +# --------------------------------------------------------------------------- + +class TestBookloreFailure: + """When Booklore client raises, the metadata build should still succeed.""" + + def test_booklore_exception_does_not_crash(self): + book = _make_book(ebook_filename='test.epub') + abs_service = _make_abs_service(item_details=None) + db = _make_db_service() + + bl_client = Mock() + bl_client.is_configured.return_value = True + bl_client.find_book_by_filename.side_effect = ConnectionError("Booklore down") + + result = build_book_metadata(book, Mock(), db, abs_service, booklore_client=bl_client) + + assert 'booklore_url' not in result + + def test_booklore_not_configured_skips_lookup(self): + book = _make_book(ebook_filename='test.epub') + abs_service = _make_abs_service(item_details=None) + db = _make_db_service() + + bl_client = Mock() + bl_client.is_configured.return_value = False + + result = build_book_metadata(book, Mock(), db, abs_service, booklore_client=bl_client) + + bl_client.find_book_by_filename.assert_not_called() + assert 'booklore_url' not in result + + +# --------------------------------------------------------------------------- +# Hardcover metadata failure +# --------------------------------------------------------------------------- + +class TestHardcoverFailure: + """When hardcover_client.get_book_metadata raises, metadata build still succeeds.""" + + def test_hardcover_metadata_exception_does_not_crash(self): + book = _make_book(sync_mode='ebook_only') + abs_service = _make_abs_service(item_details=None) + + hardcover = Mock() + hardcover.isbn = '1234567890' + hardcover.asin = None + hardcover.hardcover_pages = 300 + hardcover.hardcover_slug = 'test-book' + hardcover.hardcover_book_id = '42' + hardcover.hardcover_status_id = None + hardcover.matched_by = 'manual' + + db = _make_db_service(hardcover=hardcover) + + container = _make_container(hc_configured=True) + hc_client = container.hardcover_client() + hc_client.get_book_metadata.side_effect = TimeoutError("HC API timeout") + + result = build_book_metadata(book, container, db, abs_service) + + # Hardcover DB details should still be present + assert result['isbn'] == '1234567890' + assert result['pages'] == 300 + # But HC API metadata (description, tags) should be absent + assert 'hc_tags' not in result + + +# --------------------------------------------------------------------------- +# Caching / fallback behavior +# --------------------------------------------------------------------------- + +class TestCachingBehavior: + """Verify that cached book fields are used when API data is missing.""" + + def test_ebook_only_skips_abs_call(self): + book = _make_book(sync_mode='ebook_only', author='Ebook Author') + abs_service = _make_abs_service() + db = _make_db_service() + container = _make_container() + + result = build_book_metadata(book, container, db, abs_service) + + abs_service.get_item_details.assert_not_called() + assert result['author'] == 'Ebook Author' + + def test_abs_author_overrides_cached(self): + """When ABS returns metadata, it takes priority over cached fields.""" + book = _make_book(author='Old Author') + abs_item = { + 'media': { + 'metadata': { + 'authorName': 'ABS Author', + 'narratorName': 'Narrator', + 'subtitle': '', + 'description': 'A book', + 'genres': ['Fiction'], + }, + 'duration': 3661, + } + } + abs_service = _make_abs_service(item_details=abs_item) + db = _make_db_service() + container = _make_container() + + result = build_book_metadata(book, container, db, abs_service) + + assert result['author'] == 'ABS Author' + assert result['duration'] == '1h 1m' + + def test_duration_minutes_only_when_under_one_hour(self): + book = _make_book(duration=1800, sync_mode='ebook_only') # 30 minutes + abs_service = _make_abs_service() + db = _make_db_service() + container = _make_container() + + result = build_book_metadata(book, container, db, abs_service) + + assert result['duration'] == '30m' + + def test_no_ebook_filename_skips_booklore(self): + book = _make_book(ebook_filename=None) + abs_service = _make_abs_service(item_details=None) + db = _make_db_service() + + bl_client = Mock() + container = _make_container() + + build_book_metadata(book, container, db, abs_service, booklore_client=bl_client) + + bl_client.find_book_by_filename.assert_not_called() diff --git a/tests/test_bookfusion_routes.py b/tests/test_bookfusion_routes.py new file mode 100644 index 0000000..7465fc1 --- /dev/null +++ b/tests/test_bookfusion_routes.py @@ -0,0 +1,639 @@ +"""Tests for BookFusion blueprint routes.""" + +from datetime import datetime +from unittest.mock import Mock + +from tests.conftest import MockContainer + +# ── Helpers ──────────────────────────────────────────────────────── + +def _make_mock_book(abs_id='test-abs-id', title='Test Book', book_id=1, status='active'): + """Return a Mock that behaves like a Book ORM instance.""" + book = Mock() + book.id = book_id + book.abs_id = abs_id + book.title = title + book.status = status + book.started_at = None + book.finished_at = None + book.sync_mode = 'audiobook' + return book + + +def _make_bf_book(bookfusion_id='bf-123', title='BF Book', authors='Author', + filename='book.epub', highlight_count=3, matched_abs_id=None, + matched_book_id=None, hidden=False, series=None, tags=None): + """Return a Mock that behaves like a BookfusionBook ORM instance.""" + bf = Mock() + bf.bookfusion_id = bookfusion_id + bf.title = title + bf.authors = authors + bf.filename = filename + bf.highlight_count = highlight_count + bf.matched_abs_id = matched_abs_id + bf.matched_book_id = matched_book_id + bf.hidden = hidden + bf.series = series + bf.tags = tags + bf.frontmatter = None + return bf + + +def _make_bf_highlight(hl_id=1, highlight_id='hl-1', bookfusion_book_id='bf-123', + book_title='BF Book', content='Some highlight text', + quote_text=None, chapter_heading=None, matched_abs_id=None, + highlighted_at=None): + """Return a Mock that behaves like a BookfusionHighlight ORM instance.""" + hl = Mock() + hl.id = hl_id + hl.highlight_id = highlight_id + hl.bookfusion_book_id = bookfusion_book_id + hl.book_title = book_title + hl.content = content + hl.quote_text = quote_text + hl.chapter_heading = chapter_heading + hl.matched_abs_id = matched_abs_id + hl.highlighted_at = highlighted_at + return hl + + +# ── Booklore Books ───────────────────────────────────────────────── + +def test_booklore_books_returns_supported_formats(client, mock_container): + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.get_all_books.return_value = [ + {'id': 1, 'title': 'Book One', 'authors': 'Author A', 'fileName': 'book1.epub'}, + {'id': 2, 'title': 'Book Two', 'authors': 'Author B', 'fileName': 'book2.txt'}, + {'id': 3, 'title': 'Book Three', 'authors': 'Author C', 'fileName': 'book3.pdf'}, + ] + resp = client.get('/api/bookfusion/booklore-books') + assert resp.status_code == 200 + data = resp.get_json() + assert len(data) == 2 + assert data[0]['title'] == 'Book One' + assert data[1]['title'] == 'Book Three' + + +def test_booklore_books_with_search_query(client, mock_container): + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.search_books.return_value = [ + {'id': 5, 'title': 'Searched', 'authors': 'A', 'fileName': 'searched.epub'}, + ] + resp = client.get('/api/bookfusion/booklore-books?q=searched') + assert resp.status_code == 200 + data = resp.get_json() + assert len(data) == 1 + mock_container.mock_booklore_client.search_books.assert_called_once_with('searched') + + +def test_booklore_books_not_configured(client, mock_container): + mock_container.mock_booklore_client.is_configured.return_value = False + resp = client.get('/api/bookfusion/booklore-books') + assert resp.status_code == 200 + assert resp.get_json() == [] + + +def test_booklore_books_exception_returns_empty(client, mock_container): + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.get_all_books.side_effect = Exception('Booklore down') + resp = client.get('/api/bookfusion/booklore-books') + assert resp.status_code == 200 + assert resp.get_json() == [] + + +# ── Upload Book ──────────────────────────────────────────────────── + +def test_upload_book_success(client, mock_container): + mock_container.mock_bookfusion_client.upload_api_key = 'key-123' + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.download_book.return_value = b'file-bytes' + mock_container.mock_bookfusion_client.upload_book.return_value = {'id': 'new-bf-id'} + + resp = client.post('/api/bookfusion/upload', json={ + 'book_id': 10, 'title': 'My Book', 'authors': 'Auth', 'fileName': 'my.epub', + }) + assert resp.status_code == 200 + data = resp.get_json() + assert data['success'] is True + assert data['result'] == {'id': 'new-bf-id'} + + +def test_upload_book_no_data(client, mock_container): + resp = client.post('/api/bookfusion/upload', content_type='application/json', data='') + assert resp.status_code == 400 + + +def test_upload_book_missing_book_id(client, mock_container): + resp = client.post('/api/bookfusion/upload', json={'title': 'Missing ID'}) + assert resp.status_code == 400 + assert 'book_id required' in resp.get_json()['error'] + + +def test_upload_book_no_api_key(client, mock_container): + mock_container.mock_bookfusion_client.upload_api_key = '' + resp = client.post('/api/bookfusion/upload', json={'book_id': 1}) + assert resp.status_code == 400 + assert 'API key' in resp.get_json()['error'] + + +def test_upload_book_booklore_not_configured(client, mock_container): + mock_container.mock_bookfusion_client.upload_api_key = 'key' + mock_container.mock_booklore_client.is_configured.return_value = False + resp = client.post('/api/bookfusion/upload', json={'book_id': 1}) + assert resp.status_code == 400 + assert 'Booklore not configured' in resp.get_json()['error'] + + +def test_upload_book_download_fails(client, mock_container): + mock_container.mock_bookfusion_client.upload_api_key = 'key' + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.download_book.return_value = None + resp = client.post('/api/bookfusion/upload', json={'book_id': 1}) + assert resp.status_code == 500 + assert 'Failed to download' in resp.get_json()['error'] + + +def test_upload_book_upload_fails(client, mock_container): + mock_container.mock_bookfusion_client.upload_api_key = 'key' + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.download_book.return_value = b'bytes' + mock_container.mock_bookfusion_client.upload_book.return_value = None + resp = client.post('/api/bookfusion/upload', json={ + 'book_id': 1, 'fileName': 'x.epub', + }) + assert resp.status_code == 500 + assert 'Upload to BookFusion failed' in resp.get_json()['error'] + + +# ── Sync Highlights ──────────────────────────────────────────────── + +def test_sync_highlights_success(client, mock_container): + mock_container.mock_bookfusion_client.highlights_api_key = 'hlkey' + mock_container.mock_bookfusion_client.sync_all_highlights.return_value = { + 'new_highlights': 5, 'books_saved': 2, 'new_ids': ['a', 'b'], + } + mock_container.mock_database_service.get_unmatched_bookfusion_highlights.return_value = [] + + resp = client.post('/api/bookfusion/sync-highlights') + assert resp.status_code == 200 + data = resp.get_json() + assert data['success'] is True + assert data['new_highlights'] == 5 + assert data['books_saved'] == 2 + assert data['auto_matched'] == 0 + + +def test_sync_highlights_no_api_key(client, mock_container): + mock_container.mock_bookfusion_client.highlights_api_key = '' + resp = client.post('/api/bookfusion/sync-highlights') + assert resp.status_code == 400 + assert 'API key' in resp.get_json()['error'] + + +def test_sync_highlights_full_resync_clears_cursor(client, mock_container): + mock_container.mock_bookfusion_client.highlights_api_key = 'hlkey' + mock_container.mock_bookfusion_client.sync_all_highlights.return_value = { + 'new_highlights': 0, 'books_saved': 0, 'new_ids': [], + } + mock_container.mock_database_service.get_unmatched_bookfusion_highlights.return_value = [] + + resp = client.post('/api/bookfusion/sync-highlights', json={'full_resync': True}) + assert resp.status_code == 200 + mock_container.mock_database_service.set_bookfusion_sync_cursor.assert_called_once_with(None) + + +def test_sync_highlights_exception(client, mock_container): + mock_container.mock_bookfusion_client.highlights_api_key = 'hlkey' + mock_container.mock_bookfusion_client.sync_all_highlights.side_effect = Exception('API error') + + resp = client.post('/api/bookfusion/sync-highlights') + assert resp.status_code == 500 + assert 'failed' in resp.get_json()['error'].lower() + + +# ── Get Highlights ───────────────────────────────────────────────── + +def test_get_highlights_empty(client, mock_container): + mock_container.mock_database_service.get_bookfusion_highlights.return_value = [] + mock_container.mock_database_service.get_bookfusion_sync_cursor.return_value = None + mock_container.mock_database_service.get_all_books.return_value = [] + + resp = client.get('/api/bookfusion/highlights') + assert resp.status_code == 200 + data = resp.get_json() + assert data['highlights'] == {} + assert data['has_synced'] is False + + +def test_get_highlights_with_data(client, mock_container): + hl = _make_bf_highlight( + highlighted_at=datetime(2025, 1, 15, 10, 30, 0), + quote_text='A great quote', + matched_abs_id='book-1', + ) + mock_container.mock_database_service.get_bookfusion_highlights.return_value = [hl] + mock_container.mock_database_service.get_bookfusion_sync_cursor.return_value = 'cursor-abc' + mock_container.mock_database_service.get_all_books.return_value = [ + _make_mock_book(abs_id='book-1', title='Dashboard Book'), + ] + + resp = client.get('/api/bookfusion/highlights') + assert resp.status_code == 200 + data = resp.get_json() + assert data['has_synced'] is True + assert len(data['books']) == 1 + # The highlight should be grouped under the book title + assert len(data['highlights']) == 1 + + +# ── Link Highlight ───────────────────────────────────────────────── + +def test_link_highlight_success(client, mock_container): + book = _make_mock_book() + mock_container.mock_database_service.get_book_by_ref.return_value = book + + resp = client.post('/api/bookfusion/link-highlight', json={ + 'bookfusion_book_id': 'bf-123', 'abs_id': 'test-abs-id', + }) + assert resp.status_code == 200 + assert resp.get_json()['success'] is True + mock_container.mock_database_service.link_bookfusion_highlights_by_book_id.assert_called_once_with( + 'bf-123', book.id, + ) + + +def test_link_highlight_unlink(client, mock_container): + resp = client.post('/api/bookfusion/link-highlight', json={ + 'bookfusion_book_id': 'bf-123', 'abs_id': '', + }) + assert resp.status_code == 200 + mock_container.mock_database_service.link_bookfusion_highlights_by_book_id.assert_called_once_with( + 'bf-123', None, + ) + + +def test_link_highlight_no_data(client, mock_container): + resp = client.post('/api/bookfusion/link-highlight', content_type='application/json', data='') + assert resp.status_code == 400 + + +def test_link_highlight_missing_bookfusion_id(client, mock_container): + resp = client.post('/api/bookfusion/link-highlight', json={'abs_id': 'x'}) + assert resp.status_code == 400 + assert 'bookfusion_book_id required' in resp.get_json()['error'] + + +# ── Save Journal ─────────────────────────────────────────────────── + +def test_save_journal_success(client, mock_container): + book = _make_mock_book() + mock_container.mock_database_service.get_book_by_ref.return_value = book + mock_container.mock_database_service.cleanup_bookfusion_import_notes.return_value = {'deleted': 0} + + resp = client.post('/api/bookfusion/save-journal', json={ + 'abs_id': 'test-abs-id', + 'highlights': [ + {'quote': 'Great quote', 'chapter': 'Ch 1', 'highlighted_at': '2025-01-15 10:30:00'}, + {'quote': 'Another quote'}, + ], + }) + assert resp.status_code == 200 + data = resp.get_json() + assert data['success'] is True + assert data['saved'] == 2 + + +def test_save_journal_no_data(client, mock_container): + resp = client.post('/api/bookfusion/save-journal', content_type='application/json', data='') + assert resp.status_code == 400 + + +def test_save_journal_missing_abs_id(client, mock_container): + resp = client.post('/api/bookfusion/save-journal', json={'highlights': []}) + assert resp.status_code == 400 + assert 'abs_id required' in resp.get_json()['error'] + + +def test_save_journal_book_not_found(client, mock_container): + mock_container.mock_database_service.get_book_by_ref.return_value = None + mock_container.mock_database_service.get_bookfusion_highlights_for_book_by_book_id.return_value = [] + + resp = client.post('/api/bookfusion/save-journal', json={'abs_id': 'nonexistent'}) + # No highlights provided and none server-side -> error + assert resp.status_code in (400, 404) + + +def test_save_journal_skips_empty_quotes(client, mock_container): + book = _make_mock_book() + mock_container.mock_database_service.get_book_by_ref.return_value = book + mock_container.mock_database_service.cleanup_bookfusion_import_notes.return_value = {} + + resp = client.post('/api/bookfusion/save-journal', json={ + 'abs_id': 'test-abs-id', + 'highlights': [ + {'quote': '', 'chapter': 'Ch 1'}, + {'quote': ' '}, + {'quote': 'Valid quote'}, + ], + }) + assert resp.status_code == 200 + assert resp.get_json()['saved'] == 1 + + +def test_save_journal_fetches_server_side_highlights(client, mock_container): + book = _make_mock_book() + mock_container.mock_database_service.get_book_by_ref.return_value = book + mock_container.mock_database_service.cleanup_bookfusion_import_notes.return_value = {} + + hl = _make_bf_highlight(quote_text='Server-side quote', highlighted_at=datetime(2025, 3, 1)) + mock_container.mock_database_service.get_bookfusion_highlights_for_book_by_book_id.return_value = [hl] + + resp = client.post('/api/bookfusion/save-journal', json={'abs_id': 'test-abs-id'}) + assert resp.status_code == 200 + assert resp.get_json()['saved'] == 1 + + +# ── Library ──────────────────────────────────────────────────────── + +def test_library_returns_books(client, mock_container): + bf = _make_bf_book(bookfusion_id='bf-1', title='Library Book') + mock_container.mock_database_service.get_bookfusion_books.return_value = [bf] + mock_container.mock_database_service.get_all_books.return_value = [] + + resp = client.get('/api/bookfusion/library') + assert resp.status_code == 200 + data = resp.get_json() + assert len(data['books']) == 1 + assert data['books'][0]['title'] == 'Library Book' + assert data['books'][0]['on_dashboard'] is False + + +def test_library_marks_on_dashboard(client, mock_container): + bf = _make_bf_book(bookfusion_id='bf-1', matched_abs_id='dash-1') + dashboard_book = _make_mock_book(abs_id='dash-1', title='Dashboard Book') + mock_container.mock_database_service.get_bookfusion_books.return_value = [bf] + mock_container.mock_database_service.get_all_books.return_value = [dashboard_book] + + resp = client.get('/api/bookfusion/library') + assert resp.status_code == 200 + data = resp.get_json() + assert data['books'][0]['on_dashboard'] is True + assert data['books'][0]['abs_id'] == 'dash-1' + + +def test_library_merges_duplicate_titles(client, mock_container): + bf1 = _make_bf_book(bookfusion_id='bf-1', title='Same Title', filename='book.epub', highlight_count=5) + bf2 = _make_bf_book(bookfusion_id='bf-2', title='Same Title', filename='book.mobi', highlight_count=2) + mock_container.mock_database_service.get_bookfusion_books.return_value = [bf1, bf2] + mock_container.mock_database_service.get_all_books.return_value = [] + + resp = client.get('/api/bookfusion/library') + assert resp.status_code == 200 + data = resp.get_json() + # Duplicate titles should be merged into a single entry + assert len(data['books']) == 1 + assert data['books'][0]['highlight_count'] == 7 + assert len(data['books'][0]['bookfusion_ids']) == 2 + + +def test_library_hidden_books(client, mock_container): + bf = _make_bf_book(bookfusion_id='bf-1', title='Hidden Book', hidden=True) + mock_container.mock_database_service.get_bookfusion_books.return_value = [bf] + mock_container.mock_database_service.get_all_books.return_value = [] + + resp = client.get('/api/bookfusion/library') + assert resp.status_code == 200 + data = resp.get_json() + assert data is not None + assert data['books'][0]['hidden'] is True + + +def test_library_empty(client, mock_container): + mock_container.mock_database_service.get_bookfusion_books.return_value = [] + mock_container.mock_database_service.get_all_books.return_value = [] + + resp = client.get('/api/bookfusion/library') + assert resp.status_code == 200 + data = resp.get_json() + assert data['books'] == [] + + +# ── Add to Dashboard ─────────────────────────────────────────────── + +def test_add_to_dashboard_success(client, mock_container): + bf = _make_bf_book(bookfusion_id='bf-new', title='New Book') + mock_container.mock_database_service.get_bookfusion_book.return_value = bf + + saved_book = _make_mock_book(abs_id='bf-bf-new', title='New Book', book_id=42) + saved_book.started_at = None + saved_book.finished_at = None + # First call: not yet on dashboard (None), second: after save, third: _estimate_reading_dates + mock_container.mock_database_service.get_book_by_ref.side_effect = [None, saved_book, saved_book] + mock_container.mock_database_service.get_hardcover_details.return_value = None + mock_container.mock_database_service.get_bookfusion_highlight_date_range.return_value = None + mock_container.mock_hardcover_client.is_configured.return_value = False + + resp = client.post('/api/bookfusion/add-to-dashboard', json={ + 'bookfusion_ids': ['bf-new'], + }) + assert resp.status_code == 200 + data = resp.get_json() + assert data['success'] is True + assert data['abs_id'] == 'bf-bf-new' + + +def test_add_to_dashboard_already_exists(client, mock_container): + existing = _make_mock_book(abs_id='bf-bf-1', title='Already There') + mock_container.mock_database_service.get_bookfusion_book.return_value = _make_bf_book() + mock_container.mock_database_service.get_book_by_ref.return_value = existing + + resp = client.post('/api/bookfusion/add-to-dashboard', json={ + 'bookfusion_ids': ['bf-1'], + }) + assert resp.status_code == 200 + data = resp.get_json() + assert data['already_existed'] is True + + +def test_add_to_dashboard_no_data(client, mock_container): + resp = client.post('/api/bookfusion/add-to-dashboard', content_type='application/json', data='') + assert resp.status_code == 400 + + +def test_add_to_dashboard_missing_id(client, mock_container): + # Empty dict is falsy in Python, so this gets "No data provided" + resp = client.post('/api/bookfusion/add-to-dashboard', json={}) + assert resp.status_code == 400 + + # With a key but no bookfusion_ids, we get "bookfusion_id required" + resp = client.post('/api/bookfusion/add-to-dashboard', json={'foo': 'bar'}) + assert resp.status_code == 400 + assert 'bookfusion_id required' in resp.get_json()['error'] + + +def test_add_to_dashboard_book_not_in_catalog(client, mock_container): + mock_container.mock_database_service.get_bookfusion_book.return_value = None + resp = client.post('/api/bookfusion/add-to-dashboard', json={ + 'bookfusion_id': 'nonexistent', + }) + assert resp.status_code == 404 + assert 'not found' in resp.get_json()['error'].lower() + + +def test_add_to_dashboard_single_id_fallback(client, mock_container): + """When bookfusion_ids is absent, falls back to bookfusion_id.""" + bf = _make_bf_book(bookfusion_id='bf-single', title='Single') + mock_container.mock_database_service.get_bookfusion_book.return_value = bf + existing = _make_mock_book(abs_id='bf-bf-single') + mock_container.mock_database_service.get_book_by_ref.return_value = existing + + resp = client.post('/api/bookfusion/add-to-dashboard', json={ + 'bookfusion_id': 'bf-single', + }) + assert resp.status_code == 200 + + +# ── Match to Book ────────────────────────────────────────────────── + +def test_match_to_book_link(client, mock_container): + book = _make_mock_book(abs_id='dash-1', title='Dashboard Book') + book.started_at = None + book.finished_at = None + mock_container.mock_database_service.get_book_by_ref.return_value = book + mock_container.mock_database_service.get_hardcover_details.return_value = None + mock_container.mock_database_service.get_bookfusion_highlight_date_range.return_value = None + mock_container.mock_hardcover_client.is_configured.return_value = False + + resp = client.post('/api/bookfusion/match-to-book', json={ + 'bookfusion_ids': ['bf-1', 'bf-2'], 'abs_id': 'dash-1', + }) + assert resp.status_code == 200 + data = resp.get_json() + assert data['success'] is True + assert data['abs_id'] == 'dash-1' + assert mock_container.mock_database_service.set_bookfusion_book_match_by_book_id.call_count == 2 + + +def test_match_to_book_unlink(client, mock_container): + resp = client.post('/api/bookfusion/match-to-book', json={ + 'bookfusion_ids': ['bf-1'], + }) + assert resp.status_code == 200 + mock_container.mock_database_service.set_bookfusion_book_match_by_book_id.assert_called_once_with( + 'bf-1', None, + ) + + +def test_match_to_book_no_data(client, mock_container): + resp = client.post('/api/bookfusion/match-to-book', content_type='application/json', data='') + assert resp.status_code == 400 + + +def test_match_to_book_missing_bf_id(client, mock_container): + resp = client.post('/api/bookfusion/match-to-book', json={'abs_id': 'x'}) + assert resp.status_code == 400 + assert 'bookfusion_id required' in resp.get_json()['error'] + + +def test_match_to_book_book_not_found(client, mock_container): + mock_container.mock_database_service.get_book_by_ref.return_value = None + resp = client.post('/api/bookfusion/match-to-book', json={ + 'bookfusion_ids': ['bf-1'], 'abs_id': 'nonexistent', + }) + assert resp.status_code == 404 + assert 'not found' in resp.get_json()['error'].lower() + + +def test_match_to_book_single_id_fallback(client, mock_container): + resp = client.post('/api/bookfusion/match-to-book', json={ + 'bookfusion_id': 'bf-single', + }) + assert resp.status_code == 200 + mock_container.mock_database_service.set_bookfusion_book_match_by_book_id.assert_called_once() + + +# ── Hide / Unhide ────────────────────────────────────────────────── + +def test_hide_book_success(client, mock_container): + resp = client.post('/api/bookfusion/hide', json={ + 'bookfusion_ids': ['bf-1', 'bf-2'], 'hidden': True, + }) + assert resp.status_code == 200 + assert resp.get_json()['success'] is True + mock_container.mock_database_service.set_bookfusion_books_hidden.assert_called_once_with( + ['bf-1', 'bf-2'], True, + ) + + +def test_unhide_book(client, mock_container): + resp = client.post('/api/bookfusion/hide', json={ + 'bookfusion_ids': ['bf-1'], 'hidden': False, + }) + assert resp.status_code == 200 + mock_container.mock_database_service.set_bookfusion_books_hidden.assert_called_once_with( + ['bf-1'], False, + ) + + +def test_hide_book_no_data(client, mock_container): + resp = client.post('/api/bookfusion/hide', content_type='application/json', data='') + assert resp.status_code == 400 + + +def test_hide_book_missing_id(client, mock_container): + resp = client.post('/api/bookfusion/hide', json={'hidden': True}) + assert resp.status_code == 400 + assert 'bookfusion_id required' in resp.get_json()['error'] + + +def test_hide_book_single_id_fallback(client, mock_container): + resp = client.post('/api/bookfusion/hide', json={ + 'bookfusion_id': 'bf-single', 'hidden': True, + }) + assert resp.status_code == 200 + mock_container.mock_database_service.set_bookfusion_books_hidden.assert_called_once_with( + ['bf-single'], True, + ) + + +# ── Unlink Book ──────────────────────────────────────────────────── + +def test_unlink_book_success(client, mock_container): + book = _make_mock_book(book_id=7) + mock_container.mock_database_service.get_book_by_ref.return_value = book + + resp = client.post('/api/bookfusion/unlink', json={'abs_id': 'test-abs-id'}) + assert resp.status_code == 200 + assert resp.get_json()['success'] is True + mock_container.mock_database_service.unlink_bookfusion_by_book_id.assert_called_once_with(7) + + +def test_unlink_book_not_found_still_succeeds(client, mock_container): + mock_container.mock_database_service.get_book_by_ref.return_value = None + resp = client.post('/api/bookfusion/unlink', json={'abs_id': 'missing'}) + assert resp.status_code == 200 + assert resp.get_json()['success'] is True + mock_container.mock_database_service.unlink_bookfusion_by_book_id.assert_not_called() + + +def test_unlink_book_no_data(client, mock_container): + resp = client.post('/api/bookfusion/unlink', content_type='application/json', data='') + assert resp.status_code == 400 + + +def test_unlink_book_missing_abs_id(client, mock_container): + # Empty dict is falsy → "No data provided" + resp = client.post('/api/bookfusion/unlink', json={}) + assert resp.status_code == 400 + + # With a key but no abs_id → "abs_id required" + resp = client.post('/api/bookfusion/unlink', json={'foo': 'bar'}) + assert resp.status_code == 400 + assert 'abs_id required' in resp.get_json()['error'] + + +# ── BookFusion Page ──────────────────────────────────────────────── + +def test_bookfusion_page_renders(client, mock_container): + resp = client.get('/bookfusion') + assert resp.status_code == 200 diff --git a/tests/test_booklore_client.py b/tests/test_booklore_client.py index de9d81d..d0f7caf 100644 --- a/tests/test_booklore_client.py +++ b/tests/test_booklore_client.py @@ -45,7 +45,7 @@ def test_init_loads_from_db(mock_db): mock_db.get_all_booklore_books.return_value = [mock_book] - with patch.dict(os.environ, {"DATA_DIR": "/tmp/data"}): + with patch.dict(os.environ, {"BOOKLORE_SERVER": "http://mock", "BOOKLORE_USER": "u", "BOOKLORE_PASSWORD": "p", "DATA_DIR": "/tmp/data"}): client = BookloreClient(database_service=mock_db) assert "test_book.epub" in client._book_cache @@ -72,7 +72,7 @@ def test_migration_from_legacy_json(mock_db): with patch("json.load", return_value=legacy_data): with patch.object(Path, "exists", return_value=True): with patch.object(Path, "rename") as mock_rename: - with patch.dict(os.environ, {"DATA_DIR": "/tmp/data"}): + with patch.dict(os.environ, {"BOOKLORE_SERVER": "http://mock", "BOOKLORE_USER": "u", "BOOKLORE_PASSWORD": "p", "DATA_DIR": "/tmp/data"}): BookloreClient(database_service=mock_db) # Verification diff --git a/tests/test_client_poller.py b/tests/test_client_poller.py new file mode 100644 index 0000000..d559b41 --- /dev/null +++ b/tests/test_client_poller.py @@ -0,0 +1,313 @@ +"""Tests for ClientPoller — error isolation and timeout handling.""" + +import os +import sys +from pathlib import Path +from unittest.mock import Mock, call, patch + +import pytest + +sys.path.insert(0, str(Path(__file__).parent.parent)) + +from src.services.client_poller import ClientPoller + + +def _make_book(book_id=1, title='Test Book'): + book = Mock() + book.id = book_id + book.title = title + return book + + +def _make_sync_client(configured=True, states=None): + """Create a mock sync client. + + states: dict mapping book_id -> mock state (or exception). + """ + client = Mock() + client.is_configured.return_value = configured + + if states is None: + states = {} + + def get_state(book, prev_state=None): + val = states.get(book.id) + if isinstance(val, Exception): + raise val + return val + + client.get_service_state.side_effect = get_state + return client + + +def _make_state_result(pct): + """Create a mock state result with current.get('pct') returning pct.""" + result = Mock() + result.current = {'pct': pct} + return result + + +class TestClientRaisesDoesNotBlockOthers: + """One client raising should not prevent other clients from being polled.""" + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'custom', + 'STORYTELLER_POLL_SECONDS': '10', + 'BOOKLORE_POLL_MODE': 'custom', + 'BOOKLORE_POLL_SECONDS': '10', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + def test_failing_client_does_not_block_next_client(self): + db = Mock() + books = [_make_book(1, 'Book A')] + db.get_books_by_status.return_value = books + + storyteller = _make_sync_client(configured=True) + storyteller.get_service_state.side_effect = RuntimeError("Storyteller crashed") + + booklore_state = _make_state_result(0.5) + booklore = _make_sync_client(configured=True, states={1: booklore_state}) + + sync_clients = { + 'Storyteller': storyteller, + 'BookLore': booklore, + } + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, sync_clients) + poller._poll_cycle() + + # Storyteller failed, but BookLore should still have been polled + booklore.get_service_state.assert_called_once() + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'custom', + 'STORYTELLER_POLL_SECONDS': '10', + 'BOOKLORE_POLL_MODE': 'global', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + def test_per_book_exception_does_not_block_other_books(self): + """If get_service_state raises for book A, book B should still be checked.""" + book_a = _make_book(1, 'Book A') + book_b = _make_book(2, 'Book B') + + db = Mock() + db.get_books_by_status.return_value = [book_a, book_b] + + state_b = _make_state_result(0.3) + storyteller = Mock() + storyteller.is_configured.return_value = True + + def get_state(book, prev_state=None): + if book.id == 1: + raise ValueError("corrupt state for book A") + return state_b + + storyteller.get_service_state.side_effect = get_state + + sync_clients = {'Storyteller': storyteller} + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, sync_clients) + poller._poll_cycle() + + # Both books attempted + assert storyteller.get_service_state.call_count == 2 + # Book B's position was cached + assert poller._last_known[('Storyteller', 2)] == 0.3 + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'custom', + 'STORYTELLER_POLL_SECONDS': '10', + 'BOOKLORE_POLL_MODE': 'global', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + def test_db_failure_in_get_active_books_returns_early(self): + db = Mock() + db.get_books_by_status.side_effect = RuntimeError("DB is down") + + storyteller = _make_sync_client(configured=True) + sync_clients = {'Storyteller': storyteller} + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, sync_clients) + # Should not crash + poller._poll_cycle() + + # Client state was never queried because we couldn't get books + storyteller.get_service_state.assert_not_called() + + +class TestTimeoutHandling: + """Interval and poll timing tests.""" + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'custom', + 'STORYTELLER_POLL_SECONDS': '60', + 'BOOKLORE_POLL_MODE': 'global', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + def test_client_not_polled_before_interval_elapses(self): + """A client polled recently should be skipped until its interval elapses.""" + db = Mock() + db.get_books_by_status.return_value = [] + + storyteller = _make_sync_client(configured=True) + sync_clients = {'Storyteller': storyteller} + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, sync_clients) + + # First cycle: should poll (last_poll is 0) + poller._poll_cycle() + assert db.get_books_by_status.call_count == 1 + + # Second cycle immediately after: interval not elapsed, should skip + db.reset_mock() + poller._poll_cycle() + assert db.get_books_by_status.call_count == 0 + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'custom', + 'STORYTELLER_POLL_SECONDS': '10', + 'BOOKLORE_POLL_MODE': 'global', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + def test_client_polled_again_after_interval(self): + """After the interval elapses, the client should be polled again.""" + import time as time_module + + db = Mock() + db.get_books_by_status.return_value = [] + + storyteller = _make_sync_client(configured=True) + sync_clients = {'Storyteller': storyteller} + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, sync_clients) + + # First poll + poller._poll_cycle() + assert db.get_books_by_status.call_count == 1 + + # Simulate time passing beyond the interval + poller._last_poll['Storyteller'] -= 20 # subtract 20s, interval is 10s + db.reset_mock() + + poller._poll_cycle() + assert db.get_books_by_status.call_count == 1 + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'custom', + 'STORYTELLER_POLL_SECONDS': 'not_a_number', + 'BOOKLORE_POLL_MODE': 'global', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + def test_invalid_poll_seconds_uses_default(self): + """Non-numeric POLL_SECONDS should fall back to the default (300s).""" + db = Mock() + db.get_books_by_status.return_value = [] + sync_clients = {'Storyteller': _make_sync_client()} + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, sync_clients) + interval = poller._get_interval('STORYTELLER') + + assert interval == 300 + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'global', + 'BOOKLORE_POLL_MODE': 'global', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + def test_global_mode_clients_are_skipped(self): + """Clients in 'global' poll mode should never be individually polled.""" + db = Mock() + storyteller = _make_sync_client(configured=True) + sync_clients = {'Storyteller': storyteller} + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, sync_clients) + poller._poll_cycle() + + db.get_books_by_status.assert_not_called() + storyteller.get_service_state.assert_not_called() + + +class TestChangeDetection: + """Verify that sync is triggered only when progress actually changes.""" + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'custom', + 'STORYTELLER_POLL_SECONDS': '10', + 'BOOKLORE_POLL_MODE': 'global', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + @patch('src.services.client_poller.threading') + def test_sync_triggered_on_position_change(self, mock_threading): + db = Mock() + book = _make_book(1, 'Moving Book') + db.get_books_by_status.return_value = [book] + + state = _make_state_result(0.6) + storyteller = _make_sync_client(configured=True, states={1: state}) + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, {'Storyteller': storyteller}) + + # First poll: caches position, no sync + poller._poll_cycle() + mock_threading.Thread.assert_not_called() + + # Second poll with changed position + poller._last_poll['Storyteller'] -= 20 + new_state = _make_state_result(0.8) + storyteller.get_service_state.side_effect = lambda book, prev_state=None: new_state + + with patch('src.services.write_tracker.is_own_write', return_value=False): + poller._poll_cycle() + + mock_threading.Thread.assert_called_once() + call_kwargs = mock_threading.Thread.call_args + assert call_kwargs.kwargs['kwargs'] == {'target_book_id': 1} + + @patch.dict(os.environ, { + 'STORYTELLER_POLL_MODE': 'custom', + 'STORYTELLER_POLL_SECONDS': '10', + 'BOOKLORE_POLL_MODE': 'global', + 'BOOKLORE_2_POLL_MODE': 'global', + 'HARDCOVER_POLL_MODE': 'global', + }) + @patch('src.services.client_poller.threading') + def test_own_write_suppresses_sync(self, mock_threading): + """Changes caused by our own writes should not trigger sync.""" + db = Mock() + book = _make_book(1, 'Self-updated Book') + db.get_books_by_status.return_value = [book] + + state = _make_state_result(0.5) + storyteller = _make_sync_client(configured=True, states={1: state}) + sync_manager = Mock() + + poller = ClientPoller(db, sync_manager, {'Storyteller': storyteller}) + + # Cache initial position + poller._poll_cycle() + + # Change position + poller._last_poll['Storyteller'] -= 20 + new_state = _make_state_result(0.7) + storyteller.get_service_state.side_effect = lambda book, prev_state=None: new_state + + with patch('src.services.write_tracker.is_own_write', return_value=True): + poller._poll_cycle() + + mock_threading.Thread.assert_not_called() diff --git a/tests/test_cover_proxy.py b/tests/test_cover_proxy.py index bbfd3a5..8979b96 100644 --- a/tests/test_cover_proxy.py +++ b/tests/test_cover_proxy.py @@ -1,6 +1,7 @@ """Tests for Booklore cover proxy endpoint and auth contract.""" import sys +import tempfile import unittest from pathlib import Path from unittest.mock import Mock, patch @@ -13,6 +14,13 @@ class TestBookloreCoverProxy(unittest.TestCase): """Verify _proxy_booklore_cover_for sends correct URL, auth, and headers.""" + def setUp(self): + self._tmp = tempfile.TemporaryDirectory() + self.covers_dir = Path(self._tmp.name) + + def tearDown(self): + self._tmp.cleanup() + def _make_client(self, configured=True, token='fake-jwt-token'): client = Mock() client.is_configured.return_value = configured @@ -22,9 +30,11 @@ def _make_client(self, configured=True, token='fake-jwt-token'): # ── URL and auth contract ────────────────────────────────────── + @patch('src.blueprints.covers.get_covers_dir') @patch('src.blueprints.covers.requests.get') - def test_uses_media_endpoint_path(self, mock_get): + def test_uses_media_endpoint_path(self, mock_get, mock_dir): """API path must be /api/v1/media/book/{id}/cover.""" + mock_dir.return_value = self.covers_dir mock_get.return_value = Mock(status_code=404) bl = self._make_client() @@ -33,9 +43,11 @@ def test_uses_media_endpoint_path(self, mock_get): called_url = mock_get.call_args[0][0] self.assertEqual(called_url, 'http://booklore.local/api/v1/media/book/3880/cover') + @patch('src.blueprints.covers.get_covers_dir') @patch('src.blueprints.covers.requests.get') - def test_auth_via_query_param_not_header(self, mock_get): + def test_auth_via_query_param_not_header(self, mock_get, mock_dir): """JWT must be sent as ?token= query param, not Authorization header.""" + mock_dir.return_value = self.covers_dir mock_get.return_value = Mock(status_code=404) bl = self._make_client(token='my-secret-jwt') @@ -48,12 +60,12 @@ def test_auth_via_query_param_not_header(self, mock_get): # ── Response contract ────────────────────────────────────────── + @patch('src.blueprints.covers.get_covers_dir') @patch('src.blueprints.covers.requests.get') - def test_content_type_hardcoded_jpeg(self, mock_get): + def test_content_type_hardcoded_jpeg(self, mock_get, mock_dir): """Response must be image/jpeg regardless of upstream Content-Type.""" - upstream = Mock(status_code=200) - upstream.headers = {'content-type': 'application/json'} - upstream.iter_content.return_value = iter([b'\xff\xd8\xff\xe0']) + mock_dir.return_value = self.covers_dir + upstream = Mock(status_code=200, content=b'\xff\xd8\xff\xe0') mock_get.return_value = upstream bl = self._make_client() @@ -61,11 +73,12 @@ def test_content_type_hardcoded_jpeg(self, mock_get): self.assertEqual(resp.content_type, 'image/jpeg') + @patch('src.blueprints.covers.get_covers_dir') @patch('src.blueprints.covers.requests.get') - def test_cache_control_header_set(self, mock_get): + def test_cache_control_header_set(self, mock_get, mock_dir): """Successful proxy response must set long-lived cache headers.""" - upstream = Mock(status_code=200) - upstream.iter_content.return_value = iter([b'imgdata']) + mock_dir.return_value = self.covers_dir + upstream = Mock(status_code=200, content=b'imgdata') mock_get.return_value = upstream bl = self._make_client() @@ -73,12 +86,12 @@ def test_cache_control_header_set(self, mock_get): self.assertEqual(resp.headers.get('Cache-Control'), 'public, max-age=86400, immutable') + @patch('src.blueprints.covers.get_covers_dir') @patch('src.blueprints.covers.requests.get') - def test_streams_upstream_body(self, mock_get): - """Proxy must stream the upstream body content through.""" - chunks = [b'chunk1', b'chunk2'] - upstream = Mock(status_code=200) - upstream.iter_content.return_value = iter(chunks) + def test_streams_upstream_body(self, mock_get, mock_dir): + """Proxy must pass the upstream body content through.""" + mock_dir.return_value = self.covers_dir + upstream = Mock(status_code=200, content=b'chunk1chunk2') mock_get.return_value = upstream bl = self._make_client() @@ -87,31 +100,86 @@ def test_streams_upstream_body(self, mock_get): body = b''.join(resp.response) self.assertEqual(body, b'chunk1chunk2') + # ── Caching behaviour ───────────────────────────────────────── + + @patch('src.blueprints.covers.get_covers_dir') + @patch('src.blueprints.covers.requests.get') + def test_successful_fetch_writes_cache_file(self, mock_get, mock_dir): + """On upstream 200, cover bytes must be written to cache file.""" + mock_dir.return_value = self.covers_dir + upstream = Mock(status_code=200, content=b'cover-bytes') + mock_get.return_value = upstream + bl = self._make_client() + + _proxy_booklore_cover_for(bl, 42, cache_prefix="bl") + + cache_file = self.covers_dir / "bl-42.jpg" + self.assertTrue(cache_file.exists()) + self.assertEqual(cache_file.read_bytes(), b'cover-bytes') + + @patch('src.blueprints.covers.send_from_directory') + @patch('src.blueprints.covers.get_covers_dir') + def test_serves_from_cache_when_unconfigured(self, mock_dir, mock_send): + """When Booklore is not configured but cache exists, serve cached cover.""" + mock_dir.return_value = self.covers_dir + cache_file = self.covers_dir / "bl-7.jpg" + cache_file.write_bytes(b'cached-data') + mock_send.return_value = Mock(headers={}) + bl = self._make_client(configured=False) + + _proxy_booklore_cover_for(bl, 7) + + mock_send.assert_called_once_with(self.covers_dir, "bl-7.jpg") + # ── Error paths ──────────────────────────────────────────────── - def test_not_configured_returns_404(self): + @patch('src.blueprints.covers.get_covers_dir') + def test_not_configured_no_cache_returns_404(self, mock_dir): + mock_dir.return_value = self.covers_dir bl = self._make_client(configured=False) result = _proxy_booklore_cover_for(bl, 1) - self.assertEqual(result, ("Booklore not configured", 404)) + self.assertEqual(result, ("Cover not found", 404)) - def test_auth_failure_returns_500(self): + @patch('src.blueprints.covers.get_covers_dir') + def test_auth_failure_no_cache_returns_404(self, mock_dir): + mock_dir.return_value = self.covers_dir bl = self._make_client(token=None) result = _proxy_booklore_cover_for(bl, 1) - self.assertEqual(result, ("Booklore auth failed", 500)) + self.assertEqual(result, ("Cover not found", 404)) + @patch('src.blueprints.covers.get_covers_dir') @patch('src.blueprints.covers.requests.get') - def test_upstream_404_returns_404(self, mock_get): + def test_upstream_404_no_cache_returns_404(self, mock_get, mock_dir): + mock_dir.return_value = self.covers_dir mock_get.return_value = Mock(status_code=404) bl = self._make_client() result = _proxy_booklore_cover_for(bl, 9999) self.assertEqual(result, ("Cover not found", 404)) + @patch('src.blueprints.covers.get_covers_dir') @patch('src.blueprints.covers.requests.get') - def test_network_error_returns_500(self, mock_get): + def test_network_error_no_cache_returns_404(self, mock_get, mock_dir): + mock_dir.return_value = self.covers_dir mock_get.side_effect = ConnectionError("refused") bl = self._make_client() result = _proxy_booklore_cover_for(bl, 1) - self.assertEqual(result, ("Error loading cover", 500)) + self.assertEqual(result, ("Cover not found", 404)) + + @patch('src.blueprints.covers.send_from_directory') + @patch('src.blueprints.covers.get_covers_dir') + @patch('src.blueprints.covers.requests.get') + def test_network_error_with_cache_serves_cached(self, mock_get, mock_dir, mock_send): + """When upstream fails but cache exists, serve cached cover.""" + mock_dir.return_value = self.covers_dir + cache_file = self.covers_dir / "bl-5.jpg" + cache_file.write_bytes(b'old-cover') + mock_get.side_effect = ConnectionError("refused") + mock_send.return_value = Mock(headers={}) + bl = self._make_client() + + _proxy_booklore_cover_for(bl, 5) + + mock_send.assert_called_once_with(self.covers_dir, "bl-5.jpg") if __name__ == '__main__': diff --git a/tests/test_dashboard_errors.py b/tests/test_dashboard_errors.py new file mode 100644 index 0000000..e549108 --- /dev/null +++ b/tests/test_dashboard_errors.py @@ -0,0 +1,124 @@ +"""Tests for dashboard graceful degradation when services fail.""" + +from unittest.mock import Mock + + +def _setup_dashboard_db_defaults(mock_db): + """Configure database_service mock with defaults the dashboard route needs.""" + mock_db.get_all_books.return_value = [] + mock_db.get_setting.return_value = None + mock_db.get_states_by_book.return_value = {} + mock_db.get_all_hardcover_details.return_value = [] + mock_db.get_booklore_by_filename.return_value = {} + mock_db.get_bookfusion_linked_book_ids.return_value = set() + mock_db.get_bookfusion_highlight_counts_by_book_id.return_value = {} + mock_db.get_all_storyteller_submissions_latest.return_value = {} + mock_db.get_latest_jobs_bulk.return_value = {} + + +# ── ABS errors ──────────────────────────────────────────────────── + +def test_index_renders_when_abs_get_audiobooks_raises(flask_app, mock_container): + """Dashboard should render 200 even when ABS get_audiobooks() throws.""" + _setup_dashboard_db_defaults(mock_container.mock_database_service) + # Replace the abs_service in app config directly (it was wired at app creation) + failing_abs = Mock() + failing_abs.get_audiobooks.side_effect = Exception("ABS down") + failing_abs.is_available.return_value = False + flask_app.config['abs_service'] = failing_abs + with flask_app.test_client() as client: + response = client.get("/") + assert response.status_code == 200 + + +def test_index_renders_when_abs_service_unavailable(flask_app, mock_container): + """Dashboard should render 200 when ABS service is not available.""" + _setup_dashboard_db_defaults(mock_container.mock_database_service) + unavailable_abs = Mock() + unavailable_abs.get_audiobooks.return_value = [] + unavailable_abs.is_available.return_value = False + flask_app.config['abs_service'] = unavailable_abs + with flask_app.test_client() as client: + response = client.get("/") + assert response.status_code == 200 + + +# ── BookFusion errors ───────────────────────────────────────────── + +def test_index_renders_when_bookfusion_linked_ids_raises(client, mock_container): + """Dashboard should render 200 when BookFusion linked IDs query fails.""" + _setup_dashboard_db_defaults(mock_container.mock_database_service) + mock_container.mock_database_service.get_bookfusion_linked_book_ids.side_effect = Exception( + "BookFusion DB error" + ) + response = client.get("/") + assert response.status_code == 200 + + +def test_index_renders_when_bookfusion_highlight_counts_raises(client, mock_container): + """Dashboard should render 200 when BookFusion highlight counts query fails.""" + _setup_dashboard_db_defaults(mock_container.mock_database_service) + mock_container.mock_database_service.get_bookfusion_highlight_counts_by_book_id.side_effect = ( + Exception("BookFusion highlights error") + ) + response = client.get("/") + assert response.status_code == 200 + + +# ── Storyteller errors ──────────────────────────────────────────── + +def test_index_renders_when_storyteller_submissions_raises(flask_app, mock_container): + """Dashboard should render 200 when Storyteller submission fetch fails.""" + _setup_dashboard_db_defaults(mock_container.mock_database_service) + # Register a storyteller sync client so integrations["storyteller"] is True + st_sync_client = Mock() + st_sync_client.is_configured.return_value = True + mock_container.sync_clients = lambda: {"storyteller": st_sync_client} + mock_container.mock_database_service.get_all_storyteller_submissions_latest.side_effect = ( + Exception("Storyteller DB error") + ) + with flask_app.test_client() as client: + response = client.get("/") + assert response.status_code == 200 + + +# ── Empty state ─────────────────────────────────────────────────── + +def test_index_renders_with_no_books(client, mock_container): + """Dashboard should render 200 with zero books (empty library).""" + _setup_dashboard_db_defaults(mock_container.mock_database_service) + response = client.get("/") + assert response.status_code == 200 + + +# ── Multiple service failures ───────────────────────────────────── + +def test_index_renders_when_all_external_services_fail(flask_app, mock_container): + """Dashboard should render 200 even when ABS, BookFusion, and Storyteller all fail.""" + _setup_dashboard_db_defaults(mock_container.mock_database_service) + + # ABS failure (must patch app config directly) + failing_abs = Mock() + failing_abs.get_audiobooks.side_effect = Exception("ABS down") + failing_abs.is_available.return_value = False + flask_app.config['abs_service'] = failing_abs + + # BookFusion failure + mock_container.mock_database_service.get_bookfusion_linked_book_ids.side_effect = Exception( + "BF down" + ) + mock_container.mock_database_service.get_bookfusion_highlight_counts_by_book_id.side_effect = ( + Exception("BF down") + ) + + # Storyteller failure (need sync_clients to include storyteller) + st_sync_client = Mock() + st_sync_client.is_configured.return_value = True + mock_container.sync_clients = lambda: {"storyteller": st_sync_client} + mock_container.mock_database_service.get_all_storyteller_submissions_latest.side_effect = ( + Exception("ST down") + ) + + with flask_app.test_client() as client: + response = client.get("/") + assert response.status_code == 200 diff --git a/tests/test_database_service_integration.py b/tests/test_database_service_integration.py index b061357..4739594 100644 --- a/tests/test_database_service_integration.py +++ b/tests/test_database_service_integration.py @@ -700,6 +700,140 @@ def test_migration_partial_data(self): finally: migration_db_service.db_manager.close() + # ── T1: get_book_by_ref() resolution — all 5 branches ── + + def test_get_book_by_ref_none(self): + """get_book_by_ref(None) returns None.""" + self.assertIsNone(self.db_service.get_book_by_ref(None)) + + def test_get_book_by_ref_integer_id(self): + """get_book_by_ref with an integer returns the book by primary key.""" + book = self.Book(abs_id='ref-int-test', title='Ref Int', status='active') + saved = self.db_service.save_book(book) + result = self.db_service.get_book_by_ref(saved.id) + self.assertIsNotNone(result) + self.assertEqual(result.id, saved.id) + + def test_get_book_by_ref_abs_id_string(self): + """get_book_by_ref with a string abs_id returns the matching book.""" + book = self.Book(abs_id='abs-ref-test', title='Ref Abs', status='active') + self.db_service.save_book(book) + result = self.db_service.get_book_by_ref('abs-ref-test') + self.assertIsNotNone(result) + self.assertEqual(result.abs_id, 'abs-ref-test') + + def test_get_book_by_ref_numeric_string_fallthrough(self): + """get_book_by_ref with a numeric string falls through to book_id lookup.""" + book = self.Book(abs_id='numeric-fall', title='Numeric Fallthrough', status='active') + saved = self.db_service.save_book(book) + # Use the numeric book ID as a string — should fall through abs_id miss to int lookup + result = self.db_service.get_book_by_ref(str(saved.id)) + self.assertIsNotNone(result) + self.assertEqual(result.id, saved.id) + + def test_get_book_by_ref_nonexistent_string(self): + """get_book_by_ref with a non-numeric string that doesn't match any abs_id returns None.""" + result = self.db_service.get_book_by_ref('does-not-exist') + self.assertIsNone(result) + + # ── T2: Book without abs_id — full lifecycle ── + + def test_book_without_abs_id_lifecycle(self): + """Create a Book with no abs_id, verify auto-id, retrieve, attach states.""" + book = self.Book(title='Standalone Ebook', status='not_started', sync_mode='ebook_only') + saved = self.db_service.save_book(book, is_new=True) + self.assertIsNotNone(saved.id) + self.assertIsNone(saved.abs_id) + + retrieved = self.db_service.get_book_by_ref(saved.id) + self.assertIsNotNone(retrieved) + self.assertEqual(retrieved.title, 'Standalone Ebook') + + # Attach a state + state = self.State( + book_id=saved.id, + client_name='KoSync', + timestamp=100.0, + percentage=0.25, + last_updated=1000.0, + ) + self.db_service.save_state(state) + states = self.db_service.get_states_by_book() + self.assertIn(saved.id, states) + self.assertEqual(len(states[saved.id]), 1) + + # ── T3: save_book() three-way upsert ── + + def test_save_book_with_id(self): + """save_book with book.id set updates the existing record.""" + book = self.Book(abs_id='upsert-by-id', title='Original', status='active') + saved = self.db_service.save_book(book, is_new=True) + saved.title = 'Updated by ID' + self.db_service.save_book(saved) + result = self.db_service.get_book_by_ref(saved.id) + self.assertEqual(result.title, 'Updated by ID') + + def test_save_book_with_abs_id(self): + """save_book with book.abs_id set (no id) upserts by abs_id.""" + book = self.Book(abs_id='upsert-by-abs', title='Upserted', status='active') + self.db_service.save_book(book, is_new=True) + book2 = self.Book(abs_id='upsert-by-abs', title='Re-upserted', status='active') + self.db_service.save_book(book2) + all_books = [b for b in self.db_service.get_all_books() if b.abs_id == 'upsert-by-abs'] + self.assertEqual(len(all_books), 1) + self.assertEqual(all_books[0].title, 'Re-upserted') + + def test_save_book_new_no_abs_id(self): + """save_book with neither id nor abs_id creates a new book.""" + book = self.Book(title='Brand New', status='not_started') + saved = self.db_service.save_book(book, is_new=True) + self.assertIsNotNone(saved.id) + self.assertIsNone(saved.abs_id) + + # ── T4: AlignmentService with book_id ── + + def test_alignment_save_retrieve_delete_by_book_id(self): + """Save, retrieve, and delete alignment by book_id.""" + from unittest.mock import MagicMock + + from src.services.alignment_service import AlignmentService + + book = self.Book(abs_id='align-test', title='Alignment Test', status='active') + saved = self.db_service.save_book(book) + + polisher = MagicMock() + svc = AlignmentService(self.db_service, polisher) + + # Save alignment + svc._save_alignment(saved.id, [{"char": 0, "ts": 0.0}, {"char": 1000, "ts": 60.0}], source='test') + + # has_alignment + self.assertTrue(svc.has_alignment(saved.id)) + self.assertFalse(svc.has_alignment(999999)) + + # get_alignment_info + info = svc.get_alignment_info(saved.id) + self.assertIsNotNone(info) + self.assertEqual(info['num_points'], 2) + + # get_time_for_text + ts = svc.get_time_for_text(saved.id, char_offset_hint=500) + self.assertIsNotNone(ts) + self.assertAlmostEqual(ts, 30.0, delta=1.0) + + # get_char_for_time + char = svc.get_char_for_time(saved.id, timestamp=30.0) + self.assertIsNotNone(char) + self.assertAlmostEqual(char, 500, delta=10) + + # get_book_duration + dur = svc.get_book_duration(saved.id) + self.assertAlmostEqual(dur, 60.0, delta=0.1) + + # delete_alignment + svc.delete_alignment(saved.id) + self.assertFalse(svc.has_alignment(saved.id)) + class TestLegacyDatabaseMigration(unittest.TestCase): """ @@ -1042,5 +1176,60 @@ def test_fresh_db_still_initializes_correctly(self): self.assertTrue(Path(db_path).exists(), "Database file was not created") + + +class TestSuggestionSourceScoping(unittest.TestCase): + """Tests that suggestion operations are scoped by (source_id, source).""" + + def setUp(self): + self.temp_dir = tempfile.mkdtemp() + self.test_db_path = str(Path(self.temp_dir) / 'test_database.db') + from src.db.database_service import DatabaseService + from src.db.models import PendingSuggestion + self.db_service = DatabaseService(self.test_db_path) + self.PendingSuggestion = PendingSuggestion + + def tearDown(self): + if hasattr(self, 'db_service') and hasattr(self.db_service, 'db_manager'): + self.db_service.db_manager.close() + import shutil + shutil.rmtree(self.temp_dir, ignore_errors=True) + + def test_suggestion_exists_scoped_by_source(self): + """suggestion_exists returns False when the source_id exists under a different source.""" + suggestion = self.PendingSuggestion( + source_id='id1', title='Test', source='kosync', + ) + self.db_service.save_pending_suggestion(suggestion) + self.assertTrue(self.db_service.suggestion_exists('id1', source='kosync')) + self.assertFalse(self.db_service.suggestion_exists('id1', source='abs')) + + def test_upsert_different_source_creates_two_rows(self): + """save_pending_suggestion with same source_id but different source creates distinct rows.""" + s1 = self.PendingSuggestion(source_id='id1', title='ABS Title', source='abs') + s2 = self.PendingSuggestion(source_id='id1', title='KOSync Title', source='kosync') + self.db_service.save_pending_suggestion(s1) + self.db_service.save_pending_suggestion(s2) + + abs_suggestion = self.db_service.get_suggestion('id1', source='abs') + kosync_suggestion = self.db_service.get_suggestion('id1', source='kosync') + self.assertIsNotNone(abs_suggestion) + self.assertIsNotNone(kosync_suggestion) + self.assertEqual(abs_suggestion.title, 'ABS Title') + self.assertEqual(kosync_suggestion.title, 'KOSync Title') + + def test_resolve_scoped_by_source(self): + """resolve_suggestion only deletes the row matching the given source.""" + s1 = self.PendingSuggestion(source_id='id1', title='ABS Title', source='abs') + s2 = self.PendingSuggestion(source_id='id1', title='KOSync Title', source='kosync') + self.db_service.save_pending_suggestion(s1) + self.db_service.save_pending_suggestion(s2) + + self.db_service.resolve_suggestion('id1', source='abs') + + self.assertFalse(self.db_service.suggestion_exists('id1', source='abs')) + self.assertTrue(self.db_service.suggestion_exists('id1', source='kosync')) + + if __name__ == '__main__': unittest.main(verbosity=2) diff --git a/tests/test_debounce_manager.py b/tests/test_debounce_manager.py new file mode 100644 index 0000000..da7a846 --- /dev/null +++ b/tests/test_debounce_manager.py @@ -0,0 +1,152 @@ +"""Tests for DebounceManager.""" + +import time +from unittest.mock import MagicMock, patch + +from src.utils.debounce_manager import DebounceManager + + +class TestRecordEvent: + def test_stores_entry(self): + mgr = DebounceManager(MagicMock(), MagicMock(), poll_interval=999) + mgr.record_event(42, "Test Book") + assert 42 in mgr._entries + assert mgr._entries[42]["title"] == "Test Book" + assert mgr._entries[42]["synced"] is False + + def test_updates_existing_entry(self): + mgr = DebounceManager(MagicMock(), MagicMock(), poll_interval=999) + mgr.record_event(42, "Test Book") + first_time = mgr._entries[42]["last_event"] + time.sleep(0.01) + mgr.record_event(42, "Test Book") + assert mgr._entries[42]["last_event"] > first_time + assert mgr._entries[42]["synced"] is False + + def test_re_record_resets_synced_flag(self): + mgr = DebounceManager(MagicMock(), MagicMock(), poll_interval=999) + mgr.record_event(42, "Test Book") + mgr._entries[42]["synced"] = True + mgr.record_event(42, "Test Book") + assert mgr._entries[42]["synced"] is False + + +class TestTriggerSync: + def test_skips_missing_book(self): + db = MagicMock() + db.get_book_by_id.return_value = None + manager = MagicMock() + mgr = DebounceManager(db, manager) + mgr._trigger_sync(999, "Ghost Book") + manager.sync_cycle.assert_not_called() + + def test_calls_sync_cycle_for_found_book(self): + db = MagicMock() + book = MagicMock() + book.id = 42 + db.get_book_by_id.return_value = book + manager = MagicMock() + mgr = DebounceManager(db, manager) + + with patch("src.utils.debounce_manager.threading") as mock_threading: + mgr._trigger_sync(42, "Real Book") + mock_threading.Thread.assert_called_once() + call_kwargs = mock_threading.Thread.call_args[1] + assert call_kwargs["target"] == manager.sync_cycle + assert call_kwargs["kwargs"] == {"target_book_id": 42} + + def test_skips_when_no_manager(self): + db = MagicMock() + mgr = DebounceManager(db, None) + mgr._trigger_sync(42, "No Manager") + db.get_book_by_id.assert_not_called() + + +class TestPollLoop: + def test_triggers_sync_after_debounce_window(self): + db = MagicMock() + book = MagicMock() + book.id = 1 + db.get_book_by_id.return_value = book + manager = MagicMock() + mgr = DebounceManager(db, manager, poll_interval=999) + + # Manually add an entry that's already past debounce window + mgr._entries[1] = { + "last_event": time.time() - 100, + "title": "Overdue Book", + "synced": False, + } + + # Run one poll iteration manually (extract logic from _poll_loop) + with patch.dict("os.environ", {"ABS_SOCKET_DEBOUNCE_SECONDS": "1"}): + with patch("src.utils.debounce_manager.threading"): + now = time.time() + to_sync = [] + with mgr._lock: + for book_id, info in mgr._entries.items(): + if not info["synced"] and (now - info["last_event"]) > 1: + info["synced"] = True + to_sync.append((book_id, info["title"])) + for book_id, title in to_sync: + mgr._trigger_sync(book_id, title) + + assert mgr._entries[1]["synced"] is True + db.get_book_by_id.assert_called_once_with(1) + + def test_does_not_trigger_during_debounce_window(self): + mgr = DebounceManager(MagicMock(), MagicMock(), poll_interval=999) + # Entry just recorded — still within debounce window + mgr._entries[1] = { + "last_event": time.time(), + "title": "Fresh Book", + "synced": False, + } + + with patch.dict("os.environ", {"ABS_SOCKET_DEBOUNCE_SECONDS": "9999"}): + now = time.time() + to_sync = [] + with mgr._lock: + for book_id, info in mgr._entries.items(): + if not info["synced"] and (now - info["last_event"]) > 9999: + info["synced"] = True + to_sync.append((book_id, info["title"])) + + assert len(to_sync) == 0 + assert mgr._entries[1]["synced"] is False + + def test_prunes_stale_entries(self): + mgr = DebounceManager(MagicMock(), MagicMock(), stale_seconds=0, poll_interval=999) + mgr._entries[1] = { + "last_event": time.time() - 10, + "title": "Stale Book", + "synced": True, + } + now = time.time() + with mgr._lock: + stale = [k for k, v in mgr._entries.items() if now - v["last_event"] > 0] + for k in stale: + del mgr._entries[k] + + assert 1 not in mgr._entries + + def test_multiple_books_debounced_independently(self): + mgr = DebounceManager(MagicMock(), MagicMock(), poll_interval=999) + # Book 1: old event (ready to sync) + mgr._entries[1] = {"last_event": time.time() - 100, "title": "Book A", "synced": False} + # Book 2: fresh event (not ready) + mgr._entries[2] = {"last_event": time.time(), "title": "Book B", "synced": False} + + with patch.dict("os.environ", {"ABS_SOCKET_DEBOUNCE_SECONDS": "1"}): + now = time.time() + to_sync = [] + with mgr._lock: + for book_id, info in mgr._entries.items(): + if not info["synced"] and (now - info["last_event"]) > 1: + info["synced"] = True + to_sync.append((book_id, info["title"])) + + assert len(to_sync) == 1 + assert to_sync[0] == (1, "Book A") + assert mgr._entries[1]["synced"] is True + assert mgr._entries[2]["synced"] is False diff --git a/tests/test_ebook_normalization.py b/tests/test_ebook_normalization.py index 98d99ef..0bf41fa 100644 --- a/tests/test_ebook_normalization.py +++ b/tests/test_ebook_normalization.py @@ -108,8 +108,8 @@ def test_returns_none_without_ebook_filename(self): ) assert result is None - def test_falls_back_to_pct_when_text_extraction_fails(self): - """When get_text_from_current_state returns None, fall back to pct * total_len.""" + def test_returns_none_when_text_extraction_fails(self): + """When text match fails for any client, return None to force raw percentage comparison.""" full_text = "B" * 80_000 parser = MagicMock() parser.resolve_book_path.return_value = '/books/book.epub' @@ -131,10 +131,12 @@ def test_falls_back_to_pct_when_text_extraction_fails(self): } result = mgr._normalize_for_cross_format_comparison(_make_book(), config) - assert result == {'KoSync': 20_000, 'Booklore': 60_000} + # Fallback-only normalization is unreliable — returns None so sync + # manager uses raw percentage comparison instead + assert result is None - def test_falls_back_to_pct_when_find_text_location_returns_none(self): - """When text is found but can't be located in the EPUB, fall back to pct.""" + def test_returns_none_when_find_text_location_returns_none(self): + """When text is found but can't be located in the EPUB, return None.""" full_text = "C" * 50_000 parser = MagicMock() parser.resolve_book_path.return_value = '/books/book.epub' @@ -157,7 +159,7 @@ def test_falls_back_to_pct_when_find_text_location_returns_none(self): } result = mgr._normalize_for_cross_format_comparison(_make_book(), config) - assert result == {'KoSync': 20_000, 'Booklore': 30_000} + assert result is None # ── Integration: _determine_leader picks correct leader via char offsets ── diff --git a/tests/test_ebook_sentence_xpath_fallback.py b/tests/test_ebook_sentence_xpath_fallback.py index 2ee186c..24ff88d 100644 --- a/tests/test_ebook_sentence_xpath_fallback.py +++ b/tests/test_ebook_sentence_xpath_fallback.py @@ -6,28 +6,30 @@ from lxml import html -from src.utils.ebook_utils import EbookParser +from src.utils.koreader_xpath import KoReaderXPathService +from src.utils.locator_search import LocatorSearchService class TestEbookSentenceXPathFallback(unittest.TestCase): def setUp(self): - self.parser = EbookParser(books_dir=".") + self.service = KoReaderXPathService() def test_chapter_fallback_uses_sentence_text_node(self): html_content = "

First sentence.

" - xpath = self.parser._build_sentence_level_chapter_fallback_xpath(html_content, 7) + xpath = self.service._build_sentence_level_chapter_fallback_xpath(html_content, 7) self.assertTrue(xpath.startswith("/body/DocFragment[7]/")) self.assertTrue(xpath.endswith(".0")) self.assertIn("/text()", xpath) def test_chapter_fallback_returns_default_when_no_text(self): html_content = "
" - xpath = self.parser._build_sentence_level_chapter_fallback_xpath(html_content, 5) + xpath = self.service._build_sentence_level_chapter_fallback_xpath(html_content, 5) self.assertEqual(xpath, "/body/DocFragment[5]/body/p[1]/text().0") def test_generate_xpath_bs4_never_returns_root_or_trailing_slash(self): + locator_service = LocatorSearchService() html_content = "Single sentence only." - xpath, _, _ = self.parser._generate_xpath_bs4(html_content, 0) + xpath, _, _ = locator_service._generate_xpath_bs4(html_content, 0) self.assertEqual(xpath, "/body/p[1]") self.assertFalse(xpath.endswith("/")) @@ -36,7 +38,7 @@ def test_crengine_safe_xpath_collapses_inline_target_to_structural_anchor(self): tree = html.fromstring(html_content) span = tree.xpath("//span")[0] - xpath = self.parser._build_crengine_safe_text_xpath(span, 3, html_content) + xpath = self.service._build_crengine_safe_text_xpath(span, 3, html_content) self.assertEqual(xpath, "/body/DocFragment[3]/body/p/text().0") self.assertNotIn("/span", xpath) @@ -46,7 +48,7 @@ def test_crengine_safe_xpath_falls_back_when_anchor_has_no_direct_text(self): tree = html.fromstring(html_content) span = tree.xpath("//span")[0] - xpath = self.parser._build_crengine_safe_text_xpath(span, 8, html_content) + xpath = self.service._build_crengine_safe_text_xpath(span, 8, html_content) self.assertEqual(xpath, "/body/DocFragment[8]/body/p/text().0") self.assertNotIn("/span", xpath) diff --git a/tests/test_fix_sync_issues.py b/tests/test_fix_sync_issues.py index bb8b09f..e52e442 100644 --- a/tests/test_fix_sync_issues.py +++ b/tests/test_fix_sync_issues.py @@ -36,6 +36,7 @@ def test_smart_fallback_missing_file_db_success(self): """ abs_id = "test-book-id" book = Book(abs_id=abs_id, ebook_filename="test.epub") + book.id = 42 book.transcript_file = "/tmp/does_not_exist.json" # Mock State @@ -63,7 +64,7 @@ def test_smart_fallback_missing_file_db_success(self): result = self.client.get_text_from_current_state(book, state) # Verify - self.mock_alignment_service.get_char_for_time.assert_called_with(abs_id, 100.0) + self.mock_alignment_service.get_char_for_time.assert_called_with(42, 100.0) self.mock_ebook_parser.resolve_book_path.assert_called() self.mock_ebook_parser.extract_text_and_map.assert_called() diff --git a/tests/test_hardcover_sync_client.py b/tests/test_hardcover_sync_client.py index c308f96..40155b3 100644 --- a/tests/test_hardcover_sync_client.py +++ b/tests/test_hardcover_sync_client.py @@ -455,7 +455,7 @@ def test_automatch_uses_local_status_for_new_user_book(self, mock_record_write): # 'completed' maps to HC_READ (3) self.mock_hardcover_client.update_status.assert_called_once_with(50, HC_READ, 200) - mock_record_write.assert_called_once_with('Hardcover', 'test-hardcover-book', {'status': HC_READ}) + mock_record_write.assert_called_once_with('Hardcover', self.test_book.id, {'status': HC_READ}) @patch('src.services.hardcover_service.record_write') def test_manual_match_preserves_existing_hardcover_status(self, mock_record_write): diff --git a/tests/test_helpers.py b/tests/test_helpers.py new file mode 100644 index 0000000..95b7aee --- /dev/null +++ b/tests/test_helpers.py @@ -0,0 +1,236 @@ +"""Tests for error paths in src/blueprints/helpers.py.""" + +from unittest.mock import Mock, patch + +# ── get_kosync_id_for_ebook: Booklore download failure ──────────── + +def test_get_kosync_id_booklore_download_raises(flask_app, mock_container): + """When Booklore download_book raises, fall through to filesystem lookup.""" + bl_client = Mock() + bl_client.is_configured.return_value = True + bl_client.download_book.side_effect = Exception("Booklore network error") + + mock_container.mock_ebook_parser.get_kosync_id.return_value = None + + with flask_app.app_context(): + from src.blueprints.helpers import get_kosync_id_for_ebook + + result = get_kosync_id_for_ebook("book.epub", booklore_id=42, bl_client=bl_client) + + # Should return None because filesystem also doesn't have the file + assert result is None + bl_client.download_book.assert_called_once_with(42) + + +def test_get_kosync_id_booklore_download_returns_none(flask_app, mock_container): + """When Booklore download_book returns None, fall through to filesystem.""" + bl_client = Mock() + bl_client.is_configured.return_value = True + bl_client.download_book.return_value = None + + with flask_app.app_context(): + from src.blueprints.helpers import get_kosync_id_for_ebook + + result = get_kosync_id_for_ebook("book.epub", booklore_id=42, bl_client=bl_client) + + assert result is None + + +def test_get_kosync_id_abs_download_raises(flask_app, mock_container): + """When ABS on-demand download raises, should return None gracefully.""" + mock_container.mock_abs_client.is_configured.return_value = True + mock_container.mock_abs_client.get_ebook_files.side_effect = Exception("ABS timeout") + + with flask_app.app_context(): + from src.blueprints.helpers import get_kosync_id_for_ebook + + result = get_kosync_id_for_ebook("someitem_abs.epub") + + assert result is None + + +def test_get_kosync_id_cwa_download_raises(flask_app, mock_container): + """When CWA on-demand download raises, should return None gracefully.""" + mock_cwa = Mock() + mock_cwa.is_configured.return_value = True + mock_cwa.search_ebooks.side_effect = Exception("CWA error") + mock_container.cwa_client = lambda: mock_cwa + + with flask_app.app_context(): + from src.blueprints.helpers import get_kosync_id_for_ebook + + result = get_kosync_id_for_ebook("cwa_123.epub") + + assert result is None + + +# ── find_in_booklore: API raises ────────────────────────────────── + +def test_find_in_booklore_empty_filename(flask_app, mock_container): + """find_in_booklore returns (None, None) for empty filename.""" + with flask_app.app_context(): + from src.blueprints.helpers import find_in_booklore + + book, client = find_in_booklore("") + + assert book is None + assert client is None + + +def test_find_in_booklore_none_filename(flask_app, mock_container): + """find_in_booklore returns (None, None) for None filename.""" + with flask_app.app_context(): + from src.blueprints.helpers import find_in_booklore + + book, client = find_in_booklore(None) + + assert book is None + assert client is None + + +def test_find_in_booklore_not_configured(flask_app, mock_container): + """find_in_booklore returns (None, None) when Booklore is not configured.""" + mock_container.mock_booklore_client.is_configured.return_value = False + + with flask_app.app_context(): + from src.blueprints.helpers import find_in_booklore + + book, client = find_in_booklore("test.epub") + + assert book is None + assert client is None + + +def test_find_in_booklore_no_match(flask_app, mock_container): + """find_in_booklore returns (None, None) when no book matches.""" + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.find_book_by_filename.return_value = None + + with flask_app.app_context(): + from src.blueprints.helpers import find_in_booklore + + book, client = find_in_booklore("missing.epub") + + assert book is None + assert client is None + + +# ── serialize_suggestion with None fields ───────────────────────── + +def test_serialize_suggestion_with_none_fields(): + """serialize_suggestion handles None created_at and empty matches.""" + from src.blueprints.helpers import serialize_suggestion + + suggestion = Mock() + suggestion.id = 1 + suggestion.source_id = "abc" + suggestion.source = None + suggestion.title = None + suggestion.author = None + suggestion.cover_url = None + suggestion.matches = [] + suggestion.created_at = None + suggestion.status = "pending" + + result = serialize_suggestion(suggestion) + + assert result["id"] == 1 + assert result["source"] == "unknown" + assert result["title"] is None + assert result["created_at"] is None + assert result["matches"] == [] + assert result["top_match"] is None + assert result["hidden"] is False + + +def test_serialize_suggestion_with_bookfusion_evidence(): + """serialize_suggestion flags bookfusion evidence correctly.""" + from src.blueprints.helpers import serialize_suggestion + + suggestion = Mock() + suggestion.id = 2 + suggestion.source_id = "def" + suggestion.source = "abs" + suggestion.title = "Test Book" + suggestion.author = "Author" + suggestion.cover_url = "/cover.jpg" + suggestion.matches = [ + {"ebook_filename": "test.epub", "evidence": ["bookfusion_catalog"], "source_family": "bookfusion", "bookfusion_ids": [1]}, + ] + suggestion.created_at = None + suggestion.status = "pending" + + result = serialize_suggestion(suggestion) + + assert result["has_bookfusion_evidence"] is True + assert result["matches"][0]["has_bookfusion"] is True + assert result["top_match"] is not None + + +def test_serialize_suggestion_hidden_status(): + """serialize_suggestion correctly reports hidden=True for hidden status.""" + from src.blueprints.helpers import serialize_suggestion + + suggestion = Mock() + suggestion.id = 3 + suggestion.source_id = "ghi" + suggestion.source = "abs" + suggestion.title = "Hidden Book" + suggestion.author = None + suggestion.cover_url = None + suggestion.matches = [{"ebook_filename": "x.epub", "evidence": []}] + suggestion.created_at = None + suggestion.status = "hidden" + + result = serialize_suggestion(suggestion) + + assert result["hidden"] is True + + +# ── attempt_hardcover_automatch: exception swallowed ────────────── + +def test_attempt_hardcover_automatch_swallows_exception(flask_app, mock_container): + """attempt_hardcover_automatch logs but does not raise on failure.""" + mock_container.mock_hardcover_service.is_configured.return_value = True + mock_container.mock_hardcover_service.automatch_hardcover.side_effect = Exception("HC down") + + with flask_app.app_context(): + from src.blueprints.helpers import attempt_hardcover_automatch + + book = Mock() + # Should not raise + attempt_hardcover_automatch(mock_container, book) + + mock_container.mock_hardcover_service.automatch_hardcover.assert_called_once() + + +# ── find_booklore_metadata ──────────────────────────────────────── + +def test_find_booklore_metadata_no_match(): + """find_booklore_metadata returns None when no filename matches.""" + from src.blueprints.helpers import find_booklore_metadata + + book = Mock() + book.ebook_filename = "missing.epub" + book.original_ebook_filename = None + + result = find_booklore_metadata(book, {}) + + assert result is None + + +def test_find_booklore_metadata_matches_original(): + """find_booklore_metadata falls back to original_ebook_filename.""" + from src.blueprints.helpers import find_booklore_metadata + + book = Mock() + book.ebook_filename = "renamed.epub" + book.original_ebook_filename = "original.epub" + + meta = Mock() + meta.title = "Original Title" + booklore_by_filename = {"original.epub": [meta]} + + result = find_booklore_metadata(book, booklore_by_filename) + + assert result == meta diff --git a/tests/test_koreader_xpath.py b/tests/test_koreader_xpath.py new file mode 100644 index 0000000..b7ac41e --- /dev/null +++ b/tests/test_koreader_xpath.py @@ -0,0 +1,247 @@ +""" +Tests for KoReaderXPathService — KOReader XPath generation and resolution. + +All tests use crafted HTML and synthetic spine maps — no file I/O or mocking. +""" + +import pytest + +pytestmark = pytest.mark.docker + +from lxml import html + +from src.utils.koreader_xpath import KoReaderXPathService + + +def _make_spine_map(html_contents): + """Build a spine map from a list of HTML strings, matching extract_text_and_map format.""" + from bs4 import BeautifulSoup + + spine_map = [] + full_text_parts = [] + current_idx = 0 + + for i, content in enumerate(html_contents): + if isinstance(content, str): + content = content.encode("utf-8") + soup = BeautifulSoup(content, "html.parser") + text = soup.get_text(separator=" ", strip=True) + start = current_idx + end = current_idx + len(text) + spine_map.append( + { + "start": start, + "end": end, + "spine_index": i + 1, + "href": f"chapter{i + 1}.xhtml", + "content": content, + } + ) + full_text_parts.append(text) + current_idx = end + 1 + + full_text = " ".join(full_text_parts) + return full_text, spine_map + + +class TestGenerateXpath: + def setup_method(self): + self.service = KoReaderXPathService() + + def test_simple_paragraph_text(self): + content = "

Hello world this is a test.

" + full_text, spine_map = _make_spine_map([content]) + xpath = self.service.generate_xpath(full_text, spine_map, 0) + assert xpath is not None + assert xpath.startswith("/body/DocFragment[1]/") + assert "/text()" in xpath + assert xpath.endswith(".0") + + def test_inline_tags_skipped_to_structural_parent(self): + content = "

Lead text emphasized word more text.

" + full_text, spine_map = _make_spine_map([content]) + # Position inside the text + em_start = full_text.find("emphasized") + xpath = self.service.generate_xpath(full_text, spine_map, em_start) + assert xpath is not None + assert "/em" not in xpath + assert "/p" in xpath or "body" in xpath + + def test_duplicate_text_correct_occurrence(self): + content = "

Hello world.

Hello world.

Unique ending.

" + full_text, spine_map = _make_spine_map([content]) + # Target the second "Hello world." occurrence + first = full_text.find("Hello world.") + second = full_text.find("Hello world.", first + 1) + xpath = self.service.generate_xpath(full_text, spine_map, second) + assert xpath is not None + assert "/body/DocFragment[1]/" in xpath + + def test_position_at_start(self): + content = "

First paragraph.

Second paragraph.

" + full_text, spine_map = _make_spine_map([content]) + xpath = self.service.generate_xpath(full_text, spine_map, 0) + assert xpath is not None + assert "/body/DocFragment[1]/" in xpath + + def test_position_at_end(self): + content = "

First paragraph.

Last paragraph here.

" + full_text, spine_map = _make_spine_map([content]) + xpath = self.service.generate_xpath(full_text, spine_map, len(full_text) - 1) + assert xpath is not None + assert "/body/DocFragment[1]/" in xpath + + def test_empty_chapter_falls_back_to_sentence_level(self): + content = "
" + spine_map = [ + { + "start": 0, + "end": 0, + "spine_index": 1, + "href": "chapter1.xhtml", + "content": content.encode("utf-8"), + } + ] + xpath = self.service.generate_xpath("x", spine_map, 0) + # Should get a fallback xpath + assert xpath is not None + assert "/body/DocFragment[1]/" in xpath + + def test_multiple_spine_items_correct_docfragment(self): + ch1 = "

Chapter one text here.

" + ch2 = "

Chapter two different content.

" + full_text, spine_map = _make_spine_map([ch1, ch2]) + # Target text in chapter 2 + ch2_start = full_text.find("Chapter two") + xpath = self.service.generate_xpath(full_text, spine_map, ch2_start) + assert xpath is not None + assert "/body/DocFragment[2]/" in xpath + + def test_nested_structural_tags(self): + content = "

Nested deeply text.

" + full_text, spine_map = _make_spine_map([content]) + xpath = self.service.generate_xpath(full_text, spine_map, 0) + assert xpath is not None + assert "/span" not in xpath + + def test_returns_none_for_empty_inputs(self): + assert self.service.generate_xpath("", [], 0) is None + assert self.service.generate_xpath("text", [], 0) is None + assert self.service.generate_xpath("", [{"start": 0, "end": 0}], 0) is None + + +class TestGenerateSentenceLevelXpath: + def setup_method(self): + self.service = KoReaderXPathService() + + def test_valid_percentage(self): + content = "

Some text in a paragraph.

" + full_text, spine_map = _make_spine_map([content]) + xpath = self.service.generate_sentence_level_xpath(full_text, spine_map, 0.5) + assert xpath is not None + assert "/body/DocFragment[1]/" in xpath + assert xpath.endswith(".0") + + def test_percentage_zero(self): + content = "

Beginning text.

" + full_text, spine_map = _make_spine_map([content]) + xpath = self.service.generate_sentence_level_xpath(full_text, spine_map, 0.0) + assert xpath is not None + assert "/body/DocFragment[1]/" in xpath + + def test_percentage_near_end(self): + ch1 = "

Chapter one.

" + ch2 = "

Chapter two final.

" + full_text, spine_map = _make_spine_map([ch1, ch2]) + xpath = self.service.generate_sentence_level_xpath(full_text, spine_map, 0.99) + assert xpath is not None + assert "/body/DocFragment[2]/" in xpath + + def test_returns_none_for_empty_text(self): + assert self.service.generate_sentence_level_xpath("", [], 0.5) is None + + +class TestResolveXpath: + def setup_method(self): + self.service = KoReaderXPathService() + + def test_round_trip_generate_then_resolve(self): + content = "

The quick brown fox jumps over the lazy dog.

" + full_text, spine_map = _make_spine_map([content]) + position = full_text.find("brown fox") + xpath = self.service.generate_xpath(full_text, spine_map, position) + assert xpath is not None + + resolved = self.service.resolve_xpath(full_text, spine_map, xpath) + assert resolved is not None + assert "brown fox" in resolved or "quick" in resolved + + def test_xpath_with_id_anchor_fallback(self): + content = "

Content in identified div.

" + full_text, spine_map = _make_spine_map([content]) + xpath = "/body/DocFragment[1]/body/div[@id='chapter3']/p/text().0" + # The @id fallback in _resolve_xpath_elements should find this + resolved = self.service.resolve_xpath(full_text, spine_map, xpath) + # May or may not resolve depending on exact xpath format; just shouldn't crash + assert resolved is None or isinstance(resolved, str) + + def test_no_matching_elements_returns_none(self): + content = "

Simple text.

" + full_text, spine_map = _make_spine_map([content]) + xpath = "/body/DocFragment[1]/body/section[99]/p[42]/text().0" + resolved = self.service.resolve_xpath(full_text, spine_map, xpath) + assert resolved is None + + def test_missing_docfragment_returns_none(self): + content = "

Some text.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.resolve_xpath(full_text, spine_map, "/body/p/text().0") + assert result is None + + def test_wrong_spine_index_returns_none(self): + content = "

Only chapter.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.resolve_xpath(full_text, spine_map, "/body/DocFragment[99]/body/p/text().0") + assert result is None + + +class TestHelperMethods: + def setup_method(self): + self.service = KoReaderXPathService() + + def test_build_crengine_safe_xpath_collapses_inline(self): + html_content = "

Lead target text

" + tree = html.fromstring(html_content) + span = tree.xpath("//span")[0] + xpath = self.service._build_crengine_safe_text_xpath(span, 3, html_content) + assert "/body/DocFragment[3]/" in xpath + assert "/span" not in xpath + assert "/text()" in xpath + + def test_sentence_level_fallback_with_paragraph(self): + html_content = "

First real text.

" + xpath = self.service._build_sentence_level_chapter_fallback_xpath(html_content, 5) + assert xpath.startswith("/body/DocFragment[5]/") + assert "/text()" in xpath + assert xpath.endswith(".0") + + def test_sentence_level_fallback_no_text_returns_default(self): + html_content = "" + xpath = self.service._build_sentence_level_chapter_fallback_xpath(html_content, 2) + assert xpath == "/body/DocFragment[2]/body/p[1]/text().0" + + def test_build_xpath_with_indexed_siblings(self): + html_content = "

First

Second

Third

" + tree = html.fromstring(html_content) + paragraphs = tree.xpath("//p") + # Third paragraph should get index [3] + xpath = self.service._build_xpath(paragraphs[2]) + assert "p[3]" in xpath + + def test_build_xpath_single_child_no_index(self): + html_content = "

Only child

" + tree = html.fromstring(html_content) + p = tree.xpath("//p")[0] + xpath = self.service._build_xpath(p) + # Single p child shouldn't need an index + assert "p[" not in xpath or "p[1]" not in xpath diff --git a/tests/test_kosync_server.py b/tests/test_kosync_server.py index 71d8dbf..8979ba1 100644 --- a/tests/test_kosync_server.py +++ b/tests/test_kosync_server.py @@ -2,6 +2,7 @@ Tests for KOSync server functionality. Verifies compatibility with kosync-dotnet behavior. """ + import os import shutil import tempfile @@ -12,10 +13,10 @@ from unittest.mock import MagicMock, Mock, patch # Set test environment -TEST_DIR = '/tmp/test_kosync' -os.environ['DATA_DIR'] = TEST_DIR -os.environ['KOSYNC_USER'] = 'testuser' -os.environ['KOSYNC_KEY'] = 'testpass' +TEST_DIR = "/tmp/test_kosync" +os.environ["DATA_DIR"] = TEST_DIR +os.environ["KOSYNC_USER"] = "testuser" +os.environ["KOSYNC_KEY"] = "testpass" # Ensure test directory exists @@ -34,7 +35,7 @@ class TestKosyncDocument(unittest.TestCase): @classmethod def setUpClass(cls): """Set up test database.""" - cls.db_path = os.path.join(TEST_DIR, 'test.db') + cls.db_path = os.path.join(TEST_DIR, "test.db") cls.db_service = DatabaseService(cls.db_path) def setUp(self): @@ -47,63 +48,54 @@ def setUp(self): def test_create_kosync_document(self): """Test creating a new KOSync document.""" doc = KosyncDocument( - document_hash='a' * 32, - progress='/body/div[1]/p[1]', + document_hash="a" * 32, + progress="/body/div[1]/p[1]", percentage=0.25, - device='TestDevice', - device_id='TEST123' + device="TestDevice", + device_id="TEST123", ) saved = self.db_service.save_kosync_document(doc) - self.assertEqual(saved.document_hash, 'a' * 32) + self.assertEqual(saved.document_hash, "a" * 32) # Handle float/decimal comparison loosely self.assertAlmostEqual(float(saved.percentage), 0.25) - self.assertEqual(saved.device, 'TestDevice') + self.assertEqual(saved.device, "TestDevice") def test_get_kosync_document(self): """Test retrieving a KOSync document.""" # Create first - doc = KosyncDocument( - document_hash='b' * 32, - percentage=0.5 - ) + doc = KosyncDocument(document_hash="b" * 32, percentage=0.5) self.db_service.save_kosync_document(doc) # Retrieve - retrieved = self.db_service.get_kosync_document('b' * 32) + retrieved = self.db_service.get_kosync_document("b" * 32) self.assertIsNotNone(retrieved) self.assertAlmostEqual(float(retrieved.percentage), 0.5) def test_get_nonexistent_document(self): """Test retrieving a document that doesn't exist.""" - retrieved = self.db_service.get_kosync_document('nonexistent' + '0' * 21) + retrieved = self.db_service.get_kosync_document("nonexistent" + "0" * 21) self.assertIsNone(retrieved) def test_update_kosync_document(self): """Test updating an existing KOSync document.""" - doc = KosyncDocument( - document_hash='c' * 32, - percentage=0.1 - ) + doc = KosyncDocument(document_hash="c" * 32, percentage=0.1) self.db_service.save_kosync_document(doc) # Update doc.percentage = 0.9 - doc.progress = '/body/div[99]' + doc.progress = "/body/div[99]" self.db_service.save_kosync_document(doc) # Verify - retrieved = self.db_service.get_kosync_document('c' * 32) + retrieved = self.db_service.get_kosync_document("c" * 32) self.assertAlmostEqual(float(retrieved.percentage), 0.9) - self.assertEqual(retrieved.progress, '/body/div[99]') + self.assertEqual(retrieved.progress, "/body/div[99]") def test_link_kosync_document(self): """Test linking a document to an ABS book.""" # Create doc - doc = KosyncDocument( - document_hash='d' * 32, - percentage=0.3 - ) + doc = KosyncDocument(document_hash="d" * 32, percentage=0.3) self.db_service.save_kosync_document(doc) # Create book @@ -111,39 +103,33 @@ def test_link_kosync_document(self): book = self.db_service.save_book(book) # Link - result = self.db_service.link_kosync_document('d' * 32, book.id, abs_id='book-1') + result = self.db_service.link_kosync_document("d" * 32, book.id) self.assertTrue(result) # Verify - retrieved = self.db_service.get_kosync_document('d' * 32) - self.assertEqual(retrieved.linked_abs_id, 'book-1') + retrieved = self.db_service.get_kosync_document("d" * 32) + self.assertEqual(retrieved.linked_book_id, book.id) def test_get_unlinked_documents(self): """Test retrieving unlinked documents.""" - doc = KosyncDocument( - document_hash='e' * 32, - percentage=0.4 - ) + doc = KosyncDocument(document_hash="e" * 32, percentage=0.4) self.db_service.save_kosync_document(doc) unlinked = self.db_service.get_unlinked_kosync_documents() hashes = [d.document_hash for d in unlinked] - self.assertIn('e' * 32, hashes) + self.assertIn("e" * 32, hashes) def test_delete_kosync_document(self): """Test deleting a KOSync document.""" - doc = KosyncDocument( - document_hash='f' * 32, - percentage=0.6 - ) + doc = KosyncDocument(document_hash="f" * 32, percentage=0.6) self.db_service.save_kosync_document(doc) # Delete - result = self.db_service.delete_kosync_document('f' * 32) + result = self.db_service.delete_kosync_document("f" * 32) self.assertTrue(result) # Verify gone - retrieved = self.db_service.get_kosync_document('f' * 32) + retrieved = self.db_service.get_kosync_document("f" * 32) self.assertIsNone(retrieved) @@ -159,7 +145,7 @@ def __init__(self): self.mock_sync_manager.abs_client = self.mock_abs_client self.mock_sync_manager.booklore_client = self.mock_booklore_client - self.mock_sync_manager.get_title.return_value = 'Test Book' + self.mock_sync_manager.get_title.return_value = "Test Book" self.mock_sync_manager.get_duration.return_value = 3600 def sync_manager(self): @@ -187,10 +173,10 @@ def data_dir(self): return Path(TEST_DIR) def books_dir(self): - return Path(TEST_DIR) / 'books' + return Path(TEST_DIR) / "books" def epub_cache_dir(self): - return Path(TEST_DIR) / 'epub_cache' + return Path(TEST_DIR) / "epub_cache" class TestKosyncEndpoints(unittest.TestCase): @@ -199,12 +185,13 @@ class TestKosyncEndpoints(unittest.TestCase): @classmethod def setUpClass(cls): # Setup DB one time - cls.db_path = os.path.join(TEST_DIR, 'test.db') + cls.db_path = os.path.join(TEST_DIR, "test.db") from src import web_server + web_server.database_service = DatabaseService(cls.db_path) # Use MockContainer to avoid epubcfi import chain cls.mock_container = _KosyncMockContainer() - if not hasattr(web_server, 'app'): + if not hasattr(web_server, "app"): web_server.app, _ = web_server.create_app(test_container=cls.mock_container) cls.app = web_server.app cls.client = cls.app.test_client() @@ -212,218 +199,205 @@ def setUpClass(cls): def setUp(self): # Auth headers import hashlib + self.auth_headers = { - 'x-auth-user': 'testuser', - 'x-auth-key': hashlib.md5(b'testpass').hexdigest(), - 'Content-Type': 'application/json' + "x-auth-user": "testuser", + "x-auth-key": hashlib.md5(b"testpass").hexdigest(), + "Content-Type": "application/json", } # Clear specific tables from src import web_server + with web_server.database_service.get_session() as session: - session.query(KosyncDocument).delete() + session.query(KosyncDocument).delete() # Reset rate limiter between tests - from src.api import kosync_server - with kosync_server._rate_limit_lock: - kosync_server._rate_limit_store.clear() + with self.app.app_context(): + rate_limiter = self.app.config.get("rate_limiter") + if rate_limiter: + rate_limiter.clear() def test_put_progress_creates_document(self): """Test that PUT creates a new document.""" # Case 1: Standard device (should return String timestamp) response = self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'g' * 32, - 'progress': '/body/test', - 'percentage': 0.33, - 'device': 'TestKobo', - 'device_id': 'KOBO123' - } + "document": "g" * 32, + "progress": "/body/test", + "percentage": 0.33, + "device": "TestKobo", + "device_id": "KOBO123", + }, ) self.assertEqual(response.status_code, 200) data = response.get_json() - self.assertEqual(data['document'], 'g' * 32) - self.assertIn('timestamp', data) + self.assertEqual(data["document"], "g" * 32) + self.assertIn("timestamp", data) # PUT response timestamp should be ISO 8601 string (kosync-dotnet behavior) - self.assertIsInstance(data['timestamp'], str) - self.assertIn('T', data['timestamp']) # ISO format contains 'T' + self.assertIsInstance(data["timestamp"], str) + self.assertIn("T", data["timestamp"]) # ISO format contains 'T' # Case 2: BookNexus device (should return Int timestamp) response_bn = self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'bn' * 16, - 'progress': '/body/test2', - 'percentage': 0.44, - 'device': 'BookNexus', - 'device_id': 'BN123' - } + "document": "bn" * 16, + "progress": "/body/test2", + "percentage": 0.44, + "device": "BookNexus", + "device_id": "BN123", + }, ) self.assertEqual(response_bn.status_code, 200) data_bn = response_bn.get_json() - self.assertIsInstance(data_bn['timestamp'], int) + self.assertIsInstance(data_bn["timestamp"], int) def test_get_progress_returns_502_for_missing(self): """Test that GET returns 502 (not 404) for missing document.""" - response = self.client.get( - '/syncs/progress/' + 'z' * 32, - headers=self.auth_headers - ) + response = self.client.get("/syncs/progress/" + "z" * 32, headers=self.auth_headers) self.assertEqual(response.status_code, 502) data = response.get_json() - self.assertIn('message', data) - self.assertIn('not found', data['message'].lower()) + self.assertIn("message", data) + self.assertIn("not found", data["message"].lower()) def test_get_progress_returns_full_data(self): """Test that GET returns all fields.""" # First PUT self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'h' * 32, - 'progress': '/body/chapter[5]', - 'percentage': 0.55, - 'device': 'TestKindle', - 'device_id': 'KINDLE456' - } + "document": "h" * 32, + "progress": "/body/chapter[5]", + "percentage": 0.55, + "device": "TestKindle", + "device_id": "KINDLE456", + }, ) # Then GET - response = self.client.get( - '/syncs/progress/' + 'h' * 32, - headers=self.auth_headers - ) + response = self.client.get("/syncs/progress/" + "h" * 32, headers=self.auth_headers) self.assertEqual(response.status_code, 200) data = response.get_json() # Verify all fields present (matching kosync-dotnet) - self.assertEqual(data['document'], 'h' * 32) - self.assertEqual(data['progress'], '/body/chapter[5]') - self.assertAlmostEqual(data['percentage'], 0.55) - self.assertEqual(data['device'], 'TestKindle') - self.assertEqual(data['device_id'], 'KINDLE456') - self.assertIn('timestamp', data) + self.assertEqual(data["document"], "h" * 32) + self.assertEqual(data["progress"], "/body/chapter[5]") + self.assertAlmostEqual(data["percentage"], 0.55) + self.assertEqual(data["device"], "TestKindle") + self.assertEqual(data["device_id"], "KINDLE456") + self.assertIn("timestamp", data) def test_furthest_wins_rejects_backwards(self): """Test that backwards progress is rejected when KOSYNC_FURTHEST_WINS=true.""" # First PUT at 50% self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'i' * 32, - 'percentage': 0.50, - 'progress': '/body/middle', - 'device': 'Device1', - 'device_id': 'D1' - } + "document": "i" * 32, + "percentage": 0.50, + "progress": "/body/middle", + "device": "Device1", + "device_id": "D1", + }, ) # Try to go backwards to 25% - should be REJECTED response = self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'i' * 32, - 'percentage': 0.25, - 'progress': '/body/earlier', - 'device': 'Device2', - 'device_id': 'D2' - } + "document": "i" * 32, + "percentage": 0.25, + "progress": "/body/earlier", + "device": "Device2", + "device_id": "D2", + }, ) self.assertEqual(response.status_code, 200) # Verify progress stayed at 50% (not overwritten) - get_response = self.client.get( - '/syncs/progress/' + 'i' * 32, - headers=self.auth_headers - ) + get_response = self.client.get("/syncs/progress/" + "i" * 32, headers=self.auth_headers) data = get_response.get_json() - self.assertAlmostEqual(data['percentage'], 0.50) + self.assertAlmostEqual(data["percentage"], 0.50) def test_furthest_wins_allows_equal(self): """Test that equal progress values are accepted (not rejected as backwards).""" # First PUT at 50% self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'j' * 32, - 'percentage': 0.50, - 'progress': '/body/middle', - 'device': 'Device1', - 'device_id': 'D1' - } + "document": "j" * 32, + "percentage": 0.50, + "progress": "/body/middle", + "device": "Device1", + "device_id": "D1", + }, ) # Send same percentage again - should be ACCEPTED response = self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'j' * 32, - 'percentage': 0.50, - 'progress': '/body/middle-updated', - 'device': 'Device2', - 'device_id': 'D2' - } + "document": "j" * 32, + "percentage": 0.50, + "progress": "/body/middle-updated", + "device": "Device2", + "device_id": "D2", + }, ) self.assertEqual(response.status_code, 200) # Verify progress field was updated (same percentage, different xpath) - get_response = self.client.get( - '/syncs/progress/' + 'j' * 32, - headers=self.auth_headers - ) + get_response = self.client.get("/syncs/progress/" + "j" * 32, headers=self.auth_headers) data = get_response.get_json() - self.assertEqual(data['progress'], '/body/middle-updated') - self.assertEqual(data['device'], 'Device2') + self.assertEqual(data["progress"], "/body/middle-updated") + self.assertEqual(data["device"], "Device2") def test_furthest_wins_allows_forward(self): """Test that forward progress is accepted.""" # First PUT at 25% self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'k' * 32, - 'percentage': 0.25, - 'progress': '/body/early', - 'device': 'Device1', - 'device_id': 'D1' - } + "document": "k" * 32, + "percentage": 0.25, + "progress": "/body/early", + "device": "Device1", + "device_id": "D1", + }, ) # Go forward to 75% - should be ACCEPTED response = self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'k' * 32, - 'percentage': 0.75, - 'progress': '/body/later', - 'device': 'Device2', - 'device_id': 'D2' - } + "document": "k" * 32, + "percentage": 0.75, + "progress": "/body/later", + "device": "Device2", + "device_id": "D2", + }, ) self.assertEqual(response.status_code, 200) # Verify progress moved forward - get_response = self.client.get( - '/syncs/progress/' + 'k' * 32, - headers=self.auth_headers - ) + get_response = self.client.get("/syncs/progress/" + "k" * 32, headers=self.auth_headers) data = get_response.get_json() - self.assertAlmostEqual(data['percentage'], 0.75) - + self.assertAlmostEqual(data["percentage"], 0.75) def test_get_progress_unknown_hash_creates_stub(self): """Test that GET for a completely unknown hash returns 502 and creates a stub for background discovery.""" @@ -431,45 +405,42 @@ def test_get_progress_unknown_hash_creates_stub(self): # Create a book with a known kosync_doc_id book = Book( - abs_id='test-sibling-book', - title='Sibling Test Book', - kosync_doc_id='a' * 32, - ebook_filename='sibling_test.epub', - status='active', - sync_mode='ebook_only' + abs_id="test-sibling-book", + title="Sibling Test Book", + kosync_doc_id="a" * 32, + ebook_filename="sibling_test.epub", + status="active", + sync_mode="ebook_only", ) book = web_server.database_service.save_book(book) # Create a KosyncDocument for hash_A linked to the book, with progress self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'a' * 32, - 'progress': '/body/chapter[3]', - 'percentage': 0.45, - 'device': 'Device1', - 'device_id': 'D1' - } + "document": "a" * 32, + "progress": "/body/chapter[3]", + "percentage": 0.45, + "device": "Device1", + "device_id": "D1", + }, ) # Link it to the book - web_server.database_service.link_kosync_document('a' * 32, book.id, abs_id='test-sibling-book') + web_server.database_service.link_kosync_document("a" * 32, book.id) # Now GET with an unknown hash_B — should resolve via the book's sibling docs # First, we need hash_B to be findable. The sibling resolution requires # the unknown hash to have a filename in common. Since hash_B is brand new # with no filename, it will fall through to Step 4 (background discovery). # So this tests that the 502 + stub creation path works. - response = self.client.get( - '/syncs/progress/' + 'b' * 32, - headers=self.auth_headers - ) + response = self.client.get("/syncs/progress/" + "b" * 32, headers=self.auth_headers) # Unknown hash with no filename link returns 502 self.assertEqual(response.status_code, 502) # Clean up with web_server.database_service.get_session() as session: - session.query(Book).filter(Book.abs_id == 'test-sibling-book').delete() + session.query(Book).filter(Book.abs_id == "test-sibling-book").delete() def test_get_progress_resolves_via_book_kosync_id(self): """Test that GET resolves via book.kosync_doc_id fallback (Step 2) and returns sibling progress.""" @@ -477,46 +448,43 @@ def test_get_progress_resolves_via_book_kosync_id(self): # Create a book whose kosync_doc_id matches the GET hash book = Book( - abs_id='test-step2-book', - title='Step2 Test Book', - kosync_doc_id='s' * 32, - ebook_filename='step2_test.epub', - status='active', - sync_mode='ebook_only' + abs_id="test-step2-book", + title="Step2 Test Book", + kosync_doc_id="s" * 32, + ebook_filename="step2_test.epub", + status="active", + sync_mode="ebook_only", ) web_server.database_service.save_book(book) # Retrieve saved book to get its DB-assigned id - saved_book = web_server.database_service.get_book_by_abs_id('test-step2-book') + saved_book = web_server.database_service.get_book_by_abs_id("test-step2-book") # Create a sibling KosyncDocument linked to the same book with progress sibling_doc = KosyncDocument( - document_hash='t' * 32, - progress='/body/chapter[7]', + document_hash="t" * 32, + progress="/body/chapter[7]", percentage=0.60, - device='Sibling', - device_id='S1', + device="Sibling", + device_id="S1", timestamp=datetime.utcnow(), - linked_abs_id='test-step2-book', + linked_abs_id="test-step2-book", linked_book_id=saved_book.id, ) web_server.database_service.save_kosync_document(sibling_doc) # GET with the book's kosync_doc_id (not in kosync_documents itself) - response = self.client.get( - '/syncs/progress/' + 's' * 32, - headers=self.auth_headers - ) + response = self.client.get("/syncs/progress/" + "s" * 32, headers=self.auth_headers) self.assertEqual(response.status_code, 200) data = response.get_json() # Should return sibling's progress since it's linked to the same book - self.assertAlmostEqual(data['percentage'], 0.60) - self.assertEqual(data['document'], 's' * 32) + self.assertAlmostEqual(data["percentage"], 0.60) + self.assertEqual(data["document"], "s" * 32) # Clean up with web_server.database_service.get_session() as session: - session.query(Book).filter(Book.abs_id == 'test-step2-book').delete() + session.query(Book).filter(Book.abs_id == "test-step2-book").delete() def test_get_progress_sibling_via_filename(self): """Test that GET resolves an unknown hash when a sibling with the same filename is linked to a book.""" @@ -524,162 +492,150 @@ def test_get_progress_sibling_via_filename(self): # Create a book book = Book( - abs_id='test-filename-book', - title='Filename Test Book', - kosync_doc_id='f' * 32, - ebook_filename='shared_name.epub', - status='active', - sync_mode='ebook_only' + abs_id="test-filename-book", + title="Filename Test Book", + kosync_doc_id="f" * 32, + ebook_filename="shared_name.epub", + status="active", + sync_mode="ebook_only", ) web_server.database_service.save_book(book) - saved_book = web_server.database_service.get_book_by_abs_id('test-filename-book') + saved_book = web_server.database_service.get_book_by_abs_id("test-filename-book") # Create a KosyncDocument for hash_A linked to the book, with a filename and progress doc_a = KosyncDocument( - document_hash='f' * 32, - progress='/body/chapter[5]', + document_hash="f" * 32, + progress="/body/chapter[5]", percentage=0.50, - device='DeviceA', - device_id='DA', + device="DeviceA", + device_id="DA", timestamp=datetime.utcnow(), - filename='shared_name.epub', - linked_abs_id='test-filename-book', + filename="shared_name.epub", + linked_abs_id="test-filename-book", linked_book_id=saved_book.id, ) web_server.database_service.save_kosync_document(doc_a) # Create a KosyncDocument for hash_B with the SAME filename but NOT linked - doc_b = KosyncDocument( - document_hash='e' * 32, - filename='shared_name.epub' - ) + doc_b = KosyncDocument(document_hash="e" * 32, filename="shared_name.epub") web_server.database_service.save_kosync_document(doc_b) # GET with hash_B — should resolve via filename sibling to the book - response = self.client.get( - '/syncs/progress/' + 'e' * 32, - headers=self.auth_headers - ) + response = self.client.get("/syncs/progress/" + "e" * 32, headers=self.auth_headers) self.assertEqual(response.status_code, 200) data = response.get_json() - self.assertAlmostEqual(data['percentage'], 0.50) - self.assertEqual(data['document'], 'e' * 32) + self.assertAlmostEqual(data["percentage"], 0.50) + self.assertEqual(data["document"], "e" * 32) # Clean up with web_server.database_service.get_session() as session: - session.query(Book).filter(Book.abs_id == 'test-filename-book').delete() - + session.query(Book).filter(Book.abs_id == "test-filename-book").delete() # ---------------- Security Tests ---------------- def test_auth_rejects_raw_password(self): """Raw password (not MD5 hash) should be rejected.""" bad_headers = { - 'x-auth-user': 'testuser', - 'x-auth-key': 'testpass', # raw, not hashed - 'Content-Type': 'application/json' + "x-auth-user": "testuser", + "x-auth-key": "testpass", # raw, not hashed + "Content-Type": "application/json", } - response = self.client.get('/users/auth', headers=bad_headers) + response = self.client.get("/users/auth", headers=bad_headers) self.assertEqual(response.status_code, 401) def test_auth_accepts_md5_hash(self): """MD5-hashed password should be accepted.""" - response = self.client.get('/users/auth', headers=self.auth_headers) + response = self.client.get("/users/auth", headers=self.auth_headers) self.assertEqual(response.status_code, 200) data = response.get_json() - self.assertEqual(data['username'], 'testuser') + self.assertEqual(data["username"], "testuser") def test_auth_rejects_wrong_user(self): """Wrong username should be rejected even with correct key.""" import hashlib + bad_headers = { - 'x-auth-user': 'wronguser', - 'x-auth-key': hashlib.md5(b'testpass').hexdigest(), - 'Content-Type': 'application/json' + "x-auth-user": "wronguser", + "x-auth-key": hashlib.md5(b"testpass").hexdigest(), + "Content-Type": "application/json", } - response = self.client.get('/users/auth', headers=bad_headers) + response = self.client.get("/users/auth", headers=bad_headers) self.assertEqual(response.status_code, 401) def test_auth_rejects_missing_headers(self): """Missing auth headers should return 401.""" - response = self.client.get('/users/auth') + response = self.client.get("/users/auth") self.assertEqual(response.status_code, 401) def test_login_does_not_leak_token(self): """Login response must not contain token, key, or password.""" - response = self.client.post('/users/login') + response = self.client.post("/users/login") self.assertEqual(response.status_code, 200) data = response.get_json() - self.assertNotIn('token', data) - self.assertNotIn('key', data) - self.assertNotIn('password', data) + self.assertNotIn("token", data) + self.assertNotIn("key", data) + self.assertNotIn("password", data) def test_put_validates_percentage_range(self): """Percentage > 1.0 should return 400.""" response = self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'v' * 32, - 'percentage': 1.5, - 'progress': '/body/test', - 'device': 'Test', - 'device_id': 'T1' - } + "document": "v" * 32, + "percentage": 1.5, + "progress": "/body/test", + "device": "Test", + "device_id": "T1", + }, ) self.assertEqual(response.status_code, 400) def test_put_validates_percentage_type(self): """Non-numeric percentage should return 400.""" response = self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, json={ - 'document': 'w' * 32, - 'percentage': 'not-a-number', - 'progress': '/body/test', - 'device': 'Test', - 'device_id': 'T1' - } + "document": "w" * 32, + "percentage": "not-a-number", + "progress": "/body/test", + "device": "Test", + "device_id": "T1", + }, ) self.assertEqual(response.status_code, 400) def test_put_validates_missing_document(self): """PUT with missing document hash should return 400.""" response = self.client.put( - '/syncs/progress', + "/syncs/progress", headers=self.auth_headers, - json={ - 'percentage': 0.5, - 'progress': '/body/test', - 'device': 'Test', - 'device_id': 'T1' - } + json={"percentage": 0.5, "progress": "/body/test", "device": "Test", "device_id": "T1"}, ) self.assertEqual(response.status_code, 400) def test_get_validates_doc_id_length(self): """Oversized doc ID should return 400.""" - long_id = 'x' * 100 - response = self.client.get( - f'/syncs/progress/{long_id}', - headers=self.auth_headers - ) + long_id = "x" * 100 + response = self.client.get(f"/syncs/progress/{long_id}", headers=self.auth_headers) self.assertEqual(response.status_code, 400) def test_rate_limiting_triggers(self): """Rapid auth requests should eventually return 429.""" - from src.api import kosync_server - with kosync_server._rate_limit_lock: - kosync_server._rate_limit_store.clear() + with self.app.app_context(): + rate_limiter = self.app.config.get("rate_limiter") + if rate_limiter: + rate_limiter.clear() got_429 = False for _ in range(20): - response = self.client.get('/users/auth', headers={ - 'x-auth-user': 'testuser', - 'x-auth-key': 'wrongkey' - }, environ_base={'REMOTE_ADDR': '203.0.113.10'}) + response = self.client.get( + "/users/auth", + headers={"x-auth-user": "testuser", "x-auth-key": "wrongkey"}, + environ_base={"REMOTE_ADDR": "203.0.113.10"}, + ) if response.status_code == 429: got_429 = True break @@ -690,9 +646,20 @@ def test_rate_limiting_triggers(self): class TestCleanupCacheTraversal(unittest.TestCase): """Ensure _cleanup_cache_for_hash rejects traversal-style filenames.""" + @classmethod + def setUpClass(cls): + cls.db_path = os.path.join(TEST_DIR, "test.db") + from src import web_server + + web_server.database_service = DatabaseService(cls.db_path) + cls.mock_container = _KosyncMockContainer() + if not hasattr(web_server, "app"): + web_server.app, _ = web_server.create_app(test_container=cls.mock_container) + cls.app = web_server.app + def test_traversal_filename_blocked(self): """A filename containing '../' must be rejected, not deleted.""" - from src.api import kosync_server + from src.api import kosync_admin mock_doc = Mock() mock_doc.filename = "../evil.txt" @@ -703,23 +670,17 @@ def test_traversal_filename_blocked(self): mock_container = _KosyncMockContainer() - orig_db = kosync_server._database_service - orig_container = kosync_server._container - try: - kosync_server._database_service = mock_db - kosync_server._container = mock_container + with self.app.app_context(): + self.app.config["database_service"] = mock_db + self.app.config["container"] = mock_container - with patch.object(kosync_server.logger, 'warning') as mock_warn, \ - patch('os.remove') as mock_remove: - kosync_server._cleanup_cache_for_hash("fakehash") + with patch.object(kosync_admin.logger, "warning") as mock_warn, patch("os.remove") as mock_remove: + kosync_admin._cleanup_cache_for_hash("fakehash") mock_warn.assert_called_once() self.assertIn("Blocked cache deletion", mock_warn.call_args[0][0]) mock_remove.assert_not_called() - finally: - kosync_server._database_service = orig_db - kosync_server._container = orig_container -if __name__ == '__main__': +if __name__ == "__main__": unittest.main() diff --git a/tests/test_kosync_service.py b/tests/test_kosync_service.py new file mode 100644 index 0000000..0496576 --- /dev/null +++ b/tests/test_kosync_service.py @@ -0,0 +1,314 @@ +""" +Tests for KosyncService business logic. + +Tests handle_put_progress, handle_get_progress, resolve_best_progress, +serialize_progress, resolve_book_by_sibling_hash, and register_hash_for_book. +All tests use mocked database_service and container — no Flask app needed. +""" + +import os +from datetime import datetime +from unittest.mock import MagicMock, Mock, patch + +import pytest + +from src.services.kosync_service import KosyncService + + +def _make_service(db=None, container=None, manager=None): + db = db or MagicMock() + container = container or MagicMock() + return KosyncService(db, container, manager) + + +def _make_doc( + doc_hash="a" * 32, + percentage=0.5, + device="TestDevice", + device_id="DEV1", + linked_abs_id=None, + linked_book_id=None, + filename=None, + timestamp=None, +): + doc = MagicMock() + doc.document_hash = doc_hash + doc.percentage = percentage + doc.device = device + doc.device_id = device_id + doc.linked_abs_id = linked_abs_id + doc.linked_book_id = linked_book_id + doc.filename = filename + doc.timestamp = timestamp or datetime(2025, 1, 1, 12, 0, 0) + doc.progress = "/body/p[1]" + return doc + + +def _make_book( + book_id=1, + abs_id="book-1", + title="Test Book", + status="active", + kosync_doc_id=None, + ebook_filename=None, + activity_flag=False, +): + book = MagicMock() + book.id = book_id + book.abs_id = abs_id + book.title = title + book.status = status + book.kosync_doc_id = kosync_doc_id + book.ebook_filename = ebook_filename + book.activity_flag = activity_flag + return book + + +class TestSerializeProgress: + def test_full_data(self): + doc = _make_doc() + result = KosyncService.serialize_progress(doc) + assert result["device"] == "TestDevice" + assert result["percentage"] == 0.5 + assert result["document"] == "a" * 32 + + def test_defaults_for_missing_fields(self): + doc = _make_doc(device=None, device_id=None, percentage=None) + result = KosyncService.serialize_progress(doc, device_default="fallback") + assert result["device"] == "fallback" + assert result["device_id"] == "fallback" + assert result["percentage"] == 0 + + +class TestResolveBookBySiblingHash: + def test_filename_match_finds_linked_book(self): + db = MagicMock() + sibling = _make_doc(doc_hash="b" * 32, linked_abs_id="book-1", filename="test.epub") + db.get_kosync_document.return_value = _make_doc(filename="test.epub") + db.get_kosync_doc_by_filename.return_value = sibling + book = _make_book() + db.get_book_by_abs_id.return_value = book + + svc = _make_service(db=db) + result = svc.resolve_book_by_sibling_hash("c" * 32) + assert result == book + + def test_no_match_returns_none(self): + db = MagicMock() + db.get_kosync_document.return_value = _make_doc(filename=None) + svc = _make_service(db=db) + result = svc.resolve_book_by_sibling_hash("d" * 32) + assert result is None + + +class TestRegisterHashForBook: + def test_new_hash_creates_and_links(self): + db = MagicMock() + db.get_kosync_document.return_value = None + svc = _make_service(db=db) + book = _make_book() + svc.register_hash_for_book("e" * 32, book) + db.save_kosync_document.assert_called_once() + + def test_existing_unlinked_gets_linked(self): + db = MagicMock() + existing = _make_doc(linked_book_id=None) + db.get_kosync_document.return_value = existing + svc = _make_service(db=db) + book = _make_book() + svc.register_hash_for_book("f" * 32, book) + db.link_kosync_document.assert_called_once() + + +class TestHandlePutProgress: + def test_creates_new_document(self): + db = MagicMock() + db.get_kosync_document.return_value = None + db.get_book_by_kosync_id.return_value = None + svc = _make_service(db=db) + + result, status = svc.handle_put_progress( + {"document": "a" * 32, "percentage": 0.5, "progress": "/body/p[1]", "device": "Kobo"}, "127.0.0.1" + ) + assert status == 200 + assert result["document"] == "a" * 32 + db.save_kosync_document.assert_called_once() + + def test_updates_existing_document(self): + db = MagicMock() + existing = _make_doc(percentage=0.3) + db.get_kosync_document.return_value = existing + db.get_book_by_abs_id.return_value = None + svc = _make_service(db=db) + + with patch.dict(os.environ, {"KOSYNC_FURTHEST_WINS": "false"}): + result, status = svc.handle_put_progress( + {"document": "a" * 32, "percentage": 0.7, "device": "Kobo", "device_id": "DEV2"}, "127.0.0.1" + ) + assert status == 200 + db.save_kosync_document.assert_called() + + def test_furthest_wins_rejects_backward(self): + db = MagicMock() + existing = _make_doc(percentage=0.8, device_id="DEV-A") + db.get_kosync_document.return_value = existing + svc = _make_service(db=db) + + with patch.dict(os.environ, {"KOSYNC_FURTHEST_WINS": "true"}): + result, status = svc.handle_put_progress( + {"document": "a" * 32, "percentage": 0.2, "device": "Other", "device_id": "DEV-B"}, "127.0.0.1" + ) + assert status == 200 + assert result["document"] == "a" * 32 + # Should NOT have saved (rejected) + db.save_kosync_document.assert_not_called() + + def test_furthest_wins_allows_same_device(self): + db = MagicMock() + existing = _make_doc(percentage=0.8, device_id="DEV-A") + db.get_kosync_document.return_value = existing + db.get_book_by_abs_id.return_value = None + svc = _make_service(db=db) + + with patch.dict(os.environ, {"KOSYNC_FURTHEST_WINS": "true"}): + result, status = svc.handle_put_progress( + {"document": "a" * 32, "percentage": 0.2, "device": "Kobo", "device_id": "DEV-A"}, "127.0.0.1" + ) + assert status == 200 + db.save_kosync_document.assert_called() + + def test_furthest_wins_allows_force(self): + db = MagicMock() + existing = _make_doc(percentage=0.8, device_id="DEV-A") + db.get_kosync_document.return_value = existing + db.get_book_by_abs_id.return_value = None + svc = _make_service(db=db) + + with patch.dict(os.environ, {"KOSYNC_FURTHEST_WINS": "true"}): + result, status = svc.handle_put_progress( + {"document": "a" * 32, "percentage": 0.1, "device": "sync-bot", "device_id": "BOT", "force": True}, + "127.0.0.1", + ) + assert status == 200 + db.save_kosync_document.assert_called() + + def test_links_to_existing_book(self): + db = MagicMock() + db.get_kosync_document.return_value = _make_doc(linked_abs_id=None) + book = _make_book() + db.get_book_by_kosync_id.return_value = book + svc = _make_service(db=db) + + with patch.dict(os.environ, {"KOSYNC_FURTHEST_WINS": "false"}): + result, status = svc.handle_put_progress( + {"document": "a" * 32, "percentage": 0.5, "device": "Kobo", "device_id": "DEV1"}, "127.0.0.1" + ) + assert status == 200 + db.link_kosync_document.assert_called_once() + + def test_sets_activity_flag_on_paused_book(self): + db = MagicMock() + db.get_kosync_document.return_value = _make_doc(linked_abs_id="book-1") + book = _make_book(status="paused", activity_flag=False) + db.get_book_by_abs_id.return_value = book + svc = _make_service(db=db) + + with patch.dict(os.environ, {"KOSYNC_FURTHEST_WINS": "false"}): + result, status = svc.handle_put_progress( + {"document": "a" * 32, "percentage": 0.6, "device": "Kobo", "device_id": "DEV1"}, "127.0.0.1" + ) + assert status == 200 + assert book.activity_flag is True + # save_book should be called for the activity flag update + assert db.save_book.called + + def test_validation_errors(self): + svc = _make_service() + assert svc.handle_put_progress(None, "127.0.0.1")[1] == 400 + assert svc.handle_put_progress({}, "127.0.0.1")[1] == 400 + assert svc.handle_put_progress({"document": "a" * 32, "percentage": 2.0}, "127.0.0.1")[1] == 400 + assert svc.handle_put_progress({"document": "x" * 100}, "127.0.0.1")[1] == 400 + + +class TestHandleGetProgress: + def test_direct_hash_match(self): + db = MagicMock() + doc = _make_doc(linked_abs_id="book-1") + db.get_kosync_document.return_value = doc + book = _make_book() + db.get_book_by_abs_id.return_value = book + db.get_kosync_documents_for_book_by_book_id.return_value = [doc] + db.get_states_for_book.return_value = [] + svc = _make_service(db=db) + + result, status = svc.handle_get_progress("a" * 32, "127.0.0.1") + assert status == 200 + assert result["percentage"] == 0.5 + + def test_lookup_via_book_kosync_id(self): + db = MagicMock() + db.get_kosync_document.return_value = None + book = _make_book() + db.get_book_by_kosync_id.return_value = book + db.get_kosync_documents_for_book_by_book_id.return_value = [_make_doc(percentage=0.7)] + db.get_states_for_book.return_value = [] + svc = _make_service(db=db) + + result, status = svc.handle_get_progress("a" * 32, "127.0.0.1") + assert status == 200 + + def test_unknown_hash_returns_502(self): + db = MagicMock() + db.get_kosync_document.return_value = None + db.get_book_by_kosync_id.return_value = None + svc = _make_service(db=db) + svc.resolve_book_by_sibling_hash = MagicMock(return_value=None) + svc.start_discovery_if_available = MagicMock(return_value=False) + + result, status = svc.handle_get_progress("unknown" + "0" * 25, "127.0.0.1") + assert status == 502 + + def test_doc_id_too_long(self): + svc = _make_service() + result, status = svc.handle_get_progress("x" * 100, "127.0.0.1") + assert status == 400 + + +class TestResolveBestProgress: + def test_picks_highest_percentage_sibling(self): + db = MagicMock() + doc_a = _make_doc(doc_hash="a" * 32, percentage=0.3) + doc_b = _make_doc(doc_hash="b" * 32, percentage=0.8) + db.get_kosync_documents_for_book_by_book_id.return_value = [doc_a, doc_b] + db.get_states_for_book.return_value = [] + svc = _make_service(db=db) + + result, status = svc.resolve_best_progress("c" * 32, _make_book()) + assert status == 200 + assert result["percentage"] == 0.8 + + def test_falls_back_to_states(self): + db = MagicMock() + db.get_kosync_documents_for_book_by_book_id.return_value = [] + state = MagicMock() + state.client_name = "KoSync" + state.percentage = 0.4 + state.xpath = "/body/p[2]" + state.cfi = None + state.last_updated = 1700000000 + db.get_states_for_book.return_value = [state] + svc = _make_service(db=db) + + result, status = svc.resolve_best_progress("d" * 32, _make_book()) + assert status == 200 + assert result["percentage"] == 0.4 + assert result["device"] == "pagekeeper" + + def test_no_data_returns_502(self): + db = MagicMock() + db.get_kosync_documents_for_book_by_book_id.return_value = [] + db.get_states_for_book.return_value = [] + svc = _make_service(db=db) + + result, status = svc.resolve_best_progress("e" * 32, _make_book()) + assert status == 502 diff --git a/tests/test_locator_search.py b/tests/test_locator_search.py new file mode 100644 index 0000000..89313b7 --- /dev/null +++ b/tests/test_locator_search.py @@ -0,0 +1,231 @@ +""" +Tests for LocatorSearchService — text search, locator ID resolution, CFI resolution. + +All tests use crafted text and synthetic spine maps — no file I/O or mocking. +""" + +import pytest + +pytestmark = pytest.mark.docker + +from bs4 import BeautifulSoup + +from src.utils.locator_search import LocatorSearchService + + +def _make_spine_map(html_contents): + """Build a spine map from a list of HTML strings, matching extract_text_and_map format.""" + spine_map = [] + full_text_parts = [] + current_idx = 0 + + for i, content in enumerate(html_contents): + if isinstance(content, str): + content = content.encode("utf-8") + soup = BeautifulSoup(content, "html.parser") + text = soup.get_text(separator=" ", strip=True) + start = current_idx + end = current_idx + len(text) + spine_map.append( + { + "start": start, + "end": end, + "spine_index": i + 1, + "href": f"chapter{i + 1}.xhtml", + "content": content, + } + ) + full_text_parts.append(text) + current_idx = end + 1 + + full_text = " ".join(full_text_parts) + return full_text, spine_map + + +class TestFindTextLocation: + def setup_method(self): + self.service = LocatorSearchService(fuzzy_threshold=80) + + def test_exact_match(self): + content = "

The quick brown fox jumps over the lazy dog.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.find_text_location(full_text, spine_map, "brown fox") + assert result is not None + assert result.match_index == full_text.find("brown fox") + assert 0 < result.percentage < 1 + + def test_unique_anchor_preferred_over_first_occurrence(self): + # "Chapter One" appears twice (ToC and body), but 10-word anchor is unique + toc = "

Table of Contents: Chapter One - Introduction

" + body = "

Chapter One - Introduction to the wonderful world of testing software applications today

" + full_text, spine_map = _make_spine_map([toc, body]) + # The 10-word unique sequence should match in the body, not the ToC + search = "Chapter One - Introduction to the wonderful world of testing software" + result = self.service.find_text_location(full_text, spine_map, search) + assert result is not None + # Should find in second chapter (body), not first (ToC) + body_start = spine_map[1]["start"] + assert result.match_index >= body_start + + def test_short_phrase_uses_exact_match(self): + content = "

Short phrase here.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.find_text_location(full_text, spine_map, "Short phrase") + assert result is not None + assert result.match_index == 0 + + def test_normalized_match_ignores_case_and_punctuation(self): + content = "

Hello, World! This is great.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.find_text_location(full_text, spine_map, "hello world") + assert result is not None + + def test_fuzzy_match_finds_approximate(self): + content = ( + "

The extraordinary adventures of a curious explorer in distant lands.

" + ) + full_text, spine_map = _make_spine_map([content]) + # Slightly different text should fuzzy-match + result = self.service.find_text_location(full_text, spine_map, "extraordinary adventures of curious explorer") + assert result is not None + + def test_fuzzy_match_with_hint_percentage(self): + ch1 = "

Some filler text to pad the beginning of the book nicely.

" + ch2 = "

The target sentence we want to find in the second half.

" + full_text, spine_map = _make_spine_map([ch1, ch2]) + result = self.service.find_text_location(full_text, spine_map, "target sentence we want", hint_percentage=0.7) + assert result is not None + + def test_no_match_returns_none(self): + content = "

Simple text here.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.find_text_location(full_text, spine_map, "completely unrelated xyz123 gibberish") + assert result is None + + def test_empty_text_returns_none(self): + result = self.service.find_text_location("", [], "anything") + assert result is None + + def test_returns_valid_cfi(self): + content = "

Text for CFI generation testing.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.find_text_location(full_text, spine_map, "CFI generation") + assert result is not None + assert result.cfi is not None + assert result.cfi.startswith("epubcfi(") + + def test_returns_chapter_progress(self): + content = "

First half of content. Second half of content here now.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.find_text_location(full_text, spine_map, "Second half") + assert result is not None + assert result.chapter_progress is not None + assert 0 < result.chapter_progress < 1 + + def test_perfect_ko_xpath_is_none(self): + """Facade is responsible for filling in perfect_ko_xpath, not this service.""" + content = "

Some text to find.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.find_text_location(full_text, spine_map, "text to find") + assert result is not None + assert result.perfect_ko_xpath is None + + +class TestResolveLocatorId: + def setup_method(self): + self.service = LocatorSearchService() + + def test_known_fragment_returns_text(self): + content = "

Intro text.

Target paragraph with content.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.resolve_locator_id(full_text, spine_map, "chapter1.xhtml", "target") + assert result is not None + assert "Target paragraph" in result + + def test_unknown_fragment_returns_none(self): + content = "

Simple content.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.resolve_locator_id(full_text, spine_map, "chapter1.xhtml", "nonexistent") + assert result is None + + def test_wrong_href_returns_none(self): + content = "

Found me.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.resolve_locator_id(full_text, spine_map, "wrong_chapter.xhtml", "found") + assert result is None + + def test_fragment_with_hash_prefix(self): + content = "

Identified paragraph.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.resolve_locator_id(full_text, spine_map, "chapter1.xhtml", "#myid") + assert result is not None + assert "Identified paragraph" in result + + +class TestGetTextAroundCfi: + def setup_method(self): + self.service = LocatorSearchService() + + def test_spine_index_out_of_range_returns_none(self): + content = "

Only chapter.

" + full_text, spine_map = _make_spine_map([content]) + # CFI pointing to spine item 99 (doesn't exist) + result = self.service.get_text_around_cfi(full_text, spine_map, "epubcfi(/6/200!/4/2/1:0)") + assert result is None + + def test_malformed_cfi_returns_none(self): + content = "

Some text.

" + full_text, spine_map = _make_spine_map([content]) + result = self.service.get_text_around_cfi(full_text, spine_map, "not_a_valid_cfi") + assert result is None + + +class TestNormalize: + def test_strips_non_alphanumeric(self): + service = LocatorSearchService() + assert service._normalize("Hello, World!") == "helloworld" + assert service._normalize("test-case_123") == "testcase123" + assert service._normalize("") == "" + + +class TestGenerateXpathBs4: + def setup_method(self): + self.service = LocatorSearchService() + + def test_nested_elements_correct_path(self): + html_content = "

First

Second

" + xpath, tag, anchored = self.service._generate_xpath_bs4(html_content, 0) + assert xpath.startswith("/body") + assert tag is not None + assert not anchored + + def test_element_with_id_uses_anchor(self): + html_content = "

Content here.

" + xpath, tag, anchored = self.service._generate_xpath_bs4(html_content, 0) + assert anchored + assert "@id='ch1'" in xpath + assert xpath.startswith("//") + + def test_empty_body_returns_default(self): + html_content = "" + xpath, tag, anchored = self.service._generate_xpath_bs4(html_content, 0) + assert xpath == "/body/div/p[1]" + assert tag is None + assert not anchored + + +class TestGenerateCfi: + def setup_method(self): + self.service = LocatorSearchService() + + def test_produces_valid_cfi_format(self): + html_content = "

Some text content for CFI.

" + cfi = self.service._generate_cfi(0, html_content, 5) + assert cfi.startswith("epubcfi(/6/") + assert cfi.endswith(":0)") + + def test_no_text_produces_fallback_cfi(self): + html_content = "" + cfi = self.service._generate_cfi(2, html_content, 0) + assert cfi.startswith("epubcfi(/6/") + assert ":0)" in cfi diff --git a/tests/test_logs_routes.py b/tests/test_logs_routes.py new file mode 100644 index 0000000..acaa284 --- /dev/null +++ b/tests/test_logs_routes.py @@ -0,0 +1,404 @@ +"""Route tests for the Logs blueprint (/logs, /api/logs, /api/logs/live, /api/logs/hardcover).""" + +import json +import sys +from datetime import datetime +from pathlib import Path +from types import SimpleNamespace +from unittest.mock import MagicMock, patch + +import pytest + +sys.path.insert(0, str(Path(__file__).parent.parent)) + + +# --------------------------------------------------------------------------- +# GET /logs — renders the page +# --------------------------------------------------------------------------- + +def test_logs_view_renders(client): + resp = client.get('/logs') + assert resp.status_code == 200 + assert b'' in resp.data or b'' in resp.data.lower() + + +# --------------------------------------------------------------------------- +# GET /view_log — legacy redirect +# --------------------------------------------------------------------------- + +def test_view_log_redirects_to_logs(client): + resp = client.get('/view_log') + assert resp.status_code == 302 + assert '/logs' in resp.headers['Location'] + + +# --------------------------------------------------------------------------- +# GET /api/logs — file-based log reading +# --------------------------------------------------------------------------- + +class TestApiLogs: + """Tests for the /api/logs endpoint that reads from the log file.""" + + def test_no_log_file(self, client): + """When LOG_PATH does not exist, returns an empty log list.""" + with patch('src.blueprints.logs.LOG_PATH', Path('/tmp/nonexistent_log_file.log')): + resp = client.get('/api/logs') + data = resp.get_json() + assert resp.status_code == 200 + assert data['logs'] == [] + assert data['total_lines'] == 0 + + def test_well_formed_lines(self, client, tmp_path): + """Properly formatted log lines are parsed into structured entries.""" + log_file = tmp_path / 'test.log' + log_file.write_text( + '[2026-03-19 10:00:00] INFO - src.app: Application started\n' + '[2026-03-19 10:00:01] WARNING - src.sync: Sync delayed\n' + '[2026-03-19 10:00:02] ERROR - src.db: Connection lost\n' + ) + with patch('src.blueprints.logs.LOG_PATH', log_file): + resp = client.get('/api/logs') + data = resp.get_json() + assert resp.status_code == 200 + assert data['total_lines'] == 3 + assert len(data['logs']) == 3 + + first = data['logs'][0] + assert first['level'] == 'INFO' + assert first['module'] == 'src.app' + assert first['message'] == 'Application started' + assert first['timestamp'] == '2026-03-19 10:00:00' + + def test_malformed_lines_treated_as_info(self, client, tmp_path): + """Lines that don't match the expected format are still returned as INFO.""" + log_file = tmp_path / 'test.log' + log_file.write_text('just some random text\n') + with patch('src.blueprints.logs.LOG_PATH', log_file): + resp = client.get('/api/logs') + data = resp.get_json() + assert resp.status_code == 200 + assert len(data['logs']) == 1 + assert data['logs'][0]['level'] == 'INFO' + assert data['logs'][0]['message'] == 'just some random text' + + def test_level_filter(self, client, tmp_path): + """The ?level= parameter filters out lower-severity entries.""" + log_file = tmp_path / 'test.log' + log_file.write_text( + '[2026-03-19 10:00:00] DEBUG - mod: debug msg\n' + '[2026-03-19 10:00:01] INFO - mod: info msg\n' + '[2026-03-19 10:00:02] ERROR - mod: error msg\n' + ) + with patch('src.blueprints.logs.LOG_PATH', log_file): + resp = client.get('/api/logs?level=ERROR') + data = resp.get_json() + assert len(data['logs']) == 1 + assert data['logs'][0]['level'] == 'ERROR' + + def test_search_filter(self, client, tmp_path): + """The ?search= parameter filters entries by substring match.""" + log_file = tmp_path / 'test.log' + log_file.write_text( + '[2026-03-19 10:00:00] INFO - mod: Alpha message\n' + '[2026-03-19 10:00:01] INFO - mod: Beta message\n' + '[2026-03-19 10:00:02] INFO - mod: Alpha again\n' + ) + with patch('src.blueprints.logs.LOG_PATH', log_file): + resp = client.get('/api/logs?search=alpha') + data = resp.get_json() + assert len(data['logs']) == 2 + assert all('Alpha' in log['message'] or 'alpha' in log['message'] for log in data['logs']) + + def test_lines_limit(self, client, tmp_path): + """The ?lines= parameter caps the number of returned entries.""" + log_file = tmp_path / 'test.log' + lines = ''.join( + f'[2026-03-19 10:00:{i:02d}] INFO - mod: Line {i}\n' for i in range(20) + ) + log_file.write_text(lines) + with patch('src.blueprints.logs.LOG_PATH', log_file): + resp = client.get('/api/logs?lines=5') + data = resp.get_json() + assert data['displayed_lines'] == 5 + assert data['total_lines'] == 20 + + def test_offset_parameter(self, client, tmp_path): + """The ?offset= parameter skips the most recent N entries.""" + log_file = tmp_path / 'test.log' + lines = ''.join( + f'[2026-03-19 10:00:{i:02d}] INFO - mod: Line {i}\n' for i in range(10) + ) + log_file.write_text(lines) + with patch('src.blueprints.logs.LOG_PATH', log_file): + resp_all = client.get('/api/logs') + resp_offset = client.get('/api/logs?offset=3') + all_data = resp_all.get_json() + offset_data = resp_offset.get_json() + assert offset_data['displayed_lines'] == all_data['displayed_lines'] - 3 + + def test_empty_lines_skipped(self, client, tmp_path): + """Blank lines in the log file are ignored.""" + log_file = tmp_path / 'test.log' + log_file.write_text( + '\n\n[2026-03-19 10:00:00] INFO - mod: One line\n\n\n' + ) + with patch('src.blueprints.logs.LOG_PATH', log_file): + resp = client.get('/api/logs') + data = resp.get_json() + assert data['total_lines'] == 1 + + def test_line_without_colon_separator(self, client, tmp_path): + """A timestamped line with no ': ' is treated as INFO with unknown module.""" + log_file = tmp_path / 'test.log' + log_file.write_text('[2026-03-19 10:00:00] some text without colon separator\n') + with patch('src.blueprints.logs.LOG_PATH', log_file): + resp = client.get('/api/logs') + data = resp.get_json() + assert len(data['logs']) == 1 + assert data['logs'][0]['level'] == 'INFO' + assert data['logs'][0]['module'] == 'unknown' + + def test_log_path_none(self, client): + """When LOG_PATH is None, returns empty results.""" + with patch('src.blueprints.logs.LOG_PATH', None): + resp = client.get('/api/logs') + data = resp.get_json() + assert resp.status_code == 200 + assert data['logs'] == [] + + +# --------------------------------------------------------------------------- +# GET /api/logs/live — memory-based live logs +# --------------------------------------------------------------------------- + +class TestApiLogsLive: + """Tests for the /api/logs/live endpoint backed by MemoryLogHandler.""" + + def _make_log_entries(self, count=5, level='INFO'): + return [ + { + 'timestamp': f'2026-03-19 10:00:{i:02d}', + 'level': level, + 'message': f'Live message {i}', + 'module': 'src.test', + } + for i in range(count) + ] + + def test_happy_path(self, client): + """Returns recent memory logs as JSON.""" + entries = self._make_log_entries(3) + with patch('src.blueprints.logs.memory_log_handler') as mock_handler: + mock_handler.get_recent_logs.return_value = entries + resp = client.get('/api/logs/live') + data = resp.get_json() + assert resp.status_code == 200 + assert len(data['logs']) == 3 + assert 'timestamp' in data + + def test_level_filter(self, client): + """The ?level= parameter filters memory logs by severity.""" + entries = [ + {'timestamp': 't1', 'level': 'DEBUG', 'message': 'dbg', 'module': 'm'}, + {'timestamp': 't2', 'level': 'ERROR', 'message': 'err', 'module': 'm'}, + ] + with patch('src.blueprints.logs.memory_log_handler') as mock_handler: + mock_handler.get_recent_logs.return_value = entries + resp = client.get('/api/logs/live?level=ERROR') + data = resp.get_json() + assert len(data['logs']) == 1 + assert data['logs'][0]['level'] == 'ERROR' + + def test_search_filter(self, client): + """The ?search= parameter filters memory logs by substring.""" + entries = [ + {'timestamp': 't1', 'level': 'INFO', 'message': 'apple pie', 'module': 'm'}, + {'timestamp': 't2', 'level': 'INFO', 'message': 'banana split', 'module': 'm'}, + ] + with patch('src.blueprints.logs.memory_log_handler') as mock_handler: + mock_handler.get_recent_logs.return_value = entries + resp = client.get('/api/logs/live?search=apple') + data = resp.get_json() + assert len(data['logs']) == 1 + assert 'apple' in data['logs'][0]['message'] + + def test_count_parameter(self, client): + """The ?count= parameter limits the number of returned entries.""" + entries = self._make_log_entries(10) + with patch('src.blueprints.logs.memory_log_handler') as mock_handler: + mock_handler.get_recent_logs.return_value = entries + resp = client.get('/api/logs/live?count=3') + data = resp.get_json() + assert len(data['logs']) == 3 + + def test_error_returns_500(self, client): + """When an exception is raised, the endpoint returns 500.""" + with patch('src.blueprints.logs.memory_log_handler') as mock_handler: + mock_handler.get_recent_logs.side_effect = RuntimeError('boom') + resp = client.get('/api/logs/live') + assert resp.status_code == 500 + data = resp.get_json() + assert 'error' in data + assert data['logs'] == [] + + +# --------------------------------------------------------------------------- +# GET /api/logs/hardcover — database-backed Hardcover sync logs +# --------------------------------------------------------------------------- + +class TestApiLogsHardcover: + """Tests for the /api/logs/hardcover endpoint.""" + + @staticmethod + def _make_entry(id=1, abs_id='abc-123', book_title='Test Book', + direction='pull', action='update_progress', + detail=None, success=True, error_message=None): + entry = SimpleNamespace( + id=id, + abs_id=abs_id, + book_title=book_title, + direction=direction, + action=action, + detail=detail, + success=success, + error_message=error_message, + created_at=datetime(2026, 3, 19, 12, 0, 0), + ) + return entry + + def test_basic_pagination(self, client, mock_container): + """Returns paginated results with correct metadata.""" + entries = [self._make_entry(id=i) for i in range(3)] + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = (entries, 3) + + resp = client.get('/api/logs/hardcover') + data = resp.get_json() + assert resp.status_code == 200 + assert data['total'] == 3 + assert len(data['logs']) == 3 + assert data['page'] == 1 + assert data['total_pages'] == 1 + + def test_page_and_per_page(self, client, mock_container): + """The ?page= and ?per_page= parameters are forwarded to the database.""" + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([], 100) + + resp = client.get('/api/logs/hardcover?page=3&per_page=10') + data = resp.get_json() + assert resp.status_code == 200 + assert data['page'] == 3 + assert data['per_page'] == 10 + assert data['total_pages'] == 10 + + call_kwargs = mock_container.mock_database_service.get_hardcover_sync_logs.call_args + assert call_kwargs.kwargs['page'] == 3 + assert call_kwargs.kwargs['per_page'] == 10 + + def test_direction_and_action_filters(self, client, mock_container): + """The ?direction= and ?action= query params are forwarded.""" + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([], 0) + + resp = client.get('/api/logs/hardcover?direction=push&action=update_progress') + assert resp.status_code == 200 + + call_kwargs = mock_container.mock_database_service.get_hardcover_sync_logs.call_args + assert call_kwargs.kwargs['direction'] == 'push' + assert call_kwargs.kwargs['action'] == 'update_progress' + + def test_search_filter(self, client, mock_container): + """The ?search= query param is forwarded.""" + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([], 0) + + resp = client.get('/api/logs/hardcover?search=Dune') + assert resp.status_code == 200 + + call_kwargs = mock_container.mock_database_service.get_hardcover_sync_logs.call_args + assert call_kwargs.kwargs['search'] == 'Dune' + + def test_json_detail_parsed(self, client, mock_container): + """When entry.detail is valid JSON, it is returned as a parsed object.""" + entry = self._make_entry(detail='{"progress": 0.5}') + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([entry], 1) + + resp = client.get('/api/logs/hardcover') + data = resp.get_json() + assert data['logs'][0]['detail'] == {'progress': 0.5} + + def test_non_json_detail_returned_as_is(self, client, mock_container): + """When entry.detail is not valid JSON, it is returned as a plain string.""" + entry = self._make_entry(detail='plain text detail') + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([entry], 1) + + resp = client.get('/api/logs/hardcover') + data = resp.get_json() + assert data['logs'][0]['detail'] == 'plain text detail' + + def test_null_detail(self, client, mock_container): + """When entry.detail is None, detail is returned as null.""" + entry = self._make_entry(detail=None) + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([entry], 1) + + resp = client.get('/api/logs/hardcover') + data = resp.get_json() + assert data['logs'][0]['detail'] is None + + def test_entry_fields_serialized(self, client, mock_container): + """All expected fields are present in the serialized log entry.""" + entry = self._make_entry( + id=42, abs_id='xyz', book_title='Dune', direction='push', + action='set_status', success=False, error_message='timeout', + ) + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([entry], 1) + + resp = client.get('/api/logs/hardcover') + log = resp.get_json()['logs'][0] + assert log['id'] == 42 + assert log['abs_id'] == 'xyz' + assert log['book_title'] == 'Dune' + assert log['direction'] == 'push' + assert log['action'] == 'set_status' + assert log['success'] is False + assert log['error_message'] == 'timeout' + assert log['created_at'] == '2026-03-19T12:00:00' + + def test_null_created_at(self, client, mock_container): + """When created_at is None, it is serialized as null.""" + entry = self._make_entry() + entry.created_at = None + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([entry], 1) + + resp = client.get('/api/logs/hardcover') + assert resp.get_json()['logs'][0]['created_at'] is None + + def test_error_returns_500(self, client, mock_container): + """When an exception is raised, the endpoint returns 500.""" + mock_container.mock_database_service.get_hardcover_sync_logs.side_effect = RuntimeError('db down') + + resp = client.get('/api/logs/hardcover') + assert resp.status_code == 500 + data = resp.get_json() + assert 'error' in data + assert data['logs'] == [] + assert data['total'] == 0 + + def test_empty_filters_become_none(self, client, mock_container): + """Empty string filter params are converted to None.""" + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([], 0) + + resp = client.get('/api/logs/hardcover?direction=&action=&search=') + assert resp.status_code == 200 + + call_kwargs = mock_container.mock_database_service.get_hardcover_sync_logs.call_args + assert call_kwargs.kwargs['direction'] is None + assert call_kwargs.kwargs['action'] is None + assert call_kwargs.kwargs['search'] is None + + def test_per_page_clamped_to_max(self, client, mock_container): + """per_page cannot exceed 200.""" + mock_container.mock_database_service.get_hardcover_sync_logs.return_value = ([], 0) + + resp = client.get('/api/logs/hardcover?per_page=999') + assert resp.status_code == 200 + + call_kwargs = mock_container.mock_database_service.get_hardcover_sync_logs.call_args + assert call_kwargs.kwargs['per_page'] == 200 diff --git a/tests/test_matching_errors.py b/tests/test_matching_errors.py new file mode 100644 index 0000000..36d3eac --- /dev/null +++ b/tests/test_matching_errors.py @@ -0,0 +1,265 @@ +"""Tests for error paths in matching blueprint (src/blueprints/matching_bp.py).""" + +from unittest.mock import Mock, patch + + +def _setup_matching_db_defaults(mock_db): + """Configure database_service mock for matching routes.""" + mock_db.get_all_books.return_value = [] + mock_db.get_book_by_ref.return_value = None + mock_db.get_book_by_kosync_id.return_value = None + mock_db.get_kosync_doc_by_filename.return_value = None + mock_db.get_all_actionable_suggestions.return_value = [] + mock_db.get_bookfusion_books.return_value = [] + + +# ── _create_book_mapping: Booklore lookup fails ────────────────── + +def test_create_book_mapping_booklore_raises(flask_app, mock_container): + """_create_book_mapping proceeds when find_in_booklore raises internally. + + find_in_booklore catches its own errors, so _create_book_mapping only sees + (None, None). The mapping should still succeed if KOSync ID can be computed. + """ + _setup_matching_db_defaults(mock_container.mock_database_service) + + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.find_book_by_filename.return_value = None + + with flask_app.app_context(): + with patch("src.blueprints.matching_bp.find_in_booklore", return_value=(None, None)): + with patch("src.blueprints.matching_bp.get_kosync_id_for_ebook", return_value="hash123"): + from src.blueprints.matching_bp import _create_book_mapping + + book, error = _create_book_mapping( + mock_container, + abs_id="test-abs", + title="Test Book", + ebook_filename="test.epub", + duration=3600, + ) + + assert book is not None + assert error is None + mock_container.mock_database_service.save_book.assert_called() + + +def test_create_book_mapping_kosync_id_fails(flask_app, mock_container): + """_create_book_mapping returns error when KOSync ID cannot be computed.""" + _setup_matching_db_defaults(mock_container.mock_database_service) + + mock_container.mock_booklore_client.is_configured.return_value = False + mock_container.mock_ebook_parser.get_kosync_id.return_value = None + + with flask_app.app_context(): + from src.blueprints.matching_bp import _create_book_mapping + + book, error = _create_book_mapping( + mock_container, + abs_id="test-abs", + title="Test Book", + ebook_filename="test.epub", + duration=3600, + ) + + assert book is None + assert "KOSync ID" in error + + +def test_create_book_mapping_hardcover_automatch_fails(flask_app, mock_container): + """_create_book_mapping still returns the book when Hardcover automatch throws.""" + _setup_matching_db_defaults(mock_container.mock_database_service) + + mock_container.mock_booklore_client.is_configured.return_value = False + mock_container.mock_hardcover_service.is_configured.return_value = True + mock_container.mock_hardcover_service.automatch_hardcover.side_effect = Exception("HC timeout") + + with flask_app.app_context(): + with patch("src.blueprints.matching_bp.find_in_booklore", return_value=(None, None)): + with patch("src.blueprints.matching_bp.get_kosync_id_for_ebook", return_value="hash456"): + from src.blueprints.matching_bp import _create_book_mapping + + book, error = _create_book_mapping( + mock_container, + abs_id="test-abs", + title="Test Book", + ebook_filename="test.epub", + duration=3600, + ) + + # Book is created despite HC failure + assert book is not None + assert error is None + + +def test_create_book_mapping_booklore_add_to_shelf_fails(flask_app, mock_container): + """_create_book_mapping logs but succeeds when Booklore add_to_shelf throws.""" + _setup_matching_db_defaults(mock_container.mock_database_service) + + bl_client = Mock() + bl_client.is_configured.return_value = True + bl_client.add_to_shelf.side_effect = Exception("Shelf error") + + mock_container.mock_booklore_client.is_configured.return_value = True + mock_container.mock_booklore_client.find_book_by_filename.return_value = { + "id": 99, "_instance_id": "default" + } + mock_container.mock_ebook_parser.get_kosync_id_from_bytes.return_value = "hash789" + bl_client.download_book.return_value = b"fake epub" + mock_container.mock_ebook_parser.get_kosync_id_from_bytes.return_value = "hash789" + + # Patch find_in_booklore to return our bl_client + with flask_app.app_context(): + with patch("src.blueprints.matching_bp.find_in_booklore", return_value=({"id": 99}, bl_client)): + with patch("src.blueprints.matching_bp.get_kosync_id_for_ebook", return_value="hash789"): + from src.blueprints.matching_bp import _create_book_mapping + + book, error = _create_book_mapping( + mock_container, + abs_id="test-abs-2", + title="Test Book 2", + ebook_filename="test2.epub", + duration=3600, + ) + + assert book is not None + assert error is None + bl_client.add_to_shelf.assert_called_once() + + +# ── Batch match: individual book failure continues ──────────────── + +def test_batch_match_process_continues_on_individual_failure(flask_app, mock_container, client): + """Batch match should continue processing remaining items when one fails.""" + _setup_matching_db_defaults(mock_container.mock_database_service) + + call_count = {"n": 0} + + def create_mapping_side_effect(*args, **kwargs): + call_count["n"] += 1 + if call_count["n"] == 1: + raise Exception("First book fails") + return Mock(), None + + with flask_app.test_client() as test_client: + with test_client.session_transaction() as sess: + sess["queue"] = [ + { + "queue_key": "abs-1", + "abs_id": "abs-1", + "title": "Book 1", + "ebook_filename": "book1.epub", + "duration": 100, + }, + { + "queue_key": "abs-2", + "abs_id": "abs-2", + "title": "Book 2", + "ebook_filename": "book2.epub", + "duration": 200, + }, + ] + + with patch("src.blueprints.matching_bp._create_book_mapping", side_effect=create_mapping_side_effect): + response = test_client.post("/batch-match", data={"action": "process_queue"}) + + # Should redirect to dashboard (302) + assert response.status_code == 302 + # Both items were attempted (the second should succeed) + assert call_count["n"] == 2 + + +def test_batch_match_audio_only_continues_on_failure(flask_app, mock_container, client): + """Batch match should continue when an audio-only item fails save.""" + _setup_matching_db_defaults(mock_container.mock_database_service) + mock_container.mock_database_service.save_book.side_effect = [ + Exception("DB error on first save"), + None, # second save succeeds + ] + + with flask_app.test_client() as test_client: + with test_client.session_transaction() as sess: + sess["queue"] = [ + { + "queue_key": "abs-audio", + "abs_id": "abs-audio", + "title": "Audio Book", + "ebook_filename": "", + "duration": 100, + "audio_only": True, + }, + { + "queue_key": "abs-audio2", + "abs_id": "abs-audio2", + "title": "Audio Book 2", + "ebook_filename": "", + "duration": 200, + "audio_only": True, + }, + ] + + response = test_client.post("/batch-match", data={"action": "process_queue"}) + + assert response.status_code == 302 + + +def test_batch_match_ebook_only_kosync_failure_adds_to_failed(flask_app, mock_container, client): + """Ebook-only batch items that fail KOSync ID computation are added to failed list.""" + _setup_matching_db_defaults(mock_container.mock_database_service) + + with flask_app.test_client() as test_client: + with test_client.session_transaction() as sess: + sess["queue"] = [ + { + "queue_key": "ebook1.epub", + "abs_id": "", + "title": "Ebook Only", + "ebook_filename": "ebook1.epub", + "ebook_display_name": "My Ebook", + "duration": 0, + "ebook_only": True, + }, + ] + + with patch("src.blueprints.matching_bp.find_in_booklore", return_value=(None, None)): + with patch("src.blueprints.matching_bp.get_kosync_id_for_ebook", return_value=None): + response = test_client.post("/batch-match", data={"action": "process_queue"}) + + # Should redirect with a flash warning about failed items + assert response.status_code == 302 + + +# ── Suggestions page: serialize edge cases ──────────────────────── + +def test_suggestions_page_filters_no_matches(flask_app, mock_container): + """Suggestions page filters out suggestions with empty matches.""" + _setup_matching_db_defaults(mock_container.mock_database_service) + + suggestion_no_matches = Mock() + suggestion_no_matches.matches = [] + suggestion_no_matches.id = 1 + suggestion_no_matches.source_id = "abc" + + suggestion_with_matches = Mock() + suggestion_with_matches.id = 2 + suggestion_with_matches.source_id = "def" + suggestion_with_matches.source = "abs" + suggestion_with_matches.title = "Test" + suggestion_with_matches.author = None + suggestion_with_matches.cover_url = None + suggestion_with_matches.matches = [{"ebook_filename": "test.epub", "evidence": []}] + suggestion_with_matches.created_at = None + suggestion_with_matches.status = "pending" + + mock_container.mock_database_service.get_all_actionable_suggestions.return_value = [ + suggestion_no_matches, suggestion_with_matches + ] + + # Need ABS available for suggestions route + flask_app.config['abs_service'] = Mock() + flask_app.config['abs_service'].is_available.return_value = True + + with flask_app.test_client() as test_client: + response = test_client.get("/suggestions") + + assert response.status_code == 200 diff --git a/tests/test_rate_limiter.py b/tests/test_rate_limiter.py new file mode 100644 index 0000000..372be9a --- /dev/null +++ b/tests/test_rate_limiter.py @@ -0,0 +1,62 @@ +"""Tests for TokenBucketRateLimiter.""" + +import time + +from src.utils.rate_limiter import TokenBucketRateLimiter + + +class TestTokenBucketRateLimiter: + def test_allows_within_capacity(self): + limiter = TokenBucketRateLimiter(capacity=5, refill_rate=0) + for _ in range(5): + assert limiter.check("1.2.3.4") is True + + def test_rejects_over_capacity(self): + limiter = TokenBucketRateLimiter(capacity=3, refill_rate=0) + for _ in range(3): + limiter.check("1.2.3.4") + assert limiter.check("1.2.3.4") is False + + def test_refills_over_time(self): + limiter = TokenBucketRateLimiter(capacity=2, refill_rate=100) + # Exhaust tokens + limiter.check("1.2.3.4") + limiter.check("1.2.3.4") + assert limiter.check("1.2.3.4") is False + # Sleep briefly — high refill rate means quick recovery + time.sleep(0.05) + assert limiter.check("1.2.3.4") is True + + def test_auth_cost_exhausts_faster(self): + limiter = TokenBucketRateLimiter(capacity=10, refill_rate=0) + # AUTH_TOKEN_COST = 5, so 2 auth checks exhaust 10 tokens + assert limiter.check("1.2.3.4", cost=TokenBucketRateLimiter.AUTH_TOKEN_COST) is True + assert limiter.check("1.2.3.4", cost=TokenBucketRateLimiter.AUTH_TOKEN_COST) is True + assert limiter.check("1.2.3.4", cost=1) is False + + def test_prune_removes_stale(self): + limiter = TokenBucketRateLimiter(capacity=10, refill_rate=1) + limiter.check("stale-ip") + # Prune with a very short threshold + limiter.prune(max_idle_seconds=0) + # Internal store should be empty — next check gets fresh bucket + assert limiter.check("stale-ip") is True + # Verify a second full capacity is available (bucket was recreated) + for _ in range(9): + assert limiter.check("stale-ip") is True + + def test_clear_empties_store(self): + limiter = TokenBucketRateLimiter(capacity=2, refill_rate=0) + limiter.check("a") + limiter.check("b") + limiter.clear() + # After clear, both IPs get fresh buckets + assert limiter.check("a") is True + assert limiter.check("b") is True + + def test_separate_buckets_per_ip(self): + limiter = TokenBucketRateLimiter(capacity=1, refill_rate=0) + assert limiter.check("ip-a") is True + assert limiter.check("ip-a") is False + # Different IP should still have tokens + assert limiter.check("ip-b") is True diff --git a/tests/test_reading_bp_errors.py b/tests/test_reading_bp_errors.py new file mode 100644 index 0000000..fa5ffa6 --- /dev/null +++ b/tests/test_reading_bp_errors.py @@ -0,0 +1,230 @@ +"""Tests for error paths in reading blueprint (src/blueprints/reading_bp.py).""" + +from unittest.mock import Mock + + +def _make_mock_book(**overrides): + """Create a mock Book object with reading-relevant fields.""" + defaults = { + "id": 1, + "abs_id": "test-abs", + "title": "Test Book", + "author": "Author", + "ebook_filename": "test.epub", + "original_ebook_filename": None, + "kosync_doc_id": "hash123", + "status": "active", + "sync_mode": "audiobook", + "duration": 3600, + "started_at": "2026-01-01", + "finished_at": None, + "rating": None, + "read_count": 1, + "subtitle": None, + "storyteller_uuid": None, + "custom_cover_url": None, + "abs_ebook_item_id": None, + "ebook_item_id": None, + } + defaults.update(overrides) + book = Mock() + for k, v in defaults.items(): + setattr(book, k, v) + return book + + +def _setup_reading_db_defaults(mock_db, book=None): + """Configure mock DB with defaults for reading routes.""" + mock_db.get_all_books.return_value = [book] if book else [] + mock_db.get_all_states.return_value = [] + mock_db.get_states_by_book.return_value = {} + mock_db.get_states_for_book.return_value = [] + mock_db.get_booklore_by_filename.return_value = {} + mock_db.get_all_hardcover_details.return_value = [] + mock_db.get_hardcover_details.return_value = None + mock_db.get_reading_goal.return_value = None + mock_db.get_reading_journals.return_value = [] + mock_db.get_bookfusion_highlights_for_book_by_book_id.return_value = [] + mock_db.is_bookfusion_linked_by_book_id.return_value = False + mock_db.find_tbr_by_book_id.return_value = None + mock_db.get_tbr_count.return_value = 0 + mock_db.get_tbr_items.return_value = [] + mock_db.get_book_by_ref.return_value = book + + +# ── Reading index: ABS metadata fetch fails ─────────────────────── + +def test_reading_index_renders_when_abs_metadata_fails(flask_app, mock_container): + """Reading page should render 200 even when ABS get_audiobooks raises.""" + book = _make_mock_book() + _setup_reading_db_defaults(mock_container.mock_database_service, book) + + failing_abs = Mock() + failing_abs.get_audiobooks.side_effect = Exception("ABS unavailable") + failing_abs.is_available.return_value = False + failing_abs.get_cover_proxy_url.return_value = None + flask_app.config['abs_service'] = failing_abs + + with flask_app.test_client() as client: + response = client.get("/reading") + + assert response.status_code == 200 + + +def test_reading_index_renders_when_hardcover_check_fails(flask_app, mock_container): + """Reading page should render when Hardcover is_configured raises.""" + _setup_reading_db_defaults(mock_container.mock_database_service) + mock_container.mock_hardcover_client.is_configured.side_effect = Exception("HC error") + + with flask_app.test_client() as client: + response = client.get("/reading") + + assert response.status_code == 200 + + +def test_reading_index_renders_when_tbr_items_fail(flask_app, mock_container): + """Reading page should render when TBR items query fails.""" + _setup_reading_db_defaults(mock_container.mock_database_service) + mock_container.mock_database_service.get_tbr_items.side_effect = Exception("TBR DB error") + + with flask_app.test_client() as client: + response = client.get("/reading") + + assert response.status_code == 200 + + +# ── Rating: Hardcover sync fails but local save succeeds ────────── + +def test_update_rating_hardcover_sync_fails_local_succeeds(flask_app, mock_container): + """Rating update should succeed locally even when Hardcover push throws.""" + book = _make_mock_book(rating=4.0) + mock_container.mock_database_service.get_book_by_ref.return_value = book + + updated_book = _make_mock_book(rating=4.0) + mock_container.mock_database_service.update_book_reading_fields.return_value = updated_book + + mock_container.mock_hardcover_service.is_configured.return_value = True + mock_container.mock_hardcover_service.push_local_rating.side_effect = Exception("HC push failed") + + with flask_app.test_client() as client: + response = client.post( + "/api/reading/book/test-abs/rating", + json={"rating": 4.0}, + ) + + data = response.get_json() + assert response.status_code == 200 + assert data["success"] is True + assert data["rating"] == 4.0 + assert data["hardcover_synced"] is False + assert data["hardcover_error"] == "HC push failed" + + +def test_update_rating_hardcover_not_configured(flask_app, mock_container): + """Rating update should succeed when Hardcover is not configured.""" + book = _make_mock_book(rating=3.5) + mock_container.mock_database_service.get_book_by_ref.return_value = book + + updated_book = _make_mock_book(rating=3.5) + mock_container.mock_database_service.update_book_reading_fields.return_value = updated_book + + mock_container.mock_hardcover_service.is_configured.return_value = False + + with flask_app.test_client() as client: + response = client.post( + "/api/reading/book/test-abs/rating", + json={"rating": 3.5}, + ) + + data = response.get_json() + assert response.status_code == 200 + assert data["success"] is True + assert data["hardcover_synced"] is False + assert data["hardcover_error"] is None + + +def test_update_rating_invalid_value_returns_400(flask_app, mock_container): + """Rating update with non-numeric value returns 400.""" + book = _make_mock_book() + mock_container.mock_database_service.get_book_by_ref.return_value = book + + with flask_app.test_client() as client: + response = client.post( + "/api/reading/book/test-abs/rating", + json={"rating": "not-a-number"}, + ) + + assert response.status_code == 400 + data = response.get_json() + assert data["success"] is False + + +def test_update_rating_out_of_range_returns_400(flask_app, mock_container): + """Rating outside 0-5 returns 400.""" + book = _make_mock_book() + mock_container.mock_database_service.get_book_by_ref.return_value = book + + with flask_app.test_client() as client: + response = client.post( + "/api/reading/book/test-abs/rating", + json={"rating": 6.0}, + ) + + assert response.status_code == 400 + + +# ── Reading detail: alignment info failure ──────────────────────── + +def test_reading_detail_alignment_failure_swallowed(flask_app, mock_container): + """Reading detail page should render when alignment_service raises.""" + book = _make_mock_book(sync_mode="audiobook") + mock_container.mock_database_service.get_book_by_ref.return_value = book + mock_container.mock_database_service.get_hardcover_details.return_value = None + mock_container.mock_database_service.get_reading_journals.return_value = [] + mock_container.mock_database_service.get_bookfusion_highlights_for_book_by_book_id.return_value = [] + mock_container.mock_database_service.is_bookfusion_linked_by_book_id.return_value = False + mock_container.mock_database_service.get_states_by_book.return_value = {} + mock_container.mock_database_service.get_booklore_by_filename.return_value = {} + mock_container.mock_database_service.find_tbr_by_book_id.return_value = None + + failing_alignment = Mock() + failing_alignment.get_alignment_info.side_effect = Exception("Alignment DB error") + mock_container.alignment_service = lambda: failing_alignment + + with flask_app.test_client() as client: + response = client.get("/reading/book/test-abs") + + assert response.status_code == 200 + + +# ── TBR detail: HC check failure ────────────────────────────────── + +def test_tbr_detail_hardcover_check_fails(flask_app, mock_container): + """TBR detail page renders when Hardcover is_configured raises.""" + tbr_item = Mock() + tbr_item.id = 1 + tbr_item.title = "TBR Book" + tbr_item.author = "Author" + tbr_item.genres = None + tbr_item.book_id = None + tbr_item.book_abs_id = None + tbr_item.cover_url = None + tbr_item.hardcover_book_id = None + tbr_item.notes = None + tbr_item.priority = 0 + tbr_item.source = "manual" + tbr_item.created_at = None + tbr_item.rating = None + tbr_item.page_count = None + tbr_item.release_year = None + tbr_item.added_at = None + tbr_item.description = None + tbr_item.status = "want_to_read" + + mock_container.mock_database_service.get_tbr_item.return_value = tbr_item + mock_container.mock_hardcover_client.is_configured.side_effect = Exception("HC config error") + + with flask_app.test_client() as client: + response = client.get("/reading/tbr/1") + + assert response.status_code == 200 diff --git a/tests/test_reading_date_service.py b/tests/test_reading_date_service.py new file mode 100644 index 0000000..aabdf6f --- /dev/null +++ b/tests/test_reading_date_service.py @@ -0,0 +1,477 @@ +"""Tests for ReadingDateService — focused on error paths and edge cases.""" + +from unittest.mock import MagicMock, patch + +import pytest + +from src.services.reading_date_service import ReadingDateService, push_booklore_read_status + +# --------------------------------------------------------------------------- +# Fixtures +# --------------------------------------------------------------------------- + +@pytest.fixture +def mock_db(): + return MagicMock() + + +@pytest.fixture +def mock_hc_client(): + return MagicMock() + + +@pytest.fixture +def mock_abs_client(): + return MagicMock() + + +@pytest.fixture +def service(mock_db, mock_hc_client, mock_abs_client): + return ReadingDateService(mock_db, mock_hc_client, mock_abs_client) + + +def _make_book(**overrides): + book = MagicMock() + book.id = overrides.get("id", 1) + book.abs_id = overrides.get("abs_id", "abs-123") + book.title = overrides.get("title", "Test Book") + book.status = overrides.get("status", "active") + book.started_at = overrides.get("started_at", None) + book.finished_at = overrides.get("finished_at", None) + book.ebook_filename = overrides.get("ebook_filename", None) + return book + + +# =========================================================================== +# pull_reading_dates +# =========================================================================== + +class TestPullReadingDates: + """Tests for pull_reading_dates (ABS date retrieval).""" + + def test_returns_empty_when_book_not_found(self, service, mock_db): + mock_db.get_book_by_id.return_value = None + assert service.pull_reading_dates(99) == {} + + def test_returns_empty_when_book_has_no_abs_id(self, service, mock_db): + mock_db.get_book_by_id.return_value = _make_book(abs_id=None) + assert service.pull_reading_dates(1) == {} + + def test_returns_empty_when_abs_not_configured(self, service, mock_db, mock_abs_client): + mock_db.get_book_by_id.return_value = _make_book() + mock_abs_client.is_configured.return_value = False + assert service.pull_reading_dates(1) == {} + + def test_returns_empty_when_abs_progress_is_none(self, service, mock_db, mock_abs_client): + mock_db.get_book_by_id.return_value = _make_book() + mock_abs_client.is_configured.return_value = True + mock_abs_client.get_progress.return_value = None + assert service.pull_reading_dates(1) == {} + + def test_parses_started_and_finished_timestamps(self, service, mock_db, mock_abs_client): + mock_db.get_book_by_id.return_value = _make_book() + mock_abs_client.is_configured.return_value = True + # 1750032000000 ms = 2025-06-16 00:00:00 UTC + mock_abs_client.get_progress.return_value = { + "startedAt": 1750032000000, + "finishedAt": 1750118400000, # +1 day + } + dates = service.pull_reading_dates(1) + assert dates["started_at"] == "2025-06-16" + assert dates["finished_at"] == "2025-06-17" + + def test_returns_only_started_when_finished_missing(self, service, mock_db, mock_abs_client): + mock_db.get_book_by_id.return_value = _make_book() + mock_abs_client.is_configured.return_value = True + mock_abs_client.get_progress.return_value = {"startedAt": 1750032000000} + dates = service.pull_reading_dates(1) + assert "started_at" in dates + assert "finished_at" not in dates + + def test_returns_empty_when_api_raises(self, service, mock_db, mock_abs_client): + """Exception in ABS client should be caught and return empty dict.""" + mock_db.get_book_by_id.return_value = _make_book() + mock_abs_client.is_configured.return_value = True + mock_abs_client.get_progress.side_effect = ConnectionError("timeout") + assert service.pull_reading_dates(1) == {} + + def test_returns_empty_when_db_raises(self, service, mock_db): + """Exception in database lookup should be caught and return empty dict.""" + mock_db.get_book_by_id.side_effect = RuntimeError("DB gone") + assert service.pull_reading_dates(1) == {} + + def test_zero_timestamp_not_included(self, service, mock_db, mock_abs_client): + """A startedAt of 0 is falsy and should not produce a date.""" + mock_db.get_book_by_id.return_value = _make_book() + mock_abs_client.is_configured.return_value = True + mock_abs_client.get_progress.return_value = {"startedAt": 0, "finishedAt": 1750032000000} + dates = service.pull_reading_dates(1) + assert "started_at" not in dates + assert "finished_at" in dates + + +# =========================================================================== +# push_dates_to_hardcover +# =========================================================================== + +class TestPushDatesToHardcover: + """Tests for push_dates_to_hardcover.""" + + def test_not_configured(self, service, mock_hc_client): + mock_hc_client.is_configured.return_value = False + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "not configured" in msg + + def test_no_hardcover_details(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + mock_db.get_hardcover_details.return_value = None + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "not linked" in msg + + def test_book_not_found(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = None + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "not found" in msg.lower() + + def test_no_local_dates(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book() + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "No local dates" in msg + + def test_creates_read_when_no_reads_exist(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + book = _make_book(started_at="2025-01-01") + mock_db.get_book_by_id.return_value = book + mock_hc_client.find_user_book.return_value = { + "id": 10, "edition_id": 5, "user_book_reads": [], + } + mock_hc_client.create_read_with_dates.return_value = 99 + + with patch("src.services.reading_date_service.log_hardcover_action"): + ok, msg = service.push_dates_to_hardcover(1) + + assert ok is True + assert "Created" in msg + mock_hc_client.create_read_with_dates.assert_called_once() + + def test_create_read_failure(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book(started_at="2025-01-01") + mock_hc_client.find_user_book.return_value = { + "id": 10, "edition_id": 5, "user_book_reads": [], + } + mock_hc_client.create_read_with_dates.return_value = None + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "Failed to create" in msg + + def test_updates_existing_read_missing_dates(self, service, mock_db, mock_hc_client): + """Default mode: only fills in missing HC dates.""" + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book( + started_at="2025-01-01", finished_at="2025-02-01" + ) + mock_hc_client.find_user_book.return_value = { + "id": 10, "user_book_reads": [ + {"id": 7, "started_at": "2025-01-01", "finished_at": None} + ], + } + mock_hc_client.update_read_dates.return_value = True + + with patch("src.services.reading_date_service.log_hardcover_action"): + ok, msg = service.push_dates_to_hardcover(1) + + assert ok is True + # started_at should NOT be pushed (HC already has it); finished_at should be pushed + call_kwargs = mock_hc_client.update_read_dates.call_args + assert call_kwargs[1]["finished_at"] == "2025-02-01" + + def test_skips_when_dates_already_match(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book(started_at="2025-01-01") + mock_hc_client.find_user_book.return_value = { + "id": 10, "user_book_reads": [ + {"id": 7, "started_at": "2025-01-01", "finished_at": None} + ], + } + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "already match" in msg + + def test_force_overwrites_existing_dates(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book( + started_at="2025-03-01", finished_at="2025-04-01" + ) + mock_hc_client.find_user_book.return_value = { + "id": 10, "user_book_reads": [ + {"id": 7, "started_at": "2025-01-01", "finished_at": "2025-02-01"} + ], + } + mock_hc_client.update_read_dates.return_value = True + + with patch("src.services.reading_date_service.log_hardcover_action"): + ok, msg = service.push_dates_to_hardcover(1, force=True) + + assert ok is True + call_kwargs = mock_hc_client.update_read_dates.call_args + assert call_kwargs[1]["started_at"] == "2025-03-01" + assert call_kwargs[1]["finished_at"] == "2025-04-01" + + def test_hardcover_rejects_update(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book(started_at="2025-01-01") + mock_hc_client.find_user_book.return_value = { + "id": 10, "user_book_reads": [ + {"id": 7, "started_at": None, "finished_at": None} + ], + } + mock_hc_client.update_read_dates.return_value = False + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "rejected" in msg + + def test_exception_returns_error_tuple(self, service, mock_hc_client): + """Any unexpected exception should be caught and return a clean error.""" + mock_hc_client.is_configured.side_effect = RuntimeError("boom") + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "Unexpected error" in msg + + def test_user_book_not_found(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book(started_at="2025-01-01") + mock_hc_client.find_user_book.return_value = None + ok, msg = service.push_dates_to_hardcover(1) + assert ok is False + assert "not found in your Hardcover library" in msg + + +# =========================================================================== +# pull_dates_from_hardcover +# =========================================================================== + +class TestPullDatesFromHardcover: + """Tests for pull_dates_from_hardcover.""" + + def test_not_configured(self, service, mock_hc_client): + mock_hc_client.is_configured.return_value = False + ok, msg, dates = service.pull_dates_from_hardcover(1) + assert ok is False + assert dates == {} + + def test_no_hc_reads(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book() + mock_hc_client.find_user_book.return_value = {"user_book_reads": []} + ok, msg, dates = service.pull_dates_from_hardcover(1) + assert ok is False + assert "No reading sessions" in msg + + def test_truncates_iso_timestamps_to_date(self, service, mock_db, mock_hc_client): + """HC may return full ISO timestamps; service should truncate to YYYY-MM-DD.""" + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + book = _make_book() + mock_db.get_book_by_id.return_value = book + mock_hc_client.find_user_book.return_value = { + "user_book_reads": [ + {"started_at": "2025-03-15T14:30:00Z", "finished_at": "2025-04-20T09:00:00Z"} + ], + } + updated_book = _make_book(started_at="2025-03-15", finished_at="2025-04-20") + mock_db.get_book_by_ref.return_value = updated_book + + with patch("src.services.reading_date_service.log_hardcover_action"): + ok, msg, dates = service.pull_dates_from_hardcover(1) + + assert ok is True + # Verify DB was called with truncated dates + mock_db.update_book_reading_fields.assert_called_once_with( + 1, started_at="2025-03-15", finished_at="2025-04-20" + ) + + def test_no_dates_on_hardcover(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book() + mock_hc_client.find_user_book.return_value = { + "user_book_reads": [{"started_at": None, "finished_at": None}], + } + ok, msg, dates = service.pull_dates_from_hardcover(1) + assert ok is False + assert "No dates found" in msg + + def test_local_dates_already_match(self, service, mock_db, mock_hc_client): + mock_hc_client.is_configured.return_value = True + hc_details = MagicMock() + hc_details.hardcover_book_id = "42" + mock_db.get_hardcover_details.return_value = hc_details + mock_db.get_book_by_id.return_value = _make_book(started_at="2025-01-01") + mock_hc_client.find_user_book.return_value = { + "user_book_reads": [{"started_at": "2025-01-01", "finished_at": None}], + } + ok, msg, dates = service.pull_dates_from_hardcover(1) + assert ok is False + assert "already match" in msg + + def test_exception_returns_error_triple(self, service, mock_hc_client): + mock_hc_client.is_configured.side_effect = RuntimeError("boom") + ok, msg, dates = service.pull_dates_from_hardcover(1) + assert ok is False + assert "Unexpected error" in msg + assert dates == {} + + +# =========================================================================== +# auto_complete_finished_books +# =========================================================================== + +class TestAutoCompleteFinishedBooks: + """Tests for auto_complete_finished_books.""" + + @patch("src.services.status_machine.StatusMachine") + def test_skips_non_active_books(self, MockMachine, service, mock_db): + mock_db.get_all_books.return_value = [ + _make_book(status="completed"), + _make_book(status="not_started"), + ] + container = MagicMock() + stats = service.auto_complete_finished_books(container) + assert stats == {"completed": 0, "errors": 0} + + @patch("src.services.status_machine.StatusMachine") + def test_individual_book_failure_continues(self, MockMachine, service, mock_db, mock_abs_client): + """One book raising should not stop processing of remaining books.""" + book1 = _make_book(id=1, status="active") + book2 = _make_book(id=2, status="active") + mock_db.get_all_books.return_value = [book1, book2] + + # Both books are "finished" by state + state = MagicMock() + state.percentage = 1.0 + mock_db.get_states_for_book.return_value = [state] + + # ABS not configured so pull_reading_dates is quick + mock_abs_client.is_configured.return_value = False + + machine_inst = MockMachine.return_value + # First book's transition raises, second succeeds + machine_inst.transition.side_effect = [RuntimeError("oops"), None] + + container = MagicMock() + container.sync_clients.return_value = {} + + stats = service.auto_complete_finished_books(container) + assert stats["errors"] == 1 + assert stats["completed"] == 1 + + @patch("src.services.status_machine.StatusMachine") + def test_below_threshold_not_completed(self, MockMachine, service, mock_db): + book = _make_book(status="active") + mock_db.get_all_books.return_value = [book] + + state = MagicMock() + state.percentage = 0.5 + mock_db.get_states_for_book.return_value = [state] + + container = MagicMock() + stats = service.auto_complete_finished_books(container) + assert stats["completed"] == 0 + + +# =========================================================================== +# _push_completion_to_clients +# =========================================================================== + +class TestPushCompletionToClients: + """Tests for _push_completion_to_clients (internal helper).""" + + def test_individual_client_failure_continues(self, service, mock_db): + """Failure pushing to one client should not prevent pushing to others.""" + book = _make_book(ebook_filename=None) # no Booklore push + container = MagicMock() + + client_a = MagicMock() + client_a.is_configured.return_value = True + client_a.update_progress.side_effect = ConnectionError("dead") + + client_b = MagicMock() + client_b.is_configured.return_value = True + client_b.update_progress.return_value = None + + container.sync_clients.return_value = {"storyteller": client_a, "bookfusion": client_b} + + service._push_completion_to_clients(book, container) + + # client_b should still have been called despite client_a failing + client_b.update_progress.assert_called_once() + # State saved only for the successful client + assert mock_db.save_state.call_count == 1 + + +# =========================================================================== +# push_booklore_read_status (module-level helper) +# =========================================================================== + +class TestPushBookloreReadStatus: + def test_exception_is_swallowed(self): + container = MagicMock() + bl_client = MagicMock() + container.booklore_client.return_value = bl_client + bl_client.is_configured.return_value = True + bl_client.update_read_status.side_effect = RuntimeError("Booklore down") + + book = _make_book(ebook_filename="book.epub") + # Should not raise + push_booklore_read_status(book, container, "READ") + + def test_skips_when_not_configured(self): + container = MagicMock() + bl_client = MagicMock() + container.booklore_client.return_value = bl_client + bl_client.is_configured.return_value = False + + book = _make_book(ebook_filename="book.epub") + push_booklore_read_status(book, container, "READ") + bl_client.update_read_status.assert_not_called() diff --git a/tests/test_reading_routes.py b/tests/test_reading_routes.py index 14987e5..fec3319 100644 --- a/tests/test_reading_routes.py +++ b/tests/test_reading_routes.py @@ -249,6 +249,7 @@ def test_reading_page_renders_log_and_stats_tabs(self): self.db.get_states_by_book.return_value = {None: [state]} self.db.get_booklore_by_filename.return_value = {} self.db.get_all_booklore_books.return_value = [] + self.db.get_all_hardcover_details.return_value = [] self.db.get_reading_goal.return_value = None resp = self.client.get('/reading') diff --git a/tests/test_reading_service.py b/tests/test_reading_service.py new file mode 100644 index 0000000..8f5a17e --- /dev/null +++ b/tests/test_reading_service.py @@ -0,0 +1,205 @@ +"""Tests for ReadingService and ReadingStatsService — error paths and edge cases.""" + +import sys +from pathlib import Path +from unittest.mock import Mock, patch + +import pytest + +sys.path.insert(0, str(Path(__file__).parent.parent)) + +from src.services.reading_service import ReadingService +from src.services.reading_stats_service import ReadingStatsService + +# --------------------------------------------------------------------------- +# ReadingStatsService.get_year_stats +# --------------------------------------------------------------------------- + +def _make_mock_book(**overrides): + book = Mock() + book.id = overrides.get('id', 1) + book.status = overrides.get('status', 'active') + book.finished_at = overrides.get('finished_at', None) + book.rating = overrides.get('rating', None) + book.title = overrides.get('title', 'Test Book') + return book + + +def _make_state(book_id, percentage=0.5): + state = Mock() + state.book_id = book_id + state.percentage = percentage + return state + + +class TestGetYearStatsNoStates: + """get_year_stats when there are no states or no books.""" + + def test_no_books_returns_zero_stats(self): + db = Mock() + db.get_all_books.return_value = [] + db.get_all_states.return_value = [] + db.get_reading_goal.return_value = None + + svc = ReadingStatsService(database_service=db) + result = svc.get_year_stats(2026) + + assert result['books_finished'] == 0 + assert result['currently_reading'] == 0 + assert result['total_tracked'] == 0 + assert result['average_rating'] is None + assert result['monthly_finished'] == [0] * 12 + assert result['goal_target'] is None + assert result['goal_percent'] is None + + def test_active_books_with_no_states_not_counted_as_reading(self): + """A book with 'active' status but zero progress is not 'genuinely reading'.""" + book = _make_mock_book(id=1, status='active') + db = Mock() + db.get_all_books.return_value = [book] + db.get_all_states.return_value = [] # No states at all + db.get_reading_goal.return_value = None + + svc = ReadingStatsService(database_service=db) + result = svc.get_year_stats(2026) + + assert result['currently_reading'] == 0 + assert result['total_tracked'] == 1 + + def test_active_book_with_low_progress_not_counted(self): + """Progress <= 1% means not genuinely reading.""" + book = _make_mock_book(id=1, status='active') + state = _make_state(1, percentage=0.005) # 0.5% + db = Mock() + db.get_all_books.return_value = [book] + db.get_all_states.return_value = [state] + db.get_reading_goal.return_value = None + + svc = ReadingStatsService(database_service=db) + result = svc.get_year_stats(2026) + + assert result['currently_reading'] == 0 + + def test_completed_book_counted_for_correct_year(self): + book = _make_mock_book(id=1, status='completed', finished_at='2026-06-15', rating=4.5) + db = Mock() + db.get_all_books.return_value = [book] + db.get_all_states.return_value = [] + db.get_reading_goal.return_value = None + + svc = ReadingStatsService(database_service=db) + result = svc.get_year_stats(2026) + + assert result['books_finished'] == 1 + assert result['monthly_finished'][5] == 1 # June = index 5 + assert result['average_rating'] == 4.5 + + def test_completed_book_wrong_year_not_counted(self): + book = _make_mock_book(id=1, status='completed', finished_at='2025-12-31') + db = Mock() + db.get_all_books.return_value = [book] + db.get_all_states.return_value = [] + db.get_reading_goal.return_value = None + + svc = ReadingStatsService(database_service=db) + result = svc.get_year_stats(2026) + + assert result['books_finished'] == 0 + + +class TestGetYearStatsWhenComputationRaises: + """Verify behavior when database calls raise exceptions.""" + + def test_get_all_books_raises_propagates(self): + """If the DB is down, get_year_stats should propagate the exception.""" + db = Mock() + db.get_all_books.side_effect = RuntimeError("DB connection lost") + + svc = ReadingStatsService(database_service=db) + + with pytest.raises(RuntimeError, match="DB connection lost"): + svc.get_year_stats(2026) + + def test_get_all_states_raises_propagates(self): + db = Mock() + db.get_all_books.return_value = [_make_mock_book()] + db.get_all_states.side_effect = RuntimeError("states table locked") + + svc = ReadingStatsService(database_service=db) + + with pytest.raises(RuntimeError, match="states table locked"): + svc.get_year_stats(2026) + + def test_get_reading_goal_raises_propagates(self): + db = Mock() + db.get_all_books.return_value = [] + db.get_all_states.return_value = [] + db.get_reading_goal.side_effect = RuntimeError("goal fetch failed") + + svc = ReadingStatsService(database_service=db) + + with pytest.raises(RuntimeError, match="goal fetch failed"): + svc.get_year_stats(2026) + + +# --------------------------------------------------------------------------- +# ReadingService — error paths +# --------------------------------------------------------------------------- + +class TestReadingServiceMaxProgress: + def test_max_progress_empty_states(self): + assert ReadingService.max_progress([]) == 0.0 + + def test_max_progress_as_percent(self): + states = [_make_state(1, 0.75), _make_state(1, 0.50)] + assert ReadingService.max_progress(states, as_percent=True) == 75.0 + + def test_max_progress_caps_at_100(self): + states = [_make_state(1, 1.5)] # overflows + assert ReadingService.max_progress(states, as_percent=True) == 100.0 + + +class TestReadingServiceSetProgress: + def test_set_progress_book_not_found(self): + db = Mock() + db.get_book_by_id.return_value = None + + svc = ReadingService(db) + result = svc.set_progress(999, 0.5, Mock()) + + assert result['success'] is False + assert 'not found' in result['error'].lower() + + @patch('src.services.reading_service.StatusMachine') + def test_set_progress_sync_propagation_failure_still_succeeds(self, _mock_sm): + """If sync propagation to clients fails, set_progress still returns success.""" + book = _make_mock_book(id=1, status='active') + book.abs_id = 'abs-1' + book.started_at = '2026-01-01' + db = Mock() + db.get_book_by_id.return_value = book + + container = Mock() + failing_client = Mock() + failing_client.is_configured.return_value = True + failing_client.update_progress.side_effect = ConnectionError("unreachable") + container.sync_clients.return_value = {'Storyteller': failing_client} + + svc = ReadingService(db) + result = svc.set_progress(1, 0.5, container) + + assert result['success'] is True + assert result['percentage'] == 0.5 + + +class TestReadingServiceUpdateStatus: + @patch('src.services.reading_service.StatusMachine') + def test_update_status_book_not_found(self, _mock_sm): + db = Mock() + db.get_book_by_id.return_value = None + + svc = ReadingService(db) + result = svc.update_status(999, 'active', Mock()) + + assert result['success'] is False + assert 'not found' in result['error'].lower() diff --git a/tests/test_settings_comprehensive.py b/tests/test_settings_comprehensive.py index e706912..6da208e 100644 --- a/tests/test_settings_comprehensive.py +++ b/tests/test_settings_comprehensive.py @@ -16,7 +16,7 @@ def __init__(self): self.mock_database_service = Mock() self.mock_database_service.get_all_settings.return_value = {} self.mock_sync_manager = Mock() - self.mock_sync_manager.get_abs_title.return_value = 'Test' + self.mock_sync_manager.get_audiobook_title.return_value = 'Test' def database_service(self): return self.mock_database_service def sync_manager(self): return self.mock_sync_manager diff --git a/tests/test_storyteller_submission.py b/tests/test_storyteller_submission.py index 78e9e76..7601579 100644 --- a/tests/test_storyteller_submission.py +++ b/tests/test_storyteller_submission.py @@ -25,7 +25,12 @@ def setUp(self): self.mock_db = Mock() # Default: no existing reservation (submit_book creates a new record) - self.mock_db.get_active_storyteller_submission.return_value = None + self.mock_db.get_active_storyteller_submission_by_book_id.return_value = None + + # Service resolves book by abs_id before submission lookups + mock_book = Mock() + mock_book.id = 1 + self.mock_db.get_book_by_abs_id.return_value = mock_book self.service = StorytellerSubmissionService( storyteller_client=self.mock_storyteller, @@ -107,7 +112,7 @@ def test_submit_creates_correct_directory_structure(self, mock_download): @patch.object(StorytellerSubmissionService, "_download_file", side_effect=lambda url, dest: dest) def test_submit_persists_submission_record_only_on_success(self, mock_download): # No existing reservation — should create a new submission record - self.mock_db.get_active_storyteller_submission.return_value = None + self.mock_db.get_active_storyteller_submission_by_book_id.return_value = None result = self.service.submit_book( abs_id="book-123", title="Test Book", @@ -196,19 +201,19 @@ def test_submit_cleans_up_on_audio_download_failure(self, mock_download): # ── check_status ── def test_check_status_returns_not_found_when_no_submission(self): - self.mock_db.get_storyteller_submission.return_value = None + self.mock_db.get_storyteller_submission_by_book_id.return_value = None assert self.service.check_status("book-123") == "not_found" def test_check_status_returns_ready_when_already_ready(self): submission = Mock() submission.status = "ready" - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission assert self.service.check_status("book-123") == "ready" def test_check_status_returns_failed_when_already_failed(self): submission = Mock() submission.status = "failed" - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission assert self.service.check_status("book-123") == "failed" @patch.dict(os.environ, {"STORYTELLER_ASSETS_DIR": ""}) @@ -218,7 +223,7 @@ def test_check_status_returns_processing_when_no_transcriptions(self): submission.submission_dir = "Test Book" submission.storyteller_uuid = None submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission self.mock_storyteller.is_configured.return_value = False book = Mock() @@ -235,7 +240,7 @@ def test_check_status_returns_ready_when_transcriptions_exist(self): submission.submission_dir = "Test Book" submission.storyteller_uuid = None submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission book = Mock() book.storyteller_uuid = None @@ -262,7 +267,7 @@ def test_check_status_returns_ready_via_uuid(self): submission.submission_dir = "Test Book" submission.storyteller_uuid = "st-uuid-123" submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission self.mock_storyteller.get_word_timeline_chapters.return_value = [{"words": []}] @@ -280,7 +285,7 @@ def test_check_status_times_out_after_max_wait(self): submission.submission_dir = "Test Book" submission.storyteller_uuid = "st-uuid-123" submission.submitted_at = datetime.utcnow() - timedelta(hours=13) - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission with patch.dict(os.environ, {"STORYTELLER_MAX_WAIT_HOURS": "12"}): result = self.service.check_status("book-123") @@ -296,7 +301,7 @@ def test_check_status_uuid_check_works_without_submission_dir(self): submission.submission_dir = None submission.storyteller_uuid = "st-uuid-456" submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission self.mock_storyteller.get_word_timeline_chapters.return_value = [{"words": []}] @@ -316,7 +321,7 @@ def test_check_status_propagates_uuid_from_book(self): submission.submission_dir = "Test Book" submission.storyteller_uuid = None submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission book = Mock() book.storyteller_uuid = "book-uuid-abc" @@ -337,7 +342,7 @@ def test_check_status_no_propagation_when_book_has_no_uuid(self): submission.submission_dir = "Test Book" submission.storyteller_uuid = None submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission book = Mock() book.storyteller_uuid = None @@ -358,7 +363,7 @@ def test_check_status_fuzzy_dir_match_with_suffix(self): submission.submission_dir = "Bury Our Bones in the Midnight Soil" submission.storyteller_uuid = None submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission book = Mock() book.storyteller_uuid = None @@ -384,7 +389,7 @@ def test_check_status_fuzzy_dir_no_false_positive(self): submission.submission_dir = "Test Book" submission.storyteller_uuid = None submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission book = Mock() book.storyteller_uuid = None @@ -409,7 +414,7 @@ def test_check_status_fuzzy_dir_glob_special_chars(self): submission.submission_dir = "What If [Revised]" submission.storyteller_uuid = None submission.submitted_at = datetime.utcnow() - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission book = Mock() book.storyteller_uuid = None @@ -466,7 +471,7 @@ def test_check_status_returns_ready_without_extra_queries(self): """When submission is already ready, no filesystem/API checks should run.""" submission = Mock() submission.status = "ready" - self.mock_db.get_storyteller_submission.return_value = submission + self.mock_db.get_storyteller_submission_by_book_id.return_value = submission assert self.service.check_status("book-123") == "ready" # Should NOT have called update_storyteller_submission_status self.mock_db.update_storyteller_submission_status.assert_not_called() diff --git a/tests/test_storyteller_wordtimeline.py b/tests/test_storyteller_wordtimeline.py index e7639b0..b613a8c 100644 --- a/tests/test_storyteller_wordtimeline.py +++ b/tests/test_storyteller_wordtimeline.py @@ -131,12 +131,12 @@ def setUp(self): self.service = AlignmentService(self.mock_db, self.polisher) def test_empty_chapters_returns_false(self): - result = self.service.align_storyteller_and_store('abs-1', [], 'Some ebook text here') + result = self.service.align_storyteller_and_store(1, [], 'Some ebook text here') self.assertFalse(result) def test_empty_words_returns_false(self): chapters = [{'words': []}] - result = self.service.align_storyteller_and_store('abs-1', chapters, 'Some text') + result = self.service.align_storyteller_and_store(1, chapters, 'Some text') self.assertFalse(result) def test_builds_segments_and_aligns(self): @@ -149,7 +149,7 @@ def test_builds_segments_and_aligns(self): chapters = [{'words': words}] - result = self.service.align_storyteller_and_store('abs-test', chapters, text) + result = self.service.align_storyteller_and_store(42, chapters, text) self.assertTrue(result) # Verify _save_alignment was called @@ -165,7 +165,7 @@ def test_linear_fallback_on_no_anchors(self): # Ebook text has completely different content ebook_text = "Completely different content that shares no words with the transcript" - result = self.service.align_storyteller_and_store('abs-fallback', chapters, ebook_text) + result = self.service.align_storyteller_and_store(99, chapters, ebook_text) self.assertTrue(result) diff --git a/tests/test_suggestions_feature.py b/tests/test_suggestions_feature.py index 3dd7056..440eb1d 100644 --- a/tests/test_suggestions_feature.py +++ b/tests/test_suggestions_feature.py @@ -43,7 +43,7 @@ def __init__(self): # Link up the manager self.mock_sync_manager.abs_client = self.mock_abs_client - self.mock_sync_manager.get_abs_title.return_value = 'Test Book Title' + self.mock_sync_manager.get_audiobook_title.return_value = 'Test Book Title' self.mock_sync_manager.get_duration.return_value = 3600 def sync_manager(self): return self.mock_sync_manager @@ -186,6 +186,7 @@ def test_suggestions_page_renders_hidden_section(self): SimpleNamespace( id=1, source_id='visible-1', + source='abs', title='Visible Book', author='Visible Author', cover_url=None, @@ -196,6 +197,7 @@ def test_suggestions_page_renders_hidden_section(self): SimpleNamespace( id=2, source_id='hidden-1', + source='abs', title='Hidden Book', author='Hidden Author', cover_url=None, @@ -222,8 +224,8 @@ def test_hide_and_unhide_suggestion_api(self): self.assertEqual(hide_response.status_code, 200) self.assertEqual(unhide_response.status_code, 200) - self.mock_container.mock_database_service.hide_suggestion.assert_called_once_with('test-source') - self.mock_container.mock_database_service.unhide_suggestion.assert_called_once_with('test-source') + self.mock_container.mock_database_service.hide_suggestion.assert_called_once_with('test-source', source='abs') + self.mock_container.mock_database_service.unhide_suggestion.assert_called_once_with('test-source', source='abs') if __name__ == '__main__': unittest.main() diff --git a/tests/test_sync_concurrency.py b/tests/test_sync_concurrency.py new file mode 100644 index 0000000..0a6a34d --- /dev/null +++ b/tests/test_sync_concurrency.py @@ -0,0 +1,281 @@ +"""Concurrency tests for SyncManager parallel fetching and sync_cycle locking.""" + +import threading +import time +from unittest.mock import MagicMock, Mock, PropertyMock, patch + +import pytest + +from src.sync_manager import SyncManager + +# --------------------------------------------------------------------------- +# Helpers +# --------------------------------------------------------------------------- + +def _make_sync_manager(**overrides): + """Build a SyncManager with fully mocked dependencies (skips startup_checks).""" + db = Mock() + db.get_all_settings.return_value = {} + db.get_books_by_status.return_value = [] + db.get_all_books.return_value = [] + db.get_book_by_id.return_value = None + + defaults = dict( + abs_client=Mock(), + booklore_client=Mock(), + hardcover_client=Mock(), + transcriber=Mock(), + ebook_parser=Mock(), + database_service=db, + storyteller_client=Mock(), + sync_clients={}, + alignment_service=Mock(), + library_service=Mock(), + migration_service=Mock(), + suggestion_service=Mock(), + background_job_service=Mock(), + data_dir=None, + books_dir=None, + epub_cache_dir='/tmp/test_epub_cache', + ) + defaults.update(overrides) + + with patch.object(SyncManager, 'startup_checks'): + mgr = SyncManager(**defaults) + return mgr + + +def _make_mock_client(name, state=None, delay=0, error=None): + """Create a mock sync client whose get_service_state behaves as configured.""" + client = Mock() + client.is_configured.return_value = True + client.check_connection.return_value = True + client.fetch_bulk_state.return_value = None + + def get_service_state(book, prev_state, title_snip, bulk_ctx=None): + if delay: + time.sleep(delay) + if error: + raise error + return state + + client.get_service_state = get_service_state + return client + + +# --------------------------------------------------------------------------- +# _fetch_states_parallel: one client times out / raises +# --------------------------------------------------------------------------- + +class TestFetchStatesParallel: + """_fetch_states_parallel resilience when individual clients fail.""" + + def test_one_client_raises_others_still_return(self): + """If one client throws, the other clients' states are still collected.""" + good_state = Mock() + good_state.delta = 0.1 + good_state.current = {'pct': 0.5} + + clients = { + 'Good': _make_mock_client('Good', state=good_state), + 'Bad': _make_mock_client('Bad', error=RuntimeError('API down')), + } + + mgr = _make_sync_manager(sync_clients=clients) + # Manually set sync_clients since _setup_sync_clients filters by is_configured + mgr.sync_clients = clients + + book = Mock() + book.abs_id = 'test-book' + result = mgr._fetch_states_parallel( + book, prev_states_by_client={}, title_snip='Test', + clients_to_use=clients, + ) + + assert 'Good' in result + assert result['Good'] is good_state + assert 'Bad' not in result + + def test_one_client_times_out_others_still_return(self): + """A slow client that exceeds the timeout does not block results from fast clients.""" + fast_state = Mock() + fast_state.delta = 0.05 + fast_state.current = {'pct': 0.3} + + clients = { + 'Fast': _make_mock_client('Fast', state=fast_state), + 'Slow': _make_mock_client('Slow', state=Mock(), delay=2), + } + + mgr = _make_sync_manager(sync_clients=clients) + mgr.sync_clients = clients + + book = Mock() + book.abs_id = 'timeout-book' + + result = mgr._fetch_states_parallel( + book, prev_states_by_client={}, title_snip='Timeout', + clients_to_use=clients, + ) + + # Fast client should always be present + assert 'Fast' in result + assert result['Fast'] is fast_state + # Slow client may or may not be present depending on executor timeout; + # the important thing is that the call completes and Fast is not lost + + def test_all_clients_raise_returns_empty(self): + """When every client fails, result is an empty dict (no crash).""" + clients = { + 'A': _make_mock_client('A', error=ConnectionError('refused')), + 'B': _make_mock_client('B', error=TimeoutError('too slow')), + } + + mgr = _make_sync_manager(sync_clients=clients) + mgr.sync_clients = clients + + book = Mock() + book.abs_id = 'all-fail' + result = mgr._fetch_states_parallel( + book, prev_states_by_client={}, title_snip='Fail', + clients_to_use=clients, + ) + + assert result == {} + + def test_client_returns_none_is_excluded(self): + """A client that returns None is not included in results.""" + clients = { + 'NoneClient': _make_mock_client('NoneClient', state=None), + } + + mgr = _make_sync_manager(sync_clients=clients) + mgr.sync_clients = clients + + book = Mock() + book.abs_id = 'none-book' + result = mgr._fetch_states_parallel( + book, prev_states_by_client={}, title_snip='None', + clients_to_use=clients, + ) + + assert 'NoneClient' not in result + + +# --------------------------------------------------------------------------- +# sync_cycle concurrent access: no state corruption +# --------------------------------------------------------------------------- + +class TestSyncCycleConcurrency: + """sync_cycle called from two threads must not corrupt shared state.""" + + def test_concurrent_daemon_calls_one_wins(self): + """Two daemon (non-targeted) sync_cycle calls: only one runs, the other skips.""" + mgr = _make_sync_manager() + mgr.database_service.get_books_by_status.return_value = [] + + call_count = 0 + original_internal = mgr._sync_cycle_internal + + def counting_internal(target_book_id=None): + nonlocal call_count + call_count += 1 + time.sleep(0.2) # simulate work so overlap is likely + original_internal(target_book_id) + + mgr._sync_cycle_internal = counting_internal + + t1 = threading.Thread(target=mgr.sync_cycle) + t2 = threading.Thread(target=mgr.sync_cycle) + t1.start() + t2.start() + t1.join(timeout=5) + t2.join(timeout=5) + + # Exactly one should have run; the other was skipped (non-blocking acquire) + assert call_count == 1 + + def test_targeted_sync_waits_for_daemon(self): + """Instant-sync (targeted) waits for the daemon lock, then runs.""" + mgr = _make_sync_manager() + mgr.database_service.get_books_by_status.return_value = [] + + book = Mock() + book.abs_id = 'targeted-book' + book.id = 42 + book.status = 'active' + mgr.database_service.get_book_by_id.return_value = book + + call_order = [] + + original_internal = mgr._sync_cycle_internal + + def tracking_internal(target_book_id=None): + label = 'targeted' if target_book_id else 'daemon' + call_order.append(f'{label}_start') + time.sleep(0.1) + original_internal(target_book_id) + call_order.append(f'{label}_end') + + mgr._sync_cycle_internal = tracking_internal + + # Start daemon first, then targeted shortly after + t_daemon = threading.Thread(target=mgr.sync_cycle) + t_targeted = threading.Thread(target=lambda: mgr.sync_cycle(target_book_id=42)) + + t_daemon.start() + time.sleep(0.02) # let daemon grab the lock first + t_targeted.start() + + t_daemon.join(timeout=5) + t_targeted.join(timeout=12) + + # Both should have run (targeted waits up to 10s) + assert 'daemon_start' in call_order + assert 'daemon_end' in call_order + assert 'targeted_start' in call_order + assert 'targeted_end' in call_order + + # daemon must finish before targeted starts + daemon_end_idx = call_order.index('daemon_end') + targeted_start_idx = call_order.index('targeted_start') + assert daemon_end_idx < targeted_start_idx + + def test_sync_lock_not_held_after_exception(self): + """If _sync_cycle_internal raises, the lock is still released.""" + mgr = _make_sync_manager() + + mgr._sync_cycle_internal = Mock(side_effect=RuntimeError('unexpected')) + + mgr.sync_cycle() # should not raise — exception is caught + + # Lock must be released: another acquire should succeed immediately + acquired = mgr._sync_lock.acquire(blocking=False) + assert acquired, "Lock was not released after exception" + mgr._sync_lock.release() + + def test_pending_clears_not_corrupted_by_concurrent_access(self): + """_pending_clears set modifications under lock are safe across threads.""" + mgr = _make_sync_manager() + + def writer(): + for i in range(100): + with mgr._pending_clears_lock: + mgr._pending_clears.add(i) + time.sleep(0.001) + + def reader(): + for _ in range(100): + with mgr._pending_clears_lock: + # Just read — should never see a partial state + set(mgr._pending_clears) # read under lock + time.sleep(0.001) + + threads = [threading.Thread(target=writer), threading.Thread(target=reader)] + for t in threads: + t.start() + for t in threads: + t.join(timeout=5) + + # All 100 items should be present + assert len(mgr._pending_clears) == 100 diff --git a/tests/test_tbr_repository.py b/tests/test_tbr_repository.py index f73a048..85e441b 100644 --- a/tests/test_tbr_repository.py +++ b/tests/test_tbr_repository.py @@ -114,20 +114,20 @@ def test_no_dedup_without_keys(self): # -- Linking -- def test_link_tbr_to_book(self): - """link_tbr_to_book sets book_abs_id on the item.""" + """link_tbr_to_book sets book_id on the item.""" from src.db.models import Book book = self.db.save_book(Book(abs_id='abs-123', title='Owned Copy', status='active')) item, _ = self.db.add_tbr_item('Dune') - self.assertIsNone(item.book_abs_id) + self.assertIsNone(item.book_id) - linked = self.db.link_tbr_to_book(item.id, book.id, book_abs_id='abs-123') + linked = self.db.link_tbr_to_book(item.id, book.id) self.assertIsNotNone(linked) - self.assertEqual(linked.book_abs_id, 'abs-123') + self.assertEqual(linked.book_id, book.id) # Verify persisted refreshed = self.db.get_tbr_item(item.id) - self.assertEqual(refreshed.book_abs_id, 'abs-123') + self.assertEqual(refreshed.book_id, book.id) def test_link_tbr_not_found(self): """link_tbr_to_book returns None for missing item ID.""" @@ -177,7 +177,7 @@ def test_save_hardcover_details_auto_links_tbr(self): # Create a TBR item with hardcover_book_id=42 (not yet linked) item, created = self.db.add_tbr_item('Dune', hardcover_book_id=42) self.assertTrue(created) - self.assertIsNone(item.book_abs_id) + self.assertIsNone(item.book_id) # Save HardcoverDetails linking abs-1 to hardcover_book_id=42 hc = HardcoverDetails(abs_id='abs-1', book_id=book.id, hardcover_book_id='42') @@ -185,7 +185,7 @@ def test_save_hardcover_details_auto_links_tbr(self): # Verify TBR item is now linked refreshed = self.db.get_tbr_item(item.id) - self.assertEqual(refreshed.book_abs_id, 'abs-1') + self.assertEqual(refreshed.book_id, book.id) def test_save_hardcover_details_no_tbr_match(self): """save_hardcover_details with no matching TBR item is a no-op.""" @@ -204,14 +204,14 @@ def test_save_hardcover_details_skips_already_linked(self): book2 = self.db.save_book(Book(abs_id='abs-2', title='Dune Messiah', status='active')) item, _ = self.db.add_tbr_item('Dune', hardcover_book_id=42) - self.db.link_tbr_to_book(item.id, book2.id, book_abs_id='abs-2') # pre-linked to abs-2 + self.db.link_tbr_to_book(item.id, book2.id) # pre-linked to book2 hc = HardcoverDetails(abs_id='abs-1', book_id=book1.id, hardcover_book_id='42') self.db.save_hardcover_details(hc) - # Should still be linked to abs-2, not overwritten to abs-1 + # Should still be linked to book2, not overwritten to book1 refreshed = self.db.get_tbr_item(item.id) - self.assertEqual(refreshed.book_abs_id, 'abs-2') + self.assertEqual(refreshed.book_id, book2.id) # -- Source filtering -- diff --git a/tests/test_webserver.py b/tests/test_webserver.py index 6f97857..f59e421 100644 --- a/tests/test_webserver.py +++ b/tests/test_webserver.py @@ -28,8 +28,8 @@ def __init__(self): self.mock_database_service = Mock() self.mock_database_service.get_all_settings.return_value = {} # Default empty settings self.mock_database_service.get_book_by_ref.return_value = None - self.mock_database_service.get_bookfusion_linked_abs_ids.return_value = set() - self.mock_database_service.get_bookfusion_highlight_counts.return_value = {} + self.mock_database_service.get_bookfusion_linked_book_ids.return_value = set() + self.mock_database_service.get_bookfusion_highlight_counts_by_book_id.return_value = {} self.mock_ebook_parser = Mock() self.mock_sync_clients = Mock() self.mock_bookfusion_client = Mock() @@ -46,7 +46,7 @@ def __init__(self): self.mock_sync_manager.abs_client = self.mock_abs_client self.mock_sync_manager.booklore_client = self.mock_booklore_client self.mock_sync_manager.storyteller_client = self.mock_storyteller_client - self.mock_sync_manager.get_abs_title.return_value = 'Test Book Title' + self.mock_sync_manager.get_audiobook_title.return_value = 'Test Book Title' self.mock_sync_manager.get_duration.return_value = 3600 self.mock_sync_manager.clear_progress = Mock() From cc29fa4dde287861b549866785938fdfd0aba7ef Mon Sep 17 00:00:00 2001 From: Sarah Wolff Date: Fri, 27 Mar 2026 10:42:09 -0400 Subject: [PATCH 02/13] Align Hardcover API usage with docs, cache read IDs (#6) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Fix remaining book_id migration issues, guard Booklore cache, scope suggestions (#20) Completes the ABS ID decoupling by fixing service/repository methods that still used abs_id as lookup keys, removing 19 dead backward-compat methods, and cleaning up unnecessary abs_id parameters. Key changes: - Fix reading_stats, alignment, storyteller, dashboard lookups to use book_id - Guard Booklore cache loading behind is_configured() for unconfigured instances - Scope suggestion operations by (source_id, source) composite key with unique index migration, preventing collisions across ABS/KoSync/Booklore - Remove dead is_hash_linked_to_device methods from kosync and suggestion repos - Add 14 new tests for book_id resolution, suggestion scoping, and alignment ops All 458 tests passing. * Fix database upgrade safety issues from v0.1.4 compatibility review - Guard save_state() against double-NULL book_id/abs_id lookup - Isolate per-column error handling in _ensure_model_columns - Log orphaned rows in nullable table backfill migration - Remove dead delete_hardcover_details_by_book_id method * Fix abs_id→book_id migration gaps from CodeRabbit review (#50) Fixes 6 issues found during v0.1.5 PR review: - Restore _rdAbsId JS variable in reading_detail.html (all action buttons broken) - Key KoSync debounce, poll cache, and write-suppression by book.id not abs_id (ebook-only books have abs_id=None, collapsing all into one dict entry) - Fix link_kosync_document to set linked_abs_id for backward compat; query linked/unlinked docs by linked_book_id (the canonical FK) - Guard get_book_by_abs_id(None) with early return - Gate Base.metadata.create_all() on migration success * Smart mode defaults: auto-detect available services Default to Ebook Only mode when ABS is not configured. Detect all ebook sources (Booklore, CWA, ABS ebook libs, local /books mount). Disable mode buttons that have no backing service. Update subtitle from "ABS library" to "audiobook library". * Fix missing BookFusion covers and broken onerror fallback Skip ABS cover proxy for bf- prefixed books (always 404'd), deduplicate dashboard cover waterfall into resolve_book_covers(), fix onerror chain so placeholder shows when KoSync fallback also fails, add branded BookFusion placeholder logo. * Consolidate suggestion serializer and remove dismissed status Move _serialize_suggestion into helpers.py as shared utility, removing duplicate definitions from api.py and matching_bp.py. Unify dismissed → hidden status throughout suggestion_repository. Allow suggestion rescan to proceed when ABS is unconfigured (BookFusion-only setups). Pass storyteller_configured flag to match/batch_match templates. * Hide Storyteller UI when unconfigured and fix ABS cover proxy fallback Conditionally hide Storyteller column in match/batch_match when the integration is not configured. ABS cover proxy now falls back to using the raw book_ref as abs_id when no book record exists, allowing direct ABS ID lookups without a mapped book. * Show service logo placeholder when book cover is unavailable Add placeholder_logo field to mapping/book data dicts, determined by primary source (BookFusion, Booklore, or Audiobookshelf). Display the logo in all cover placeholder divs across dashboard, reading log, reading detail, and backlog cards. * Deduplicate placeholder_logo logic, fix cover proxy streaming, and fix N+1 query Extract resolve_placeholder_logo() into cover_resolver.py and return it from resolve_book_covers(), removing duplicate 4-branch conditionals from dashboard.py and reading_bp.py. Drop unnecessary stream=True from cover proxy requests that immediately buffer via .content. Bulk-fetch Hardcover details on the reading page to avoid per-book N+1 queries. * KoSync system overhaul: service extraction, document management, bug fixes Major refactoring and feature additions for the KoSync subsystem: Service extraction: - Extract 375 lines of business logic from kosync_server.py into new KosyncService class (src/services/kosync_service.py) - Decompose _try_find_epub_by_hash (151 lines) into 3 focused methods - Remove dead code: _hash_cache, unused repository methods KoSync Document Management page (/kosync-documents): - New page accessible from Settings > KoSync tab - Three sections: Healthy, Needs Attention, Stale (30+ days) - Actions: Link to Book (search), Link to Self, Create Book, Clear Hash, Unlink, Delete - Rich context: book titles, time-ago indicators, device vs bot labels - Dashboard "Pending Identification" section for unlinked hashes with reading progress Bug fixes: - Fix sync direction inversion: mixed text-matched and percentage-fallback normalization could elect wrong leader (Entitled at 39% over Booklore at 45%) - Fix Booklore get_text_from_current_state using wrong filename - Fix Booklore 2 not showing as pairing option when only BL2 enabled - Fix Booklore crash on books with no ebook filename - Fix ebook-only books showing as unlinked (linked_abs_id vs linked_book_id) - Fix Link to Self sending empty body (Flask 400) - Fix external KoSync server missing credential fields and secret handling - Prevent orphaned hashes by creating KosyncDocument on every book save Improvements: - Rename abs-kosync-bot to pagekeeper-bot, centralize in constants.py - Remove legacy bot names (book-stitch, book-sync) - Redesign KoSync settings tab: sync source at top, conditional sections - Auto-create books for exact ABS title matches (skip suggestion approval) - Downgrade noisy no-progress warnings to debug - Include book title in Instant Sync log message - Add external KoSync server credential fields (KOSYNC_SERVER_USER/KEY) * Address remaining review findings: error handling, atomicity, TypeError guard * fix(hardcover): align API usage with Hardcover docs, cache read IDs - Replace undocumented `public` field with `privacy_setting_id` on lists - Add `search_by_asin` using dedicated `editions.asin` field - Prefer `reading_format_id` over undocumented format string fields - Cache `user_book_read_id` to skip extra API call per progress update - Add `distinct_on: book_id` to user_books queries per docs - Increase request timeout from 10s to 20s (server allows 30s) - Clamp ratings to 0-5 range with 0.5 increments - Add `get_book_series` method for series metadata - Extract dominant color from `cached_image` into `cover_color` * fix: address PR #6 review findings - Add author/subtitle to save_book update_attrs (silent field drops) - Prefer linked_book_id over linked_abs_id in KoSync lookups (ebook-only support) - Persist book before passing to _get_or_create_user_book (id=None guard) - Remove max(ts_gap, 1) clamp in alignment sentinel detection - Fix test_matching_errors to mock get_kosync_id_for_ebook (correct target) - Set linked_book_id=None on MagicMock kosync docs in tests * fix: address low-priority PR #6 review findings - Cast title/authors to str in bookfusion upload (type safety) - Use coalesce in get_latest_jobs_bulk join (NULL timestamp handling) - Replace sleep with threading.Event in concurrency test (deterministic) - Remove early break in auto_link_by_title (link all matches) --- ...t0u1_add_suggestion_source_unique_index.py | 12 +- src/api/hardcover_client.py | 165 ++++-- src/blueprints/abs_bp.py | 14 +- src/blueprints/bookfusion_bp.py | 339 ++++++------ src/blueprints/dashboard.py | 28 +- src/blueprints/matching_bp.py | 42 +- src/blueprints/reading_bp.py | 488 ++++++++++-------- src/db/book_repository.py | 126 +++-- src/db/bookfusion_repository.py | 6 +- src/db/hardcover_repository.py | 1 - src/db/kosync_repository.py | 15 +- src/db/suggestion_repository.py | 145 ++++-- src/services/alignment_service.py | 266 +++++----- src/services/hardcover_service.py | 264 +++++----- src/services/kosync_service.py | 66 ++- src/sync_clients/hardcover_sync_client.py | 20 +- src/utils/cover_resolver.py | 20 +- templates/settings.html | 1 + tests/test_abs_socket_listener.py | 3 + tests/test_book_status.py | 225 ++++---- tests/test_hardcover_sync_client.py | 6 +- tests/test_matching_errors.py | 32 +- tests/test_sync_concurrency.py | 95 ++-- 23 files changed, 1349 insertions(+), 1030 deletions(-) diff --git a/alembic/versions/p6q7r8s9t0u1_add_suggestion_source_unique_index.py b/alembic/versions/p6q7r8s9t0u1_add_suggestion_source_unique_index.py index a7baf7f..ecbe9a0 100644 --- a/alembic/versions/p6q7r8s9t0u1_add_suggestion_source_unique_index.py +++ b/alembic/versions/p6q7r8s9t0u1_add_suggestion_source_unique_index.py @@ -15,8 +15,8 @@ from alembic import op # revision identifiers, used by Alembic. -revision: str = 'p6q7r8s9t0u1' -down_revision: str = 'o7p8q9r0s1t2' +revision: str = "p6q7r8s9t0u1" +down_revision: str = "o7p8q9r0s1t2" branch_labels: Sequence[str] | None = None depends_on: str | None = None @@ -24,9 +24,9 @@ def upgrade() -> None: try: op.create_index( - 'ix_pending_suggestions_source_id_source', - 'pending_suggestions', - ['source_id', 'source'], + "ix_pending_suggestions_source_id_source", + "pending_suggestions", + ["source_id", "source"], unique=True, ) except sa.exc.OperationalError: @@ -35,6 +35,6 @@ def upgrade() -> None: def downgrade() -> None: try: - op.drop_index('ix_pending_suggestions_source_id_source', table_name='pending_suggestions') + op.drop_index("ix_pending_suggestions_source_id_source", table_name="pending_suggestions") except sa.exc.OperationalError: pass diff --git a/src/api/hardcover_client.py b/src/api/hardcover_client.py index 9e5cf01..ab4ca68 100644 --- a/src/api/hardcover_client.py +++ b/src/api/hardcover_client.py @@ -92,7 +92,7 @@ def query(self, query: str, variables: dict | None = None) -> dict | None: self.api_url, json={"query": query, "variables": variables or {}}, headers=self.headers, - timeout=10, + timeout=20, ) # Handle rate limiting (429) with exponential backoff @@ -111,7 +111,7 @@ def query(self, query: str, variables: dict | None = None) -> dict | None: self.api_url, json={"query": query, "variables": variables or {}}, headers=self.headers, - timeout=10, + timeout=20, ) backoff *= 2 @@ -156,7 +156,7 @@ def get_user_book(self, book_id): query = """ query GetUserBook($book_id: Int!, $user_id: Int!) { - user_books(where: {book_id: {_eq: $book_id}, user_id: {_eq: $user_id}}) { + user_books(where: {book_id: {_eq: $book_id}, user_id: {_eq: $user_id}}, distinct_on: book_id) { id status_id } @@ -179,7 +179,7 @@ def get_user_book(self, book_id): def _extract_cover_url(self, cached_image) -> str | None: """Extract a cover image URL from the cached_image jsonb field. - The field is a JSON object like {"url": "https://..."} or similar. + The field is a JSON object like {"url": "https://...", "color": "#hex"} or similar. """ if not cached_image: return None @@ -189,6 +189,12 @@ def _extract_cover_url(self, cached_image) -> str | None: return cached_image return None + def _extract_cover_color(self, cached_image) -> str | None: + """Extract the dominant color from cached_image (hex string like '#1a2b3c').""" + if isinstance(cached_image, dict): + return cached_image.get("color") + return None + def _normalize_book(self, book: dict) -> dict: """Extract standard book metadata from a Hardcover book object. @@ -208,6 +214,7 @@ def _normalize_book(self, book: dict) -> dict: "title": book.get("title", ""), "author": authors[0] if authors else "", "cached_image": self._extract_cover_url(book.get("cached_image")), + "cover_color": self._extract_cover_color(book.get("cached_image")), "slug": book.get("slug"), "pages": book.get("pages"), "rating": parsed_rating, @@ -270,6 +277,36 @@ def search_by_isbn(self, isbn: str) -> dict | None: } return None + def search_by_asin(self, asin: str) -> dict | None: + """Search by ASIN using the dedicated editions.asin field.""" + query = """ + query ($asin: String!) { + editions(where: { asin: { _eq: $asin } }) { + id + pages + book { + id + title + slug + cached_image + } + } + } + """ + + result = self.query(query, {"asin": str(asin)}) + if result and result.get("editions") and len(result["editions"]) > 0: + edition = result["editions"][0] + return { + "book_id": edition["book"]["id"], + "slug": edition["book"].get("slug"), + "edition_id": edition["id"], + "pages": edition["pages"], + "title": edition["book"]["title"], + "cached_image": self._extract_cover_url(edition["book"].get("cached_image")), + } + return None + def search_by_title_author(self, title: str, author: str | None = None) -> dict | None: """Search by title and author, returning the best fuzzy match.""" # Clean the input title for better matching comparison @@ -439,6 +476,8 @@ def get_book_author(self, book_id: int) -> str | None: return authors[0] return None + READING_FORMAT_LABELS = {1: "Physical", 2: "Audiobook", 3: "Physical + Audio", 4: "eBook"} + def get_book_editions(self, book_id: int) -> list: """Fetch all editions for a book with format, pages, duration, and year.""" query = """ @@ -447,6 +486,7 @@ def get_book_editions(self, book_id: int) -> list: id pages audio_seconds + reading_format_id edition_format physical_format release_date @@ -457,18 +497,17 @@ def get_book_editions(self, book_id: int) -> list: if result and result.get("editions"): editions = [] for ed in result["editions"]: - # Determine format label: prefer edition_format, fall back to physical_format - format_label = ed.get("edition_format") or ed.get("physical_format") + format_label = self.READING_FORMAT_LABELS.get(ed.get("reading_format_id")) + if not format_label: + format_label = ed.get("edition_format") or ed.get("physical_format") if not format_label: - # Infer format from available data if ed.get("audio_seconds") and ed.get("audio_seconds") > 0: format_label = "Audiobook" elif ed.get("pages") and ed.get("pages") > 0: format_label = "Book" else: format_label = "Unknown" - # Normalize format label - if format_label and format_label != "Unknown": + elif format_label not in self.READING_FORMAT_LABELS.values(): format_lower = format_label.lower() if format_lower == "ebook": format_label = "eBook" @@ -607,7 +646,7 @@ def find_user_book(self, book_id: int) -> dict | None: """Find existing user_book with read info.""" query = """ query ($bookId: Int!, $userId: Int!) { - user_books(where: { book_id: { _eq: $bookId }, user_id: { _eq: $userId }}) { + user_books(where: { book_id: { _eq: $bookId }, user_id: { _eq: $userId }}, distinct_on: book_id) { id status_id edition_id @@ -766,39 +805,46 @@ def update_progress( audio_seconds: int = None, started_at: str = None, finished_at: str = None, - ) -> bool: + cached_read_id: int = None, + ) -> dict | None: """ - Update reading progress. + Update reading progress. Returns {"success": bool, "read_id": int} or None. + Uses current_percentage > 0.02 (2%) to decide when to set 'started_at'. For audiobook editions, pass audio_seconds to use progress_seconds instead of progress_pages. Optional started_at/finished_at (YYYY-MM-DD strings) override the default of using today's date when filling missing dates on the Hardcover read. - """ - # First check if there's an existing read - read_query = """ - query ($userBookId: Int!) { - user_book_reads(where: { user_book_id: { _eq: $userBookId }}, order_by: {id: desc}, limit: 1) { - id - started_at - finished_at - } - } - """ - read_result = self.query(read_query, {"userBookId": user_book_id}) + If cached_read_id is provided, skips the fetch query for existing reads. + """ today = self._get_today_date() - - # LOGIC: Only set started date if we are past 2% should_start = current_percentage > 0.02 - if ( - read_result - and read_result.get("user_book_reads") - and len(read_result["user_book_reads"]) > 0 - ): + # Use cached read ID if available, otherwise fetch + existing_read = None + if cached_read_id: + existing_read = {"id": cached_read_id, "started_at": None, "finished_at": None} + else: + read_query = """ + query ($userBookId: Int!) { + user_book_reads(where: { user_book_id: { _eq: $userBookId }}, order_by: {id: desc}, limit: 1) { + id + started_at + finished_at + } + } + """ + read_result = self.query(read_query, {"userBookId": user_book_id}) + if ( + read_result + and read_result.get("user_book_reads") + and len(read_result["user_book_reads"]) > 0 + ): + existing_read = read_result["user_book_reads"][0] + + if existing_read: # --- UPDATE EXISTING READ --- - existing_read = read_result["user_book_reads"][0] read_id = existing_read["id"] started_at_val = existing_read.get("started_at") @@ -866,9 +912,9 @@ def update_progress( if result and result.get("update_user_book_read"): if result["update_user_book_read"].get("error"): - return False - return True - return False + return None + return {"success": True, "read_id": read_id} + return None else: # --- CREATE NEW READ --- @@ -927,9 +973,11 @@ def update_progress( if result and result.get("insert_user_book_read"): if result["insert_user_book_read"].get("error"): - return False - return True - return False + return None + read = result["insert_user_book_read"].get("user_book_read") + new_read_id = read["id"] if read else None + return {"success": True, "read_id": new_read_id} + return None def get_book_metadata(self, book_id: int) -> dict | None: """Fetch enrichment metadata for a book. @@ -1174,6 +1222,42 @@ def get_all_editions(self, book_id: int) -> dict: editions['audio'] = book["default_audio_edition"] return editions + def get_book_series(self, book_id: int) -> list[dict]: + """Fetch series info for a book (excluding compilations).""" + query = """ + query ($bookId: Int!) { + books_by_pk(id: $bookId) { + book_series(where: {compilation: {_eq: false}}) { + position + details + series { + id + name + slug + books_count + } + } + } + } + """ + result = self.query(query, {"bookId": book_id}) + if not result or not result.get("books_by_pk"): + return [] + + entries = result["books_by_pk"].get("book_series", []) + return [ + { + "position": bs.get("position"), + "details": bs.get("details"), + "series_id": bs["series"]["id"], + "series_name": bs["series"]["name"], + "series_slug": bs["series"].get("slug"), + "series_books_count": bs["series"].get("books_count"), + } + for bs in entries + if bs.get("series") + ] + # ── TBR / Want-to-Read methods ── def get_want_to_read_books(self) -> list[dict]: @@ -1232,7 +1316,7 @@ def get_user_lists(self) -> list[dict]: name description books_count - public + privacy_setting_id updated_at } } @@ -1247,7 +1331,8 @@ def get_user_lists(self) -> list[dict]: "name": lst.get("name", ""), "description": lst.get("description", ""), "books_count": lst.get("books_count", 0), - "public": lst.get("public", False), + "privacy_setting_id": lst.get("privacy_setting_id", 3), + "public": lst.get("privacy_setting_id") == 1, "updated_at": lst.get("updated_at"), } for lst in result["lists"] diff --git a/src/blueprints/abs_bp.py b/src/blueprints/abs_bp.py index 67aaab5..f5c27ce 100644 --- a/src/blueprints/abs_bp.py +++ b/src/blueprints/abs_bp.py @@ -10,10 +10,10 @@ logger = logging.getLogger(__name__) -abs_bp = Blueprint('abs', __name__) +abs_bp = Blueprint("abs", __name__) -@abs_bp.route('/api/abs/libraries', methods=['GET']) +@abs_bp.route("/api/abs/libraries", methods=["GET"]) def get_abs_libraries(): """Return available ABS libraries.""" abs_service = get_abs_service() @@ -23,13 +23,13 @@ def get_abs_libraries(): return jsonify(libraries) -@abs_bp.route('/api/cover-proxy/') +@abs_bp.route("/api/cover-proxy/") def proxy_cover(book_ref): """Proxy cover access with local caching for offline resilience.""" book = get_database_service().get_book_by_ref(book_ref) abs_id = book.abs_id if book and book.abs_id else book_ref - if not re.fullmatch(r'[a-zA-Z0-9_\-]+', abs_id): + if not re.fullmatch(r"[a-zA-Z0-9_\-]+", abs_id): return "Invalid ID", 400 covers_dir = get_covers_dir() @@ -50,8 +50,8 @@ def proxy_cover(book_ref): cache_file.write_bytes(data) except Exception: logger.debug(f"Failed to cache cover for '{abs_id}'") - resp = Response(data, content_type=req.headers.get('content-type', 'image/jpeg')) - resp.headers['Cache-Control'] = 'public, max-age=86400, immutable' + resp = Response(data, content_type=req.headers.get("content-type", "image/jpeg")) + resp.headers["Cache-Control"] = "public, max-age=86400, immutable" return resp except Exception as e: logger.error(f"Error proxying cover for '{abs_id}': {e}") @@ -59,7 +59,7 @@ def proxy_cover(book_ref): # Fall back to local cache if cache_file.exists(): resp = send_from_directory(covers_dir, cache_file.name) - resp.headers['Cache-Control'] = 'public, max-age=86400, immutable' + resp.headers["Cache-Control"] = "public, max-age=86400, immutable" return resp return "Cover not found", 404 diff --git a/src/blueprints/bookfusion_bp.py b/src/blueprints/bookfusion_bp.py index 4065ec1..c995e33 100644 --- a/src/blueprints/bookfusion_bp.py +++ b/src/blueprints/bookfusion_bp.py @@ -13,24 +13,24 @@ logger = logging.getLogger(__name__) -bookfusion_bp = Blueprint('bookfusion', __name__) +bookfusion_bp = Blueprint("bookfusion", __name__) -SUPPORTED_FORMATS = {'.epub', '.mobi', '.azw3', '.pdf', '.azw', '.fb2', '.cbz', '.cbr'} +SUPPORTED_FORMATS = {".epub", ".mobi", ".azw3", ".pdf", ".azw", ".fb2", ".cbz", ".cbr"} def _is_supported(filename: str) -> bool: return any(filename.lower().endswith(ext) for ext in SUPPORTED_FORMATS) -@bookfusion_bp.route('/bookfusion') +@bookfusion_bp.route("/bookfusion") def bookfusion_page(): - return render_template('bookfusion.html') + return render_template("bookfusion.html") -@bookfusion_bp.route('/api/bookfusion/booklore-books') +@bookfusion_bp.route("/api/bookfusion/booklore-books") def booklore_books(): """List Booklore books for upload selection, filtered by supported formats.""" - q = request.args.get('q', '').strip() + q = request.args.get("q", "").strip() results = [] client = get_booklore_client() @@ -38,62 +38,64 @@ def booklore_books(): try: label = current_app.config.get("BOOKLORE_LABEL", "Booklore") books = client.search_books(q) if q else client.get_all_books() - for b in (books or []): - fname = b.get('fileName', '') + for b in books or []: + fname = b.get("fileName", "") if not _is_supported(fname): continue - results.append({ - 'id': b.get('id'), - 'title': b.get('title', ''), - 'authors': b.get('authors', ''), - 'fileName': fname, - 'source': label, - }) + results.append( + { + "id": b.get("id"), + "title": b.get("title", ""), + "authors": b.get("authors", ""), + "fileName": fname, + "source": label, + } + ) except Exception as e: logger.warning(f"Booklore search failed: {e}") return jsonify(results) -@bookfusion_bp.route('/api/bookfusion/upload', methods=['POST']) +@bookfusion_bp.route("/api/bookfusion/upload", methods=["POST"]) def upload_book(): """Upload a book from Booklore to BookFusion.""" data = request.get_json() if not data: - return jsonify({'error': 'No data provided'}), 400 + return jsonify({"error": "No data provided"}), 400 - book_id = data.get('book_id') - title = data.get('title', '') - authors = data.get('authors', '') - filename = data.get('fileName', '') + book_id = data.get("book_id") + title = str(data.get("title", "")) + authors = str(data.get("authors", "")) + filename = data.get("fileName", "") if not book_id: - return jsonify({'error': 'book_id required'}), 400 + return jsonify({"error": "book_id required"}), 400 container = get_container() bf_client = container.bookfusion_client() if not bf_client.upload_api_key: - return jsonify({'error': 'BookFusion upload API key not configured'}), 400 + return jsonify({"error": "BookFusion upload API key not configured"}), 400 bl_client = get_booklore_client() if not bl_client.is_configured(): - return jsonify({'error': 'Booklore not configured'}), 400 + return jsonify({"error": "Booklore not configured"}), 400 # Download from Booklore file_bytes = bl_client.download_book(book_id) if not file_bytes: - return jsonify({'error': 'Failed to download book from Booklore'}), 500 + return jsonify({"error": "Failed to download book from Booklore"}), 500 # Upload to BookFusion logger.info(f"BookFusion upload request: title='{title}', authors='{authors}', filename='{filename}'") result = bf_client.upload_book(filename, file_bytes, title, authors) if result: - return jsonify({'success': True, 'result': result}) - return jsonify({'error': 'Upload to BookFusion failed'}), 500 + return jsonify({"success": True, "result": result}) + return jsonify({"error": "Upload to BookFusion failed"}), 500 -@bookfusion_bp.route('/api/bookfusion/sync-highlights', methods=['POST']) +@bookfusion_bp.route("/api/bookfusion/sync-highlights", methods=["POST"]) def sync_highlights(): """Trigger highlight sync from BookFusion.""" container = get_container() @@ -101,27 +103,27 @@ def sync_highlights(): db_service = get_database_service() if not bf_client.highlights_api_key: - return jsonify({'error': 'BookFusion highlights API key not configured'}), 400 + return jsonify({"error": "BookFusion highlights API key not configured"}), 400 data = request.get_json(silent=True) or {} - if data.get('full_resync'): + if data.get("full_resync"): db_service.set_bookfusion_sync_cursor(None) try: result = bf_client.sync_all_highlights(db_service) matched = _auto_match_highlights(db_service) - return jsonify({ - 'success': True, - 'new_highlights': result['new_highlights'], - 'books_saved': result['books_saved'], - 'auto_matched': matched, - 'new_ids': result.get('new_ids', []), - }) + return jsonify( + { + "success": True, + "new_highlights": result["new_highlights"], + "books_saved": result["books_saved"], + "auto_matched": matched, + "new_ids": result.get("new_ids", []), + } + ) except Exception: logger.exception("BookFusion highlight sync failed") - return jsonify({'error': 'BookFusion highlight sync failed'}), 500 - - + return jsonify({"error": "BookFusion highlight sync failed"}), 500 def _auto_match_highlights(db_service) -> int: @@ -144,7 +146,7 @@ def _auto_match_highlights(db_service) -> int: # Group unmatched by book_title title_groups: dict[str, list] = {} for hl in unmatched: - title = clean_book_title(hl.book_title or '') + title = clean_book_title(hl.book_title or "") title_groups.setdefault(title, []).append(hl) matched_count = 0 @@ -212,23 +214,23 @@ def _estimate_reading_dates(db_service, abs_id: str, bookfusion_ids: list[str], new_details = HardcoverDetails( abs_id=abs_id, book_id=book.id, - hardcover_book_id=str(search_result['book_id']), - hardcover_slug=search_result.get('slug'), - hardcover_cover_url=search_result.get('cached_image'), - matched_by='title', + hardcover_book_id=str(search_result["book_id"]), + hardcover_slug=search_result.get("slug"), + hardcover_cover_url=search_result.get("cached_image"), + matched_by="title", ) db_service.save_hardcover_details(new_details) - user_book = hc_client.find_user_book(search_result['book_id']) + user_book = hc_client.find_user_book(search_result["book_id"]) if user_book: - reads = user_book.get('user_book_reads', []) + reads = user_book.get("user_book_reads", []) if reads: read = reads[0] - if read.get('started_at'): - started_at = read['started_at'] - if read.get('finished_at'): - finished_at = read['finished_at'] + if read.get("started_at"): + started_at = read["started_at"] + if read.get("finished_at"): + finished_at = read["finished_at"] if started_at or finished_at: - source = 'hardcover' + source = "hardcover" except Exception as e: logger.debug(f"Hardcover date lookup failed for '{abs_id}': {e}") @@ -237,10 +239,10 @@ def _estimate_reading_dates(db_service, abs_id: str, bookfusion_ids: list[str], date_range = db_service.get_bookfusion_highlight_date_range(bookfusion_ids) if date_range: earliest, latest, count = date_range - started_at = earliest.strftime('%Y-%m-%d') if earliest else None + started_at = earliest.strftime("%Y-%m-%d") if earliest else None if count > 1 and latest: - finished_at = latest.strftime('%Y-%m-%d') - source = 'highlights' + finished_at = latest.strftime("%Y-%m-%d") + source = "highlights" estimated = True if not source: @@ -249,9 +251,9 @@ def _estimate_reading_dates(db_service, abs_id: str, bookfusion_ids: list[str], # Apply dates and status updates = {} if started_at: - updates['started_at'] = started_at + updates["started_at"] = started_at if finished_at: - updates['finished_at'] = finished_at + updates["finished_at"] = finished_at if updates: updated_book = db_service.update_book_reading_fields(book.id, **updates) if updated_book: @@ -260,21 +262,22 @@ def _estimate_reading_dates(db_service, abs_id: str, bookfusion_ids: list[str], # Update status via ReadingService for proper journal entries + HC sync if finished_at or started_at: from src.services.reading_service import ReadingService + reading_svc = ReadingService(db_service) - target_status = 'completed' if finished_at else 'active' + target_status = "completed" if finished_at else "active" if book.status != target_status: reading_svc.update_status(abs_id, target_status, container) return { - 'dates_set': True, - 'dates_source': source, - 'dates_estimated': estimated, - 'started_at': started_at, - 'finished_at': finished_at, + "dates_set": True, + "dates_source": source, + "dates_estimated": estimated, + "started_at": started_at, + "finished_at": finished_at, } -@bookfusion_bp.route('/api/bookfusion/highlights') +@bookfusion_bp.route("/api/bookfusion/highlights") def get_highlights(): """Return cached highlights from DB, grouped by book.""" db_service = get_database_service() @@ -282,32 +285,34 @@ def get_highlights(): grouped = {} for hl in highlights: - key = (hl.bookfusion_book_id, hl.matched_abs_id, clean_book_title(hl.book_title or 'Unknown Book')) + key = (hl.bookfusion_book_id, hl.matched_abs_id, clean_book_title(hl.book_title or "Unknown Book")) if key not in grouped: grouped[key] = { - 'highlights': [], - 'matched_abs_id': hl.matched_abs_id, - 'bookfusion_book_id': hl.bookfusion_book_id, - 'display_title': clean_book_title(hl.book_title or 'Unknown Book'), + "highlights": [], + "matched_abs_id": hl.matched_abs_id, + "bookfusion_book_id": hl.bookfusion_book_id, + "display_title": clean_book_title(hl.book_title or "Unknown Book"), + } + date_str = hl.highlighted_at.strftime("%Y-%m-%d %H:%M:%S") if hl.highlighted_at else None + grouped[key]["highlights"].append( + { + "id": hl.id, + "highlight_id": hl.highlight_id, + "quote": hl.quote_text or hl.content, + "date": date_str, + "chapter_heading": hl.chapter_heading, + "matched_abs_id": hl.matched_abs_id, } - date_str = hl.highlighted_at.strftime('%Y-%m-%d %H:%M:%S') if hl.highlighted_at else None - grouped[key]['highlights'].append({ - 'id': hl.id, - 'highlight_id': hl.highlight_id, - 'quote': hl.quote_text or hl.content, - 'date': date_str, - 'chapter_heading': hl.chapter_heading, - 'matched_abs_id': hl.matched_abs_id, - }) + ) # Sort highlights within each book by date for key in grouped: - grouped[key]['highlights'].sort(key=lambda h: h['date'] or '', reverse=True) + grouped[key]["highlights"].sort(key=lambda h: h["date"] or "", reverse=True) # Re-key by display title for the frontend (API contract uses title as key) display = {} for _key, group in grouped.items(): - title = group.pop('display_title') + title = group.pop("display_title") # Disambiguate if two different books share the same cleaned title display_key = title if display_key in display: @@ -318,23 +323,23 @@ def get_highlights(): # Include list of PageKeeper books for journal matching books = db_service.get_all_books() - book_list = [{'abs_id': b.abs_id, 'title': b.title} for b in books if b.title] + book_list = [{"abs_id": b.abs_id, "title": b.title} for b in books if b.title] - return jsonify({'highlights': display, 'has_synced': cursor is not None, 'books': book_list}) + return jsonify({"highlights": display, "has_synced": cursor is not None, "books": book_list}) -@bookfusion_bp.route('/api/bookfusion/link-highlight', methods=['POST']) +@bookfusion_bp.route("/api/bookfusion/link-highlight", methods=["POST"]) def link_highlight(): """Manually link or unlink a BookFusion book's highlights to a PageKeeper book.""" data = request.get_json() if not data: - return jsonify({'error': 'No data provided'}), 400 + return jsonify({"error": "No data provided"}), 400 - bookfusion_book_id = data.get('bookfusion_book_id') - abs_id = data.get('abs_id') # None or empty to unlink + bookfusion_book_id = data.get("bookfusion_book_id") + abs_id = data.get("abs_id") # None or empty to unlink if not bookfusion_book_id: - return jsonify({'error': 'bookfusion_book_id required'}), 400 + return jsonify({"error": "bookfusion_book_id required"}), 400 db_service = get_database_service() if abs_id: @@ -343,21 +348,21 @@ def link_highlight(): else: book_id = None db_service.link_bookfusion_highlights_by_book_id(bookfusion_book_id, book_id) - return jsonify({'success': True}) + return jsonify({"success": True}) -@bookfusion_bp.route('/api/bookfusion/save-journal', methods=['POST']) +@bookfusion_bp.route("/api/bookfusion/save-journal", methods=["POST"]) def save_highlight_to_journal(): """Save BookFusion highlights as reading journal entries for a book.""" data = request.get_json() if not data: - return jsonify({'error': 'No data provided'}), 400 + return jsonify({"error": "No data provided"}), 400 - abs_id = data.get('abs_id') - highlights = data.get('highlights', []) + abs_id = data.get("abs_id") + highlights = data.get("highlights", []) if not abs_id: - return jsonify({'error': 'abs_id required'}), 400 + return jsonify({"error": "abs_id required"}), 400 # When no highlights provided in the request, fetch them server-side if not highlights: @@ -365,26 +370,28 @@ def save_highlight_to_journal(): book = db_service.get_book_by_ref(abs_id) bf_highlights = db_service.get_bookfusion_highlights_for_book_by_book_id(book.id) if book else [] if not bf_highlights: - return jsonify({'error': 'No highlights found for this book'}), 400 + return jsonify({"error": "No highlights found for this book"}), 400 highlights = [] for hl in bf_highlights: - highlights.append({ - 'quote': hl.quote_text or hl.content, - 'chapter': hl.chapter_heading or '', - 'highlighted_at': hl.highlighted_at.strftime('%Y-%m-%d %H:%M:%S') if hl.highlighted_at else '', - }) + highlights.append( + { + "quote": hl.quote_text or hl.content, + "chapter": hl.chapter_heading or "", + "highlighted_at": hl.highlighted_at.strftime("%Y-%m-%d %H:%M:%S") if hl.highlighted_at else "", + } + ) db_service = get_database_service() book = db_service.get_book_by_ref(abs_id) if not book: - return jsonify({'error': 'Book not found'}), 404 + return jsonify({"error": "Book not found"}), 404 cleanup_stats = db_service.cleanup_bookfusion_import_notes(abs_id) saved = 0 for hl in highlights: - quote = hl.get('quote', '').strip() - chapter = hl.get('chapter', '') - highlighted_at_raw = (hl.get('highlighted_at') or '').strip() + quote = hl.get("quote", "").strip() + chapter = hl.get("chapter", "") + highlighted_at_raw = (hl.get("highlighted_at") or "").strip() if not quote: continue entry = quote @@ -392,7 +399,7 @@ def save_highlight_to_journal(): entry += f"\n— {chapter}" created_at = None if highlighted_at_raw: - for fmt in ('%Y-%m-%d %H:%M:%S', '%b %d, %Y'): + for fmt in ("%Y-%m-%d %H:%M:%S", "%b %d, %Y"): try: created_at = datetime.strptime(highlighted_at_raw, fmt) break @@ -401,15 +408,15 @@ def save_highlight_to_journal(): if not created_at: logger.debug("Could not parse BookFusion highlight timestamp '%s'", highlighted_at_raw) try: - db_service.add_reading_journal(book.id, 'highlight', entry=entry, created_at=created_at, abs_id=book.abs_id) + db_service.add_reading_journal(book.id, "highlight", entry=entry, created_at=created_at, abs_id=book.abs_id) saved += 1 except Exception as e: logger.warning(f"Failed to save journal entry: {e}") - return jsonify({'success': True, 'saved': saved, 'cleanup': cleanup_stats}) + return jsonify({"success": True, "saved": saved, "cleanup": cleanup_stats}) -@bookfusion_bp.route('/api/bookfusion/library') +@bookfusion_bp.route("/api/bookfusion/library") def get_library(): """Return BookFusion library catalog for the Library tab, merging duplicate titles.""" db_service = get_database_service() @@ -418,12 +425,12 @@ def get_library(): # Check which books are already on the dashboard (by bf- prefix or highlight match) all_books = db_service.get_all_books() dashboard_ids = {b.abs_id for b in all_books} - book_list = [{'abs_id': b.abs_id, 'title': b.title} for b in all_books if b.title] + book_list = [{"abs_id": b.abs_id, "title": b.title} for b in all_books if b.title] # Group by normalized title to merge format duplicates groups = defaultdict(list) for b in bf_books: - norm = normalize_title(b.title or b.filename or '') + norm = normalize_title(b.title or b.filename or "") groups[norm].append(b) result = [] @@ -432,10 +439,10 @@ def get_library(): group.sort(key=lambda b: b.highlight_count or 0, reverse=True) primary = group[0] - title = clean_book_title(primary.title or primary.filename or '') - authors = '' - series = '' - tags = '' + title = clean_book_title(primary.title or primary.filename or "") + authors = "" + series = "" + tags = "" for b in group: if not authors and b.authors: authors = b.authors @@ -462,62 +469,64 @@ def get_library(): # A group is hidden if any entry in the group is hidden is_hidden = any(b.hidden for b in group) - result.append({ - 'bookfusion_id': bookfusion_ids[0], - 'bookfusion_ids': bookfusion_ids, - 'title': title, - 'authors': authors, - 'filenames': filenames, - 'filename': primary.filename or '', - 'series': series, - 'tags': tags, - 'highlight_count': highlight_count, - 'on_dashboard': matched_abs_id is not None, - 'abs_id': matched_abs_id, - 'hidden': is_hidden, - }) - - return jsonify({'books': result, 'dashboard_books': book_list}) - - -@bookfusion_bp.route('/api/bookfusion/add-to-dashboard', methods=['POST']) + result.append( + { + "bookfusion_id": bookfusion_ids[0], + "bookfusion_ids": bookfusion_ids, + "title": title, + "authors": authors, + "filenames": filenames, + "filename": primary.filename or "", + "series": series, + "tags": tags, + "highlight_count": highlight_count, + "on_dashboard": matched_abs_id is not None, + "abs_id": matched_abs_id, + "hidden": is_hidden, + } + ) + + return jsonify({"books": result, "dashboard_books": book_list}) + + +@bookfusion_bp.route("/api/bookfusion/add-to-dashboard", methods=["POST"]) def add_to_dashboard(): """Add a BookFusion book to the reading dashboard.""" data = request.get_json() if not data: - return jsonify({'error': 'No data provided'}), 400 + return jsonify({"error": "No data provided"}), 400 - bookfusion_ids = data.get('bookfusion_ids') or [] + bookfusion_ids = data.get("bookfusion_ids") or [] if not bookfusion_ids: - single = data.get('bookfusion_id') + single = data.get("bookfusion_id") if single: bookfusion_ids = [single] if not bookfusion_ids: - return jsonify({'error': 'bookfusion_id required'}), 400 + return jsonify({"error": "bookfusion_id required"}), 400 primary_id = bookfusion_ids[0] db_service = get_database_service() bf_book = db_service.get_bookfusion_book(primary_id) if not bf_book: - return jsonify({'error': 'BookFusion book not found in catalog'}), 404 + return jsonify({"error": "BookFusion book not found in catalog"}), 404 abs_id = f"bf-{primary_id}" # Check if already on dashboard existing = db_service.get_book_by_ref(abs_id) if existing: - return jsonify({'success': True, 'abs_id': abs_id, 'already_existed': True}) + return jsonify({"success": True, "abs_id": abs_id, "already_existed": True}) # Create dashboard book entry - title = clean_book_title(bf_book.title or bf_book.filename or 'Unknown') - initial_status = data.get('status', 'not_started') - if initial_status not in ('not_started', 'active'): - initial_status = 'not_started' + title = clean_book_title(bf_book.title or bf_book.filename or "Unknown") + initial_status = data.get("status", "not_started") + if initial_status not in ("not_started", "active"): + initial_status = "not_started" book = Book( abs_id=abs_id, title=title, status=initial_status, - sync_mode='ebook_only', + sync_mode="ebook_only", ) db_service.save_book(book, is_new=True) @@ -533,33 +542,33 @@ def add_to_dashboard(): # Auto-populate reading dates date_info = _estimate_reading_dates(db_service, abs_id, bookfusion_ids, title) - resp = {'success': True, 'abs_id': abs_id} + resp = {"success": True, "abs_id": abs_id} resp.update(date_info) return jsonify(resp) -@bookfusion_bp.route('/api/bookfusion/match-to-book', methods=['POST']) +@bookfusion_bp.route("/api/bookfusion/match-to-book", methods=["POST"]) def match_to_book(): """Match a BookFusion catalog book to an existing dashboard book (link highlights).""" data = request.get_json() if not data: - return jsonify({'error': 'No data provided'}), 400 + return jsonify({"error": "No data provided"}), 400 - bookfusion_ids = data.get('bookfusion_ids') or [] + bookfusion_ids = data.get("bookfusion_ids") or [] if not bookfusion_ids: - single = data.get('bookfusion_id') + single = data.get("bookfusion_id") if single: bookfusion_ids = [single] - abs_id = data.get('abs_id') # None/empty to unlink + abs_id = data.get("abs_id") # None/empty to unlink if not bookfusion_ids: - return jsonify({'error': 'bookfusion_id required'}), 400 + return jsonify({"error": "bookfusion_id required"}), 400 db_service = get_database_service() book = db_service.get_book_by_ref(abs_id) if abs_id else None if abs_id and not book: - return jsonify({'error': 'Book not found'}), 404 + return jsonify({"error": "Book not found"}), 404 book_id = book.id if book else None @@ -568,51 +577,51 @@ def match_to_book(): db_service.set_bookfusion_book_match_by_book_id(bid, book_id) db_service.link_bookfusion_highlights_by_book_id(bid, book_id) - resp = {'success': True, 'abs_id': abs_id} + resp = {"success": True, "abs_id": abs_id} # Auto-populate reading dates if linking (not unlinking) if abs_id: - title = book.title if book else '' + title = book.title if book else "" date_info = _estimate_reading_dates(db_service, abs_id, bookfusion_ids, title) resp.update(date_info) return jsonify(resp) -@bookfusion_bp.route('/api/bookfusion/hide', methods=['POST']) +@bookfusion_bp.route("/api/bookfusion/hide", methods=["POST"]) def hide_book(): """Hide or unhide a BookFusion library book.""" data = request.get_json() if not data: - return jsonify({'error': 'No data provided'}), 400 + return jsonify({"error": "No data provided"}), 400 - bookfusion_ids = data.get('bookfusion_ids') or [] + bookfusion_ids = data.get("bookfusion_ids") or [] if not bookfusion_ids: - single = data.get('bookfusion_id') + single = data.get("bookfusion_id") if single: bookfusion_ids = [single] if not bookfusion_ids: - return jsonify({'error': 'bookfusion_id required'}), 400 + return jsonify({"error": "bookfusion_id required"}), 400 - hidden = data.get('hidden', True) + hidden = data.get("hidden", True) db_service = get_database_service() db_service.set_bookfusion_books_hidden(bookfusion_ids, hidden) - return jsonify({'success': True}) + return jsonify({"success": True}) -@bookfusion_bp.route('/api/bookfusion/unlink', methods=['POST']) +@bookfusion_bp.route("/api/bookfusion/unlink", methods=["POST"]) def unlink_book(): """Unlink a BookFusion book from a dashboard book.""" data = request.get_json() if not data: - return jsonify({'error': 'No data provided'}), 400 + return jsonify({"error": "No data provided"}), 400 - abs_id = data.get('abs_id') + abs_id = data.get("abs_id") if not abs_id: - return jsonify({'error': 'abs_id required'}), 400 + return jsonify({"error": "abs_id required"}), 400 db_service = get_database_service() book = db_service.get_book_by_ref(abs_id) if book: db_service.unlink_bookfusion_by_book_id(book.id) - return jsonify({'success': True}) + return jsonify({"success": True}) diff --git a/src/blueprints/dashboard.py b/src/blueprints/dashboard.py index 16f2c57..c9b3554 100644 --- a/src/blueprints/dashboard.py +++ b/src/blueprints/dashboard.py @@ -100,8 +100,8 @@ def _run_date_sync(): # Merge Booklore 2 into booklore flag so templates show the service # when either instance is configured - if integrations.get('booklore2') and not integrations.get('booklore'): - integrations['booklore'] = True + if integrations.get("booklore2") and not integrations.get("booklore"): + integrations["booklore"] = True # BookFusion integration status bf_client = container.bookfusion_client() @@ -335,11 +335,15 @@ def _run_date_sync(): mapping["last_sync"] = "Never" covers = resolve_book_covers( - book, abs_service, database_service, book_type, - booklore_meta=bl_meta, hardcover_details=hardcover_details, + book, + abs_service, + database_service, + book_type, + booklore_meta=bl_meta, + hardcover_details=hardcover_details, ) - mapping["cover_url"] = covers['cover_url'] - mapping["placeholder_logo"] = covers['placeholder_logo'] + mapping["cover_url"] = covers["cover_url"] + mapping["placeholder_logo"] = covers["placeholder_logo"] duration = mapping.get("duration", 0) progress_pct = mapping.get("unified_progress", 0) @@ -366,17 +370,19 @@ def _run_date_sync(): # Unlinked KoSync documents — for dashboard toast + pending identification section kosync_unlinked_count = 0 unlinked_reading = [] - kosync_active = os.environ.get('KOSYNC_ENABLED', '').lower() in ('true', '1', 'yes', 'on') or os.environ.get('KOSYNC_SERVER', '') + kosync_active = os.environ.get("KOSYNC_ENABLED", "").lower() in ("true", "1", "yes", "on") or os.environ.get( + "KOSYNC_SERVER", "" + ) if kosync_active: try: unlinked_docs = database_service.get_unlinked_kosync_documents() kosync_unlinked_count = len(unlinked_docs) unlinked_reading = [ { - 'document_hash': doc.document_hash, - 'percentage': float(doc.percentage) if doc.percentage else 0, - 'device': doc.device, - 'last_updated': doc.last_updated.isoformat() if doc.last_updated else None, + "document_hash": doc.document_hash, + "percentage": float(doc.percentage) if doc.percentage else 0, + "device": doc.device, + "last_updated": doc.last_updated.isoformat() if doc.last_updated else None, } for doc in unlinked_docs if doc.percentage and float(doc.percentage) > 0 diff --git a/src/blueprints/matching_bp.py b/src/blueprints/matching_bp.py index 2527ecf..97dbc38 100644 --- a/src/blueprints/matching_bp.py +++ b/src/blueprints/matching_bp.py @@ -42,7 +42,9 @@ def _create_storyteller_reservation(database_service, abs_id): """ book = database_service.get_book_by_ref(abs_id) storyteller_uuid = book.storyteller_uuid if book else None - submission = StorytellerSubmission(abs_id=abs_id, book_id=book.id if book else None, status="queued", storyteller_uuid=storyteller_uuid) + submission = StorytellerSubmission( + abs_id=abs_id, book_id=book.id if book else None, status="queued", storyteller_uuid=storyteller_uuid + ) database_service.save_storyteller_submission(submission) return submission @@ -100,9 +102,17 @@ def _copy_book_merge_metadata(existing_book, overrides=None): return metadata -def _create_book_mapping(container, abs_id, title, ebook_filename, duration, - storyteller_uuid=None, storyteller_submit=False, - author=None, subtitle=None): +def _create_book_mapping( + container, + abs_id, + title, + ebook_filename, + duration, + storyteller_uuid=None, + storyteller_submit=False, + author=None, + subtitle=None, +): """Create a book mapping with full pipeline: Booklore, KOSync, merge, Hardcover, etc. Returns (book, error_message). On success error_message is None. @@ -205,7 +215,10 @@ def _create_book_mapping(container, abs_id, title, ebook_filename, duration, # Storyteller submission (background thread) if storyteller_submit: _submit_to_storyteller_async( - container, abs_id, title, ebook_filename, + container, + abs_id, + title, + ebook_filename, current_app.config.get("BOOKS_DIR", ""), current_app.config.get("EPUB_CACHE_DIR", ""), ) @@ -467,7 +480,8 @@ def match(): _ab_meta = selected_ab.get("media", {}).get("metadata", {}) book, error = _create_book_mapping( - container, abs_id, + container, + abs_id, title=manager.get_audiobook_title(selected_ab), ebook_filename=ebook_filename, duration=manager.get_duration(selected_ab), @@ -612,8 +626,10 @@ def batch_match(): is_ebook_only = not abs_id and (ebook_filename or storyteller_uuid) is_audio_only = abs_id and not ebook_filename and not storyteller_uuid title = ( - manager.get_audiobook_title(selected_ab) if selected_ab - else ebook_display_name or Path(ebook_filename).stem if ebook_filename + manager.get_audiobook_title(selected_ab) + if selected_ab + else ebook_display_name or Path(ebook_filename).stem + if ebook_filename else "Storyteller Book" ) _ab_meta = (selected_ab or {}).get("media", {}).get("metadata", {}) @@ -638,7 +654,9 @@ def batch_match(): return redirect(url_for("matching.batch_match", search=request.form.get("search", ""))) elif action == "remove_from_queue": remove_key = request.form.get("queue_key") or request.form.get("abs_id") - session["queue"] = [item for item in session.get("queue", []) if item.get("queue_key", item.get("abs_id")) != remove_key] + session["queue"] = [ + item for item in session.get("queue", []) if item.get("queue_key", item.get("abs_id")) != remove_key + ] session.modified = True return redirect(url_for("matching.batch_match")) elif action == "clear_queue": @@ -681,7 +699,11 @@ def batch_match(): if not kosync_doc_id: failed_items.append(item.get("ebook_display_name") or ebook_filename) continue - title = item.get("ebook_display_name") or (bl_book.get("title") if bl_book else None) or Path(ebook_filename).stem + title = ( + item.get("ebook_display_name") + or (bl_book.get("title") if bl_book else None) + or Path(ebook_filename).stem + ) else: title = item.get("title", "Storyteller Book") ebook_filename = None diff --git a/src/blueprints/reading_bp.py b/src/blueprints/reading_bp.py index 527d6c5..0c40ac7 100644 --- a/src/blueprints/reading_bp.py +++ b/src/blueprints/reading_bp.py @@ -24,7 +24,7 @@ logger = logging.getLogger(__name__) -reading_bp = Blueprint('reading', __name__) +reading_bp = Blueprint("reading", __name__) def _get_reading_service(): @@ -37,6 +37,7 @@ def _get_reading_stats_service(): def _synthetic_journal(abs_id, event, date_str, percentage=None): """Create a lightweight object mimicking ReadingJournal for timeline display.""" + class _SyntheticJournal: def __init__(self): self.id = None @@ -44,29 +45,36 @@ def __init__(self): self.event = event self.entry = None self.percentage = percentage - self.created_at = datetime.strptime(date_str, '%Y-%m-%d') if date_str else None + self.created_at = datetime.strptime(date_str, "%Y-%m-%d") if date_str else None + return _SyntheticJournal() -def _build_book_reading_data(book, database_service, abs_service, states_by_book, - booklore_by_filename=None, abs_metadata_by_id=None, - hardcover_details=None): +def _build_book_reading_data( + book, + database_service, + abs_service, + states_by_book, + booklore_by_filename=None, + abs_metadata_by_id=None, + hardcover_details=None, +): """Build a reading-focused data dict for a single book.""" sync_mode = book.sync_mode - if sync_mode == 'ebook_only': - book_type = 'ebook-only' + if sync_mode == "ebook_only": + book_type = "ebook-only" elif not book.ebook_filename: - book_type = 'audio-only' + book_type = "audio-only" else: - book_type = 'linked' + book_type = "linked" # Get unified progress from states states = states_by_book.get(book.id, []) max_progress = ReadingService.max_progress(states, as_percent=True) # Enrich title/author from Booklore or ABS metadata when available - display_title = book.title or '' - display_author = '' + display_title = book.title or "" + display_author = "" bl_meta = find_booklore_metadata(book, booklore_by_filename) if booklore_by_filename else None if bl_meta and bl_meta.title: stems = set() @@ -79,38 +87,39 @@ def _build_book_reading_data(book, database_service, abs_service, states_by_book if bl_meta and bl_meta.authors: display_author = bl_meta.authors - if not display_author and book_type != 'ebook-only': + if not display_author and book_type != "ebook-only": abs_meta = (abs_metadata_by_id or {}).get(book.abs_id, {}) - display_author = abs_meta.get('author') or '' + display_author = abs_meta.get("author") or "" if not display_author and book.author: display_author = book.author if not display_author: - display_author = book.ebook_filename or '' + display_author = book.ebook_filename or "" - covers = resolve_book_covers(book, abs_service, database_service, book_type, - booklore_meta=bl_meta, hardcover_details=hardcover_details) + covers = resolve_book_covers( + book, abs_service, database_service, book_type, booklore_meta=bl_meta, hardcover_details=hardcover_details + ) return { - 'id': book.id, - 'abs_id': book.abs_id, - 'title': display_title, - 'abs_author': display_author, - 'ebook_filename': book.ebook_filename, - 'kosync_doc_id': book.kosync_doc_id, - 'status': book.status, - 'book_type': book_type, - 'unified_progress': max_progress, - 'cover_url': covers['cover_url'], - 'placeholder_logo': covers['placeholder_logo'], - 'custom_cover_url': covers['custom_cover_url'], - 'abs_cover_url': covers['abs_cover_url'], - 'fallback_cover_url': covers['fallback_cover_url'], - 'started_at': book.started_at, - 'finished_at': book.finished_at, - 'rating': book.rating, - 'read_count': book.read_count or 1, + "id": book.id, + "abs_id": book.abs_id, + "title": display_title, + "abs_author": display_author, + "ebook_filename": book.ebook_filename, + "kosync_doc_id": book.kosync_doc_id, + "status": book.status, + "book_type": book_type, + "unified_progress": max_progress, + "cover_url": covers["cover_url"], + "placeholder_logo": covers["placeholder_logo"], + "custom_cover_url": covers["custom_cover_url"], + "abs_cover_url": covers["abs_cover_url"], + "fallback_cover_url": covers["fallback_cover_url"], + "started_at": book.started_at, + "finished_at": book.finished_at, + "rating": book.rating, + "read_count": book.read_count or 1, } @@ -121,16 +130,16 @@ def _is_genuinely_reading(book_data): We only count it as "currently reading" if it has meaningful progress (>1%). We don't trust started_at alone because ABS/Hardcover auto-set it on first sync. """ - if book_data['status'] == 'not_started': + if book_data["status"] == "not_started": return False - if book_data['status'] != 'active': + if book_data["status"] != "active": return True # paused/completed/dnf are explicit user actions - return book_data['unified_progress'] > 1.0 + return book_data["unified_progress"] > 1.0 -@reading_bp.route('/reading') -@reading_bp.route('/reading/tbr') -@reading_bp.route('/reading/stats') +@reading_bp.route("/reading") +@reading_bp.route("/reading/tbr") +@reading_bp.route("/reading/stats") def reading_index(): """Render the main reading tab page.""" database_service = get_database_service() @@ -139,19 +148,19 @@ def reading_index(): books = database_service.get_all_books() # Only include books with reading-relevant statuses - reading_statuses = {'active', 'completed', 'paused', 'dnf', 'not_started'} + reading_statuses = {"active", "completed", "paused", "dnf", "not_started"} books = [b for b in books if b.status in reading_statuses] abs_metadata_by_id = {} try: all_abs_books = abs_service.get_audiobooks() for ab in all_abs_books: - ab_id = ab.get('id') + ab_id = ab.get("id") if not ab_id: continue - metadata = ab.get('media', {}).get('metadata', {}) + metadata = ab.get("media", {}).get("metadata", {}) abs_metadata_by_id[ab_id] = { - 'author': metadata.get('authorName') or '', + "author": metadata.get("authorName") or "", } except Exception as e: logger.warning(f"Could not fetch ABS metadata for reading log enrichment: {e}") @@ -188,43 +197,43 @@ def reading_index(): not_started = [] for bd in all_book_data: - if bd['status'] == 'completed': - bd['display_status'] = 'finished' + if bd["status"] == "completed": + bd["display_status"] = "finished" finished.append(bd) - elif bd['status'] == 'paused': - bd['display_status'] = 'paused' + elif bd["status"] == "paused": + bd["display_status"] = "paused" paused.append(bd) - elif bd['status'] == 'dnf': - bd['display_status'] = 'dnf' + elif bd["status"] == "dnf": + bd["display_status"] = "dnf" dnf.append(bd) - elif bd['status'] == 'not_started': - bd['display_status'] = 'not_started' + elif bd["status"] == "not_started": + bd["display_status"] = "not_started" not_started.append(bd) elif _is_genuinely_reading(bd): - bd['display_status'] = 'reading' + bd["display_status"] = "reading" currently_reading.append(bd) else: - bd['display_status'] = 'not_started' + bd["display_status"] = "not_started" not_started.append(bd) # Sort each section for default ordering - currently_reading.sort(key=lambda b: b['unified_progress'], reverse=True) - finished.sort(key=lambda b: b['finished_at'] or '', reverse=True) - paused.sort(key=lambda b: (b['title'] or '').lower()) - dnf.sort(key=lambda b: (b['title'] or '').lower()) - not_started.sort(key=lambda b: (b['title'] or '').lower()) + currently_reading.sort(key=lambda b: b["unified_progress"], reverse=True) + finished.sort(key=lambda b: b["finished_at"] or "", reverse=True) + paused.sort(key=lambda b: (b["title"] or "").lower()) + dnf.sort(key=lambda b: (b["title"] or "").lower()) + not_started.sort(key=lambda b: (b["title"] or "").lower()) section_counts = { - 'reading': len(currently_reading), - 'finished': len(finished), - 'paused': len(paused), - 'dnf': len(dnf), - 'not_started': len(not_started), + "reading": len(currently_reading), + "finished": len(finished), + "paused": len(paused), + "dnf": len(dnf), + "not_started": len(not_started), } # Collect unique years from finished books for year dividers finished_years = sorted( - {bd['finished_at'][:4] for bd in finished if bd.get('finished_at')}, + {bd["finished_at"][:4] for bd in finished if bd.get("finished_at")}, reverse=True, ) @@ -233,23 +242,23 @@ def reading_index(): goal = database_service.get_reading_goal(current_year) reading_sections = [ { - 'id': 'continue', - 'title': 'Continue Reading', - 'description': 'Books with active progress and quick resume context.', - 'books': currently_reading, + "id": "continue", + "title": "Continue Reading", + "description": "Books with active progress and quick resume context.", + "books": currently_reading, }, { - 'id': 'finished', - 'title': 'Recently Finished', - 'description': 'Completed books grouped by finish year.', - 'books': finished, - 'group_by_year': True, + "id": "finished", + "title": "Recently Finished", + "description": "Completed books grouped by finish year.", + "books": finished, + "group_by_year": True, }, { - 'id': 'stalled', - 'title': 'Paused and DNF', - 'description': 'Books you may revisit or archive.', - 'books': paused + dnf, + "id": "stalled", + "title": "Paused and DNF", + "description": "Books you may revisit or archive.", + "books": paused + dnf, }, ] @@ -263,8 +272,8 @@ def reading_index(): hc_configured = False # Determine active tab from route - path_to_tab = {'/reading/tbr': 'tbr', '/reading/stats': 'stats'} - active_tab = path_to_tab.get(request.path, 'log') + path_to_tab = {"/reading/tbr": "tbr", "/reading/stats": "stats"} + active_tab = path_to_tab.get(request.path, "log") # Build TBR-linked abs_ids set for "Add to Want to Read" visibility tbr_linked_abs_ids = set() @@ -275,7 +284,7 @@ def reading_index(): logger.debug(f"Could not load TBR items: {e}") return render_template( - 'reading.html', + "reading.html", all_books=currently_reading + finished + paused + dnf + not_started, reading_sections=reading_sections, section_counts=section_counts, @@ -293,10 +302,10 @@ def reading_index(): ) -@reading_bp.route('/reading/book/') +@reading_bp.route("/reading/book/") def reading_detail(book_ref): """Render the book detail view with journal.""" - active_tab = request.args.get('tab', 'overview') + active_tab = request.args.get("tab", "overview") database_service = get_database_service() abs_service = get_abs_service() @@ -310,17 +319,18 @@ def reading_detail(book_ref): booklore_by_filename = database_service.get_booklore_by_filename(enabled_server_ids=enabled_bl_ids) hc_details = database_service.get_hardcover_details(book.id) - book_data = _build_book_reading_data(book, database_service, abs_service, states_by_book, - booklore_by_filename, hardcover_details=hc_details) + book_data = _build_book_reading_data( + book, database_service, abs_service, states_by_book, booklore_by_filename, hardcover_details=hc_details + ) journals = database_service.get_reading_journals(book.id) # Synthesize started/finished timeline entries from book dates if missing existing_events = {j.event for j in journals} synthetic = [] - if book.started_at and 'started' not in existing_events: - synthetic.append(_synthetic_journal(book.abs_id, 'started', book.started_at)) - if book.finished_at and 'finished' not in existing_events: - synthetic.append(_synthetic_journal(book.abs_id, 'finished', book.finished_at, percentage=1.0)) + if book.started_at and "started" not in existing_events: + synthetic.append(_synthetic_journal(book.abs_id, "started", book.started_at)) + if book.finished_at and "finished" not in existing_events: + synthetic.append(_synthetic_journal(book.abs_id, "finished", book.finished_at, percentage=1.0)) if synthetic: journals = list(journals) + synthetic journals.sort(key=lambda j: j.created_at or datetime.min, reverse=True) @@ -329,17 +339,22 @@ def reading_detail(book_ref): bf_highlights = database_service.get_bookfusion_highlights_for_book_by_book_id(book.id) has_bookfusion_link = ( - (book.abs_id or '').startswith('bf-') + (book.abs_id or "").startswith("bf-") or len(bf_highlights) > 0 or database_service.is_bookfusion_linked_by_book_id(book.id) ) container = get_container() metadata = build_book_metadata(book, container, database_service, abs_service) - hardcover = metadata.get('_hardcover') + hardcover = metadata.get("_hardcover") service_states, integrations, services_enabled = build_service_info( - book, states_by_book, container, abs_service, metadata, has_bookfusion_link, + book, + states_by_book, + container, + abs_service, + metadata, + has_bookfusion_link, ) # Check if this book already has a linked TBR item @@ -347,41 +362,41 @@ def reading_detail(book_ref): # Alignment tab data alignment_info = None - show_alignment_tab = book.sync_mode != 'ebook_only' + show_alignment_tab = book.sync_mode != "ebook_only" # Validate active_tab against actually available tabs - valid_tabs = {'overview', 'journal'} + valid_tabs = {"overview", "journal"} if bf_highlights: - valid_tabs.add('highlights') + valid_tabs.add("highlights") if show_alignment_tab: - valid_tabs.add('alignment') + valid_tabs.add("alignment") if active_tab not in valid_tabs: - active_tab = 'overview' + active_tab = "overview" if show_alignment_tab: try: alignment_service = container.alignment_service() alignment_info = alignment_service.get_alignment_info(book.id) if alignment_info: book_duration = book.duration - max_ts = alignment_info['max_timestamp'] + max_ts = alignment_info["max_timestamp"] if book_duration and book_duration > 0: coverage = min(max_ts / book_duration, 1.0) - alignment_info['coverage'] = coverage - alignment_info['coverage_hours'] = max_ts / 3600 - alignment_info['total_hours'] = book_duration / 3600 - alignment_info['status'] = 'active' if coverage >= 0.9 else 'partial' + alignment_info["coverage"] = coverage + alignment_info["coverage_hours"] = max_ts / 3600 + alignment_info["total_hours"] = book_duration / 3600 + alignment_info["status"] = "active" if coverage >= 0.9 else "partial" else: - alignment_info['coverage'] = None - alignment_info['status'] = 'active' + alignment_info["coverage"] = None + alignment_info["status"] = "active" # Infer source for legacy data - if not alignment_info['source'] and book.storyteller_uuid: - alignment_info['source'] = 'storyteller' + if not alignment_info["source"] and book.storyteller_uuid: + alignment_info["source"] = "storyteller" except Exception as e: logger.debug(f"Failed to load alignment info for book {book.id}: {e}") return render_template( - 'reading_detail.html', + "reading_detail.html", book=book_data, journals=journals, bf_highlights=bf_highlights, @@ -391,9 +406,8 @@ def reading_detail(book_ref): services_enabled=services_enabled, service_states=service_states, integrations=integrations, - hardcover_rating_sync_available=services_enabled['hardcover'] and bool( - hardcover and hardcover.hardcover_book_id - ), + hardcover_rating_sync_available=services_enabled["hardcover"] + and bool(hardcover and hardcover.hardcover_book_id), hardcover_linked=bool(hardcover and hardcover.hardcover_book_id), active_tab=active_tab, show_alignment_tab=show_alignment_tab, @@ -401,7 +415,7 @@ def reading_detail(book_ref): ) -@reading_bp.route('/reading/tbr/') +@reading_bp.route("/reading/tbr/") def tbr_detail(item_id): """Render the TBR book detail page.""" database_service = get_database_service() @@ -427,7 +441,7 @@ def tbr_detail(item_id): hc_configured = False return render_template( - 'tbr_detail.html', + "tbr_detail.html", item=item, genres=genres, linked_book=linked_book, @@ -439,13 +453,13 @@ def tbr_detail(item_id): # ─── API Endpoints ─────────────────────────────────────────────────── -@reading_bp.route('/api/reading/book//rating', methods=['POST']) +@reading_bp.route("/api/reading/book//rating", methods=["POST"]) def update_rating(book_ref): """Set or update the rating for a book.""" database_service = get_database_service() book = get_book_or_404(book_ref) data = request.json or {} - rating = data.get('rating') + rating = data.get("rating") if rating is not None: try: @@ -468,25 +482,27 @@ def update_rating(book_ref): hc_service = container.hardcover_service() if hc_service.is_configured(): sync_result = hc_service.push_local_rating(book, rating) - hardcover_synced = bool(sync_result.get('hardcover_synced')) - hardcover_error = sync_result.get('hardcover_error') + hardcover_synced = bool(sync_result.get("hardcover_synced")) + hardcover_error = sync_result.get("hardcover_error") except Exception as e: hardcover_error = str(e) - return jsonify({ - "success": True, - "rating": book.rating, - "hardcover_synced": hardcover_synced, - "hardcover_error": hardcover_error, - }) + return jsonify( + { + "success": True, + "rating": book.rating, + "hardcover_synced": hardcover_synced, + "hardcover_error": hardcover_error, + } + ) -@reading_bp.route('/api/reading/book//progress', methods=['POST']) +@reading_bp.route("/api/reading/book//progress", methods=["POST"]) def update_progress(book_ref): """Manually set reading progress for a book (e.g. BookFusion books without auto-sync).""" book = get_book_or_404(book_ref) data = request.json or {} - percentage = data.get('percentage') + percentage = data.get("percentage") if percentage is None: return jsonify({"success": False, "error": "percentage is required"}), 400 @@ -501,13 +517,13 @@ def update_progress(book_ref): container = get_container() result = _get_reading_service().set_progress(book.id, percentage, container) - if not result['success']: + if not result["success"]: return jsonify(result), 404 return jsonify({"success": True, "percentage": percentage}) -@reading_bp.route('/api/reading/book//dates', methods=['POST']) +@reading_bp.route("/api/reading/book//dates", methods=["POST"]) def update_dates(book_ref): """Update started_at and/or finished_at dates.""" database_service = get_database_service() @@ -515,12 +531,12 @@ def update_dates(book_ref): data = request.json or {} updates = {} - for field in ('started_at', 'finished_at'): + for field in ("started_at", "finished_at"): if field in data: val = data[field] if val: try: - datetime.strptime(val, '%Y-%m-%d') + datetime.strptime(val, "%Y-%m-%d") except ValueError: return jsonify({"success": False, "error": f"Invalid date format for {field}"}), 400 updates[field] = val or None @@ -528,8 +544,8 @@ def update_dates(book_ref): if not updates: return jsonify({"success": False, "error": "No date fields provided"}), 400 - effective_started = updates.get('started_at') or (book.started_at if 'started_at' not in updates else None) - effective_finished = updates.get('finished_at') or (book.finished_at if 'finished_at' not in updates else None) + effective_started = updates.get("started_at") or (book.started_at if "started_at" not in updates else None) + effective_finished = updates.get("finished_at") or (book.finished_at if "finished_at" not in updates else None) if effective_started and effective_finished and effective_started > effective_finished: return jsonify({"success": False, "error": "started_at cannot be after finished_at"}), 400 @@ -538,24 +554,26 @@ def update_dates(book_ref): return jsonify({"success": False, "error": "Book not found"}), 404 # Sync corresponding journal entry timestamps to match the edited dates - event_map = {'started_at': 'started', 'finished_at': 'finished'} + event_map = {"started_at": "started", "finished_at": "finished"} for field, event in event_map.items(): if field in updates: journal = database_service.find_journal_by_event(book.id, event) if journal: new_date = updates[field] if new_date: - new_dt = datetime.strptime(new_date, '%Y-%m-%d') + new_dt = datetime.strptime(new_date, "%Y-%m-%d") database_service.update_reading_journal(journal.id, created_at=new_dt) - return jsonify({ - "success": True, - "started_at": book.started_at, - "finished_at": book.finished_at, - }) + return jsonify( + { + "success": True, + "started_at": book.started_at, + "finished_at": book.finished_at, + } + ) -@reading_bp.route('/api/reading/book//dates/sync-hardcover', methods=['POST']) +@reading_bp.route("/api/reading/book//dates/sync-hardcover", methods=["POST"]) def sync_dates_to_hardcover(book_ref): """Push local started_at/finished_at to Hardcover, overwriting HC dates.""" book = get_book_or_404(book_ref) @@ -566,7 +584,7 @@ def sync_dates_to_hardcover(book_ref): return jsonify({"success": False, "error": message}), 400 -@reading_bp.route('/api/reading/book//dates/pull-hardcover', methods=['POST']) +@reading_bp.route("/api/reading/book//dates/pull-hardcover", methods=["POST"]) def pull_dates_from_hardcover(book_ref): """Pull started_at/finished_at from Hardcover into local DB.""" book = get_book_or_404(book_ref) @@ -577,13 +595,13 @@ def pull_dates_from_hardcover(book_ref): return jsonify({"success": False, "error": message}), 400 -@reading_bp.route('/api/reading/book//journal', methods=['POST']) +@reading_bp.route("/api/reading/book//journal", methods=["POST"]) def add_journal(book_ref): """Add a journal note for a book.""" database_service = get_database_service() book = get_book_or_404(book_ref) data = request.json or {} - entry = (data.get('entry') or '').strip() + entry = (data.get("entry") or "").strip() if not entry: return jsonify({"success": False, "error": "Entry text is required"}), 400 @@ -593,23 +611,28 @@ def add_journal(book_ref): max_pct = ReadingService.max_progress(book_states) journal = database_service.add_reading_journal( - book.id, event='note', entry=entry, percentage=max_pct if max_pct > 0 else None, + book.id, + event="note", + entry=entry, + percentage=max_pct if max_pct > 0 else None, abs_id=book.abs_id, ) - return jsonify({ - "success": True, - "journal": { - "id": journal.id, - "event": journal.event, - "entry": journal.entry, - "percentage": journal.percentage, - "created_at": journal.created_at.isoformat() if journal.created_at else None, + return jsonify( + { + "success": True, + "journal": { + "id": journal.id, + "event": journal.event, + "entry": journal.entry, + "percentage": journal.percentage, + "created_at": journal.created_at.isoformat() if journal.created_at else None, + }, } - }) + ) -@reading_bp.route('/api/reading/journal/', methods=['DELETE']) +@reading_bp.route("/api/reading/journal/", methods=["DELETE"]) def delete_journal(journal_id): """Delete a journal entry (cascades to book dates for started/finished).""" database_service = get_database_service() @@ -628,88 +651,94 @@ def delete_journal(journal_id): # If this was the last started/finished journal, clear the corresponding book field cleared_field = None - if event in ('started', 'finished'): + if event in ("started", "finished"): remaining = database_service.find_journal_by_event(book_id, event) if not remaining: - cleared_field = 'started_at' if event == 'started' else 'finished_at' + cleared_field = "started_at" if event == "started" else "finished_at" database_service.update_book_reading_fields(book_id, **{cleared_field: None}) return jsonify({"success": True, "cleared_field": cleared_field}) -@reading_bp.route('/api/reading/journal/', methods=['PATCH']) +@reading_bp.route("/api/reading/journal/", methods=["PATCH"]) def update_journal(journal_id): """Update a journal entry (notes: text; started/finished: date).""" database_service = get_database_service() data = request.json or {} - entry = (data.get('entry') or '').strip() + entry = (data.get("entry") or "").strip() existing = database_service.get_reading_journal(journal_id) if not existing: return jsonify({"success": False, "error": "Journal entry not found"}), 404 # Started/finished entries: only allow editing the date (created_at), not text - if existing.event in ('started', 'finished'): - date_str = (data.get('created_at') or '').strip() + if existing.event in ("started", "finished"): + date_str = (data.get("created_at") or "").strip() if not date_str: return jsonify({"success": False, "error": "created_at date is required for started/finished entries"}), 400 try: - new_dt = datetime.strptime(date_str, '%Y-%m-%d') + new_dt = datetime.strptime(date_str, "%Y-%m-%d") except ValueError: return jsonify({"success": False, "error": "Invalid date format (expected YYYY-MM-DD)"}), 400 journal = database_service.update_reading_journal(journal_id, created_at=new_dt) # Also update the corresponding book field - field = 'started_at' if existing.event == 'started' else 'finished_at' + field = "started_at" if existing.event == "started" else "finished_at" database_service.update_book_reading_fields(existing.book_id, **{field: date_str}) - return jsonify({ - "success": True, - "journal": { - "id": journal.id, - "event": journal.event, - "entry": journal.entry, - "percentage": journal.percentage, - "created_at": journal.created_at.isoformat() if journal.created_at else None, + return jsonify( + { + "success": True, + "journal": { + "id": journal.id, + "event": journal.event, + "entry": journal.entry, + "percentage": journal.percentage, + "created_at": journal.created_at.isoformat() if journal.created_at else None, + }, } - }) + ) - if existing.event != 'note': + if existing.event != "note": return jsonify({"success": False, "error": "Only notes can be edited"}), 400 if not entry: return jsonify({"success": False, "error": "entry is required"}), 400 journal = database_service.update_reading_journal(journal_id, entry=entry) - return jsonify({ - "success": True, - "journal": { - "id": journal.id, - "event": journal.event, - "entry": journal.entry, - "percentage": journal.percentage, - "created_at": journal.created_at.isoformat() if journal.created_at else None, + return jsonify( + { + "success": True, + "journal": { + "id": journal.id, + "event": journal.event, + "entry": journal.entry, + "percentage": journal.percentage, + "created_at": journal.created_at.isoformat() if journal.created_at else None, + }, } - }) + ) -@reading_bp.route('/api/reading/goal/', methods=['GET']) +@reading_bp.route("/api/reading/goal/", methods=["GET"]) def get_goal(year): """Get the reading goal for a given year.""" database_service = get_database_service() stats = _get_reading_stats_service().get_year_stats(year) goal = database_service.get_reading_goal(year) - return jsonify({ - "year": year, - "target": goal.target_books if goal else None, - "completed": stats['books_finished'], - }) + return jsonify( + { + "year": year, + "target": goal.target_books if goal else None, + "completed": stats["books_finished"], + } + ) -@reading_bp.route('/api/reading/goal/', methods=['POST']) +@reading_bp.route("/api/reading/goal/", methods=["POST"]) def set_goal(year): """Set or update the yearly reading goal.""" database_service = get_database_service() data = request.json or {} - target = data.get('target_books') + target = data.get("target_books") if target is None: return jsonify({"success": False, "error": "target_books is required"}), 400 @@ -729,13 +758,13 @@ def set_goal(year): # ─── New API Endpoints (Step 1 — Issue #16 leftovers) ──────────────── -@reading_bp.route('/api/reading/books', methods=['GET']) +@reading_bp.route("/api/reading/books", methods=["GET"]) def get_reading_books(): """Return all books with reading data (status, progress, dates, rating).""" database_service = get_database_service() books = database_service.get_all_books() - reading_statuses = {'active', 'completed', 'paused', 'dnf', 'not_started'} + reading_statuses = {"active", "completed", "paused", "dnf", "not_started"} books = [b for b in books if b.status in reading_statuses] all_states = database_service.get_all_states() @@ -748,21 +777,23 @@ def get_reading_books(): states = states_by_book.get(book.id, []) max_progress = ReadingService.max_progress(states, as_percent=True) - result.append({ - 'abs_id': book.abs_id, - 'title': book.title, - 'status': book.status, - 'unified_progress': min(max_progress, 100.0), - 'started_at': book.started_at, - 'finished_at': book.finished_at, - 'rating': book.rating, - 'read_count': book.read_count or 1, - }) + result.append( + { + "abs_id": book.abs_id, + "title": book.title, + "status": book.status, + "unified_progress": min(max_progress, 100.0), + "started_at": book.started_at, + "finished_at": book.finished_at, + "rating": book.rating, + "read_count": book.read_count or 1, + } + ) return jsonify(result) -@reading_bp.route('/api/reading/book/', methods=['GET']) +@reading_bp.route("/api/reading/book/", methods=["GET"]) def get_reading_book(book_ref): """Single book detail with journals.""" database_service = get_database_service() @@ -773,28 +804,33 @@ def get_reading_book(book_ref): max_progress = ReadingService.max_progress(states, as_percent=True) journals = database_service.get_reading_journals(book.id) - journal_list = [{ - 'id': j.id, - 'event': j.event, - 'entry': j.entry, - 'percentage': j.percentage, - 'created_at': j.created_at.isoformat() if j.created_at else None, - } for j in journals] - - return jsonify({ - 'abs_id': book.abs_id, - 'title': book.title, - 'status': book.status, - 'unified_progress': min(max_progress, 100.0), - 'started_at': book.started_at, - 'finished_at': book.finished_at, - 'rating': book.rating, - 'read_count': book.read_count or 1, - 'journals': journal_list, - }) - - -@reading_bp.route('/api/reading/book//status', methods=['POST']) + journal_list = [ + { + "id": j.id, + "event": j.event, + "entry": j.entry, + "percentage": j.percentage, + "created_at": j.created_at.isoformat() if j.created_at else None, + } + for j in journals + ] + + return jsonify( + { + "abs_id": book.abs_id, + "title": book.title, + "status": book.status, + "unified_progress": min(max_progress, 100.0), + "started_at": book.started_at, + "finished_at": book.finished_at, + "rating": book.rating, + "read_count": book.read_count or 1, + "journals": journal_list, + } + ) + + +@reading_bp.route("/api/reading/book//status", methods=["POST"]) def update_status(book_ref): """Update reading status for a book (with journal auto-creation). @@ -802,18 +838,18 @@ def update_status(book_ref): """ book = get_book_or_404(book_ref) data = request.json or {} - new_status = data.get('status') + new_status = data.get("status") container = get_container() result = _get_reading_service().update_status(book.id, new_status, container) - if not result['success']: - code = 404 if result.get('error') == 'Book not found' else 400 + if not result["success"]: + code = 404 if result.get("error") == "Book not found" else 400 return jsonify(result), code return jsonify(result) -@reading_bp.route('/api/reading/stats/', methods=['GET']) +@reading_bp.route("/api/reading/stats/", methods=["GET"]) def get_stats(year): """Reading stats for a given year.""" stats = _get_reading_stats_service().get_year_stats(year) diff --git a/src/db/book_repository.py b/src/db/book_repository.py index 1bbafea..44bfffc 100644 --- a/src/db/book_repository.py +++ b/src/db/book_repository.py @@ -18,7 +18,6 @@ class BookRepository(BaseRepository): - # ── Book CRUD ── def get_book_by_abs_id(self, abs_id): @@ -68,10 +67,7 @@ def search_books(self, query, limit=10): if not query or not query.strip(): return [] with self.get_session() as session: - results = (session.query(Book) - .filter(Book.title.ilike(f'%{query}%')) - .limit(limit) - .all()) + results = session.query(Book).filter(Book.title.ilike(f"%{query}%")).limit(limit).all() for r in results: session.expunge(r) return results @@ -79,19 +75,34 @@ def search_books(self, query, limit=10): def get_book_by_ebook_filename(self, filename): """Find a book by its ebook filename (current or original).""" from sqlalchemy import or_ - return self._get_one( - Book, - or_(Book.ebook_filename == filename, Book.original_ebook_filename == filename) - ) + + return self._get_one(Book, or_(Book.ebook_filename == filename, Book.original_ebook_filename == filename)) def create_book(self, book): return self._save_new(book) def save_book(self, book): - update_attrs = ['title', 'ebook_filename', 'original_ebook_filename', 'kosync_doc_id', - 'transcript_file', 'status', 'duration', 'sync_mode', 'storyteller_uuid', - 'abs_ebook_item_id', 'ebook_item_id', 'activity_flag', 'custom_cover_url', - 'started_at', 'finished_at', 'rating', 'read_count'] + update_attrs = [ + "title", + "author", + "subtitle", + "ebook_filename", + "original_ebook_filename", + "kosync_doc_id", + "transcript_file", + "status", + "duration", + "sync_mode", + "storyteller_uuid", + "abs_ebook_item_id", + "ebook_item_id", + "activity_flag", + "custom_cover_url", + "started_at", + "finished_at", + "rating", + "read_count", + ] if book.id: return self._upsert(Book, [Book.id == book.id], book, update_attrs) elif book.abs_id: @@ -101,9 +112,9 @@ def save_book(self, book): def delete_book(self, book_id): with self.get_session() as session: - session.query(KosyncDocument).filter( - KosyncDocument.linked_book_id == book_id - ).update({KosyncDocument.linked_abs_id: None, KosyncDocument.linked_book_id: None}) + session.query(KosyncDocument).filter(KosyncDocument.linked_book_id == book_id).update( + {KosyncDocument.linked_abs_id: None, KosyncDocument.linked_book_id: None} + ) book = session.query(Book).filter(Book.id == book_id).first() if book: session.delete(book) @@ -125,9 +136,7 @@ def migrate_book_data(self, old_abs_id, new_abs_id): # Delete states for the new abs_id that would conflict incoming_clients = { - r[0] for r in session.query(State.client_name).filter( - State.book_id == book.id - ).all() + r[0] for r in session.query(State.client_name).filter(State.book_id == book.id).all() } target_book = session.query(Book).filter(Book.abs_id == new_abs_id).first() if target_book: @@ -145,17 +154,21 @@ def migrate_book_data(self, old_abs_id, new_abs_id): # Update denormalized abs_id on child rows session.query(State).filter(State.book_id == book.id).update( - {State.abs_id: new_abs_id}, synchronize_session=False) + {State.abs_id: new_abs_id}, synchronize_session=False + ) session.query(Job).filter(Job.book_id == book.id).update( - {Job.abs_id: new_abs_id}, synchronize_session=False) + {Job.abs_id: new_abs_id}, synchronize_session=False + ) session.query(ReadingJournal).filter(ReadingJournal.book_id == book.id).update( - {ReadingJournal.abs_id: new_abs_id}, synchronize_session=False) + {ReadingJournal.abs_id: new_abs_id}, synchronize_session=False + ) session.query(StorytellerSubmission).filter(StorytellerSubmission.book_id == book.id).update( - {StorytellerSubmission.abs_id: new_abs_id}, synchronize_session=False) + {StorytellerSubmission.abs_id: new_abs_id}, synchronize_session=False + ) - session.query(KosyncDocument).filter( - KosyncDocument.linked_abs_id == old_abs_id - ).update({KosyncDocument.linked_abs_id: new_abs_id}, synchronize_session=False) + session.query(KosyncDocument).filter(KosyncDocument.linked_abs_id == old_abs_id).update( + {KosyncDocument.linked_abs_id: new_abs_id}, synchronize_session=False + ) logger.info(f"Migrated book identity from '{old_abs_id}' to '{new_abs_id}'") except Exception as e: @@ -186,7 +199,7 @@ def save_state(self, state): State, lookup, state, - ['last_updated', 'percentage', 'timestamp', 'xpath', 'cfi', 'abs_id', 'book_id'], + ["last_updated", "percentage", "timestamp", "xpath", "cfi", "abs_id", "book_id"], ) def delete_states_for_book(self, book_id): @@ -199,9 +212,7 @@ def delete_states_for_book(self, book_id): def get_latest_job(self, book_id): with self.get_session() as session: - job = session.query(Job).filter( - Job.book_id == book_id - ).order_by(Job.last_attempt.desc()).first() + job = session.query(Job).filter(Job.book_id == book_id).order_by(Job.last_attempt.desc()).first() if job: session.expunge(job) return job @@ -228,7 +239,7 @@ def get_latest_jobs_bulk(self, book_ids): .join( latest, (Job.book_id == latest.c.book_id) - & (Job.last_attempt == latest.c.max_ts), + & (func.coalesce(Job.last_attempt, "1970-01-01") == func.coalesce(latest.c.max_ts, "1970-01-01")), ) .all() ) @@ -249,9 +260,7 @@ def save_job(self, job): def update_latest_job(self, book_id, **kwargs): with self.get_session() as session: - job = session.query(Job).filter( - Job.book_id == book_id - ).order_by(Job.last_attempt.desc()).first() + job = session.query(Job).filter(Job.book_id == book_id).order_by(Job.last_attempt.desc()).first() if job: for key, value in kwargs.items(): if hasattr(job, key): @@ -274,22 +283,31 @@ def delete_jobs_for_book(self, book_id): def get_books_with_recent_activity(self, limit=10): with self.get_session() as session: - latest = session.query( - State.book_id, - func.max(State.last_updated).label('max_updated') - ).group_by(State.book_id).subquery() - books = session.query(Book).join( - latest, Book.id == latest.c.book_id - ).order_by(latest.c.max_updated.desc()).limit(limit).all() + latest = ( + session.query(State.book_id, func.max(State.last_updated).label("max_updated")) + .group_by(State.book_id) + .subquery() + ) + books = ( + session.query(Book) + .join(latest, Book.id == latest.c.book_id) + .order_by(latest.c.max_updated.desc()) + .limit(limit) + .all() + ) for book in books: session.expunge(book) return books def get_failed_jobs(self, limit=20): with self.get_session() as session: - jobs = session.query(Job).filter( - Job.last_error.isnot(None) - ).order_by(Job.last_attempt.desc()).limit(limit).all() + jobs = ( + session.query(Job) + .filter(Job.last_error.isnot(None)) + .order_by(Job.last_attempt.desc()) + .limit(limit) + .all() + ) for job in jobs: session.expunge(job) return jobs @@ -297,16 +315,14 @@ def get_failed_jobs(self, limit=20): def get_statistics(self): with self.get_session() as session: stats = { - 'total_books': session.query(Book).count(), - 'active_books': session.query(Book).filter(Book.status == 'active').count(), - 'paused_books': session.query(Book).filter(Book.status == 'paused').count(), - 'dnf_books': session.query(Book).filter(Book.status == 'dnf').count(), - 'total_states': session.query(State).count(), - 'total_jobs': session.query(Job).count(), - 'failed_jobs': session.query(Job).filter(Job.last_error.isnot(None)).count(), + "total_books": session.query(Book).count(), + "active_books": session.query(Book).filter(Book.status == "active").count(), + "paused_books": session.query(Book).filter(Book.status == "paused").count(), + "dnf_books": session.query(Book).filter(Book.status == "dnf").count(), + "total_states": session.query(State).count(), + "total_jobs": session.query(Job).count(), + "failed_jobs": session.query(Job).filter(Job.last_error.isnot(None)).count(), } - client_counts = session.query( - State.client_name, func.count(State.id) - ).group_by(State.client_name).all() - stats['states_by_client'] = {client: count for client, count in client_counts} + client_counts = session.query(State.client_name, func.count(State.id)).group_by(State.client_name).all() + stats["states_by_client"] = {client: count for client, count in client_counts} return stats diff --git a/src/db/bookfusion_repository.py b/src/db/bookfusion_repository.py index 329aa49..fc8d3ba 100644 --- a/src/db/bookfusion_repository.py +++ b/src/db/bookfusion_repository.py @@ -13,7 +13,6 @@ class BookFusionRepository(BaseRepository): - # ── BookFusion Highlights ── def save_bookfusion_highlights(self, highlights): @@ -73,7 +72,7 @@ def save_bookfusion_highlights(self, highlights): existing.highlighted_at = h.get("highlighted_at") existing.quote_text = h.get("quote_text") session.flush() - return {'saved': saved, 'new_ids': new_ids} + return {"saved": saved, "new_ids": new_ids} def get_bookfusion_highlights(self): with self.get_session() as session: @@ -250,12 +249,11 @@ def auto_link_by_title(self, book): return norm_book = normalize_title(book.title) for hl in unmatched: - bf_title = clean_book_title(hl.book_title or '') + bf_title = clean_book_title(hl.book_title or "") norm_bf = normalize_title(bf_title) if norm_bf == norm_book or difflib.SequenceMatcher(None, norm_bf, norm_book).ratio() > 0.85: if hl.bookfusion_book_id: self.link_bookfusion_highlights_by_book_id(hl.bookfusion_book_id, book.id) logger.info(f"Auto-linked BookFusion highlights for '{bf_title}' to book {book.id}") - break except (AttributeError, TypeError) as e: logger.warning(f"BookFusion auto-link failed: {e}") diff --git a/src/db/hardcover_repository.py b/src/db/hardcover_repository.py index 8d41fab..635bcd9 100644 --- a/src/db/hardcover_repository.py +++ b/src/db/hardcover_repository.py @@ -9,7 +9,6 @@ class HardcoverRepository(BaseRepository): - # ── Hardcover Details ── def get_hardcover_details(self, book_id): diff --git a/src/db/kosync_repository.py b/src/db/kosync_repository.py index 4411ced..2c53223 100644 --- a/src/db/kosync_repository.py +++ b/src/db/kosync_repository.py @@ -7,7 +7,6 @@ class KoSyncRepository(BaseRepository): - def get_kosync_document(self, document_hash): return self._get_one(KosyncDocument, KosyncDocument.document_hash == document_hash) @@ -32,9 +31,7 @@ def get_unlinked_kosync_documents(self): def link_kosync_document(self, document_hash, book_id, abs_id=None): with self.get_session() as session: - doc = session.query(KosyncDocument).filter( - KosyncDocument.document_hash == document_hash - ).first() + doc = session.query(KosyncDocument).filter(KosyncDocument.document_hash == document_hash).first() if doc: doc.linked_book_id = book_id doc.linked_abs_id = abs_id @@ -44,9 +41,7 @@ def link_kosync_document(self, document_hash, book_id, abs_id=None): def unlink_kosync_document(self, document_hash): with self.get_session() as session: - doc = session.query(KosyncDocument).filter( - KosyncDocument.document_hash == document_hash - ).first() + doc = session.query(KosyncDocument).filter(KosyncDocument.document_hash == document_hash).first() if doc: doc.linked_abs_id = None doc.linked_book_id = None @@ -74,11 +69,7 @@ def get_orphaned_kosync_books(self): """Get books with kosync_doc_id set but no matching KosyncDocument.""" with self.get_session() as session: subq = session.query(KosyncDocument.document_hash) - results = (session.query(Book) - .filter(Book.kosync_doc_id != None) - .filter(~Book.kosync_doc_id.in_(subq)) - .all()) + results = session.query(Book).filter(Book.kosync_doc_id != None).filter(~Book.kosync_doc_id.in_(subq)).all() for r in results: session.expunge(r) return results - diff --git a/src/db/suggestion_repository.py b/src/db/suggestion_repository.py index 23c5469..d25f4af 100644 --- a/src/db/suggestion_repository.py +++ b/src/db/suggestion_repository.py @@ -5,46 +5,60 @@ class SuggestionRepository(BaseRepository): - ACTIONABLE_STATUSES = ('pending', 'hidden') + ACTIONABLE_STATUSES = ("pending", "hidden") - def get_suggestion(self, source_id, source='abs'): + def get_suggestion(self, source_id, source="abs"): return self._get_one( PendingSuggestion, PendingSuggestion.source_id == source_id, PendingSuggestion.source == source, ) - def get_pending_suggestion(self, source_id, source='abs'): + def get_pending_suggestion(self, source_id, source="abs"): return self._get_one( PendingSuggestion, PendingSuggestion.source_id == source_id, PendingSuggestion.source == source, - PendingSuggestion.status == 'pending', + PendingSuggestion.status == "pending", ) - def suggestion_exists(self, source_id, source='abs'): + def suggestion_exists(self, source_id, source="abs"): with self.get_session() as session: - return session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id, - PendingSuggestion.source == source, - ).first() is not None - - def is_suggestion_ignored(self, source_id, source='abs'): + return ( + session.query(PendingSuggestion) + .filter( + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, + ) + .first() + is not None + ) + + def is_suggestion_ignored(self, source_id, source="abs"): with self.get_session() as session: - return session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id, - PendingSuggestion.source == source, - PendingSuggestion.status == 'ignored', - ).first() is not None + return ( + session.query(PendingSuggestion) + .filter( + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, + PendingSuggestion.status == "ignored", + ) + .first() + is not None + ) def save_pending_suggestion(self, suggestion): with self.get_session() as session: - existing = session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == suggestion.source_id, - PendingSuggestion.source == suggestion.source, - ).first() - if existing and existing.status == 'hidden' and suggestion.status == 'pending': - suggestion.status = 'hidden' + existing = ( + session.query(PendingSuggestion) + .filter( + PendingSuggestion.source_id == suggestion.source_id, + PendingSuggestion.source == suggestion.source, + ) + .first() + ) + if existing and existing.status == "hidden" and suggestion.status == "pending": + suggestion.status = "hidden" return self._upsert( PendingSuggestion, @@ -53,13 +67,13 @@ def save_pending_suggestion(self, suggestion): PendingSuggestion.source == suggestion.source, ], suggestion, - ['title', 'author', 'cover_url', 'matches_json', 'status'], + ["title", "author", "cover_url", "matches_json", "status"], ) def get_all_pending_suggestions(self): return self._get_all( PendingSuggestion, - PendingSuggestion.status == 'pending', + PendingSuggestion.status == "pending", order_by=PendingSuggestion.created_at.desc(), ) @@ -73,55 +87,67 @@ def get_all_actionable_suggestions(self): def get_hidden_suggestions(self): return self._get_all( PendingSuggestion, - PendingSuggestion.status == 'hidden', + PendingSuggestion.status == "hidden", order_by=PendingSuggestion.created_at.desc(), ) - def delete_pending_suggestion(self, source_id, source='abs'): + def delete_pending_suggestion(self, source_id, source="abs"): return self._delete_one( PendingSuggestion, PendingSuggestion.source_id == source_id, PendingSuggestion.source == source, - PendingSuggestion.status == 'pending', + PendingSuggestion.status == "pending", ) - def resolve_suggestion(self, source_id, source='abs'): + def resolve_suggestion(self, source_id, source="abs"): return self._delete_one( PendingSuggestion, PendingSuggestion.source_id == source_id, PendingSuggestion.source == source, ) - def hide_suggestion(self, source_id, source='abs'): + def hide_suggestion(self, source_id, source="abs"): with self.get_session() as session: - suggestion = session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id, - PendingSuggestion.source == source, - ).first() - if suggestion and suggestion.status != 'ignored': - suggestion.status = 'hidden' + suggestion = ( + session.query(PendingSuggestion) + .filter( + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, + ) + .first() + ) + if suggestion and suggestion.status != "ignored": + suggestion.status = "hidden" return True return False - def unhide_suggestion(self, source_id, source='abs'): + def unhide_suggestion(self, source_id, source="abs"): with self.get_session() as session: - suggestion = session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id, - PendingSuggestion.source == source, - ).first() - if suggestion and suggestion.status == 'hidden': - suggestion.status = 'pending' + suggestion = ( + session.query(PendingSuggestion) + .filter( + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, + ) + .first() + ) + if suggestion and suggestion.status == "hidden": + suggestion.status = "pending" return True return False - def ignore_suggestion(self, source_id, source='abs'): + def ignore_suggestion(self, source_id, source="abs"): with self.get_session() as session: - suggestion = session.query(PendingSuggestion).filter( - PendingSuggestion.source_id == source_id, - PendingSuggestion.source == source, - ).first() + suggestion = ( + session.query(PendingSuggestion) + .filter( + PendingSuggestion.source_id == source_id, + PendingSuggestion.source == source, + ) + .first() + ) if suggestion: - suggestion.status = 'ignored' + suggestion.status = "ignored" return True return False @@ -130,17 +156,24 @@ def clear_stale_suggestions(self): from sqlalchemy import not_ from .models import Book + with self.get_session() as session: - count = session.query(PendingSuggestion).filter( - PendingSuggestion.source == 'abs', - PendingSuggestion.status.in_(self.ACTIONABLE_STATUSES), - not_(PendingSuggestion.source_id.in_(session.query(Book.abs_id))) - ).delete(synchronize_session=False) + count = ( + session.query(PendingSuggestion) + .filter( + PendingSuggestion.source == "abs", + PendingSuggestion.status.in_(self.ACTIONABLE_STATUSES), + not_(PendingSuggestion.source_id.in_(session.query(Book.abs_id))), + ) + .delete(synchronize_session=False) + ) return count def normalize_dismissed_suggestions(self): with self.get_session() as session: - updated = session.query(PendingSuggestion).filter( - PendingSuggestion.status == 'dismissed' - ).update({'status': 'hidden'}, synchronize_session=False) + updated = ( + session.query(PendingSuggestion) + .filter(PendingSuggestion.status == "dismissed") + .update({"status": "hidden"}, synchronize_session=False) + ) return updated diff --git a/src/services/alignment_service.py b/src/services/alignment_service.py index f284807..275d5a1 100644 --- a/src/services/alignment_service.py +++ b/src/services/alignment_service.py @@ -38,8 +38,8 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars Returns: dict of {client_name: normalized_value} for comparison, or None """ - has_abs = 'ABS' in config - ebook_clients = [k for k in config.keys() if k != 'ABS'] + has_abs = "ABS" in config + ebook_clients = [k for k in config.keys() if k != "ABS"] book_label = book.title or str(book.id) if not ebook_clients: @@ -65,7 +65,7 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars if not client: continue client_state = config[client_name] - client_pct = client_state.current.get('pct', 0) + client_pct = client_state.current.get("pct", 0) try: client_pct = max(0.0, min(1.0, float(client_pct))) except (TypeError, ValueError): @@ -74,20 +74,21 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars try: text_snippet = client.get_text_from_current_state(book, client_state) if text_snippet: - loc = ebook_parser.find_text_location( - book.ebook_filename, text_snippet, - hint_percentage=client_pct - ) + loc = ebook_parser.find_text_location(book.ebook_filename, text_snippet, hint_percentage=client_pct) if loc and loc.match_index is not None: normalized[client_name] = loc.match_index - logger.debug(f"'{book_label}' Normalized '{client_name}' {client_pct:.2%} -> char {loc.match_index}") + logger.debug( + f"'{book_label}' Normalized '{client_name}' {client_pct:.2%} -> char {loc.match_index}" + ) matched = True except Exception as e: logger.debug(f"'{book_label}' Text-based normalization failed for '{client_name}': {e}") if not matched: used_fallback = True normalized[client_name] = int(client_pct * total_text_len) - logger.debug(f"'{book_label}' Normalized '{client_name}' {client_pct:.2%} -> char {int(client_pct * total_text_len)} (pct fallback)") + logger.debug( + f"'{book_label}' Normalized '{client_name}' {client_pct:.2%} -> char {int(client_pct * total_text_len)} (pct fallback)" + ) if used_fallback: # Mixing text-matched positions with percentage-based estimates is @@ -105,9 +106,9 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars normalized = {} - abs_state = config['ABS'] - abs_ts = abs_state.current.get('ts', 0) - normalized['ABS'] = abs_ts + abs_state = config["ABS"] + abs_ts = abs_state.current.get("ts", 0) + normalized["ABS"] = abs_ts for client_name in ebook_clients: client = sync_clients.get(client_name) @@ -115,7 +116,7 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars continue client_state = config[client_name] - client_pct = client_state.current.get('pct', 0) + client_pct = client_state.current.get("pct", 0) try: client_pct = max(0.0, min(1.0, float(client_pct))) except (TypeError, ValueError): @@ -127,17 +128,14 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars total_text_len = len(full_text) char_offset = int(client_pct * total_text_len) - txt = full_text[max(0, char_offset - 400):min(total_text_len, char_offset + 400)] + txt = full_text[max(0, char_offset - 400) : min(total_text_len, char_offset + 400)] if not txt: logger.debug(f"'{book_label}' Could not get text from '{client_name}' for normalization") continue if alignment_service: - ts_for_text = alignment_service.get_time_for_text( - book.id, - char_offset_hint=char_offset - ) + ts_for_text = alignment_service.get_time_for_text(book.id, char_offset_hint=char_offset) else: ts_for_text = None @@ -147,12 +145,15 @@ def normalize_for_cross_format_comparison(book, config, sync_clients, ebook_pars else: logger.debug(f"'{book_label}' Could not find timestamp for '{client_name}' text") except Exception as e: - logger.warning(f"'{book_label}' Cross-format normalization failed for '{client_name}': {sanitize_exception(e)}") + logger.warning( + f"'{book_label}' Cross-format normalization failed for '{client_name}': {sanitize_exception(e)}" + ) if len(normalized) > 1: return normalized return None + class AlignmentService: def __init__(self, database_service, polisher: Polisher): self.database_service = database_service @@ -162,7 +163,14 @@ def has_alignment(self, book_id: int) -> bool: return bool(book_id and self._get_alignment(book_id)) @time_execution - def align_and_store(self, book_id: int, raw_segments: list[dict], ebook_text: str, spine_chapters: list[dict] = None, source: str = None): + def align_and_store( + self, + book_id: int, + raw_segments: list[dict], + ebook_text: str, + spine_chapters: list[dict] = None, + source: str = None, + ): """ Main entry point for "Unified Alignment". @@ -174,7 +182,9 @@ def align_and_store(self, book_id: int, raw_segments: list[dict], ebook_text: st 4. Rebuild: Fix fragmented sentences in transcript using ebook text as a guide. 5. Store: Save ONLY the mapping and essential metadata to DB. """ - logger.info(f"AlignmentService: Processing book {book_id} (Text: {len(ebook_text)} chars, Segments: {len(raw_segments)})") + logger.info( + f"AlignmentService: Processing book {book_id} (Text: {len(ebook_text)} chars, Segments: {len(raw_segments)})" + ) # 1. Validation (Spine Check) # Note: This is soft validation. If lengths assume vastly different sizes, warn. @@ -182,12 +192,12 @@ def align_and_store(self, book_id: int, raw_segments: list[dict], ebook_text: st # For now, we trust the inputs but log warnings. ebook_len = len(ebook_text) # Estimate audio text length - audio_text_rough = " ".join([s['text'] for s in raw_segments]) + audio_text_rough = " ".join([s["text"] for s in raw_segments]) audio_len = len(audio_text_rough) ratio = audio_len / ebook_len if ebook_len > 0 else 0 if ratio < 0.5 or ratio > 1.5: - logger.warning(f"Alignment Size Mismatch: Audio text is {ratio:.2%} of Ebook text size.") + logger.warning(f"Alignment Size Mismatch: Audio text is {ratio:.2%} of Ebook text size.") # 2. Normalize & Rebuild # Fix fragmented sentences (Mr. Smith case) @@ -216,8 +226,10 @@ def align_storyteller_and_store(self, book_id: int, storyteller_chapters: list[d Each wordTimeline entry is expected to have 'startTime' (float seconds) and 'word' or 'text' (string). """ - logger.info(f"AlignmentService: Processing book {book_id} via Storyteller wordTimeline " - f"({len(storyteller_chapters)} chapters, {len(ebook_text)} chars)") + logger.info( + f"AlignmentService: Processing book {book_id} via Storyteller wordTimeline " + f"({len(storyteller_chapters)} chapters, {len(ebook_text)} chars)" + ) # Build segments from wordTimeline data (~15-second groups) SEGMENT_DURATION = 15.0 @@ -227,9 +239,9 @@ def align_storyteller_and_store(self, book_id: int, storyteller_chapters: list[d last_word_start = 0.0 for chapter in storyteller_chapters: - for entry in chapter.get('words', []): - start_time = entry.get('startTime', 0.0) - word = entry.get('word') or entry.get('text', '') + for entry in chapter.get("words", []): + start_time = entry.get("startTime", 0.0) + word = entry.get("word") or entry.get("text", "") if not word: continue @@ -241,21 +253,25 @@ def align_storyteller_and_store(self, book_id: int, storyteller_chapters: list[d # Close segment when duration exceeds threshold if start_time - segment_start >= SEGMENT_DURATION and len(current_words) > 1: - segments.append({ - 'start': segment_start, - 'end': start_time, - 'text': ' '.join(current_words), - }) + segments.append( + { + "start": segment_start, + "end": start_time, + "text": " ".join(current_words), + } + ) current_words = [] # Flush remaining words if current_words: end_time = max(last_word_start, segment_start + 1.0) - segments.append({ - 'start': segment_start, - 'end': end_time, - 'text': ' '.join(current_words), - }) + segments.append( + { + "start": segment_start, + "end": end_time, + "text": " ".join(current_words), + } + ) if not segments: logger.error(f"AlignmentService: No segments produced from wordTimeline for book {book_id}") @@ -269,14 +285,14 @@ def align_storyteller_and_store(self, book_id: int, storyteller_chapters: list[d if not alignment_map: # Fallback: linear map from total duration - total_duration = segments[-1]['end'] + total_duration = segments[-1]["end"] alignment_map = [ {"char": 0, "ts": 0.0}, {"char": len(ebook_text), "ts": total_duration}, ] logger.warning(f" N-gram anchoring failed, using linear fallback for book {book_id}") - self._save_alignment(book_id, alignment_map, source='storyteller') + self._save_alignment(book_id, alignment_map, source="storyteller") return True def get_time_for_text(self, book_id: int, char_offset_hint: int = None) -> float | None: @@ -295,29 +311,31 @@ def get_time_for_text(self, book_id: int, char_offset_hint: int = None) -> float left = 0 right = len(map_points) - 1 - if target_offset < map_points[0]['char']: - return map_points[0]['ts'] + if target_offset < map_points[0]["char"]: + return map_points[0]["ts"] # Detect partial alignment: use second-to-last point as the real # data boundary (last point may be a sentinel mapping to epub end) real_end = map_points[-1] if len(map_points) >= 2: penultimate = map_points[-2] - char_gap = real_end['char'] - penultimate['char'] - ts_gap = real_end['ts'] - penultimate['ts'] - if ts_gap > 0 and char_gap / max(ts_gap, 1) > 1000: + char_gap = real_end["char"] - penultimate["char"] + ts_gap = real_end["ts"] - penultimate["ts"] + if ts_gap > 0 and char_gap / ts_gap > 1000: real_end = penultimate - if target_offset > real_end['char']: - logger.warning(f"book {book_id}: Char offset {target_offset} exceeds alignment range " - f"(max {real_end['char']}) — alignment may be partial") + if target_offset > real_end["char"]: + logger.warning( + f"book {book_id}: Char offset {target_offset} exceeds alignment range " + f"(max {real_end['char']}) — alignment may be partial" + ) return None # Manual binary search to find floor floor_idx = 0 while left <= right: mid = (left + right) // 2 - if map_points[mid]['char'] <= target_offset: + if map_points[mid]["char"] <= target_offset: floor_idx = mid left = mid + 1 else: @@ -329,16 +347,17 @@ def get_time_for_text(self, book_id: int, char_offset_hint: int = None) -> float if floor_idx + 1 < len(map_points): p2 = map_points[floor_idx + 1] else: - return p1['ts'] + return p1["ts"] # Linear Interpolation - char_span = p2['char'] - p1['char'] - time_span = p2['ts'] - p1['ts'] + char_span = p2["char"] - p1["char"] + time_span = p2["ts"] - p1["ts"] - if char_span == 0: return p1['ts'] + if char_span == 0: + return p1["ts"] - ratio = (target_offset - p1['char']) / char_span - estimated_time = p1['ts'] + (time_span * ratio) + ratio = (target_offset - p1["char"]) / char_span + estimated_time = p1["ts"] + (time_span * ratio) return float(estimated_time) @@ -359,30 +378,32 @@ def get_char_for_time(self, book_id: int, timestamp: float) -> int | None: left = 0 right = len(map_points) - 1 - if target_ts <= map_points[0]['ts']: - return int(map_points[0]['char']) + if target_ts <= map_points[0]["ts"]: + return int(map_points[0]["char"]) # Detect partial alignment: use second-to-last point as the real # data boundary (last point may be a sentinel mapping to epub end) real_end = map_points[-1] if len(map_points) >= 2: penultimate = map_points[-2] - char_gap = real_end['char'] - penultimate['char'] - ts_gap = real_end['ts'] - penultimate['ts'] + char_gap = real_end["char"] - penultimate["char"] + ts_gap = real_end["ts"] - penultimate["ts"] # If the last point has a disproportionate char jump, it's a sentinel - if ts_gap > 0 and char_gap / max(ts_gap, 1) > 1000: + if ts_gap > 0 and char_gap / ts_gap > 1000: real_end = penultimate - if target_ts > real_end['ts']: + if target_ts > real_end["ts"]: # Timestamp is beyond the alignment data — can't determine position - logger.warning(f"book {book_id}: Timestamp {target_ts:.1f}s exceeds alignment range " - f"(max {real_end['ts']:.1f}s) — alignment may be partial") + logger.warning( + f"book {book_id}: Timestamp {target_ts:.1f}s exceeds alignment range " + f"(max {real_end['ts']:.1f}s) — alignment may be partial" + ) return None floor_idx = 0 while left <= right: mid = (left + right) // 2 - if map_points[mid]['ts'] <= target_ts: + if map_points[mid]["ts"] <= target_ts: floor_idx = mid left = mid + 1 else: @@ -392,16 +413,17 @@ def get_char_for_time(self, book_id: int, timestamp: float) -> int | None: if floor_idx + 1 < len(map_points): p2 = map_points[floor_idx + 1] else: - return int(p1['char']) + return int(p1["char"]) # 3. Interpolate - time_span = p2['ts'] - p1['ts'] - char_span = p2['char'] - p1['char'] + time_span = p2["ts"] - p1["ts"] + char_span = p2["char"] - p1["char"] - if time_span == 0: return int(p1['char']) + if time_span == 0: + return int(p1["char"]) - ratio = (target_ts - p1['ts']) / time_span - estimated_char = p1['char'] + (char_span * ratio) + ratio = (target_ts - p1["ts"]) / time_span + estimated_char = p1["char"] + (char_span * ratio) return int(estimated_char) @@ -414,32 +436,33 @@ def _generate_alignment_map(self, segments: list[dict], full_text: str) -> list[ # 1. Tokenize Transcript transcript_words = [] for seg in segments: - raw_words = seg['text'].split() - if not raw_words: continue + raw_words = seg["text"].split() + if not raw_words: + continue - duration = seg['end'] - seg['start'] + duration = seg["end"] - seg["start"] per_word = duration / len(raw_words) for i, w in enumerate(raw_words): norm = self.polisher.normalize(w) - if not norm: continue - transcript_words.append({ - "word": norm, - "ts": seg['start'] + (i * per_word), - "orig_index": len(transcript_words) # Keep track for slicing - }) + if not norm: + continue + transcript_words.append( + { + "word": norm, + "ts": seg["start"] + (i * per_word), + "orig_index": len(transcript_words), # Keep track for slicing + } + ) # 2. Tokenize Book book_words = [] - for match in re.finditer(r'\S+', full_text): + for match in re.finditer(r"\S+", full_text): raw_w = match.group() norm = self.polisher.normalize(raw_w) - if not norm: continue - book_words.append({ - "word": norm, - "char": match.start(), - "orig_index": len(book_words) - }) + if not norm: + continue + book_words.append({"word": norm, "char": match.start(), "orig_index": len(book_words)}) if not transcript_words or not book_words: return [] @@ -450,9 +473,10 @@ def _find_anchors(t_tokens, b_tokens, n_size): def build_ngrams(items, is_book=False): grams = {} for i in range(len(items) - n_size + 1): - keys = [x['word'] for x in items[i:i+n_size]] + keys = [x["word"] for x in items[i : i + n_size]] key = "_".join(keys) - if key not in grams: grams[key] = [] + if key not in grams: + grams[key] = [] # Store entire object to retrieve ts/char/index grams[key].append(items[i]) return grams @@ -462,44 +486,46 @@ def build_ngrams(items, is_book=False): found = [] for key, t_list in t_grams.items(): - if len(t_list) == 1: # Unique in transcript slice - if key in b_grams and len(b_grams[key]) == 1: # Unique in book slice + if len(t_list) == 1: # Unique in transcript slice + if key in b_grams and len(b_grams[key]) == 1: # Unique in book slice # Safe access using indices b_item = b_grams[key][0] t_item = t_list[0] - found.append({ - "ts": t_item['ts'], - "char": b_item['char'], - "t_idx": t_item['orig_index'], - "b_idx": b_item['orig_index'] - }) + found.append( + { + "ts": t_item["ts"], + "char": b_item["char"], + "t_idx": t_item["orig_index"], + "b_idx": b_item["orig_index"], + } + ) return found # 3. PASS 1: Global Search (N=12) anchors = _find_anchors(transcript_words, book_words, n_size=12) # Sort by character position - anchors.sort(key=lambda x: x['char']) + anchors.sort(key=lambda x: x["char"]) # Filter Monotonic (Global) valid_anchors = [] if anchors: valid_anchors.append(anchors[0]) for a in anchors[1:]: - if a['ts'] > valid_anchors[-1]['ts']: + if a["ts"] > valid_anchors[-1]["ts"]: valid_anchors.append(a) # 4. PASS 2: Backfill Start (N=6) "Work Backwards" # If the first anchor is significantly into the book, try to recover the intro. # Threshold: First anchor is > 1000 chars in AND > 30 seconds in - if valid_anchors and valid_anchors[0]['char'] > 1000 and valid_anchors[0]['ts'] > 30.0: + if valid_anchors and valid_anchors[0]["char"] > 1000 and valid_anchors[0]["ts"] > 30.0: first = valid_anchors[0] logger.info(f" Late start detected (Char: {first['char']}, TS: {first['ts']:.1f}s) — Attempting backfill") # Slice the data: Everything BEFORE the first anchor # We use the indices we stored during tokenization - t_slice = transcript_words[:first['t_idx']] - b_slice = book_words[:first['b_idx']] + t_slice = transcript_words[: first["t_idx"]] + b_slice = book_words[: first["b_idx"]] if t_slice and b_slice: # Run with reduced N-Gram (N=6) @@ -507,12 +533,12 @@ def build_ngrams(items, is_book=False): early_anchors = _find_anchors(t_slice, b_slice, n_size=6) # Filter Early Anchors (Must be monotonic with themselves) - early_anchors.sort(key=lambda x: x['char']) + early_anchors.sort(key=lambda x: x["char"]) valid_early = [] if early_anchors: valid_early.append(early_anchors[0]) for a in early_anchors[1:]: - if a['ts'] > valid_early[-1]['ts']: + if a["ts"] > valid_early[-1]["ts"]: valid_early.append(a) if valid_early: @@ -520,24 +546,22 @@ def build_ngrams(items, is_book=False): # Prepend to main list valid_anchors = valid_early + valid_anchors - - # 5. Build Final Map final_map = [] if not valid_anchors: return [] # Force 0,0 if still missing (Linear Interpolation fallback) - if valid_anchors[0]['char'] > 0: + if valid_anchors[0]["char"] > 0: final_map.append({"char": 0, "ts": 0.0}) final_map.extend(valid_anchors) # Force End last = valid_anchors[-1] - if last['char'] < len(full_text): + if last["char"] < len(full_text): # Safe check for segments - end_ts = segments[-1]['end'] if segments else last['ts'] + end_ts = segments[-1]["end"] if segments else last["ts"] final_map.append({"char": len(full_text), "ts": end_ts}) logger.info(f" Anchored Alignment: Found {len(valid_anchors)} anchors (Total).") @@ -563,18 +587,18 @@ def get_alignment_info(self, book_id: int) -> dict | None: real_end = data[-1] if len(data) >= 2: penultimate = data[-2] - char_gap = real_end['char'] - penultimate['char'] - ts_gap = real_end['ts'] - penultimate['ts'] - if ts_gap > 0 and char_gap / max(ts_gap, 1) > 1000: + char_gap = real_end["char"] - penultimate["char"] + ts_gap = real_end["ts"] - penultimate["ts"] + if ts_gap > 0 and char_gap / ts_gap > 1000: real_end = penultimate return { - 'num_points': len(data), - 'max_timestamp': real_end['ts'], - 'max_char': real_end['char'], - 'total_chars': data[-1]['char'], - 'last_updated': row.last_updated, - 'source': row.source, + "num_points": len(data), + "max_timestamp": real_end["ts"], + "max_char": real_end["char"], + "total_chars": data[-1]["char"], + "last_updated": row.last_updated, + "source": row.source, } except (KeyError, TypeError, IndexError): logger.warning(f"Malformed alignment data for book {book_id}") @@ -594,7 +618,7 @@ def realign_book(self, book_id: int): book = session.query(Book).filter_by(id=book_id).first() if book: book.transcript_file = None - book.status = 'pending' + book.status = "pending" logger.info(f"Re-alignment queued for book {book_id}") def _save_alignment(self, book_id: int, alignment_map: list[dict], source: str = None): @@ -636,9 +660,9 @@ def _get_alignment(self, book_id: int) -> list[dict] | None: # Validate structure: each point must have int 'char' and float 'ts' validated = [] for point in raw: - if isinstance(point, dict) and 'char' in point and 'ts' in point: + if isinstance(point, dict) and "char" in point and "ts" in point: try: - validated.append({'char': int(point['char']), 'ts': float(point['ts'])}) + validated.append({"char": int(point["char"]), "ts": float(point["ts"])}) except (ValueError, TypeError): logger.warning(f"Skipping invalid alignment point for book {book_id}: {point}") else: @@ -654,5 +678,5 @@ def get_book_duration(self, book_id: int) -> float | None: alignment = self._get_alignment(book_id) if alignment and len(alignment) > 0: # The last point in the alignment map should have the max timestamp - return float(alignment[-1]['ts']) + return float(alignment[-1]["ts"]) return None diff --git a/src/services/hardcover_service.py b/src/services/hardcover_service.py index 72b0160..fdc00b1 100644 --- a/src/services/hardcover_service.py +++ b/src/services/hardcover_service.py @@ -26,19 +26,18 @@ PROGRESS_COMPLETE_THRESHOLD = 0.99 LOCAL_TO_HC_STATUS = { - 'not_started': HC_WANT_TO_READ, - 'active': HC_CURRENTLY_READING, - 'completed': HC_READ, - 'paused': HC_PAUSED, - 'dnf': HC_DNF, + "not_started": HC_WANT_TO_READ, + "active": HC_CURRENTLY_READING, + "completed": HC_READ, + "paused": HC_PAUSED, + "dnf": HC_DNF, } class HardcoverService: """Non-sync Hardcover operations: status push, ratings, matching.""" - def __init__(self, hardcover_client: HardcoverClient, database_service, - abs_client=None): + def __init__(self, hardcover_client: HardcoverClient, database_service, abs_client=None): self.hardcover_client = hardcover_client self.database_service = database_service self.abs_client = abs_client @@ -61,8 +60,8 @@ def _require_hardcover_details(self, book): def select_edition_id(self, book, hardcover_details): """Select the appropriate edition based on sync source.""" - sync_source = getattr(book, 'sync_source', None) - if sync_source == 'audiobook' and hardcover_details.hardcover_audio_edition_id: + sync_source = getattr(book, "sync_source", None) + if sync_source == "audiobook" and hardcover_details.hardcover_audio_edition_id: return hardcover_details.hardcover_audio_edition_id return hardcover_details.hardcover_edition_id @@ -82,21 +81,21 @@ def resolve_editions(self, hardcover_details): return False editions = self.hardcover_client.get_all_editions(int(book_id)) - page_ed = editions.get('ebook') or editions.get('physical') - audio_ed = editions.get('audio') - - if page_ed and page_ed.get('pages') and page_ed['pages'] > 0: - hardcover_details.hardcover_pages = page_ed['pages'] - hardcover_details.hardcover_edition_id = page_ed['id'] - if audio_ed and audio_ed.get('audio_seconds') and audio_ed['audio_seconds'] > 0: - hardcover_details.hardcover_audio_edition_id = str(audio_ed['id']) - hardcover_details.hardcover_audio_seconds = audio_ed['audio_seconds'] + page_ed = editions.get("ebook") or editions.get("physical") + audio_ed = editions.get("audio") + + if page_ed and page_ed.get("pages") and page_ed["pages"] > 0: + hardcover_details.hardcover_pages = page_ed["pages"] + hardcover_details.hardcover_edition_id = page_ed["id"] + if audio_ed and audio_ed.get("audio_seconds") and audio_ed["audio_seconds"] > 0: + hardcover_details.hardcover_audio_edition_id = str(audio_ed["id"]) + hardcover_details.hardcover_audio_seconds = audio_ed["audio_seconds"] self.database_service.save_hardcover_details(hardcover_details) return True - elif audio_ed and audio_ed.get('audio_seconds') and audio_ed['audio_seconds'] > 0: - hardcover_details.hardcover_audio_seconds = audio_ed['audio_seconds'] - hardcover_details.hardcover_audio_edition_id = str(audio_ed['id']) - hardcover_details.hardcover_edition_id = audio_ed['id'] + elif audio_ed and audio_ed.get("audio_seconds") and audio_ed["audio_seconds"] > 0: + hardcover_details.hardcover_audio_seconds = audio_ed["audio_seconds"] + hardcover_details.hardcover_audio_edition_id = str(audio_ed["id"]) + hardcover_details.hardcover_edition_id = audio_ed["id"] hardcover_details.hardcover_pages = -1 # Audio-only self.database_service.save_hardcover_details(hardcover_details) return True @@ -129,13 +128,15 @@ def push_local_status(self, book, status_label): ) hardcover_details.hardcover_status_id = hc_status_id self.database_service.save_hardcover_details(hardcover_details) - record_write('Hardcover', book.id, {'status': hc_status_id}) + record_write("Hardcover", book.id, {"status": hc_status_id}) log_hardcover_action( - self.database_service, abs_id=book.abs_id, + self.database_service, + abs_id=book.abs_id, book_title=sanitize_log_data(book.title), - direction='push', action='status_update', - detail={'status_label': status_label, 'hc_status_id': hc_status_id}, + direction="push", + action="status_update", + detail={"status_label": status_label, "hc_status_id": hc_status_id}, ) logger.info( f"Local → Hardcover status: '{sanitize_log_data(book.title)}' " @@ -143,11 +144,14 @@ def push_local_status(self, book, status_label): ) except Exception as e: log_hardcover_action( - self.database_service, abs_id=book.abs_id, + self.database_service, + abs_id=book.abs_id, book_title=sanitize_log_data(book.title), - direction='push', action='status_update', - success=False, error_message=str(e), - detail={'status_label': status_label, 'hc_status_id': hc_status_id}, + direction="push", + action="status_update", + success=False, + error_message=str(e), + detail={"status_label": status_label, "hc_status_id": hc_status_id}, ) logger.warning(f"Failed to push status to Hardcover: {e}") @@ -156,42 +160,51 @@ def push_local_status(self, book, status_label): def push_local_rating(self, book, rating): """Mirror a local rating change to Hardcover when a link exists.""" if not self.is_configured(): - return {'hardcover_synced': False, 'hardcover_error': 'Hardcover not configured'} + return {"hardcover_synced": False, "hardcover_error": "Hardcover not configured"} hardcover_details = self._require_hardcover_details(book) if not hardcover_details: - return {'hardcover_synced': False, 'hardcover_error': 'Book is not linked to Hardcover'} + return {"hardcover_synced": False, "hardcover_error": "Book is not linked to Hardcover"} try: ub = self._get_or_create_user_book(book, hardcover_details) - if not ub or not ub.get('id'): - return {'hardcover_synced': False, 'hardcover_error': 'Could not resolve Hardcover user_book'} + if not ub or not ub.get("id"): + return {"hardcover_synced": False, "hardcover_error": "Could not resolve Hardcover user_book"} + + hc_rating = None + if rating is not None: + hc_rating = round(min(max(float(rating), 0), 5) * 2) / 2 result = self.hardcover_client.update_user_book( - int(ub['id']), - {'rating': float(rating) if rating is not None else None}, + int(ub["id"]), + {"rating": hc_rating}, ) if not result: - return {'hardcover_synced': False, 'hardcover_error': 'Hardcover rejected rating update'} + return {"hardcover_synced": False, "hardcover_error": "Hardcover rejected rating update"} - record_write('Hardcover', book.id, {'rating': rating}) + record_write("Hardcover", book.id, {"rating": rating}) log_hardcover_action( - self.database_service, abs_id=book.abs_id, + self.database_service, + abs_id=book.abs_id, book_title=sanitize_log_data(book.title), - direction='push', action='rating', - detail={'rating': rating}, + direction="push", + action="rating", + detail={"rating": rating}, ) - return {'hardcover_synced': True, 'hardcover_error': None} + return {"hardcover_synced": True, "hardcover_error": None} except Exception as e: log_hardcover_action( - self.database_service, abs_id=book.abs_id, + self.database_service, + abs_id=book.abs_id, book_title=sanitize_log_data(book.title), - direction='push', action='rating', - success=False, error_message=str(e), - detail={'rating': rating}, + direction="push", + action="rating", + success=False, + error_message=str(e), + detail={"rating": rating}, ) logger.warning(f"Failed to push rating to Hardcover: {e}") - return {'hardcover_synced': False, 'hardcover_error': str(e)} + return {"hardcover_synced": False, "hardcover_error": str(e)} # ── User Book Lifecycle ─────────────────────────────────────────── @@ -204,21 +217,23 @@ def _get_or_create_user_book(self, book, hardcover_details, edition_id=None): # 1. Return cached IDs if available if hardcover_details.hardcover_user_book_id and hardcover_details.hardcover_status_id: return { - 'id': hardcover_details.hardcover_user_book_id, - 'status_id': hardcover_details.hardcover_status_id, + "id": hardcover_details.hardcover_user_book_id, + "status_id": hardcover_details.hardcover_status_id, } # 2. Try to adopt an existing user_book from HC ub = self.hardcover_client.get_user_book(hardcover_details.hardcover_book_id) if ub: - hardcover_details.hardcover_user_book_id = ub['id'] - hardcover_details.hardcover_status_id = ub.get('status_id') + hardcover_details.hardcover_user_book_id = ub["id"] + hardcover_details.hardcover_status_id = ub.get("status_id") self.database_service.save_hardcover_details(hardcover_details) log_hardcover_action( - self.database_service, abs_id=book.abs_id, + self.database_service, + abs_id=book.abs_id, book_title=sanitize_log_data(book.title), - direction='pull', action='adopt_user_book', - detail={'user_book_id': ub['id'], 'status_id': ub.get('status_id')}, + direction="pull", + action="adopt_user_book", + detail={"user_book_id": ub["id"], "status_id": ub.get("status_id")}, ) logger.info( f"Hardcover: adopted existing user_book {ub['id']} " @@ -238,18 +253,20 @@ def _get_or_create_user_book(self, book, hardcover_details, edition_id=None): hc_status_id, int(edition_id) if edition_id else None, ) - if not result or not result.get('id'): + if not result or not result.get("id"): return None - hardcover_details.hardcover_user_book_id = result['id'] - hardcover_details.hardcover_status_id = result.get('status_id', hc_status_id) + hardcover_details.hardcover_user_book_id = result["id"] + hardcover_details.hardcover_status_id = result.get("status_id", hc_status_id) self.database_service.save_hardcover_details(hardcover_details) - record_write('Hardcover', book.id, {'status': hardcover_details.hardcover_status_id}) + record_write("Hardcover", book.id, {"status": hardcover_details.hardcover_status_id}) log_hardcover_action( - self.database_service, abs_id=book.abs_id, + self.database_service, + abs_id=book.abs_id, book_title=sanitize_log_data(book.title), - direction='push', action='create_user_book', - detail={'user_book_id': result['id'], 'status_id': hardcover_details.hardcover_status_id}, + direction="push", + action="create_user_book", + detail={"user_book_id": result["id"], "status_id": hardcover_details.hardcover_status_id}, ) return result @@ -273,9 +290,9 @@ def _pull_dates_at_match(self, book): read = reads[0] updates = {} if not book.started_at and read.get("started_at"): - updates['started_at'] = read["started_at"] + updates["started_at"] = read["started_at"] if not book.finished_at and read.get("finished_at"): - updates['finished_at'] = read["finished_at"] + updates["finished_at"] = read["finished_at"] if updates: self.database_service.update_book_reading_fields(book.id, **updates) logger.info(f"Pulled dates at match time for '{book.abs_id}': {updates}") @@ -294,24 +311,23 @@ def push_initial_progress(self, book, hardcover_sync_client): try: from src.sync_clients.sync_client_interface import LocatorResult, UpdateProgressRequest + locator = LocatorResult(percentage=max_pct) request = UpdateProgressRequest(locator_result=locator, txt="Initial sync", previous_location=None) result = hardcover_sync_client.update_progress(book, request) if result and result.success: import time - pct = result.updated_state.get('pct', max_pct) if result.updated_state else max_pct + + pct = result.updated_state.get("pct", max_pct) if result.updated_state else max_pct state = State( abs_id=book.abs_id, book_id=book.id, - client_name='hardcover', + client_name="hardcover", last_updated=time.time(), percentage=pct, ) self.database_service.save_state(state) - logger.info( - f"Hardcover: pushed initial progress {max_pct:.1%} for " - f"'{sanitize_log_data(book.title)}'" - ) + logger.info(f"Hardcover: pushed initial progress {max_pct:.1%} for '{sanitize_log_data(book.title)}'") except Exception as e: logger.warning(f"Failed to push initial progress to Hardcover: {e}") @@ -334,7 +350,7 @@ def backfill_hardcover_states(self): # Check if a Hardcover state already exists states = self.database_service.get_states_for_book(details.book_id) - hc_states = [s for s in states if s.client_name == 'hardcover'] + hc_states = [s for s in states if s.client_name == "hardcover"] if hc_states: continue @@ -347,7 +363,7 @@ def backfill_hardcover_states(self): state = State( abs_id=details.abs_id, book_id=details.book_id, - client_name='hardcover', + client_name="hardcover", last_updated=time.time(), percentage=max_pct, ) @@ -365,7 +381,7 @@ def _try_match_with_strategy(self, search_func, strategy_name, book_title): if not match: return None, None - pages = match.get('pages') + pages = match.get("pages") if not pages or pages <= 0: logger.info(f"'{book_title}' could not find valid page count using '{strategy_name}' match") return None, match @@ -389,11 +405,11 @@ def automatch_hardcover(self, book, hardcover_sync_client=None): if not item: return - meta = item.get('media', {}).get('metadata', {}) - isbn = meta.get('isbn') - asin = meta.get('asin') - title = meta.get('title') - author = meta.get('authorName') + meta = item.get("media", {}).get("metadata", {}) + isbn = meta.get("isbn") + asin = meta.get("asin") + title = meta.get("title") + author = meta.get("authorName") match = None matched_by = None @@ -401,10 +417,14 @@ def automatch_hardcover(self, book, hardcover_sync_client=None): first_rejected_by = None search_strategies = [ - (lambda: self.hardcover_client.search_by_isbn(isbn) if isbn else None, 'isbn', isbn), - (lambda: self.hardcover_client.search_by_isbn(asin) if asin else None, 'asin', asin), - (lambda: self.hardcover_client.search_by_title_author(title, author) if (title and author) else None, 'title_author', title and author), - (lambda: self.hardcover_client.search_by_title_author(title, "") if title else None, 'title', title), + (lambda: self.hardcover_client.search_by_isbn(isbn) if isbn else None, "isbn", isbn), + (lambda: self.hardcover_client.search_by_asin(asin) if asin else None, "asin", asin), + ( + lambda: self.hardcover_client.search_by_title_author(title, author) if (title and author) else None, + "title_author", + title and author, + ), + (lambda: self.hardcover_client.search_by_title_author(title, "") if title else None, "title", title), ] for search_func, strategy_name, condition in search_strategies: @@ -422,32 +442,34 @@ def automatch_hardcover(self, book, hardcover_sync_client=None): audio_seconds = None audio_edition_id = None if not match and first_rejected: - book_id = first_rejected.get('book_id') + book_id = first_rejected.get("book_id") if book_id: editions = self.hardcover_client.get_all_editions(book_id) - audio_ed = editions.get('audio') - if audio_ed and audio_ed.get('audio_seconds') and audio_ed['audio_seconds'] > 0: + audio_ed = editions.get("audio") + if audio_ed and audio_ed.get("audio_seconds") and audio_ed["audio_seconds"] > 0: match = first_rejected matched_by = first_rejected_by - audio_seconds = audio_ed['audio_seconds'] - audio_edition_id = str(audio_ed['id']) - page_ed = editions.get('ebook') or editions.get('physical') - if page_ed and page_ed.get('pages') and page_ed['pages'] > 0: - match['edition_id'] = page_ed['id'] - match['pages'] = page_ed['pages'] + audio_seconds = audio_ed["audio_seconds"] + audio_edition_id = str(audio_ed["id"]) + page_ed = editions.get("ebook") or editions.get("physical") + if page_ed and page_ed.get("pages") and page_ed["pages"] > 0: + match["edition_id"] = page_ed["id"] + match["pages"] = page_ed["pages"] else: - match['edition_id'] = audio_ed['id'] - match['pages'] = -1 - logger.info(f"Hardcover: '{sanitize_log_data(meta.get('title'))}' matched as audiobook ({audio_seconds}s)") + match["edition_id"] = audio_ed["id"] + match["pages"] = -1 + logger.info( + f"Hardcover: '{sanitize_log_data(meta.get('title'))}' matched as audiobook ({audio_seconds}s)" + ) if match: hardcover_details = HardcoverDetails( abs_id=book.abs_id, book_id=book.id, - hardcover_book_id=match.get('book_id'), - hardcover_slug=match.get('slug'), - hardcover_edition_id=match.get('edition_id'), - hardcover_pages=match.get('pages'), + hardcover_book_id=match.get("book_id"), + hardcover_slug=match.get("slug"), + hardcover_edition_id=match.get("edition_id"), + hardcover_pages=match.get("pages"), hardcover_audio_seconds=audio_seconds, isbn=isbn, asin=asin, @@ -457,14 +479,15 @@ def automatch_hardcover(self, book, hardcover_sync_client=None): self.database_service.save_hardcover_details(hardcover_details) self.resolve_editions(hardcover_details) - self._get_or_create_user_book(book, hardcover_details, match.get('edition_id')) + self._get_or_create_user_book(book, hardcover_details, match.get("edition_id")) self._pull_dates_at_match(book) log_hardcover_action( - self.database_service, abs_id=book.abs_id, - book_title=sanitize_log_data(meta.get('title')), - direction='push', action='automatch', - detail={'matched_by': matched_by, 'hardcover_book_id': match.get('book_id'), - 'slug': match.get('slug')}, + self.database_service, + abs_id=book.abs_id, + book_title=sanitize_log_data(meta.get("title")), + direction="push", + action="automatch", + detail={"matched_by": matched_by, "hardcover_book_id": match.get("book_id"), "slug": match.get("slug")}, ) logger.info(f"Hardcover: '{sanitize_log_data(meta.get('title'))}' matched (matched by {matched_by})") if hardcover_sync_client: @@ -490,9 +513,9 @@ def set_manual_match(self, book_abs_id: str, input_str: str) -> bool: try: item = self.abs_client.get_item_details(book_abs_id) if item: - meta = item.get('media', {}).get('metadata', {}) - isbn = meta.get('isbn') - asin = meta.get('asin') + meta = item.get("media", {}).get("metadata", {}) + isbn = meta.get("isbn") + asin = meta.get("asin") except Exception as e: logger.warning(f"Failed to fetch ABS details during manual match: {e}") @@ -501,29 +524,32 @@ def set_manual_match(self, book_abs_id: str, input_str: str) -> bool: details = HardcoverDetails( abs_id=book_abs_id, book_id=book_id, - hardcover_book_id=match['book_id'], - hardcover_slug=match.get('slug'), - hardcover_edition_id=match.get('edition_id'), - hardcover_pages=match.get('pages'), - hardcover_audio_seconds=match.get('audio_seconds'), + hardcover_book_id=match["book_id"], + hardcover_slug=match.get("slug"), + hardcover_edition_id=match.get("edition_id"), + hardcover_pages=match.get("pages"), + hardcover_audio_seconds=match.get("audio_seconds"), isbn=isbn, asin=asin, - matched_by='manual', + matched_by="manual", ) self.database_service.save_hardcover_details(details) self.resolve_editions(details) log_hardcover_action( - self.database_service, abs_id=book_abs_id, book_id=book_id, - book_title=sanitize_log_data(match.get('title', '')), - direction='push', action='manual_match', - detail={'hardcover_book_id': match['book_id'], 'slug': match.get('slug'), - 'input': input_str}, + self.database_service, + abs_id=book_abs_id, + book_id=book_id, + book_title=sanitize_log_data(match.get("title", "")), + direction="push", + action="manual_match", + detail={"hardcover_book_id": match["book_id"], "slug": match.get("slug"), "input": input_str}, ) logger.info(f"Manually matched ABS {book_abs_id} to Hardcover {match['book_id']} ({match.get('title')})") if not book: - book = Book(abs_id=book_abs_id, title='', status='') - self._get_or_create_user_book(book, details, match.get('edition_id')) + book = Book(abs_id=book_abs_id, title="", status="") + book = self.database_service.save_book(book) + self._get_or_create_user_book(book, details, match.get("edition_id")) self._pull_dates_at_match(book) return True diff --git a/src/services/kosync_service.py b/src/services/kosync_service.py index b3c6fbb..eb57d23 100644 --- a/src/services/kosync_service.py +++ b/src/services/kosync_service.py @@ -33,23 +33,29 @@ def ensure_kosync_document(book, database_service): """ if not book or not book.kosync_doc_id or not book.id: return - existing = database_service.get_kosync_document(book.kosync_doc_id) - if existing: - if not existing.linked_book_id: - database_service.link_kosync_document(book.kosync_doc_id, book.id, book.abs_id) - logger.info(f"KOSync: Linked existing document {book.kosync_doc_id[:8]}... to '{book.title}'") - if not existing.filename and book.ebook_filename: - existing.filename = book.ebook_filename - database_service.save_kosync_document(existing) - else: - doc = KosyncDocument( - document_hash=book.kosync_doc_id, - linked_book_id=book.id, - linked_abs_id=book.abs_id, - filename=book.ebook_filename, + try: + existing = database_service.get_kosync_document(book.kosync_doc_id) + if existing: + if not existing.linked_book_id: + database_service.link_kosync_document(book.kosync_doc_id, book.id, book.abs_id) + logger.info(f"KOSync: Linked existing document {book.kosync_doc_id[:8]}... to '{book.title}'") + if not existing.filename and book.ebook_filename: + existing.filename = book.ebook_filename + database_service.save_kosync_document(existing) + else: + doc = KosyncDocument( + document_hash=book.kosync_doc_id, + linked_book_id=book.id, + linked_abs_id=book.abs_id, + filename=book.ebook_filename, + ) + database_service.save_kosync_document(doc) + logger.debug(f"KOSync: Created document {book.kosync_doc_id[:8]}... for '{book.title}'") + except Exception: + logger.warning( + f"KOSync: Failed to ensure document for book {book.id} " + f"(hash {book.kosync_doc_id[:8]}...) — will retry on next sync cycle" ) - database_service.save_kosync_document(doc) - logger.debug(f"KOSync: Created document {book.kosync_doc_id[:8]}... for '{book.title}'") class KosyncService: @@ -89,8 +95,12 @@ def resolve_book_by_sibling_hash(self, doc_id, existing_doc=None): if doc and doc.filename: # Find sibling document with same filename that's linked sibling = self._db.get_kosync_doc_by_filename(doc.filename) - if sibling and sibling.linked_abs_id and sibling.document_hash != doc_id: - book = self._db.get_book_by_abs_id(sibling.linked_abs_id) + if sibling and (sibling.linked_book_id or sibling.linked_abs_id) and sibling.document_hash != doc_id: + book = ( + self._db.get_book_by_id(sibling.linked_book_id) + if sibling.linked_book_id + else self._db.get_book_by_abs_id(sibling.linked_abs_id) + ) if book: logger.info(f"KOSync: Resolved {doc_id[:8]}... to '{book.title}' via filename sibling") return book @@ -160,8 +170,12 @@ def _find_epub_in_db(self, doc_hash): except FileNotFoundError: logger.debug(f"DB suggested '{doc.filename}' but file is missing — Re-scanning") - if doc and doc.linked_abs_id: - book = self._db.get_book_by_abs_id(doc.linked_abs_id) + if doc and (doc.linked_book_id or doc.linked_abs_id): + book = ( + self._db.get_book_by_id(doc.linked_book_id) + if doc.linked_book_id + else self._db.get_book_by_abs_id(doc.linked_abs_id) + ) if book and book.original_ebook_filename: try: self._container.ebook_parser().resolve_book_path(book.original_ebook_filename) @@ -244,7 +258,7 @@ def _find_epub_in_booklore(self, doc_hash): meta = json.loads(book.raw_metadata) fallback_id = meta.get("id") book_id = str(fallback_id) if fallback_id is not None else None - except (json.JSONDecodeError, AttributeError) as e: + except (json.JSONDecodeError, AttributeError, TypeError) as e: logger.debug(f"Failed to parse raw_metadata JSON: {e}") continue @@ -617,7 +631,9 @@ def handle_put_progress(self, data, remote_addr, debounce_manager=None): # Update linked book if exists linked_book = None - if kosync_doc.linked_abs_id: + if kosync_doc.linked_book_id: + linked_book = self._db.get_book_by_id(kosync_doc.linked_book_id) + elif kosync_doc.linked_abs_id: linked_book = self._db.get_book_by_abs_id(kosync_doc.linked_abs_id) else: linked_book = self._db.get_book_by_kosync_id(doc_hash) @@ -665,7 +681,11 @@ def handle_get_progress(self, doc_id, remote_addr): # Step 1: Direct hash lookup kosync_doc = self._db.get_kosync_document(doc_id) if kosync_doc: - if kosync_doc.linked_abs_id: + if kosync_doc.linked_book_id: + book = self._db.get_book_by_id(kosync_doc.linked_book_id) + if book: + return self.resolve_best_progress(doc_id, book) + elif kosync_doc.linked_abs_id: book = self._db.get_book_by_abs_id(kosync_doc.linked_abs_id) if book: return self.resolve_best_progress(doc_id, book) diff --git a/src/sync_clients/hardcover_sync_client.py b/src/sync_clients/hardcover_sync_client.py index 037586c..b917e3d 100644 --- a/src/sync_clients/hardcover_sync_client.py +++ b/src/sync_clients/hardcover_sync_client.py @@ -204,7 +204,7 @@ def update_progress(self, book: Book, request: UpdateProgressRequest) -> SyncRes current_status = self._handle_status_transition(book, hardcover_details, current_status, percentage, is_finished) try: - self.hardcover_client.update_progress( + result = self.hardcover_client.update_progress( ub['id'], page_num, edition_id=edition_id, @@ -212,8 +212,16 @@ def update_progress(self, book: Book, request: UpdateProgressRequest) -> SyncRes current_percentage=percentage, started_at=book.started_at, finished_at=book.finished_at, + cached_read_id=hardcover_details.hardcover_user_book_read_id, ) + if not result or not result.get("success"): + return SyncResult(None, False) + + if result.get("read_id") and result["read_id"] != hardcover_details.hardcover_user_book_read_id: + hardcover_details.hardcover_user_book_read_id = result["read_id"] + self.database_service.save_hardcover_details(hardcover_details) + actual_pct = 1.0 if is_finished and total_pages > 0 else ( min(page_num / total_pages, 1.0) if total_pages > 0 else percentage ) @@ -241,7 +249,7 @@ def _update_audiobook_progress(self, book, hardcover_details, ub, percentage, au try: progress_seconds = int(audio_seconds * percentage) - self.hardcover_client.update_progress( + result = self.hardcover_client.update_progress( ub['id'], 0, edition_id=edition_id or hardcover_details.hardcover_edition_id, @@ -250,8 +258,16 @@ def _update_audiobook_progress(self, book, hardcover_details, ub, percentage, au audio_seconds=audio_seconds, started_at=book.started_at, finished_at=book.finished_at, + cached_read_id=hardcover_details.hardcover_user_book_read_id, ) + if not result or not result.get("success"): + return SyncResult(None, False) + + if result.get("read_id") and result["read_id"] != hardcover_details.hardcover_user_book_read_id: + hardcover_details.hardcover_user_book_read_id = result["read_id"] + self.database_service.save_hardcover_details(hardcover_details) + updated_state = { 'pct': percentage, 'progress_seconds': progress_seconds, diff --git a/src/utils/cover_resolver.py b/src/utils/cover_resolver.py index 957e312..604dc07 100644 --- a/src/utils/cover_resolver.py +++ b/src/utils/cover_resolver.py @@ -12,8 +12,7 @@ def resolve_placeholder_logo(book, book_type, booklore_meta): return None -def resolve_book_covers(book, abs_service, database_service, book_type, - booklore_meta=None, hardcover_details=None): +def resolve_book_covers(book, abs_service, database_service, book_type, booklore_meta=None, hardcover_details=None): """Resolve cover URLs for a book using the priority waterfall. Priority chain: @@ -28,7 +27,7 @@ def resolve_book_covers(book, abs_service, database_service, book_type, """ custom_cover_url = book.custom_cover_url or None abs_cover_url = None - if book.abs_id and book_type != 'ebook-only' and not book.abs_id.startswith('bf-'): + if book.abs_id and book_type != "ebook-only" and not book.abs_id.startswith("bf-"): abs_cover_url = f"/api/cover-proxy/{book.abs_id}" # Cover URL -- preserve custom override, otherwise walk the waterfall. @@ -37,14 +36,15 @@ def resolve_book_covers(book, abs_service, database_service, book_type, # Booklore cover (authenticated proxy, always available if metadata exists) if not cover_url and booklore_meta: - bl_id = (booklore_meta.raw_metadata_dict or {}).get('id') + bl_id = (booklore_meta.raw_metadata_dict or {}).get("id") if bl_id: from src.blueprints.helpers import booklore_cover_proxy_prefix + prefix = booklore_cover_proxy_prefix(booklore_meta.server_id) cover_url = f"{prefix}/{bl_id}" if not cover_url and book.kosync_doc_id: - cover_url = f'/covers/{book.kosync_doc_id}.jpg' + cover_url = f"/covers/{book.kosync_doc_id}.jpg" # Hardcover cover fallback if not cover_url and book.id: @@ -60,9 +60,9 @@ def resolve_book_covers(book, abs_service, database_service, book_type, fallback_cover_url = None return { - 'cover_url': cover_url, - 'custom_cover_url': custom_cover_url, - 'abs_cover_url': abs_cover_url, - 'fallback_cover_url': fallback_cover_url, - 'placeholder_logo': resolve_placeholder_logo(book, book_type, booklore_meta), + "cover_url": cover_url, + "custom_cover_url": custom_cover_url, + "abs_cover_url": abs_cover_url, + "fallback_cover_url": fallback_cover_url, + "placeholder_logo": resolve_placeholder_logo(book, book_type, booklore_meta), } diff --git a/templates/settings.html b/templates/settings.html index d05a5e9..e7a1ae9 100644 --- a/templates/settings.html +++ b/templates/settings.html @@ -386,6 +386,7 @@

KOSync

Sync Source

+
Enable
' + '
' + '
' + - '' + escapeHtml(match.source_family || 'unknown') + '' + + '' + escapeHtml(match.source_family || match.source || 'unknown') + '' + formatEvidence(match.evidence) + '
' + (match.highlight_count ? '
BookFusion highlights: ' + escapeHtml(match.highlight_count) + '
' : '') + @@ -119,6 +124,7 @@ '

' + escapeHtml(suggestion.title) + '

' + '

' + escapeHtml(suggestion.author || 'Unknown author') + '

' + '
' + + (suggestion.source && suggestion.source !== 'abs' ? '' + escapeHtml(suggestion.source) + '' : '') + '' + escapeHtml((suggestion.matches || []).length) + ' candidates' + (suggestion.hidden ? 'Hidden' : '') + (suggestion.has_bookfusion_evidence ? 'BookFusion evidence' : '') + diff --git a/templates/bookfusion.html b/templates/bookfusion.html index 33ab958..c32395f 100644 --- a/templates/bookfusion.html +++ b/templates/bookfusion.html @@ -63,11 +63,11 @@

BookFusion

-
Search your Booklore library
+
Search your Grimmory library
Type above to find books to upload
diff --git a/templates/index.html b/templates/index.html index 25dd6b0..d266ff5 100644 --- a/templates/index.html +++ b/templates/index.html @@ -74,7 +74,7 @@

Processing

{% for mapping in processing_books %} - {{ render_book_card(mapping, integrations, booklore_label) }} + {{ render_book_card(mapping, integrations, grimmory_label) }} {% endfor %}
@@ -101,6 +101,43 @@

Pending Identification

{% endif %} + {% if top_suggestions %} +
+ +

High-confidence matches found for books you're listening to.

+
+ {% for sg in top_suggestions %} +
+ +
+
{{ sg.title }}
+ {% if sg.author %}
{{ sg.author }}
{% endif %} + {% if sg.top_match %} +
+ {{ sg.top_match.title }} + {{ sg.top_match.confidence }} +
+ {% endif %} +
+ {% if sg.source == 'abs' %} + Map Now + {% elif sg.top_match and sg.top_match.abs_id %} + Map Now + {% else %} + Review + {% endif %} + +
+
+
+ {% endfor %} +
+
+ {% endif %} +

Currently Reading

@@ -109,7 +146,7 @@

Currently Reading

{% for mapping in mappings %} {% if mapping.unified_progress > 0 and mapping.unified_progress < 100 and mapping.status not in ['completed', 'paused', 'dnf', 'not_started', 'pending', 'processing', 'failed_retry_later'] %} {{ render_book_card(mapping, - integrations, booklore_label) }} {% endif %} {% endfor %}
+ integrations, grimmory_label) }} {% endif %} {% endfor %}
{% set finished_books = mappings|selectattr('status', 'equalto', 'completed')|list + mappings|rejectattr('status', 'equalto', 'completed')|selectattr('unified_progress', 'ge', 100)|list %} @@ -119,7 +156,7 @@

Finished

{% for mapping in finished_books %} - {{ render_book_card(mapping, integrations, booklore_label) }} + {{ render_book_card(mapping, integrations, grimmory_label) }} {% endfor %}
@@ -131,7 +168,7 @@

Paused

{% for mapping in paused_books %} - {{ render_book_card(mapping, integrations, booklore_label) }} + {{ render_book_card(mapping, integrations, grimmory_label) }} {% endfor %}
@@ -143,7 +180,7 @@

Did Not Finish

{% for mapping in dnf_books %} - {{ render_book_card(mapping, integrations, booklore_label) }} + {{ render_book_card(mapping, integrations, grimmory_label) }} {% endfor %}
@@ -155,7 +192,7 @@

All Books

{% for mapping in mappings %} - {{ render_book_card(mapping, integrations, booklore_label) }} + {{ render_book_card(mapping, integrations, grimmory_label) }} {% endfor %}
@@ -227,13 +264,13 @@

Link to Hardcover

- + - + {% set abs_url = get_header_service_url('ABS') %} {% set storyteller_url = get_header_service_url('STORYTELLER') %} - {% set booklore_url = get_header_service_url('BOOKLORE') %} - {% set booklore2_url = get_header_service_url('BOOKLORE_2') %} + {% set grimmory_url = get_header_service_url('GRIMMORY') %} + {% set grimmory2_url = get_header_service_url('GRIMMORY_2') %} {% set cwa_url = get_header_service_url('CWA') %} {% set hardcover_url = get_header_service_url('HARDCOVER') %} {% set show_bookfusion_page = get_bool('BOOKFUSION_ENABLED') and (get_val('BOOKFUSION_API_KEY') or get_val('BOOKFUSION_UPLOAD_API_KEY')) %} {% set show_bookfusion_icon = get_bool('BOOKFUSION_ENABLED') %} - {% set has_any_link = abs_url or storyteller_url or booklore_url or booklore2_url or cwa_url or hardcover_url or show_bookfusion_icon %} + {% set has_any_link = abs_url or storyteller_url or grimmory_url or grimmory2_url or cwa_url or hardcover_url or show_bookfusion_icon %} {% if has_any_link %} {{ service_links('library-links-desktop') }} @@ -85,8 +85,8 @@

PageKeeper

{% if show_bookfusion_page %} BookFusion Books {% endif %} - {% if get_bool('SUGGESTIONS_ENABLED') and abs_url %} - Suggestions + {% if get_bool('SUGGESTIONS_ENABLED') %} + Suggestions{% if suggestion_count %} {{ suggestion_count }}{% endif %} {% endif %} Batch Logs diff --git a/templates/reading_detail.html b/templates/reading_detail.html index 9f6675f..ec61ee0 100644 --- a/templates/reading_detail.html +++ b/templates/reading_detail.html @@ -177,8 +177,8 @@

{{ book.title }}{% if metadata.subtitle %}: {{ metada {# ── Overview Tab ── #}
- {# Services — dashboard order: ABS, KoSync, Storyteller, Booklore, Hardcover, BookFusion #} - {% set any_service = integrations.abs or integrations.kosync or integrations.storyteller or integrations.booklore or integrations.hardcover or integrations.bookfusion or services_enabled.storyteller or services_enabled.booklore or services_enabled.hardcover or services_enabled.bookfusion or (services_enabled.abs and not integrations.abs and book.book_type == 'ebook-only') %} + {# Services — dashboard order: ABS, KoSync, Storyteller, Grimmory, Hardcover, BookFusion #} + {% set any_service = integrations.abs or integrations.kosync or integrations.storyteller or integrations.grimmory or integrations.hardcover or integrations.bookfusion or services_enabled.storyteller or services_enabled.grimmory or services_enabled.hardcover or services_enabled.bookfusion or (services_enabled.abs and not integrations.abs and book.book_type == 'ebook-only') %} {% if any_service %}
@@ -230,26 +230,26 @@

{{ book.title }}{% if metadata.subtitle %}: {{ metada {% endif %} {% endif %} - {# ── Booklore ── #} - {% if services_enabled.booklore %} - {% if integrations.booklore %} + {# ── Grimmory ── #} + {% if services_enabled.grimmory %} + {% if integrations.grimmory %}
- BL - {% if metadata.booklore_url %} - Booklore + GR + {% if metadata.grimmory_url %} + Grimmory {% else %} - Booklore + Grimmory {% endif %} - + - {% if service_states.booklore is defined %}{{ '%.1f' | format(service_states.booklore.percentage) }}%{% else %}--{% endif %} + {% if service_states.grimmory is defined %}{{ '%.1f' | format(service_states.grimmory.percentage) }}%{% else %}--{% endif %}
{% else %}
- BL - Booklore - + GR + Grimmory +
{% endif %} {% endif %} @@ -745,13 +745,13 @@

Link to Hardcover

- {# ── Booklore Link Modal ── #} + {# ── Grimmory Link Modal ── #}
@@ -287,83 +287,83 @@

Storyteller

- -