Skip to content

[MEDIUM] Document watcher: folder-deletion handling, debouncing, thread cleanup #38

@matthewod11-stack

Description

@matthewod11-stack

Description

The file watcher in documents.rs has several resilience gaps that could cause data loss, deadlocks, or resource leaks.

Current State

  • src-tauri/src/documents.rs:1073+ — watcher thread spins on rx.recv_timeout(Duration::from_secs(2)) indefinitely. If the watched folder is deleted / renamed / unmounted (external drive), notify stops delivering events but the thread keeps looping.
  • Next scan via discover_files on a missing folder returns empty → the chunk index gets wiped.
  • WatcherState.handle uses std::sync::Mutex; start() calls stop() (which h.join() — blocks the calling thread) while holding the mutex. On the Tauri tokio thread pool this can deadlock.
  • No explicit debouncing (though notify 7 supports it) — rapid file churn (e.g., git operations in the watched folder) fires many events.

Suggested Fix

  • On notify::EventKind::Remove targeting the watched root path, call stop() on self, emit a documents-folder-missing Tauri event so the UI can prompt user to reselect.
  • Before rescan, Path::exists(&folder_path) check; skip rescan if false to prevent index wipe.
  • Replace std::sync::Mutex + std::thread::JoinHandle with tokio::sync::Mutex + tokio::task::JoinHandle; make stop() async.
  • Add a debouncer (e.g., notify-debouncer-full 0.3 or a manual 500ms collector) to coalesce rapid events.
  • Add a test (or manual verification checklist) for: folder deleted mid-session, folder renamed, external drive unmounted.

Verification

  • cargo test documents passes
  • Manual: delete the watched folder → UI prompt appears, no data loss.
  • Manual: rapid file churn (e.g., for i in {1..100}; do touch file$i; done) → watcher processes one coalesced event, not 100.

Automation Hints

scope: src-tauri/src/documents.rs
do-not-touch: FTS schema, chunking logic
approach: refactor-to-config
risk: medium (async refactor)
max-files-changed: 2
blocked-by: none
bail-if: notify 7 lacks sufficient debouncing primitives without a second crate

Priority

Medium — avoids data-loss edge cases for document-heavy users.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghardeningReliability or defense-in-depth improvement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions