feat: Implement Logseq directory import and file sync with simplified DDD architecture #3

cyrusagent · 2025-10-19T03:41:59Z

Summary

Implements the core file processing system for importing and syncing Logseq markdown directories using a pragmatic DDD architecture suitable for a personal project.

Resolves #PER-5

Core Features Implemented

🗂️ ImportService

✅ Import entire Logseq directories (pages/ and journals/)
✅ Bounded concurrency (4-6 files at once, configurable)
✅ Real-time progress tracking with callbacks
✅ Graceful error handling (continues on individual file failures)
✅ Returns ImportSummary with detailed statistics

🔄 SyncService

✅ Incremental file synchronization with file watching
✅ 500ms debouncing window (configurable)
✅ Auto-sync on file changes (create, update, delete)
✅ Event callbacks for sync operations
✅ Runs indefinitely watching for changes

📝 Logseq Markdown Parser

✅ Async file parsing with Tokio
✅ Indentation-based hierarchy parsing (tabs or 2-space indents)
✅ Automatic URL extraction from content
✅ Page reference ([[page]]) and tag (#tag) extraction
✅ Converts markdown files to Page and Block domain objects

📂 File System Utilities

✅ Recursive markdown file discovery
✅ Logseq-specific directory filtering
✅ Cross-platform file watching with debouncing
✅ Event filtering to .md files only

Architecture

This implementation follows a three-layer DDD architecture:

Domain Layer: Value objects (LogseqDirectoryPath, ImportProgress) and events
Application Layer: Services (ImportService, SyncService)
Infrastructure Layer: File system operations, markdown parsing

See IMPLEMENTATION.md for detailed architecture documentation.

Key Design Decisions

Following simplified DDD for personal projects:

✅ No complex event sourcing (events for notifications only)
✅ Direct callbacks (no event bus/CQRS complexity)
✅ Simple error handling (continue on error, collect failures)
✅ File system as source of truth (no conflict resolution)
✅ Bounded concurrency using tokio::sync::Semaphore

Dependencies Added

notify (6.1): Cross-platform file watching
notify-debouncer-mini (0.4): Event debouncing
tokio (1.41): Async runtime
uuid (1.11): UUID generation
thiserror, anyhow: Error handling
tracing: Structured logging
tempfile (dev): Testing utilities

Testing

✅ Comprehensive unit tests for all components
✅ Domain layer: Value objects and events
✅ Infrastructure layer: Parser, file discovery, watcher
✅ Application layer: Import service statistics
✅ Integration test structure ready

Documentation

✅ Comprehensive IMPLEMENTATION.md
✅ CHANGELOG.md with detailed changes
✅ Inline code documentation
✅ Usage examples

What's Next

Future enhancements (not included in this PR):

SQLite persistence for PageRepository
File→Page mapping for proper deletion handling
Tauri integration (commands and event emitters)
Full-text search with Tantivy
UI components for import/sync status

Testing Instructions

Build the project:
```
cargo build
```
Run tests:
```
cargo test
```
Run with logging:
```
RUST_LOG=debug cargo test
```

Review Focus Areas

Architecture: Is the simplified DDD approach appropriate?
Error Handling: Does the "continue on error" strategy make sense?
Concurrency: Is the bounded concurrency (4-6 files) reasonable?
API Design: Are the service interfaces intuitive?
Testing: Is test coverage adequate?

🤖 Generated with Claude Code

… DDD architecture This commit implements the core file processing system for importing and syncing Logseq markdown directories, following a pragmatic DDD approach suitable for a personal project. ## Core Features ### ImportService - Import entire Logseq directories (pages/ and journals/) - Bounded concurrency (4-6 files at once, configurable) - Real-time progress tracking with callbacks - Graceful error handling (continues on individual file failures) - Returns ImportSummary with statistics ### SyncService - Incremental file synchronization with file watching - 500ms debouncing window (configurable) - Auto-sync on file changes (create, update, delete) - Event callbacks for sync operations - Runs indefinitely watching for changes ## Domain Layer Changes ### New Value Objects - LogseqDirectoryPath: Validated directory with pages/ and journals/ - ImportProgress: Tracks import progress (files, percentage) ### New Domain Events - Import events: ImportStarted, FileProcessed, ImportCompleted, ImportFailed - Sync events: SyncStarted, FileCreatedEvent, FileUpdatedEvent, FileDeletedEvent, SyncCompleted ## Infrastructure Layer (New) ### Logseq Markdown Parser - Async file parsing with Tokio - Indentation-based hierarchy (tabs or 2-space indents) - URL extraction (http://, https://) - Page reference ([[page]]) and tag (#tag) extraction - Converts markdown to Page/Block domain objects ### File System Utilities - discover_markdown_files(): Recursive .md file discovery - discover_logseq_files(): Find files in pages/ and journals/ - LogseqFileWatcher: Cross-platform file watching with debouncing - Filters to only .md files in Logseq directories ## Dependencies Added - notify (6.1): Cross-platform file watching - notify-debouncer-mini (0.4): Event debouncing - tokio (1.41): Async runtime with fs, rt-multi-thread, macros, sync, time - serde, serde_json: Serialization - thiserror, anyhow: Error handling - tracing, tracing-subscriber: Structured logging - uuid (1.11): UUID generation for IDs - tempfile (3.14): Dev dependency for tests ## Architecture Decisions Following simplified DDD for personal projects: - No complex event sourcing (events for notifications only) - Direct callbacks (no event bus/CQRS complexity) - Simple error handling (continue on error, collect failures) - File system as source of truth (no conflict resolution) - No import session persistence ## Documentation - Comprehensive IMPLEMENTATION.md with architecture, components, usage examples - CHANGELOG.md documenting all changes - Inline code documentation and tests ## Testing - Unit tests for all components - Integration test structure ready - Test coverage for domain, infrastructure, and application layers Resolves: PER-5 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Fix partial move error in sync_service.rs by borrowing operation in match - Add Entity trait import to import_service.rs tests - Update watcher.rs to use DebouncedEventKind from notify-debouncer-mini - Fix Block constructor calls to match correct signatures (new_root, new_child) - Fix iterator issues in tests by collecting all_blocks() before indexing - Add Box::pin to discover_markdown_files for recursive async function - Remove unused imports and variables to eliminate warnings

- Fix type mismatch in sync_service.rs: clone PathBuf when passing to event - Fix non-exhaustive pattern match in watcher.rs: add wildcard pattern - Remove unused imports from import_service.rs, entities.rs, and aggregates.rs - Remove unused variable path_buf from import_service.rs These fixes address all compilation errors reported by the GitHub Actions CI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Fixed the extract_page_references function to preserve the order of appearance of [[page references]] and #tags as they appear in the markdown content. Previously, all [[brackets]] were extracted first, then all #tags, which broke the expected ordering in tests. Rewrote the parser to use a single-pass character-by-character approach that maintains proper ordering. Fixes test_extract_page_references test failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Implemented sync_once() method that performs a one-time synchronization of a Logseq directory, detecting and handling: - New files (creates pages) - Updated files (compares modification time and updates pages) - Deleted files (removes from repository) - Unchanged files (skips processing) Features: - Maintains sync registry to track file metadata and modification times - Intelligent change detection using file modification timestamps - Proper deletion handling with title-based lookup - Support for optional callbacks to track sync progress - Returns detailed SyncSummary with operation counts and errors Added comprehensive test coverage: - test_sync_once_new_files - test_sync_once_updated_files - test_sync_once_unchanged_files - test_sync_once_deleted_files - test_sync_once_mixed_operations - test_sync_once_with_journals - test_sync_once_with_callback All 125 tests passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Fixed test_parse_with_urls_and_references to use root_blocks() which preserves insertion order, instead of all_blocks() which iterates over a HashMap with non-deterministic order. Also added more detailed assertions to verify the parsed content includes correct URLs and page references. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Cyrus AI and others added 6 commits October 19, 2025 03:41

weswalla merged commit b6b3991 into main Oct 19, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement Logseq directory import and file sync with simplified DDD architecture #3

feat: Implement Logseq directory import and file sync with simplified DDD architecture #3

Uh oh!

cyrusagent bot commented Oct 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Implement Logseq directory import and file sync with simplified DDD architecture #3

feat: Implement Logseq directory import and file sync with simplified DDD architecture #3

Uh oh!

Conversation

cyrusagent bot commented Oct 19, 2025

Summary

Core Features Implemented

🗂️ ImportService

🔄 SyncService

📝 Logseq Markdown Parser

📂 File System Utilities

Architecture

Key Design Decisions

Dependencies Added

Testing

Documentation

What's Next

Testing Instructions

Review Focus Areas

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant