Skip to content

feat: vision rework with per-inbox config and Ollama support#11

Merged
spignotti merged 13 commits intomainfrom
feat/vision-rework
Apr 6, 2026
Merged

feat: vision rework with per-inbox config and Ollama support#11
spignotti merged 13 commits intomainfrom
feat/vision-rework

Conversation

@spignotti
Copy link
Copy Markdown
Owner

Summary

Hybrid text-first extraction with vision fallback for scanned PDFs, per-inbox configuration support, and first-class Ollama backend support.

What changed

  • Per-inbox config: New [[inbox]] TOML format allows different templates, languages, and prompts per folder
  • Hybrid extraction: Text-based PDFs use pypdf (no vision costs); scans fall back to pymupdf rendering
  • Ollama support: First-class local model support via LiteLLM with example config
  • Backwards compat: Legacy inbox_paths format still works with deprecation warning

Why it changed

  • Simpler codebase: no separate OCR model needed
  • Cost savings: cloud text-PDFs don't trigger vision API calls
  • Flexibility: per-folder settings for different document types
  • Privacy: fully local processing option with Ollama

Validation

  • All unit tests pass (14 tests in test_models.py)
  • Lint checks pass (ruff check)
  • Compile checks pass for all modified files

Breaking changes

  • Config format changed from flat inbox_paths = [...] to [[inbox]] array-of-tables
  • Legacy format still supported for 1 major version with deprecation warning

Migration

Update your config.toml:

# Old (deprecated)
inbox_paths = ["/path/to/folder"]
language = "en"

# New
[[inbox]]
path = "/path/to/folder"

spignotti added 13 commits April 6, 2026 14:52
- Add InboxConfig class with optional filename_template, language, rename_prompt
- Add EffectiveInboxConfig dataclass for resolved values
- Update AppConfig to use List[InboxConfig] + global defaults
- Add get_effective_config() merge logic
- Add backwards compatibility for inbox_paths with deprecation warning
- Add extract_content() to preview.py for text-first, vision-fallback strategy
- Update renamer.py to iterate List[InboxConfig] and compute effective config
- Update metadata.py to accept effective config values (language, prompt, llm_config)
- Update init command to generate [[inbox]] format config
- Add Ollama setup example in comments
- Update --inbox CLI override to use new InboxConfig model
- Update config.toml.example with new format
- Reorder sections: Quick Start → Configuration → Ollama Setup → Cloud Providers → CLI → Privacy
- Add Ollama Setup section with copy-paste example
- Update Configuration section with [[inbox]] format documentation
- Add privacy callout for local models
- Document backwards compatibility for legacy inbox_paths
- Add tests for effective config merge logic
- Add tests for backwards compatibility with inbox_paths
- Add tests for hybrid extraction (text vs scan PDF)
- Update test_renamer.py to use new inboxes format
- Update CHANGELOG.md with v1.2.0 release notes
- Bump version in pyproject.toml from 1.1.1 to 1.2.0
- Use validation_alias="inbox" for TOML [[inbox]] array-of-tables parsing
- Fix backwards compat check to use raw_config.get("inbox") not "inboxes"
- Update tests to use inbox=... instead of inboxes=... for direct construction
- Add model_config with populate_by_name to AppConfig for field/alias flexibility
- Fix test_metadata.py to use new extract_metadata signature
- Fix test_cli.py to expect new [[inbox]] format in generated config
- Fix test_renamer.py and test_models.py to use inboxes= consistently
@spignotti spignotti merged commit 6618615 into main Apr 6, 2026
2 checks passed
@spignotti spignotti deleted the feat/vision-rework branch April 6, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant