Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 27 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ AI-powered CLI that renames files based on their content.
Scanned documents, downloads, and exported files often arrive with useless names like `scan_001.pdf` or `IMG_5847.jpg`. renamr reads each file — extracting text from PDFs, rendering pages as images for vision models, or encoding photos directly — sends a preview to an LLM, and renames the file to a structured format based on the content it actually finds.

```
scan_001.pdf -> 240115_ACME_Rechnung.pdf
IMG_5847.jpg -> 241203_DeutschePost_Zustellbenachrichtigung.jpg
invoice_download.pdf -> 250110_Amazon_Bestellbestaetigung.pdf
scan_001.pdf -> 240115_ACME_Invoice.pdf
IMG_5847.jpg -> 241203_PostOffice_DeliveryNotice.jpg
invoice_download.pdf -> 250110_Amazon_OrderConfirmation.pdf
```

Only the filename changes. Files are never modified.
Expand All @@ -24,7 +24,9 @@ Only the filename changes. Files are never modified.
- Content-aware renaming via any LiteLLM-supported provider (OpenAI, OpenRouter, Anthropic, local models)
- PDF text extraction for text-based documents
- Vision model support for scanned PDFs and image files
- iCloud evicted file handling — triggers download via `brctl` before processing (macOS)
- iCloud evicted file handling — auto-downloads stubs via `brctl` before processing (macOS only)
- Multi-inbox support — configure one or more folders in a single config
- Configurable output language — extracted metadata returned in any language
- Dry-run mode to preview renames without touching files
- Undo the last run with a single command
- Configurable output template (`{date}_{sender}_{subject}`), file extensions, and system prompt
Expand All @@ -45,19 +47,22 @@ uv tool install renamr
## Quick Start

```bash
# Create config.toml and data/ in the current directory
# One-time global install
uv tool install renamr # or: pip install renamr

# First-run setup — creates ~/.config/renamr/config.toml
renamr init

# Set your API key
export OPENAI_API_KEY="your-key"

# Preview renames without touching any files
# Preview renames
renamr run --dry-run

# Rename files
renamr run

# Undo the last run
# Undo last run
renamr undo
```

Expand All @@ -69,10 +74,14 @@ renamr run --inbox ~/Documents/inbox --dry-run

## Configuration

`renamr init` creates a `config.toml` in the current directory. The full set of options:
`renamr init` creates `~/.config/renamr/config.toml` by default. On Linux, `XDG_CONFIG_HOME`
is respected, so the actual path becomes `$XDG_CONFIG_HOME/renamr/config.toml` when set.

The full set of options:

```toml
inbox_path = "."
inbox_paths = ["/path/to/your/folder"]
language = "en"
file_extensions = [".pdf", ".jpg", ".jpeg", ".png", ".txt"]
recursive = false
filename_template = "{date}_{sender}_{subject}"
Expand All @@ -95,9 +104,15 @@ level = "INFO"
json_logs = false
```

`filename_template` supports three placeholders: `{date}`, `{sender}`, `{subject}`. The date is extracted from document content when available, falling back to the file's creation timestamp.
`inbox_paths` accepts one or more folders. `renamr run` processes all of them in one pass.
Use `--inbox /some/folder` for a one-off override without editing the config.

`filename_template` supports three placeholders: `{date}`, `{sender}`, `{subject}`. Changing
the order does not affect metadata extraction — the model still returns the same fields, and
renamr only changes how they are assembled into the final filename.

`data/undo.json` is stored relative to the config file. Always run `renamr run` and `renamr undo` with the same `--config` path, or from the same directory when using the default.
`undo.json` is stored next to the config file. With the default setup, that means
`~/.config/renamr/undo.json`.

**Switching providers.** Change `model` and set `api_base`. For OpenRouter:

Expand Down Expand Up @@ -129,7 +144,7 @@ Then set `OPENROUTER_API_KEY` instead of `OPENAI_API_KEY`. Any provider supporte
Additional notes:

- Always use an `https://` endpoint for `api_base`. An `http://` URL sends file content unencrypted.
- Keep `data/undo.json` private on shared systems — it contains the file paths from the last run.
- Keep `~/.config/renamr/undo.json` private on shared systems — it contains the file paths from the last run.
- Avoid sharing verbose log output publicly; failed auth responses may include API key fragments.

## Maintenance
Expand Down
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ dependencies = [
"litellm>=1.82.1",
"pillow>=12.1.1",
"pydantic>=2.12.5",
"pydantic-settings>=2.13.1",
"pymupdf>=1.27.2",
"pypdf>=6.8.0",
"structlog>=25.5.0",
Expand Down
61 changes: 44 additions & 17 deletions src/renamr/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

from __future__ import annotations

import importlib.resources
import os
from pathlib import Path
from typing import Annotated

Expand All @@ -24,61 +24,88 @@
console = Console()


def _config_dir() -> Path:
"""Return the default directory for renamr config and runtime files."""
xdg_config_home = os.environ.get("XDG_CONFIG_HOME")
if xdg_config_home:
return Path(xdg_config_home) / "renamr"
return Path.home() / ".config" / "renamr"


@app.command()
def version() -> None:
"""Print the installed version."""
typer.echo(__version__)


@app.command()
def init() -> None:
def init(
config: Annotated[Path | None, typer.Option("--config")] = None,
) -> None:
"""Create a local config file and data directory."""
setup_logging("INFO", False)
config_path = Path("config.toml")
config_path = config or _config_dir() / "config.toml"
if config_path.exists():
typer.echo("config.toml already exists")
else:
example = importlib.resources.files("renamr").joinpath("config.toml.example").read_text()
config_path.write_text(example)
typer.echo("Created config.toml")
Path("data").mkdir(parents=True, exist_ok=True)
typer.echo("Ensured data/ exists")
typer.echo(f"{config_path} already exists. Delete it to reinitialize.")
return
config_path.parent.mkdir(parents=True, exist_ok=True)

inbox_path = Path(typer.prompt("Inbox folder path")).expanduser().resolve()
language = typer.prompt("Language for extracted metadata", default="en")
model = typer.prompt("LLM model", default="gpt-4o-mini")

config_path.write_text(
"\n".join(
[
f'inbox_paths = ["{inbox_path}"]',
f'language = "{language}"',
"",
"[llm]",
f'model = "{model}"',
"",
]
)
)
typer.echo(f"Created {config_path}")
typer.echo(f"Ensured {config_path.parent} exists")


@app.command()
def run(
config: Annotated[Path, typer.Option("--config")] = Path("config.toml"),
config: Annotated[Path | None, typer.Option("--config")] = None,
dry_run: Annotated[bool, typer.Option("--dry-run/--no-dry-run")] = False,
compress: Annotated[bool | None, typer.Option("--compress/--no-compress")] = None,
inbox: Annotated[Path | None, typer.Option("--inbox")] = None,
recursive: Annotated[bool | None, typer.Option("--recursive/--no-recursive")] = None,
verbose: Annotated[bool, typer.Option("--verbose/--no-verbose")] = False,
) -> None:
"""Scan files, extract metadata, and rename them."""
if not config.exists():
config_path = config or _config_dir() / "config.toml"
if not config_path.exists():
typer.secho("Missing config.toml. Run `renamr init` first.", fg=typer.colors.RED)
raise typer.Exit(code=1)
app_config = load_config(config)
app_config = load_config(config_path)
if inbox is not None:
app_config = app_config.model_copy(update={"inbox_path": str(inbox)})
app_config = app_config.model_copy(update={"inbox_paths": [str(inbox)]})
if recursive is not None:
app_config = app_config.model_copy(update={"recursive": recursive})
if compress is None:
compress = app_config.compress.enabled
log_level = "DEBUG" if verbose else app_config.logging.level
setup_logging(log_level, app_config.logging.json_logs)
data_dir = config.parent / "data"
data_dir = config_path.parent
summary = run_pipeline(app_config, dry_run=dry_run, compress=compress, data_dir=data_dir)
_print_summary(summary)


@app.command()
def undo(
config: Annotated[Path, typer.Option("--config")] = Path("config.toml"),
config: Annotated[Path | None, typer.Option("--config")] = None,
) -> None:
"""Undo the last successful rename run."""
setup_logging("INFO", False)
data_dir = config.parent / "data"
config_path = config or _config_dir() / "config.toml"
data_dir = config_path.parent
reversed_pairs = undo_last_run(data_dir)
if not reversed_pairs:
typer.secho("Nothing to undo.", fg=typer.colors.YELLOW)
Expand Down
16 changes: 0 additions & 16 deletions src/renamr/config.py

This file was deleted.

13 changes: 3 additions & 10 deletions src/renamr/config.toml.example
Original file line number Diff line number Diff line change
@@ -1,18 +1,11 @@
inbox_path = "."
inbox_paths = ["/path/to/your/folder"]
# inbox_paths = ["/folder/one", "/folder/two"] # multiple inboxes supported
language = "en" # language for extracted metadata: "en", "de", or any locale name
file_extensions = [".pdf", ".jpg", ".jpeg", ".png", ".txt"]
recursive = false
filename_template = "{date}_{sender}_{subject}"
# Full default prompt lives in src/renamr/models.py as DEFAULT_RENAME_PROMPT.
# Copy it here and customize if you want provider-specific behavior.
rename_prompt = """
---
language: en
output_format: json_only
---

# Purpose
Extract sender, subject, document date, and filename format from a document for file renaming.
"""

[llm]
model = "gpt-4o-mini"
Expand Down
22 changes: 20 additions & 2 deletions src/renamr/metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,12 @@ def extract_metadata(
filename_format="date_subject",
)
prompt = _build_user_prompt(filename, created_at, preview_text)
system_content = (
f"Language for all extracted metadata values: {config.language}\n\n"
f"{config.rename_prompt}"
)
messages = [
{"role": "system", "content": config.rename_prompt},
{"role": "system", "content": system_content},
{"role": "user", "content": _build_user_content(prompt, image_base64)},
]
for attempt in range(config.llm.max_retries + 1):
Expand Down Expand Up @@ -140,7 +144,10 @@ def _parse_date_string(value: str) -> date | None:
if not match:
continue
if fmt is not None:
return datetime.strptime(match.group(0), fmt).date()
try:
return datetime.strptime(match.group(0), fmt).date()
except ValueError:
continue
parsed = _parse_ambiguous_date(match.group(1), match.group(2), match.group(3))
if parsed is not None:
return parsed
Expand Down Expand Up @@ -200,28 +207,39 @@ def _normalize_umlauts(value: str) -> str:

_MONTH_MAP = {
"januar": 1,
"january": 1,
"jan": 1,
"februar": 2,
"february": 2,
"feb": 2,
"maerz": 3,
"marz": 3,
"march": 3,
"mrz": 3,
"mar": 3,
"april": 4,
"apr": 4,
"mai": 5,
"may": 5,
"juni": 6,
"june": 6,
"jun": 6,
"juli": 7,
"july": 7,
"jul": 7,
"august": 8,
"aug": 8,
"september": 9,
"sep": 9,
"sept": 9,
"oktober": 10,
"october": 10,
"okt": 10,
"oct": 10,
"november": 11,
"nov": 11,
"december": 12,
"dezember": 12,
"dec": 12,
"dez": 12,
}
3 changes: 2 additions & 1 deletion src/renamr/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,11 +118,12 @@ class CompressConfig(BaseModel):
class AppConfig(BaseModel):
"""Top-level application configuration."""

inbox_path: str = Field(default=".")
inbox_paths: list[str] = Field(default_factory=lambda: ["."])
file_extensions: list[str] = Field(
default_factory=lambda: [".pdf", ".jpg", ".jpeg", ".png", ".txt"]
)
recursive: bool = Field(default=False)
language: str = Field(default="en")
filename_template: str = Field(default="{date}_{sender}_{subject}")
rename_prompt: str = Field(default=DEFAULT_RENAME_PROMPT)
llm: LLMConfig = Field(default_factory=LLMConfig)
Expand Down
19 changes: 12 additions & 7 deletions src/renamr/renamer.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,8 @@ def scan_files(inbox: Path, extensions: list[str], recursive: bool) -> list[Path
def process_file(filepath: Path, config: AppConfig, dry_run: bool) -> RenameResult:
"""Extract metadata, build a filename, and optionally rename the file."""
try:
created_at = datetime.fromtimestamp(filepath.stat().st_ctime, tz=UTC)
stat = filepath.stat()
created_at = datetime.fromtimestamp(getattr(stat, "st_birthtime", stat.st_mtime), tz=UTC)
preview_text = extract_text_preview(filepath)
image_base64 = _get_image_payload(filepath, preview_text)
metadata = extract_metadata(
Expand Down Expand Up @@ -105,12 +106,16 @@ def process_file(filepath: Path, config: AppConfig, dry_run: bool) -> RenameResu

def run(config: AppConfig, dry_run: bool, compress: bool, data_dir: Path) -> RunSummary:
"""Run the rename pipeline over configured files."""
inbox = Path(config.inbox_path)
if not inbox.exists():
raise FileNotFoundError(f"Inbox path does not exist: {inbox}")
results = [_download_stub(stub) for stub in _scan_icloud_stubs(inbox, config.recursive)]
filepaths = scan_files(inbox, config.file_extensions, config.recursive)
results.extend(process_file(path, config, dry_run) for path in filepaths)
inboxes = [Path(path_str) for path_str in config.inbox_paths]
for inbox in inboxes:
if not inbox.exists():
raise FileNotFoundError(f"Inbox path does not exist: {inbox}")

results: list[RenameResult] = []
for inbox in inboxes:
results.extend(_download_stub(stub) for stub in _scan_icloud_stubs(inbox, config.recursive))
filepaths = scan_files(inbox, config.file_extensions, config.recursive)
results.extend(process_file(path, config, dry_run) for path in filepaths)
if compress and not dry_run:
_compress_renamed_pdfs(results, config)
if not dry_run:
Expand Down
Loading
Loading