Self-hosted media acquisition for deterministic local libraries.
URLs, playlists, search, Spotify sync, and library imports into a clean, predictable media archive.
Follow us on X: https://x.com/sudoStacks
Retreivr is a self-hosted acquisition engine for building and maintaining a clean local media archive.
It takes your intent, resolves the target, downloads the media, normalizes metadata and naming, and writes predictable files to disk.
Retreivr is not a streaming server. It is the acquisition layer.
- Deterministic acquisition instead of one-off, chaotic downloads
- MusicBrainz-first metadata authority for music workflows
- Clean filesystem output with canonical naming and finalization rules
- Unified queue, worker, watcher, scheduler, and review flows
- Web UI and API for operations, recovery, and automation
- Built for intentional local ownership, not algorithmic consumption
- Community-cache publish branch resets now use the correct GitHub ref-update endpoint
- Manual publish runs recover more reliably after branch cleanup or rejected PRs
- The cache publish/validate/merge loop is closer to the final
v1.0.0stability bar
- Acquire from direct URLs, playlists, search, Spotify sync, and library-import files
- Resolve media into canonical download and metadata workflows
- Finalize files into a clean, predictable local library
- Keep ingestion repeatable through queueing, retries, and review paths
- Expose live status, logs, metrics, and API endpoints for operators
Search by artist, album, or track and let Retreivr resolve downloads into structured local music files with metadata-first workflows.
Bring in Apple Music XML or similar exports and resolve them into queued acquisition jobs with import progress, rejection reasons, and recovery controls.
Use watcher and scheduler flows to poll sources, detect new content, and ingest it into the same queue and finalization system.
Sync intent from Spotify without turning Retreivr into a playback server.
Monitor queue health, review blocked work, recover stale jobs, and track subsystem state from the UI or API.
If you want a more visual middle section, add a row of 3 to 5 images or thumbnails here.
Recommended concepts:
- A YouTube playlist watcher view with newly detected items
- A Music Mode album search with strong matches
- A library import run with batch counters and rejection summaries
- A review queue screenshot showing operator approval workflow
Suggested assets:
docs/images/readme/watcher-example.pngdocs/images/readme/album-search-example.pngdocs/images/readme/library-import-example.pngdocs/images/readme/review-queue-example.png
Retreivr follows a simple acquisition model:
- Input arrives from URL, search, playlist, Spotify sync, or library import.
- Resolver logic identifies the best target and metadata authority.
- Jobs enter the queue and are claimed by workers.
- Media is downloaded, post-processed, tagged, and finalized.
- The UI and API expose status, logs, review states, and recovery actions.
- Prepare files:
cp docker/docker-compose.yml.example docker/docker-compose.yml
cp .env.example .env- Start Retreivr:
docker compose -f docker/docker-compose.yml up -d- Open the UI:
http://localhost:8090
Default mapping is 8090:8000 (host:container).
- Open
Config - Add playlist, search, or music settings
- Set destination folders under
/downloads - Optionally configure Spotify OAuth and Telegram
- GitHub Container Registry image:
ghcr.io/sudostacks/retreivr:<tag> - Docker Hub image:
sudostacks/retreivr:<tag> - GitHub Release asset:
retreivr-docker-starter-<tag>.zip
The Docker starter bundle contains:
docker-compose.yml.env.exampleconfig/config.json.exampleREADME-runtime.md
Retreivr is now part of a broader resolution-network ecosystem. If you want to participate beyond running a single node, these side repositories matter:
- Community cache dataset:
https://github.com/sudoStacks/retreivr-community-cache- canonical public transport-resolution dataset
- trusted publisher policy, validation rules, and contribution flow
- Jellyfin plugin:
https://github.com/sudoStacks/retreivr-jellyfin-plugin- early Jellyfin-side integration for search, availability, and Retreivr-backed acquisition
- Plex plugin:
https://github.com/sudoStacks/retreivr-plex-plugin- experimental legacy Plex integration path
If you want to contribute verified mappings back to the shared network, start with the community cache repository and its trusted publisher workflow.
Use these container paths for predictable behavior:
/downloadsmedia output/dataruntime DB and temp/configconfig JSON/logslogs/tokensauth and cookies
Requirements:
- Python
3.11.x ffmpegon PATH
Run:
python3.11 scripts/archiver.py --config data/config/config.jsonRun API/UI locally:
python3.11 -m uvicorn api.main:app --host 127.0.0.1 --port 8000Then open:
http://localhost:8000
- Unified queue for import, watcher, search, and direct acquisition flows
- Live status for active jobs, queue health, watcher state, and import progress
- Recovery controls for stale or blocked work
- Review path for low-confidence music matches
- Metrics and API endpoints for operational visibility
GET /api/statusGET /api/metricsPOST /api/runGET /api/download_jobsPOST /api/import/playlistGET /docs
Retreivr currently supports one cache concept in the acquisition pipeline:
community_cache_lookup_enabled: Enables reading shared community transport hints. Defaults totrue.community_cache_publish_enabled: Enables local proposal emission for contributing verified matches. Defaults tofalse.
Related controls:
community_cache_publish_mode:off | dry_run | write_outboxcommunity_cache_publish_min_scorecommunity_cache_publish_outbox_dircommunity_cache_publish_repocommunity_cache_publish_target_branchcommunity_cache_publish_branchcommunity_cache_publish_open_prcommunity_cache_publish_poll_minutescommunity_cache_publish_token_envcommunity_cache_publish_batch_size
The resolution_api block controls optional node-to-node dataset sync for the Resolution API layer.
resolution_api.upstream_base_url: Base URL of another Retreivr Resolution API node to sync from.resolution_api.sync_enabled: Enables scheduled pulls from that upstream node.resolution_api.sync_poll_minutes: How often to check for updates.resolution_api.sync_batch_size: How many records to request per sync batch.resolution_api.local_node_id: Stable identifier for this node in sync and verification flows.
When to use it:
- Leave it off for a single-node install that is already doing its own local acquisition and community-cache publishing.
- Enable it when you want this Retreivr instance to mirror resolution data from another Resolution API node for faster local lookups.
What the UI buttons do:
Run Cache Sync Now: immediately pulls a sync batch from the configured upstream API.Refresh Sync Status: refreshes the last-sync state shown in Settings.
Minimum working setup for sync:
- set
resolution_api.upstream_base_url - enable
resolution_api.sync_enabled - choose a
resolution_api.local_node_id
If upstream_base_url is blank, local cache sync should remain disabled.
- Docker runtime notes: docker/README.md
- Portainer deployment: docs/portainer.md
- Container/data path guidance: docs/paths.md
- Runtime starter bundle notes: README-runtime.md
If you are upgrading to 0.9.20, pull the latest image and restart:
docker compose -f docker/docker-compose.yml pull
docker compose -f docker/docker-compose.yml up -dYour mounted /data, /downloads, /config, and /tokens persist.
Retreivr does:
- Acquire media reliably
- Normalize metadata and output structure
- Keep playlist and sync ingestion deterministic
Retreivr does not:
- Stream media
- Replace Plex or Jellyfin players
- Bypass DRM or protected content
Use this section as a build list for the final polished README assets.
- Hero dashboard screenshot
- Home or search workflow screenshot
- Music Mode / album-resolution screenshot
- Operations Status screenshot
- Optional watcher or playlist-ingest screenshot
- Optional review queue screenshot
- Simple architecture diagram
Retreivr is licensed under the GNU Affero General Public License v3.0 (AGPLv3). See the LICENSE file for full details.



