Fork of Hyaxia/blogwatcher.
A Go CLI tool to track blog articles, detect new posts, and manage read/unread status. Supports both RSS/Atom feeds and HTML scraping as fallback.
- Dual Source Support - Tries RSS feeds first, falls back to HTML scraping
- Automatic Feed Discovery - Detects RSS/Atom URLs from blog homepages
- Read/Unread Management - Track which articles you've read
- Blog Filtering - View articles from specific blogs
- Duplicate Prevention - Never tracks the same article twice
- Colored CLI Output - User-friendly terminal interface
# Install via go
go install github.com/JulienTant/blogwatcher-cli/cmd/blogwatcher-cli@latest
# Or build locally
go build ./cmd/blogwatcher-cli
# Or run via Docker
docker run --rm -v blogwatcher-cli:/data ghcr.io/julientant/blogwatcher-cliPre-built binaries for Linux, macOS, and Windows are available on the GitHub Releases page.
# Add a blog (auto-discovers RSS feed)
blogwatcher-cli add "My Favorite Blog" https://example.com/blog
# Add with explicit feed URL
blogwatcher-cli add "Tech Blog" https://techblog.com --feed-url https://techblog.com/rss.xml
# Add with HTML scraping selector (for blogs without feeds)
blogwatcher-cli add "No-RSS Blog" https://norss.com --scrape-selector "article h2 a"# List all tracked blogs
blogwatcher-cli blogs
# Remove a blog (and all its articles)
blogwatcher-cli remove "My Favorite Blog"
# Remove without confirmation
blogwatcher-cli remove "My Favorite Blog" -y# Scan all blogs for new articles
blogwatcher-cli scan
# Scan a specific blog
blogwatcher-cli scan "Tech Blog"# List unread articles
blogwatcher-cli articles
# List all articles (including read)
blogwatcher-cli articles --all
# List articles from a specific blog
blogwatcher-cli articles --blog "Tech Blog"# Mark an article as read (use article ID from articles list)
blogwatcher-cli read 42
# Mark an article as unread
blogwatcher-cli unread 42
# Mark all unread articles as read
blogwatcher-cli read-all
# Mark all unread articles as read for a blog (skip prompt)
blogwatcher-cli read-all --blog "Tech Blog" --yes- For each tracked blog, blogwatcher-cli first attempts to parse the RSS/Atom feed
- If no feed URL is configured, it tries to auto-discover one from the blog homepage
- If RSS parsing fails and a
scrape_selectoris configured, it falls back to HTML scraping - New articles are saved to the database as unread
- Already-tracked articles are skipped
blogwatcher-cli searches for feeds in two ways:
- Looking for
<link rel="alternate">tags with RSS/Atom types - Checking common feed paths:
/feed,/rss,/feed.xml,/atom.xml, etc.
When RSS isn't available, provide a CSS selector that matches article links:
# Example selectors
--scrape-selector "article h2 a" # Links inside article h2 tags
--scrape-selector ".post-title a" # Links with post-title class
--scrape-selector "#blog-posts a" # Links inside blog-posts IDblogwatcher-cli stores data in SQLite at ~/.blogwatcher-cli/blogwatcher-cli.db.
If upgrading from the original Hyaxia/blogwatcher, migrate your existing database:
mv ~/.blogwatcher/blogwatcher.db ~/.blogwatcher-cli/blogwatcher-cli.dbTables:
- blogs - Tracked blogs (name, URL, feed URL, scrape selector)
- articles - Discovered articles (title, URL, dates, read status)
- mise (manages Go, golangci-lint, gotestsum, goreleaser)
mise install# Run all tests
gotestsum -- ./...
# Run e2e tests only
gotestsum -- ./e2e/ -count=1
# Update e2e expected output after intentional changes
UPDATE_EXPECTED=1 go test ./e2e/ -run TestE2E/flagsPush a tag to trigger a release (binaries + Docker images to GHCR):
git tag vX.Y.Z
git push origin vX.Y.ZMIT