blogwatcher-cli

A Go CLI tool to track blog articles, detect new posts, and manage read/unread status. Supports both RSS/Atom feeds and HTML scraping as fallback.

Features

Dual Source Support - Tries RSS feeds first, falls back to HTML scraping
Automatic Feed Discovery - Detects RSS/Atom URLs from blog homepages
Read/Unread Management - Track which articles you've read
Blog Filtering - View articles from specific blogs
Duplicate Prevention - Never tracks the same article twice
Colored CLI Output - User-friendly terminal interface

Installation

# Install via go
go install github.com/JulienTant/blogwatcher-cli/cmd/blogwatcher-cli@latest

# Or build locally
go build ./cmd/blogwatcher-cli

# Or run via Docker
docker run --rm -v blogwatcher-cli:/data ghcr.io/julientant/blogwatcher-cli

Pre-built binaries for Linux, macOS, and Windows are available on the GitHub Releases page.

Usage

Adding Blogs

# Add a blog (auto-discovers RSS feed)
blogwatcher-cli add "My Favorite Blog" https://example.com/blog

# Add with explicit feed URL
blogwatcher-cli add "Tech Blog" https://techblog.com --feed-url https://techblog.com/rss.xml

# Add with HTML scraping selector (for blogs without feeds)
blogwatcher-cli add "No-RSS Blog" https://norss.com --scrape-selector "article h2 a"

Managing Blogs

# List all tracked blogs
blogwatcher-cli blogs

# Remove a blog (and all its articles)
blogwatcher-cli remove "My Favorite Blog"

# Remove without confirmation
blogwatcher-cli remove "My Favorite Blog" -y

Scanning for New Articles

# Scan all blogs for new articles
blogwatcher-cli scan

# Scan a specific blog
blogwatcher-cli scan "Tech Blog"

Viewing Articles

# List unread articles
blogwatcher-cli articles

# List all articles (including read)
blogwatcher-cli articles --all

# List articles from a specific blog
blogwatcher-cli articles --blog "Tech Blog"

Managing Read Status

# Mark an article as read (use article ID from articles list)
blogwatcher-cli read 42

# Mark an article as unread
blogwatcher-cli unread 42

# Mark all unread articles as read
blogwatcher-cli read-all

# Mark all unread articles as read for a blog (skip prompt)
blogwatcher-cli read-all --blog "Tech Blog" --yes

How It Works

Scanning Process

For each tracked blog, blogwatcher-cli first attempts to parse the RSS/Atom feed
If no feed URL is configured, it tries to auto-discover one from the blog homepage
If RSS parsing fails and a scrape_selector is configured, it falls back to HTML scraping
New articles are saved to the database as unread
Already-tracked articles are skipped

Feed Auto-Discovery

blogwatcher-cli searches for feeds in two ways:

Looking for <link rel="alternate"> tags with RSS/Atom types
Checking common feed paths: /feed, /rss, /feed.xml, /atom.xml, etc.

HTML Scraping

When RSS isn't available, provide a CSS selector that matches article links:

# Example selectors
--scrape-selector "article h2 a"      # Links inside article h2 tags
--scrape-selector ".post-title a"     # Links with post-title class
--scrape-selector "#blog-posts a"     # Links inside blog-posts ID

Database

blogwatcher-cli stores data in SQLite at ~/.blogwatcher-cli/blogwatcher-cli.db.

If upgrading from the original Hyaxia/blogwatcher, migrate your existing database:

mv ~/.blogwatcher/blogwatcher.db ~/.blogwatcher-cli/blogwatcher-cli.db

Tables:

blogs - Tracked blogs (name, URL, feed URL, scrape selector)
articles - Discovered articles (title, URL, dates, read status)

Development

Requirements

mise (manages Go, golangci-lint, gotestsum, goreleaser)

mise install

Running Tests

# Run all tests
gotestsum -- ./...

# Run e2e tests only
gotestsum -- ./e2e/ -count=1

# Update e2e expected output after intentional changes
UPDATE_EXPECTED=1 go test ./e2e/ -run TestE2E/flags

Publishing

Push a tag to trigger a release (binaries + Docker images to GHCR):

git tag vX.Y.Z
git push origin vX.Y.Z

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github		.github
cmd/blogwatcher-cli		cmd/blogwatcher-cli
docs		docs
e2e		e2e
internal		internal
skills		skills
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
mise.toml		mise.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

blogwatcher-cli

Features

Installation

Usage

Adding Blogs

Managing Blogs

Scanning for New Articles

Viewing Articles

Managing Read Status

How It Works

Scanning Process

Feed Auto-Discovery

HTML Scraping

Database

Development

Requirements

Running Tests

Publishing

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

blogwatcher-cli

Features

Installation

Usage

Adding Blogs

Managing Blogs

Scanning for New Articles

Viewing Articles

Managing Read Status

How It Works

Scanning Process

Feed Auto-Discovery

HTML Scraping

Database

Development

Requirements

Running Tests

Publishing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages