Add category support for RSS/Atom feed articles by JulienTant · Pull Request #12 · JulienTant/blogwatcher-cli

JulienTant · 2026-04-03T22:10:49Z

Summary

Parse and store categories/tags from RSS and Atom feeds into articles
Add --category / -c flag to the articles command for filtering
Display categories in article output when present
New database migration (000002_add_categories) adds the categories column
Backward compatible: existing articles get nil categories gracefully

Inspired by Hyaxia/blogwatcher#11 by @weronikakombat, reimplemented to follow our project patterns (squirrel, golang-migrate, context threading, testify).

Usage

# Show all articles
blogwatcher-cli articles --all

# Filter by category
blogwatcher-cli articles --category "AI"

# Combined with other filters
blogwatcher-cli articles --blog "TechCrunch" --category "Security"

Test plan

Unit tests for category round-trip in storage layer
Unit tests for ListArticles category filtering
Unit tests for bulk insert with categories
Unit tests for RSS feed category parsing
Unit tests for scanner category pass-through
Unit tests for controller category filter
All existing e2e tests pass (29 tests, 0 failures)
golangci-lint: 0 issues

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- SSRF protection enabled by default; added explicit --unsafe-client flag to allow unsafe requests
- Article categories: persisted, shown in article output, and filterable via a --category option
Tests
- New/updated unit and e2e tests covering SSRF protection, category parsing, persistence, and filtering; updated fixtures
Documentation
- Developer guide: added a “Before speaking” section about grounding and verifying claims

- Use DataDog go-secure-sdk for SSRF protection (blocks private IPs, loopback, link-local, cloud metadata endpoints) - Refactor rss.Fetcher and scraper.Scraper as structs holding *http.Client - Scanner accepts Fetcher + Scraper as dependencies (no global state) - Add --unsafe-client / BLOGWATCHER_UNSAFE_CLIENT flag for local dev - ScanBlog now returns error (hard fail, not soft error string) - E2e test verifies safe client blocks loopback requests - Tests use plain http.Client (no SSRF check needed for httptest) - Update AGENTS.md: ground claims in evidence, not guesses Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@weronikakombat

Parse and store categories/tags from RSS and Atom feeds, with a new --category flag on the articles command to filter by category. Inspired by Hyaxia/blogwatcher#11 (by @weronikakombat). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-03T22:12:26Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e78d57f8-dbd9-489d-879a-d9dbec9c1895

📥 Commits

Reviewing files that changed from the base of the PR and between d9ecfa5 and cea7563.

📒 Files selected for processing (1)

internal/storage/database.go

🚧 Files skipped from review as they are similar to previous changes (1)

internal/storage/database.go

📝 Walkthrough

Walkthrough

Adds SSRF-safe configurable HTTP clients and propagates them into RSS fetcher and scraper; refactors scanner into a dependency-injected type returning explicit errors; introduces Article.Categories (parsing, DB column, JSON helpers, filtering, CLI flag), updates tests/e2e fixtures, and extends AGENTS.md with a “Before speaking” section.

Changes

Cohort / File(s)	Summary
Docs `AGENTS.md`	Adds a "Before speaking" section requiring claims be grounded in files/docs/command output and instructing verification steps for uncertain dependencies/APIs.
CLI root & e2e harness `internal/cli/root.go`, `e2e/e2e_test.go`	Adds persistent `--unsafe-client` flag; e2e test harness forces unsafe-client in flags/env modes; adds `TestSSRFProtection` and a new category-filter e2e assertion.
HTTP client & deps `go.mod`, `internal/cli/commands.go`	Adds go-secure-sdk and x/sync; introduces `newHTTPClient` factory to produce safe/unsafe *http.Client and wires it into downstream components.
RSS fetcher & tests `internal/rss/rss.go`, `internal/rss/rss_test.go`, `e2e/testdata/github_blog.rss`	Replaces package-level feed functions with `rss.Fetcher` carrying an injected *http.Client; parses `<category>` into `FeedArticle.Categories`; tests and fixtures updated.
Scraper & tests `internal/scraper/scraper.go`, `internal/scraper/scraper_test.go`	Converts standalone scraper function into `Scraper` type with injected *http.Client and method `ScrapeBlog`; tests use `NewScraper`.
Scanner refactor & tests `internal/scanner/scanner.go`, `internal/scanner/scanner_test.go`	Introduces `Scanner` (fetcher+scraper), converts free Scan* functions to methods that return explicit errors, removes `ScanResult.Error`; updates tests and adds RSS-with-categories scan test.
Model, DB schema & storage tests `internal/model/model.go`, `internal/storage/database.go`, `internal/storage/database_test.go`, `internal/storage/migrations/*`	Adds `Article.Categories []string`, persists categories as JSON in new `articles.categories` column, adds JSON helpers, extends `ListArticles` with optional `category *string` filter, updates migrations and many DB tests.
Controller & CLI articles flag `internal/controller/controller.go`, `internal/controller/controller_test.go`, `internal/cli/commands.go`	Extends `GetArticles` signature with `category string`, threads filter to storage, adds `--category/-c` flag, and updates/adds controller tests.
E2E expected outputs `e2e/expected/*`	Updates e2e expected fixtures to include `Categories:` lines for affected entries and adds a new expected file for category-filtered output.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant CLI
participant NewHTTP as NewHTTPClient
participant Scanner
participant Fetcher
participant Scraper
participant DB
participant External as BlogServer
CLI->>NewHTTP: create client (safe/unsafe)
NewHTTP->>Scanner: provide *http.Client
CLI->>Scanner: trigger scan or list (with category)
Scanner->>Fetcher: DiscoverFeedURL / ParseFeed (uses client)
Fetcher->>External: HTTP request
External-->>Fetcher: feed response
alt RSS parse succeeds
Fetcher-->>Scanner: Feed articles (with Categories)
else RSS parse fails and ScrapeSelector set
Scanner->>Scraper: ScrapeBlog (uses same client)
Scraper->>External: HTTP request
External-->>Scraper: HTML response
Scraper-->>Scanner: scraped articles
end
Scanner->>DB: persist articles (including Categories)
DB-->>Scanner: ack / list results

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

Add e2e tests and switch CI to mise #7: Overlaps changes to e2e harness, scanner/rss/scraper refactors and test wiring.
Add env variable support and custom DB path #1: Related modifications to CLI flag/config wiring (root flags and viper usage).

Poem

🐰 I hopped through code beneath the pale monitor light,
Clients now guarded to keep loopbacks tight,
Categories sprout on each article stem,
Scanners and fetchers all passing to them,
I nibble a carrot — tests green and bright!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add category support for RSS/Atom feed articles' clearly summarizes the main change: adding category parsing and display functionality for feed articles.
Description check	✅ Passed	The description includes all required template sections: a comprehensive summary explaining what was implemented and why, and a detailed test plan with all items checked off, covering unit tests, e2e tests, and linting.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/category-support

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/scanner/scanner.go (1)

153-190: ⚠️ Potential issue | 🔴 Critical

The current worker-pool error path can deadlock ScanAllBlogs.

jobs is unbuffered, and errs is not drained until after every job has been queued. If all workers exit early, the send loop blocks forever; if only one fails, the function can still return while other workers are still writing to the database.

💡 Suggested fix

-	jobs := make(chan job)
+	jobs := make(chan job, len(blogs))
 	results := make([]ScanResult, len(blogs))
 	errs := make(chan error, workers)
@@
-	for i := 0; i < workers; i++ {
-		if err := <-errs; err != nil {
-			return nil, err
-		}
-	}
-
-	return results, nil
+	var firstErr error
+	for i := 0; i < workers; i++ {
+		if err := <-errs; err != nil && firstErr == nil {
+			firstErr = err
+		}
+	}
+	if firstErr != nil {
+		return nil, firstErr
+	}
+
+	return results, nil

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/scanner/scanner.go` around lines 153 - 190, ScanAllBlogs can
deadlock because jobs is unbuffered and errs is only read after enqueuing all
jobs; fix by adding cancellation and proper synchronization: create a
cancellable context (ctx, cancel := context.WithCancel(ctx)), make errs buffered
to capacity workers, use a sync.WaitGroup to track worker goroutines, have each
worker select on ctx.Done() while reading from jobs and when encountering an
error call cancel() and send the error into errs (use non-blocking send or
buffered errs so send cannot block), ensure the job-sender loop selects on
ctx.Done() to stop enqueuing when cancelled, close jobs after enqueuing, wait
for wg.Done and then close(errs) and range over errs to return the first non-nil
error; reference the jobs channel, errs channel, the worker goroutine that calls
storage.OpenDatabase and s.ScanBlog, and ensure workerDB.Close runs in defer as
currently written.

🧹 Nitpick comments (2)

AGENTS.md (1)

3-3: Consider varying the sentence structure.

Three consecutive sentences begin with "If you're unsure," which slightly reduces readability. You might rephrase for variety while keeping the same meaning.

✍️ Optional rephrasing for variety

-Never guess. Ground every claim in actual files, docs, or command output. If you're unsure about a dependency's size, run `go get` and check. If you're unsure about an API, read the source or docs. If you're unsure about behavior, write a test. State what you verified and how — don't speculate.
+Never guess. Ground every claim in actual files, docs, or command output. When uncertain about a dependency's size, run `go get` and check. For APIs, read the source or documentation. For behavior, write a test to confirm. Always state what you verified and how — don't speculate.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@AGENTS.md` at line 3, Update the three consecutive sentences that start with
"If you're unsure" in AGENTS.md to vary sentence openings while preserving
meaning; reword one or two to begin differently (e.g., "When in doubt, run `go
get`...", "To confirm an API's behavior, read the source or docs", "If behavior
is unclear, write a test") and keep each instruction concrete about what to
verify and how to verify.

internal/rss/rss_test.go (1)

48-80: Consider adding an Atom fixture here too.

This currently only locks in RSS category parsing. An Atom sample would cover the second advertised feed format and make regressions here much easier to spot.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/rss/rss_test.go` around lines 48 - 80, TestParseFeedWithCategories
only exercises RSS; add an Atom fixture and assert Atom category parsing to
prevent regressions. Update the test (TestParseFeedWithCategories) to include an
Atom feed string (with <entry> elements and <category> tags) and either start a
second httptest server or extend the existing handler to serve the Atom fixture
at a distinct path, then call newTestFetcher().ParseFeed for that Atom URL and
add assertions analogous to require.Equal/require.Nil that validate the returned
articles' Categories for the Atom entries. Ensure the test references ParseFeed
and the same article struct expectations so Atom parsing is covered alongside
RSS.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/scanner/scanner.go`:
- Around line 45-49: The code currently sets blog.FeedURL = discovered and calls
db.UpdateBlog(ctx, blog) immediately when a discovery string is returned, which
can persist a bad URL before ParseFeed runs; change the flow in the scanner
(scanner.go) so that you do NOT call db.UpdateBlog or mutate blog.FeedURL when
discovered != "" up front—instead assign feedURL = discovered only for use in
the following ParseFeed call (the ParseFeed function invocation), attempt to
parse, and only if ParseFeed succeeds update blog.FeedURL and call
db.UpdateBlog(ctx, blog); ensure any early returns on parse failure do not
persist the discovered value so future scans can re-discover.

In `@internal/storage/database.go`:
- Around line 363-365: The code currently uses sq.Like on the "categories"
string column (checking variable category with query.Where(sq.Like{"categories":
"%" + *category + "%"}) ), which causes boundary-blind matches; change the
storage and query to a structured categories representation (e.g., JSONB array
or SQL array) and perform exact matching using the DB's containment or
membership operator (e.g., jsonb @> or ARRAY containment) instead of LIKE.
Update the query construction in the database.go block that references category
(and the similar logic around lines 523-535) to use sq.Expr with the appropriate
operator (e.g., "categories @> to_jsonb(array[?])" or "categories @> ?::jsonb"/"
? = ANY(categories)" depending on chosen schema) and add a migration to convert
the existing comma-joined values to the new array/JSONB format.

---

Outside diff comments:
In `@internal/scanner/scanner.go`:
- Around line 153-190: ScanAllBlogs can deadlock because jobs is unbuffered and
errs is only read after enqueuing all jobs; fix by adding cancellation and
proper synchronization: create a cancellable context (ctx, cancel :=
context.WithCancel(ctx)), make errs buffered to capacity workers, use a
sync.WaitGroup to track worker goroutines, have each worker select on ctx.Done()
while reading from jobs and when encountering an error call cancel() and send
the error into errs (use non-blocking send or buffered errs so send cannot
block), ensure the job-sender loop selects on ctx.Done() to stop enqueuing when
cancelled, close jobs after enqueuing, wait for wg.Done and then close(errs) and
range over errs to return the first non-nil error; reference the jobs channel,
errs channel, the worker goroutine that calls storage.OpenDatabase and
s.ScanBlog, and ensure workerDB.Close runs in defer as currently written.

---

Nitpick comments:
In `@AGENTS.md`:
- Line 3: Update the three consecutive sentences that start with "If you're
unsure" in AGENTS.md to vary sentence openings while preserving meaning; reword
one or two to begin differently (e.g., "When in doubt, run `go get`...", "To
confirm an API's behavior, read the source or docs", "If behavior is unclear,
write a test") and keep each instruction concrete about what to verify and how
to verify.

In `@internal/rss/rss_test.go`:
- Around line 48-80: TestParseFeedWithCategories only exercises RSS; add an Atom
fixture and assert Atom category parsing to prevent regressions. Update the test
(TestParseFeedWithCategories) to include an Atom feed string (with <entry>
elements and <category> tags) and either start a second httptest server or
extend the existing handler to serve the Atom fixture at a distinct path, then
call newTestFetcher().ParseFeed for that Atom URL and add assertions analogous
to require.Equal/require.Nil that validate the returned articles' Categories for
the Atom entries. Ensure the test references ParseFeed and the same article
struct expectations so Atom parsing is covered alongside RSS.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f3a0f4b-53b5-4c04-9edc-d850611b07ae

📥 Commits

Reviewing files that changed from the base of the PR and between 2972f35 and 61dd750.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (18)

AGENTS.md
e2e/e2e_test.go
go.mod
internal/cli/commands.go
internal/cli/root.go
internal/controller/controller.go
internal/controller/controller_test.go
internal/model/model.go
internal/rss/rss.go
internal/rss/rss_test.go
internal/scanner/scanner.go
internal/scanner/scanner_test.go
internal/scraper/scraper.go
internal/scraper/scraper_test.go
internal/storage/database.go
internal/storage/database_test.go
internal/storage/migrations/000002_add_categories.down.sql
internal/storage/migrations/000002_add_categories.up.sql

internal/scanner/scanner.go

- Only persist discovered feed URL after ParseFeed succeeds - Fix category LIKE query to use delimited matching (prevents "AI" matching "FAIR") - Replace scanner worker pool with errgroup to prevent deadlock - Add categories to GitHub RSS fixture and e2e test for --category filter Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Use json.Marshal/Unmarshal for categories serialization - Query with json_each() for exact element matching instead of LIKE Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (2)

internal/rss/rss_test.go (1)
79-79: Avoid nil-specific assertion for empty categories

Line 79 is stricter than needed and can make the test brittle. Prefer asserting emptiness rather than exact nil.
Proposed test assertion change
-	require.Nil(t, articles[1].Categories)
+	require.Empty(t, articles[1].Categories)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/rss/rss_test.go` at line 79, The test is too strict by asserting a
nil slice; update the assertion to check for emptiness instead—replace the
nil-specific check on articles[1].Categories with an emptiness assertion (e.g.,
require.Empty(t, articles[1].Categories)) so the test passes for both nil and
zero-length slices while still verifying no categories are present.
internal/storage/database.go (1)
363-372: Delimited LIKE matching improves but doesn't fully solve boundary issues.

The four-way OR pattern correctly handles prefix/suffix/middle positions to avoid "AI" matching "FAIR". However, if a category value itself contains a comma (e.g., "AI, ML"), the round-trip through categoriesToString/categoriesFromString will corrupt it into two separate categories.

This is acceptable for an MVP if category values are controlled/sanitized upstream (RSS feeds rarely use commas in tags). Consider documenting this limitation or escaping/quoting values if user-defined categories are planned.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/storage/database.go` around lines 363 - 372, The current delimited
LIKE matching around the `category` filter avoids partial matches but breaks
when a category value contains commas (since
`categoriesToString`/`categoriesFromString` round-trip will split on commas);
either document this limitation near the serialization helpers or implement
escaping/quoting in `categoriesToString` and `categoriesFromString` and update
the SQL matching logic to use the escaped form so commas are preserved; locate
the serialization helpers (`categoriesToString`, `categoriesFromString`) and
either add a clear comment about the comma limitation or change them to escape
commas (and adjust the code that builds `query` using `category` accordingly).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/rss/rss_test.go`:
- Line 79: The test is too strict by asserting a nil slice; update the assertion
to check for emptiness instead—replace the nil-specific check on
articles[1].Categories with an emptiness assertion (e.g., require.Empty(t,
articles[1].Categories)) so the test passes for both nil and zero-length slices
while still verifying no categories are present.

In `@internal/storage/database.go`:
- Around line 363-372: The current delimited LIKE matching around the `category`
filter avoids partial matches but breaks when a category value contains commas
(since `categoriesToString`/`categoriesFromString` round-trip will split on
commas); either document this limitation near the serialization helpers or
implement escaping/quoting in `categoriesToString` and `categoriesFromString`
and update the SQL matching logic to use the escaped form so commas are
preserved; locate the serialization helpers (`categoriesToString`,
`categoriesFromString`) and either add a clear comment about the comma
limitation or change them to escape commas (and adjust the code that builds
`query` using `category` accordingly).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1b02f9dc-76c3-438a-a4ea-2ed17dd1a59b

📥 Commits

Reviewing files that changed from the base of the PR and between 61dd750 and 0b0c115.

📒 Files selected for processing (10)

e2e/e2e_test.go
e2e/expected/11_articles_unread.txt
e2e/expected/12b_articles_filter_category.txt
e2e/expected/20_articles_all.txt
e2e/testdata/github_blog.rss
go.mod
internal/rss/rss.go
internal/rss/rss_test.go
internal/scanner/scanner.go
internal/storage/database.go

✅ Files skipped from review due to trivial changes (5)

e2e/expected/12b_articles_filter_category.txt
e2e/testdata/github_blog.rss
e2e/expected/11_articles_unread.txt
e2e/expected/20_articles_all.txt
go.mod

🚧 Files skipped from review as they are similar to previous changes (1)

internal/rss/rss.go

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/storage/database.go (1)

223-257: ⚠️ Potential issue | 🟠 Major

Don't silently suppress category serialization/deserialization errors.

categoriesToJSON and categoriesFromJSON swallow JSON errors by returning nil, which silently loses categories when marshaling or database values are malformed. This violates the guideline to always handle errors in Go code.

Modify both helpers to return errors, and update call sites in AddArticle (line 224), AddArticlesBulk (line 257), and scanArticle (line 481) to propagate them.

Proposed patch

-func categoriesToJSON(categories []string) *string {
+func categoriesToJSON(categories []string) (*string, error) {
 	if len(categories) == 0 {
-		return nil
+		return nil, nil
 	}
 	b, err := json.Marshal(categories)
 	if err != nil {
-		return nil
+		return nil, err
 	}
 	s := string(b)
-	return &s
+	return &s, nil
 }

-func categoriesFromJSON(s sql.NullString) []string {
+func categoriesFromJSON(s sql.NullString) ([]string, error) {
 	if !s.Valid || s.String == "" {
-		return nil
+		return nil, nil
 	}
 	var cats []string
 	if err := json.Unmarshal([]byte(s.String), &cats); err != nil {
-		return nil
+		return nil, err
 	}
-	return cats
+	return cats, nil
 }

-		Values(article.BlogID, article.Title, article.URL, formatTimePtr(article.PublishedDate), formatTimePtr(article.DiscoveredDate), article.IsRead, categoriesToJSON(article.Categories)).
+	categoriesJSON, err := categoriesToJSON(article.Categories)
+	if err != nil {
+		return article, err
+	}
+	result, err := sq.Insert("articles").
+		Columns("blog_id", "title", "url", "published_date", "discovered_date", "is_read", "categories").
+		Values(article.BlogID, article.Title, article.URL, formatTimePtr(article.PublishedDate), formatTimePtr(article.DiscoveredDate), article.IsRead, categoriesJSON).

-			categoriesToJSON(article.Categories),
+			categoriesJSON,

-		Categories: categoriesFromJSON(categories),
+	cats, err := categoriesFromJSON(categories)
+	if err != nil {
+		return nil, err
+	}
+	article := &model.Article{
+		ID:         id,
+		BlogID:     blogID,
+		Title:      title,
+		URL:        url,
+		IsRead:     isRead,
+		Categories: cats,
+	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/storage/database.go` around lines 223 - 257, The helpers
categoriesToJSON and categoriesFromJSON currently swallow JSON errors; change
their signatures to return ([]byte, error) and ([]string, error) (or appropriate
types plus error) and propagate JSON marshal/unmarshal errors instead of
returning nil; update all call sites — AddArticle (where categoriesToJSON is
used in the single-insert block), AddArticlesBulk (inside the loop building
insert.Values), and scanArticle (where categoriesFromJSON is used when scanning
DB rows) — to check the error, return it to the caller, and in AddArticlesBulk
ensure the transaction is aborted (tx.Rollback) on any serialization error
before returning so partial inserts are not committed. Ensure function
signatures and returned errors are updated accordingly throughout the call
chain.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/storage/database.go`:
- Around line 363-367: Replace the raw SQL predicate string in the categories
filter with a squirrel expression: instead of query.Where("EXISTS (SELECT 1 FROM
json_each(categories) WHERE json_each.value = ?)", *category), call query =
query.Where(sq.Expr("EXISTS (SELECT 1 FROM json_each(categories) WHERE
json_each.value = ?)", *category)) so the filter uses a squirrel expression like
the existing blogID filter that uses sq.Eq{}; update the code around the query,
category, and categories references accordingly.

---

Outside diff comments:
In `@internal/storage/database.go`:
- Around line 223-257: The helpers categoriesToJSON and categoriesFromJSON
currently swallow JSON errors; change their signatures to return ([]byte, error)
and ([]string, error) (or appropriate types plus error) and propagate JSON
marshal/unmarshal errors instead of returning nil; update all call sites —
AddArticle (where categoriesToJSON is used in the single-insert block),
AddArticlesBulk (inside the loop building insert.Values), and scanArticle (where
categoriesFromJSON is used when scanning DB rows) — to check the error, return
it to the caller, and in AddArticlesBulk ensure the transaction is aborted
(tx.Rollback) on any serialization error before returning so partial inserts are
not committed. Ensure function signatures and returned errors are updated
accordingly throughout the call chain.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e4d4e2d4-3123-47a1-a7b4-2dabf4145f39

📥 Commits

Reviewing files that changed from the base of the PR and between 0b0c115 and 10c9812.

📒 Files selected for processing (1)

internal/storage/database.go

coderabbitai · 2026-04-04T00:34:51Z

internal/storage/database.go

+	if category != nil && *category != "" {
+		// Categories are stored as a JSON string array. Use json_each()
+		// for exact element matching.
+		query = query.Where("EXISTS (SELECT 1 FROM json_each(categories) WHERE json_each.value = ?)", *category)
+	}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify no raw SQL predicate remains for category filtering and sq.Expr is used. rg -nP --type=go 'Where\("EXISTS \(SELECT 1 FROM json_each\(categories\) WHERE json_each\.value = \?"' internal/storage/database.go rg -nP --type=go 'Where\(sq\.Expr\("EXISTS \(SELECT 1 FROM json_each\(categories\) WHERE json_each\.value = \?"' internal/storage/database.go

Repository: JulienTant/blogwatcher-cli

Length of output: 52

🏁 Script executed:

# Read the actual content at the specified lines sed -n '360,370p' internal/storage/database.go

Repository: JulienTant/blogwatcher-cli

Length of output: 465

Use a squirrel expression instead of the raw SQL string in the Where call.

Line 366 uses a raw SQL predicate string, which violates the project's requirement to use the squirrel library exclusively. The blogID filter on line 362 correctly uses sq.Eq{}, so apply the same pattern for the category filter.

Proposed patch

- query = query.Where("EXISTS (SELECT 1 FROM json_each(categories) WHERE json_each.value = ?)", *category) + query = query.Where(sq.Expr("EXISTS (SELECT 1 FROM json_each(categories) WHERE json_each.value = ?)", *category))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/storage/database.go` around lines 363 - 367, Replace the raw SQL predicate string in the categories filter with a squirrel expression: instead of query.Where("EXISTS (SELECT 1 FROM json_each(categories) WHERE json_each.value = ?)", *category), call query = query.Where(sq.Expr("EXISTS (SELECT 1 FROM json_each(categories) WHERE json_each.value = ?)", *category)) so the filter uses a squirrel expression like the existing blogID filter that uses sq.Eq{}; update the code around the query, category, and categories references accordingly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

JulienTant and others added 2 commits April 3, 2026 14:58

Merge branch 'main' into feat/category-support

93a740a

coderabbitai bot reviewed Apr 3, 2026

View reviewed changes

internal/scanner/scanner.go Outdated Show resolved Hide resolved

JulienTant and others added 3 commits April 3, 2026 17:24

Fix test compatibility with Fetcher struct API

0b0c115

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Store categories as JSON array instead of comma-separated string

10c9812

- Use json.Marshal/Unmarshal for categories serialization - Query with json_each() for exact element matching instead of LIKE Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

Case-insensitive category filtering with LOWER() + test

d9ecfa5

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

Propagate JSON serialization errors in category helpers

cea7563

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

JulienTant merged commit a1f2b0a into main Apr 4, 2026
2 checks passed

JulienTant deleted the feat/category-support branch April 4, 2026 00:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add category support for RSS/Atom feed articles#12

Add category support for RSS/Atom feed articles#12
JulienTant merged 8 commits intomainfrom
feat/category-support

JulienTant commented Apr 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 3, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JulienTant commented Apr 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JulienTant commented Apr 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 3, 2026 •

edited

Loading