Fix duplicate feed links extracted from HTTP pages by ilyatitovich · Pull Request #331 · hplush/slowreader

ilyatitovich · 2026-01-20T17:25:39Z

Added dedupeLinks function to ensure that feed links differing only by protocol are not duplicated;
Added a test.

Motivation

When a user provides an HTTP URL, the fetched HTML may reference the same feed using both absolute HTTPS links and relative links resolved from the HTTP page. This resulted in duplicate feed candidates being returned.

Note

For manual testing, you can use this page. It contains three potential feeds that would appear as duplicates if you use HTTP without this fix.

ai · 2026-01-20T22:53:21Z

This change is also very welcome, but #219 was about another issue:

A year ago you subscribed to http://exler.ru
Now website was moved to https://exler.ru
You are typing https://exler.ru to add page, but feedPopup.feed is undefined (UI shows that you was not subscribed to the feed)

But don't close this PR, it is also nice UX touch. We need it as well (I will review it tomorrow).

ilyatitovich · 2026-01-21T09:13:23Z

Maybe I misunderstood, but I remember how I found this small bug. We discussed it on Discord: if you open a site via HTTP, the feed link extracted from the anchor appears in the list as http, creating a duplicate of the same link that already exists as https. So I assumed that only this behavior needed to be fixed.

Now I need to think about this subscription issue 😀

ilyatitovich · 2026-01-22T16:01:12Z

I think I see the core issue now.

We use the feed URL itself as the filter key, and createFilter() relies on strict equality when querying IndexedDB. Because of that, http and https URLs are treated as different feeds.

The clean solution would be to introduce a normalized identifier (like a feedKey URL without protocol), but I understand that this would require a schema update and data migration.

As a simpler workaround without schema changes, we could try checking the feed with https first, and if nothing is found, fall back to http. This would keep existing subscriptions working after an http → https move.

ai · 2026-01-22T21:55:30Z

As a simpler workaround without schema changes, we could try checking the feed with https first, and if nothing is found, fall back to http. This would keep existing subscriptions working after an http → https move.

Or we can first check exact URL and on nothing found, make another check with other (https → http, http → https).

ai · 2026-01-22T21:56:45Z

Thanks for UX improvment

Fix duplicate feed links extracted from HTTP pages

ef14d97

ai merged commit 44644c7 into hplush:main Jan 22, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix duplicate feed links extracted from HTTP pages#331

Fix duplicate feed links extracted from HTTP pages#331
ai merged 1 commit intohplush:mainfrom
ilyatitovich:core/deduplicate-feed-links

ilyatitovich commented Jan 20, 2026 •

edited by ai

Loading

Uh oh!

ai commented Jan 20, 2026

Uh oh!

ilyatitovich commented Jan 21, 2026

Uh oh!

ilyatitovich commented Jan 22, 2026

Uh oh!

ai commented Jan 22, 2026

Uh oh!

Uh oh!

ai commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ilyatitovich commented Jan 20, 2026 • edited by ai Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Note

Uh oh!

ai commented Jan 20, 2026

Uh oh!

ilyatitovich commented Jan 21, 2026

Uh oh!

ilyatitovich commented Jan 22, 2026

Uh oh!

ai commented Jan 22, 2026

Uh oh!

Uh oh!

ai commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ilyatitovich commented Jan 20, 2026 •

edited by ai

Loading