Skip to content

Prevent event reindexing (on same HS) by only allowing a cursor to change incrementally#662

Open
ok300 wants to merge 7 commits intopubky:mainfrom
ok300:ok300-prevent-hs-reindexing
Open

Prevent event reindexing (on same HS) by only allowing a cursor to change incrementally#662
ok300 wants to merge 7 commits intopubky:mainfrom
ok300:ok300-prevent-hs-reindexing

Conversation

@ok300
Copy link
Contributor

@ok300 ok300 commented Jan 2, 2026

This PR brings in ok300#12 .

@ok300 ok300 force-pushed the ok300-prevent-hs-reindexing branch from 41fa484 to 0691214 Compare January 2, 2026 14:40
@ok300 ok300 marked this pull request as ready for review January 2, 2026 15:50
@ok300 ok300 requested review from SHAcollision and tipogi January 2, 2026 15:50
Copy link
Collaborator

@SHAcollision SHAcollision left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ok300 , nice work!

Two concerns:

  1. Backwards compatibility: I am not sure about this, but maybe the previous Homeserver cursors were previously stored in Redis as strings because of their crockford encoding (e.g., "0000000000000") ?. With the field now defined as u32 and deserialized directly from Redis, any existing cursor will fail to deserialize, causing get_from_index/validate_cursor_change to error before processing. Maybe a data migration or a custom deserializer that accepts both strings and numbers is needed to avoid runtime failures during rollout. Have you tried running this fix over an existing homeserver?

  2. Cursor range maybe be too small?: The new u32 type limits cursor values to ~4.29B. If homeserver cursors can exceed this (the former 13-digit likely did), .parse::<u32>() will reject valid cursors and prevent further updates. Is the cursor on pubky-core/pubky-homeserver also u32 ? We could consider u64 (or string+numeric parsing) if we might sometimes expect larger cursors.

claude and others added 3 commits January 12, 2026 10:54
Add custom serde deserializer for the cursor field to handle backwards
compatibility with old cached data where cursor was stored as a string
(e.g., "0000000000000"). This allows deserialization from Redis to work
with both legacy string format and the new u32 numeric format.
…g-ko5Qm

Support deserializing Homeserver cursor from both string and number
@ok300
Copy link
Contributor Author

ok300 commented Jan 12, 2026

To your 1st point:

As part of the Postgres change, the HS also migrated its cursor logic such that the /events endpoint accepts both timestamp- and incremental-event-ID-based cursors, but returns the new cursor in the later, incremental ID format.

High-level view of the flow:

  1. Nexus calls ?cursor={timestamp}
  2. The HS recognizes the timestamp and converts it into the corresponding event ID
  3. HS process the request as usual
  4. HS returns the new event ID cursor
  5. Nexus persists it and uses it as starting cursor in subsequent requests

In other words, when the HS version with the Postgres change went live, cursors were seamlessly migrated to the integer format. So no timestamp-based cursors remain in Nexus.

As a live example, see https://homeserver.pubky.app/events/?cursor=0032W24W13DBW&limit=10 (it returns an integer as next cursor).

Have you tried running this fix over an existing homeserver?

Yes, I ran it against the Prod HS and it works.

I also tested it against a HS with no events, where the Nexus Redis already stored its cursor as the string "0000000000000". WIth the original PR it failed, but it's fixed in the latest commit da5e471 .

It's important to note this won't work with old, pre-Postgres homeservers, as they're not using incremental event IDs.

@ok300
Copy link
Contributor Author

ok300 commented Jan 12, 2026

To your second point:

Is the cursor on pubky-core/pubky-homeserver also u32 ?

It's i64 (link) probably due to the Postgres schema. This means the positive range is effectively u32.

If the HS gets a larger range for the event ID, we can easily adopt it.

@ok300 ok300 force-pushed the ok300-prevent-hs-reindexing branch from 09f19f7 to 0a7c417 Compare January 12, 2026 12:51
@ok300 ok300 force-pushed the ok300-prevent-hs-reindexing branch from 0a7c417 to 58724ea Compare January 12, 2026 13:32
re-indexing

A homeserver's cursor is not allowed to decrease, which is what would be
needed for this re-indexing test portion.
@tipogi tipogi added the 👀 watcher Nexus indexer related operations label Feb 12, 2026
@tipogi tipogi added this to the 2026-Q1 milestone Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

👀 watcher Nexus indexer related operations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants