Prevent event reindexing (on same HS) by only allowing a cursor to change incrementally#662
Prevent event reindexing (on same HS) by only allowing a cursor to change incrementally#662ok300 wants to merge 7 commits intopubky:mainfrom
Conversation
41fa484 to
0691214
Compare
SHAcollision
left a comment
There was a problem hiding this comment.
Hey @ok300 , nice work!
Two concerns:
-
Backwards compatibility: I am not sure about this, but maybe the previous Homeserver cursors were previously stored in Redis as strings because of their crockford encoding (e.g.,
"0000000000000") ?. With the field now defined asu32and deserialized directly from Redis, any existing cursor will fail to deserialize, causingget_from_index/validate_cursor_changeto error before processing. Maybe a data migration or a custom deserializer that accepts both strings and numbers is needed to avoid runtime failures during rollout. Have you tried running this fix over an existing homeserver? -
Cursor range maybe be too small?: The new
u32type limits cursor values to ~4.29B. If homeserver cursors can exceed this (the former 13-digit likely did),.parse::<u32>()will reject valid cursors and prevent further updates. Is the cursor onpubky-core/pubky-homeserveralsou32? We could consideru64(or string+numeric parsing) if we might sometimes expect larger cursors.
Add custom serde deserializer for the cursor field to handle backwards compatibility with old cached data where cursor was stored as a string (e.g., "0000000000000"). This allows deserialization from Redis to work with both legacy string format and the new u32 numeric format.
…g-ko5Qm Support deserializing Homeserver cursor from both string and number
|
To your 1st point: As part of the Postgres change, the HS also migrated its cursor logic such that the High-level view of the flow:
In other words, when the HS version with the Postgres change went live, cursors were seamlessly migrated to the integer format. So no timestamp-based cursors remain in Nexus. As a live example, see https://homeserver.pubky.app/events/?cursor=0032W24W13DBW&limit=10 (it returns an integer as next cursor).
Yes, I ran it against the Prod HS and it works. I also tested it against a HS with no events, where the Nexus Redis already stored its cursor as the string It's important to note this won't work with old, pre-Postgres homeservers, as they're not using incremental event IDs. |
|
To your second point:
It's If the HS gets a larger range for the event ID, we can easily adopt it. |
09f19f7 to
0a7c417
Compare
0a7c417 to
58724ea
Compare
re-indexing A homeserver's cursor is not allowed to decrease, which is what would be needed for this re-indexing test portion.
This PR brings in ok300#12 .