fix: `put_index_list` de-duplicates when appending to list by ok300 · Pull Request #691 · pubky/pubky-nexus

ok300 · 2026-01-16T10:36:45Z

This PR addresses a TODO in put_index_list which mentioned the method only inserts and doesn't check for duplicates.

The fix is a modification to lists::put to use LREM + RPUSH pattern in a pipeline to prevent duplicate entries.

For each value being added:

LREM removes all existing occurrences from the list (count=0 removes all)
RPUSH appends the value to the end of the list

Note: If a value already exists, it will be removed from its original position and re-added at the end of the list.

Modified lists::put to use LREM + RPUSH pattern in a pipeline to prevent duplicate entries when re-indexing operations are performed. For each value being added: 1. LREM removes all existing occurrences from the list (count=0 removes all) 2. RPUSH appends the value to the end of the list This approach: - Prevents duplicates during re-indexing operations - Makes the operation idempotent (calling multiple times = calling once) - Uses atomic pipeline execution (no Lua scripts required) - Preserves append semantics (new values added to end) Note: If a value already exists, it will be removed from its original position and re-added at the end of the list. Files changed: - nexus-common/src/db/kv/index/lists.rs: Updated put() implementation - nexus-common/src/db/kv/traits.rs: Removed TODO warning about unsafe re-indexing

tipogi · 2026-01-19T05:07:37Z

nexus-common/src/db/kv/index/lists.rs

+
+    // Use a pipeline to remove duplicates then append each value
+    // LREM removes all existing occurrences (0 = all), then RPUSH appends to end
+    // This is atomic from the perspective of other Redis clients


That comment is not strictly correct in Redis terms:

A pipeline sends multiple commands in one batch over the same connection, reducing round trips.

But it does not make them atomic. Other clients’ commands can still be interleaved between these commands on the server.

To get true atomicity in Redis, you’d need MULTI / EXEC (transaction) or a Lua script (EVAL).

SHAcollision

Two comments from me:

A much better alternative for list uses that should not have duplicates and where order does not matter might be to use a Redis Set instead. So we might simply identify these use cases (if any) and migrate them to a set?
I am not 100% sure that we never want duplicates in ANY of our uses of Redis Lists 🤔 Certainly this makes our implementation of "put_to_list" stop behaving as a regular list, so we need to account for that when we want to implement new features that should behave as lists in the future. In fact I see how we are using the List type for /events , this is a use case where we do expect and want duplicates. This is the only List usage we have I believe.

IMO the TODO was on the wrong side and we probably should not merge this. How about we just remove the TODO? 😄

Codex review:

⚠️ Ordering considerations (behavior change)

The new behavior changes ordering for any value that already exists in the list; it gets moved to the tail. That is a meaningful semantic change and should be confirmed as acceptable. If list order is significant (e.g., chronological or priority ordering), this change could be surprising.

⚠️ Partial update visibility / atomicity

LREM + RPUSH in a pipeline is not atomic. Other clients can observe transient states where the item has been removed but not yet re-added. If atomicity matters, this should be done inside a Lua script or a MULTI/EXEC transaction. A pipeline improves throughput, but does not provide isolation.

⚠️ Duplicate handling within a single batch

If the same value appears multiple times in the input (e.g., due to a caller bug), the current described approach will still append duplicates. If true de-duplication is required, the input list should be de-duplicated before issuing Redis commands.

Performance Review

LREM is O(N) per value. If these lists are large or if a large batch is appended, total cost can become O(N*M). Depending on dataset size, this might be acceptable but should be acknowledged.

Pipelining mitigates round-trip cost but doesn’t change algorithmic cost on Redis.

Tests / Validation

No tests were described. This change affects persistence semantics and ordering, so targeted tests would be valuable:

Re-indexing does not create duplicates.

Existing values move to the end (if that’s intended).

No duplicate values remain after a batch insert.

Concurrency scenario test (if possible), or at least documentation of atomicity limits.

Merge Readiness

Not yet ready. The deduplication behavior is good, but I recommend clarifying and/or addressing the following before merge:

Confirm that “move-to-end” semantics are desired.

Address atomicity, or explicitly document that transient removal is acceptable.

Add tests to lock in the new behavior and prevent regressions.

ok300 requested review from SHAcollision and tipogi January 16, 2026 10:37

tipogi approved these changes Jan 19, 2026

View reviewed changes

SHAcollision reviewed Jan 19, 2026

View reviewed changes

ok300 marked this pull request as draft January 21, 2026 07:46

tipogi assigned ok300 Feb 12, 2026

tipogi added the 📈 enhancement New feature or request label Feb 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: `put_index_list` de-duplicates when appending to list#691

fix: `put_index_list` de-duplicates when appending to list#691
ok300 wants to merge 1 commit intopubky:mainfrom
ok300:claude/find-fix-bug-mk9qylaxodkk8gey-U1M8Z

ok300 commented Jan 16, 2026

Uh oh!

tipogi Jan 19, 2026

Uh oh!

SHAcollision left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ok300 commented Jan 16, 2026

Uh oh!

tipogi Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

SHAcollision left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

⚠️ Ordering considerations (behavior change)

⚠️ Partial update visibility / atomicity

⚠️ Duplicate handling within a single batch

Performance Review

Tests / Validation

Merge Readiness

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SHAcollision left a comment •

edited

Loading