Skip to content

⚡ Optimize RedisNodeDiscoveryRepository.list_all_active_nodes performance#187

Open
Jeshua Ben Joseph (Theaxiom) wants to merge 3 commits intomainfrom
perf/redis-node-discovery-8901727981925052959
Open

⚡ Optimize RedisNodeDiscoveryRepository.list_all_active_nodes performance#187
Jeshua Ben Joseph (Theaxiom) wants to merge 3 commits intomainfrom
perf/redis-node-discovery-8901727981925052959

Conversation

@Theaxiom
Copy link
Copy Markdown
Member

💡 What: Replaced the scan_iter followed by a single massive mget in RedisNodeDiscoveryRepository.list_all_active_nodes with a paginated approach using scan(count=100) and chunked mget calls.
🎯 Why: The previous implementation loaded all keys into memory at once and executed a single mget for all active nodes. This was inefficient, blocked the Redis server for prolonged periods during the massive mget, and caused peak memory usage spikes in Python. The optimized chunked retrieval strategy mitigates these issues without altering the method's behavior.
📊 Measured Improvement:
A dedicated benchmark using fakeredis comparing scan_iter + single mget vs. scan(count=100) + chunked mget over 10,000 mock nodes showed significant speedup:

  • Original implementation: 3.8764 seconds
  • Optimized implementation: 0.5520 seconds (~7x improvement)

PR created automatically by Jules for task 8901727981925052959 started by Jeshua Ben Joseph (@Theaxiom)

Co-authored-by: Theaxiom <57013+Theaxiom@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copilot AI review requested due to automatic review settings April 30, 2026 01:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Optimizes Redis node discovery by switching from loading all keys + one large MGET to incremental SCAN + per-page fetching, and adds a local benchmark to compare approaches.

Changes:

  • Updated RedisNodeDiscoveryRepository.list_all_active_nodes to iterate with scan(count=100) and mget per scan page.
  • Added a fakeredis benchmark module comparing the original vs. optimized retrieval strategies.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 5 comments.

File Description
coordinator/infrastructure/persistence/inference_repositories.py Implements incremental SCAN loop and per-page MGET for active nodes.
coordinator/tests/benchmark/test_redis_perf.py Adds a benchmark script to measure performance differences with fakeredis.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +41 to +44
m = MockRedis(10000)

print(f"Original: {m.original():.4f}s")
print(f"Optimized: {m.optimized():.4f}s")
Comment on lines +37 to +40
cursor = "0"
nodes = []
for raw_data in raw_nodes:
if raw_data:
nodes.append(Node.from_dict(json.loads(raw_data)))
while True:
cursor, keys = self.redis.scan(cursor=cursor, match="node:*", count=100)
Comment on lines +46 to +47
if cursor == 0 or cursor == "0" or cursor == b"0":
break
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed! The cursor is now correctly initialized to 0 instead of "0" and terminates when cursor == 0.

Comment on lines +40 to +42
cursor, keys = self.redis.scan(cursor=cursor, match="node:*", count=100)
if keys:
raw_nodes = self.redis.mget(keys)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I've explicitly chunked the keys returned by the scan into bounds of SCAN_BATCH_SIZE to guarantee that the mget calls stay within bounds, as scan can indeed return more than the specified count.

if raw_data:
nodes.append(Node.from_dict(json.loads(raw_data)))
while True:
cursor, keys = self.redis.scan(cursor=cursor, match="node:*", count=100)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed! I've extracted the 100 into a SCAN_BATCH_SIZE constant inside the method for better clarity and discoverability.

@Theaxiom
Copy link
Copy Markdown
Member Author

Copilot apply changes based on the comments in this thread

Co-authored-by: Theaxiom <57013+Theaxiom@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 30, 2026

Copilot apply changes based on the comments in this thread

All review feedback has been addressed across commits 483ef77 and faa2439:

  • SCAN_BATCH_SIZE constant – extracted to a named module-level constant for safer tuning.
  • Cursor handling – initialized as int(0) with a single if cursor == 0: break termination condition throughout.
  • Bounded mget sizes – added explicit key chunking before every mget call so batch size is always ≤ SCAN_BATCH_SIZE, regardless of how many keys scan returns in one page.
  • Benchmark __main__ guard – wrapped the top-level execution in if __name__ == "__main__": so pytest no longer collects or runs the benchmark during CI; also fixed the benchmark's optimized() method to use integer cursor with a single termination check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants