Skip to content

Rippleitinnz-bloom-filtering-patch1 [DO NOT MERGE]#405

Open
rippleitinnz wants to merge 16 commits intoEvernodeXRPL:mainfrom
rippleitinnz:rippleitinnz-bloom-patch1
Open

Rippleitinnz-bloom-filtering-patch1 [DO NOT MERGE]#405
rippleitinnz wants to merge 16 commits intoEvernodeXRPL:mainfrom
rippleitinnz:rippleitinnz-bloom-patch1

Conversation

@rippleitinnz
Copy link
Copy Markdown
Contributor

@rippleitinnz rippleitinnz commented Jun 20, 2025

Moving deduplication to bloom filtering.

Key changes:

Created a bloom_filter class that uses 32MB of memory
Added a typedef using rollover_hashset = bloom_filter; for compatibility
Created global instances for recent_peermsg_hashes and recent_selfmsg_hashes
All files now include bloom_filter.hpp instead of rollover_hashset.hpp
Removed local declarations of the hash sets since they're now global

This maintains full compatibility with the existing code while switching to the bloom filter implementation.

Increase MAX_QUEUE_SIZE size to handle larger UNL
increase MAX_NPL_MSG_QUEUE_SIZE and MAX_CONTROL_MSG_QUEUE_SIZE
Updating deduplication to bloom filtering
moving deduplication to bloom filters
Moving deduplication to bloom filters
moving deduplication to bloom filters
moving deduplication to bloom filters
Moving deduplication to bloom filters
moving deduplication to bloom filters
moving deduplication to bloom filters
moving deduplication to bloom filters
@rippleitinnz rippleitinnz requested a review from RichardAH June 20, 2025 21:43
util::rollover_hashset recent_peermsg_hashes(200);

/**
/**
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's the actual use of the bloom filter to deduplicate? you probably need to run two at the same time so that periodically you can clear one or the other to prevent them filling up and becoming useless

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bloom filter implementation maintains the exact same interface (try_emplace) so no other code changes are needed. The global recent_peermsg_hashes is now defined in bloom_filter.hpp and works exactly the same way.

Will look at running two bloom filters at the same time and will amend

Added in rolling bloom filter.
Two 16MB filters (32MB total) that rotate every 5 minutes
When checking, we look in both filters. When inserting, we add to both.
Every 5 minutes, the older filter is cleared and becomes the new filter
Since we insert into both filters, messages are retained for 5-10 minutes
Uses atomic operations to ensure only one thread performs rotation

The behaviour is:

Minutes 0-5: Filter 1 is active, Filter 2 is building
Minute 5: Clear Filter 1, switch to Filter 2 as active
Minutes 5-10: Filter 2 is active, Filter 1 is building
Minute 10: Clear Filter 2, switch to Filter 1 as active
and so on...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants