Enhancement: bulk prefetch IOCs before processing to eliminate per-IOC SELECT queries. Closes #1241#1297
Open
rahulgunwanistudy-2005 wants to merge 2 commits intoGreedyBear-Project:developfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The extraction pipeline was calling
ioc_repo.get_ioc_by_name()once per IOC insideadd_ioc(), so a batch of U unique IPs meant U SELECT queries. On top of that,_add_fks()inBaseExtractionStrategywas re-fetching IOCs thatadd_ioc()had just processed. Tanner had a manual localioc_cachein_classify_attacks()as a partial workaround for the same problem.This PR fixes all three by introducing a bulk prefetch mechanism in
IocProcessor:_ioc_cacheandprefetch_iocs()toIocProcessor. Each_get_scanners()now bulk-loads all IOCs for the batch in one query before the loop, reducing U SELECTs to 1.add_ioc()checks the cache first, falls back to the repo only on a miss, and updates the cache after every save._add_fks()now builds an in-memory map fromself.ioc_recordsinstead of making extra DB calls for IOCs already processed.ioc_cachein_classify_attacks()is removed and replaced by the processor-level cache.Related issues
Closes #1241
Type of change
Checklist
Formalities
Enhancement: bulk prefetch IOCs before processing to eliminate per-IOC SELECT queries. Closes #1241develop.develop.Docs and tests
GUI changes
No GUI changes in this PR.