Skip to content

Optimization: Bulk Enrichment of IOCs #1241

@rootp1

Description

@rootp1

Hi @regulartim,
I had a thought on a safe optimization we can queue up, in which right now the IOC_from_hits path is functionally right, but it still queries per unique IP.
So even when we process a huge batch efficiently in memory, DB calls still grow as 3 + 4U + 1 (U = new src_ip). That’s feels expensive. We can keep all current behavior exactly the same and just shift enrichment internals to batch operations so we stop repeating the same lookups per IOC.

If we do that, query growth drops to about 6 + U (plus a small bounded M2M thing), which is a big improvement.

Like for Example at 1,000 unique ips, we go from roughly 4,004 queries to ~1,006, so about a 75% reduction with no intended change in output.
The practical win is better throughput headroom, more stable worker latency, and less DB pressure during spikes

We can do it by writing one optimized enrichment path.
Please share your thoughts on it

Metadata

Metadata

Labels

enhancementNew feature or requestpythonPull requests that update Python code

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions