Hi @regulartim,
I had a thought on a safe optimization we can queue up, in which right now the IOC_from_hits path is functionally right, but it still queries per unique IP.
So even when we process a huge batch efficiently in memory, DB calls still grow as 3 + 4U + 1 (U = new src_ip). That’s feels expensive. We can keep all current behavior exactly the same and just shift enrichment internals to batch operations so we stop repeating the same lookups per IOC.
If we do that, query growth drops to about 6 + U (plus a small bounded M2M thing), which is a big improvement.
Like for Example at 1,000 unique ips, we go from roughly 4,004 queries to ~1,006, so about a 75% reduction with no intended change in output.
The practical win is better throughput headroom, more stable worker latency, and less DB pressure during spikes
We can do it by writing one optimized enrichment path.
Please share your thoughts on it
Hi @regulartim,
I had a thought on a safe optimization we can queue up, in which right now the IOC_from_hits path is functionally right, but it still queries per unique IP.
So even when we process a huge batch efficiently in memory, DB calls still grow as 3 + 4U + 1 (U = new src_ip). That’s feels expensive. We can keep all current behavior exactly the same and just shift enrichment internals to batch operations so we stop repeating the same lookups per IOC.
If we do that, query growth drops to about 6 + U (plus a small bounded M2M thing), which is a big improvement.
Like for Example at 1,000 unique ips, we go from roughly 4,004 queries to ~1,006, so about a 75% reduction with no intended change in output.
The practical win is better throughput headroom, more stable worker latency, and less DB pressure during spikes
We can do it by writing one optimized enrichment path.
Please share your thoughts on it