perf: optimize memory usage in denoised-rows query by conditionally including processedRows #1177

brandon-pereira · 2025-09-17T22:07:34Z

When the searching row limits is set very high (ex the max of 100k) the app quickly consumes all available memory and crashes.

This adds some improvements to help mitigate the problem:

QueryKey Issues - The queryKey is generating a ton of extra entries every time the processedRows changes (which is every 5s when in live mode). The queryKey and result is cached regardless of if enabled is true or false. The base hashFn strategy is to stringify the objects which creates a very large string to be stored in memory. I tried to fix this by providing a custom queryKeyHashFn to useQuery but it was too slow, and the faster browser based hashing fns return a promise which isn't supported by useQuery at this time. The easiest solution I found was to short circuit the hash generation if we are not denoising.
Sync gcTime - We already set gcTime in useOffsetPaginatedQuery so I added that field here too, this helps keep the memory usage lower while denoising rows (but the memory still is much higher).

The app still uses very high memory usage, just from the sheer number of rows being captured and processed, but it doesn't crash anymore. There is definitely further optimizations we could make to reduce this. One solution that comes to mind is storing a hash/unique id of each row server side before sending to the client, then our app can leverage this key instead of a stringified object.

Before (after 1 min):

After (after 5 mins):

Fixes: HDX-2409

…ncluding processedRows

changeset-bot · 2025-09-17T22:07:38Z

🦋 Changeset detected

Latest commit: 10560c9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages

Name	Type
@hyperdx/app	Patch
@hyperdx/api	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2025-09-17T22:07:38Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
hyperdx-v2-oss-app	Ready	Preview	Comment	Sep 17, 2025 10:12pm

github-actions · 2025-09-17T22:24:21Z

Stably Runner - Test Suite - 'Smoke Test'

Test Suite Run Result: 🔴 Failure (1/4 tests failed) [dashboard]

Failed Tests:

Advanced Search Workflow - Traces

This comment was generated from stably-runner-action

wrn14897 · 2025-09-17T22:52:26Z

packages/app/src/components/DBRowTable.tsx

      }
      return undefined;
    },
+    gcTime: isLive ? ms('30s') : ms('5m'), // more aggressive gc for live data, since it can end up holding lots of data


oh this is neat. theoretically we don't need to keep the old page if live tail is enabled

wrn14897 · 2025-09-17T23:00:34Z

packages/app/src/components/DBRowTable.tsx

+      denoiseResults,
+      // Only include processed rows if denoising is enabled
+      // This helps prevent the queryKey from getting extremely large
+      // and causing memory issues, when it's not used.
+      ...(denoiseResults ? [processedRows] : []),


I’m scratching my head as to why this is relevant. even if denoise is disabled, react-query still tries to cache the key that blows up the memory?

Yeah, exactly!

You'd think the enabled flag being false would turn all this off, I can file a bug in the tanstack query library when I get back from vacation (and see if one already exists) if we want to help provide feedback.

wrn14897 · 2025-09-17T23:09:53Z

packages/app/src/components/DBRowTable.tsx

+      denoiseResults,
+      // Only include processed rows if denoising is enabled
+      // This helps prevent the queryKey from getting extremely large
+      // and causing memory issues, when it's not used.
+      ...(denoiseResults ? [processedRows] : []),


perf nit: Instead of passing all the rows, one idea is to generate an ID for processedRows. For example, we can take a fixed step to sample the rows and then compute a hash from those samples

Ah good idea, I spent a large portion of time messing around with their queryKeyHashFn but I couldn't get it to have a clean & fast solution on the full processedRows dataset, but if we're sampling the results and generating a hash on those then I can definitely improve the perf on that front. This should reduce memory when denoising is enabled.

Let me take a look at this next week when I get back from vacation!

I think we don't need to go too deep down optimizing this path - my preference is if scaling issues continue to come up for this feature, we re-evaluate pushing down denoising into the clickhouse query itself as opposed to incremental improvements to the current implementation.

perf: optimize memory usage in denoised-rows query by conditionally i…

e9373de

…ncluding processedRows

brandon-pereira requested review from wrn14897 and knudtty September 17, 2025 22:07

add changeset

10560c9

vercel bot deployed to Preview September 17, 2025 22:12 View deployment

wrn14897 reviewed Sep 17, 2025

View reviewed changes

MikeShi42 approved these changes Sep 17, 2025

View reviewed changes

wrn14897 reviewed Sep 17, 2025

View reviewed changes

wrn14897 approved these changes Sep 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: optimize memory usage in denoised-rows query by conditionally including processedRows #1177

perf: optimize memory usage in denoised-rows query by conditionally including processedRows #1177

Uh oh!

brandon-pereira commented Sep 17, 2025 •

edited

Loading

Uh oh!

changeset-bot bot commented Sep 17, 2025 •

edited

Loading

Uh oh!

vercel bot commented Sep 17, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 17, 2025 •

edited

Loading

Uh oh!

wrn14897 Sep 17, 2025 •

edited

Loading

Uh oh!

wrn14897 Sep 17, 2025

Uh oh!

brandon-pereira Sep 17, 2025

Uh oh!

wrn14897 Sep 17, 2025

Uh oh!

brandon-pereira Sep 17, 2025 •

edited

Loading

Uh oh!

MikeShi42 Sep 17, 2025

Uh oh!

Uh oh!

perf: optimize memory usage in denoised-rows query by conditionally including processedRows #1177

Are you sure you want to change the base?

perf: optimize memory usage in denoised-rows query by conditionally including processedRows #1177

Uh oh!

Conversation

brandon-pereira commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Stably Runner - Test Suite - 'Smoke Test'

Uh oh!

wrn14897 Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wrn14897 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

brandon-pereira Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

wrn14897 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

brandon-pereira Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MikeShi42 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

brandon-pereira commented Sep 17, 2025 •

edited

Loading

changeset-bot bot commented Sep 17, 2025 •

edited

Loading

vercel bot commented Sep 17, 2025 •

edited

Loading

github-actions bot commented Sep 17, 2025 •

edited

Loading

wrn14897 Sep 17, 2025 •

edited

Loading

brandon-pereira Sep 17, 2025 •

edited

Loading