[PERFORMANCES] Pre-allocate arrays and tensors to use less RAM #36

Theaublanchard · 2025-11-13T14:19:09Z

Hello,

First thanks for providing your code associated to your paper that I read after seeing your work at PAISS. It has proved valuable in my work to estimate image generation quality on specific domain not usually well described by ImageNet-based scores.

This MR intend to fix memory issues when the dataset is too large caused by compute_packet_statistics.

If the dataset is actually very large, the packet_tensor buffer will eat up all the RAM. For example for 10.000 RGB 256x256 images, the it uses 30Gib of RAM. By first allocating directly the tensor it avoids over using the memory when calling th.stack. But for bigger dataset the problem will persist. A solution could be to split the computation by averaging mu and sigma across multiple packets. But this will yield a biased estimate of the true quantity...
Similarly this linesigma = th.stack( [gpu_cov(packet_tensor[p, :, :].to(device)) for p in range(P)], dim=0 ).numpy() will create a list of P tensor before stacking them. It would be faster to just pre-allocate the corresponding np.array and fill it in. This saves only a tiny bit of memory as the overhead still comes from packet_tensor.

On a dataset of 20k (3,256,256) images, it saves up to 30% of peak RAM.

The MR also includes removing the Exception caused by a too big imaginary component in the Fréchet distance calculation. I'm not so sure about this...

I'm interested in hearing your thoughts about these issues.

…echet_distance

Avoids duplicate memory usage penalizing computation for larger dataset / smaller available RAM.

BLANCHARD, Theau added 2 commits November 10, 2025 16:29

Change ValueError to warning for imaginary components in calculate_fr…

ccc1a73

…echet_distance

Refactor compute_packet_statistics to preallocate tensors and arrays.

515ee33

Avoids duplicate memory usage penalizing computation for larger dataset / smaller available RAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PERFORMANCES] Pre-allocate arrays and tensors to use less RAM #36

[PERFORMANCES] Pre-allocate arrays and tensors to use less RAM #36

Uh oh!

Theaublanchard commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[PERFORMANCES] Pre-allocate arrays and tensors to use less RAM #36

Are you sure you want to change the base?

[PERFORMANCES] Pre-allocate arrays and tensors to use less RAM #36

Uh oh!

Conversation

Theaublanchard commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant