Skip to content

Conversation

@orasagar
Copy link
Contributor

Raise the default xz compression level from 0 to 3 for improved compression efficiency. For smaller archives with xz -0 level
533M total --> 26M total in 16.4695 sec
For smaller archives with xz -3 level
533M total --> 13M total in 24.1473 sec

For larger archives with xz -0 level
15G total -->695M total in 815.4636 sec
For larger archives with xz -3 level
15G total -->428M total in 1016.3921 sec
xz_benchmark_small_archive.log
xz_benchmark_big_archive.log

Raise the default xz compression level from 0 to 3 for improved compression efficiency.
For smaller archives with xz -0 level
533M total --> 26M total in 16.4695 sec
For smaller archives with xz -3 level
533M total --> 13M total in 24.1473 sec

For larger archives with xz -0 level
15G total -->695M   total in 815.4636 sec
For larger archives with xz -3 level
15G total -->428M   total in 1016.3921 sec
@christianhorn
Copy link
Collaborator

christianhorn commented Nov 21, 2025

Raise the default xz compression level from 0 to 3 for improved compression efficiency. [..]

While commit it is increasing compression ratio, "efficiency" would relate to the resources which are most important for the user.
Increasing compression level could also be seen as a waste of cpu cycles and energy, for not much storage savings - depending on how you value the resources.

Not saying it should not be done.. but it should be thought through well.

@natoscott natoscott requested a review from kmcdonell November 23, 2025 22:52
@natoscott
Copy link
Member

For reference, commit fdfa25c is the original change to introduce "xz -0" as default. It seems to focus on comparing the -0 option to the -6 option, and -3 is not mentioned as a (possible ideal?) midpoint.

FWIW, I think we could make this change. Having smaller on-disk footprint at the cost of increase in CPU required to inflate (up to a point) can make sense from an archive replay latency POV (in addition to the more obvious space savings aspects).

@christianhorn
Copy link
Collaborator

Some quick testing, compressing 4.36GB of pmlogger archives:

method | compress    | ratio     | compress | uncompress
               | size               |              | time         | time
xz -0       | 368055576 |  8.47%  |  93s         | 26s
xz -1       | 56707348   |  1.30%   | 40s         | 11s
xz -2       | 51921808    | 1.19%   | 64s         | 8s
xz -3       | 46975632    | 1.10%   | 169s       | 7s
xz -4       | 49721836    | 1.14%   | 357s       | 10s
xz -5       | 44449408    | 1.02%   | 565s       | 9s
xz -6       | 41469712    | 0.95%   | 486s       | 8s
xz -7       | 39125352    | 0.90%   | 682s       | 9s
xz -8       | 37248892    | 0.85%   | 682s       | 8s
xz -9       | 36137088    | 0.83%   | 613s       | 8s

gzip -1     1246678262  28.57%  110s    27s
gzip -2     1265609529  29.01%  60s     28s
gzip -4     1193171742  27.35%  83s     27s
gzip -6     1039722752  23.83%  203s    27s
gzip -9     1018464795  23.34%  1227s   23s

I agree we should move off from "xz -0", it did surprisingly bad. I would see the sweet spot though with -1 or -2, not -3. Was not expecting gzip to do that bad on the test.

Also worth noting, xz has moved in 2013 from single-threaded to multi threaded by default. So the compression is by default then putting all cores of the system under load.

@natoscott
Copy link
Member

Thanks @chorn - agreed, -2 looks best for xz. @orasagar can you confirm with your archives, and update the PR if so?

@christianhorn
Copy link
Collaborator

@myllynen wondered about zstd:

level         | filesize          | ratio      | compress  usertime   | uncompress
                 |                        |               | (wc = wallclock time)  | time
zstd -1     | 237199142   | 5.44%   | 4s (2s wc)                      | 3s
zstd -2     | 155115074   | 3.56%   | 4s (2s wc)                      | 2s
zstd -3     | 106141422   | 2.43%   | 5s (2s wc)                      | 2s
zstd -5     | 101441517   | 2.33%   | 8s (4s wc)                      | 3s
zstd -7     | 93525759     | 2.14%   | 16s (12s wc)                  | 2s
zstd -9     | 78489614     | 1.80%   | 24s (12s wc)                  | 2s
zstd -11    | 78182625    | 1.79%   | 43s (22s wc)                  | 2s
zstd -13    | 77520510    | 1.78%   | 132s (66s wc)                | 2s
zstd -15    | 77315212    | 1.77%   | 382s (192s wc)              | 2s
zstd -17    | 66314359    | 1.52%   | 313s (217s wc)              | 2s
zstd -19    | 62033538    | 1.42%  |  751s (378s)                    | 2s

zstd used 2 threads for compression, so "compress usertime" is the "time spent on cpus, summed up", comparable to my results from above where a single thread was enforced. "wc" is wall clock time, so the time the compression actually took - half of the "usertime" due to the 2 threads.

My interpretation: it's amazing how high compression rates we get with zstd on our data while spending just a few compute cycles. But then, spending even more compute cycles, the compression ratio is not improving as much as for xz. The "xz -2" is running 64s and reaching 1.19% ratio. With xzst, spending the same time we would accomplish ~1.78% ratio.

So I think "xz -2" is ok. If a customer does not want to spend many cpu cycles on compression and/or also wants to optimize for quicker extraction time, they might want to consider xstd (not sure how hard we make it to switch).

@christianhorn
Copy link
Collaborator

I did also try brotli. It's not impacting our conclusion here, I think, but sharing:

method | compress    | ratio   | compress | uncompress
               | size               |            | time           | time
brotli -0	864404557	19.81%	10s			10s
brotli -1	704265093	16.14%	13s			12s
brotli -2	83160772	1.91%	9s			3s
brotli -4	69778319	1.60%	20s			2s
brotli -5	58698136	1.35%	23s			2s
brotli -6	57786988	1.32%	36s			2s
brotli -7	57117606	1.31%	52s			2s
brotli -8	56712559	1.30%	95s			2s
brotli -9	56507445	1.30%	173s		2s

@orasagar
Copy link
Contributor Author

orasagar commented Dec 1, 2025

@natoscott I do see -2 looks best in most of the cases.I have tested it out and looks good.Will update the code for it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants