Skip to content

rotaki/fastzipf

Repository files navigation

A psuedo-random number generator and zipfian distribution implementation in Rust

Features

Benchmark Notes 1

Generate 100M random numbers and sum them up. The sum is not important, but the time taken is.

SplitMix64:  sum = f8587b57c65abb04, time = 81.601865ms
Xoroshiro128++: sum = a7f5dd1e31fd4db7, time = 103.242012ms
Xoroshiro128**: sum = cac3b95d0cd46896, time = 96.135394ms
Xoshiro256++: sum = a3e6e3015c3a4c69, time = 78.411429ms
(rand crate) thread_rng:   sum = 1a0db0133aa10031, time = 192.123836ms
(rand crate) small_rng:    sum = 87c5fb9d063b7620, time = 74.777916ms // (internally uses Xoshiro256++)

Benchmark Notes 2

Generate Zipfian distribution with different domain sizes, comparing the original Zipf implementation and the FastZipf implementation.

The Zipf implementation builds a cumulative distribution table and uses binary search to find the index. The FastZipf implementation does not build a table, but uses a mathematical formula to find the index, which does not depend on the domain size. Thus, as the domain size increases, the Zipf implementation will take longer to generate the random number, while the FastZipf implementation will take a constant amount of time.

Benchmarking Zipf and FastZipf with theta = 0.5
Iterations: 100000000
Number of items (nr): 10
Zipf:         sum = 2, time = 321.209353ms
FastZipf:     sum = 0, time = 1.259187054s

Benchmarking Zipf and FastZipf with theta = 0.5
Iterations: 100000000
Number of items (nr): 100
Zipf:         sum = 2a, time = 606.840563ms
FastZipf:     sum = 37, time = 1.299370813s

Benchmarking Zipf and FastZipf with theta = 0.5
Iterations: 100000000
Number of items (nr): 10000
Zipf:         sum = 252, time = 1.531311041s
FastZipf:     sum = 982, time = 1.331495367s

Benchmarking Zipf and FastZipf with theta = 0.5
Iterations: 100000000
Number of items (nr): 100000
Zipf:         sum = ad5b, time = 2.584903649s
FastZipf:     sum = 10193, time = 1.338737406s

Benchmarking Zipf and FastZipf with theta = 0.5
Iterations: 100000000
Number of items (nr): 1000000
Zipf:         sum = 6b1c, time = 5.087740815s
FastZipf:     sum = 89355, time = 1.321986307s

Benchmarking Zipf and FastZipf with theta = 0.5
Iterations: 100000000
Number of items (nr): 10000000
Zipf:         sum = cecec1, time = 26.17055999s
FastZipf:     sum = 8bab96, time = 1.326641607s

Checks

Distribution of random numbers generated by the Zipf implementation

Distribution of random numbers generated by the FastZipf implementation

About

fast zipfan distribution generator and psuedo-random generator

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages