Random: faster seed hashing #60206

rfourquet · 2025-11-22T15:19:11Z

When calling seed!(rng, seed), the seed is converted into random bytes, which are then expanded to produce the initialization state for rng. The purpose of hashing is to ensure that MyRNG(1) and MyRNG(2) produce uncorrelated streams.

In practice, this call is implemented as seed!(rng, SeedHasher(seed)), assuming that rng implements seed!(::AbstractRNG) for initialization from another RNG.

Previously, SeedHasher worked in three stages:

encode seed into bytes
hash these bytes using SHA-2
expand the resulting digest with an ad-hoc construction

This approach was functional but relatively slow.
This commit replaces stages 2 and 3 with an algorithm designed specifically for seed generation by M. E. O'Neill, described at: https://www.pcg-random.org/posts/developing-a-seed_seq-alternative.html

The implementation is adapted from O'Neill's seed_seq_fe C++ reference (MIT license). NumPy uses the same algorithm for its SeedSequence.

Here are some numbers:

@btime Xoshiro(1)
@btime Xoshiro($(rand(UInt)))
@btime Xoshiro($(rand(UInt, 4)))
@btime Xoshiro($(rand(UInt, 8)))
s = Random.SeedHasher(); @btime rand($s, UInt)

On master:

  398.755 ns (9 allocations: 448 bytes)
  403.610 ns (9 allocations: 448 bytes)
  486.254 ns (9 allocations: 448 bytes)
  998.182 ns (9 allocations: 448 bytes)
  73.213 ns (0 allocations: 63 bytes)

On PR:

  36.807 ns (3 allocations: 256 bytes)
  54.717 ns (3 allocations: 256 bytes)
  144.917 ns (3 allocations: 256 bytes)
  228.738 ns (3 allocations: 256 bytes)
  2.454 ns (0 allocations: 0 bytes)

When calling `seed!(rng, seed)`, the `seed` is converted into random bytes, which are then expanded to produce the initialization state for `rng`. The purpose of hashing is to ensure that `MyRNG(1)` and `MyRNG(2)` produce uncorrelated streams. In practice, this call is implemented as `seed!(rng, SeedHasher(seed))`, assuming that `rng` implements `seed!(::AbstractRNG)` for initialization from another RNG. Previously, `SeedHasher` worked in three stages: 1. encode `seed` into bytes 2. hash these bytes using SHA-2 3. expand the resulting digest with an ad-hoc construction This approach was functional but relatively slow. This commit replaces stages 2 and 3 with an algorithm designed specifically for seed generation by M. E. O'Neill, described at: https://www.pcg-random.org/posts/developing-a-seed_seq-alternative.html The implementation is adapted from O'Neill's `seed_seq_fe` C++ reference (MIT license). NumPy uses the same algorithm for its `SeedSequence`.

rfourquet added performance Must go faster randomness Random number generation and the Random stdlib labels Nov 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Random: faster seed hashing #60206

Random: faster seed hashing #60206

Uh oh!

rfourquet commented Nov 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Random: faster seed hashing #60206

Are you sure you want to change the base?

Random: faster seed hashing #60206

Uh oh!

Conversation

rfourquet commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rfourquet commented Nov 22, 2025 •

edited

Loading