Skip to content

Conversation

studzien
Copy link
Contributor

@studzien studzien commented Jul 16, 2025

Summary

This PR adds key-based partitioning for duplicate Registry entries, allowing optimization for workloads with many keys and few entries per key (e.g., many topics with few subscribers each).

Background

This feature addresses performance concerns first noticed in phoenixframework/phoenix_pubsub#198, where Registry's PID-based partitioning wasn't optimal for workloads with many topics and relatively few subscribers per topic.

Solution

Enhanced Keys API

  • :unique: Traditional unique registry behavior
  • :duplicate: Traditional duplicate registry with PID-based partitioning (default)
  • {:duplicate, :pid}: Explicit PID-based partitioning for duplicate registries
  • {:duplicate, :key}: New key-based partitioning for duplicate registries

Performance Optimization

  • Key-based lookups with {:duplicate, :key} now only need to check a single partition
  • Reduces lookup complexity from O(partitions) to O(1) for key-based operations

Changes

lib/elixir/lib/registry.ex:

  • Updated keys typespec to include {:duplicate, :key} and {:duplicate, :pid}
  • Enhanced all internal functions to handle key-based partitioning strategy
  • Updated documentation with usage examples and performance guidance
  • Modified lookup/2, values/3, match/3, count/1, count_match/3, select/2 and other functions to support both partitioning strategies

lib/elixir/test/elixir/registry_test.exs:

  • Added comprehensive tests for all partitioning strategies
  • Extended cleanup and functionality tests to cover key-based partitioning
  • Added specific test cases for {:duplicate, :key} behavior

API Changes

New Keys Format

# Traditional duplicate registry (PID partitioning)
Registry.start_link(keys: :duplicate, name: MyRegistry)

# Explicit PID partitioning  
Registry.start_link(keys: {:duplicate, :pid}, name: MyRegistry)

# New key-based partitioning
Registry.start_link(keys: {:duplicate, :key}, name: MyRegistry)

Usage Guidelines

  • Use :duplicate or {:duplicate, :pid}: When you have few keys with many entries (e.g., one topic with many subscribers)
  • Use {:duplicate, :key}: When you have many keys with few entries each (e.g., many topics with few subscribers)

Backward Compatibility

This change is fully backward compatible:

  • Default behavior remains unchanged (:duplicate still uses PID partitioning)
  • Existing code continues to work without modification
  • Only adds new functionality for those who opt into {:duplicate, :key} partitioning
  • Key-based partitioning is only supported for duplicate registries (validated at startup)

Related

🤖 Generated with Claude Code

@studzien
Copy link
Contributor Author

studzien commented Jul 17, 2025

@josevalim I updated it to use {:duplicate, :key} or {:duplicate, :pid}.
I was not certain if the partition supervisor strategy should change, but I think it should stay the same (:one_for_one) since we partition both ets tables by the same key (:key or :pid).

Will update the PR description shortly.

end

{kind, _, _} ->
{kind, _, _, _} ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This claude needs to be reverted but we need to be careful with #{kind} below, that will no longer work.

Copy link
Member

@josevalim josevalim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have dropped a few comments.

I also think we need to better parameterize the test:

  1. Break the current RegistryTest into Registry.UniqueTest and Registry.DuplicateTest
  2. Make sure the DuplicateTest is parameterized by partition count AND partition by

The new tests you added can be part of the new DuplicateTest.

@studzien
Copy link
Contributor Author

Excellent feedback, thank you!
I will have time to continue working on this around Wednesday/Thursday this week.

@josevalim
Copy link
Member

Ping! Last call if you want this included in Elixir v1.19 :)

@studzien
Copy link
Contributor Author

studzien commented Oct 2, 2025

Ping! Last call if you want this included in Elixir v1.19 :)

Thanks for the ping and sorry for the delay!
If I have this ready on Monday, can it still make the release cut?

@josevalim
Copy link
Member

Yes!

@studzien studzien force-pushed the registry/duplicate-keys-partition-by-key branch from b83745c to eefdbcf Compare October 6, 2025 15:01
Co-authored-by: José Valim <jose.valim@gmail.com>
@studzien
Copy link
Contributor Author

studzien commented Oct 6, 2025

  1. I split the tests between unique and duplicate test modules. Running the fully parameterized tests on the duplicate test suite uncovered a bug in the Registry.keys/2 function (for {:duplicate, :keys} we need to iterate over all partitions). I'm not happy with how the function now looks (it's a bit long), so please let me know if there's a pattern of shortening it that I could apply
  2. I removed the tests I added earlier that covered successful registry startup for all permutations of kinds and partitions; parameterized tests now cover it
  3. I reproduced the issue with the missed clause by adding a test for the update_value function in the duplicate registry.
  4. I left the cleanup test code in the RegistryTest

@josevalim
Copy link
Member

Beautiful work. It all looks great to me! It will be merged once tests pass.

@josevalim josevalim merged commit fecb221 into elixir-lang:main Oct 6, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants