Skip to content

Conversation

@redactedontop
Copy link

Howdy!

I've refactored most of your code (in a day! turns out it's exhausting xD), and here are most of the things I did:

  1. Switched to the 2024 rust edition
  2. Removed useless impl's and slow code from Tag
  3. Optimized add_ref and clear_for_drop majorly
  4. Made a lot of style changes

Once I have some time, I ask you to review this code and, for consistency, try to change what I missed to the new code style (as long as you like it). Feel free to propose any changes!

Thanks,
Alex <3

@wvwwvwwv
Copy link
Owner

Awesome!

I'll review the change possibly by next week. In the meantime, could you please give me some performance numbers? Thanks a lot!

@redactedontop
Copy link
Author

Hey!

Hell yeah! Give me some time (as it's exam season), and I'll get back to you ASAP.

Thanks,
Alex <3

@redactedontop
Copy link
Author

Hey, sorry!

No benches yet, will try today (on my decent computer).

Thanks,
Alex <3

@redactedontop
Copy link
Author

Main branch results:

                        change: [-9.3250% -6.7385% -3.7397%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  5 (5.00%) high mild
  13 (13.00%) high severe

EBR: superposed guard   time:   [490.25 ps 496.28 ps 502.93 ps]
                        change: [-2.2388% +0.0442% +2.1496%] (p = 0.97 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild```

@redactedontop
Copy link
Author

My results:

                        change: [-11.632% -9.5819% -7.8764%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

EBR: superposed guard   time:   [719.66 ps 726.75 ps 734.34 ps]
                        change: [+66.492% +69.035% +71.453%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe```
  
  Curious why the regression happens.

@redactedontop
Copy link
Author

Turns out it's LTO.

@redactedontop
Copy link
Author

Have refactored the code a bit more. Please take a look at the updated code. I've reduced boilerplate with the 2 Atomic types by using the type-state pattern.

@redactedontop
Copy link
Author

Another approach would be to use traits, which could produce cleaner code, but I'm not sure I'm ready for the task yet.

@redactedontop
Copy link
Author

As the Epoch API has changed, I bumped the major version.

@redactedontop
Copy link
Author

redactedontop commented Apr 4, 2025

Benchmarks (no LTO):

EBR: guard - time: [6.3678 ns 6.4250 ns 6.5398 ns]
Found 17 outliers among 100 measurements (17.00%)
  9 (9.00%) high mild
  8 (8.00%) high severe

EBR: superposed guard - time: [462.72 ps 466.61 ps 470.89 ps]
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe```

@wvwwvwwv
Copy link
Owner

Hi, thanks for the updates. I really didn't have a chance to look into this PR, but I'll take some time next week.
-> In the meantime, I'll approve workflow runs.

@redactedontop
Copy link
Author

Hey, can you please approve new runs? I've fixed the issue with the old ones.

@redactedontop
Copy link
Author

MSRV needs bump. Could you do it pre-merge? I can't figure out which version to set it to.

Copy link
Owner

@wvwwvwwv wvwwvwwv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I only reviewed half of the change. I'll review the rest by the end of this week.


[dev-dependencies]
criterion = "0.5"
criterion = "0.5.1"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally omitted patch versions in Cargo.toml - revert it to "0.5".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair.

[dependencies]
loom = { version = "0.7", optional = true }
[dependencies.loom]
version = "0.7.2"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"0.7"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I also need to make it optional.

pub(super) mod ownership {
use crate::ref_counted::RefCounted;

pub(super) trait Type {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit, optional) Can you add doc to each method/type?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why it's important but honestly I'm wayyy too lazy to... could you take a shot at it?

if let Some(f) = self.f.take() {
f();
}
let Some(f) = self.f.take() else {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bumps MSRV, and I don't see any readability improvements here. Can you revert it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're bumping MSRV anyway (because of atomic add for it's performance), and I believe this is included as well.

}
}
} else {
if (*collector_ptr).num_readers > 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any justification for this refactoring? Comparing against 0 is "usually" better than '>' in terms of performance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler can most probably optimize it away, and it causes much cleaner code.

let mut current = GLOBAL_ROOT.chain_head.load(Relaxed);
loop {

unsafe {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This repeats line 244-248. Can you revert it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's either repeating or some ugly nesting, I thought this would be preferred.

current_collector_ptr = (*collector_ptr).next_link.load(Relaxed);
continue;
}
let Ok(mut current_ptr) = Self::lock_chain() else {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is debatable; your code is clearly "clearer", but this bumps MSRV. I'll need to think about the pros and cons here.
-> Since you bumped SDD to 4, we can make breaking changes, though keeping MSRV as low as possible is good for dependent libraries.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the previous comment about let-else.

@wvwwvwwv
Copy link
Owner

MSRV needs bump. Could you do it pre-merge? I can't figure out which version to set it to.

You can test it with cargo msrv verify; the lower the better!

Tag::Second => 2,
Tag::Both => 3,
}
fn not(self) -> usize {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it needed? What is the semantics of this method? A Tag is intentionally a four-state type, but the output of ! is a usize that doesn't quite make sense to me...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain further?

2 => Ok(Tag::Second),
3 => Ok(Tag::Both),
_ => Err(val),
fn add(self, rhs: Self) -> Tag {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tag is supposed to contain either of four possible values; this doesn't seem right.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain further as well: this is used for later code replacing a match statement with arithmetic (just like the previous comment!).


/// [`Tag`] is a four-state `Enum` that can be embedded in a pointer as the two least
/// significant bits of the pointer value.
#[repr(usize)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usize is a waste of memory for a four-state type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improves performance as you don't need to convert an u8 to a usize, even though I believe that'll be optimized away by the compiler.

pub(super) fn drop_ref(&self) -> bool {
// It does not have to be a load-acquire as everything's synchronized via the global
// epoch.
let mut current = self.ref_cnt().load(Relaxed);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

  • Could you revive the debug assertion?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be hard without messing with the code a lot.

}
},
)
.fetch_update(order, order, |r| (r & 1 == 1).then_some(r + 2))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

pub fn get_guarded_ref<'g>(&self, _guard: &'g Guard) -> &'g T {
unsafe { std::mem::transmute::<&T, _>(&**self) }
pub fn get_guarded_ref<'g>(&self, _: &'g Guard) -> &'g T {
#[allow(clippy::missing_transmute_annotations)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you not omit type annotations here?

@@ -0,0 +1,97 @@
#[cfg(feature = "loom")]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do tomorrow.

/// let prev_prev = prev.prev();
/// assert!(prev_prev < prev);
/// ```
#[allow(clippy::precedence)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just add a parenthesis instead of suppressing the error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well thought of.

@wvwwvwwv
Copy link
Owner

Hi, any chance to update this PR?

@loyd
Copy link

loyd commented Sep 30, 2025

intel i9-14900, no difference:

group                    main                                   pr3
-----                    ----                                   ---
EBR: guard               1.00      6.0±0.02ns        ? ?/sec    1.00      6.0±0.01ns        ? ?/sec
EBR: superposed guard    1.00      0.4±0.00ns        ? ?/sec    1.00      0.4±0.00ns        ? ?/sec

@loyd
Copy link

loyd commented Sep 30, 2025

and 3rd major update in one year =(

@redactedontop
Copy link
Author

intel i9-14900, no difference:

group                    main                                   pr3
-----                    ----                                   ---
EBR: guard               1.00      6.0±0.02ns        ? ?/sec    1.00      6.0±0.01ns        ? ?/sec
EBR: superposed guard    1.00      0.4±0.00ns        ? ?/sec    1.00      0.4±0.00ns        ? ?/sec

Compile with target-cpu.

@redactedontop
Copy link
Author

Hi, any chance to update this PR?

Soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants