-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Stop emitting UbChecks on every Vec→Slice #150265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Stop emitting UbChecks on every Vec→Slice
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (d3405d7): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 1.5%, secondary 1.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -1.0%, secondary -1.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.2%, secondary -0.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 481.34s -> 483.129s (0.37%) |
Spotted this in PR148766's test changes. It doesn't seem like this ubcheck would catch anything useful; let's see if skipping it helps perf.
9ca160d to
fd8744f
Compare
|
r? @ibraheemdev rustbot has assigned @ibraheemdev. Use |
|
Reconfirming after rebasing, but should be basically the same |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Stop emitting UbChecks on every Vec→Slice
| fn not_equal(&self, other: &[B]) -> bool { | ||
| !self.equal(other) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annot: nothing actually overrode this anywhere, so removed it in favour of the usual PartialEq::ne.
| StorageLive(_38); | ||
| _36 = copy _29 as &[u8] (Transmute); | ||
| _38 = copy _28 as &[u8] (Transmute); | ||
| _7 = <[u8] as PartialEq>::eq(move _36, move _38) -> [return: bb19, unwind unreachable]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
annot: note that we're still inlining the whole &String → &str → &u8 part (since it'll essentially disappear in LLVM), just stopping at <[_]>::eq which sharing at the MIR level is probably best.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (fa07ba2): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.1%, secondary 1.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -1.0%, secondary -1.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.2%, secondary -0.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 481.395s -> 479.879s (-0.31%) |
| unsafe { slice::from_raw_parts(self.as_ptr(), self.len) } | ||
| unsafe { | ||
| // normally this would use `slice::from_raw_parts`, but it's | ||
| // hot enough that avoiding the UB check is worth it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say "hot" here, I think some readers might take this comment to mean there is runtime overhead to UB checks.
| // It's not worth trying to inline the loops underneath here *in MIR*, | ||
| // and preventing it encourages more useful inlining upstream, | ||
| // such as in `<str as PartialEq>::eq`. | ||
| // The codegen backend can still inline it later if needed. | ||
| #[rustc_no_mir_inline] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the problem is the loop, why not put the attribute on the generic form that actually has a loop in it, instead of covering the entire entrypoint into the specialized-upon trait?
Spotted this in #148766's test changes. It doesn't seem like this ubcheck would catch anything useful; let's see if skipping it helps perf. (After all, this is inside every
[]on a vec, among other things.)