-
Notifications
You must be signed in to change notification settings - Fork 14k
Simplify jemalloc setup (without perf regression)
#148925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I could reproduce the performance regression locally, let's see if rustc-perf agrees: @bors try |
This comment has been minimized.
This comment has been minimized.
Fix performance regression with jemalloc
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
d4e5eb3 to
84a974f
Compare
|
@bors try |
This comment has been minimized.
This comment has been minimized.
Fix performance regression with jemalloc
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (aef7be6): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary -0.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 2.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 475.201s -> 474.025s (-0.25%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is a great cleanup.
Feel free to r=me once you undraft the PR.
| /// See docs in https://github.com/rust-lang/rust/blob/HEAD/compiler/rustc/src/main.rs | ||
| /// and https://github.com/rust-lang/rust/pull/146627 for why we need this `use` statement. | ||
| #[cfg(any(target_os = "linux", target_os = "macos"))] | ||
| use tikv_jemalloc_sys as _; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would extern crate be needed here too instead of the use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, miri already uses tikv_jemalloc_sys from Cargo, so extern crate and use work the same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean Miri suffers from the cc-rs issue? The comment in clippy seems to say that the extern crate thing is needed to avoid the cc-rs issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean Miri suffers from the cc-rs issue? The comment in clippy seems to say that the extern crate thing is needed to avoid the cc-rs issue.
I haven't tested it, but on reflection I'm fairly sure that it does, yeah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh then maybe let's not land this as-is? Please don't break Miri.^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, jemalloc isn't LTO optimized for the miri Rustup component, neither before nor after this PR.
But jemalloc is used both before and after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, LTO for jemalloc resulted maybe in a ~1% total instruction count win for rustc, IIRC, it wasn't a huge deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay. I'd appreciate your help with fixing that (if only just to have things consistent across tools), but it doesn't have to be in this PR then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be a bit troublesome, because Miri supports being compiled outside the rust-lang/rust workspace (?)
If that's not an issue, then it'd be enough to just do:
diff --git a/src/tools/miri/Cargo.toml b/src/tools/miri/Cargo.toml
index 611e549930a..2235203e2d7 100644
--- a/src/tools/miri/Cargo.toml
+++ b/src/tools/miri/Cargo.toml
@@ -29,13 +29,6 @@ directories = "6"
bitflags = "2.6"
serde_json = { version = "1.0", optional = true }
-# Copied from `compiler/rustc/Cargo.toml`.
-# But only for some targets, it fails for others. Rustc configures this in its CI, but we can't
-# easily use that since we support of-tree builds.
-[target.'cfg(any(target_os = "linux", target_os = "macos"))'.dependencies.tikv-jemalloc-sys]
-version = "0.6.1"
-features = ['override_allocator_on_supported_platforms']
-
[target.'cfg(unix)'.dependencies]
libc = "0.2"
# native-lib dependencies
@@ -75,6 +68,7 @@ stack-cache = []
expensive-consistency-checks = ["stack-cache"]
tracing = ["serde_json"]
native-lib = ["dep:libffi", "dep:libloading", "dep:capstone", "dep:ipc-channel", "dep:nix", "dep:serde"]
+jemalloc = []
[lints.rust.unexpected_cfgs]
level = "warn"
diff --git a/src/tools/miri/src/bin/miri.rs b/src/tools/miri/src/bin/miri.rs
index d7c5cb68e4f..cab31d159d3 100644
--- a/src/tools/miri/src/bin/miri.rs
+++ b/src/tools/miri/src/bin/miri.rs
@@ -22,8 +22,12 @@
/// See docs in https://github.com/rust-lang/rust/blob/HEAD/compiler/rustc/src/main.rs
/// and https://github.com/rust-lang/rust/pull/146627 for why we need this `use` statement.
-#[cfg(any(target_os = "linux", target_os = "macos"))]
-use tikv_jemalloc_sys as _;
+///
+/// FIXME(madsmtm): This is loaded from the sysroot that was built with the other `rustc` crates
+/// above, instead of via Cargo as you'd normally do. This is currently needed for LTO due to
+/// https://github.com/rust-lang/cc-rs/issues/1613.
+#[cfg(feature = "jemalloc")]
+extern crate tikv_jemalloc_sys as _;
mod log;
I can put up a PR, but I'm unfamiliar with how CI and perf runs in r-l/rust vs. r-l/miri, so I wouldn't know which repo to target? And I'd fear breaking jemalloc for a use-case that I don't know about. And besides, I'm kinda hoping to resolve the cc-rs issue, in which case we wouldn't have to do anything in Miri.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be a bit troublesome, because Miri supports being compiled outside the rust-lang/rust workspace (?)
That's also the case for clippy though.
It seems the main difference is using tikv-jemalloc-sys from the sysroot vs having it as a dependency directly in the crate. And if it's already in the sysroot then it seems reasonable to use that, and avoid any risk of it being duplicated or so.
|
Some changes occurred in src/tools/clippy cc @rust-lang/clippy These commits modify the If this was unintentional then you should revert the changes before this PR is merged. The Miri subtree was changed cc @rust-lang/miri |
jemalloc setup (without perf regression)
c690fa1 to
139754b
Compare
Using the new `override_allocator_on_supported_platforms` feature in `tikv-jemalloc-sys v0.6.1` we can avoid the manual statics.
139754b to
73cecf3
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
The Clippy subtree PR was merged, so this should be good to go. |
Simplify `jemalloc` setup (without perf regression) Reland #146627 after fixing [the performance regression](#148851 (comment)) that caused it to be reverted in #148896. This avoids 65f0b7a (second commit in the initial PR), and adds a comment explaining why `extern crate` is needed here instead of `use` (we need to load `tikv_jemalloc_sys` from the sysroot because of rust-lang/cc-rs#1613). r? Kobzol
|
@bors r- |
|
@bors retry |
|
@bors r=Kobzol |
Reland #146627 after fixing the performance regression that caused it to be reverted in #148896.
This avoids 65f0b7a (second commit in the initial PR), and adds a comment explaining why
extern crateis needed here instead ofuse(we need to loadtikv_jemalloc_sysfrom the sysroot because of rust-lang/cc-rs#1613).r? Kobzol