-
-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Description
I tried this code (artificial, for the sake of a self-contained example):
#![feature(debug_closure_helpers)]
use core::fmt::{self, Debug};
pub struct Point {
pub x: u32,
pub y: u32,
}
impl fmt::Debug for Point {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_struct("Point")
.field_with("x", |f| self.x.fmt(f))
.field_with("y", |f| self.y.fmt(f))
.finish()
}
}I expected to see this happen: generating roughly comparable amounts of code to the more conventional approach with DebugStruct::field.
Instead, this happened: every use of field_with with a distinct closure monomorphizes all of field_with's code, adding a hundred lines of LLVM IR per call site. For the simple example above that only calls field_with twice, cargo-llvm-lines counts 578 lines, while the version using .field("x", &self.x).field("y", &self.y) generates 42 lines.
A real-world example where I've run into is the impl Debug for raw pointers. Here, the use of field_with causes a lot of code to be generated per pointee type. For example, consider this program:
use core::ptr::null;
use core::fmt::Debug;
type Pointers = (*mut u8, *mut u16, *mut u32, *mut u64);
pub fn force_codegen() -> Box<dyn Debug> {
Box::new(Pointers::default())
}cargo llvm-lines output
Lines Copies Function name
----- ------ -------------
1591 38 (TOTAL)
868 (54.6%, 54.6%) 4 (10.5%, 10.5%) core::fmt::builders::DebugStruct::field_with::{{closure}}
180 (11.3%, 65.9%) 4 (10.5%, 21.1%) core::fmt::builders::DebugStruct::field_with
120 (7.5%, 73.4%) 4 (10.5%, 31.6%) <*const T as core::fmt::Pointer>::fmt
103 (6.5%, 79.9%) 1 (2.6%, 34.2%) alloc::alloc::Global::alloc_impl
50 (3.1%, 83.0%) 1 (2.6%, 36.8%) core::tuple::::default
40 (2.5%, 85.5%) 4 (10.5%, 47.4%) <*const T as core::fmt::Pointer>::fmt::{{closure}}
39 (2.5%, 88.0%) 1 (2.6%, 50.0%) alloc::alloc::exchange_malloc
37 (2.3%, 90.3%) 1 (2.6%, 52.6%) core::alloc::layout::Layout::from_size_align_unchecked::precondition_check
36 (2.3%, 92.6%) 4 (10.5%, 63.2%) <*mut T as core::fmt::Debug>::fmt
33 (2.1%, 94.7%) 1 (2.6%, 65.8%) core::ptr::non_null::NonNull::new_unchecked::precondition_check
28 (1.8%, 96.4%) 4 (10.5%, 76.3%) <&T as core::fmt::Debug>::fmt
23 (1.4%, 97.9%) 1 (2.6%, 78.9%) <(W,V,U,T) as core::fmt::Debug>::fmt
17 (1.1%, 98.9%) 1 (2.6%, 81.6%) alloc::boxed::Box::new
6 (0.4%, 99.3%) 1 (2.6%, 84.2%) <() as core::fmt::Debug>::fmt
6 (0.4%, 99.7%) 1 (2.6%, 86.8%) test_field_with::force_codegen
4 (0.3%, 99.9%) 4 (10.5%, 97.4%) core::ptr::mut_ptr::::default
1 (0.1%,100.0%) 1 (2.6%,100.0%) <() as core::unit::IsUnit>::is_unit
Note:
- This is initial LLVM IR builds. LLVM easily optimizes into jumping to the
fmt_pointer_innerfunction for Sized pointees. But it's still wasting compile time to generate all that LLVM IR, and for unsized pointee types (e.g.,*mut [T]) this will probably impact binary size even with optimizations. - This specific
implcan be fixed to avoidfield_with(and that may be useful as a minor optimization even iffield_withimproves), but it's a nice example of how easy it is to run into the code size footgun of the current way these helpers are implemented.
Meta
rustc --version --verbose:
rustc 1.93.0-nightly (01867557c 2025-11-12)
binary: rustc
commit-hash: 01867557cd7dbe256a031a7b8e28d05daecd75ab
commit-date: 2025-11-12
host: x86_64-unknown-linux-gnu
release: 1.93.0-nightly
LLVM version: 21.1.5