Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 25 additions & 10 deletions compiler/rustc_codegen_ssa/src/mir/naked_asm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ fn prefix_and_suffix<'tcx>(

let is_arm = tcx.sess.target.arch == Arch::Arm;
let is_thumb = tcx.sess.unstable_target_features.contains(&sym::thumb_mode);
let function_sections =
tcx.sess.opts.unstable_opts.function_sections.unwrap_or(tcx.sess.target.function_sections);

let attrs = tcx.codegen_instance_attrs(instance.def);
let link_section = attrs.link_section.map(|symbol| symbol.as_str().to_string());
Expand Down Expand Up @@ -201,8 +203,6 @@ fn prefix_and_suffix<'tcx>(
let mut end = String::new();
match asm_binary_format {
BinaryFormat::Elf => {
let section = link_section.unwrap_or_else(|| format!(".text.{asm_name}"));

let progbits = match is_arm {
true => "%progbits",
false => "@progbits",
Expand All @@ -213,7 +213,11 @@ fn prefix_and_suffix<'tcx>(
false => "@function",
};

writeln!(begin, ".pushsection {section},\"ax\", {progbits}").unwrap();
if let Some(section) = &link_section {
writeln!(begin, ".pushsection {section},\"ax\", {progbits}").unwrap();
} else if function_sections {
writeln!(begin, ".pushsection .text.{asm_name},\"ax\", {progbits}").unwrap();
}
writeln!(begin, ".balign {align_bytes}").unwrap();
write_linkage(&mut begin).unwrap();
match item_data.visibility {
Expand All @@ -232,14 +236,18 @@ fn prefix_and_suffix<'tcx>(
// pattern match on assembly generated by LLVM.
writeln!(end, ".Lfunc_end_{asm_name}:").unwrap();
writeln!(end, ".size {asm_name}, . - {asm_name}").unwrap();
writeln!(end, ".popsection").unwrap();
if link_section.is_some() || function_sections {
writeln!(end, ".popsection").unwrap();
}
if !arch_suffix.is_empty() {
writeln!(end, "{}", arch_suffix).unwrap();
}
}
BinaryFormat::MachO => {
let section = link_section.unwrap_or_else(|| "__TEXT,__text".to_string());
writeln!(begin, ".pushsection {},regular,pure_instructions", section).unwrap();
// NOTE: LLVM ignores `-Zfunction-sections` on macos.
if let Some(section) = &link_section {
writeln!(begin, ".pushsection {section},regular,pure_instructions").unwrap();
}
writeln!(begin, ".balign {align_bytes}").unwrap();
write_linkage(&mut begin).unwrap();
match item_data.visibility {
Expand All @@ -250,14 +258,19 @@ fn prefix_and_suffix<'tcx>(

writeln!(end).unwrap();
writeln!(end, ".Lfunc_end_{asm_name}:").unwrap();
writeln!(end, ".popsection").unwrap();
if link_section.is_some() {
writeln!(end, ".popsection").unwrap();
}
if !arch_suffix.is_empty() {
writeln!(end, "{}", arch_suffix).unwrap();
}
}
BinaryFormat::Coff => {
let section = link_section.unwrap_or_else(|| format!(".text.{asm_name}"));
writeln!(begin, ".pushsection {},\"xr\"", section).unwrap();
if let Some(section) = &link_section {
writeln!(begin, ".pushsection {section},\"xr\"").unwrap()
} else if function_sections {
writeln!(begin, ".pushsection .text${asm_name},\"xr\"").unwrap()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On -msvc targets, function_sections is ignored. .text$sym is only used on -gnu targets.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also LLVM currently generates the following line on COFF targets:

.section        {section},"xr",one_only,{sym},unique,0`

Should we be doing the same here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On -msvc targets, function_sections is ignored.

Meaning that it's just always assumed to be on? https://godbolt.org/z/a6ofW49YK

.text$sym is only used on -gnu targets.

It should still work for msvc, no?

Should we be doing the same here?

I don't think we can reliably emit the unique,id bit of that line because how do we make that ID?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually using -Zfunction-sections=no on msvc does have an effect. So I'm not really sure what it being ignored means then. Maybe this is a more recent LLVM change?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's always assumed to be off on msvc. You can see it always uses .text.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does always use .text but normally it emits the line you quoted:

  .section .text,"xr",one_only,foo,unique,0

i.e. it uses a subsection which, as far as I understand, does allow DCE, with -Zfunction-sections=no it emits just

  .text

So then DCE is impossible

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unique,0 bit is apparently an LLVM extension https://llvm.org/docs/Extensions.html#id2. It is documented as elf-specific, but clearly it's used for COFF as well...

I just don't see how we can generate the unique ID in a reliable way. Also because we would generate a function per section, the function name (which should be unique?) should be sufficient to disambiguate the sections.

In short, I think the implementation in this PR is the best we can reliably do.

}
writeln!(begin, ".balign {align_bytes}").unwrap();
write_linkage(&mut begin).unwrap();
writeln!(begin, ".def {asm_name}").unwrap();
Expand All @@ -268,7 +281,9 @@ fn prefix_and_suffix<'tcx>(

writeln!(end).unwrap();
writeln!(end, ".Lfunc_end_{asm_name}:").unwrap();
writeln!(end, ".popsection").unwrap();
if link_section.is_some() || function_sections {
writeln!(end, ".popsection").unwrap();
}
if !arch_suffix.is_empty() {
writeln!(end, "{}", arch_suffix).unwrap();
}
Expand Down
70 changes: 38 additions & 32 deletions tests/codegen-llvm/naked-fn/naked-functions.rs
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
//@ add-minicore
//@ revisions: linux win_x86 win_i686 macos thumb
//@ revisions: linux win_x86_msvc win_x86_gnu win_i686_gnu macos thumb
//
//@[linux] compile-flags: --target x86_64-unknown-linux-gnu
//@[linux] needs-llvm-components: x86
//@[win_x86] compile-flags: --target x86_64-pc-windows-gnu
//@[win_x86] needs-llvm-components: x86
//@[win_i686] compile-flags: --target i686-pc-windows-gnu
//@[win_i686] needs-llvm-components: x86
//@[win_x86_gnu] compile-flags: --target x86_64-pc-windows-gnu
//@[win_x86_gnu] needs-llvm-components: x86
//@[win_x86_msvc] compile-flags: --target x86_64-pc-windows-msvc
//@[win_x86_msvc] needs-llvm-components: x86
//@[win_i686_gnu] compile-flags: --target i686-pc-windows-gnu
//@[win_i686_gnu] needs-llvm-components: x86
//@[macos] compile-flags: --target aarch64-apple-darwin
//@[macos] needs-llvm-components: aarch64
//@[thumb] compile-flags: --target thumbv7em-none-eabi
Expand All @@ -22,9 +24,10 @@ use minicore::*;
// linux,win: .intel_syntax
//
// linux: .pushsection .text.naked_empty,\22ax\22, @progbits
// macos: .pushsection __TEXT,__text,regular,pure_instructions
// win_x86: .pushsection .text.naked_empty,\22xr\22
// win_i686: .pushsection .text._naked_empty,\22xr\22
// macos-NOT: .pushsection
// win_x86_msvc: .pushsection .text$naked_empty,\22xr\22
// win_x86_gnu-NOT: .pushsection
// win_i686_gnu-NOT: .pushsection
// thumb: .pushsection .text.naked_empty,\22ax\22, %progbits
//
// CHECK: .balign 4
Expand All @@ -37,12 +40,12 @@ use minicore::*;
//
// linux: .type naked_empty, @function
//
// win_x86: .def naked_empty
// win_i686: .def _naked_empty
// win_x86_msvc,win_x86_gnu: .def naked_empty
// win_i686_gnu: .def _naked_empty
//
// win_x86,win_i686: .scl 2
// win_x86,win_i686: .type 32
// win_x86,win_i686: .endef
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .scl 2
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .type 32
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .endef
//
// thumb: .type naked_empty, %function
// thumb: .thumb
Expand All @@ -53,7 +56,8 @@ use minicore::*;
// linux,macos,win: ret
// thumb: bx lr
//
// CHECK: .popsection
// linux,windows,win_x86_msvc,thumb: .popsection
// win_x86_gnu-NOT,win_i686_gnu-NOT: .popsection
//
// thumb: .thumb
//
Expand All @@ -72,9 +76,10 @@ pub extern "C" fn naked_empty() {
// linux,win: .intel_syntax
//
// linux: .pushsection .text.naked_with_args_and_return,\22ax\22, @progbits
// macos: .pushsection __TEXT,__text,regular,pure_instructions
// win_x86: .pushsection .text.naked_with_args_and_return,\22xr\22
// win_i686: .pushsection .text._naked_with_args_and_return,\22xr\22
// macos-NOT: .pushsection
// win_x86_msvc: .pushsection .text$naked_with_args_and_return,\22xr\22
// win_x86_gnu-NOT: .pushsection
// win_i686_gnu-NOT: .pushsection
// thumb: .pushsection .text.naked_with_args_and_return,\22ax\22, %progbits
//
// CHECK: .balign 4
Expand All @@ -87,27 +92,28 @@ pub extern "C" fn naked_empty() {
//
// linux: .type naked_with_args_and_return, @function
//
// win_x86: .def naked_with_args_and_return
// win_i686: .def _naked_with_args_and_return
// win_x86_msvc,win_x86_gnu: .def naked_with_args_and_return
// win_i686_gnu: .def _naked_with_args_and_return
//
// win_x86,win_i686: .scl 2
// win_x86,win_i686: .type 32
// win_x86,win_i686: .endef
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .scl 2
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .type 32
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .endef
//
// thumb: .type naked_with_args_and_return, %function
// thumb: .thumb
// thumb: .thumb_func
//
// CHECK-LABEL: naked_with_args_and_return:
//
// linux, win_x86,win_i686: lea rax, [rdi + rsi]
// linux,win_x86_msvc,win_x86_gnu,win_i686_gnu: lea rax, [rdi + rsi]
// macos: add x0, x0, x1
// thumb: adds r0, r0, r1
//
// linux,macos,win: ret
// thumb: bx lr
//
// CHECK: .popsection
// linux,windows,win_x86_msvc,thumb: .popsection
// win_x86_gnu-NOT,win_i686_gnu-NOT: .popsection
//
// thumb: .thumb
//
Expand All @@ -134,7 +140,7 @@ pub extern "C" fn naked_with_args_and_return(a: isize, b: isize) -> isize {

// linux: .pushsection .text.some_different_name,\22ax\22, @progbits
// macos: .pushsection .text.some_different_name,regular,pure_instructions
// win_x86,win_i686: .pushsection .text.some_different_name,\22xr\22
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .pushsection .text.some_different_name,\22xr\22
// thumb: .pushsection .text.some_different_name,\22ax\22, %progbits
// CHECK-LABEL: test_link_section:
#[no_mangle]
Expand All @@ -148,15 +154,15 @@ pub extern "C" fn test_link_section() {
naked_asm!("bx lr");
}

// win_x86: .def fastcall_cc
// win_i686: .def @fastcall_cc@4
// win_x86_msvc,win_x86_gnu: .def fastcall_cc
// win_i686_gnu: .def @fastcall_cc@4
//
// win_x86,win_i686: .scl 2
// win_x86,win_i686: .type 32
// win_x86,win_i686: .endef
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .scl 2
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .type 32
// win_x86_msvc,win_x86_gnu,win_i686_gnu: .endef
//
// win_x86-LABEL: fastcall_cc:
// win_i686-LABEL: @fastcall_cc@4:
// win_x86_msvc-LABEL,win_x86_gnu-LABEL: fastcall_cc:
// win_i686_gnu-LABEL: @fastcall_cc@4:
#[cfg(target_os = "windows")]
#[no_mangle]
#[unsafe(naked)]
Expand Down
24 changes: 24 additions & 0 deletions tests/run-make/naked-dead-code-elimination/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
use std::arch::naked_asm;

#[unsafe(naked)]
#[no_mangle]
extern "C" fn used() {
naked_asm!("ret")
}

#[unsafe(naked)]
#[no_mangle]
extern "C" fn unused() {
naked_asm!("ret")
}

#[unsafe(naked)]
#[link_section = "foobar"]
#[no_mangle]
extern "C" fn unused_link_section() {
naked_asm!("ret")
}
Comment on lines +9 to +20
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjorn3 does it make sense to you that these functions are not removed at link time on windows? They are not present on linux at least, but with windows they are. So either we're missing something or the windows linker just keeps these functions despite them being unreachable.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet they'd get removed before the changes in this PR. Since function sections are disabled for windows-gnu, beginning with this PR they are no longer output in separate sections, so ld.bfd doesn't GC them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I'd expect -Zfunction-sections=yes to cause them to be removed again, but that is not what I see. In fact even unused_link_section is not removed, and it always has its own section.

I'm using cross-compilation here so maybe the issue is there? But it looks like unused sections just aren't removed. The same is true when I use normal (i.e. non-naked) functions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right they aren't removed fully. Cross-compilation should be irrelevant.

Before this PR or with -Zfunction-sections=yes nm shows them as undefined:

❯ nm main.exe | rg unused
                 U unused
                 U unused_link_section

With this PR but without -Zfunction-sections=yes only unused_link_section is undefined:

❯ nm main.exe | rg unused
0000000140001540 T unused
                 U unused_link_section

Linking the same code with LLD (-Clink-arg=-fuse-ld=lld) doesn't remove them at all:

❯ nm main-lld.exe | rg unused
000000014009a230 T unused
00000001400d5000 T unused_link_section

In fact LLD didn't even discard foobar section.

The same is true when I use normal (i.e. non-naked) functions.

With GNU ld yes, but with LLD the normal function gets removed. If I add:

#[no_mangle]
extern "C" fn unused_regular() {}

unused_regular is defined without function sections, made undefined with function sections, and removed with LLD + function sections:

Details
❯ nm main.exe | rg '\sused|unused'
                 U unused
                 U unused_link_section
0000000140001630 T unused_regular
000000014009a150 T used

❯ RUSTC_BOOTSTRAP=1 rustc --target x86_64-pc-windows-gnu main.rs -Zfunction-sections=yes

❯ nm main.exe | rg '\sused|unused'
                 U unused
                 U unused_link_section
                 U unused_regular
000000014009a148 T used

❯ RUSTC_BOOTSTRAP=1 rustc --target x86_64-pc-windows-gnu main.rs -Clink-arg=-fuse-ld=lld

❯ nm main.exe | rg '\sused|unused'
000000014009a230 T unused
00000001400d5000 T unused_link_section
0000000140001630 T unused_regular
000000014009a234 T used

❯ RUSTC_BOOTSTRAP=1 rustc --target x86_64-pc-windows-gnu main.rs -Clink-arg=-fuse-ld=lld -Zfunction-sections=yes

❯ nm main.exe | rg '\sused|unused'
000000014009a230 T unused
00000001400d5000 T unused_link_section
000000014009a234 T used

Quite a mess nonetheless.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that detective work!

With GNU ld yes, but with LLD the normal function gets removed

So, that is different from the unused naked function, right? Are you able to identify what the difference is?

Regardless, what should we test for windows here. Like obviously we could ignore it entirely, that would be simplest. Alternatively we could add regular function equivalents and assert that the naked function is treated the same as a non-naked function?

Copy link
Copy Markdown
Member

@mati865 mati865 Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, that is different from the unused naked function, right?

Correct.

Are you able to identify what the difference is?

COMDAT usage for regular function is the most striking difference to me:

❯ llvm-readobj -Sst main.o | less
...
  Section {
    Number: 4
    Name: .text$unused (2F 31 31 35 00 00 00 00)
...
    Characteristics [ (0x60300020)
      IMAGE_SCN_ALIGN_4BYTES (0x300000)
      IMAGE_SCN_CNT_CODE (0x20)
      IMAGE_SCN_MEM_EXECUTE (0x20000000)
      IMAGE_SCN_MEM_READ (0x40000000)
    ]
  }
...
  Section {
    Number: 22
    Name: .text$unused_regular (2F 32 34 00 00 00 00 00)
...
    Characteristics [ (0x60501020)
      IMAGE_SCN_ALIGN_16BYTES (0x500000)
      IMAGE_SCN_CNT_CODE (0x20)
      IMAGE_SCN_LNK_COMDAT (0x1000)
      IMAGE_SCN_MEM_EXECUTE (0x20000000)
      IMAGE_SCN_MEM_READ (0x40000000)
    ]
  }

With C example:

cat main.c
void used() {}
void unused() {}
int main() { used(); return 0; }

build with x86_64-w64-mingw32-gcc main.c -Wl,--gc-sections -ffunction-sections -fno-asynchronous-unwind-tables I'm seeing similar thing.
With GCC unused function is undefined with GNU ld and defined with LLD, but with Clang while GNU ld behaves the same way and LLD removes the symbol.

Once again this seems to boil down to IMAGE_SCN_LNK_COMDAT (0x1000) that is set for the intermediary object only with Clang.

Perhaps memory fails me but I think GNU ld used to remove such symbols in the past. Anyway for this PR this solution seems nice if it doesn't add too much work:

Alternatively we could add regular function equivalents and assert that the naked function is treated the same as a non-naked function?

CI only tests regular x86_64-pc-windows-gnu which gives us no function sections and GNU ld as the linker, so in both cases the symbols will be undefined (assuming that the older ld installed on CI also doesn't get rid of the symbol).
With x86_64-pc-windows-gnullvm target this test would fail because regular function would be removed. But since it is not run on the CI, this is actually a good thing showing a potential for improvement.


fn main() {
used();
}
10 changes: 10 additions & 0 deletions tests/run-make/naked-dead-code-elimination/rmake.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
//@ needs-asm-support

use run_make_support::symbols::object_contains_any_symbol;
use run_make_support::{bin_name, rustc};

fn main() {
rustc().input("main.rs").opt().run();
let mut unused = vec!["unused", "unused_link_section"];
assert!(!object_contains_any_symbol(bin_name("main"), &unused));
}
Loading