Skip to content

[lld][COFF] Add /discard option to discard input sections by name#189542

Open
HaohaiWen wants to merge 2 commits intollvm:mainfrom
HaohaiWen:cmd
Open

[lld][COFF] Add /discard option to discard input sections by name#189542
HaohaiWen wants to merge 2 commits intollvm:mainfrom
HaohaiWen:cmd

Conversation

@HaohaiWen
Copy link
Copy Markdown
Contributor

@HaohaiWen HaohaiWen commented Mar 31, 2026

This provides a general mechanism similar to ELF linker scripts'
/DISCARD/ for COFF. Though the intention is to explicitly discard
.llvmbc and .llvmcmd sections. (See discussion in #150897, #188398
for more details.)

Add a -strip-embedded-bitcode option so that users who compile with
-fembed-bitcode but don't want these sections in the final binary can
explicitly opt out, while preserving the default behavior.

The .llvmbc and .llvmcmd sections was previously stripped from the final
binary unconditionally (llvm#150897). However, this broke the workflow of
-lto-embed-bitcode and llvm#188398 reverted it.

The test of this PR is from llvm#150897.
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 31, 2026

@llvm/pr-subscribers-lld-coff
@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-lld

Author: Haohai Wen (HaohaiWen)

Changes

Add a -strip-embedded-bitcode option so that users who compile with
-fembed-bitcode but don't want these sections in the final binary can
explicitly opt out, while preserving the default behavior.

The .llvmbc and .llvmcmd sections was previously stripped from the final
binary unconditionally (#150897). However, this broke the workflow of
-lto-embed-bitcode and #188398 reverted it.

The test of this PR is from #150897.


Full diff: https://github.com/llvm/llvm-project/pull/189542.diff

5 Files Affected:

  • (modified) lld/COFF/Config.h (+1)
  • (modified) lld/COFF/Driver.cpp (+3)
  • (modified) lld/COFF/InputFiles.cpp (+4)
  • (modified) lld/COFF/Options.td (+2)
  • (added) lld/test/COFF/embed-bitcode.test (+39)
diff --git a/lld/COFF/Config.h b/lld/COFF/Config.h
index 1c0f874ddfd79..c15e11444092e 100644
--- a/lld/COFF/Config.h
+++ b/lld/COFF/Config.h
@@ -134,6 +134,7 @@ struct Configuration {
   bool forceUnresolved = false;
   bool debug = false;
   bool includeDwarfChunks = false;
+  bool stripEmbeddedBitcode = false;
   bool debugGHashes = false;
   bool writeSymtab = false;
   bool driver = false;
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index df76f05ed5a06..9416f1c1147bc 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -2129,6 +2129,9 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
   if (auto *arg = args.getLastArg(OPT_sectionlayout))
     parseSectionLayout(arg->getValue());
 
+  // Handle /strip-embedded-bitcode
+  config->stripEmbeddedBitcode = args.hasArg(OPT_strip_embedded_bitcode);
+
   // Handle /align
   if (auto *arg = args.getLastArg(OPT_align)) {
     parseNumbers(arg->getValue(), &config->align);
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 0cc3aaeba41e3..3f9eac77f1fc0 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -403,6 +403,10 @@ SectionChunk *ObjFile::readSection(uint32_t sectionNumber,
     return nullptr;
   }
 
+  if (symtab.ctx.config.stripEmbeddedBitcode &&
+      (name == ".llvmbc" || name == ".llvmcmd"))
+    return nullptr;
+
   // Object files may have DWARF debug info or MS CodeView debug info
   // (or both).
   //
diff --git a/lld/COFF/Options.td b/lld/COFF/Options.td
index fb762b880c2cb..a2f08726d6db9 100644
--- a/lld/COFF/Options.td
+++ b/lld/COFF/Options.td
@@ -110,6 +110,8 @@ def pdbstream : Joined<["/", "-", "/?", "-?"], "pdbstream:">,
 def section : P<"section", "Specify section attributes">;
 def sectionlayout : P<"sectionlayout", "Specifies the layout strategy for output sections">;
 def stack   : P<"stack", "Size of the stack">;
+def strip_embedded_bitcode : F<"strip-embedded-bitcode">,
+    HelpText<"Strip .llvmbc and .llvmcmd sections from the output">;
 def stub    : P<"stub", "Specify DOS stub file">;
 def subsystem : P<"subsystem", "Specify subsystem">;
 def timestamp : P<"timestamp", "Specify the PE header timestamp">;
diff --git a/lld/test/COFF/embed-bitcode.test b/lld/test/COFF/embed-bitcode.test
new file mode 100644
index 0000000000000..d452c35fd9f82
--- /dev/null
+++ b/lld/test/COFF/embed-bitcode.test
@@ -0,0 +1,39 @@
+# RUN: yaml2obj %s -o %t.obj
+
+## By default, .llvmbc and .llvmcmd sections are preserved.
+# RUN: lld-link /entry:main /subsystem:console /out:%t.exe %t.obj
+# RUN: llvm-readobj -S %t.exe | FileCheck --check-prefix=PRESERVE %s
+
+# PRESERVE: Name: .llvmbc
+# PRESERVE: Name: .llvmcmd
+
+## -strip-embedded-bitcode strips .llvmbc and .llvmcmd sections.
+# RUN: lld-link /entry:main /subsystem:console /strip-embedded-bitcode /out:%t-stripped.exe %t.obj
+# RUN: llvm-readobj -S %t-stripped.exe | FileCheck --check-prefix=STRIP %s
+
+# STRIP-NOT: Name: .llvmbc
+# STRIP-NOT: Name: .llvmcmd
+
+--- !COFF
+header:
+  Machine:         IMAGE_FILE_MACHINE_AMD64
+
+sections:
+  - Name:            .text
+    Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
+    SectionData:     "C3"
+  - Name:           .llvmbc
+    Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
+    SectionData:     "4243C0DE"
+  - Name:           .llvmcmd
+    Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
+    SectionData:     "2D63633100"
+
+symbols:
+  - Name:            main
+    Value:           0
+    SectionNumber:   1
+    SimpleType:      IMAGE_SYM_TYPE_NULL
+    ComplexType:     IMAGE_SYM_DTYPE_FUNCTION
+    StorageClass:    IMAGE_SYM_CLASS_EXTERNAL
+...

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a COFF lld-link switch to optionally strip embedded bitcode sections (.llvmbc / .llvmcmd) from the final output while keeping the default behavior of preserving them.

Changes:

  • Introduce /strip-embedded-bitcode (also accepted with -) to remove .llvmbc and .llvmcmd from output.
  • Implement stripping by skipping those sections during COFF object section ingestion when the flag is enabled.
  • Add a COFF regression test covering default “preserve” behavior and the new “strip” behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
lld/test/COFF/embed-bitcode.test Verifies embedded bitcode sections are kept by default and removed with /strip-embedded-bitcode.
lld/COFF/Options.td Defines the new strip-embedded-bitcode driver flag and help text.
lld/COFF/InputFiles.cpp Skips .llvmbc / .llvmcmd sections when stripEmbeddedBitcode is enabled.
lld/COFF/Driver.cpp Parses the new option and sets config->stripEmbeddedBitcode.
lld/COFF/Config.h Adds stripEmbeddedBitcode configuration field.

@zmodem
Copy link
Copy Markdown
Collaborator

zmodem commented Mar 31, 2026

Can you add more context on your use case? Isn't the point of -fembed-bitcode that the bitcode gets embedded in the final executable or library?

Do the other lld versions (ELF and Mach-O) have a similar flag?

@HaohaiWen
Copy link
Copy Markdown
Contributor Author

Can you add more context on your use case? Isn't the point of -fembed-bitcode that the bitcode gets embedded in the final executable or library?

We get bitcode/cmd from object file or bitcode (LTO mode) for large application. Embeding the concatenate bitcode/cmd to final binary usually bloat the binary to exceed 2GB PE limitation on Windows (https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#optional-header-image-only).

Do the other lld versions (ELF and Mach-O) have a similar flag?

For ELF, please refer to #150897 (comment)
For wasm, they have been removed just like the previous implementation #150897:

// These custom sections are generated by `clang -fembed-bitcode`.
// These are used by the rust toolchain to ship LTO data along with
// compiled object code, but they don't want this included in the linker
// output.
if (name == ".llvmbc" || name == ".llvmcmd")
continue;

For MachO I haven't seen any explicit handling for them.

@aganea
Copy link
Copy Markdown
Member

aganea commented Mar 31, 2026

Can you add more context on your use case? Isn't the point of -fembed-bitcode that the bitcode gets embedded in the final executable or library?

We get bitcode/cmd from object file or bitcode (LTO mode) for large application.

Do you mean from external libraries where you don’t control the build flags?

@rnk
Copy link
Copy Markdown
Collaborator

rnk commented Mar 31, 2026

I think it's worth it. I have a low bar for adding flags, I bias to "yes". LLD COFF doesn't have linker scripts, so there's no simple general way to discard sections that you don't want.

I can imagine use cases where you want to get bitcode for some interesting large product executable, but you share objects with a collection of test executables. Say, I'm interested in clang bitcode analysis, but not opt, llc, FileCheck etc, so I drop the bitcode for those exes, and retain it for clang, and I don't want to compile twice.

If we wanted a more general mechanism, I think a flag similar to /merge:.sec1=.sec2 would be the way to go, something like /discard:.llvmbc.

@HaohaiWen
Copy link
Copy Markdown
Contributor Author

Can you add more context on your use case? Isn't the point of -fembed-bitcode that the bitcode gets embedded in the final executable or library?

We get bitcode/cmd from object file or bitcode (LTO mode) for large application.

Do you mean from external libraries where you don’t control the build flags?

No. we only need those .llvmbc and .llvmcmd from inputs (obj/bc) to lld.

This provides a general mechanism similar to ELF linker scripts'
/DISCARD/ for COFF. Though the intention is to explicitly discard
.llvmbc and .llvmcmd sections. (See discussion in llvm#150897, llvm#188398
for more details.)
@HaohaiWen
Copy link
Copy Markdown
Contributor Author

I think it's worth it. I have a low bar for adding flags, I bias to "yes". LLD COFF doesn't have linker scripts, so there's no simple general way to discard sections that you don't want.

I can imagine use cases where you want to get bitcode for some interesting large product executable, but you share objects with a collection of test executables. Say, I'm interested in clang bitcode analysis, but not opt, llc, FileCheck etc, so I drop the bitcode for those exes, and retain it for clang, and I don't want to compile twice.

If we wanted a more general mechanism, I think a flag similar to /merge:.sec1=.sec2 would be the way to go, something like /discard:.llvmbc.

Good suggestion. Let's use /discard: to make it more general.

@HaohaiWen HaohaiWen changed the title [lld][COFF] Add -strip-embedded-bitcode option [lld][COFF] Add /discard option to discard input sections by name Apr 1, 2026
@MaskRay
Copy link
Copy Markdown
Member

MaskRay commented Apr 1, 2026

GNU ld has --discard-all and --discard-locals. For the mingw port --discard-section would be better.

For the link.exe-like interface, consider /discard-section:?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants