Skip to content

[lld][COFF] Restore lto-embed-bitcode and -fembed-bitcode Bitcode Embedding Features#188398

Merged
mstorsjo merged 5 commits intollvm:mainfrom
realoriginal:llvm-22.x-restore-embed-bitcode-feature
Mar 30, 2026
Merged

[lld][COFF] Restore lto-embed-bitcode and -fembed-bitcode Bitcode Embedding Features#188398
mstorsjo merged 5 commits intollvm:mainfrom
realoriginal:llvm-22.x-restore-embed-bitcode-feature

Conversation

@realoriginal
Copy link
Copy Markdown
Contributor

@realoriginal realoriginal commented Mar 25, 2026

Removes the patches introduced by #150897 which broke LTO embed documented features for creating whole-program-bitcode representations of executables, used in production analysis/rewriting toolsets. This was a documented feature available up until 21.1.8 broken by 22.x release.

This previously allowed the users to have a whole-program-bitcode section .llvmbc embedded inside of the final executable.

@realoriginal realoriginal changed the title [lld][COFF] Restore lto-embed-bitcode and -fembed-bitcode Bitcode Embedding Features From [lld][COFF] Restore lto-embed-bitcode and -fembed-bitcode Bitcode Embedding Features Mar 25, 2026
@realoriginal
Copy link
Copy Markdown
Contributor Author

@mstorsjo any way we can get this merged?

Copy link
Copy Markdown
Contributor

@HaohaiWen HaohaiWen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always keep .llvmbc and .llvmcmd in the final binary if -fembed-bitcode is presented.
As I mentioned in the #150897 (comment). Sometimes, we just want it exist in the object/bitcode but not in the final binary to avoid 2GB PE limitation.

Name == getInstrProfSectionName(IPSK_covname, Triple::COFF,
/*AddSegmentInfo=*/false) ||
Name == ".llvmbc" || Name == ".llvmcmd")
/*AddSegmentInfo=*/false))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're metadata, right?

Copy link
Copy Markdown
Contributor Author

@realoriginal realoriginal Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it has to be metadata, I'm fine with that, as long it ends up in the final binary as per lto-embed-bitcode's documented feature allows so that myself and others whom build tools on this documented assumption can extract it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting is to metadata does not mean lld will drop them automatically.
I think they should be metadata (IMAGE_SCN_MEM_DISCARDABLEj).

Copy link
Copy Markdown
Contributor Author

@realoriginal realoriginal Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait for what @MaskRay says, so we can get this merged in and fixed quickly

Copy link
Copy Markdown
Contributor

@HaohaiWen HaohaiWen Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverting this also affect our production, as bitcode/cmdline sections are usually too big to be embedded in. Sometimes lld even failed for large application.

if (fileSize > UINT32_MAX)
Fatal(ctx) << "image size (" << fileSize << ") "
<< "exceeds maximum allowable size (" << UINT32_MAX << ")";

-lto-embed-bitcode is indeed an issue that I did not aware of. I aggress to revert the lld code, and I suggest to add an /strip-embedded-bitcode option so that both products can work.
Regarding metadata, I don't think this will affect existence of those sections. ELF also set them as metadata sections.

Copy link
Copy Markdown
Member

@aganea aganea Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need these sections to remain in the .EXE file? I think it could be straightforward to move them in a PDB named stream? For the reasons @HaohaiWen mentionned, ie. PE has a hard size limit, whereas PDB has a 16 TB limit (when using /PDBPAGESIZE:xxx).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s just kind of strange to completely break a documented linker flag that is fully optional and only affects people who explicitly set it.

For ‘-fembed-bitcode` it’s a different story, I wouldn’t expect that to end up in the final executable since enabling it signals you are processing the objects individually anyway.

@HaohaiWen HaohaiWen requested a review from MaskRay March 25, 2026 03:33
@realoriginal
Copy link
Copy Markdown
Contributor Author

realoriginal commented Mar 25, 2026

@HaohaiWen I would agree with you if the case was it was always embedded, however as far as I've seen there is no code path I've seen that leads to this case. Theres alternatives if you just want the bitcode by itself without it being in the final binary, and the linker flag -lto-embed-bitcode is supposed to embed the bitcode in the final binary. If you don't specify that, it shouldn't end up in the final binary. As far as I've ever seen anyway.

@HaohaiWen
Copy link
Copy Markdown
Contributor

HaohaiWen commented Mar 25, 2026

How about adding a new option to explicitly drop it in lld? e.g.
/strip-embedded-bitcode

@realoriginal realoriginal force-pushed the llvm-22.x-restore-embed-bitcode-feature branch from 951486c to 16b841f Compare March 25, 2026 13:27
@realoriginal
Copy link
Copy Markdown
Contributor Author

I've reverted the metadata patch so its re-applied, at this point, does at least restoring the original behavior and taking care of an additional flag fine as a seperate PR seem sufficient so this can be merged? @HaohaiWen / @aganea / @MaskRay / @mstorsjo

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 25, 2026

@llvm/pr-subscribers-lld-coff

Author: Austin Hudson (realoriginal)

Changes

Removes the patches introduced by #150879 and #150897 which broke LTO embed documented features for creating whole-program-bitcode representations of executables, used in production analysis/rewriting toolsets. This was a documented feature available up until 21.1.8 broken by 22.x release.

This previously allowed the users to have a whole-program-bitcode section .llvmbc embedded inside of the final executable.

If there is anything else necessary to getting this merged, please let me know as this feature is critical to myself and others.


Full diff: https://github.com/llvm/llvm-project/pull/188398.diff

2 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (-5)
  • (removed) lld/test/COFF/embed-bitcode.test (-30)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 492b2ad80166d..4320a39644616 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -404,11 +404,6 @@ SectionChunk *ObjFile::readSection(uint32_t sectionNumber,
     return nullptr;
   }
 
-  // Those sections are generated by -fembed-bitcode and do not need to be kept
-  // in executable files.
-  if (name == ".llvmbc" || name == ".llvmcmd")
-    return nullptr;
-
   // Object files may have DWARF debug info or MS CodeView debug info
   // (or both).
   //
diff --git a/lld/test/COFF/embed-bitcode.test b/lld/test/COFF/embed-bitcode.test
deleted file mode 100644
index 10f88c5c0117d..0000000000000
--- a/lld/test/COFF/embed-bitcode.test
+++ /dev/null
@@ -1,30 +0,0 @@
-# RUN: yaml2obj %s -o %t.obj
-# RUN: lld-link /entry:main /subsystem:console /out:%t.exe %t.obj
-# RUN: llvm-readobj -S %t.exe | FileCheck %s
-
-# CHECK-NOT: Name: .llvmbc
-# CHECK-NOT: Name: .llvmcmd
-
---- !COFF
-header:
-  Machine:         IMAGE_FILE_MACHINE_AMD64
-
-sections:
-  - Name:            .text
-    Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
-    SectionData:     "C3"
-  - Name:           .llvmbc
-    Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
-    SectionData:     "4243C0DE"
-  - Name:           .llvmcmd
-    Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
-    SectionData:     "2D63633100"
-
-symbols:
-  - Name:            main
-    Value:           0
-    SectionNumber:   1
-    SimpleType:      IMAGE_SYM_TYPE_NULL
-    ComplexType:     IMAGE_SYM_DTYPE_FUNCTION
-    StorageClass:    IMAGE_SYM_CLASS_EXTERNAL
-...

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 25, 2026

@llvm/pr-subscribers-platform-windows

Author: Austin Hudson (realoriginal)

Changes

Removes the patches introduced by #150879 and #150897 which broke LTO embed documented features for creating whole-program-bitcode representations of executables, used in production analysis/rewriting toolsets. This was a documented feature available up until 21.1.8 broken by 22.x release.

This previously allowed the users to have a whole-program-bitcode section .llvmbc embedded inside of the final executable.

If there is anything else necessary to getting this merged, please let me know as this feature is critical to myself and others.


Full diff: https://github.com/llvm/llvm-project/pull/188398.diff

2 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (-5)
  • (removed) lld/test/COFF/embed-bitcode.test (-30)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 492b2ad80166d..4320a39644616 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -404,11 +404,6 @@ SectionChunk *ObjFile::readSection(uint32_t sectionNumber,
     return nullptr;
   }
 
-  // Those sections are generated by -fembed-bitcode and do not need to be kept
-  // in executable files.
-  if (name == ".llvmbc" || name == ".llvmcmd")
-    return nullptr;
-
   // Object files may have DWARF debug info or MS CodeView debug info
   // (or both).
   //
diff --git a/lld/test/COFF/embed-bitcode.test b/lld/test/COFF/embed-bitcode.test
deleted file mode 100644
index 10f88c5c0117d..0000000000000
--- a/lld/test/COFF/embed-bitcode.test
+++ /dev/null
@@ -1,30 +0,0 @@
-# RUN: yaml2obj %s -o %t.obj
-# RUN: lld-link /entry:main /subsystem:console /out:%t.exe %t.obj
-# RUN: llvm-readobj -S %t.exe | FileCheck %s
-
-# CHECK-NOT: Name: .llvmbc
-# CHECK-NOT: Name: .llvmcmd
-
---- !COFF
-header:
-  Machine:         IMAGE_FILE_MACHINE_AMD64
-
-sections:
-  - Name:            .text
-    Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
-    SectionData:     "C3"
-  - Name:           .llvmbc
-    Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
-    SectionData:     "4243C0DE"
-  - Name:           .llvmcmd
-    Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
-    SectionData:     "2D63633100"
-
-symbols:
-  - Name:            main
-    Value:           0
-    SectionNumber:   1
-    SimpleType:      IMAGE_SYM_TYPE_NULL
-    ComplexType:     IMAGE_SYM_DTYPE_FUNCTION
-    StorageClass:    IMAGE_SYM_CLASS_EXTERNAL
-...

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 25, 2026

@llvm/pr-subscribers-lld

Author: Austin Hudson (realoriginal)

Changes

Removes the patches introduced by #150879 and #150897 which broke LTO embed documented features for creating whole-program-bitcode representations of executables, used in production analysis/rewriting toolsets. This was a documented feature available up until 21.1.8 broken by 22.x release.

This previously allowed the users to have a whole-program-bitcode section .llvmbc embedded inside of the final executable.

If there is anything else necessary to getting this merged, please let me know as this feature is critical to myself and others.


Full diff: https://github.com/llvm/llvm-project/pull/188398.diff

2 Files Affected:

  • (modified) lld/COFF/InputFiles.cpp (-5)
  • (removed) lld/test/COFF/embed-bitcode.test (-30)
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 492b2ad80166d..4320a39644616 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -404,11 +404,6 @@ SectionChunk *ObjFile::readSection(uint32_t sectionNumber,
     return nullptr;
   }
 
-  // Those sections are generated by -fembed-bitcode and do not need to be kept
-  // in executable files.
-  if (name == ".llvmbc" || name == ".llvmcmd")
-    return nullptr;
-
   // Object files may have DWARF debug info or MS CodeView debug info
   // (or both).
   //
diff --git a/lld/test/COFF/embed-bitcode.test b/lld/test/COFF/embed-bitcode.test
deleted file mode 100644
index 10f88c5c0117d..0000000000000
--- a/lld/test/COFF/embed-bitcode.test
+++ /dev/null
@@ -1,30 +0,0 @@
-# RUN: yaml2obj %s -o %t.obj
-# RUN: lld-link /entry:main /subsystem:console /out:%t.exe %t.obj
-# RUN: llvm-readobj -S %t.exe | FileCheck %s
-
-# CHECK-NOT: Name: .llvmbc
-# CHECK-NOT: Name: .llvmcmd
-
---- !COFF
-header:
-  Machine:         IMAGE_FILE_MACHINE_AMD64
-
-sections:
-  - Name:            .text
-    Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
-    SectionData:     "C3"
-  - Name:           .llvmbc
-    Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
-    SectionData:     "4243C0DE"
-  - Name:           .llvmcmd
-    Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
-    SectionData:     "2D63633100"
-
-symbols:
-  - Name:            main
-    Value:           0
-    SectionNumber:   1
-    SimpleType:      IMAGE_SYM_TYPE_NULL
-    ComplexType:     IMAGE_SYM_DTYPE_FUNCTION
-    StorageClass:    IMAGE_SYM_CLASS_EXTERNAL
-...

@mstorsjo
Copy link
Copy Markdown
Member

If @HaohaiWen is ok with doing this (and looking into adding an option for this afterwards), then I guess we can go ahead and merge this (after a couple of CI checks run complete).

@mstorsjo
Copy link
Copy Markdown
Member

@realoriginal For merging, I noticed that you email address on github is set to private; see https://github.com/llvm/llvm-project/blob/main/.github/workflows/email-check.yaml#L34-L37 for our policy on the matter.

@realoriginal
Copy link
Copy Markdown
Contributor Author

Good to go, I've made it public @mstorsjo

# RUN: lld-link /entry:main /subsystem:console /out:%t.exe %t.obj
# RUN: llvm-readobj -S %t.exe | FileCheck %s

# CHECK-NOT: Name: .llvmbc
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of deleting the test, you can check that they are still present

@mstorsjo mstorsjo merged commit 1e99c9e into llvm:main Mar 30, 2026
13 checks passed
@mstorsjo
Copy link
Copy Markdown
Member

As the regression was in 22.x, I think this should be backported as well.

@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Mar 30, 2026
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in LLVM Release Status Mar 30, 2026
@mstorsjo
Copy link
Copy Markdown
Member

/cherry-pick 1e99c9e

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 30, 2026

/pull-request #189375

HaohaiWen added a commit to HaohaiWen/llvm-project that referenced this pull request Mar 31, 2026
Add a -strip-embedded-bitcode option so that users who compile with
-fembed-bitcode but don't want these sections in the final binary can
explicitly opt out, while preserving the default behavior.

The .llvmbc and .llvmcmd sections was previously stripped from the final
binary unconditionally (llvm#150897). However, this broke the workflow of
-lto-embed-bitcode and llvm#188398 reverted it.

The test of this PR is from llvm#150897.
c-rhodes pushed a commit to llvmbot/llvm-project that referenced this pull request Mar 31, 2026
… Embedding Features (llvm#188398)

Removes the patches introduced by llvm#150897 which broke LTO embed
documented features for creating whole-program-bitcode representations
of executables, used in production analysis/rewriting toolsets. This was
a documented feature available up until 21.1.8 broken by 22.x release.

This previously allowed the users to have a whole-program-bitcode
section `.llvmbc` embedded inside of the final executable.

(cherry picked from commit 1e99c9e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

7 participants