[LLD][COFF] Discard .llvmbc and .llvmcmd sections#150897
Conversation
Those sections are generated by -fembed-bitcode and do not need to be kept in executable files.
|
@llvm/pr-subscribers-platform-windows @llvm/pr-subscribers-lld Author: Haohai Wen (HaohaiWen) ChangesThose sections are generated by -fembed-bitcode and do not need to be Full diff: https://github.com/llvm/llvm-project/pull/150897.diff 2 Files Affected:
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 2a6b63cbacca1..c08099b8810bb 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -403,6 +403,11 @@ SectionChunk *ObjFile::readSection(uint32_t sectionNumber,
return nullptr;
}
+ // Those sections are generated by -fembed-bitcode and do not need to be kept
+ // in executable files.
+ if (name == ".llvmbc" || name == ".llvmcmd")
+ return nullptr;
+
// Object files may have DWARF debug info or MS CodeView debug info
// (or both).
//
diff --git a/lld/test/COFF/embed-bitcode.test b/lld/test/COFF/embed-bitcode.test
new file mode 100644
index 0000000000000..10f88c5c0117d
--- /dev/null
+++ b/lld/test/COFF/embed-bitcode.test
@@ -0,0 +1,30 @@
+# RUN: yaml2obj %s -o %t.obj
+# RUN: lld-link /entry:main /subsystem:console /out:%t.exe %t.obj
+# RUN: llvm-readobj -S %t.exe | FileCheck %s
+
+# CHECK-NOT: Name: .llvmbc
+# CHECK-NOT: Name: .llvmcmd
+
+--- !COFF
+header:
+ Machine: IMAGE_FILE_MACHINE_AMD64
+
+sections:
+ - Name: .text
+ Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
+ SectionData: "C3"
+ - Name: .llvmbc
+ Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
+ SectionData: "4243C0DE"
+ - Name: .llvmcmd
+ Characteristics: [ IMAGE_SCN_MEM_DISCARDABLE ]
+ SectionData: "2D63633100"
+
+symbols:
+ - Name: main
+ Value: 0
+ SectionNumber: 1
+ SimpleType: IMAGE_SYM_TYPE_NULL
+ ComplexType: IMAGE_SYM_DTYPE_FUNCTION
+ StorageClass: IMAGE_SYM_CLASS_EXTERNAL
+...
|
|
@mstorsjo Could this be reverted? This ends up breaking |
@HaohaiWen who authored this PR, can you comment? Actually, that makes me wonder what the motivation for this whole PR is to begin with: If building with |
|
Some discussion about this: https://discourse.llvm.org/t/end-to-end-fembed-bitcode-llvmbc-and-llvmcmd/56265/3
If we compile the source code with -fembed-bitcode, then, What inside the final executable is the merged .llvmbc and .llvmcmd. There's no guaranteed mapping between the N-th command line and the N-th bitcode module unless the order is preserved. LLVM also doesn't have existing API to split them apart from the final executable. Those two sections will occupy lot of space, making it easier to be greater then 2GB limitation. I wonder what's motivation to use the merged .llvmbc and .llvmcmd in the final binary? I think use bitcode/object is more friendly and easier. |
However, when I request it to embed bitcode, I need it to embed bitcode. LLVM doesnt intentionally do that unless you ask for it. |
|
@mstorsjo I'll submit a quick PR removing this patch seems it appears it brokes more than then just |
|
Just FYI the |
|
They were also removed from wasm already. |
This reverts commit 41f3332.
This reverts commit 41f3332.
|
I was also wary of this change, as I mentioned in this related comment. The points raised by @realoriginal and @mrexodia reinforce that maintaining concatenated bitcode in the .llvmbc section is used by certain workflows. lld/ELF’s support for linker scripts allows for fine-grained control, such as using /DISCARD/ to remove unwanted input sections. |
|
I've created #188398 for fixing this, so hopefully we can get this merged before the next version if possible |
… Embedding Features (#188398) Removes the patches introduced by #150897 which broke LTO embed documented features for creating whole-program-bitcode representations of executables, used in production analysis/rewriting toolsets. This was a documented feature available up until 21.1.8 broken by 22.x release. This previously allowed the users to have a whole-program-bitcode section `.llvmbc` embedded inside of the final executable.
Add a -strip-embedded-bitcode option so that users who compile with -fembed-bitcode but don't want these sections in the final binary can explicitly opt out, while preserving the default behavior. The .llvmbc and .llvmcmd sections was previously stripped from the final binary unconditionally (llvm#150897). However, this broke the workflow of -lto-embed-bitcode and llvm#188398 reverted it. The test of this PR is from llvm#150897.
… Embedding Features (llvm#188398) Removes the patches introduced by llvm#150897 which broke LTO embed documented features for creating whole-program-bitcode representations of executables, used in production analysis/rewriting toolsets. This was a documented feature available up until 21.1.8 broken by 22.x release. This previously allowed the users to have a whole-program-bitcode section `.llvmbc` embedded inside of the final executable. (cherry picked from commit 1e99c9e)
This provides a general mechanism similar to ELF linker scripts' /DISCARD/ for COFF. Though the intention is to explicitly discard .llvmbc and .llvmcmd sections. (See discussion in llvm#150897, llvm#188398 for more details.)
Those sections are generated by -fembed-bitcode and do not need to be
kept in executable files.