Skip to content

proposal: debug/pe: add BigObj COFF support via new fields on File struct #75659

@Peter0x44

Description

@Peter0x44

Proposal Details

The debug/pe package currently uses this file struct:

// A File represents an open PE file.
type File struct {
	FileHeader
	OptionalHeader any // of type *OptionalHeader32 or *OptionalHeader64
	Sections       []*Section
	Symbols        []*Symbol    // COFF symbols with auxiliary symbol records removed
	COFFSymbols    []COFFSymbol // all COFF symbols (including auxiliary symbol records)
	StringTable    StringTable

	closer io.Closer
}

Unfortunately, in order to support BigObj COFF, (golang/go#24341), a few things need to change,.
BigObj files use a different type of header (ANON_OBJECT_HEADER_BIGOBJ) with largely the same fields as the FileHeader (IMAGE_FILE_HEADER) present already in the File struct.

You can find the C struct declarations in winnt.h.

The most important difference in the header is that FileHeader.NumberOfSections is now 32 bits, instead of 16 bits.
Supporting more sections in the file is the entire point of BigObj. I propose to add an extra:
int32 SectionCount field in the file struct to deal with this. In the case that the section count is greater than INT16_MAX, File.FileHeader.NumberOfSections will be zero. File.SectionCount will always contain an accurate number of sections.

File.FileHeader.NumberOfSections will be deprecated and new code should not rely on it, and instead refer to File.SectionCount to get the true number of sections.

The other fields in the header have 1:1 equivalents in the old FileHeader, so the code can just copy over the values from the BigObj header back to the FileHeader, and they can be accessed from that struct instead.

A similar approach has to be taken for Symbols and COFFSymbols.

Symbols inside regular COFF files are of the format IMAGE_SYMBOL (also present in winnt.h).
BigObj files instead have the format IMAGE_SYMBOL_EX. The only difference between them is that the SectionNumber field got expanded to 32 bits.
Unfortunately, the current COFFSymbol and Symbol structs have a 16 bit SectionNumber.

I propose adding two new fields:

BigObjSymbols []*BigObjSymbol
BigObjCOFFSymbols []BigObjCOFFSymbol

The only change in these structs will be a 32 bit SectionNumber.

Similar to the header, the old structs will be deprecated, and both of those arrays will be empty in the case that the section count is greater than INT16_MAX. New code should always prefer to use these fields.

For more details on BigObj, I wrote up a blog post explaining my findings when writing my initial patch:
https://peter0x44.github.io/posts/bigobj_format_explained/
Unfortunately, Microsoft doesn't document the binary format. And it seems no one else did either. So you just have to "Trust Me, Bro".
#75631

My patch is functional. The issues are that it breaks the go compatibility guarantee.

I am happy to bikeshed over the naming of the structs in this API. I couldn't come up with any I fully liked.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Active

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions