A program may have a lot of unused code and data. Many linkers provide garbage collection features to discard such code and data.
In GNU linkers, the feature is called "garbage collection" (ld --gc-sections
). On macOS, ld64 calls it "dead stripping" (ld64 -dead_strip
). On Windows, "eliminates functions and data that are never referenced" (link.exe /OPT:REF
).
In many binary formats, "section" is an inseparable unit. The garbage collection feature starts with a list of sections which should be retained (GC roots). marks associated sections and referenced sections via relocations. For each retained section, the linker marks referenced sections via relocations and associated sections (ELF section group and PE IMAGE_COMDAT_SELECT_ASSOCIATIVE
).
ELF
To leverage section based GC, -ffunction-sections
and -fdata-sections
are needed. Otherwise, monolithic text and data sections are produced and they will likely be retained as a whole.
-ffunction-sections
causes section names like .text.name
. -fdata-sections
causes section names like .data.name
, .rodata.name
, .data.rel.local.name
. The section names are unique can consume the the string table space. Clang provides -fno-unique-section-names
to save the string table space: stem names such as .text
, .data
, .rodata
are used.
-fno-unique-section-names
needs to create multiple sections with the same name. This requires a new assembler syntax: .section .text,"a",@progbits,unique,1
. H.J. Lu implemented the syntax for binutils 2.35.
About LTO: code generation options can be categorized into IR generation options (e.g. -g) and object file generation options (e.g. -ffunction-sections -fdata-sections). Without LTO, you usually do not see a line between the two because they are both performed under the hood. With LTO, the distinction becomes clearer: IR generation options have effects on .c
-> .ll
/.bc
(though usually people reuse ".o") and object file generation options have effects on .ll
/.bc
-> .o
. In LLVM, -ffunction-sections
and -fdata-sections
are object file generating options and do not affect IR generation. LLD and LLVMgold.so enable function sections and data sections automatically.
GNU ld
The following sections are GC roots.
- Sections which define a symbol specified by
-u
,--entry
,--init
, or--fini
- Sections which define a symbol exported to
.dynsym
- STV_DEFAULT or STV_PROTECTED
-shared
or--export-dynamic
or--gc-keep-exported
- ...
- Sections which have the
SHF_GNU_RETAIN
flag SHT_PREINIT_ARRAY/SHT_INIT_ARRAY/SHT_FINI_ARRAY
- Non-
SHF_ALLOC
non-SHF_LINK_ORDER
SHT_NOTE
- Linker created sections (
SEC_LINKER_CREATED
) - Sections which are matched by an input section description with the
KEEP
keyword - ...
If there is at least one retained non-SHT_NOTE SHF_ALLOC section, GNU ld will mark more sections. Otherwise, most sections (e.g. .debug_*
) will be discarded. (See bfd/elflink.c:_bfd_elf_gc_mark_extra_sections
) This heuristic is interesting: it can discard quite a few debug sections when no non-SHT_NOTE SHF_ALLOC section is retained.
- Non-
SHF_ALLOC
non-SHF_LINK_ORDER
sections which are not in a section group- Except
.debug_line.*
which is associated to a discarded text section
- Except
- Group sections which contain just non-
SHF_ALLOC
sections
ld.lld
The following sections are GC roots.
- Sections which define a symbol specified by
-u
,--entry
,--init
, or--fini
- Sections which define a symbol exported to
.dynsym
- Sections which define a symbol referenced by a linker script
- Sections which have the
SHF_GNU_RETAIN
flag SHT_PREINIT_ARRAY/SHT_INIT_ARRAY/SHT_FINI_ARRAY
SHT_NOTE
not within a section group (this rule is for Fedora watermark).ctors/.dtors/.init/.fini/.jcr
- Personality routines or language-specific data area referenced by
.eh_frame
. LLD handles--gc-sections
before.eh_frame
deduplication, so this may retain sections than needed. - Sections which are matched by an input section description with the
KEEP
keyword
Non-SHF_ALLOC
non-SHF_LINK_ORDER
non-SHF_REL[A]
non-SHF_GROUP
sections are retained, albeit not GC roots.
Debug sections
Linkers do not discard debug sections. This is the "smart format, dumb linker" philosophy. DWARF sections are large. Optimizing them would take a significant amount of link time.
There are, however, instances where discarded text sections can lead to some tombstone values in DWARF.
.debug_ranges
&.debug_loc
: 1 (LLD<11: 0+addend; GNU ld uses 1 for .debug_ranges).debug_*
: 0 (LLD<11: 0+addend; GNU ld uses 0; future LLD: 0xffffffff or 0xffffffffffffffff)
Misc
In binutils 2.36, GNU as introduced the flag R
to represent SHF_GNU_RETAIN
on FreeBSD and Linux emulations. I have added the support to LLVM integrated assembler and allowed the syntax on all ELF platforms.
1 | .section meta,"aR",@progbits |
With GCC>=11 or Clang>=13 (https://reviews.llvm.org/D97447), you can write:
1 | __attribute__((retain,used,section("meta"))) |
The used
attribute, when attached to a function or variable definition, indicates that there may be references to the entity which are not apparent in the source code. On COFF and Mach-O targets (Windows and Apple platforms), the used attribute prevents symbols from being removed by linker section GC. On ELF targets, GNU ld/gold/LLD may remove the definition if it is not otherwise referenced.
The retain
attributed was introduced in GCC 11 to set the SHF_GNU_RETAIN
flag on ELF targets.
The typical solution before SHF_GNU_RETAIN
is:
1 | asm(".pushsection .init_array,\"aw\",@init_array\n" \ |
This idea is that SHT_INIT_ARRAY
sections are GC roots. An empty SHT_INIT_ARRAY
does not change the output. The artificial reference keeps meta
live.
I added .reloc
support for R_ARM_NONE/R_AARCH64_NONE/R_386_NONE/R_X86_64_NONE/R_PPC_NONE/R_PPC64_NONE
in LLVM 9.0.0.
PE-COFF
lld-link
- Sections which define a symbol specified by
/include:
- Sections which define a symbol specified by
llvm.used
(in bitcode, the/include:
workaround happens too late) - Sections which define a symbol specified by
/export:
- Sections which define the delay load helper if
/delayload
is specified
Misc
Unfortunately GC roots require linker options. For a local symbol, there is no good way to retain the definition becuase /include:
cannot be used. Using a unique name COMDAT with its name referenced by a /include:
directive is a possible workaround.
Mach-O
As a binary format Mach-O is quite limited. You cannot have more than 255 sections. Thankfully, there is an interesting feature .subsections_via_symbols
which is enabled by default. The idea is that if we combine ELF .text.name
sections, we get back the monolithic section. If we use defined symbols as separators, we can get conceptual subsections.
-ffunction-sections
and -fdata-sections
are no-op.
LLVM IR
The global variable llvm.used
is an appending linkage array which contains a list of GlobalValue
s (most are GlobalObject
s). If a symbol appears in the list, the compiler, assembler and linker cannot discard it.
The global variable llvm.compiler.used
is an appending linkage array which contains a list of GlobalValue
s (most are GlobalObject
s). It is similar to llvm.used
, except that the linker can discard the symbol.
__attribute__((used))
in GNU C maps to llvm.compiler.used
on ELF and llvm.compiler.used
on COFF/Mach-O/wasm.
Linux kernel
Since v4.10, the CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
configuration option is provided to leverage ld --gc-sections
.