Linker garbage collection
2021-02-28 17:00:00 Author: maskray.me(查看原文) 阅读量:250 收藏

A program may have a lot of unused code and data. Many linkers provide garbage collection features to discard such code and data.

In GNU linkers, the feature is called "garbage collection" (ld --gc-sections). On macOS, ld64 calls it "dead stripping" (ld64 -dead_strip). On Windows, "eliminates functions and data that are never referenced" (link.exe /OPT:REF).

In many binary formats, "section" is an inseparable unit. The garbage collection feature starts with a list of sections which should be retained (GC roots). marks associated sections and referenced sections via relocations. For each retained section, the linker marks referenced sections via relocations and associated sections (ELF section group and PE IMAGE_COMDAT_SELECT_ASSOCIATIVE).

ELF

To leverage section based GC, -ffunction-sections and -fdata-sections are needed. Otherwise, monolithic text and data sections are produced and they will likely be retained as a whole.

-ffunction-sections causes section names like .text.name. -fdata-sections causes section names like .data.name, .rodata.name, .data.rel.local.name. The section names are unique can consume the the string table space. Clang provides -fno-unique-section-names to save the string table space: stem names such as .text, .data, .rodata are used.

-fno-unique-section-names needs to create multiple sections with the same name. This requires a new assembler syntax: .section .text,"a",@progbits,unique,1. H.J. Lu implemented the syntax for binutils 2.35.

About LTO: code generation options can be categorized into IR generation options (e.g. -g) and object file generation options (e.g. -ffunction-sections -fdata-sections). Without LTO, you usually do not see a line between the two because they are both performed under the hood. With LTO, the distinction becomes clearer: IR generation options have effects on .c -> .ll/.bc (though usually people reuse ".o") and object file generation options have effects on .ll/.bc -> .o. In LLVM, -ffunction-sections and -fdata-sections are object file generating options and do not affect IR generation. LLD and LLVMgold.so enable function sections and data sections automatically.

GNU ld

The following sections are GC roots.

  • Sections which define a symbol specified by -u, --entry, --init, or --fini
  • Sections which define a symbol exported to .dynsym
    • STV_DEFAULT or STV_PROTECTED
    • -shared or --export-dynamic or --gc-keep-exported
    • ...
  • Sections which have the SHF_GNU_RETAIN flag
  • SHT_PREINIT_ARRAY/SHT_INIT_ARRAY/SHT_FINI_ARRAY
  • Non-SHF_ALLOC non-SHF_LINK_ORDER SHT_NOTE
  • Linker created sections (SEC_LINKER_CREATED)
  • Sections which are matched by an input section description with the KEEP keyword
  • ...

If there is at least one retained non-SHT_NOTE SHF_ALLOC section, GNU ld will mark more sections. Otherwise, most sections (e.g. .debug_*) will be discarded. (See bfd/elflink.c:_bfd_elf_gc_mark_extra_sections) This heuristic is interesting: it can discard quite a few debug sections when no non-SHT_NOTE SHF_ALLOC section is retained.

  • Non-SHF_ALLOC non-SHF_LINK_ORDER sections which are not in a section group
    • Except .debug_line.* which is associated to a discarded text section
  • Group sections which contain just non-SHF_ALLOC sections

ld.lld

The following sections are GC roots.

  • Sections which define a symbol specified by -u, --entry, --init, or --fini
  • Sections which define a symbol exported to .dynsym
  • Sections which define a symbol referenced by a linker script
  • Sections which have the SHF_GNU_RETAIN flag
  • SHT_PREINIT_ARRAY/SHT_INIT_ARRAY/SHT_FINI_ARRAY
  • SHT_NOTE not within a section group (this rule is for Fedora watermark)
  • .ctors/.dtors/.init/.fini/.jcr
  • Personality routines or language-specific data area referenced by .eh_frame. LLD handles --gc-sections before .eh_frame deduplication, so this may retain sections than needed.
  • Sections which are matched by an input section description with the KEEP keyword

Non-SHF_ALLOC non-SHF_LINK_ORDER non-SHF_REL[A] non-SHF_GROUP sections are retained, albeit not GC roots.

Debug sections

Linkers do not discard debug sections. This is the "smart format, dumb linker" philosophy. DWARF sections are large. Optimizing them would take a significant amount of link time.

There are, however, instances where discarded text sections can lead to some tombstone values in DWARF.

  • .debug_ranges & .debug_loc: 1 (LLD<11: 0+addend; GNU ld uses 1 for .debug_ranges)
  • .debug_*: 0 (LLD<11: 0+addend; GNU ld uses 0; future LLD: 0xffffffff or 0xffffffffffffffff)

Misc

In binutils 2.36, GNU as introduced the flag R to represent SHF_GNU_RETAIN on FreeBSD and Linux emulations. I have added the support to LLVM integrated assembler and allowed the syntax on all ELF platforms.

1
.section meta,"aR",@progbits

With GCC>=11 or Clang>=13 (https://reviews.llvm.org/D97447), you can write:

1
2
__attribute__((retain,used,section("meta")))
static const char dummy[0];

The used attribute, when attached to a function or variable definition, indicates that there may be references to the entity which are not apparent in the source code. On COFF and Mach-O targets (Windows and Apple platforms), the used attribute prevents symbols from being removed by linker section GC. On ELF targets, GNU ld/gold/LLD may remove the definition if it is not otherwise referenced.

The retain attributed was introduced in GCC 11 to set the SHF_GNU_RETAIN flag on ELF targets.

The typical solution before SHF_GNU_RETAIN is:

1
2
3
asm(".pushsection .init_array,\"aw\",@init_array\n" \
".reloc ., R_AARCH64_NONE, meta\n" \
".popsection\n")

This idea is that SHT_INIT_ARRAY sections are GC roots. An empty SHT_INIT_ARRAY does not change the output. The artificial reference keeps meta live.

I added .reloc support for R_ARM_NONE/R_AARCH64_NONE/R_386_NONE/R_X86_64_NONE/R_PPC_NONE/R_PPC64_NONE in LLVM 9.0.0.

PE-COFF

  • Sections which define a symbol specified by /include:
  • Sections which define a symbol specified by llvm.used (in bitcode, the /include: workaround happens too late)
  • Sections which define a symbol specified by /export:
  • Sections which define the delay load helper if /delayload is specified

Misc

Unfortunately GC roots require linker options. For a local symbol, there is no good way to retain the definition becuase /include: cannot be used. Using a unique name COMDAT with its name referenced by a /include: directive is a possible workaround.

Mach-O

As a binary format Mach-O is quite limited. You cannot have more than 255 sections. Thankfully, there is an interesting feature .subsections_via_symbols which is enabled by default. The idea is that if we combine ELF .text.name sections, we get back the monolithic section. If we use defined symbols as separators, we can get conceptual subsections.

-ffunction-sections and -fdata-sections are no-op.

LLVM IR

The global variable llvm.used is an appending linkage array which contains a list of GlobalValues (most are GlobalObjects). If a symbol appears in the list, the compiler, assembler and linker cannot discard it.

The global variable llvm.compiler.used is an appending linkage array which contains a list of GlobalValues (most are GlobalObjects). It is similar to llvm.used, except that the linker can discard the symbol.

__attribute__((used)) in GNU C maps to llvm.compiler.used on ELF and llvm.compiler.used on COFF/Mach-O/wasm.

Linux kernel

Since v4.10, the CONFIG_LD_DEAD_CODE_DATA_ELIMINATION configuration option is provided to leverage ld --gc-sections.


文章来源: http://maskray.me/blog/2021-02-28-linker-garbage-collection
如有侵权请联系:admin#unsafe.sh