I was asked about a segfault related to lld linked musl libc.so on PowerPC64.
/usr/lib/ld-musl-powerpc64le.so.1 /path/to/thing
worked. The kernel ELF loader loads rtld and rtld loads the executable./path/to/thing
segfaulted. The kernel ELF loader loads both rtld and the executable.
Therefore the bug is likely due to a difference between the two modes.
The section and program header dump from readelf looked like the following. I annotated the interesting lines with !!!
.
1 | There are 36 section headers, starting at offset 0x2e4890: |
There were two PT_LOAD
program headers with the PF_R|PF_W
flags and the unusual property p_filesz < p_memsz
. For both RW PT_LOAD
program headers, we had roundUp(p_vaddr+p_filesz, pagesz) < roundUp(p_vaddr+p_memsz, pagesz)
. It turns out that when the Linux kernel loads an interpreter (PT_INTERP
; see fs/binfmt_elf.c:load_elf_interp
), it only supports one PT_LOAD
with p_filesz < p_memsz
.
Note: it is typical for lld output to have two RW PT_LOAD
program headers, one for RELRO sections (PT_GNU_RELRO
) and the other for non-RELRO sections. This may look unusual at the first glance but it avoids an alignment padding as used in GNU ld's single RW PT_LOAD
layout. See Explain GNU style linker options#-z relro.
In the PowerPC ELFv2 ABI, .plt
is like GOTPLT on other architectures (it holds resolved addresses for PLT entries) and has the SHT_NOBITS
type. With -z now
, .plt
can be eagerly resolved and become read-only after relocation resolving, therefore it is part of PT_GNU_RELRO
. When lld layouts sections, it is part of the first RW PT_LOAD
. In the unlucky libc.so
, .plt
is 64 bytes (2 reserved pointer entries plus 6 pointer entries for malloc/calloc/realloc/memalign/aligned_alloc/free
). p_memsz = p_filesz - 64
. If roundUp(p_vaddr+p_filesz, pagesz) < roundUp(p_vaddr+p_memsz, pagesz)
, relocation resolving will access an unmapped memory page and segfault. If the comparison result is equal and we just have p_filesz < p_memsz
, the kernel will fail to zero some bytes but the bytes will be overwritten by rtld anyway.
Clang passes -z now
to ld for Alpine Linux. Chimera Linux has patched Clang Driver to pass -z now
for all musl target triples.
- Pros: GOTPLT is part of RELRO and provides security hardening values. In addition, the
DF_1_NOW
flag avoids an allocation in its rtld. See this commit emulate lazy relocation as deferrable relocation. - Cons: There is a slight size increase of
.dynamic
: it will always have aDT_FLAGS
holdingDF_NOW
. In most casesDT_FLAGS
can actually be absent if-z now
is not used.
Workarounds
There are multiple ways to work around the issue.
-z lazy
The easiest is to build musl with LDFLAGS=-Wl,-z,lazy
to override driver specified -z now
. I verified with a local cross-compilation build.
1 | mkdir out/ppc64le && cd out/ppc64le |
Cons: loses some security hardening.
(If you use GCC's powerpc64 port, avoid -Os
. lld has not implemented _savefpr*
and _restfpr*
functions.)
SHT_PROGBITS
.plt
The linker synthesized .plt
has the SHT_NOBITS
type. We can link a relocatable object file with an empty SHT_PROGBITS
.plt
.
1 | .section .plt,"awR",@progbits |
The output section will have the SHT_NOBITS
type.
Note R
for SHF_RETAIN
. Without the flag, the linker option --gc-sections
drops the input .plt
so that it cannot affect the output section type.
Prevent roundUp(p_vaddr+p_filesz, pagesz) < roundUp(p_vaddr+p_memsz, pagesz)
The musl build system forces -Wl,--hash-style=both
. We can specify LDFLAGS=-Wl,--hash-style=gnu
to drop .hash
.
Alternatively, we may pad .plt
or any preceding output section so that the property no longer holds.
1 | // a.lds |
Use clang -o libc.so ... a.lds
.
You may be attempted to keep -z now
and link libc.so
with a linker script:
1 | SECTIONS { .plt : {} } INSERT AFTER .bss; |
Unfortunately that would create discontinued RELRO sections, which is unsupported by linkers and most rtld implementations.
glibc
glibc adopts a separate rtld and libc.so design. Its rtld has no JUMP_SLOT
(JMP_SLOT
) relocations.
The powerpc64 port has been buildable since lld 13. There is no .plt
section, therefore the first RW PT_LOAD
has p_filesz == p_memsz
. The built rtld works with Linux kernel.
I got powerpc64le-linux-gnu-gcc
and binutils from system packages. I have installed /usr/local/bin/powerpc64le-linux-gnu-ld.lld
so that powerpc64le-linux-gnu-gcc -fuse-ld=lld
works.
1 | mkdir out/ppc64le && cd out/ppc64le |
Reliable reproduce
Here is the main trick: assemble the following assembly file toc.s
and link it into musl lib/libc.so
.
1 | .section .toc,"aw",@nobits |
.toc
is recognized as a RELRO section in ld.lld, even if the architecture is not PowerPC64:)
The section is 4096 (page size), therefore we can ensure roundUp(p_vaddr+p_filesz, pagesz) < roundUp(p_vaddr+p_memsz, pagesz)
.
Compile the following C program, link it with toc.o
using -Wl,--dynamic-linker=path/to/libc.so
.
1 | #include <assert.h> |
The output will have two RW PT_LOAD
program headers with p_filesz < p_memsz
.
You can add custom sections to the PT_GNU_RELRO
program header using a full linker script with DATA_SEGMENT_ALIGN
and DATA_SEGMENT_RELRO_END
(implemented in ld.lld 15).
1 | . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE)); |
Stress test
To test that the kernel ELF loader can handle more RW PT_LOAD
program headers, we can add a few more SHF_ALLOC|SHF_WRITE
sections (abbreviated as RW below). We can place a read-only section after .bss
followed by a RW section. The read-only section will form a read-only PT_LOAD
and the RW section will form a RW PT_LOAD
.
Create some files. If you have split-file (a test utility from llvm-project), you may place the following content into a.txt
.
1 | #--- a.c |
Then run:
1 | split-file a.txt a |
Note: when a SHT_NOBITS
section is followed by another section, the SHT_NOBITS
section behaves as if it occupies the file offset range. This is because ld.lld does not implement a file size optimization.
Test a patched kernel
Pedro Falcato has a kernel patch [PATCH] fs/binfmt_elf: Fix memsz > filesz handling to fix the issue. Let's verify it.
1 | // In linux |
Now prepare an initrd image with the test program. https://github.com/ClangBuiltLinux/boot-utils has a prebuilt image and adding extra files is convenient.
1 | mkdir /tmp/initrd && cd /tmp/initrd |
Copy musl lib/libc.so
to /tmp/initrd/lib/libc.so
and our toy program to /tmp/initrd/toy
. Edit /tmp/initrd/init
to run /toy || echo failed: $?
. Rebuild the initrd image.
1 | find . | sudo cpio -o --format=newc | zstd > ~/Dev/ClangBuiltLinux/boot-utils/images/x86_64/rootfs.cpio.zst |
With an unpatched kernel, /toy
segfaults as expected:
1 | % ~/Dev/ClangBuiltLinux/boot-utils/boot-qemu.py -a x86_64 -k /tmp/linux/x86_64 |
With a patched kernel, /toy
succeeds.
1 | % ~/Dev/ClangBuiltLinux/boot-utils/boot-qemu.py -a x86_64 -k /tmp/linux/x86_64 |