zstd compressed debug sections
2022-9-9 15:0:0 Author: maskray.me(查看原文) 阅读量:17 收藏

In January I wrote Compressed debug sections. The venerable zlib shows its age and there are replacements which are better in every metric except adoption. The obvious choice was Zstandard, but I was not so confident about solving the ecosystem issue if we adopted it for ELF compressed debug sections.

In June, Cole Kissane posted [RFC] Zstandard as a second compression method to LLVM on LLVM discourse forums. I learned that other folks were investigating a better compression format for ELF compressed debug sections and told myself: it's high time to add ELFCOMPRESS_ZSTD to the generic System V Application Binary Interface (generic ABI).

ELF is an elegant format which has passed the test of time. Many things created by the forefathers from 30 years ago carry over and are still used today. Every new feature, even a small addition like introducing a new constant has to pass a significant high bar for acceptance. There were many discussions on Add new ch_type value: ELFCOMPRESS_ZSTD.

Personally I think a selected format need to have these properties:

  • It has an open compression algorithm and implementation.
  • It provides significant benefits (compression speed, decompression speed, compression ratio) with a decent memory footprint.
  • It has full backward compatibility. In 20 years I want to be able to decompress a debug section created today.
  • It has wild and active use cases.
  • It has good documentation.
  • It's easy to use.

Given an executable, you may extract its .debug_info section with llvm-objcopy --dump-section .debug_info=debug_info a.out /dev/null. I have tried bzip2, gzip, lz4, lzo, pigz, xz, zstd, and manually verified that zstd is the best considering compression speed, decompression speed, and compression ratio. The built-in parallel compression support is a plus.

It took about one month and ELFCOMPRESS_ZSTD was accepted.

  • binutils: feature request
    • addr2line: symbolization needs to decompress debug sections
    • gas: compress debug sections
    • ld: decompress compressed input sections and compress output debug sections
    • objcopy: --decompress-debug-sections and --compress-debug-sections=zstd
    • dwp: decompress compressed .dwo
  • gdb needs to support executables, shared objects, .dwo files using ELFCOMPRESS_ZSTD. Feature request
  • GCC: support -gz=zstd
  • llvm-project: most have been implemented (milestone: 16.0.0)
  • dwz

On the llvm-project side, there was a lot of debate on how the API should look like. This week we (Cole Kissane, David Blaikie, I) reached an agreement that the free function style compression API is acceptable. I pushed some changes so llvm-objcopy --compress-debug-sections=zstd, clang -gz=std, ld.lld --compress-debug-sections=zstd are available now.

1
2
3
4
5
6
7
8
9
10
11
% cat a.cc
#include <iostream>
int main() { std::cout << "zstd"; }
% clang -c -g -gz a.cc
% readelf -x .debug_info a.o

Hex dump of section '.debug_info':
NOTE: This section has relocations against it, but these have NOT been applied to this dump.
0x00000000 02000000 00000000 1a180000 00000000 ................
0x00000010 01000000 00000000 28b52ffd 601a17ed ........(./.`...
...

ELFCOMPRESS_ZSTD (2) can be identified by the first 4 bytes. In a little-endian object file, it displays as 02000000.

If llvm-objcopy is built with zstd support, use --decompress-debug-sections to decompress an object file:

1
2
3
4
5
6
7
8
% llvm-objcopy --decompress-debug-sections a.o a.o.decompressed
% readelf -x .debug_info a.o.decompressed

Hex dump of section '.debug_info':
NOTE: This section has relocations against it, but these have NOT been applied to this dump.
0x00000000 16180000 05000108 00000000 01002100 ..............!.
0x00000010 01000000 00000000 00020000 00000000 ................
...

Cole's lldb patch will be merged soon. On the llvm-project side we should reach full feature readiness soon.

It would be nice that someone picks up the work items on the GNU side so that many Linux distributions can start investigating the adoption of zstd compressed debug sections.


文章来源: https://maskray.me/blog/2022-09-09-zstd-compressed-debug-sections
如有侵权请联系:admin#unsafe.sh