In C++, dynamic initializations for non-local variables happen before the first statement of main
. All (most?) implementations just ensure such dynamic initializations happen before main
.
As an extension, GCC supports __attribute__((constructor))
which can make an arbitrary function run before main
. A constructor function can have an optional priority (__attribute__((constructor(N)))
).
Priorities from 0 to 100 are reserved for the implementation (-Wprio-ctor-dtor
catches violation), e.g. gcov uses __attribute__((destructor(100)))
. Applications can use 101 to 65535. 65535 (.init_array
or .ctors
, without a suffix) has the same priority as a non-local variable's dynamic initialization in C++.
1 | struct S { S(); }; |
Under the hood, on ELF platforms, the initialization functions or constructors are implemented in two schemes. The legacy one uses .init
/.ctors
while the new one uses .init_array
.
1 | .section .text.startup,"ax",@progbits |
.init
and .fini
System V release 4 introduced the dynamic tags DT_INIT
and DT_FINI
to implement ELF initialization and termination functions. It is difficult to figure out what it actually did today, but many systems using GCC used the design described below.
On a GCC+glibc system, traditionally the section .init
in an executable/shared object consisted of four fragments:
1 | glibc crti.o:(.init) _init |
The linker combines .init
input sections and places the fragments into the .init
output section. _init
is defined at offset 0 in the first input section, so its address equals the address of the .init
output section. The linker defines DT_INIT
according to the value of _init
(which can be changed by the -init
linker option).
.fini
is similar:
1 | glibc crti.o:(.fini) _fini |
The linker defines DT_FINI
according to the value of _fini
(which can be changed by the -fini
linker option).
In glibc x86-64, sysdeps/x86_64/crti.S
and sysdeps/x86_64/crtn.S
provide the definitions:
1 | # crti.o |
crti.o
calls __gmon_start__
(gmon profiling system) if defined. This is used by gcc -pg
.
musl just provides empty crti.o
and crtn.o
.
.ctors
and .dtors
In GCC libgcc/crtstuff.c
, when __LIBGCC_INIT_ARRAY_SECTION_ASM_OP__
is not defined and __LIBGCC_INIT_SECTION_ASM_OP__
is defined (HAVE_INITFINI_ARRAY_SUPPORT
is 1 in $builddir/gcc/auto-host.h
), the following scheme is used. Note: the condition is not satisfied on modern systems.
C++ dynamic initializations and __attribute__((constructor))
do not use _init
directly. They are implemented as ELF functions. The addresses are collected in the .ctors
section which will be called by the runtime. Assume that we have one object files a.o
and b.o
with .ctors
sections with different priorities, the layout of the .ctors
output section is:
1 | crtbegin.o:(.ctors) __CTOR_LIST__ |
.dtors
is similar:
1 | crtbegin.o:(.dtors) __DTOR_LIST__ |
crtbegin.o
defines.ctors
and.dtors
with one element, -1 (0xffffffff on 32-bit platforms and 0xffffffffffffffff on 64-bit platforms).crtend.o
defines.ctors
and.dtors
with one element, 0.crtend.o
defines a.init
section which calls__do_global_ctors_aux
.__do_global_ctors_aux
calls the static constructors in the.ctors
section. The -1 and 0 sentinels are skipped.crtbegin.o
defines a.fini
section which calls__do_global_dtors_aux
.__do_global_dtors_aux
calls the static constructors in the.dtors
section. The -1 and 0 sentinels are skipped.
Reversed execution order
Here is an interesting property: .ctors
elements are run in the reversed order and .dtors
elements are run in the regular order. E.g. for a.o:(.ctors) b.o:(.ctors)
, b.o's constructor runs before a.o's.
This is to make dynamic linking similar to static linking for .ctors
sections without a suffix (having the lowest priority).
The origin may be related to a generic ABI promise: if a.so depends on b.so, then b.so's constructors run first. If we only look at .ctors
sections without a suffix, the behavior of ld main.o a.so b.so
may be quite similar to the static linking ld main.o a.a b.a
.
.dtors
can be seen as undoing .ctors
, so its order is the reverse of .ctors
, which is the regular order.
.init_array
HP-UX developers noticed that the .init
/.ctors
scheme have multiple problems:
- Fragmented
_init
function is ugly and error-prone. - Sentinel values in
.ctors
are ugly. .init
and.ctors
use magic names instead of dedicated section types.
They invented DT_INIT_ARRAY
as an alternative. glibc implemented the scheme in 1999. The GCC and binutils implementations were also quite old.
In this scheme, .init_array
and .init_array.N
sections have a dedicated type SHT_INIT_ARRAY
. crtbegin.o
and crtend.o
do not provide fragments.
Below is a layout.
1 | a.o:(.init_array.1) b.o:(.init_array.1) |
Note: ctors_priority = 65535-init_array_priority
The linker defines DT_INIT_ARRAY
and DT_INIT_ARRAYSZ
according to the address and size of .init_array
.
Unlike .ctors
, the execution order of .init_array
is forward (follows .init
). a.o:(.init_array) b.o:(.init_array)
has a different order from a.o:(.ctors) b.o:(.ctors)
. In a future section we will discuss that this difference can expose a type of very subtle bugs called "static initialization order fiasco".
In GCC, newer ABI implementations like AArch64 and RISC-V only use .init_array
and don't provide .ctors
.
Runtime behavior
The generic ABI says:
If an object contains both DT_INIT and DT_INIT_ARRAY entries, the function referenced by the DT_INIT entry is processed before those referenced by the DT_INIT_ARRAY entry for that object. If an object contains both DT_FINI and DT_FINI_ARRAY entries, the functions referenced by the DT_FINI_ARRAY entry are processed before the one referenced by the DT_FINI entry for that object.
If the executable a
depends on b.so
and c.so
(in order), the glibc ld.so and libc behavior is:
ld.so
runsc.so:DT_INIT
. The crtbegin.o fragment of_init
calls.ctors
ld.so
runsc.so:DT_INIT_ARRAY
ld.so
runsb.so:DT_INIT
. The crtbegin.o fragment of_init
calls.ctors
ld.so
runsb.so:DT_INIT_ARRAY
libc_nonshared.a
runsa:DT_INIT
. The crtbegin.o fragment of_init
calls.ctors
libc_nonshared.a
runsa:DT_INIT_ARRAY
As a new ABI, glibc's RISC-V port doesn't define ELF_INIT_FINI
, so DT_INIT
does not run.
.ctors
to .init_array
transition
In 2010-12, Mike Hommey filed Replace .ctors/.dtors with .init_array/.fini_array on targets supporting them which I believe was related to his ELF hack work for Firefox.
Switching sections needed to consider backward compatibility: how to handle old object files using .ctors
sections. H.J. Lu proposed that the internal linker script of GNU ld could be changed to place .ctors .ctors.N .init_array .init_array.N
input sections into the .init_array
output section in RFC: Support mixing .init_array.* and .ctors.* input sections
.
With this GNU ld support, GCC 4.7 made the switch.
1 |
|
1 | % clang -fpic -shared b.cc -o b.so |
gold doesn't have the concept of an internal linker script. Ian Lance Taylor added the enabled-by-default linker option --ctors-in-init-array
to emulate the GNU ld behavior
Since .ctors
is rare, ld.lld does not implement converting .ctors
into .init_array
.
Linux remnant of .ctors
in 2021
If you don't use prebuilt object files from GCC<4.7, it is difficult to see .ctors
on Linux in 2021. However, I found two exceptions.
First, a libgcc file for the split stack implementation had .ctors.65535
assembly code. I filed morestack.S should support .init_array.0 besides .ctors.65535
which was fixed in 2021-10.
Second, GCC cross compilers targeting Linux did not enable --enable-initfini-array
. H.J. Lu reported --enable-initfini-array should be enabled for cross compiler to Linux
and fixed it for GCC 12. This affected GCC 11 builds by scripts/build-many-glibcs.py
C++ dynamic initialization
In a typical C++ object, most .init_array
elements are dynamic initializations, so I will spend some paragraphs describing it.
The standard defines the order for various initializations.
- Constant initialization and zero initialization
- Dynamic initialization
main
- Deferred dynamic initialization (e.g. optimized out, on-demand shared library)
Dynamic initialization has three types with different degrees of order guarantee:
- Unordered dynamic initialization (static data members and variable templates not explicitly specialized)
- Partially-ordered dynamic initialization (inline variables that are not an implicitly or explicitly instantiated specialization)
- Ordered dynamic initialization (other non-local variables)
Basically, in one translation unit, the order of dynamic initializations usually matches the intuition, e.g. a
's initializations happen before b
's below.
1 | struct A { A(); }; |
C++ static initialization order fiasco
If no appearance-ordered relationship is defined, we say that two initializations are indeterminately sequenced. Relying on a particular order is called "static initialization order fiasco". (I don't know what "static" refers to. Perhaps it refers to static variables or static storage duration.)
Below is a registry example. The order that a, b, and C are registered depends on the link order. If somehow only one order works, than the program may be brittle.
1 |
|
Fixing such bugs requires thoughts on the initialization order. Basically one needs to do one of the following:
- constant initialization
- lazy initialization (dynamic initialization of function-locale static, llvm::ManagedStatic, etc)
- manual initialization
Some ways to prevent static initialization order fiasco:
- constexpr
- constinit (constexpr - const)
clang -Wglobal-constructors
:warning: declaration requires a global constructor [-Wglobal-constructors]
AddressSanitizer check_initialization_order
is enabled by default due to strict_init_order
. It enforces that a dynamic initialization does not touch memory regions of other global variables. Unfortunately in practice it misses many many cases.
ld.lld --shuffle-sections=.init_array=-1
Rafael Espindola added --shuffle-sections
motivated by making tests stabilized. I changed the option to apply to .init_array
/.init_array
as well and later changed it to the current form: --shuffle-sections=<section-glob>=<seed>
: shuffle matched input sections using the given seed before mapping them to the output sections. I specialized the seed value -1 to mean the deterministic reversed order.
In practice, most static initialization order fiasco bugs are due to the order between two translation units. For a mostly statically linked executable, testing the regular order and the reversed order is sufficient to catch such bugs.
However, for a program with many shared objects, the checks may be insufficient. An executable and all its DT_NEEDED
transitive dependencies from a dependency graph. The generic ABI requirement "if a.so depends on b.so, then b.so's initialization functions run first" imposes some orders in the graph.
If the executable a
depends on b.so
and c.so
, and b.so
and c.so
are unrelated. We may consider it a bug if c.so
's initialization functions need to run before b.so
, but the linker cannot do anything to improve the checks. There is also no ld.so feature altering the order.