C/C++ projects can benefit from using precompiled headers to improve compile time. GCC added support for precompiled headers in 2003 (version 3.4), and the current documentation can be found at https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html.
Even with the emergence of C++ modules, precompiled headers remain relevant for several reasons:
- Precompiled headers share implementation aspects with modules (e.g., AST serialization in Clang).
- Many C++ projects rely on the traditional compilation model and are not converted to C++ modules.
- Modules may possibly use some preamble-like technology to accelerate IDE-centric operations.
- C doesn't have C++ modules.
This article focuses on Clang precompiled headers (PCH). Let's begin with an example.
1 | cat > a1.cc <<'eof' |
We compile b.hh
using -c
, just like we
would compile a non-header file. Clang parses the file, performs
semantic analysis, and writes the precompiled header (as a serialized
AST file) into b.hh.pch
.
When compiling a.cc
, we use -include-pch
as
a prefix header. This means that the translation unit will get two
b.hh
copies: one from b.hh.pch
and one from
the textual b.hh
. The same applies to a1.cc
.
To avoid a redefinition of 'fb'
error, b.hh
should have a header guard or use #pragma once
.
Now, let's examine the steps in detail.
PCH generation
Given a header file as input, Clang determines the input type as
either c-header
(.h
) or
c++-header
(.hh
/.hpp
/.hxx
) based on the file
extension.
For compilation actions, either clang
or
clang++
can be used. If we treat .h
as a C++
header, we need to specify -xc++-header
(e.g.,
clang -c -xc++-header b.h -o b.h.pcm
). (It's worth noting
that the behavior of clang++ -c a.h
is deprecated. Other
than that, the only significant difference between clang
and clang++
is the linking process, specifically whether
the C++ standard library is linked.)
When the input type is c-header
or
c++-header
, Clang Driver selects the -emit-pch
frontend action. (Note:
c++-user-header
/c++-system-header
are used for
C++ modules and have different functionality.)
Conventionally, the extension used for Clang precompiled headers is
.pch
(similar to MSVC). However, to match GCC, when the
-o
option is omitted, the default output file is
input_file + ".gch"
(see
Driver::GetNamedOutputPath
). For example, when the input
file is d/b.hh
, the output is d/b.hh.gch
. This
is different from d/b.cc
that will go to
b.o
.
The frontend parses the file, performs semantic analysis, and writes
the precompiled header (as a serialized AST file) (see
PCHGenerator
). For the serialized format, refer to Precompiled Header
and Modules Internals.
Using PCH
-include-pch b.hh.pch
(PreprocessorOptions::ImplicitPCHInclude
) loads the
precompiled header b.hh.pch
as a prefix header.
We can also write -include b.hh
, and Clang will probe
b.hh.pch
/b.hh.gch
and use the file if present.
This is a behavior ported from GCC.
-include
and -include-pch
may specify a
directory. Clang will search for a suitable precompiled header in the
directory (see ASTReader::isAcceptableASTFile
). The
directory may contain precompiled headers for different compiler
options. This is another behavior ported from GCC.
1 | echo 'extern int X;' > d.hh |
1 | % clang++ -c -DX=z -include d.hh e.cc |
PCH validation
When we generate and use a precompiled header with different compiler
options, the behavior will be a combination of those options.
Consequently, the behavior of -include b.hh
may differ
depending on the presence of
b.hh.pch
/b.hh.gch
.
To identify this common pitfall, Clang performs PCH validation (see
PCHValidator
) to check for inconsistent options, similar to
how MSVC handles it. The validated options include those that can affect
AST generation, such as language options (driver -std=
,
-fPIC
, -fPIE
), target options (driver
--target
), file system options, header search options, and
preprocessor options.
Modules employ the same validation mechanism, but PCH validation is
stricter (!AllowCompatibleConfigurationMismatch
). This
means that
COMPATIBLE_LANGOPT
/COMPATIBLE_ENUM_LANGOPT
/COMPATIBLE_VALUE_LANGOPT
options (e.g., whether the built-in macro __OPTIMIZE__
is
defined) must match as well.
If one side of the precompiled header and the user code are compiled
with the -D
option, the other side should either use the
same -D
option or omit it entirely.
1 | clang -c b.hh -o b.hh.pch -DB=1 |
As an escape hatch, -Xclang -fno-validate-pch
disables
PCH validation.
Performance optimization
In order to achieve better performance, it is possible to make certain compromises on properties such as language standard conformance.
-fpch-instantiate-templates
-fpch-instantiate-templates
allows pending template instantiations to be performed in the PCH file.
This means that these instantiations do not need to be repeated in every
translation unit that includes the PCH file. This optimization can
significantly improve the speed of certain projects. However, the option
changes the point of instantiation for certain function templates, which
is non-conforming. Nevertheless, the altered behavior is generally
harmless in most cases.
1 | #ifndef HEADER |
1 | % clang++ -c -xc++-header a.cc -o a.pch |
-fpch-instantiate-templates
is the default when using
the clang-cl driver mode, as MSVC appears to have a similar
behavior.
Modular code generation was initially implemented. It was
later extended to support
precompiled headers in Clang 11. To utilize this feature, you can
specify -Xclang -fmodules-codegen
as a command-line option
or use the driver option -fpch-codegen
.
When generating a serialized AST file for PCH or modules, Clang
identifies non-always-inline functions that do not depend on template
parameters and have linkages other than GVA_Internal
or
GVA_AvailableExternally
. These functions are then
serialized (see ASTWriter::ModularCodegenDecls
).
In an importer that encounters such a definition, the linkage is
adjusted to GVA_AvailableExternally
. This allows for
discarding of the definition if it is not required for
optimizations.
Let's consider an example using the files a.cc
,
a1.cc
, and b.hh
from the initial example
provided at the beginning of this article.
1 | echo 'module b { header "b.hh" }' > module.modulemap |
Both a.cc
and a1.cc
include
b.hh
and obtain an inline definition of fb
. In
a regular build, the fb
definition has
GVA_DiscardableODR
linkage and is compiled twice into
a.o
and a1.o
. These duplicate definitions are
then deduplicated by the linker, following COMDAT semantics.
In a modular code generation build, fb
is assigned
GVA_StrongODR
linkage in b.pcm
and is emitted
into b.o
. The copies of fb
in
a.cc
and a1.cc
are adjusted to
GVA_AvailableExternally
. They are used for optimizations by
callers but are not emitted otherwise. In a -O0
build, the
GVA_AvailableExternally
definitions are simply discarded.
Regardless, both the code generator and the linker have reduced work,
resulting in decreased build time.
However, there are two primary differences in behavior.
First, if b.hh
contains a
GVA_StrongExternal
definition, a regular build will
encounter a linker error due to a duplicate symbol. However, in the
prebuilt modules build using -fmodules-codegen
, this error
does not occur.
Second, in a regular build, if fb
is unused, no
translation unit will contain its COMDAT definition. On the other hand,
in the prebuilt modules build using -fmodules-codegen
, we
compile the prebuilt module b.pcm
into b.o
and
link b.o
into the executable, always resulting in a
definition of fb
, even when using -O1
or
above. To discard fb
, linker garbage collection can be
leveraged by using -Wl,--gc-sections
(-ffunction-sections
is unneeded even for ELF targets,
since COMDAT functions are emitted in separate sections anyway).
If b.hh
contains an inline variable with an initializer
involving a side effect (e.g.,
inline int vb = puts("vb"), 1;
), the modular code
generation build will always observe the side effect. In contrast, a
regular build may not observe the side effect if, for example, the
containing header is not included in any translation unit.
Nevertheless, these behavior differences are almost always benign, and the speedup gained in build time may outweigh the downsides.
Note: Clang exhibits a similiar behavior when compiling module interface units and module partitions for strong definitions.
Modular code generation has been extended to support PCH in Clang 11.
We specify -fpch-codegen
to pass
-fmodules-codegen
to the frontend.
1 | clang++ -c -fpch-codegen b.hh -o b.hh.pch |
When using -fpch-codegen
, compared to the
non--fpch-codegen
usage of PCH, it is necessary to compile
the PCH file b.pch
into b.o
and link
b.o
into the executable. If b.o
is not linked,
a linker error for an undefined fb()
will occur.
Here is a CMake feature
request for -fpch-codegen
.
-Xclang -fmodules-codegen
The cc1 option -fmodules-debuginfo
serves a similar
purpose as -fmodules-codegen
, but specifically for debug
information descriptions of CXXRecordDecl
that is
non-dependent. When using -g
, the compiled PCH or module
file emit type definitions while importers just emit declarations
(DIFlagFwdDecl
).
-fpch-debuginfo
instructs Clang Driver to pass
-fmodules-codegen
to the frontend.
Here is an example of using MSVC-style precompiled headers with
clang-cl (a CL-style driver mode), using the files a.cc
,
a1.cc
, and b.hh
from the initial example
provided at the beginning of this article.
1 | clang-cl /c /Ycb.hh a.cc |
The /Ycb.hh
command instructs the Clang Driver to
perform two frontend actions. First, Clang parses the base source file
a.cc
up to and including #include "b.hh"
,
performs semantic analysis, and writes the precompiled header into
b.pch
. It replaces the header file extension with
.pch
, unlike GCC. Second, Clang compiles a.cc
using -include-pch b.pch
, but it skips preprocessing tokens
up to and including #include "b.hh"
(see
Preprocessor::SkippingUntilPCHThroughHeader
).
The /Yub.hh
command is similar to the second frontend
action of /Ycb.hh
. It compiles a1.cc
using
-include-pch b.pch
, but it also skips preprocessing tokens
up to and including #include "b.hh"
.
Internally, /Ycb.hh
and /Yub.hh
instruct
the driver to pass -pch-through-header=b.hh
to the
frontend. This helps Clang detect common pitfalls by examining whether
the source file contains the #include "b.hh"
directive.
It is also possible to use /Yc
and /Yu
without specifying a filename. In this case, the precompiled header
region is determined by #pragma hdrstop
or the end of the
source file. For more details, refer to /Yc
(Create Precompiled Header File).
In MSVC, an
inline function with the dllexport attribute is exported, whether or
not it is referenced. LLVM IR models such a definition with the
weak_odr
linkage. When using clang-cl /Yc
, the
cc1 option -building-pch-with-obj
is passed to the
frontend. This option instructs the frontend to serialize inline
functions with the dllexport attribute, similar to
-fmodules-codegen
. In an importer that encounters such a
definition, the linkage is adjusted to
GVA_AvailableExternally
.
See also /Zc:dllexportInlines-
.
TODO: PCH signature and linker
Precompiled preamble
During IDE-centric operations, such as code completion and reporting diagnostics, an approach called precompiled preamble (introduced to Clang in 2010) is used to minimize the need for re-parsing the entire file after making changes. The precompiled preamble is the initial region of the source file that includes only preprocessor tokens and comments. It is precompiled and then reused to speed up subsequent reparsing.
A precompiled preamble can be serialized to disk, similar to precompiled headers.
-Xclang -fno-pch-timestamp
By default, the generated PCH file includes the modified time of the headers it depends on. When using the PCH file, PCH validation checks whether the included headers have changed based on their sizes and modified time.
1 | clang -c b.hh -o b.hh.pch |
This timestamp safeguard, however, renders builds non-reproducible and can cause inconvenience for distributed build systems.
The cc1 option -fno-pch-timestamp
can be used to disable
timestamp for PCH generation and bypass the timestamp validation.
For distributed build systems, precompiled headers have another major issue. They are usually much larger than the source they are built from. The network traffic can offset some advantages.