Clang is a C/C++ compiler that generates LLVM IR and utilitizes LLVM to generate relocatable object files. Using the classic three-stage compiler structure, the stages can be described as follows:
1 | C/C++ =(front end)=> LLVM IR =(middle end)=> LLVM IR (optimized) =(back end)=> relocatable object file |
If we follow the representation of functions and instructions, a more detailed diagram looks like this:
1 | C/C++ =(front end)=> LLVM IR =(middle end)=> LLVM IR (optimized) =(instruction selector)=> MachineInstr =(AsmPrinter)=> MCInst =(assembler)=> relocatable object file |
LLVM and Clang are designed as a collection of libraries. This post describes how different libraries work together to create the final relocatable object file. I will focus on how a function goes through the multiple compilation stages.
This post describes how different libraries work together to create the final relocatable object file.
<%- toc(page.content) %>
Compiler frontend
The compiler frontend primarily comprises the following libraries:
- clangDriver
- clangFrontend
- clangParse and clangSema
- clangCodeGen
Let's use a C++ source file as an example.
1 | % cat a.cc |
The entry point of the Clang executable is implemented in
clang/tools/driver/
. clang_main
creates a
clang::driver::Driver
instance, calls
BuildCompilation
to construct a
clang::driver::Compilation
instance, and then calls
ExecuteCompilation
.
clangDriver
clangDriver parses the command line arguments, constructs compilation actions, assigns actions to tools, generates commands for these tools, and executes the commands.
You may read Compiler driver and cross compilation for additional information.
1 | BuildCompilation |
For clang++ -g a.cc
, clangDriver identifies the
following phases: preprocessor, compiler (C++ to LLVM IR), backend,
assembler, and linker. The first several phases can be performed by one
single clang::driver::tools::Clang
object (also known as
Clang cc1), while the final phase requires an external program (the
linker).
1 | % clang++ -g a.cc '-###' |
cc1_main
in clangDriver calls
ExecuteCompilerInvocation
defined in clangFrontend.
clangFrontend
clangFrontend
defines CompilerInstance
,
which manages various classes, including
CompilerInvocation
, DiagnosticsEngine
,
TargetInfo
, FileManager
,
SourceManager
, Preprocessor
,
ASTContext
, ASTConsumer
, and
Sema
.
1 | ExecuteCompilerInvocation |
In ExecuteCompilerInvocation
, a FrontAction
is created based on the CompilerInstance
argument and then
executed. When using the -emit-obj
option, the selected
FrontAction
is an EmitObjAction
, which is a
derivative of CodeGenAction
.
During FrontendAction::BeginSourceFile
, several classes
mentioned earlier are created, and a BackendConsumer
is
also established. The BackendConsumer
serves as a wrapper
around CodeGenerator
, which is another derivative of
ASTConsumer
. Finally, in
FrontendAction::BeginSourceFile
,
CompilerInstance::setASTConsumer
is called to create a
CodeGenModule
object, responsible for managing an LLVM IR
module.
In FrontendAction::Execute
,
CodeGenAction::ExecuteAction
is invoked, primarily handling
the compilation of LLVM IR files. This function, in turn, calls the base
function ASTFrontendAction::ExecuteAction
, which, in
essence, triggers the entry point of clangParse
:
ParseAST
.
clangParse and clangSema
clangParse
consumes tokens from clangLex
and invokes parser actions, many of which are named Act*
,
defined in clangSema
. clangSema
performs
semantic analysis and generates AST nodes.
1 | ParseAST |
In the end, we get a full AST (actually a misnomer as the
representation is not abstract, not only about syntax, and is not a
tree). ParseAST
calls virtual functions
HandleTopLevelDecl
and
HandleTranslationUnit
.
clangCodeGen
BackendConsumer
defined in clangCodeGen overrides
HandleTopLevelDecl
and HandleTranslationUnit
to perform LLVM IR and machine code generation.
1 | BackendConsumer::HandleTopLevelDecl |
BackendConsumer::HandleTopLevelDecl
generates LLVM IR
for each top-level declaration. This means that Clang generates a
function at a time.
BackendConsumer::HandleTranslationUnit
invokes
EmitBackendOutput
to create an LLVM IR file, an assembly
file, or a relocatable object file. EmitBackendOutput
establishes an optimization pipeline and a machine code generation
pipeline.
Now let's explore CodeGenFunction::EmitFunctionBody
.
Generating IR for a variable declaration and a return statement involve
the following functions, among others:
1 | EmitFunctionBody |
After generating the LLVM IR, clangCodeGen proceeds to execute
EmitAssemblyHelper::RunOptimizationPipeline
to perform
middle-end optimizations and subsequently
EmitAssemblyHelper::RunCodegenPipeline
to generate machine
code.
Compiler middle end
EmitAssemblyHelper::RunOptimizationPipeline
creates a
pass manager to schedule the middle-end optimization pipeline. This pass
manager executes numerous optimization passes and analyses.
The option -mllvm -print-pipeline-passes
provides
insight into these passes:
1 | % clang -c -O1 -mllvm -print-pipeline-passes a.c |
Compiler back end
The demarcation between the middle end and the back end may not be
entirely distinct. Within
LLVMTargetMachine::addPassesToEmitFile
, several IR passes
are scheduled. It's reasonable to consider these IR passes as part of
the middle end, while the phase beginning with instruction selection can
be regarded as the actual back end.
Here is an overview of
LLVMTargetMachine::addPassesToEmitFile
:
1 | LLVMTargetMachine::addPassesToEmitFile |
These IR and machine passes are scheduled by the legacy pass manager.
The option -mllvm -debug-pass=Structure
provides insight
into these passes:
1 | clang -c -O1 a.c -mllvm -debug-pass=Structure |
Instruction selector
There are three instruction selectors: SelectionDAG, FastISel, and GlobalISel. FastISel is integrated within the SelectionDAG framework.
For most targets, FastISel is the default for clang -O0
while SelectionDAG is the default for optimized builds. However, for
most AArch64 -O0
configurations, GlobalISel is the
default.
SelectionDAG
See https://llvm.org/docs/WritingAnLLVMBackend.html#instruction-selector.
1 | SectionDAG: normal code path |
1 | TargetPassConfig::addCoreISelPasses |
Each backend implements a derived class of SelectionDAGISel. For
example, the X86 backend implements X86DAGToDAGISel
and
overrides runOnMachineFunction
to set up variables like
X86Subtarget
and then invokes the base function
SelectionDAGISel::runOnMachineFunction
.
SelectionDAGISel
creates a
SelectionDAGBuilder
. For each basic block,
SelectionDAGISel::SelectBasicBlock
iterates over all IR
instructions and calls SelectionDAGBuilder::visit
on them,
creating a new SDNode
for each Value
that
becomes part of the DAG.
The initial DAG may contain types and operations that are not
natively supported by the target.
SelectionDAGISel::CodeGenAndEmitDAG
invokes
LegalizeTypes
and Legalize
to convert
unsupported types and operations to supported ones.
For llvm.memset
, the call stack may resemble the
following:
1 | SelectionDAGBuilder::visit |
ScheduleDAGSDNodes::EmitSchedule
emits the machine code
(MachineInstr
s) in the scheduled order.
FastISel, typically used for clang -O0
, represents a
fast path of SelectionDAG that generates less optimized machine
code.
When FastISel is enabled, SelectAllBasicBlocks
tries to
skip SelectBasicBlock
and select instructions with
FastISel. However, FastISel only handles a subset of IR instructions.
For unhandled instructions, SelectAllBasicBlocks
falls back
to SelectBasicBlock
to handle the remaining instructions in
the basic block.
GlobalISel
GlobalISel is a new instruction selection framework that operates on the entire function, in contrast to the basic block view of SelectionDAG. GlobalISel offers improved performance and modularity.
The design of the generic MachineInstr
replaces an
intermediate representation, SDNode
, which was used in the
SelectionDAG framework.
1 | LLVM IR =(IRTranslator)=> generic MachineInstr =(Legalizer,RegBankSelect,GlobalInstructionSelect)=> MachineInstr |
1 | TargetPassConfig::addCoreISelPasses |
Machine passes
1 | TargetPassConfig::addMachinePasses |
AsmPrinter
This target-specific AsmPrinter pass converts
MachineInstr
s to MCInst
s and emits them to a
MCStreamer
.
MC
Clang has the capability to output either assembly code or an object file. Generating an object file directly without involving an assembler is referred to as "direct object emission".
To provide a unified interface, MCStreamer
is created to
handle the emission of both assembly code and object files. The two
primary subclasses of MCStreamer
are
MCAsmStreamer
and MCObjectStreamer
,
responsible for emitting assembly code and machine code
respectively.
LLVMAsmPrinter calls the MCStreamer
API to emit assembly
code or machine code.
In the case of an assembly input file, LLVM creates an
MCAsmParser
object (LLVMMCParser) and a target-specific
MCTargetAsmParser
object. The MCAsmParser
is
responsible for tokenizing the input, parsing assembler directives, and
invoking the MCTargetAsmParser
to parse an instruction.
Both the MCAsmParser
and MCTargetAsmParser
objects can call the MCStreamer
API to emit assembly code
or machine code.
For an instruction parsed by the MCTargetAsmParser
, if
the streamer is an MCAsmStreamer
, the MCInst
will be pretty-printed. If the streamer is an MCELFStreamer
(other object file formats are similar),
MCELFStreamer::emitInstToData
will use
${Target}MCCodeEmitter
from LLVM${Target}Desc to encode the
MCInst
, emit its byte sequence, and records needed
relocations. An ELFObjectWriter
object is used to write the
relocatable object file.
You may read my post Assemblers for more information about the LLVM integrated assembler.