-march, -mcpu, and -mtune
2022-8-28 15:0:0 Author: maskray.me(查看原文) 阅读量:17 收藏

In GCC and Clang, there are three major options specifying the architecture and microarchitecture the generated code can run on. The general semantics are described below, but each target machine may assign different semantics.

  • -march=X: (execution domain) Generate code that can use instructions available in the architecture X
  • -mtune=X: (optimization domain) Optimize for the microarchitecture X, but does not change the ABI or make assumptions about available instructions
  • -mcpu=X: Specify both -march= and -mtune= but can be overridden by the two options. The supported values are generally the same as -mtune=. The architecture name is inferred from X

AArch32

See https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html.

AArch32 follows the general description.

-march=name[+extension...] may specify architecture extensions.

AArch64

See https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html.

AArch64 follows the general description.

PowerPC

See https://gcc.gnu.org/onlinedocs/gcc/RS_002f6000-and-PowerPC-Options.html.

-march= is not implemented in GCC.

1
2
% powerpc64le-linux-gnu-gcc -fsyntax-only -march=power10 a.c
powerpc64le-linux-gnu-gcc: error: unrecognized command-line option ‘-march=power10’; did you mean ‘-mcpu=power10’?

RISC-V

RISC-V follows the general description.

-mtune= must specify an element in gcc/config/riscv/riscv-cores.def. The default -mcpu= value is RISCV_TUNE_STRING_DEFAULT.

-mcpu= does not seem to infer -march= in GCC.

x86

See https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html.

-mcpu= has been deprecated since 2003-02. It is an alias for -mtune=.

-march= specifies a cpu-type. cpu-type is a microarchitecture name (e.g. skylake, znver3), a microarchitecture level (x86-64, x86-64-v2, x86-64-v3, x86-64-v4), or native. -march=native enables all instruction subsets supported by the local machine and recognized by the GCC version.

If -mtune= is not specified, use the -march= value or generic.

-march= and -mtune= must specify an element in gcc/common/config/i386/i386-common.cc:processor_alias_table. -mtune= must specify an element without the PTA_NO_TUNE flag.

-mtune= decides features in gcc/config/i386/i386-options.cc:ix86_tune_features:

1
2
3
4
5
6
7
% gcc -c -mdump-tune-features a.c
List of x86 specific tuning parameter names:
schedule : on
partial_reg_dependency : on
sse_partial_reg_dependency : on
sse_split_regs : off
...

-mtune= defines some built-in macros __tune_* (see gcc/config/i386/i386-c.cc:ix86_target_macros_internal).

x86-64, x86-64-v2, x86-64-v3, x86-64-v4

The x86-64 psABI defines some microarchitecture levels. They are a very small set of choices suitable for a Linux distribution default with a neutral name (i.e. not tied to an AMD or Intel microarchitecture name). See New x86-64 micro-architecture levels for the original proposal on libc-alpha.

Recommendation

For the simplest case where only one option is desired, use -march= for x86 and -mcpu= for other targets.

When optimizing for the local machine, just use -march=native for x86 and -mcpu=native for other targets.

When the architecture and subarchitecture are both specified, i.e. when both the execution domain and the optimization domain need to be specified, specify -march= and -mtune=, and avoid -mcpu=. On PowerPC, specify both -mcpu= and -mtune=.


文章来源: https://maskray.me/blog/2022-08-28-march-mcpu-mtune
如有侵权请联系:admin#unsafe.sh