49

AMD, Intel, Red Hat, and SUSE have defined a set of "architecture levels" for x86-64 CPUs. For example x86-64-v2 means that a CPU support not only the basic x86-64 instructions set, but also other instructions like SSE4.2, SSSE3 or POPCNT.

How can I check which architecture levels are supported by my CPU?

9 Answers 9

46

This is based on gioele’s answer; the whole script might as well be written in AWK:

#!/usr/bin/awk -f BEGIN { while (!/flags/) if (getline < "/proc/cpuinfo" != 1) exit 1 if (/lm/&&/cmov/&&/cx8/&&/fpu/&&/fxsr/&&/mmx/&&/syscall/&&/sse2/) level = 1 if (level == 1 && /cx16/&&/lahf/&&/popcnt/&&/sse4_1/&&/sse4_2/&&/ssse3/) level = 2 if (level == 2 && /avx/&&/avx2/&&/bmi1/&&/bmi2/&&/f16c/&&/fma/&&/abm/&&/movbe/&&/xsave/) level = 3 if (level == 3 && /avx512f/&&/avx512bw/&&/avx512cd/&&/avx512dq/&&/avx512vl/) level = 4 if (level > 0) { print "CPU supports x86-64-v" level; exit level + 1 } exit 1 } 

This also checks for the baseline (“level 1” here), only outputs the highest supported level, and exits with an exit code matching the first unsupported level.

8
  • 3
    Some of the checking of earlier levels is redundant, but not a bad idea I guess. In real hardware, SSE4.2 for example already implies support for all previous (Intel) SSE versions (but not AMD SSE4a). In a virtual machine CPUID is virtualized so it's theoretically possible to indicate SSSE3 support without SSE3. Only in a software emulator would could you make SSE3 instructions fault while SSSE3 instructions didn't. (BTW, you omitted /sse3/.) The de-facto standard is that runtime CPU dispatching only needs to check the highest SSE feature flag it depends on. Commented Jan 27, 2021 at 19:02
  • 1
    There are other de-facto standards like SSE4.2 implying popcnt, but that's good to check explicitly. And other non-SIMD extensions like BMI1 are fully independent of SIMD (although since some BMI1/2 instructions use VEX encoding, they're normally only found on CPUs that support AVX. And unfortunately Intel even disables BMI1/2 on their Pentium/Celeron CPUs, perhaps as a way of fully disabling AVX.). Commented Jan 27, 2021 at 19:08
  • 1
    BTW, level 2 = Nehalem and current Silvermont, and current-gen Pentium/Celeron. Also AMD Bulldozer family since even Excavator doesn't have BMI2, only AVX2 and FMA3. Level 3 = Haswell (and Zen), and includes most of the really good stuff. MacOS apparently can make fat binaries with baseline x86-64 and Haswell feature-level, allowing usage of BMI2 efficient shift instructions all over the place, and of AVX everywhere. Level 4 = -march=skylake-avx512. Commented Jan 27, 2021 at 19:12
  • 1
    @PeterCordes yes, there are a number of deficiencies and redundancies here (in particular, I should check full fields instead of using regexes, since for example /lm/ will match anything containing those characters). I followed the exhaustive level definitions as used in the first answer (that’s where /ssse3/ without /sse3/ came from), even though as you say many of them are redundant. (I’ve been following the discussions leading up to the definition of these levels.) Commented Jan 27, 2021 at 19:29
  • 1
    TBH this was more an exercise in showing that all the checks could be done in AWK instead of a mixture of AWK ans shell, rather than coming up with the best level checker ;-). Commented Jan 27, 2021 at 19:32
37

Originally copied from https://gitlab.archlinux.org/archlinux/rfcs/-/merge_requests/2/diffs

With glibc 2.33 or later (Arch Linux, Debian 12, Ubuntu 21.04, Fedora 34, etc.), or patched glibc (RHEL 8), you can see what architecture is supported by your CPU by running:

$ /lib/ld-linux-x86-64.so.2 --help Subdirectories of glibc-hwcaps directories, in priority order: x86-64-v4 x86-64-v3 (supported, searched) x86-64-v2 (supported, searched) 

On Debian derivatives the path is different, you need to run /lib64/ld-linux-x86-64.so.2 --help.

1
  • 8
    The AMD64 ABI remarks that /lib64/ld-linux-x86-64.so.2 is the standard place (on Linux) for the program interpreter. Commented Jan 17, 2023 at 11:50
15

On Linux, one can check the CPU capabilities reported by /proc/cpuinfo against the requirements described in the x86-psABI documentation.

The following script automates that process (the exit code is the number of the first non-supported architecture level).

#!/bin/sh -eu flags=$(cat /proc/cpuinfo | grep flags | head -n 1 | cut -d: -f2) supports_v2='awk "/cx16/&&/lahf/&&/popcnt/&&/sse4_1/&&/sse4_2/&&/ssse3/ {found=1} END {exit !found}"' supports_v3='awk "/avx/&&/avx2/&&/bmi1/&&/bmi2/&&/f16c/&&/fma/&&/abm/&&/movbe/&&/xsave/ {found=1} END {exit !found}"' supports_v4='awk "/avx512f/&&/avx512bw/&&/avx512cd/&&/avx512dq/&&/avx512vl/ {found=1} END {exit !found}"' echo "$flags" | eval $supports_v2 || exit 2 && echo "CPU supports x86-64-v2" echo "$flags" | eval $supports_v3 || exit 3 && echo "CPU supports x86-64-v3" echo "$flags" | eval $supports_v4 || exit 4 && echo "CPU supports x86-64-v4" 
4
  • 2
    Instead of using a variable and evaling it, you could have used a function Commented Jan 27, 2021 at 8:48
  • As an FYI, my old AMD FX-6100 supports v2, but not v3 or v4. Commented Jan 27, 2021 at 18:47
  • 1
    @RonJohn: Yup, even Bulldozer-family is only "level 2", even though Excavator has AVX2 and FMA. It's missing BMI2 and movbe. (Piledriver / Steamroller have AVX1 and FMA; Bulldozer has AVX1 and FMA4 but not FMA3; Intel pulled the rug out from under AMD as late as they could. See Stop the instruction set war on Agner Fog's blog.) To be fair, having another level with AVX but not BMI2 would be of limited value, and BMI2 is quite nice for Intel CPUs: variable-count shifts with SHLX/SHRX are 1 uop instead of 3, and can use any reg instead of CL Commented Jan 27, 2021 at 19:16
  • 1
    Level 3 = Haswell and Zen1. Level 4 = -march=skylake-avx512. Commented Jan 27, 2021 at 19:16
7

Here's a shell script to determine the x86_64 CPU architecture level on Linux. It's compatible with BusyBox. With the option -v, it shows which flags you're missing to reach the next level. See What do the flags in /proc/cpuinfo mean? for an explanation of the flags.

#!/bin/sh set -e verbose= while getopts v OPTLET; do case "$OPTLET" in v) verbose=1;; \?) exit 2;; esac done flags=$(grep '^flags\b' </proc/cpuinfo | head -n 1) flags=" ${flags#*:} " has_flags () { for flag; do case "$flags" in *" $flag "*) :;; *) if [ -n "$verbose" ]; then echo >&2 "Missing $flag for the next level" fi return 1;; esac done } determine_level () { level=0 has_flags lm cmov cx8 fpu fxsr mmx syscall sse2 || return 0 level=1 has_flags cx16 lahf_lm popcnt sse4_1 sse4_2 ssse3 || return 0 level=2 has_flags avx avx2 bmi1 bmi2 f16c fma abm movbe xsave || return 0 level=3 has_flags avx512f avx512bw avx512cd avx512dq avx512vl || return 0 level=4 } determine_level echo "$level" 

(Acknowledgement: I reused the list of flags from Stephen Kitt's answer which in turns builds on gioele's answer.)

3
  • .@Gilles, thanks for this. I created a 'x86-64-level' tool (<github.com/HenrikBengtsson/x86-64-level/>) that originated from your script here. I did it to give it a home, make it downloadable, and to be able to add a README with more details. I'd like to add a FOSS license to it; do you have a preference? Commented Dec 17, 2022 at 3:49
  • @HenrikB Anything posted on this site is already open source. But if you prefer a different license: I hereby allow anyone to make a derivative work based on the code in unix.stackexchange.com/a/631320 and license it (or not) however they please. Commented Dec 17, 2022 at 11:22
  • Thank you @Gilles. I've added the 'CC BY-SA 4.0' license per the default here. Also, I've listed you as one of the authors - I hope that's okay. Also, if you want to be listed with anything else that your StackOverflow user name, please let me know. Commented Dec 18, 2022 at 0:06
3

On more modern Fedora / Red Hat systems do this:

$ /usr/lib64/ld-linux-x86-64.so.2 --help Usage: /usr/lib64/ld-linux-x86-64.so.2 [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...] You have invoked 'ld.so', the program interpreter for dynamically-linked ELF programs. Usually, the program interpreter is invoked automatically when a dynamically-linked executable is started. You may invoke the program interpreter program directly from the command line to load and run an ELF executable file; this is like executing that file itself, but always uses the program interpreter you invoked, instead of the program interpreter specified in the executable file you run. Invoking the program interpreter directly provides access to additional diagnostics, and changing the dynamic linker behavior without setting environment variables (which would be inherited by subprocesses). --list list all dependencies and how they are resolved --verify verify that given object really is a dynamically linked object we can handle --inhibit-cache Do not use /etc/ld.so.cache --library-path PATH use given PATH instead of content of the environment variable LD_LIBRARY_PATH --glibc-hwcaps-prepend LIST search glibc-hwcaps subdirectories in LIST --glibc-hwcaps-mask LIST only search built-in subdirectories if in LIST --inhibit-rpath LIST ignore RUNPATH and RPATH information in object names in LIST --audit LIST use objects named in LIST as auditors --preload LIST preload objects named in LIST --argv0 STRING set argv[0] to STRING before running --list-tunables list all tunables with minimum and maximum values --list-diagnostics list diagnostics information --help display this help and exit --version output version information and exit This program interpreter self-identifies as: /lib64/ld-linux-x86-64.so.2 Shared library search path: (libraries located via /etc/ld.so.cache) /lib64 (system search path) /usr/lib64 (system search path) Subdirectories of glibc-hwcaps directories, in priority order: x86-64-v4 x86-64-v3 x86-64-v2 (supported, searched) 
3

Here's a script to determine x86_64 level and tell which instructions are missing for next higher level. Compatible with Linux and Unix (tested on Debian and OpenBSD).

#!/usr/bin/awk -f BEGIN { # Collect CPU features from lscpu cmd = "lscpu | grep 'Flags:' | awk '{for (i=2; i<=NF; i++) print $i}'" while (cmd | getline) { features = features " " $0 } close(cmd) # Define required features for each x86-64-v level levels[1] = "lm cmov cx8 fpu fxsr mmx syscall sse2" levels[2] = "cx16 lahf_lm popcnt sse4_1 sse4_2 ssse3" levels[3] = "avx avx2 bmi1 bmi2 f16c fma abm movbe xsave" levels[4] = "avx512f avx512bw avx512cd avx512dq avx512vl" level = 0 missing = "" # Check features for each level for (i = 1; i <= 4; i++) { level_met = 1 split(levels[i], flags, " ") missing = "" for (j in flags) { if (features !~ flags[j]) { missing = missing flags[j] " " level_met = 0 } } if (level_met == 1) { level = i } else { print "Current level: x86-64-v" level print "Missing for x86-64-v" i ": " missing exit i } } # Report the highest supported level print "Current level: x86-64-v" level exit level + 1 } 

Example result:

$ ./x86_64_check.sh Current level: x86-64-v1 Missing for x86-64-v2: popcnt sse4_2 $ ./x86_64_check.sh Current level: x86-64-v3 Missing for x86-64-v4: avx512f avx512bw avx512cd avx512dq avx512vl 
2
  • Seems like the current_level_met variable is unnecessary.  You set it to 0 just before exiting, so that value will never be seen (i.e., tested), so that statement is pointless.  So how can you get to the test (at the bottom) without it being set? Commented Aug 15, 2024 at 8:42
  • Yes that's right. Thank you. I've removed current_level_met declarations, and updated my code in the previous post. Commented Aug 16, 2024 at 14:47
2

I've created an x86-64-level tool based on the suggestions here. Examples:

$ x86-64-level 3 $ level=$(x86-64-level) $ echo "x86-64-v${level}" x86-64-v3 ## Output an explanation to stderr $ x86-64-level --verbose Identified x86-64-v3, because x86-64-v4 requires 'avx512f', which is not supported by this CPU [Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz] 3 

If you want to assert that the current machine supports a certain x86-64 level in a shell script, add the following one-line gatekeeper;

x86-64-level --assert=4 || exit 1 

This will be silent if the host supports x86-64-v4, otherwise it'll output:

The CPU [Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz] on this host ('dev2') supports x86-64-v3, which is less than the required x86-64-v4 

and exit with exit value 1.

The x86-64-level tool is a standalone Bash script that's available at https://github.com/HenrikBengtsson/x86-64-level.

2

The single thing that worked for me is to use gcc with its __builtin_cpu_supports feature.  Since I invoked it in msys it is likely to work on Windows too.  Can be done with C++ too.

// test_cpu.c #ifndef __GNUC__ #error "You must use gnu" #endif #include <stdio.h> int main() { if (__builtin_cpu_supports("x86-64-v4")) puts("v=4"); else if (__builtin_cpu_supports("x86-64-v3")) puts("v=3"); else if (__builtin_cpu_supports("x86-64-v2")) puts("v=2"); else puts("v=1"); } 

Usage:

$ gcc /test_cpu.c -o /test_cpu $ /test_cpu v=3 
1

One way is to use the Function Multiversioning feature in GCC, write a test program, and see what version of the function (dependent on your CPU arch) will it pick.

The foo function from the program below will create multiple symbols in the binary, and the "best" version will be picked at runtime

$ nm a.out | grep foo 0000000000402236 T _Z3foov 000000000040224c T _Z3foov.arch_x86_64 0000000000402257 T _Z3foov.arch_x86_64_v2 0000000000402262 T _Z3foov.arch_x86_64_v3 000000000040226d T _Z3foov.arch_x86_64_v4 0000000000402290 W _Z3foov.resolver 0000000000402241 T _Z3foov.sse4.2 0000000000402290 i _Z7_Z3foovv 
// multiversioning.c #include <stdio.h> __attribute__ ((target ("default"))) const char* foo () { return "default"; } __attribute__ ((target ("sse4.2"))) const char* foo () { return "sse4.2"; } __attribute__ ((target ("arch=x86-64"))) const char* foo () { return "x86-64-v1"; } __attribute__ ((target ("arch=x86-64-v2"))) const char* foo () { return "x86-64-v2"; } __attribute__ ((target ("arch=x86-64-v3"))) const char* foo () { return "x86-64-v3"; } __attribute__ ((target ("arch=x86-64-v4"))) const char* foo () { return "x86-64-v4"; } int main () { printf("%s\n", foo()); return 0; } 

On my laptop, this prints

$ g++ multiversioning.c $ ./a.out x86-64-v3 

Note that the use of g++ is intentional here.

If I used gcc to compile, it would fail with error: redefinition of ‘foo’.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.