Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

8
  • 3
    Some of the checking of earlier levels is redundant, but not a bad idea I guess. In real hardware, SSE4.2 for example already implies support for all previous (Intel) SSE versions (but not AMD SSE4a). In a virtual machine CPUID is virtualized so it's theoretically possible to indicate SSSE3 support without SSE3. Only in a software emulator would could you make SSE3 instructions fault while SSSE3 instructions didn't. (BTW, you omitted /sse3/.) The de-facto standard is that runtime CPU dispatching only needs to check the highest SSE feature flag it depends on. Commented Jan 27, 2021 at 19:02
  • 1
    There are other de-facto standards like SSE4.2 implying popcnt, but that's good to check explicitly. And other non-SIMD extensions like BMI1 are fully independent of SIMD (although since some BMI1/2 instructions use VEX encoding, they're normally only found on CPUs that support AVX. And unfortunately Intel even disables BMI1/2 on their Pentium/Celeron CPUs, perhaps as a way of fully disabling AVX.). Commented Jan 27, 2021 at 19:08
  • 1
    BTW, level 2 = Nehalem and current Silvermont, and current-gen Pentium/Celeron. Also AMD Bulldozer family since even Excavator doesn't have BMI2, only AVX2 and FMA3. Level 3 = Haswell (and Zen), and includes most of the really good stuff. MacOS apparently can make fat binaries with baseline x86-64 and Haswell feature-level, allowing usage of BMI2 efficient shift instructions all over the place, and of AVX everywhere. Level 4 = -march=skylake-avx512. Commented Jan 27, 2021 at 19:12
  • 1
    @PeterCordes yes, there are a number of deficiencies and redundancies here (in particular, I should check full fields instead of using regexes, since for example /lm/ will match anything containing those characters). I followed the exhaustive level definitions as used in the first answer (that’s where /ssse3/ without /sse3/ came from), even though as you say many of them are redundant. (I’ve been following the discussions leading up to the definition of these levels.) Commented Jan 27, 2021 at 19:29
  • 1
    TBH this was more an exercise in showing that all the checks could be done in AWK instead of a mixture of AWK ans shell, rather than coming up with the best level checker ;-). Commented Jan 27, 2021 at 19:32