I'm trying to assemble a file that uses ARM's CRC instruction. The assembler is producing an error Error: selected processor does not support 'crc32b w1,w0,w0'.
There are runtime checks in place, so we are safe with the instruction. The technique works fine on i686 and x86_64. For example, I can assemble a file that uses Intel CRC intrinsics or SHA Intrinsics without -mcrc or -msha (and on a machine without the features).
Here is the test case:
$ cat test.cxx #include <arm_neon.h> #define GCC_INLINE_ATTRIB __attribute__((__gnu_inline__, __always_inline__, __artificial__)) #if defined(__GNUC__) && !defined(__ARM_FEATURE_CRC32) __inline unsigned int GCC_INLINE_ATTRIB CRC32B(unsigned int crc, unsigned char v) { unsigned int r; asm ("crc32b %w2, %w1, %w0" : "=r"(r) : "r"(crc), "r"((unsigned int)v)); return r; } #else // Use the intrinsic # define CRC32B(a,b) __crc32b(a,b) #endif int main(int argc, char* argv[]) { return CRC32B(argc, argc); } And here is the result:
$ g++ test.cxx -c /tmp/ccqHBPUf.s: Assembler messages: /tmp/ccqHBPUf.s:23: Error: selected processor does not support `crc32b w1,w0,w0' Placing the ASM code in a source file and compiling with different options is not feasible because CRC32B will be used in C++ header files, too.
How do I get GAS to assemble the instruction?
GCC's configuration and options are the reason we are trying to do things this way. User's don't read manuals, so they won't add -march=armv8-a+crc+crypto -mtune=cortex-a53 to CFLAGS and CXXFLAGS.
In addition, distros compile to a "least capable" machine, so we want the hardware acceleration routines available. When the library is provided by a distro like Linaro, both code paths (software CRC and hardware accelerated CRC) will be available.
The machine is a LeMaker HiKey, which is ARMv8/Aarch64. It has an A53 processor with CRC and Crypto (CRC and Crypto is optional under the architecture):
$ cat /proc/cpuinfo Processor : AArch64 Processor rev 3 (aarch64) processor : 0 ... processor : 7 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: AArch64 GCC lacks most of the usual defines one expects to be present by default:
$ g++ -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)' #define __aarch64__ 1 #define __AARCH64_CMODEL_SMALL__ 1 #define __AARCH64EL__ 1 Using GCC's -march=native does not work on ARM:
$ g++ -march=native -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)' cc1: error: unknown value ‘native’ for -march And Clang:
$ clang++ -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)' #define __AARCH64EL__ 1 #define __ARM_64BIT_STATE 1 #define __ARM_ACLE 200 #define __ARM_ALIGN_MAX_STACK_PWR 4 #define __ARM_ARCH 8 #define __ARM_ARCH_ISA_A64 1 #define __ARM_ARCH_PROFILE 'A' #define __ARM_FEATURE_CLZ 1 #define __ARM_FEATURE_DIV 1 #define __ARM_FEATURE_FMA 1 #define __ARM_FEATURE_UNALIGNED 1 #define __ARM_FP 0xe #define __ARM_FP16_FORMAT_IEEE 1 #define __ARM_FP_FENV_ROUNDING 1 #define __ARM_NEON 1 #define __ARM_NEON_FP 0xe #define __ARM_PCS_AAPCS64 1 #define __ARM_SIZEOF_MINIMAL_ENUM 4 #define __ARM_SIZEOF_WCHAR_T 4 #define __aarch64__ 1 GCC version:
$ gcc -v ... gcc version 4.9.2 (Debian/Linaro 4.9.2-10) GAS version:
$ as -v GNU assembler version 2.24 (aarch64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.24
.arch_extension name. Perhaps added directly to this asm instruction. According to the docs, this allows you to add or remove extensions incrementally to the architecture being compiled for. Failing that, perhaps adding an.arch nameas a "top level" bit of 'basic' asm?.arch_extensionyesterday, but it resulted in errors..arch_extensionneeds Binutils 2.26 from 2016. 2.26 has the support for both Aarch32 and Aarch64. Also see Error: unknown pseudo-op: `.arch_extension' on the Linaro Toolchain mailing list.#pragma GCC targetis going to be a better bet than using (a potentially conflicting).archasm at file scope. Something to keep in mind though.#pragma GCC targetneeds GCC 6 for Aarch64. And a correction: Binutils 2.26 is from 2014; not 2016.