Timeline for x64 assembly clearmem / zeromem

4 events

when toggle format	what		by	license	comment
Apr 1, 2018 at 19:11	comment	added	FUZxxl		Note that the special path seems to be only for `stosb`. The instruction `stosd` is still slow as far as I am concerned.
Aug 26, 2016 at 3:45	comment	added	Peter Cordes		The `test` is redundant. Use `dec rcx / jnz loop_top`, because `dec` already sets ZF according to the result. It dec/jnz can even macro-fuse on Intel SnB-family microarchitectures. BTW yes, `loop` is slow on most CPUs, except AMD Bulldozer-family. And yes, `rep stosq` is highly recommend, as long as the buffer is aligned. Otherwise, do one unaligned store then rep stos. (See also stackoverflow.com/tags/x86/info for more perf links)
Mar 25, 2014 at 14:28	vote	accept	Drew Chapin
Mar 12, 2014 at 19:01	history	answered	Jerry Coffin	CC BY-SA 3.0