Timeline for x64 assembly clearmem / zeromem
Current License: CC BY-SA 3.0
4 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Apr 1, 2018 at 19:11 | comment | added | FUZxxl | Note that the special path seems to be only for stosb. The instruction stosd is still slow as far as I am concerned. | |
| Aug 26, 2016 at 3:45 | comment | added | Peter Cordes | The test is redundant. Use dec rcx / jnz loop_top, because dec already sets ZF according to the result. It dec/jnz can even macro-fuse on Intel SnB-family microarchitectures. BTW yes, loop is slow on most CPUs, except AMD Bulldozer-family. And yes, rep stosq is highly recommend, as long as the buffer is aligned. Otherwise, do one unaligned store then rep stos. (See also stackoverflow.com/tags/x86/info for more perf links) | |
| Mar 25, 2014 at 14:28 | vote | accept | Drew Chapin | ||
| Mar 12, 2014 at 19:01 | history | answered | Jerry Coffin | CC BY-SA 3.0 |