Since you are writing FizzBuzz in assembler, you are obviously concerned about performance and code size.
For performance, div is one of the worst instructions. Since the divisibility repeats modulo 15, you could define some constants:
const divisible_by_3 = 0b1001001001001001 const divisible_by_5 = 0b1000010000100001
To test the divisibility, have an extra register that is initialized with i mod 15 and adjusted whenever i changes. The basic idea is:
dec i dec i_mod_15 cmovs i_mod_15, 14 ; the maximum value mod 15
You can also combine divisible_by_3 and divisible_by_5 into a bit vector of two-bit entries (divisible) and define a jump table based on that. To do the actual testing, use bit-shifting.
const divisible_by_3 = 0b_1_0_0_1_0_0_1_0_0_1_0_0_1_0_0_1 const divisible_by_5 = 0b1_0_0_0_0_1_0_0_0_0_1_0_0_0_0_1_ const divisible = 0b11000001001001000001100001000011
Another idea is to use unroll the loop by using Duff's Device.
Right now, your code is a really boring, straight-forward translation of some higher-level language, probably C. In assembler, you have much more potential to compress the code (DRY principle). For example, you can jmp do_printf instead of writing push ecx; call printf; pop ecx several times.
… 90 minutes later …
Based on the above ideas, the code might look like this, tested and works.
format PE console entry main include 'macro/import32.inc' section '.rdata' data readable ; Each of these messages takes exactly 8 bytes, ; except for the last one. ; This is important for addressing them efficiently. messages db '%d', 13, 10, 0, 0, 0, 0 db 'Fizz', 13, 10, 0, 0 db 'Buzz', 13, 10, 0, 0 db 'FizzBuzz', 13, 10, 0 ; This bit mask selects one of the above messages ; to be printed. The counter is always passed to ; printf, and in 8/15 cases it is ignored. ; The uu entry at the end is unused. ; ; 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ; div3 mask = 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 u ; div5 mask = 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 u div_mask = 11_00_00_01_00_10_01_00_00_01_10_00_01_00_00_00b section '.text' code readable main: push ebp mov ebp, esp mov ecx, 100 ; ecx = the counter mov eax, ecx xor edx, edx mov ebx, 15 div ebx ; edx = counter mod 15 mov ebx, div_mask ; ebx = bit mask for selecting the message rol ebx, 2 xchg edx, ecx ; Variable-width rotation is only possible rol ebx, cl ; with cl, therefore the temporary swap rol ebx, cl ; of edx and ecx. xchg edx, ecx ; .again: mov eax, ebx ; Select the message format for printf. and eax, 11b shl eax, 3 add eax, messages push ebx ; save registers before printf push ecx push edx push ecx ; actually call printf push eax call [printf] add esp, 8 pop edx ; restore registers after printf pop ecx pop ebx dec edx ; Adjust counter, counter mod 15 jns .normal ; and bit mask. add edx, 15 ror ebx, 2 ; Rotate one extra time since .normal: ; the bit mask has 16 entries. ror ebx, 2 dec ecx jnz .again xor eax, eax pop ebp ret section '.idata' import data readable library msvcrt, 'msvcrt.dll' import msvcrt, printf, 'printf', system, 'system'
Some more things I took care of:
- The code must not be in a writeable section.
- Since the program can return from
main, it should do so. To make that work, I had to add the pop ebp that corresponds to the push ebp at the very top; in your code that was missing. - I carefully avoided many branching statements, since they are poisonous to performance. See Why is it faster to process a sorted array.
- The one remaining conditional behaves well in that it follows the jump in 14/15 cases, which is easily predictable.
- Of course, using
printf and the C stdio for output ruins all performance effects. But that's outside the scope for this little fun experiment. - All arguments to
printf may be modified by printf. There's no guarantee that you get your ecx back at the point where you commented ; Get the counter back into ecx. To hide the counter from printf, you must push it once more to the stack. That's why I explicitly commented on this saving/calling/restoring in my code.
A nice thing is that you can play with the bit mask, which feels just like an ordinary configuration file.