Skip to main content
1 of 4
Reto Koradi
  • 4.9k
  • 1
  • 14
  • 19

C

Initial version:

#include <stdint.h> #include <stdlib.h> #include <stdio.h> static int W, H, W1; static char Out[64 * 2 + 1]; static void place(uint64_t cMask, int nSteps) { if (nSteps) { int pos = __builtin_ctzl(~cMask); uint64_t mask = 1ul << pos; cMask |= mask; int y = pos / W; int x = pos - y * W; uint64_t neighMask = mask << 1; if ((x < W - 1) && !(cMask & neighMask)) { Out[W1 * y + x] = '-'; Out[W1 * y + x + 1] = '-'; place(cMask | neighMask, nSteps - 1); } neighMask = mask << W; if ((y < H - 1) && !(cMask & neighMask)) { Out[W1 * y + x] = '|'; Out[W1 * (y + 1) + x] = '|'; place(cMask | neighMask, nSteps - 1); } } else { puts(Out); } } int main(int argc, char* argv[]) { W = atoi(argv[1]); H = atoi(argv[2]); W1 = W + 1; int nSteps = W * H; if (nSteps & 1) { return 0; } nSteps >>= 1; for (int y = 0; y < H; ++y) { Out[W1 * y + W] = '\n'; } Out[W1 * H] = '\0'; place(0, nSteps); return 0; } 

This is not much optimized yet. But optimization of the actual algorithm code is largely moot because most of the time is used for output anyway, and compilation time is already similar to execution time. I had a version that only counted the solutions instead of printing them, and it took about 8 milliseconds for the 8x6 size on my modest laptop.

Build/test instructions:

> gcc -O0 Code.c > ./a.out 8 6 

My development was on a Mac using clang as the compiler. But I hope that gcc will behave similarly. The execution time is reduced when compiling with e.g. -O3, but the increase in compile time more than compensated for it in my tests. Ideally it could be tried with -O0 to -O3 on the benchmark system to determine which results in the lowest value for the sum.

I hope you'll also test for correctness, because I don't have much to validate it beyond the 4x2 case.

Reto Koradi
  • 4.9k
  • 1
  • 14
  • 19