I'm writing a digital filter, and I need to keep the last X values and sum them all together.
Now there are two possible approaches to this. Either I shift the whole array using memmove to make room for the next value, and have the right indexes to the array as hard-coded values in my summing algorithm.
memmove(&Fifo[0], &Fifo[1], 12 * 4); // Shift array to the left Result += Factor[1] * (Fifo[5] + Fifo[7]); Result += Factor[2] * (Fifo[4] + Fifo[8]); Result += Factor[3] * (Fifo[3] + Fifo[9]); Result += Factor[4] * (Fifo[2] + Fifo[10]); Result += Factor[5] * (Fifo[1] + Fifo[11]); Result += Factor[6] * (Fifo[0] + Fifo[12]); Or alternatively, I don't copy any memory, but increment a counter instead, and calculate each index from that using a modulo operation (like a circular buffer).
i++; // Increment the index Result += Factor[1] * (Fifo[(i + 5) % 13] + Fifo[(i + 7) % 13]); Result += Factor[2] * (Fifo[(i + 4) % 13] + Fifo[(i + 8) % 13]); Result += Factor[3] * (Fifo[(i + 3) % 13] + Fifo[(i + 9) % 13]); Result += Factor[4] * (Fifo[(i + 2) % 13] + Fifo[(i + 10) % 13]); Result += Factor[5] * (Fifo[(i + 1) % 13] + Fifo[(i + 11) % 13]); Result += Factor[6] * (Fifo[(i + 0) % 13] + Fifo[(i + 12) % 13]); Since its an embedded ARM cpu, I was wondering what would be more efficient. Since I assume that the CPU has to move at least one 32-bit value internally to do the modulo operation, could it be that just moving the whole array is just as fast as calculating the right indexes?