After some latency measurement tests, I figured out I need to optimize an pythagoras triangle calculation done on an embedded CPU with a pretty slow FPU.
The problem is if these calculations occur, they come in numbers and this messes up the timing. I cannot reduce the absolute number of calculations. But somehow they need to get faster ... by at least factor 5. :-/
I'm currently thinking of pre-processing these calculations since the input range of distinct values is somehow limited to about 300-500 permutations and interpolation between two table entry should suffice. But I was also wondering if using some conditions to the problem it might be possible to also speed up this code:
float h = 0.f, v=0.f; /// ... float const d = std::sqrt( (h*h) + (v*v) ); This I haven't used yet:
- The accuracy of result d is very limited to not more as 3 fractional digits are required
- The legs of the triangle (h,v) are always at aspect ratio of 4:3 or 16:9
I don't know if some integer fixed point calculation are available fort square root or if the function can be substituted to a one with less accuracy or somehow using the aspect ratio.
Any ideas?
Thank you!