I am working on a block design to compute the coordinate in the complex set represented by a pixel. Given an x and y pixel value, the step size, and starting x and starting y I need to compute a coordinate in the complex plane. For instance x' = start_x + step * x
Since x is an integer I first pass it though a floating point conversion module and then through a floating point fused multiply add. This has a total latency of 24 cycles. The problem is that I also need to provide a memory address at the beginning and get it out 24 cycles later before I pass it on the the module which gets x' y' and the address. I am looking for an IP which can help with this. The closest thing I have found is a shift register but I would need 24 of them. I was thinking of maybe using a FIFO. Is there anything that just acts as a latency delay for data while other calculations are being performed?
PS: Not sure why I called my Fused-Mul-Add fmax
