NumPY has complex64 corresponding to two float32's.
But it also has float16's but no complex32.
How come? I have signal processing calculation involving FFT's where I think I'd be fine with complex32, but I don't see how to get there. In particular I was hoping for speedup on NVidia GPU with cupy.
However it seems that float16 is slower on GPU rather than faster.
Why is half-precision unsupported and/or overlooked?
Also related is why we don't have complex integers, as this may also present an opportunity for speedup.
ccode) is optimized for 32 and 64 bit processing? Most of us aren't using 8 bit processors any more!However it seems that float16 is slower on GPU rather than faster.Its certainly possible for a FP16 FFT on a GPU to be faster than a corrsponding F32 (or FP64) FFT. GPU type matters, of course. It also seems like you may have pointed this out in an oblique fashion in your comments, so I'm not sure why you would leave your statement like that in your question unedited. So I'll just leave this here for future readers.