6

Is there anything as half precision floating points in CUDA?

Background: I want to manipulate an opengl texture using glTexSubImage3D with data from a PBO which I generate using CUDA. The texture is stored in GL_INTENSITY16 format (which is a half precision floating type afaik) and I dont want to use glPixelTransferf(GL_x_SCALE,...) to scale from integer values since it seems to be much faster without the scaling.

Any advice?

1 Answer 1

15

CUDA only natively supports 32 and 64 bit floating precision types.

Both driver and runtime APIs support binding to half float textures, but the resulting read inside the kernel will return the value promoted to a 32 bit floating point number. The CUDA standard libraries include __half2float() and __float2half_rn() functions for converting between half and single precision floating point types (the half float stored in a 16 bit integer). So it might be possible to do the manipulation in 32 bit precision kernels with reads and writes done using 16 bit types. But for native 16 bit floating point, I think you are out of luck.


EDIT to add that in 2015, NVIDIA extended half precision floating point support with the CUDA 7.5 toolkit by added half and half2 types and intrinsic functions to handle them. It has also been announced that (not yet released) Pascal architecture will support IEE754-2008 compliant half precision operations in hardware.

Sign up to request clarification or add additional context in comments.

2 Comments

Storing half-precision in the textures WILL save you bandwidth, even if the computation is full 32-bit. So if your app is bandwidth bound at all, it might be worth it.
(-1) _half2float _IS "native support." The point is that __half2float is an intrinsic (single-cycle, I'm pretty sure), while in SSE it requires many instructions to do right. The fact that ALU is 32bit doesn't matter, halfs let you save memory and memory bandwidth.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.