20

Does CUDA support double precision floating point numbers?

Also, what are the reasons for the same?

4 Answers 4

16

If your GPU has compute capability 1.3 then you can do double precision. You should be aware though that 1.3 hardware has only one double precision FP unit per MP, which has to be shared by all the threads on that MP, whereas there are 8 single precision FPUs, so each active thread has its own single precision FPU. In other words you may well see 8x worse performance with double precision than with single precision.

Sign up to request clarification or add additional context in comments.

10 Comments

Thanks for the tip Paul. I wanted to switch to double precision mostly for accuracy. I'm consulting on a side-project where I'm converting Python code to C++ / CUDA and the Python code uses double precision everywhere. I noticed that when I switched to using float I had an maximum absolute difference of 1e-06 for the results. I wasn't too satisfied with that, but I'd rather take the bullet with the accuracy than the performance. Thanks! +1.
Ha - commenting on 7 year old answers now Ray ? ;-) Seriously though this may be a bit out of date now - I haven't played with CUDA for a few years and the latest nVidia hardware may well have better double precision support by now, for all I know.
Hehe I didn't notice the year. I looked up the capability before I commented :). The card I'm working on for my client only has compute capability 3.0 and it's still only with half of that of single precision. It has only been in full support since 6.0... Pity. Thanks nonetheless, even if this was 7 years old!
One other thing to consider is that if the GPU is old, but the CPU is reasonably new (and particularly if it has a good number of cores), then you may get better results with a good FFT library (e.g. FFTW) on the CPU, which is a lot easier to implement and manage. Anyway, good luck with whichever route you go down!
@Suparshva Ah I see. No, my first comment at the end says "... but I'd rather taken the bullet with accuracy than performance"... meaning that I ended up using single-precision instead. I also didn't go with using any FFT based solutions because it wasn't required for my specific use case (even though I did implement a convolution in 2D).
|
11

As a tip:

If you want to use double precision you have to set the GPU architecture to sm_13 (if your GPU supports it).

Otherwise it will still convert all doubles to floats and gives only a warning (as seen in faya's post). (Very annoying if you get a error because of this :-) )

The flag is: -arch=sm_13

Comments

9

Following on from Paul R's comments, Compute Capability 2.0 devices (aka Fermi) have much improved double-precision support, with performance only half that of single-precision.

This Fermi whitepaper has more details about the double performance of the new devices.

3 Comments

+1: thanks for that additional info - I haven't worked with CUDA for about a year now and wasn't aware of Compute Capability 2.0 - nothing in tech stays still for very long !
Be aware though that Fermi's double precision performance is (artificially) lower for GeForce cards than for Teslas. Quadro cards should have the same performance level as Tesla cards.
Unfortunately, Quadro cards appear to be priced at around 10 times the price of GeForce cards with corresponding GPUs (though Quadro cards come with more memory).
4

As mentioned by others, older CUDA cards don't support the double type. But if you want more precision than the one your old GPU provides you can use the float-float solution which is similar to the double-double technique. For more information about that technique read

Of course on modern GPUs you can also use double-double to achieve an accuracy larger than double. double-double is also used for long double on PowerPC

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.