Double precision floating point in CUDA

Question

Does CUDA support double precision floating point numbers?

Also, what are the reasons for the same?

Paul R · Accepted Answer · 2017-11-18 06:44:28Z

16

If your GPU has compute capability 1.3 then you can do double precision. You should be aware though that 1.3 hardware has only one double precision FP unit per MP, which has to be shared by all the threads on that MP, whereas there are 8 single precision FPUs, so each active thread has its own single precision FPU. In other words you may well see 8x worse performance with double precision than with single precision.

edited Nov 18, 2017 at 6:44

answered May 12, 2010 at 8:24

Paul R

214k38 gold badges402 silver badges579 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

rayryeng Over a year ago

Thanks for the tip Paul. I wanted to switch to double precision mostly for accuracy. I'm consulting on a side-project where I'm converting Python code to C++ / CUDA and the Python code uses double precision everywhere. I noticed that when I switched to using float I had an maximum absolute difference of 1e-06 for the results. I wasn't too satisfied with that, but I'd rather take the bullet with the accuracy than the performance. Thanks! +1.

Paul R Over a year ago

Ha - commenting on 7 year old answers now Ray ? ;-) Seriously though this may be a bit out of date now - I haven't played with CUDA for a few years and the latest nVidia hardware may well have better double precision support by now, for all I know.

rayryeng Over a year ago

Hehe I didn't notice the year. I looked up the capability before I commented :). The card I'm working on for my client only has compute capability 3.0 and it's still only with half of that of single precision. It has only been in full support since 6.0... Pity. Thanks nonetheless, even if this was 7 years old!

Paul R Over a year ago

One other thing to consider is that if the GPU is old, but the CPU is reasonably new (and particularly if it has a good number of cores), then you may get better results with a good FFT library (e.g. FFTW) on the CPU, which is a lot easier to implement and manage. Anyway, good luck with whichever route you go down!

rayryeng Over a year ago

@Suparshva Ah I see. No, my first comment at the end says "... but I'd rather taken the bullet with accuracy than performance"... meaning that I ended up using single-precision instead. I also didn't go with using any FFT based solutions because it wasn't required for my specific use case (even though I did implement a convolution in 2D).

|

phuclv · Accepted Answer · 2016-12-01 03:22:21Z

As a tip:

If you want to use double precision you have to set the GPU architecture to sm_13 (if your GPU supports it).

Otherwise it will still convert all doubles to floats and gives only a warning (as seen in faya's post). (Very annoying if you get a error because of this :-) )

The flag is: -arch=sm_13

Edric · Accepted Answer · 2010-05-12 09:41:19Z

9

Following on from Paul R's comments, Compute Capability 2.0 devices (aka Fermi) have much improved double-precision support, with performance only half that of single-precision.

This Fermi whitepaper has more details about the double performance of the new devices.

answered May 12, 2010 at 9:41

Edric

25.3k2 gold badges42 silver badges43 bronze badges

3 Comments

Paul R Over a year ago

+1: thanks for that additional info - I haven't worked with CUDA for about a year now and wasn't aware of Compute Capability 2.0 - nothing in tech stays still for very long !

Eric Over a year ago

Be aware though that Fermi's double precision performance is (artificially) lower for GeForce cards than for Teslas. Quadro cards should have the same performance level as Tesla cards.

Roger Dahl Over a year ago

Unfortunately, Quadro cards appear to be priced at around 10 times the price of GeForce cards with corresponding GPUs (though Quadro cards come with more memory).

phuclv · Accepted Answer · 2019-06-03 13:44:05Z

As mentioned by others, older CUDA cards don't support the double type. But if you want more precision than the one your old GPU provides you can use the float-float solution which is similar to the double-double technique. For more information about that technique read

Of course on modern GPUs you can also use double-double to achieve an accuracy larger than double. double-double is also used for long double on PowerPC

Collectives™ on Stack Overflow

Double precision floating point in CUDA

4 Answers 4

10 Comments

Comments

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

10 Comments

Comments

3 Comments

Comments

Linked

Related