Join our community on Discord
https://packt.link/deep-engineering-cpp
Now that we’ve had a glimpse of GPU programming with slightly larger applications it is time to learn in more detail what enables GPUs to confer such speed advantage. First, we are going to learn about concepts that map to the GPU cores and how we address them.
After that, we will go through more advanced concepts relating to how modern GPUs execute much more at the same time. Finally, we will take a first look at how to improve memory access times inside the GPU environment, and consider why it matters.
With the solid foundation of the previous chapters to build on – we’ve seen how to compile and run CUDA programs, and have learnt about the effects of memory transfers on performance – we are well placed to understand in more detail what is needed to create a high-performing GPU application. By the end of this chapter, you will have mastered the abstractions...