Questions tagged [cuda]

Question 1

I was recently working on my CUDA wrappers library, and this particular class is one of the oldest pieces of code in the entire project. Since that time, I added tons of other features (for example <...

Question 2

When I discovered that CUDA device memory was represented by plain old void* I was horrified by having to deal with C-style type safety and resource ownership (i.e. ...

Question 3

I've implemented a resource management class for CUDA interop using RAII to ensure exception safety. The goal is to handle the registration/unregistration and mapping/unmapping, of graphics resources (...

Question 4

This is some kind of follow up to my previous question, this question will be more focused on the actual tessellating pipeline. What I changed from previous question Implemented the async sphere ...

Question 5

I was working on my version of "Universe Sandbox" and first thought comes to your mind is "where the hell are my planets?" so I thought loading models sucks and made this thing, It'...

Question 6

I've implemented a feature in my C++ fractal explorer application to switch between CUDA and NVRTC. The main reason for the NVRTC/Driver API context is to support runtime compilation of custom CUDA ...

Question 7

I'm looking for feedback and suggestions on improving the performance and quality of my CUDA kernel for rendering the Mandelbrot set. I've implemented a "ping-pong" style coloring and ...

Question 8

I'm developing a fractal renderer in CUDA and need advice on tracking the total number of iterations performed during rendering. This is important for real-time dragging and zooming performance. ...

Question 9

I am doing a fractal renderer using CUDA, SFML, C++, recently optimized it to eat less memory, now I am going to optimize the actual fractals, because for some reason, it is the most holding back ...

Question 10

One instance of the following module uses up to almost 75% of my vram. So, I was wondering how I could improve that without slowing down runtime too much. The code is below: ...

Question 11

I'm a new student in reinforcement learning. Below is the code that I wrote for deep Q learning: ...

Question 12

To multiply the matrices A and B using the outer product of vectors, we can express each row of matrix A as a row vector and each column of matrix B as a column vector. Then, we can take the outer ...

Question 13

I need to apply the coint function from the statsmodels library to 207 times series with 1397 points each, two by two. Currently, it takes between 35-40 minutes on my computer with an Intel 24 Cores ...

Question 14

Do you have any suggestions for improving the efficiency of the code below? I believe that better optimization can be implemented in the GPU function cuKer_sum, which is located in the ...

Question 15

My first time writing anything significant in CUDA. This kernel takes two arrays representing square matrices and compares them pair-wise. It takes into consideration large input arrays, and ...

Stack Exchange Network

Questions tagged [cuda]

RAII Wrapper For CUDA Pointers

Strongly-typed CUDA device memory

RAII Wrapper For Registering/Mapping CUDA Resources

Sphere Generation System With CUDA-OpenGL Interop

CUDA Sphere Tesselation With Support For LOD

CUDA/NVRTC context switching function

CUDA Mandelbrot Kernel

Tracking total iterations in CUDA fractal renderer

FractalRendering on GPU with CUDA

I have a pytorch module that takes in some parameters and predicts the difference between one of it inputs and the target

Pytorch code running slow for Deep Q learning (Reinforcement Learning)

A CUDA kernel for a matrix product as outer product vectors

Applying cointegration function from statsmodels on a large dataframe

Summation over different determinants that are independently computed using CUDA

CUDA kernel to compare pairs of matrices

Hot Network Questions