CUDA 4.1 Update

Question

I'm currently working on porting a particle system to update on the GPU via the use of CUDA. With CUDA, I've already passed over the required data I need to the GPU and allocated and copied the date via the host. When I build the project, it all runs fine, but when I run it, the project says I need to allocate my h_position pointer. This pointer is my host pointer and is meant to hold the data.

I know I need to pass in the current particle position to the required cudaMemcpy call and they are currently stored in a list with a for loop being created and interated for each particle calling the following line of code:

m_particleList[i].positionY = m_particleList[i].positionY - (m_particleList[i].velocity * frameTime * 0.001f);

My current host side cuda code looks like this:

float* h_position; // Your host pointer. This holds the data (I assume it's already filled with the data.) float* d_position; // Your device pointer, we will allocate and fill this float* d_velocity; float* d_time; int threads_per_block = 128; // You should play with this value int blocks = m_maxParticles/threads_per_block + ( (m_maxParticles%threads_per_block)?1:0 ); const int N = 10; size_t size = N * sizeof(float); cudaMalloc( (void**)&d_position, m_maxParticles * sizeof(float) ); cudaMemcpy( d_position, h_position, m_maxParticles * sizeof(float), cudaMemcpyHostToDevice);

Both of which were / can be found inside my UpdateParticle() method. I had originally thought it would be a simple case of changing the h_position variable in the cudaMemcpy to m_particleList[i] but then I get the following error:

no suitable conversion function from "ParticleSystemClass::ParticleType" to "const void *" exists

I've probably messed up somewhere, but could someone please help fix the issues I'm facing. Everything else seems to running fine, it's just when I try to run the program that certain things hit the fan.

user13213 · Accepted Answer · 2012-05-28 21:36:50Z

I don't see any place in your code where you assign anything to h_position, so naturally the CudaMemcpy call fails.

your m_particleList seems to contain ParticleType objects which hold at least position and velocity attributes, not just a single float.

Additionally, do your particles have x,y,z positions and velocities? You can't just pass the pointer to m_particleList into the CudaMemcpy function then, your list of n ParticleTypes will not fit in an array of n floats.

You might need to create something along the lines of

struct particle{ float x,y,z,vx,vy,vz; };

in both host and device code, and use particle* instead of float*.

kevintodisco · Accepted Answer · 2012-10-26 03:19:29Z

As melak47 has pointed out, the first problem is that you first need to allocate memory for h_position:

h_position = malloc(sizeof(float) * m_maxParticles); memset(h_position, 0, sizeof(float) * m_maxParticles);

But it also seems like you're trying to copy back your integrated positions into your particles array on the host? I really don't recommend copying large amounts of data from the GPU to the CPU every frame. Only do it if you absolutely must. You haven't mentioned what, if anything, you need to computer on the CPU, but I would say you can just push the particles to the GPU once and let it handle things from there. To do that, you can declare your particle struct:

struct Particle { float x, y, z; float vx, vy, vz; __device__ __host__ Particle() { x = 0; y = 0; z = 0; vx = 0; vy = 0; vz = 0; } };

Prefixing the default constructor with __device__ and __host__ indicates that the Particle struct can exist on both the CPU and GPU. Now, you can allocate space for your particles on the CPU, and fill out the information:

Particle* particles = malloc(m_maxParticles * sizeof(Particle)); // Fill out positions and initial velocities as you'd like.

And then copy the data over to the GPU:

Particle* d_particles; cudaMalloc( (void**)d_particles, m_maxParticles * sizeof(Particle) ); cudaMemcpy( d_particles, particles, m_maxParticles * sizeof(Particle), cudaMemcpyHostToDevice );

As I said before, you should probably only do this copy once, unless you are adding or removing particles on a frame-by-frame basis.

Stack Exchange Network

CUDA 4.1 Update

2 Answers 2

You must log in to answer this question.

Hot Network Questions

CUDA 4.1 Update

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions