3

The cudaMalloc() documentation says

The allocated memory is suitably aligned for any kind of variable.

But...

  • What affects the actual aligment? Compute capability? CUDA driver version? The specific kind of card? The allocation size?
  • Can I determine the minimum / typical allocation alignment as a function of these parameters?
4
  • 3
    (a) Not documented. (b) No. Commented Apr 10, 2016 at 19:41
  • @DmitriBudnikov: It seems to me the allocation quantum is significantly larger than alignof(long long)... isn't it? Commented Apr 11, 2016 at 6:47
  • @DmitriBudnikov: Sorry, I meant aligned by a larger quantum. I notice quite a few 0's at the lower bits of allocated areas. Commented Apr 11, 2016 at 7:39
  • Relevant: stackoverflow.com/a/14083295/2386951. Also consider @njuffa's comment on that. Commented Apr 12, 2016 at 5:58

1 Answer 1

1

You can un-safely assume it's at least cudaDeviceProp::textureAlignment
(i.e. 256 on Fermi, 512 on Kepler, Maxwell).

@sgarizvi reports that, from his experimentation on multiple devices, that the alignment of allocated blocks of device memory is no less than the texture alignment field of the device properties ( cudaDeviceProp::textureAlignment). For Kepler and Maxwell devices this is 512 bytes.

Of course, like @talonmies notes, this is not actually guaranteed nor is it documented.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.