ByteBuffer memory allocations

Question

I'm trying to understand how DirectByteBuffer works on Linux and wrote the following very simple program to run under strace:

public static void main(String[] args){ while(true){ ByteBuffer buffer = ByteBuffer.allocateDirect(8192); } }

I expected actually some mmap or sys_brk syscalls to allocate memory from the operating system directly, but actually it just sets read and write protection of the pages requested. I mean something like:

mprotect(0x7fa9681ef000, 8192, PROT_READ|PROT_WRITE) = 0

This seems the reason that allocating direct buffer is slower than allocating heap buffer since it requires syscall for each allocation.

Please correct me if I'm wrong, but heap buffer allocation (if happens inside TLAB) is equivalent to returning a pointer to pre-allocated heap memory.

QUESTION: Why can't we do the same for direct memory? Return a pointer to pre-allocated memory?

I couldn't find details but looks like the reason may be that JVM (at least JRockit) crashed when using mmap in some cases: "Crashes on Linux 64. Using mprotect instead of mmap/munmap when possible stopped the crashes and prevented us from leaking guard pages" — syntagma
– syntagma, Commented Oct 6, 2018 at 8:23
@syntagma Crashes on Linux 64. Using mprotect instead of mmap/munmap when possible stopped the crashes and prevented us from leaking guard pages. Very interesting, but did they expand the reason or this just happened to work better? — St.Antario
– St.Antario, Commented Oct 6, 2018 at 8:25
I would also like to know the explanation but couldn't find it. — syntagma
– syntagma, Commented Oct 6, 2018 at 8:26

Peter Lawrey · Accepted Answer · 2018-10-06 11:55:34Z

In the Oracle/OpenJDK, ByteBuffer.allocateDirect(n) uses Unsafe.allocateMemory(n) which in turn calls malloc on Linux.

On Linux, malloc allocates from a pool of memory for smaller allocations such as 8KB, however, for allocation of 128 KB or more it adds a new mmap.

I expected actually some mmap or sys_brk syscalls to allocate memory from the operating system directly

Try allocating 128 << 10 or 128 KB at a time.

This seems the reason that allocating direct buffer is slower than allocating heap buffer since it requires syscall for each allocation.

The syscall adds about 2 micro-seconds. It is not intended that direct ByteBuffers be allocated and freed often. You should find ways to reuse these buffers.

Please correct me if I'm wrong, but heap buffer allocation (if happens inside TLAB) is equivalent to returning a pointer to pre-allocated heap memory.

Correct. Smaller allocation in native memory use the native heap.

QUESTION: Why can't we do the same for direct memory?

It does.

Return a pointer to pre-allocated memory?

It doesn't do this for 128 KB+ to release memory back to the OS.

I got more confusing behavior while trying it with 128kb. I added print(java.lang.Long.toHexString((DirectBuffer) buf.address())) to catch the actual pointer and looked at the strace. Here is it mprotect(0x7f3fdf6e6000, 131072, PROT_READ|PROT_WRITE) = 0\n write(1, "7f3fdf6e5a20", 127f3fdf6e5a20) = 12. The address being protected is different to returned by the address() method.
@St.Antario my understanding is that each allocation has a header to describe the data allocated. I am surprised the address in Java is before the start of the region rather than after. Are you sure you have the association right and you do many of these?
Maybe I misinterpret the strace output... The thing I could notice is that the returned address got incremented by 8192 + 16 bytes (probably for header). The rest of the addresses returned look like some magic. Probably the only way to understand it is to look at malloc implementation...
note that malloc behavior usually depends on libc but can also be overriden in various other ways, e.g. mallopt in glibc.

Collectives™ on Stack Overflow

ByteBuffer memory allocations

1 Answer 1

4 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Related