Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

16
  • 6
    Very instructive answer and pointers! This clearly deserve more votes! Commented Feb 28, 2018 at 13:44
  • 9
    @user3927312: agner.org/optimize is one of the best and most coherent guides to low-level stuff for x86 specifically, but some of the general ideas apply to other ISAs. As well as asm guides, Agner has an optimizing C++ PDF. For other performance / CPU-architecture links, see stackoverflow.com/tags/x86/info. I've also written some about optimizing C++ by helping the compiler make better asm for critical loops when it's worth have a look at the compiler's asm output: C++ code for testing the Collatz conjecture faster than hand-written asm? Commented Apr 9, 2019 at 7:31
  • 2
    @PeterCordes: "large pages" are what Intel and AMD have always called 2 MiB (and 4 MiB) pages. Windows also calls them large pages (e.g. MEM_LARGE_PAGES flag for VirtualAlloc()). Linux seems to support one or the other but not both at the same, and uses the same word for either case. Note that it's relatively shocking how crippled operating systems are (Windows not supporting 1 GiB pages at all, requiring special permission just to use 2 MiB pages, not allowing 2 MiB pages to be "pageable"; and Linux having a cesspool of hackery with 2 separate systems and no way for user-space to choose) Commented Dec 15, 2020 at 5:15
  • 1
    @Brendan: Linux certainly can combine multiple small pages into a large page; see kernel.org/doc/Documentation/vm/transhuge.txt. Active scavenging (by defragging) is what khugepaged does, unless you disable it with echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag. There are some other tune settings to control when an mmap allocation and/or madvise waits for defragging vs. starting with small pages and working in the background. (echo defer+madvise > /sys/kernel/mm/transparent_hugepage/defrag). If you didn't know about this, Linux is less bad than you think! Commented Dec 15, 2020 at 13:12
  • 2
    @PeterCordes: Ah, you're right (kcompactd is the clue I was missing). It's actually worse/more awful than I suspected - like the name suggests, it literally does "compaction" (moving everything from one end of a physical memory zone to another) rather than only bothering when most small pages belonging to the larger page are already free. Commented Dec 16, 2020 at 10:06