Smallest executable program (x86-64 Linux)

Question

I recently came across this post describing the smallest possible ELF executable for Linux, however the post was written for 32 bit and I was unable to get the final version to compile on my machine.

This brings me to the question: what's the smallest x86-64 ELF executable it's possible to write that runs without error?

Abusing or violating the ELF specification is ok as long as current Linux kernels in practice will run it, as with the last couple versions of the MuppetLabs 32-bit teensy ELF executable article.

What machine do you have? Windows subsystem for Linux (which doesn't support 32-bit executable at all)? Or a proper Linux kernel built without IA-32 compat? What do you mean you couldn't get the final version to even compile? Surely you got a binary file, but couldn't run it? (Anyway, I know your question isn't about that, but if you couldn't even compile the 32-bit version, you probably won't be able to use NASM's flat-binary output to create a 64-bit executable with code packed into the ELF headers either.) — Peter Cordes
– Peter Cordes, Commented Nov 19, 2018 at 21:10
Can you use 32-bit int 0x80 system calls in your 64-bit executable? If so, your probably don't need to change much. I know there's some overlap of ELF header fields being interpreted as part of the machine code, so some change might be needed for ELF64. — Peter Cordes
– Peter Cordes, Commented Nov 19, 2018 at 21:13
For 64 bit mode, you basically need to recreate the entire program as both the machine code and the layout of the ELF header is quite different. While this is a nice exercise for an experienced programmer, I'm not sure if you are going to get an answer to your question within the scope of this site. — fuz
– fuz, Commented Nov 19, 2018 at 21:33
I'm voting to close this question as off-topic because code golf questions are off-topic on StackOverflow. — Ross Ridge
– Ross Ridge, Commented Nov 19, 2018 at 22:32
This is not just a "code golf" question IMO; it has practical value as well. I came here because I was interested in writing a tiny assembly program by hand, and was looking for a starting point. — Brandon
– Brandon, Commented Oct 23, 2022 at 1:05

Matteo Italia · Accepted Answer · 2018-11-19 22:25:49Z

Starting from an answer of mine about the "real" entrypoint of an ELF executable on Linux and "raw" syscalls, we can strip it down to

bits 64 global _start _start: mov di,42 ; only the low byte of the exit code is kept, ; so we can use di instead of the full edi/rdi xor eax,eax mov al,60 ; shorter than mov eax,60 syscall ; perform the syscall

I don't think you can get it to be any smaller without going out of specs - in particular, the psABI doesn't guarantee anything about the state of eax. This gets assembled to precisely 10 bytes (as opposed to the 7 bytes of the 32 bit payload):

66 bf 2a 00 31 c0 b0 3c 0f 05

The straightforward way (assemble with nasm, link with ld) produces me a 352 bytes executable.

The first "real" transformation he does is building the ELF "by hand"; doing this (with some modifications, as the ELF header for x86_64 is a bit bigger)

bits 64 org 0x08048000 ehdr: ; Elf64_Ehdr db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident times 8 db 0 dw 2 ; e_type dw 62 ; e_machine dd 1 ; e_version dq _start ; e_entry dq phdr - $$ ; e_phoff dq 0 ; e_shoff dd 0 ; e_flags dw ehdrsize ; e_ehsize dw phdrsize ; e_phentsize dw 1 ; e_phnum dw 0 ; e_shentsize dw 0 ; e_shnum dw 0 ; e_shstrndx ehdrsize equ $ - ehdr phdr: ; Elf64_Phdr dd 1 ; p_type dd 5 ; p_flags dq 0 ; p_offset dq $$ ; p_vaddr dq $$ ; p_paddr dq filesize ; p_filesz dq filesize ; p_memsz dq 0x1000 ; p_align phdrsize equ $ - phdr _start: mov di,42 ; only the low byte of the exit code is kept, ; so we can use di instead of the full edi/rdi xor eax,eax mov al,60 ; shorter than mov eax,60 syscall ; perform the syscall filesize equ $ - $$

we get down to 130 bytes. This is a tad bigger than the 91 bytes executable, but it comes from the fact that several fields become 64 bits instead of 32.

We can then apply some tricks similar to his; the partial overlap of phdr and ehdr can be done, although the order of fields in phdr is different, and we have to overlap p_flags with e_shnum (which however should be ignored due to e_shentsize being 0).

Moving the code inside the header is slightly more difficult, as it's 3 bytes larger, but that part of header is just as big as in the 32 bit case. We overcome this by starting 2 bytes earlier, overwriting the padding byte (ok) and the ABI version field (not ok, but still works).

So, we reach:

bits 64 org 0x08048000 ehdr: ; Elf64_Ehdr db 0x7F, "ELF", 2, 1, ; e_ident _start: mov di,42 ; only the low byte of the exit code is kept, ; so we can use di instead of the full edi/rdi xor eax,eax mov al,60 ; shorter than mov eax,60 syscall ; perform the syscall dw 2 ; e_type dw 62 ; e_machine dd 1 ; e_version dq _start ; e_entry dq phdr - $$ ; e_phoff dq 0 ; e_shoff dd 0 ; e_flags dw ehdrsize ; e_ehsize dw phdrsize ; e_phentsize phdr: ; Elf64_Phdr dw 1 ; e_phnum p_type dw 0 ; e_shentsize dw 5 ; e_shnum p_flags dw 0 ; e_shstrndx ehdrsize equ $ - ehdr dq 0 ; p_offset dq $$ ; p_vaddr dq $$ ; p_paddr dq filesize ; p_filesz dq filesize ; p_memsz dq 0x1000 ; p_align phdrsize equ $ - phdr filesize equ $ - $$

which is 112 bytes long.

Here I stop for the moment, as I don't have much time for this right now. You now have the basic layout with the relevant modifications for 64 bit, so you just have to experiment with more audacious overlaps

If you're golfing for code-size and you still want to _exit(42) instead of xor edi,edi like a normal person, you'd use push 42/pop rdi (3 bytes) instead of a 4-byte 66 mov-di imm16. And then a 3-byte lea eax, [rdi - 42 + 60] or another push/pop. Tips for golfing in x86/x64 machine code. Of course in practice Linux does zero all the registers before process startup. Depending on your golfing rules, you might take advantage. (codegolf.SE only requires that code work on at least one implementation, not necessarily all.)
To set only the low byte, another option is mov al,42 (2 bytes) /xchg eax,edi (1 byte).
@PeterCordes: argh the usual push/pop trick, I keep forgetting it... probably it's because I usually golf in 16 bit x86, where they aren't as useful (except for segment registers). _exit(42) is there to match the original, otherwise I would have just made it exit with whatever happened to be in rdi :-D. Unfortunately, as this is not a "regular" code-golf, there aren't really well-defined rules...
I am at 9 Bytes with use64; xor edi, edi; mov al, 42; xchg eax, edi; mov al, 60; syscall?
@mercury0114: the code itself is 12 bytes, the rest is various headers, the symbol table, the definition of other standard executable sections and stuff like that. Assembling your code with nasm -felf and linking it with ld -m elf_i386 I get 484 bytes, doing strip -s over the resulting binary gets down to 248 (you can get an idea of the content before/after using objdump -x -D).

Peter Cordes · Accepted Answer · 2024-11-26 19:43:08Z

Most articles out there give up on ld and resort to hand-crafting the ELF headers waaay too soon, including the amazing answer from Matteo Italia.

I've discovered you can get to the standard ELF header + program code 120-ish bytes limit using only standard tools, no need to insert the ELF header in your ASM

Standard assembly code, with a few tricks:

; tiny.asm BITS 64 SECTION .text align=1 GLOBAL _start _start: ; _exit(42) ; all registers zeroed by Linux ABI at start, so safe to use al/dil mov al, 60 ; Select the _exit syscall (60 in Linux ABI) mov dil, 42 ; Set the exit code argument for _exit syscall ; Perform the selected syscall

Remarks:

Using al/dil instead of the common eax/edi or the naive rax/rdi, for a 7-byte code payload. This is fine as Linux ABI guarantees all registers to be zero on program start.
align=1 so ld with its default linker script for a non-PIE can pick 0x400078 as the program entry point address, putting the payload right after the ELF header. As explained by @ecm, NASM's default is align=16, which makes ld use 8 bytes of padding to get to 0x400080.

And now some fine-tuned command-line arguments:

nasm -f elf64 tiny.asm && ld -s -no-pie -z noseparate-code tiny.o -o tiny

Results:

$ wc -c tiny && ./tiny; echo $? 336 tiny 42

That's already better than the 352 bytes Matteo had in his last ld attempt. And the code payload only accounts for 3 out of the 18 bytes saved.

But the payload is not the point here. My goal is to get rid of all section headers, so we get to the 120+payload size which is the absolute minimum before manually fiddling with the ELF header.

Given our 7-byte payload, we aim for a 127-byte binary, breaking the ~300 bytes barrier.

$ strip --strip-section-headers tiny && wc -c tiny && ./tiny; echo $? 127 tiny 42

A 62% reduction with a single strip, and goal achieved!

$ readelf -Wa tiny ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x400078 Start of program headers: 64 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 1 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 There are no sections in this file. There are no section groups in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00007f 0x00007f R E 0x1000 There is no dynamic section in this file. There are no relocations in this file. No processor specific unwind information to decode Dynamic symbol information is not available for displaying symbols. No version information found in this file.

This is a result Matteo and most articles only achieved by pasting the ELF header in ASM and editing the loader address by hand, but now we tamed nasm, ld and strip to do it automatically for us.

And, for completeness, the 4-byte payload true clone that yields impressive 124 bytes in just 5 lines, which I believe is the smallest possible size before non-standard approaches like overlapping headers and embedding the payload in it:

SECTION .text align=1 GLOBAL _start _start: mov al, 60 syscall

nasm -f elf64 tiny.asm && ld -s -no-pie -z noseparate-code tiny.o -o tiny && strip --strip-section-headers tiny && wc -c tiny && ./tiny; echo $?

124 tiny 0

A tiny executable and a tiny source!

Nice Result! I tried to reproduce it and got stuck at 'strip'. It seems like I don't have the '--strip-section-headers' available. Any hint?
@bennib22 : It is a fairly recent addition to strip, added in binutils 2.41 released in July 2023, according to its release notes.

Peter Cordes · Accepted Answer · 2024-01-31 06:12:34Z

Updated Answer

After seeing the tricks used in @Matteo Italia's answer, I found it's possible to reach 112 bytes since we can not only hide the string but also the code in the EFL header.

Explanations: The key idea is hiding everthing to the header, including string "Hello World!\n" and the code to print the string. We should first test what part of the header is modifiable (aka modify the value and the program can still be executed). Then, we hide our data and code in header as following code shows: (compile with command nasm -f bin ./x.asm)

This source code is based on @Matteo Italia's answer but completes the part he didn't show, of printing Hello World as well as exiting. There doesn't seem to be a way to make it any shorter; the kernel requires the file to be big enough to contain the ELF headers.
This version has some nop instructions in other space that's available for use inside / between the ELF headers which we can't avoid. We still have space to waste in p_paddr and p_align.

bits 64 org 0x08048000 ehdr: ; Elf64_Ehdr db 0x7F, "ELF", ; e_ident _start: mov dl, 13 mov esi,STR pop rax syscall jmp _S0 dw 2 ; e_type dw 62 ; e_machine dd 0xff ; e_version dq _start ; e_entry dq phdr - $$ ; e_phoff STR: db "Hello Wo" ; e_shoff db "rld!" ; e_flags dw 0x0a ; e_ehsize, ther place where we hide the next line symbol dw phdrsize ; e_phentsize phdr: ; Elf64_Phdr dw 1 ; e_phnum p_type dw 0 ; e_shentsize dw 5 ; e_shnum p_flags dw 0 ; e_shstrndx ehdrsize equ $ - ehdr dq 0 ; p_offset dq $$ ; p_vaddr _S0: nop ; unused space for more code nop nop nop nop nop jmp _S1 ; p_paddr, These 8 bytes belong to p_paddr, I nop them to show we can add some asm code here dq filesize ; p_filesz dq filesize ; p_memsz _S1: mov eax,60 ; p_align[0:5] syscall ; p_align[6:7] nop ; p_align[7:8] phdrsize equ $ - phdr filesize equ $ - $$

Original Post:

I have a 129-byte x64 "Hello World!".

Step1. Compile the following asm code with nasm -f bin hw.asm

; hello_world.asm BITS 64 org 0x400000 ehdr: ; Elf64_Ehdr db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident times 8 db 0 dw 2 ; e_type dw 0x3e ; e_machine dd 1 ; e_version dq _start ; e_entry dq phdr - $$ ; e_phoff dq 0 ; e_shoff dd 0 ; e_flags dw ehdrsize ; e_ehsize dw phdrsize ; e_phentsize phdr: ; Elf64_Phdr dd 1 ; e_phnum ; p_type ; e_shentsize dd 5 ; e_shnum ; p_flags ; e_shstrndx ehdrsize equ $ - ehdr dq 0 ; p_offset dq $$ ; p_vaddr dq $$ ; p_paddr dq filesize ; p_filesz dq filesize ; p_memsz dq 0x1000 ; p_align phdrsize equ $ - phdr _start: ; write "Hello World!" to stdout pop rax mov dl, 60 mov esi, hello syscall syscall hello: db "Hello World!", 10 ; 10 is the ASCII code for newline filesize equ $ - $$

Step2. Modify it with following python script

from pwn import * context.log_level='debug' context.arch='amd64' context.terminal = ['tmux', 'splitw', '-h', '-F' '#{pane_pid}', '-P'] with open('./hw','rb') as f: pro = f.read() print(len(pro)) pro = list(pro) cut = 0x68 pro[0x18] = cut pro[0x74] = 0x7c-(0x70-cut) pro = pro[:cut]+pro[0x70:] print(pro) x = b'' for _ in pro: x+=_.to_bytes(1,'little') with open("X",'wb') as f: f.write(x)

You should a 129-byte "Hello World".

[18:19:02] n132 :: xps ➜ /tmp » strace ./X execve("./X", ["./X"], 0x7fffba3db670 /* 72 vars */) = 0 write(0, "Hello World!\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 60Hello World! ) = 60 exit(0) = ? +++ exited with 0 +++ [18:19:04] n132 :: xps ➜ /tmp » ./X Hello World! [18:19:11] n132 :: xps ➜ /tmp » ls -la ./X -rwxrwxr-x 1 n132 n132 129 Jan 29 18:18 ./X

What changes does the Python code make, and why do that with Python instead of NASM macros + directives? Clever trick, though to write with a length of 60 = __NR_exit including trailing garbage so the return value is the call number for the next syscall. And to use rax = argc as __NR_write. This also depends on stdin (fd 0) being a read-write FD that's open on the terminal, since you write(0, hello, 60).
This program doesn't respect ./hello > /dev/null, but breaks if you close or redirect stdin. Which is fine, it still works in a normal terminal, but worth at least a mention in the comments to document that it's intentionally writing stdin to save bytes (because Linux initializes register values to 0 in a freshly-execed process.)
You are right. I used stdin to save bytes as well as truncate the ELF header. I don't know how to use nasm to do that so I just used Python and find at most we can ignore the last 8 bytes in the header. Also, we can hide the string "Hellow World!\n" in the ELF header. I got a 118-byte Hello World by utilizing this skill. (For this case I have to set RDX for SYS_Write and RAX for SYS_exit correctly since there are non-zero bytes after "Hello World\n". ) It's still not hard to make it smaller than 118 bytes.
You can't truncate or overwrite bytes you've already emitted with NASM, so just comment out the dq 0x1000 ; p_align line to not emit those 8 bytes in the first place. (Leaving it there commented out is a good way to document what you're doing, along with other comments. Unlike your Python code full of magic numbers with no comments.)
since there are non-zero bytes after "Hello World\n" - can you put the code inside the ELF header instead of the string? The string is 13 bytes, the machine code is 12. Or do the bytes need to have certain values, and re-ordering your asm instructions can't achieve that?

Collectives™ on Stack Overflow

Smallest executable program (x86-64 Linux)

3 Answers 3

8 Comments

2 Comments

15 Comments

Linked

Hot Network Questions