4

I tried to write the smallest possible x86_64 ELF hello world program by hand, but I receive a Segmentation fault when trying to run it.

gdb says: During startup program terminated with signal SIGSEGV, Segmentation fault.

Here is the hexdump:

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............ 00000010: 0200 3e00 0100 0000 7800 0000 0000 0000 ..>.....x....... 00000020: 4000 0000 0000 0000 0000 0000 0000 0000 @............... 00000030: 0000 0000 4000 3800 0100 0000 0000 0000 [email protected]......... 00000040: 0100 0000 0500 0000 0000 0000 0000 0000 ................ 00000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 00000060: 3100 0000 0000 0000 3100 0000 0000 0000 1.......1....... 00000070: 0200 0000 0000 0000 b801 0000 00bf 0100 ................ 00000080: 0000 be9a 0000 00ba 0a00 0000 0f05 b83c ..."...........< 00000090: 0000 00bf 0000 0000 0f05 4865 6c6c 6f2c ..........Hello, 000000a0: 2057 6f72 6c64 210a 00 World!.. 

Here is the output of readelf -a:

ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x78 Start of program headers: 64 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 1 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 There are no sections in this file. There are no section groups in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000031 0x0000000000000031 R E 0x2 There is no dynamic section in this file. There are no relocations in this file. No processor specific unwind information to decode Dynamic symbol information is not available for displaying symbols. No version information found in this file. 

And here is the code:

0xb8 0x01 0x00 0x00 0x00 /* mov %rax, 1 ; sys_write */ 0xbf 0x01 0x00 0x00 0x00 /* mov %rdi, 1 ; STDOUT */ 0xbe 0x9a 0x00 0x00 0x00 /* mov %rsi, 0x9a ; address of string */ 0xba 0x0a 0x00 0x00 0x00 /* mov %rdi, 15 ; size of string */ 0x0f 0x05 /* syscall */ 0xb8 0x3c 0x00 0x00 0x00 /* mov %rax, 60 ; sys_exit */ 0xbf 0x00 0x00 0x00 0x00 /* mov %rdi, 0 ; exit status */ 0x0f 0x05 /* syscall */ 

The "Hello, World!\n" string follows immediately afterwards. I have been using this MOV instruction. Playing around with the program header offset, alignment and virtual address fields did not yield anything. The manpage is a little confusing in this section. I also tried comparing this binary to one written in assembly, but I've found nothing useful.

Now to my question: Can you tell me what the mistake is and/or how I can debug this binary?

8
  • Don't you have to have a section, in order for anything to be loaded? The elf header isn't loaded into your process address space. Commented Jul 10, 2022 at 18:59
  • If the address of the entry point is 78, then the address of the string is 9a, not 22. Commented Jul 10, 2022 at 19:01
  • >Don't you have to have a section, in order for anything to be loaded? According to the specification they are optional. >If the address of the entry point is 78, then the address of the string is 9a, not 22. Ah thanks, I forgot to modify this value after playing round with the program header. Doesn't solve the Segfault though. Commented Jul 10, 2022 at 19:04
  • 1
    Perhaps have a look at this for inspiration? muppetlabs.com/~breadbox/software/tiny/teensy.html Commented Jul 10, 2022 at 19:35
  • The full site has a lot of good info Muppet Labs - The Teensy Files Commented Jul 10, 2022 at 20:15

2 Answers 2

3

I tried to write the smallest possible x86_64 ELF hello world program by hand

You should provide a source for your program, so we can fix it.

gdb says: During startup program terminated with signal SIGSEGV

This is GDB telling you that it called fork/execve to create the target program, and expected the kernel to notify GDB that the program is now ready to be debugged. Instead, the kernel notified GDB that the program has died with SIGSEGV, without ever reaching its first instruction.

GDB didn't expect that. Why would this happen?

This happens when the kernel looks at your executable, and says "I can't create a running program out of that".

Why is that the case here? Because this LOAD segment:

Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000031 0x0000000000000031 R E 0x2 

is asking the kernel to map 0x31 bytes from offset 0 in the file to virtual address 0. But the kernel (rightfully) refuses such nonsense request, and terminates the program with SIGSEGV before returning from execve.

You could probably avoid this by making the file ET_DYN instead of ET_EXEC -- that would change the meaning of your program header from "map this segment at 0" to "map this segment anywhere".

You could definitely avoid this by keeping the ET_EXEC, but changing the .p_vaddr and .p_paddr of the segment to something like 0x10000.

TL;DR: Your program and file headers must make sense to the kernel, or you'll never get off the ground.

Sign up to request clarification or add additional context in comments.

5 Comments

ET_DYN (a PIE executable) would require a RIP-relative lea rsi, [rip + msg], not the current mov esi, 0x9a. Fortunately there's tons of room to save space on the other instructions (like push 1 / pop rdi (3B) / mov eax, edi 2B / lea edx, [rdi-1 + msg.len] 3B), so the total payload can still fit into whatever space they found to tuck it in to. Unless all those zeroes in the machine code needed to be zeros in ELF header fields, like how muppetlabs.com/~breadbox/software/tiny/teensy.html eventually has a version with the text segment overlapping with the ELF headers.
@Employed Russian Thanks for the advice, but changing p_vaddr and p_paddr to 0x1000, p_offset to 0x78 (the start of the code) and the string address in the code to 0x22 0x00 0x01 still results in a segfault. As for the source... i put it in the question. I wrote this binary by hand.
@DieDummheitInPerson You can't change .p_vaddr = 0x1000 and .p_offset = 0x78 -- they must be congruent module pagesize. Leave .p_offset = 0 and try again.
@Employed Russian Still gives a segfault. If you know a bit more about the "congruent modulo pagesize" or have better resources than the manpage, could you tell me? My best guess is, that it has something to do with this rule. EDIT: Now gdb outputs something different. I will try to debug this with the comments from Peter Cordes and update this comment if I find something out.
@Employed Russian Thank you very much, it worked. Just had to sort out some issues with the code.
1

The answer that I accepted did the trick. I just want to share the new hexdump of the binary here:

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............ 00000010: 0200 3e00 0100 0000 7800 0100 0000 0000 ..>.....x....... 00000020: 4000 0000 0000 0000 0000 0000 0000 0000 @............... 00000030: 0000 0000 4000 3800 0100 0000 0000 0000 [email protected]......... 00000040: 0100 0000 0500 0000 0000 0000 0000 0000 ................ 00000050: 0000 0100 0000 0000 0000 0100 0000 0000 ................ 00000060: 3100 0000 0000 0000 3100 0000 0000 0000 1.......1....... 00000070: 0200 0000 0000 0000 b801 0000 00bf 0100 ................ 00000080: 0000 be9a 0001 00ba 0f00 0000 0f05 b83c ...............< 00000090: 0000 00bf 0000 0000 0f05 4865 6c6c 6f2c ..........Hello, 000000a0: 2057 6f72 6c64 210a 00 World!.. 

readelf -a:

ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x10078 Start of program headers: 64 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 1 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 There are no sections in this file. There are no section groups in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000010000 0x0000000000010000 0x0000000000000031 0x0000000000000031 R E 0x2 There is no dynamic section in this file. There are no relocations in this file. No processor specific unwind information to decode Dynamic symbol information is not available for displaying symbols. No version information found in this file. 

The code:

0xb8, 0x01, 0x00, 0x00, 0x00, /* mov $0x1,%rax ; sys_write */ 0xbf, 0x01, 0x00, 0x00, 0x00, /* mov $0x1,%rdi ; STDOUT */ 0xbe, 0x9a, 0x00, 0x01, 0x00, /* mov $0x1009a,%rsi ; address of string */ 0xba, 0x0f, 0x00, 0x00, 0x00, /* mov $0xf,%rdx ; size of string*/ 0x0f, 0x05, /* syscall */ 0xb8, 0x3c, 0x00, 0x00, 0x00, /* mov $0x3c,%rax ; sys_exit */ 0xbf, 0x00, 0x00, 0x00, 0x00, /* mov $0x0,%edi ; exit status */ 0x0f, 0x05 /* syscall */ 

As mentioned in the comments, this is not the smallest possible x86_64 ELF binary. The code could be improved and if you want to be crazy, you can put stuff in unused parts of the elf header. But in any case, I'm quite satisfied with a file size of 169 Bytes.

2 Comments

I'd suggest at least the standard optimization of xor %edi, %edi to zero EDI. Or omit that entirely and let your exit status be 1 - you still printed Hello World, nobody said the exit status had to be 0. What is the best way to set a register to zero in x86 assembly: xor, mov or and?. Also basic golf of mov %eax, %edi to take advantage of __NR_write == STDOUT_FILENO (probably not a coincidence, since Linux x86-64 also chose __NR_read == STDIN_FILENO (0).
Also, if you have source code for your binary, that would be a good thing to post for future readers that want to play around. (e.g. a .s file with .byte and .quad directives? Or did you just create it with a hex editor so you had nowhere to put comments about which field was what?) But anyway, seems like the key change here was putting the virtual address of your segment above mmap_min_addr.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.