Hello! So it seems to me that there are two major paths to decide between for the thing to add next: * Control structures (if/else, loops) * Write compiled programs to ELF executables Both will be challenging. I'm leaning towards ELF at the moment. We'll see what I think when I come back tomorrow night. Two nights later: Yup, gonna try to write an ELF executable. This is gonna be cool! First, I need to test writing a file. Then write the ELF header, then the contents of a word. I'll start by having 'make_elf' take a token (will use as an output filename for writing the executable and, later, the word get the machine code from). Then I'll write the string 'ELF' to that file. (Which is very appropriate because Bytes 2-5 of a _real_ ELF header are that string.) Next night: So I've got my test 'make_elf' and it is supposed to be writing to whatever filename you want: make_elf foo That should write the string 'ELF' to a file called 'foo', but it's not. So I've inserted a DEBUG to see what the fd returned from 'open' is: $ mr make_elf foo new fd: fffffffe Goodbye. Exit status: 0 Yeah, that's definitely an error. While looking for how to decode that error (the open(2) man page explains the errors, but they're all C mnemonic constants, of course), I came across this excellent suggestion on SO: https://stackoverflow.com/a/68155464 Which was to use strace to decode the error for me! $ strace ./meow5 execve("./meow5", ["./meow5"], 0x7fff2d4ec190 /* 60 vars */) = 0 [ Process PID=2579 runs in 32 bit mode. ] read(0, make_elf foo "make_elf foo\n", 1024) = 13 open("foo", O_WRONLY|0xc) = -1 ENOENT (No such file or directory) write(1, "new fd: ", 8new fd: ) = 8 write(1, "fffffffe\n", 9fffffffe ) = 9 write(-2, "ELF", 3) = -1 EBADF (Bad file descriptor) read(0, "", 1024) = 0 write(1, "Goodbye.\n", 9Goodbye. ) = 9 exit(0) = ? +++ exited with 0 +++ Huh, so something's wrong with my attempt to open the output file with write-only, create, and truncate flags. Here's what I'm sending: ; From open(2) man page: ; A call to creat() is equivalent to calling open() ; with flags equal to O_CREAT|O_WRONLY|O_TRUNC. ; I got the flags by searching all of /usr/include and ; finding /usr/include/asm-generic/fcntl.h ; That yielded (along with bizarre comment "not fcntl"): ; #define O_CREAT 00000100 ; #define O_WRONLY 00000001 ; #define O_TRUNC 00001000 ; Hence this flag value for 'open': mov ecx, 1101b But from the strace above, it looks like it sees O_WRONLY and...0xC - which is, indeed 1100... Sounds like I've got a mystery for tomorrow night. Two nights later: I bet somebody out there is screaming. Ha ha. Those numbers are in octal, not binary (despite looking for all the world like bit flags). So I fixed that one night. Then I had to learn how to set the mode (permissions), which was, like, freakishly hard to find online. All the 'open' examples I found were opening existing files. But since CREAT is an option, obviously there was a way to do it... The search "32 x86 assembly linux syscall table" is the blessed way to ask the major search engines. The answer is: the mode bits (in the usual unix octal owner/group/all format) go in register edx. So: ; ebx contains null-terminated word name (see above) mov ecx, (0100o | 0001o | 1000o) ; open flags mov edx, 666o ; mode (permissions) mov eax, SYS_OPEN int 80h ; now eax will contain the new file desc. And when I went to test it, I was sleepy and forgot that since I was running the binary from strace, it wasn't gonna re-build from source like my shell aliases 'mr', 'mb', 'mt' do, so I couldn't figure out why it wasn't working... ...until I woke up in the middle of the night with the realization. Anyway, next morning, here goes: $ strace ./meow5 execve("./meow5", ["./meow5"], 0x7fff56d5ec40 /* 60 vars */) = 0 [ Process PID=1377 runs in 32 bit mode. ] read(0, make_elf foo "make_elf foo\n", 1024) = 13 open("foo", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 write(1, "new fd: ", 8new fd: ) = 8 write(1, "00000003\n", 900000003 ) = 9 write(3, "ELF", 3) = 3 read(0, "", 1024) = 0 write(1, "Goodbye.\n", 9Goodbye. ) = 9 exit(0) = ? +++ exited with 0 +++ Awesome, we can see the flags being correctly decoded and the mode/permission param: open("foo", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 So I've learned that strace rules for this sort of thing! But did it work? $ cat foo ELF Yahoo! Ha ha, I have written a string to a new file. Jeez, that was way harder than I expected. But now I can actually try writing an ELF header. I'm excited. ------------------------------------------------------- 11 nights later: It's the holiday season, which is a lot of exhausting activity (if you're a parent) under the best of circumstances and this was an unusually hard one for the family. So what I could easily have done in a single night ended up stretching out for many nights. But I finally finished the header portion in the .data section and am writing it with the 'make_elf' word (I am *not* writing the word yet). Let's see what it does so far: $ mr make_elf exit new fd: 00000003 Goodbye. Exit status: 0 The "new fd" message is a DEBUG statement I apparently left in there to make sure I was opening the file correctly. If I've done everything correctly, this will have written a file named "exit" with a more-or-less correct ELF header. Let's see what 'file' thinks of it: $ file exit exit: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), can't read elf program headers at 184, no section header Not bad! The program headers error might be due to a bug in my headers or just the fact that I'm not writing the program to the file yet. Let's see what 'readelf' says: $ readelf -a exit ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x8048000 Start of program headers: 184 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 1 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 ... readelf: exit: Error: Reading 32 bytes extends past end of file for program headers Yeah, so it looks like my program header offset might be wrong. But otherwise, the decoding looks correct! Next night: Okay, I don't see anything wrong with my header data (program header offset), so I'm gonna try just writing out a program (word) and see what happens. I'm overwriting the program size portion of the program header in data and then writing the header, *then* writing the actual program after that. Every time I call 'make_elf' my elf_header data will contain the last word's size that was written. Anyway, here goes: $ mr make_elf exit prog bytes: 00000008 new fd: 00000003 Goodbye. My 'exit' word is 8 bytes, that sounds right. What does file say? $ file exit exit: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, no section header Ooh! No more errors there! And readelf? $ readelf exit ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x8048000 Start of program headers: 52 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 1 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 ... Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x08048100 0x00000000 0x00008 0x00008 R E 0 ... Cool! That looks good. My program takes up 8 bytes in memory TOTAL. It doesn't allocate ANY memory for a stack or data or anything, which is correct. Next morning (I fell asleep): Now for the moment of truth, does the program properly exit? $ ./exit bash: ./exit: Permission denied LOL. Yeah, it literally doesn't have execute permission: -rw-r--r-- 1 dave users 92 Dec 30 09:04 exit Weird. That's not the permissions I thought I was setting via the edx register for the sys 'write' call: mov edx, 555o ; mode (permissions) Well, I'll figure that out in a bit. Right now I just wanna see if I can run this thing. $ chmod +x exit $ ./exit Segmentation fault Oops, nope. Let's see what GDB says about this program. Looks like I have to break via explicit address since there's no debugging symbols... $ gdb exit ... (gdb) break *0x08048100 Breakpoint 1 at 0x8048100 (gdb) run ... Segmentation fault. Argh. Shouldn't it have halted at the first instruction? Hmm... I'm thinking maybe my program section doesn't have execution permissions or something, in which case it might die before it can even look at the first instruction? Anyway, now I know what I'm gonna start looking at next time. Next night: no, the flags (pretty sure they're R=read, E=execute) look right for a text/executable segment. And at any rate, as near as I can tell (and meow5 wouldn't work the way it does if it weren't true), Linux ignores the flags anyway! Instead, I had mis-typed the entry point address in the main header vs the program header. Now I've made them the same: $ readelf -a exit ... Entry point address: 0x8048100 ... Program Headers: Type Offset VirtAddr ... LOAD 0x000000 0x08048100 ... Kinda weird that there's a leading 0 on one, but not the other, right? But I don't see any harm per se. Also, the meow5 executable shows the same thing (though it executes starting in the second segment and I don't claim to entirely understand the program segment addressing yet, so I may well be missing something important. I need to read that chapter of the ELF document properly...) Anyway, does it work now? $ ./exit Segmentation fault Bah. Okay, let's see if I can figure out some stuff with GDB. (gdb) file exit Reading symbols from exit... (No debugging symbols found in exit) (gdb) info file Symbols from "/home/dave/meow5/exit". Hmmm. I thought 'info file' would at least show the entry point, but no luck there. (gdb) break *0x08048100 Breakpoint 1 at 0x8048100 (gdb) run Starting program: /home/dave/meow5/exit During startup program terminated with signal SIGSEGV, Segmentation fault. Another mystery. Well, my meow5 executable starts each LOAD segment at even 1000 byte marks - which I guess has something to do with page sizes? (Again, I need to read that ELF document chapter, and I will, but I just wanna see this working!) So I updated my addresses to 0x08048000 at an even 1000 (in hex). I double-checked them with 'readelf -hl exit', which I'll spare you from here. But running it: $ ./exit Segmentation fault Argh. I'll take a look with GDB: (gdb) file exit Reading symbols from exit... (No debugging symbols found in exit) (gdb) r Starting program: /home/dave/meow5/exit Program received signal SIGSEGV, Segmentation fault. 0x08048047 in ?? () Wait a second! That *is* progress. Now it's showing me the address of the crash. I wasn't getting that before. And it looks like it's crashing 47 bytes into memory (which is way larger than my exit code). So it could be that my program just isn't executing correctly... So I'll set a breakpoint at the entry point (with GBD's '*' address syntax) and see if I can figure out how to view what's running. (gdb) break *0x08048000 Breakpoint 1 at 0x8048000 (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/dave/meow5/exit Breakpoint 1, 0x08048000 in ?? () Cool! I've finally paused the darn thing. (gdb) disass *0x08048000 No function contains specified address. I guess without symbols, 'disassemble' won't cooperate? Can I at least step? (gdb) s Cannot find bounds of current function Oh, right. I know this one. There's a separate 'stepi' to step through the program at the instruction level since there are no 'lines' to step through! (gdb) stepi 0x08048047 in ?? () Huh? Why am I now at that '...8047' address? Turns out there's an 'i' format that will display whatever memory you want as an instruction. So, after the fact, here's that first instruction we just ran: (gdb) x/i 0x08048000 0x8048000: jg 0x8048047 Ha ha, well, that certainly explains what's happening. But how did that get there? Here's the bytes of that machine code: (gdb) x/x 0x8048000 0x8048000: 0x464c457f Since it's so tiny, I'm just gonna hex dump exit entirely to see where that is: 00000000: 7f45 4c46 0101 0100 0000 0000 0000 0000 .ELF............ 00000010: 0200 0300 0100 0000 0080 0408 3400 0000 ............4... 00000020: 0000 0000 0000 0000 3400 2000 0100 0000 ........4. ..... 00000030: 0000 0000 0100 0000 0000 0000 0080 0408 ................ 00000040: 0000 0000 0800 0000 0800 0000 0500 0000 ................ 00000050: 0000 0000 5bb8 0100 0000 cd80 ....[....... Ha ha, I see it right away (though little-endian always makes it harder because the bytes are reversed). The memory we're trying to execute is the 'ELF' magic string from the header! Okay, apparently I really need to read that chapter about program segments and how they're loaded into memory now. But I gotta say, I really don't regret getting this wrong to begin with. Now I have a concrete example of what's happening and the information in that chapter is going to make *so* much more sense to me. Sometimes getting it right the first time "by the book" doesn't teach me nearly as much as getting it wrong on my own and *then* learning how to do it properly. It just sticks better. Some number of nights later: First of all, the file creation permissions here _were_ working. I've also updated them to 755: mov edx, 755o ; mode (permissions) Which shows up correctly: $ ls -l exit -rwxr-xr-x 1 dave users 92 Jan 3 22:01 exit And as for my executable trying to run the ELF header itself...ha ha, well, I did read Part 2: "Program Loading and Dynamic Linking" of the System V ELF spec and the answer was so simple, it was downright silly. When you specify that the ELF executable wants to load the file segment into (one of) the program's virtual memory segments (which is what my single "LOAD" type program header is requesting), it will load the ELF header itself, followed by whatever data (or machine code, in this case) follows the header. So you always need to account for the ELF header when determining the execution entry point address. In other words, where I was pointing to the very first byte of my requested virtual address: dd 0x08048000 ; entry - Execution start address I needed to offset it by the elf header size: dd elf_va + elf_size ; entry - execution start address Oh, right, and I also made a NASM macro to contain that address so I wouldn't have the bare value in multiple places: %assign elf_va 0x08048000 ; elf virt mem start address Okay, crossing my fingers and toes... $ mr make_elf exit prog bytes: 00000008 new fd: 00000003 Goodbye. Exit status: 0 $ ./exit $ Gasp! It worked! My executable exited cleanly! That can only happen if the exit syscall was called correctly. But a *real* test would be to call the exit syscall with a unique value so we can *see* it doing something. Do I dare hope? I'm going to try making a new word with a constant value and "calling" the 'exit' word and see if I can write that out as a new ELF executable: $ mr : foo 42 exit ; make_elf foo prog bytes: 0000000d new fd: 00000003 Goodbye. Exit status: 0 Indeed, that wrote a 97 byte ELF file containing 0xD (13) bytes of machine code: $ ls -l foo -rwxr-xr-x 1 dave users 97 Jan 3 22:25 foo But does it work?! Drum roll... $ ./foo $ echo $? 42 $ Ha ha! No way! It totally works. Initial ELF creation is a success! I think I'll figure out how to handle memory in my ELF output next. It would be amazing to be able to write a stand-alone executable that prints "Meow. Meow. Meow..." See you in the next log!