1 Hello! So it seems to me that there are two major paths
2 to decide between for the thing to add next:
3
4 * Control structures (if/else, loops)
5 * Write compiled programs to ELF executables
6
7 Both will be challenging. I'm leaning towards ELF at the
8 moment. We'll see what I think when I come back tomorrow
9 night.
10
11 Two nights later: Yup, gonna try to write an ELF
12 executable. This is gonna be cool!
13
14 First, I need to test writing a file. Then write the ELF
15 header, then the contents of a word.
16
17 I'll start by having 'make_elf' take a token (will use
18 as an output filename for writing the executable and,
19 later, the word get the machine code from).
20
21 Then I'll write the string 'ELF' to that file. (Which is
22 very appropriate because Bytes 2-5 of a _real_ ELF
23 header are that string.)
24
25 Next night: So I've got my test 'make_elf' and it is
26 supposed to be writing to whatever filename you want:
27
28 make_elf foo
29
30 That should write the string 'ELF' to a file called
31 'foo', but it's not. So I've inserted a DEBUG to see
32 what the fd returned from 'open' is:
33
34 $ mr
35 make_elf foo
36 new fd: fffffffe
37 Goodbye.
38 Exit status: 0
39
40 Yeah, that's definitely an error.
41
42 While looking for how to decode that error (the open(2)
43 man page explains the errors, but they're all C mnemonic
44 constants, of course), I came across this excellent
45 suggestion on SO: https://stackoverflow.com/a/68155464
46
47 Which was to use strace to decode the error for me!
48
49 $ strace ./meow5
50 execve("./meow5", ["./meow5"], 0x7fff2d4ec190 /* 60 vars */) = 0
51 [ Process PID=2579 runs in 32 bit mode. ]
52 read(0, make_elf foo
53 "make_elf foo\n", 1024) = 13
54 open("foo", O_WRONLY|0xc) = -1 ENOENT (No such file or directory)
55 write(1, "new fd: ", 8new fd: ) = 8
56 write(1, "fffffffe\n", 9fffffffe
57 ) = 9
58 write(-2, "ELF", 3) = -1 EBADF (Bad file descriptor)
59 read(0, "", 1024) = 0
60 write(1, "Goodbye.\n", 9Goodbye.
61 ) = 9
62 exit(0) = ?
63 +++ exited with 0 +++
64
65 Huh, so something's wrong with my attempt to open the
66 output file with write-only, create, and truncate flags.
67
68 Here's what I'm sending:
69
70 ; From open(2) man page:
71 ; A call to creat() is equivalent to calling open()
72 ; with flags equal to O_CREAT|O_WRONLY|O_TRUNC.
73 ; I got the flags by searching all of /usr/include and
74 ; finding /usr/include/asm-generic/fcntl.h
75 ; That yielded (along with bizarre comment "not fcntl"):
76 ; #define O_CREAT 00000100
77 ; #define O_WRONLY 00000001
78 ; #define O_TRUNC 00001000
79 ; Hence this flag value for 'open':
80 mov ecx, 1101b
81
82 But from the strace above, it looks like it sees
83 O_WRONLY and...0xC - which is, indeed 1100...
84
85 Sounds like I've got a mystery for tomorrow night.
86
87 Two nights later: I bet somebody out there is
88 screaming. Ha ha. Those numbers are in octal, not binary
89 (despite looking for all the world like bit flags).
90
91 So I fixed that one night. Then I had to learn how to
92 set the mode (permissions), which was, like, freakishly
93 hard to find online. All the 'open' examples I found
94 were opening existing files. But since CREAT is an
95 option, obviously there was a way to do it...
96
97 The search "32 x86 assembly linux syscall table" is the
98 blessed way to ask the major search engines.
99
100 The answer is: the mode bits (in the usual unix octal
101 owner/group/all format) go in register edx. So:
102
103 ; ebx contains null-terminated word name (see above)
104 mov ecx, (0100o | 0001o | 1000o) ; open flags
105 mov edx, 666o ; mode (permissions)
106 mov eax, SYS_OPEN
107 int 80h ; now eax will contain the new file desc.
108
109 And when I went to test it, I was sleepy and forgot that
110 since I was running the binary from strace, it wasn't
111 gonna re-build from source like my shell aliases 'mr',
112 'mb', 'mt' do, so I couldn't figure out why it wasn't
113 working...
114
115 ...until I woke up in the middle of the night with the
116 realization.
117
118 Anyway, next morning, here goes:
119
120 $ strace ./meow5
121 execve("./meow5", ["./meow5"], 0x7fff56d5ec40 /* 60 vars */) = 0
122 [ Process PID=1377 runs in 32 bit mode. ]
123 read(0, make_elf foo
124 "make_elf foo\n", 1024) = 13
125 open("foo", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
126 write(1, "new fd: ", 8new fd: ) = 8
127 write(1, "00000003\n", 900000003
128 ) = 9
129 write(3, "ELF", 3) = 3
130 read(0, "", 1024) = 0
131 write(1, "Goodbye.\n", 9Goodbye.
132 ) = 9
133 exit(0) = ?
134 +++ exited with 0 +++
135
136 Awesome, we can see the flags being correctly decoded
137 and the mode/permission param:
138
139 open("foo", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
140
141 So I've learned that strace rules for this sort of thing!
142
143 But did it work?
144
145 $ cat foo
146 ELF
147
148 Yahoo! Ha ha, I have written a string to a new file.
149 Jeez, that was way harder than I expected.
150
151 But now I can actually try writing an ELF header. I'm
152 excited.
153
154 -------------------------------------------------------
155
156 11 nights later: It's the holiday season, which is a lot
157 of exhausting activity (if you're a parent) under the
158 best of circumstances and this was an unusually hard one
159 for the family. So what I could easily have done in a
160 single night ended up stretching out for many nights.
161 But I finally finished the header portion in the .data
162 section and am writing it with the 'make_elf' word (I am
163 *not* writing the word yet).
164
165 Let's see what it does so far:
166
167 $ mr
168 make_elf exit
169 new fd: 00000003
170 Goodbye.
171 Exit status: 0
172
173 The "new fd" message is a DEBUG statement I apparently
174 left in there to make sure I was opening the file
175 correctly.
176
177 If I've done everything correctly, this will have
178 written a file named "exit" with a more-or-less correct
179 ELF header.
180
181 Let's see what 'file' thinks of it:
182
183 $ file exit
184 exit: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), can't read elf program headers at 184, no section header
185
186 Not bad! The program headers error might be due to a bug
187 in my headers or just the fact that I'm not writing the
188 program to the file yet.
189
190 Let's see what 'readelf' says:
191
192 $ readelf -a exit
193 ELF Header:
194 Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
195 Class: ELF32
196 Data: 2's complement, little endian
197 Version: 1 (current)
198 OS/ABI: UNIX - System V
199 ABI Version: 0
200 Type: EXEC (Executable file)
201 Machine: Intel 80386
202 Version: 0x1
203 Entry point address: 0x8048000
204 Start of program headers: 184 (bytes into file)
205 Start of section headers: 0 (bytes into file)
206 Flags: 0x0
207 Size of this header: 52 (bytes)
208 Size of program headers: 32 (bytes)
209 Number of program headers: 1
210 Size of section headers: 0 (bytes)
211 Number of section headers: 0
212 Section header string table index: 0
213
214 ...
215
216 readelf: exit: Error: Reading 32 bytes extends past end of
217 file for program headers
218
219 Yeah, so it looks like my program header offset might be
220 wrong. But otherwise, the decoding looks correct!
221
222 Next night: Okay, I don't see anything wrong with my
223 header data (program header offset), so I'm gonna try
224 just writing out a program (word) and see what
225 happens.
226
227 I'm overwriting the program size portion of the program
228 header in data and then writing the header, *then*
229 writing the actual program after that. Every time I call
230 'make_elf' my elf_header data will contain the last
231 word's size that was written.
232
233 Anyway, here goes:
234
235 $ mr
236 make_elf exit
237 prog bytes: 00000008
238 new fd: 00000003
239 Goodbye.
240
241 My 'exit' word is 8 bytes, that sounds right.
242
243 What does file say?
244
245 $ file exit
246 exit: ELF 32-bit LSB executable, Intel 80386, version 1
247 (SYSV), statically linked, no section header
248
249 Ooh! No more errors there!
250
251 And readelf?
252
253 $ readelf exit
254 ELF Header:
255 Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
256 Class: ELF32
257 Data: 2's complement, little endian
258 Version: 1 (current)
259 OS/ABI: UNIX - System V
260 ABI Version: 0
261 Type: EXEC (Executable file)
262 Machine: Intel 80386
263 Version: 0x1
264 Entry point address: 0x8048000
265 Start of program headers: 52 (bytes into file)
266 Start of section headers: 0 (bytes into file)
267 Flags: 0x0
268 Size of this header: 52 (bytes)
269 Size of program headers: 32 (bytes)
270 Number of program headers: 1
271 Size of section headers: 0 (bytes)
272 Number of section headers: 0
273 Section header string table index: 0
274
275 ...
276
277 Program Headers:
278 Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
279 LOAD 0x000000 0x08048100 0x00000000 0x00008 0x00008 R E 0
280
281 ...
282
283 Cool! That looks good. My program takes up 8 bytes in
284 memory TOTAL. It doesn't allocate ANY memory for a stack
285 or data or anything, which is correct.
286
287 Next morning (I fell asleep): Now for the moment of
288 truth, does the program properly exit?
289
290 $ ./exit
291 bash: ./exit: Permission denied
292
293 LOL. Yeah, it literally doesn't have execute permission:
294
295 -rw-r--r-- 1 dave users 92 Dec 30 09:04 exit
296
297 Weird. That's not the permissions I thought I was
298 setting via the edx register for the sys 'write' call:
299
300 mov edx, 555o ; mode (permissions)
301
302 Well, I'll figure that out in a bit. Right now I just
303 wanna see if I can run this thing.
304
305 $ chmod +x exit
306 $ ./exit
307 Segmentation fault
308
309 Oops, nope. Let's see what GDB says about this
310 program.
311
312 Looks like I have to break via explicit address since
313 there's no debugging symbols...
314
315 $ gdb exit
316 ...
317 (gdb) break *0x08048100
318 Breakpoint 1 at 0x8048100
319 (gdb) run
320 ...
321 Segmentation fault.
322
323 Argh. Shouldn't it have halted at the first instruction?
324 Hmm...
325
326 I'm thinking maybe my program section doesn't have
327 execution permissions or something, in which case it
328 might die before it can even look at the first
329 instruction?
330
331 Anyway, now I know what I'm gonna start looking at next
332 time.
333
334 Next night: no, the flags (pretty sure they're R=read,
335 E=execute) look right for a text/executable segment. And
336 at any rate, as near as I can tell (and meow5 wouldn't
337 work the way it does if it weren't true), Linux ignores
338 the flags anyway!
339
340 Instead, I had mis-typed the entry point address in the
341 main header vs the program header. Now I've made them
342 the same:
343
344 $ readelf -a exit
345 ...
346 Entry point address: 0x8048100
347 ...
348 Program Headers:
349 Type Offset VirtAddr ...
350 LOAD 0x000000 0x08048100 ...
351
352 Kinda weird that there's a leading 0 on one, but not the
353 other, right? But I don't see any harm per se. Also, the
354 meow5 executable shows the same thing (though it
355 executes starting in the second segment and I don't
356 claim to entirely understand the program segment
357 addressing yet, so I may well be missing something
358 important. I need to read that chapter of the ELF
359 document properly...)
360
361 Anyway, does it work now?
362
363 $ ./exit
364 Segmentation fault
365
366 Bah.
367
368 Okay, let's see if I can figure out some stuff with GDB.
369
370 (gdb) file exit
371 Reading symbols from exit...
372 (No debugging symbols found in exit)
373 (gdb) info file
374 Symbols from "/home/dave/meow5/exit".
375
376 Hmmm. I thought 'info file' would at least show the
377 entry point, but no luck there.
378
379 (gdb) break *0x08048100
380 Breakpoint 1 at 0x8048100
381 (gdb) run
382 Starting program: /home/dave/meow5/exit
383 During startup program terminated with signal SIGSEGV,
384 Segmentation fault.
385
386 Another mystery. Well, my meow5 executable starts each
387 LOAD segment at even 1000 byte marks - which I guess has
388 something to do with page sizes? (Again, I need to read
389 that ELF document chapter, and I will, but I just wanna
390 see this working!)
391
392 So I updated my addresses to 0x08048000 at an even 1000
393 (in hex). I double-checked them with 'readelf -hl exit',
394 which I'll spare you from here.
395
396 But running it:
397
398 $ ./exit
399 Segmentation fault
400
401 Argh.
402
403 I'll take a look with GDB:
404
405 (gdb) file exit
406 Reading symbols from exit...
407 (No debugging symbols found in exit)
408 (gdb) r
409 Starting program: /home/dave/meow5/exit
410
411 Program received signal SIGSEGV, Segmentation fault.
412 0x08048047 in ?? ()
413
414 Wait a second! That *is* progress. Now it's showing me
415 the address of the crash. I wasn't getting that before.
416 And it looks like it's crashing 47 bytes into memory
417 (which is way larger than my exit code). So it could be
418 that my program just isn't executing correctly...
419
420 So I'll set a breakpoint at the entry point (with GBD's
421 '*' address syntax) and see if I can figure out how to
422 view what's running.
423
424 (gdb) break *0x08048000
425 Breakpoint 1 at 0x8048000
426 (gdb) r
427 The program being debugged has been started already.
428 Start it from the beginning? (y or n) y
429 Starting program: /home/dave/meow5/exit
430
431 Breakpoint 1, 0x08048000 in ?? ()
432
433 Cool! I've finally paused the darn thing.
434
435 (gdb) disass *0x08048000
436 No function contains specified address.
437
438 I guess without symbols, 'disassemble' won't cooperate?
439 Can I at least step?
440
441 (gdb) s
442 Cannot find bounds of current function
443
444 Oh, right. I know this one. There's a separate 'stepi'
445 to step through the program at the instruction level
446 since there are no 'lines' to step through!
447
448 (gdb) stepi
449 0x08048047 in ?? ()
450
451 Huh? Why am I now at that '...8047' address?
452
453 Turns out there's an 'i' format that will display
454 whatever memory you want as an instruction. So, after
455 the fact, here's that first instruction we just ran:
456
457 (gdb) x/i 0x08048000
458 0x8048000: jg 0x8048047
459
460 Ha ha, well, that certainly explains what's happening.
461 But how did that get there? Here's the bytes of that
462 machine code:
463
464 (gdb) x/x 0x8048000
465 0x8048000: 0x464c457f
466
467 Since it's so tiny, I'm just gonna hex dump exit
468 entirely to see where that is:
469
470 00000000: 7f45 4c46 0101 0100 0000 0000 0000 0000 .ELF............
471 00000010: 0200 0300 0100 0000 0080 0408 3400 0000 ............4...
472 00000020: 0000 0000 0000 0000 3400 2000 0100 0000 ........4. .....
473 00000030: 0000 0000 0100 0000 0000 0000 0080 0408 ................
474 00000040: 0000 0000 0800 0000 0800 0000 0500 0000 ................
475 00000050: 0000 0000 5bb8 0100 0000 cd80 ....[.......
476
477 Ha ha, I see it right away (though little-endian always
478 makes it harder because the bytes are reversed).
479
480 The memory we're trying to execute is the 'ELF' magic
481 string from the header!
482
483 Okay, apparently I really need to read that chapter
484 about program segments and how they're loaded into
485 memory now.
486
487 But I gotta say, I really don't regret getting this
488 wrong to begin with. Now I have a concrete example of
489 what's happening and the information in that chapter is
490 going to make *so* much more sense to me. Sometimes
491 getting it right the first time "by the book" doesn't
492 teach me nearly as much as getting it wrong on my own
493 and *then* learning how to do it properly. It just
494 sticks better.
495
496 Some number of nights later: First of all, the file
497 creation permissions here _were_ working. I've also
498 updated them to 755:
499
500 mov edx, 755o ; mode (permissions)
501
502 Which shows up correctly:
503
504 $ ls -l exit
505 -rwxr-xr-x 1 dave users 92 Jan 3 22:01 exit
506
507 And as for my executable trying to run the ELF header
508 itself...ha ha, well, I did read Part 2: "Program
509 Loading and Dynamic Linking" of the System V ELF spec
510 and the answer was so simple, it was downright silly.
511
512 When you specify that the ELF executable wants to load
513 the file segment into (one of) the program's virtual
514 memory segments (which is what my single "LOAD" type
515 program header is requesting), it will load the ELF
516 header itself, followed by whatever data (or machine
517 code, in this case) follows the header.
518
519 So you always need to account for the ELF header when
520 determining the execution entry point address.
521
522 In other words, where I was pointing to the very first
523 byte of my requested virtual address:
524
525 dd 0x08048000 ; entry - Execution start address
526
527 I needed to offset it by the elf header size:
528
529 dd elf_va + elf_size ; entry - execution start address
530
531 Oh, right, and I also made a NASM macro to contain that
532 address so I wouldn't have the bare value in multiple
533 places:
534
535 %assign elf_va 0x08048000 ; elf virt mem start address
536
537 Okay, crossing my fingers and toes...
538
539 $ mr
540 make_elf exit
541 prog bytes: 00000008
542 new fd: 00000003
543 Goodbye.
544 Exit status: 0
545 $ ./exit
546 $
547
548 Gasp! It worked! My executable exited cleanly! That can
549 only happen if the exit syscall was called correctly.
550
551 But a *real* test would be to call the exit syscall with
552 a unique value so we can *see* it doing something.
553
554 Do I dare hope? I'm going to try making a new word with
555 a constant value and "calling" the 'exit' word and see
556 if I can write that out as a new ELF executable:
557
558 $ mr
559 : foo 42 exit ;
560 make_elf foo
561 prog bytes: 0000000d
562 new fd: 00000003
563 Goodbye.
564 Exit status: 0
565
566 Indeed, that wrote a 97 byte ELF file containing 0xD
567 (13) bytes of machine code:
568
569 $ ls -l foo
570 -rwxr-xr-x 1 dave users 97 Jan 3 22:25 foo
571
572 But does it work?!
573
574 Drum roll...
575
576 $ ./foo
577 $ echo $?
578 42
579 $
580
581 Ha ha! No way!
582
583 It totally works.
584
585 Initial ELF creation is a success!
586
587 I think I'll figure out how to handle memory in my ELF
588 output next. It would be amazing to be able to write a
589 stand-alone executable that prints "Meow. Meow. Meow..."
590
591 See you in the next log!