Dave's nasmjf Dev Log 06
Created: 2022-07-22
This is an entry in my developer’s log series written between December 2021 and August 2022 (started project in September). I wrote these as I completed my port of the JONESFORTH assembly language Forth interpreter.
Exciting stuff! There are now enough "code words" defined in machine code to create the COLON (":") word as a pure Forth definition of other words. Let's see if it works. Reading symbols from nasmjf... Wait, where is code_COLON? (gdb) break code_C code_CHAR code_COMMA code_CREATE Oh, ha ha. Right. No such thing. COLON is defined entirely in a data segment. There is no machine code portion. Well, let's break right before it gets called, then. INTERPRET.execute is the point at which we've matcched the user input with a word in the dictionary and we hand control over to it via the pointer right after the "header" portion of the word definition. I'll type a new definition using ':' to make a word called "five" that pushes 5 on the stack: (gdb) break code_INTERPRET.execute Breakpoint 3 at 0x8049096: file nasmjf.asm, line 254. (gdb) c Continuing. : five 5 ; Breakpoint 3, code_INTERPRET.execute () at nasmjf.asm:254 254 mov ecx,[interpret_is_lit] ; Literal? 255 test ecx,ecx ; Literal? 256 jnz .do_literal 260 jmp [eax] So we should be about to jump to DOCOL, which should be the machine code COLON points to in the "interpreter" pointer at the beginning of the word definition. (This is confusing because the "interpreter" of a word is not the same as the interpreter INTERPRET that just took our typed input... Anyway, let's take a look at eax. Looks like it points to COLON + 1. Wait! Shouldn't it just be COLON? (gdb) p /x $eax $1 = 0x804a11d (gdb) info symbol $eax COLON + 1 in section .data of /home/dave/nasmjf/nasmjf And what's at that address (* treats the value as a pointer to memory)? It's...uh, not quite the value I was expecting (an address in the 0x804000 range). Yeah, the pointer there is no good. (gdb) p /x *$eax $2 = 0x64080490 (gdb) info symbol *$eax No symbol matches *$eax. Where is DOCOL? (gdb) info address DOCOL Symbol "DOCOL" is at 0x8049000 in a file compiled without debugging. The next night: Oh, now I see it! The COLON + 1 was, indeed, the problem. Check it out, the pointer in eax is shifted 1 byte off from the correct DOCOL address: 0x8049000 <--- DOCOL 0x64080490 <--- eax And sure enough, letting it run causes a segfault: Program received signal SIGSEGV, Segmentation fault. 0x64080490 in ?? () So where is it going wrong? Register al contains just 00000001 (/t means binary formatting, of COURSE). (gdb) p/t $al $1 = 1 392 and al,F_LENMASK ; Just the length, not the flags. We can't examine F_LENMASK in GDB because it was a NASM constant. But we can see what it was with a disassembly: 0x1f (gdb) disass Dump of assembler code for function _TCFA: 0x0804918d <+0>: xor eax,eax 0x0804918f <+2>: add edi,0x4 0x08049192 <+5>: mov al,BYTE PTR [edi] 0x08049194 <+7>: inc edi => 0x08049195 <+8>: and al,0x1f 0x08049197 <+10>: add edi,eax 0x08049199 <+12>: add edi,0x3 0x0804919c <+15>: and edi,0xfffffffd 0x0804919f <+18>: ret End of assembler dump. Which is 00011111 in binary - so it masks off all but the last five bits from al. This currently has no effect (no flags were set on COLON) and the name ':' is, indeed, one characer long. (gdb) p/t 0x1f $2 = 11111 So after this, edi should contain the address of the pointer stored after the name. 393 add edi,eax ; Skip the name. (gdb) p/x $eax $3 = 0x1 (gdb) p/x $edi $4 = 0x804a119 Ah, but first we have to make sure we're pointed at the pointer stored after the name AND aligned to the next 4 bytes. Apparently, adding 3 and masking with -3 does the trick. How does this work? So aligning on 4 bytes means that the last two bits of the address have to be 0. And to get to the next four bytes, we would always need to advance to the NEXT 4 byte-aligned addr, so we can't just mask off the last two digits. All three of these addreses need to advance to the same next 4 byte-aligned address: 00001001 --> 00001100 00001010 --> 00001100 00001011 --> 00001100 Adding 3 (11) to each of these would produce: 00001100 00001101 00001110 respectively. So that advances the 4's place bit as needed, now we just need to mask off the last two digits and we're set. (Also, adding 3 (11) to an already-aligned address will do no harm since it wouldn't advance the 4's place bit: 1000 + 11 = 1011) So what I don't understand is why we're masking with -3, which is this value when stored with two's complement: 0x0804919c <+15>: and edi,0xfffffffd which is ...1111111101 because you invert and add one to make a number negative. This seems like a mistake (and exactly the off-by-one mistake we've got here). To mask off the last two digits, don't we want -4 instead? 00000100 2 11111011 invert digits 11111100 add one Anyway, let's examine the actual values... 394 add edi,3 ; The codeword is 4-byte aligned. (gdb) p/x $edi $5 = 0x804a11a (gdb) p/t $edi $7 = 1000000001001010000100011101 395 and edi,-3 (gdb) p/x $edi $8 = 0x804a11d (gdb) p/t $edi $9 = 1000000001001010000100011101 (gdb) info symbol $edi COLON + 1 in section .data of /home/dave/nasmjf/nasmjf Now the off-by-one makes plenty of sense. I'll try a -4 now, but why... Argh! I just looked at the jonesforth source again. It's not -3, it's ~3! Which is unary NOT 3 (11111100). Bah! Of course it is. Here's the original GAS line: andl $~3,%edi NASM uses ~ for unary not as well. I bet it'll work now. (gdb) break _TCFA Breakpoint 2 at 0x804918d: file nasmjf.asm, line 388. (gdb) c Continuing. : FIVE 5 ; Breakpoint 2, _TCFA () at nasmjf.asm:388 388 xor eax,eax 389 add edi,4 ; Skip link pointer. 390 mov al,[edi] ; Load flags+len into %al. 391 inc edi ; Skip flags+len byte. 392 and al,F_LENMASK ; Just the length, not the flags. 393 add edi,eax ; Skip the name. Let's check this each step of the way. edi points to the name (header) portion of COLON. It ends in a 2 (10) so we'll need to advance it to the next 4-byte alignment where the COLON code begins. (gdb) info symbol $edi name_COLON + 6 in section .data of /home/dave/nasmjf/nasmjf (gdb) p/t $edi $2 = 1000000001001010000100011010 Now the 4's place is incremented. But the address ends in 1. 394 add edi,3 ; The codeword is 4-byte aligned: (gdb) p/t $edi $3 = 1000000001001010000100011101 Finally, we mask with NOT 3. Now edi is aligned and points to the code definition! 395 and edi,~3 ; Add ...00000011 and mask ...11111100. 396 ret ; For more, see log06.txt in this repo. (gdb) p/t $edi $4 = 1000000001001010000100011100 (gdb) info symbol $edi COLON in section .data of /home/dave/nasmjf/nasmjf We'll skip some stuff and take a look at what INTERPRET.execute now does with these results. 260 jmp [eax] (gdb) info symbol $eax COLON in section .data of /home/dave/nasmjf/nasmjf (gdb) info symbol *$eax DOCOL in section .text of /home/dave/nasmjf/nasmjf Excellent! The address at our word's definition contains another address. This one is for the DOCOL word, which starts the chain reaction that executes the rest of the words in the definition of COLON. So it turned out that the alignment bug had just been waiting to crop up. I still get a segfault after this point, so the debugging will continue in log07.txt.