colorful rat Ratfactor.com > Dave's Repos

nasmjf

A NASM assembler port of JONESFORTH
git clone http://ratfactor.com/repos/nasmjf/nasmjf.git

nasmjf/devlog/log02.txt

Download raw file: devlog/log02.txt

1 Tonight, we'll see how much of the FIND word works. FIND looks for 2 words in the "dictionary" of defined Forth words via linked list. 3 The interpreter uses it to look up the addresses of the word 4 implementations so it can "compile" them into new words definitions. 5 6 Reading symbols from nasmjf... 7 Breakpoint 1 at 0x804900e: file nasmjf.asm, line 80. 8 9 Now that I'm using GNU Screen with windows for Vim and GDB that 10 will close if either application exits, I need to reload the 11 file in GDB when I make changes (previously I was just restarting 12 GDB). 13 14 (gdb) file nasmjf 15 Reading symbols from nasmjf... 16 17 No need to step through everything until this point because I 18 already know it works: WORD collects a word entered through STDIN. 19 20 So I break when we enter the implementatoin for FIND and then 21 continue (run the program). The "foo" below is where nasmjf is 22 asking for input and I type "foo" and hit enter. 23 24 (gdb) break _FIND 25 Breakpoint 2 at 0x804920b: file nasmjf.asm, line 485. 26 (gdb) c 27 Continuing. 28 foo 29 30 Our breakpoint triggers. Now we're in FIND. It checks if we've 31 run out of entries. 32 33 Breakpoint 2, _FIND () at nasmjf.asm:485 34 485 push esi ; _FIND! Save esi, we'll use this reg for string comparison 35 488 mov edx,[var_LATEST] ; LATEST points to name header of the latest word in the diction 36 ary 37 _FIND.test_word () at nasmjf.asm:490 38 490 test edx,edx ; NULL pointer? (end of the linked list) 39 491 je .not_found 40 41 And then I think this is clever: instead of immediately 42 checking if the name strings match, it checks the precalculated 43 and stored length of the name first. Much more efficient. 44 45 496 xor eax,eax 46 497 mov al, [edx+4] ; al = flags+length field 47 498 and al,(F_HIDDEN|F_LENMASK) ; al = name length 48 499 cmp cl,al ; Length is the same? 49 500 jne .prev_word ; nope, try prev 50 51 And that's what happens here: the length doesn't match, so 52 we move to the previous word in the linked list. And it 53 starts over at .test_word... 54 55 _FIND.prev_word () at nasmjf.asm:517 56 517 mov edx,[edx] ; Move back through the link field to the previous word 57 518 jmp .test_word ; loop, test prev word 58 _FIND.test_word () at nasmjf.asm:490 59 490 test edx,edx ; NULL pointer? (end of the linked list) 60 61 So I set a new breakpoint back in INTERPRET right after FIND 62 returns to see how a "not found" condition is handled. 63 64 (gdb) break 215 65 Breakpoint 3 at 0x8049043: file nasmjf.asm, line 215. 66 (gdb) c 67 Continuing. 68 Breakpoint 3, code_INTERPRET () at nasmjf.asm:215 69 215 test eax,eax ; Found? 70 71 If FIND fails, INTERPRET checks if the input is a numeric literal. 72 73 216 jz .try_literal 74 code_INTERPRET.try_literal () at nasmjf.asm:230 75 230 inc byte [interpret_is_lit] ; DID NOT MATCH a word, trying literal number 76 231 call _NUMBER ; Returns the parsed number in %eax, %ecx > 0 if error 77 _NUMBER () at nasmjf.asm:407 78 407 xor eax,eax 79 408 xor ebx,ebx 80 410 test ecx,ecx ; trying to parse a zero-length string is an error, but returns 81 0 82 411 jz .return 83 84 It's neat how Forth supports numeric input in the base 85 of your choice without any extra syntax. Just set BASE. 86 87 413 mov edx, [var_BASE] ; get BASE (in dl) 88 416 mov bl,[edi] ; bl = first character in string 89 417 inc edi 90 418 push eax ; push 0 on stack 91 _NUMBER () at nasmjf.asm:419 92 419 cmp bl,'-' ; negative number? 93 420 jnz .convert_char 94 _NUMBER.convert_char () at nasmjf.asm:435 95 435 sub bl,'0' ; < '0'? 96 436 jb .negate 97 437 cmp bl,10 ; <= '9'? 98 438 jb .compare_base 99 439 sub bl,17 ; < 'A'? (17 is 'A'-'0') 100 440 jb .negate 101 441 add bl,10 102 _NUMBER.compare_base () at nasmjf.asm:444 103 444 cmp bl,dl ; >= BASE? 104 445 jge .negate 105 _NUMBER.negate () at nasmjf.asm:453 106 453 pop ebx 107 _NUMBER.negate () at nasmjf.asm:454 108 454 test ebx,ebx 109 455 jz .return 110 _NUMBER.return () at nasmjf.asm:459 111 459 ret 112 113 Coming back from NUMBER, a value > 0 in ecx indicates an error 114 in trying to parse a numeric value. 115 116 code_INTERPRET.try_literal () at nasmjf.asm:232 117 232 test ecx,ecx 118 233 jnz .parse_error 119 120 And sure enough, "foo" was not a valid base-ten (the default) 121 value, so we jump to the parse_error section. This should 122 print an error message. 123 124 code_INTERPRET.parse_error () at nasmjf.asm:267 125 267 mov ebx,2 ; 1st param: stderr 126 268 mov ecx,errmsg ; 2nd param: error message 127 269 mov edx,(errmsgend - errmsg) ; 3rd param: length of string 128 270 mov eax,[__NR_write] ; write syscall 129 130 But oops! Looks like I've got an error. 131 132 Program received signal SIGSEGV, Segmentation fault. 133 code_INTERPRET.parse_error () at nasmjf.asm:270 134 270 mov eax,[__NR_write] ; write syscall 135 136 The next evening, I load it up again to see what's going on... 137 138 Reading symbols from nasmjf... 139 (gdb) break code_INTERPRET.parse_error 140 Breakpoint 2 at 0x80490a6: file nasmjf.asm, line 267. 141 (gdb) cont 142 Continuing. 143 foo 144 145 Breakpoint 2, code_INTERPRET.parse_error () at nasmjf.asm:267 146 267 mov ebx,2 ; 1st param: stderr 147 268 mov ecx,errmsg ; 2nd param: error message 148 269 mov edx,(errmsgend - errmsg) ; 3rd param: length of string 149 150 First I try to print the value at errmsg as a string. It 151 should be the string "PARSE ERROR: ". 152 153 (gdb) x/s $ecx 154 0x804a315 <errmsg>: "" 155 156 Weird. Let's look at the first 4 bytes: 157 158 (gdb) x/4x $ecx 159 0x804a315 <errmsg>: 0x00 0x00 0x00 0x53 160 161 Weird! Looking at stuff... 162 163 (gdb) info addr errmsg 164 Symbol "errmsg" is at 0x804a315 in a file compiled without debugging. 165 (gdb) info addr errmsgend 166 Symbol "errmsgend" is at 0x804a322 in a file compiled without debugging. 167 (gdb) x/10c $ecx 168 0x804a315 <errmsg>: 0 '\000' 0 '\000' 0 '\000' 83 'S' 69 'E' 32 ' ' 69 ' 169 E' 82 'R' 170 0x804a31d: 82 'R' 79 'O' 171 172 Huh, so I've basically got "---SE ERROR: " (where '-' is NUL). Something 173 is happening to the first three bytes of my string. Or is this some 174 alignment issue? I'll see... To be continued.