1 Tonight, we'll see how much of the FIND word works. FIND looks for
2 words in the "dictionary" of defined Forth words via linked list.
3 The interpreter uses it to look up the addresses of the word
4 implementations so it can "compile" them into new words definitions.
5
6 Reading symbols from nasmjf...
7 Breakpoint 1 at 0x804900e: file nasmjf.asm, line 80.
8
9 Now that I'm using GNU Screen with windows for Vim and GDB that
10 will close if either application exits, I need to reload the
11 file in GDB when I make changes (previously I was just restarting
12 GDB).
13
14 (gdb) file nasmjf
15 Reading symbols from nasmjf...
16
17 No need to step through everything until this point because I
18 already know it works: WORD collects a word entered through STDIN.
19
20 So I break when we enter the implementatoin for FIND and then
21 continue (run the program). The "foo" below is where nasmjf is
22 asking for input and I type "foo" and hit enter.
23
24 (gdb) break _FIND
25 Breakpoint 2 at 0x804920b: file nasmjf.asm, line 485.
26 (gdb) c
27 Continuing.
28 foo
29
30 Our breakpoint triggers. Now we're in FIND. It checks if we've
31 run out of entries.
32
33 Breakpoint 2, _FIND () at nasmjf.asm:485
34 485 push esi ; _FIND! Save esi, we'll use this reg for string comparison
35 488 mov edx,[var_LATEST] ; LATEST points to name header of the latest word in the diction
36 ary
37 _FIND.test_word () at nasmjf.asm:490
38 490 test edx,edx ; NULL pointer? (end of the linked list)
39 491 je .not_found
40
41 And then I think this is clever: instead of immediately
42 checking if the name strings match, it checks the precalculated
43 and stored length of the name first. Much more efficient.
44
45 496 xor eax,eax
46 497 mov al, [edx+4] ; al = flags+length field
47 498 and al,(F_HIDDEN|F_LENMASK) ; al = name length
48 499 cmp cl,al ; Length is the same?
49 500 jne .prev_word ; nope, try prev
50
51 And that's what happens here: the length doesn't match, so
52 we move to the previous word in the linked list. And it
53 starts over at .test_word...
54
55 _FIND.prev_word () at nasmjf.asm:517
56 517 mov edx,[edx] ; Move back through the link field to the previous word
57 518 jmp .test_word ; loop, test prev word
58 _FIND.test_word () at nasmjf.asm:490
59 490 test edx,edx ; NULL pointer? (end of the linked list)
60
61 So I set a new breakpoint back in INTERPRET right after FIND
62 returns to see how a "not found" condition is handled.
63
64 (gdb) break 215
65 Breakpoint 3 at 0x8049043: file nasmjf.asm, line 215.
66 (gdb) c
67 Continuing.
68 Breakpoint 3, code_INTERPRET () at nasmjf.asm:215
69 215 test eax,eax ; Found?
70
71 If FIND fails, INTERPRET checks if the input is a numeric literal.
72
73 216 jz .try_literal
74 code_INTERPRET.try_literal () at nasmjf.asm:230
75 230 inc byte [interpret_is_lit] ; DID NOT MATCH a word, trying literal number
76 231 call _NUMBER ; Returns the parsed number in %eax, %ecx > 0 if error
77 _NUMBER () at nasmjf.asm:407
78 407 xor eax,eax
79 408 xor ebx,ebx
80 410 test ecx,ecx ; trying to parse a zero-length string is an error, but returns
81 0
82 411 jz .return
83
84 It's neat how Forth supports numeric input in the base
85 of your choice without any extra syntax. Just set BASE.
86
87 413 mov edx, [var_BASE] ; get BASE (in dl)
88 416 mov bl,[edi] ; bl = first character in string
89 417 inc edi
90 418 push eax ; push 0 on stack
91 _NUMBER () at nasmjf.asm:419
92 419 cmp bl,'-' ; negative number?
93 420 jnz .convert_char
94 _NUMBER.convert_char () at nasmjf.asm:435
95 435 sub bl,'0' ; < '0'?
96 436 jb .negate
97 437 cmp bl,10 ; <= '9'?
98 438 jb .compare_base
99 439 sub bl,17 ; < 'A'? (17 is 'A'-'0')
100 440 jb .negate
101 441 add bl,10
102 _NUMBER.compare_base () at nasmjf.asm:444
103 444 cmp bl,dl ; >= BASE?
104 445 jge .negate
105 _NUMBER.negate () at nasmjf.asm:453
106 453 pop ebx
107 _NUMBER.negate () at nasmjf.asm:454
108 454 test ebx,ebx
109 455 jz .return
110 _NUMBER.return () at nasmjf.asm:459
111 459 ret
112
113 Coming back from NUMBER, a value > 0 in ecx indicates an error
114 in trying to parse a numeric value.
115
116 code_INTERPRET.try_literal () at nasmjf.asm:232
117 232 test ecx,ecx
118 233 jnz .parse_error
119
120 And sure enough, "foo" was not a valid base-ten (the default)
121 value, so we jump to the parse_error section. This should
122 print an error message.
123
124 code_INTERPRET.parse_error () at nasmjf.asm:267
125 267 mov ebx,2 ; 1st param: stderr
126 268 mov ecx,errmsg ; 2nd param: error message
127 269 mov edx,(errmsgend - errmsg) ; 3rd param: length of string
128 270 mov eax,[__NR_write] ; write syscall
129
130 But oops! Looks like I've got an error.
131
132 Program received signal SIGSEGV, Segmentation fault.
133 code_INTERPRET.parse_error () at nasmjf.asm:270
134 270 mov eax,[__NR_write] ; write syscall
135
136 The next evening, I load it up again to see what's going on...
137
138 Reading symbols from nasmjf...
139 (gdb) break code_INTERPRET.parse_error
140 Breakpoint 2 at 0x80490a6: file nasmjf.asm, line 267.
141 (gdb) cont
142 Continuing.
143 foo
144
145 Breakpoint 2, code_INTERPRET.parse_error () at nasmjf.asm:267
146 267 mov ebx,2 ; 1st param: stderr
147 268 mov ecx,errmsg ; 2nd param: error message
148 269 mov edx,(errmsgend - errmsg) ; 3rd param: length of string
149
150 First I try to print the value at errmsg as a string. It
151 should be the string "PARSE ERROR: ".
152
153 (gdb) x/s $ecx
154 0x804a315 <errmsg>: ""
155
156 Weird. Let's look at the first 4 bytes:
157
158 (gdb) x/4x $ecx
159 0x804a315 <errmsg>: 0x00 0x00 0x00 0x53
160
161 Weird! Looking at stuff...
162
163 (gdb) info addr errmsg
164 Symbol "errmsg" is at 0x804a315 in a file compiled without debugging.
165 (gdb) info addr errmsgend
166 Symbol "errmsgend" is at 0x804a322 in a file compiled without debugging.
167 (gdb) x/10c $ecx
168 0x804a315 <errmsg>: 0 '\000' 0 '\000' 0 '\000' 83 'S' 69 'E' 32 ' ' 69 '
169 E' 82 'R'
170 0x804a31d: 82 'R' 79 'O'
171
172 Huh, so I've basically got "---SE ERROR: " (where '-' is NUL). Something
173 is happening to the first three bytes of my string. Or is this some
174 alignment issue? I'll see... To be continued.