1 Exciting stuff! There are now enough "code words" defined
2 in machine code to create the COLON (":") word as a pure
3 Forth definition of other words.
4
5 Let's see if it works.
6
7 Reading symbols from nasmjf...
8
9 Wait, where is code_COLON?
10
11 (gdb) break code_C
12 code_CHAR code_COMMA code_CREATE
13
14 Oh, ha ha. Right. No such thing. COLON is defined entirely
15 in a data segment. There is no machine code portion.
16
17 Well, let's break right before it gets called, then.
18 INTERPRET.execute is the point at which we've matcched
19 the user input with a word in the dictionary and we
20 hand control over to it via the pointer right after
21 the "header" portion of the word definition.
22
23 I'll type a new definition using ':' to make a word
24 called "five" that pushes 5 on the stack:
25
26 (gdb) break code_INTERPRET.execute
27 Breakpoint 3 at 0x8049096: file nasmjf.asm, line 254.
28 (gdb) c
29 Continuing.
30 : five 5 ;
31
32 Breakpoint 3, code_INTERPRET.execute () at nasmjf.asm:254
33 254 mov ecx,[interpret_is_lit] ; Literal?
34 255 test ecx,ecx ; Literal?
35 256 jnz .do_literal
36 260 jmp [eax]
37
38 So we should be about to jump to DOCOL, which should be
39 the machine code COLON points to in the "interpreter"
40 pointer at the beginning of the word definition. (This
41 is confusing because the "interpreter" of a word is
42 not the same as the interpreter INTERPRET that just
43 took our typed input...
44
45 Anyway, let's take a look at eax. Looks like it points
46 to COLON + 1. Wait! Shouldn't it just be COLON?
47
48 (gdb) p /x $eax
49 $1 = 0x804a11d
50 (gdb) info symbol $eax
51 COLON + 1 in section .data of /home/dave/nasmjf/nasmjf
52
53 And what's at that address (* treats the value as
54 a pointer to memory)? It's...uh, not quite the
55 value I was expecting (an address in the 0x804000
56 range).
57
58 Yeah, the pointer there is no good.
59
60 (gdb) p /x *$eax
61 $2 = 0x64080490
62 (gdb) info symbol *$eax
63 No symbol matches *$eax.
64
65 Where is DOCOL?
66
67 (gdb) info address DOCOL
68 Symbol "DOCOL" is at 0x8049000 in a file compiled without debugging.
69
70 The next night: Oh, now I see it! The COLON + 1 was,
71 indeed, the problem. Check it out, the pointer in eax
72 is shifted 1 byte off from the correct DOCOL address:
73
74 0x8049000 <--- DOCOL
75 0x64080490 <--- eax
76
77 And sure enough, letting it run causes a segfault:
78
79 Program received signal SIGSEGV, Segmentation fault.
80 0x64080490 in ?? ()
81
82 So where is it going wrong?
83
84 Register al contains just 00000001 (/t means binary
85 formatting, of COURSE).
86
87 (gdb) p/t $al
88 $1 = 1
89 392 and al,F_LENMASK ; Just the length, not the flags.
90
91 We can't examine F_LENMASK in GDB because it was a
92 NASM constant.
93
94 But we can see what it was with a disassembly: 0x1f
95
96 (gdb) disass
97 Dump of assembler code for function _TCFA:
98 0x0804918d <+0>: xor eax,eax
99 0x0804918f <+2>: add edi,0x4
100 0x08049192 <+5>: mov al,BYTE PTR [edi]
101 0x08049194 <+7>: inc edi
102 => 0x08049195 <+8>: and al,0x1f
103 0x08049197 <+10>: add edi,eax
104 0x08049199 <+12>: add edi,0x3
105 0x0804919c <+15>: and edi,0xfffffffd
106 0x0804919f <+18>: ret
107 End of assembler dump.
108
109 Which is 00011111 in binary - so it masks off all
110 but the last five bits from al. This currently
111 has no effect (no flags were set on COLON) and
112 the name ':' is, indeed, one characer long.
113
114 (gdb) p/t 0x1f
115 $2 = 11111
116
117 So after this, edi should contain the address of
118 the pointer stored after the name.
119
120 393 add edi,eax ; Skip the name.
121 (gdb) p/x $eax
122 $3 = 0x1
123 (gdb) p/x $edi
124 $4 = 0x804a119
125
126 Ah, but first we have to make sure we're pointed
127 at the pointer stored after the name AND aligned
128 to the next 4 bytes.
129
130 Apparently, adding 3 and masking with -3 does
131 the trick. How does this work?
132
133 So aligning on 4 bytes means that the last two
134 bits of the address have to be 0. And to get to
135 the next four bytes, we would always need to
136 advance to the NEXT 4 byte-aligned addr, so we
137 can't just mask off the last two digits.
138
139 All three of these addreses need to advance
140 to the same next 4 byte-aligned address:
141
142 00001001 --> 00001100
143 00001010 --> 00001100
144 00001011 --> 00001100
145
146 Adding 3 (11) to each of these would produce:
147
148 00001100
149 00001101
150 00001110
151
152 respectively. So that advances the 4's place
153 bit as needed, now we just need to mask off
154 the last two digits and we're set.
155
156 (Also, adding 3 (11) to an already-aligned
157 address will do no harm since it wouldn't
158 advance the 4's place bit: 1000 + 11 = 1011)
159
160 So what I don't understand is why we're masking with -3,
161 which is this value when stored with two's complement:
162
163 0x0804919c <+15>: and edi,0xfffffffd
164
165 which is ...1111111101 because you invert and add one to
166 make a number negative.
167
168 This seems like a mistake (and exactly the off-by-one
169 mistake we've got here).
170
171 To mask off the last two digits, don't we want
172 -4 instead?
173
174 00000100 2
175 11111011 invert digits
176 11111100 add one
177
178 Anyway, let's examine the actual values...
179
180 394 add edi,3 ; The codeword is 4-byte aligned.
181 (gdb) p/x $edi
182 $5 = 0x804a11a
183 (gdb) p/t $edi
184 $7 = 1000000001001010000100011101
185 395 and edi,-3
186 (gdb) p/x $edi
187 $8 = 0x804a11d
188 (gdb) p/t $edi
189 $9 = 1000000001001010000100011101
190 (gdb) info symbol $edi
191 COLON + 1 in section .data of /home/dave/nasmjf/nasmjf
192
193 Now the off-by-one makes plenty of sense. I'll try a -4
194 now, but why...
195
196 Argh! I just looked at the jonesforth source again.
197 It's not -3, it's ~3! Which is unary NOT 3 (11111100).
198 Bah! Of course it is. Here's the original GAS line:
199
200 andl $~3,%edi
201
202 NASM uses ~ for unary not as well. I bet it'll work now.
203
204 (gdb) break _TCFA
205 Breakpoint 2 at 0x804918d: file nasmjf.asm, line 388.
206 (gdb) c
207 Continuing.
208 : FIVE 5 ;
209
210 Breakpoint 2, _TCFA () at nasmjf.asm:388
211 388 xor eax,eax
212 389 add edi,4 ; Skip link pointer.
213 390 mov al,[edi] ; Load flags+len into %al.
214 391 inc edi ; Skip flags+len byte.
215 392 and al,F_LENMASK ; Just the length, not the flags.
216 393 add edi,eax ; Skip the name.
217
218 Let's check this each step of the way. edi points to
219 the name (header) portion of COLON. It ends in a 2 (10)
220 so we'll need to advance it to the next 4-byte alignment
221 where the COLON code begins.
222
223 (gdb) info symbol $edi
224 name_COLON + 6 in section .data of /home/dave/nasmjf/nasmjf
225 (gdb) p/t $edi
226 $2 = 1000000001001010000100011010
227
228 Now the 4's place is incremented. But the address
229 ends in 1.
230
231 394 add edi,3 ; The codeword is 4-byte aligned:
232 (gdb) p/t $edi
233 $3 = 1000000001001010000100011101
234
235 Finally, we mask with NOT 3. Now edi is aligned and
236 points to the code definition!
237
238 395 and edi,~3 ; Add ...00000011 and mask ...11111100.
239 396 ret ; For more, see log06.txt in this repo.
240 (gdb) p/t $edi
241 $4 = 1000000001001010000100011100
242 (gdb) info symbol $edi
243 COLON in section .data of /home/dave/nasmjf/nasmjf
244
245 We'll skip some stuff and take a look at what
246 INTERPRET.execute now does with these results.
247
248 260 jmp [eax]
249 (gdb) info symbol $eax
250 COLON in section .data of /home/dave/nasmjf/nasmjf
251 (gdb) info symbol *$eax
252 DOCOL in section .text of /home/dave/nasmjf/nasmjf
253
254 Excellent! The address at our word's definition
255 contains another address. This one is for the
256 DOCOL word, which starts the chain reaction that
257 executes the rest of the words in the definition
258 of COLON.
259
260 So it turned out that the alignment bug had just
261 been waiting to crop up.
262
263 I still get a segfault after this point, so the
264 debugging will continue in log07.txt.