1 The mind-destroying temporal manipulation operators
2 continue now with a new word called
3
4 [COMPILE]
5
6 Which compiles the next "word" of input even if it would
7 otherwise have been immediate.
8
9 This allows you to be your own father's uncle or
10 something like that.
11
12 And, of course, [COMPILE] itself is IMMEDIATE. Let's
13 take a look at the whole definition with some Jones
14 comments:
15
16 : [COMPILE] IMMEDIATE
17 WORD \ get the next word
18 FIND \ find it in the dictionary
19 >CFA \ get its codeword
20 , \ and compile that
21 ;
22
23 I have to keep reminding myself that even though
24 [COMPILE] is IMMEDIATE, the WORD word's next word isn't
25 FIND, it's whatever follows the *use* of [COMPILE],
26 which is probably during compilation. I think we have
27 exceeded the limitations of the English language now.
28 Throw away your dictionaries, clocks, and family trees.
29 Where we're going, we won't be needing those anymore.
30 Set phasers for "make love to your own grandma" and
31 prepare to jump to lightspeed toward Planet FORTH.
32
33 I'm tired.
34
35 Next night: Okay, I figured out how to test [COMPILE].
36 I'll make a new IMMEDIATE word called foo that emits 'z'
37 when it runs.
38
39 : foo IMMEDIATE [ CHAR z ] LITERAL EMIT ;
40 foo
41 z
42
43 : bar 'A' EMIT foo 'A' EMIT ;
44 z
45
46 bar
47 AA
48
49 Yup, foo runs when bar is compiled, not when bar runs.
50
51 Then I'll see if I can use [COMPILE] to call foo at
52 "runtime" instead:
53
54 : bar2 'A' EMIT [COMPILE] foo 'A' EMIT ;
55 bar2
56 AzA
57
58 Yes! Nailed it.
59
60 The next word defined in jonesforth.f is RECURSE.
61
62 Based on what's come before, you'd think this was a
63 total mind-wrecking word. But RECURSE just lets you call
64 a word from itself. Just like it sounds.
65
66 The only reason you can't _normally_ call a word from
67 itself is that a word is usually marked as hidden until
68 its done being compiled. (Which allows you to call the
69 previous definition.)
70
71 And that should make this real easy to test:
72
73 : foo 1 . ;
74 foo
75 1
76 : foo 2 . foo ;
77 foo
78 21
79 : foo 3 . RECURSE ;
80 foo
81 321
82
83 Hmmm... that's not what I expected it to do. The second
84 foo calls the first, as expected. But the third should
85 have called itself, not the second, right?
86
87 I could debug this in GDB, but quite frankly, that's
88 gonna suck. The level of abstraction is all wrong. I
89 really need some proper introspective debugging in the
90 interpreter itself. And I feel like FORTH is uniquely
91 suited to introspection.
92
93 At a minimum, I'd like a list of words in the dictionary
94 with addresses. I'd like to see what HERE points to.
95 And the contents of the stack.
96
97 I'm pretty sure I've got all the primitives I need to
98 get those things. I just have to solve the puzzle.
99
100 Then maybe I'll be able to figure out why RECURSE isn't
101 doing what I think it should do.
102
103 (Update: this didn't work. Feel free to skim ahead...)
104
105 First: printing the pointers HERE and LATEST sounds
106 pretty easy.
107
108 HERE .
109 134537812
110 LATEST .
111 134537776
112
113 I presume those decimal addresses are correct.
114
115 Two nights later: Now I've added a PRINTWORD word in
116 assembly that takes the address of a word (its header)
117 and prints its name.
118
119 LATEST PRINTWORD
120 RECURSE
121 LATEST @ PRINTWORD
122 [COMPILE]
123 LATEST @ @ PRINTWORD
124 '.'
125
126 The above example is printing the last three words in
127 the dictionary by using @ to fetch the address of the
128 previous word from the linked list.
129
130 I'd like to see some additional info about words like
131 how long their definitions are and whether they're
132 immediate and/or hidden.
133
134 HERE LATEST - .
135 36
136
137 That tells me that RECURSE is 36 bytes long, I guess.
138
139 The length + flags (it's hidden and immediate) should be
140 here:
141
142 LATEST 4 + C@ .
143 135
144
145 Hmmmm... that's 10000111 in binary, so looks like the
146 length portion is 111 (7). That checks out, 'RECURSE' is
147 seven letters long. (See bc session below for the binary
148 conversion, etc.)
149
150 And it should be immediate. Does that correspond with
151 10000111? Let's see:
152
153 %assign F_IMMED 0x80
154 %assign F_HIDDEN 0x20
155
156 And a little bc session to confirm some stuff:
157
158 eeepc:~$ bc
159 bc 1.32.1
160 Adapted from https://github.com/gavinhoward/bc
161 Original code (c) 2018 Gavin D. Howard and contributors
162 obase=2
163 135
164 10000111
165 7
166 111
167 ibase=16
168 80
169 10000000
170 20
171 100000
172
173 Yup! RECURSE is immediate. If it were hidden, it would
174 have had 10100111 in the length+flags.
175
176 If you're skimming, stop here!
177
178 Next night. Well, the above was neat and a good
179 refresher for me but I messed around with it quite a bit
180 trying to debug RECURSE without solving. However,
181 during all of that messing around, I did figure out
182 what's going wrong. Here's Jones's RECURSE:
183
184 : RECURSE IMMEDIATE
185 LATEST @ \ LATEST points to the word being compiled at the moment
186 >CFA \ get the codeword
187 , \ compile it
188 ;
189
190 The problem is that LATEST already points to the word
191 being compiled. LATEST @ fetches the _value_ at that
192 address. Well, that's a pointer to the _previous_ word.
193 Which completely explains the behavior I've seen.
194
195 I'm baffled. This is the exact same "bug" I encountered
196 in the COLON and SEMICOLON words.
197
198 Several days pass: Ha ha, wow. So I ended up creating an
199 actual web page containing the conundrum and posted that
200 to reddit.com/r/forth and got an *excellent* explanation
201 in just a couple hours:
202
203 I WAS MISSING THIS FUNDAMENTAL FACT ABOUT VARIABLES:
204
205 THEY DON'T LEAVE THEIR VALUE ON THE STACK.
206
207 THEY LEAVE THEIR ADDRESS ON THE STACK.
208
209 Why? Because that way you can also write to them. By
210 providing an address instead, it allows for both STORE
211 (!) as well as FETCH (@).
212
213 (I believe constants, by contrast, leave their values on
214 the stack.)
215
216 Anyway, my bug comes down to returning FETCH to the ':'
217 and ';' definitions:
218
219 dd LATEST, FETCH, HIDDEN ; Make the word hidden while it's being compiled.
220
221 dd LATEST, FETCH, HIDDEN ; Unhide word now that it's been compiled.
222
223 And now to test. This should be an infinite loop.
224
225 : all-nines 9 . RECURSE ;
226 all-nines
227 999999999999999999999999999999999999999999999999999999999999999999...
228 ...999999999999999999999999
229 Program received signal SIGSEGV, Segmentation fault.
230 0x0804e278 in ?? ()
231 (gdb)
232
233 I'm 100% not sure what caused that segmentation fault.
234 Oh, probably the stack overflowed. That makes sense
235 because a recursive word that never stops also never
236 gets to the EXIT that ';' compiles into the end of the
237 word. And EXIT is the only mechanism that automatically
238 pops the return stack.
239
240 By the way, I did try to make a non-infinite recursive
241 word using 0BRANCH and gave up. That's worse than
242 writing assembly with no labels!
243
244 Anyway, I'll go back to earlier logs now to add a note
245 so no one else is led astray by my fundamental
246 misunderstanding. Yikes:
247
248 r! ag -l 'fetch|latest|@' log* | sort
249 log02.txt
250 log04.txt
251 log07.txt
252 log09.txt
253 log10.txt <-- here is where I'm first confused
254 log11.txt <-- here I drop FETCH from : and ;
255 log15.txt <-- wrong variable examples
256 log16.txt <-- wrong variable examples
257 log17.txt
258 log19.txt
259
260 And is it possible that I can now read all of
261 jonesforth.f without errors?
262
263 I'll try it by setting the __lines_of_jf_to_read to the
264 end of the file: 1790...
265
266 PARSE ERROR: ( look it up in the dictionary )
267 >DFA
268 PARSE ERROR: ( look it up in the dictionary )
269 >DFA
270
271 Program received signal SIGSEGV, Segmentation fault.
272 _COMMA () at nasmjf.asm:689
273 689 stosd ; puts the value in eax at edi, increments edi
274 (gdb)
275
276 Ha ha, nope. But I think I'm getting further. Those
277 PARSE ERROR messages are new. Weird, I don't see why it
278 would choke on a comment when there are other '(...)'
279 comments before that:
280
281 : TO IMMEDIATE ( n -- )
282 WORD ( get the name of the value )
283 FIND ( look it up in the dictionary )
284 >DFA ( get a pointer to the first data field (the 'LIT') )
285
286 Well, I'll set the lines to read before that parse error
287 and keep working my way down. At least RECURSE works
288 now...