colorful rat Ratfactor.com > Dave's Repos

nasmjf

A NASM assembler port of JONESFORTH
git clone http://ratfactor.com/repos/nasmjf/nasmjf.git

nasmjf/devlog/log19.txt

Download raw file: devlog/log19.txt

1 The mind-destroying temporal manipulation operators 2 continue now with a new word called 3 4 [COMPILE] 5 6 Which compiles the next "word" of input even if it would 7 otherwise have been immediate. 8 9 This allows you to be your own father's uncle or 10 something like that. 11 12 And, of course, [COMPILE] itself is IMMEDIATE. Let's 13 take a look at the whole definition with some Jones 14 comments: 15 16 : [COMPILE] IMMEDIATE 17 WORD \ get the next word 18 FIND \ find it in the dictionary 19 >CFA \ get its codeword 20 , \ and compile that 21 ; 22 23 I have to keep reminding myself that even though 24 [COMPILE] is IMMEDIATE, the WORD word's next word isn't 25 FIND, it's whatever follows the *use* of [COMPILE], 26 which is probably during compilation. I think we have 27 exceeded the limitations of the English language now. 28 Throw away your dictionaries, clocks, and family trees. 29 Where we're going, we won't be needing those anymore. 30 Set phasers for "make love to your own grandma" and 31 prepare to jump to lightspeed toward Planet FORTH. 32 33 I'm tired. 34 35 Next night: Okay, I figured out how to test [COMPILE]. 36 I'll make a new IMMEDIATE word called foo that emits 'z' 37 when it runs. 38 39 : foo IMMEDIATE [ CHAR z ] LITERAL EMIT ; 40 foo 41 z 42 43 : bar 'A' EMIT foo 'A' EMIT ; 44 z 45 46 bar 47 AA 48 49 Yup, foo runs when bar is compiled, not when bar runs. 50 51 Then I'll see if I can use [COMPILE] to call foo at 52 "runtime" instead: 53 54 : bar2 'A' EMIT [COMPILE] foo 'A' EMIT ; 55 bar2 56 AzA 57 58 Yes! Nailed it. 59 60 The next word defined in jonesforth.f is RECURSE. 61 62 Based on what's come before, you'd think this was a 63 total mind-wrecking word. But RECURSE just lets you call 64 a word from itself. Just like it sounds. 65 66 The only reason you can't _normally_ call a word from 67 itself is that a word is usually marked as hidden until 68 its done being compiled. (Which allows you to call the 69 previous definition.) 70 71 And that should make this real easy to test: 72 73 : foo 1 . ; 74 foo 75 1 76 : foo 2 . foo ; 77 foo 78 21 79 : foo 3 . RECURSE ; 80 foo 81 321 82 83 Hmmm... that's not what I expected it to do. The second 84 foo calls the first, as expected. But the third should 85 have called itself, not the second, right? 86 87 I could debug this in GDB, but quite frankly, that's 88 gonna suck. The level of abstraction is all wrong. I 89 really need some proper introspective debugging in the 90 interpreter itself. And I feel like FORTH is uniquely 91 suited to introspection. 92 93 At a minimum, I'd like a list of words in the dictionary 94 with addresses. I'd like to see what HERE points to. 95 And the contents of the stack. 96 97 I'm pretty sure I've got all the primitives I need to 98 get those things. I just have to solve the puzzle. 99 100 Then maybe I'll be able to figure out why RECURSE isn't 101 doing what I think it should do. 102 103 (Update: this didn't work. Feel free to skim ahead...) 104 105 First: printing the pointers HERE and LATEST sounds 106 pretty easy. 107 108 HERE . 109 134537812 110 LATEST . 111 134537776 112 113 I presume those decimal addresses are correct. 114 115 Two nights later: Now I've added a PRINTWORD word in 116 assembly that takes the address of a word (its header) 117 and prints its name. 118 119 LATEST PRINTWORD 120 RECURSE 121 LATEST @ PRINTWORD 122 [COMPILE] 123 LATEST @ @ PRINTWORD 124 '.' 125 126 The above example is printing the last three words in 127 the dictionary by using @ to fetch the address of the 128 previous word from the linked list. 129 130 I'd like to see some additional info about words like 131 how long their definitions are and whether they're 132 immediate and/or hidden. 133 134 HERE LATEST - . 135 36 136 137 That tells me that RECURSE is 36 bytes long, I guess. 138 139 The length + flags (it's hidden and immediate) should be 140 here: 141 142 LATEST 4 + C@ . 143 135 144 145 Hmmmm... that's 10000111 in binary, so looks like the 146 length portion is 111 (7). That checks out, 'RECURSE' is 147 seven letters long. (See bc session below for the binary 148 conversion, etc.) 149 150 And it should be immediate. Does that correspond with 151 10000111? Let's see: 152 153 %assign F_IMMED 0x80 154 %assign F_HIDDEN 0x20 155 156 And a little bc session to confirm some stuff: 157 158 eeepc:~$ bc 159 bc 1.32.1 160 Adapted from https://github.com/gavinhoward/bc 161 Original code (c) 2018 Gavin D. Howard and contributors 162 obase=2 163 135 164 10000111 165 7 166 111 167 ibase=16 168 80 169 10000000 170 20 171 100000 172 173 Yup! RECURSE is immediate. If it were hidden, it would 174 have had 10100111 in the length+flags. 175 176 If you're skimming, stop here! 177 178 Next night. Well, the above was neat and a good 179 refresher for me but I messed around with it quite a bit 180 trying to debug RECURSE without solving. However, 181 during all of that messing around, I did figure out 182 what's going wrong. Here's Jones's RECURSE: 183 184 : RECURSE IMMEDIATE 185 LATEST @ \ LATEST points to the word being compiled at the moment 186 >CFA \ get the codeword 187 , \ compile it 188 ; 189 190 The problem is that LATEST already points to the word 191 being compiled. LATEST @ fetches the _value_ at that 192 address. Well, that's a pointer to the _previous_ word. 193 Which completely explains the behavior I've seen. 194 195 I'm baffled. This is the exact same "bug" I encountered 196 in the COLON and SEMICOLON words. 197 198 Several days pass: Ha ha, wow. So I ended up creating an 199 actual web page containing the conundrum and posted that 200 to reddit.com/r/forth and got an *excellent* explanation 201 in just a couple hours: 202 203 I WAS MISSING THIS FUNDAMENTAL FACT ABOUT VARIABLES: 204 205 THEY DON'T LEAVE THEIR VALUE ON THE STACK. 206 207 THEY LEAVE THEIR ADDRESS ON THE STACK. 208 209 Why? Because that way you can also write to them. By 210 providing an address instead, it allows for both STORE 211 (!) as well as FETCH (@). 212 213 (I believe constants, by contrast, leave their values on 214 the stack.) 215 216 Anyway, my bug comes down to returning FETCH to the ':' 217 and ';' definitions: 218 219 dd LATEST, FETCH, HIDDEN ; Make the word hidden while it's being compiled. 220 221 dd LATEST, FETCH, HIDDEN ; Unhide word now that it's been compiled. 222 223 And now to test. This should be an infinite loop. 224 225 : all-nines 9 . RECURSE ; 226 all-nines 227 999999999999999999999999999999999999999999999999999999999999999999... 228 ...999999999999999999999999 229 Program received signal SIGSEGV, Segmentation fault. 230 0x0804e278 in ?? () 231 (gdb) 232 233 I'm 100% not sure what caused that segmentation fault. 234 Oh, probably the stack overflowed. That makes sense 235 because a recursive word that never stops also never 236 gets to the EXIT that ';' compiles into the end of the 237 word. And EXIT is the only mechanism that automatically 238 pops the return stack. 239 240 By the way, I did try to make a non-infinite recursive 241 word using 0BRANCH and gave up. That's worse than 242 writing assembly with no labels! 243 244 Anyway, I'll go back to earlier logs now to add a note 245 so no one else is led astray by my fundamental 246 misunderstanding. Yikes: 247 248 r! ag -l 'fetch|latest|@' log* | sort 249 log02.txt 250 log04.txt 251 log07.txt 252 log09.txt 253 log10.txt <-- here is where I'm first confused 254 log11.txt <-- here I drop FETCH from : and ; 255 log15.txt <-- wrong variable examples 256 log16.txt <-- wrong variable examples 257 log17.txt 258 log19.txt 259 260 And is it possible that I can now read all of 261 jonesforth.f without errors? 262 263 I'll try it by setting the __lines_of_jf_to_read to the 264 end of the file: 1790... 265 266 PARSE ERROR: ( look it up in the dictionary ) 267 >DFA 268 PARSE ERROR: ( look it up in the dictionary ) 269 >DFA 270 271 Program received signal SIGSEGV, Segmentation fault. 272 _COMMA () at nasmjf.asm:689 273 689 stosd ; puts the value in eax at edi, increments edi 274 (gdb) 275 276 Ha ha, nope. But I think I'm getting further. Those 277 PARSE ERROR messages are new. Weird, I don't see why it 278 would choke on a comment when there are other '(...)' 279 comments before that: 280 281 : TO IMMEDIATE ( n -- ) 282 WORD ( get the name of the value ) 283 FIND ( look it up in the dictionary ) 284 >DFA ( get a pointer to the first data field (the 'LIT') ) 285 286 Well, I'll set the lines to read before that parse error 287 and keep working my way down. At least RECURSE works 288 now...