colorful rat Ratfactor.com > Dave's Repos

nasmjf

A NASM assembler port of JONESFORTH
git clone http://ratfactor.com/repos/nasmjf/nasmjf.git

nasmjf/devlog/log24.txt

Download raw file: devlog/log24.txt

1 This log starts with neat stuff: checking and allocating 2 memory! 3 4 Memory alloction in Forth couldn't be less complicated. 5 The application has some amount of memory upon startup. 6 As you use it for variables, constants, strings, and new 7 word definitions, the HERE pointer advances to the next 8 unused spot. 9 10 You check how much memory is left with UNUSED and 11 request more with the deliciously retro-sounding command 12 MORECORE. 13 14 In this Linux interpreter, memory checking and 15 allocation is handled with the brk system call. 16 17 The "break" address (end of memory allocated for us by 18 Linux) minus HERE gives us the amount of unused memory: 19 20 JONESFORTH VERSION 1 21 20643 CELLS REMAINING 22 OK 23 GET-BRK . 24 151916544 25 HERE . 26 134522628 27 UNUSED . 28 20643 29 30 That checks out. Let's add a bit more: 31 32 1024 MORECORE 33 UNUSED . 34 21667 35 36 Sweet! 37 38 I did not have very much fun with the file io words. 39 Though I did manage to create an empty file with: 40 41 S" foo.txt" R/W CREATE-FILE 42 CLOSE-FILE 43 BYE 44 $ ls 45 foo.txt 46 47 But I'm not sure what to do with a file descriptor for 48 writing. Words like TELL are hard-coded to use STDOUT. 49 50 Eh, no big deal. Learning how to read and write files is 51 not one of goals here. :-) 52 53 But you know what _is_ a big deal? The proof-of-concept 54 assembler implemented near the end of jonesforth.f: 55 56 ;CODE - ends a colon word like usual, but it 57 appends a machine code NEXT (just like 58 our NEXT macro in NASM) and then alters 59 the "codeword" link in the just-compiled 60 word to point to the "data" we compiled 61 into the word definition. 62 63 You use it like so: 64 65 : foo <machine code/assembly> ;CODE 66 67 Then Jones defines some assembly mnemonics for things 68 like the registers EAX, ECD, EDX, etc. and assembly 69 mnemonics for PUSH and POP. 70 71 Finally, a fun instruction for the Pentium (and later) 72 x86 CPUs, RDTSC that returns 64 bits worth of clock 73 cycles. 74 75 The assembled word that makes use of the mnemonics looks 76 like this: 77 78 79 : RDTSC ( -- lsb msb ) 80 RDTSC ( writes the result in %edx:%eax ) 81 EAX PUSH ( push lsb ) 82 EDX PUSH ( push msb ) 83 ;CODE 84 85 Let's try it! 86 87 RDTSC . . 88 7 193238564 89 RDTSC DROP RDTSC DROP SWAP - . 90 8815 91 RDTSC DROP RDTSC DROP SWAP - . 92 9068 93 94 I'm dropping the most significant bytes since I'm 95 measuring smaller amounts of time (which wouldn't be 96 correct if the least significant bytes rolled over!) 97 98 Apparently a couple instructions takes over 8,000 CPU 99 cycles? Very interesting! 100 101 Well, I could try adding some new x86 instructions, I 102 suppose. I thought about it. But I think the 103 proof-of-concept is plenty. 104 105 One thing's for sure, Forth with an assembler is surely 106 the most flexible programming system ever. 107 108 The final trick in the jonesforth.f file is INLINE: 109 110 111 INLINE can be used to inline an assembler 112 primitive into the current (assembler) word. 113 114 For example: 115 116 : 2DROP INLINE DROP INLINE DROP ;CODE 117 118 Looking at the implementation, it literally copies the 119 machine code from the word to be inlined until the next 120 macro (which is no longer needed since the code can just 121 keep running to the next word and so on. 122 123 What's wild is that this is exactly what I was 124 contemplating when I first learned how the "threaded" 125 code in Forth works. (Threaded code is great when memory 126 and disk space are at an absolute premium. But on our 127 modern machines, a lot of what Forth does to save space 128 seems downright silly.) I thought, "why couldn't I just 129 copy the contents of these words rather than their 130 addresses?" Especially since most of the words are so 131 tiny, often just a handful of machine instructions. It 132 seems silly to JMP to them! 133 134 Anyway, I want to test this out. Reading a hex dump is a 135 pain, but I figure with a ton of repetition, I'll be 136 able to see if the code is, indeed, inlined: 137 138 JONESFORTH VERSION 1 139 20643 CELLS REMAINING 140 OK 141 : 6DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP ;CODE 142 42 6DUP .S 143 42 42 42 42 42 42 42 144 145 It works, now I'll make a silly word to dump memory at 146 a word definition so I can compare them: 147 148 : foo WORD FIND 64 DUMP ; 149 foo DUP 150 804A250 40 A2 4 8 3 44 55 50 6 94 4 8 50 A2 4 8 @....DUP....P... 151 804A260 4 4F 56 45 52 90 90 90 D 94 4 8 5C A2 4 8 .OVER.......\... 152 804A270 3 52 4F 54 15 94 4 8 6C A2 4 8 4 2D 52 4F .ROT....l....-RO 153 804A280 54 90 90 90 1E 94 4 8 78 A2 4 8 5 32 44 52 T.......x....2DR 154 foo 6DUP 155 A011D74 D8 1C 1 A 4 36 44 55 50 0 0 0 84 1D 1 A .....6DUP....... 156 A011D84 8B 4 24 50 8B 4 24 50 8B 4 24 50 8B 4 24 50 ..$P..$P..$P..$P 157 A011D94 8B 4 24 50 8B 4 24 50 AD FF 20 0 74 1D 1 A ..$P..$P.. .t... 158 A011DA4 3 66 6F 6F 5A 90 4 8 C4 A0 4 8 48 A1 4 8 .fooZ.......H... 159 160 Yeah, clearly the $P bit is repeated six times. And it 161 looks like each DUP is 4 bytes of machine code. 162 163 8B 04 24 50 164 165 Oh yeah, I can check that out with GDB, huh? 166 167 (gdb) disassemble /r code_DUP 168 Dump of assembler code for function code_DUP: 169 0x08049406 <+0>: 8b 04 24 mov eax,DWORD PTR [esp] 170 0x08049409 <+3>: 50 push eax 171 0x0804940a <+4>: ad lods eax,DWORD PTR ds:[esi] 172 0x0804940b <+5>: ff 20 jmp DWORD PTR [eax] 173 End of assembler dump. 174 175 Yup, that checks out. 176 177 Well, gosh. This concludes jonesforth/jonesforth.f. 178 179 Next, I'll take a look at the test files in the 180 jonesforth/ dir. 181 182 Next night: okay, so jonesforth/Makefile has this test 183 target: 184 185 test_%.test: test_%.f jonesforth 186 @echo -n "$< ... " 187 @rm -f .$@ 188 @cat <(echo ': TEST-MODE ;') jonesforth.f $< <(echo 'TEST') | \ 189 ./jonesforth 2>&1 | \ 190 sed 's/DSP=[0-9]*//g' > .$@ 191 @diff -u .$@ $<.out 192 @rm -f .$@ 193 @echo "ok" 194 195 So make isn't my favorite thing, but I understand that 196 it's going to run a selected <test>.f file and write the 197 output to <test>.test and diff it with <test>.out, which 198 contains the expected output. 199 200 The 'TEST-MODE' word definition simply causes JONESFORTH 201 to not display its welcome message. Hmmm...that's a 202 little tricky because my port runs that before anything 203 from STDIN. There are ways to make that work, but I'm 204 thinking I'll just make my test script ignore the 205 welcome instead. 206 207 Then the 'TEST' invocation runs whatever test word was 208 defined in the <test>.f file. Which it could have done 209 itself. But at least I can do that like the makefile 210 does! 211 212 Okay, let's see if we can do some basic script input 213 first: 214 215 cat <(echo 'CR ." BEEP BOOP. Test mode activated." CR ') | ./nasmjf 216 217 JONESFORTH VERSION 1 218 20643 CELLS REMAINING 219 OK 220 BEEP BOOP. Test mode activated. 221 $ 222 223 LOL. Awesome. 224 225 I sometimes forget that this is now a "real" program and 226 it would't take much to make it a somewhat useful UNIX 227 citizen... 228 229 Anyway, now I just need to redirect each test file in 230 followed by the TEST invocation. 231 232 I'll use sed to skip the welcome message (and my silly 233 "test mode" message, which I'm totally keeping in 234 there). 235 236 Here's what the output looks like (condensed a bit to 237 make it a bit more compact to look at: 238 239 eeepc:~/nasmjf$ ./test.sh 240 2DROP: 2 1 241 t e s t i n g 242 0 1 0 1 1 0 243 1 0 1 0 0 1 244 1 1 1 0 1 0 1 1 0 245 1 1 1 1 0 1 0 0 1 246 1 0 1 0 1 247 0 1 0 1 0 248 0 1 0 249 1 0 1 250 0 0 1 251 1 0 0 252 0 1 1 253 1 1 0 254 TEST4+0 TEST3+8 CATCH+28 CATCH ( DSP=3218223136 ) TEST2+8 TEST+0 255 TEST4+0 TEST3+20 CATCH+28 CATCH ( DSP=3218223136 ) TEST2+8 TEST+0 256 TEST3 threw exception 26 257 TEST4+0 TEST3+8 TEST2+68 TEST+0 258 TEST4+0 TEST3+20 TEST2+68 TEST+0 259 UNCAUGHT THROW 26 260 123 261 -127 262 7FF77FF7 263 -1111111111101110111111111110111 264 7FF77FF7 265 test_read_file.f.out: ERRNO=2 266 0 267 42 42 268 0 269 1 2 270 1 2 1 271 2 1 3 272 1 3 2 273 2 1 274 4 3 4 3 2 1 275 2 1 4 3 276 0 277 TEST4+0 TEST3+0 TEST2+0 TEST+0 278 3 279 TEST4+0 TEST3+32 TEST2+0 TEST+0 280 TEST4+0 TEST3+0 TEST2+4 TEST+0 281 3 282 TEST4+0 TEST3+32 TEST2+4 TEST+0 283 284 As far as I know, that's all good except the failure to 285 open "test_read_file.f.out" which will be due to the 286 fact that I'm running the tests up a directory from 287 where they would normally be run. 288 289 I'll just go ahead and modify the test: 290 291 - S" test_read_file.f.out" R/O OPEN-FILE 292 + S" jonesforth/test_read_file.f.out" R/O OPEN-FILE 293 294 Now I compare that output with what's expected in 295 Jones's .out files by adding a call to diff to my loop 296 and we'll see if any fail. After a couple tweaks 297 (whitespace, adding the stripping of "DSP=nnnnn" from 298 the stack trace tests using sed, etc.), I was able to get 299 a nice clean run! 300 301 eeepc:~/nasmjf$ ./test.sh 302 Testing: jonesforth/test_assembler.f 303 Testing: jonesforth/test_comparison.f 304 Testing: jonesforth/test_exception.f 305 Testing: jonesforth/test_number.f 306 Testing: jonesforth/test_read_file.f 307 Testing: jonesforth/test_stack.f 308 Testing: jonesforth/test_stack_trace.f 309 310 Well, then I call this port complete! I'm going to clean 311 up the assembly source (which is a royal mess) and see 312 if I can't maybe improve on the comments, etc. And 313 probably the README as well. 314 315 I've got more fun Forth stuff planned next. 316 317 : bye BEGIN ." Goodbye! " AGAIN ; 318 bye 319 Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! 320 oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! 321 odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G 322 dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go 323 bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo 324 ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good 325 e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb 326 ! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby 327 Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye 328 Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! 329 oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! 330 odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G 331 dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go 332 bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo 333 ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good 334 e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb 335 ! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby 336 Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye 337 Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! 338 oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! 339 odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G 340 dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go 341 bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo 342 ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good 343 e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb 344 ! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby