Dave's nasmjf Dev Log 24
Created: 2022-07-22
This is an entry in my developer’s log series written between December 2021 and August 2022 (started project in September). I wrote these as I completed my port of the JONESFORTH assembly language Forth interpreter.
This log starts with neat stuff: checking and allocating
memory!
Memory alloction in Forth couldn't be less complicated.
The application has some amount of memory upon startup.
As you use it for variables, constants, strings, and new
word definitions, the HERE pointer advances to the next
unused spot.
You check how much memory is left with UNUSED and
request more with the deliciously retro-sounding command
MORECORE.
In this Linux interpreter, memory checking and
allocation is handled with the brk system call.
The "break" address (end of memory allocated for us by
Linux) minus HERE gives us the amount of unused memory:
JONESFORTH VERSION 1
20643 CELLS REMAINING
OK
GET-BRK .
151916544
HERE .
134522628
UNUSED .
20643
That checks out. Let's add a bit more:
1024 MORECORE
UNUSED .
21667
Sweet!
I did not have very much fun with the file io words.
Though I did manage to create an empty file with:
S" foo.txt" R/W CREATE-FILE
CLOSE-FILE
BYE
$ ls
foo.txt
But I'm not sure what to do with a file descriptor for
writing. Words like TELL are hard-coded to use STDOUT.
Eh, no big deal. Learning how to read and write files is
not one of goals here. :-)
But you know what _is_ a big deal? The proof-of-concept
assembler implemented near the end of jonesforth.f:
;CODE - ends a colon word like usual, but it
appends a machine code NEXT (just like
our NEXT macro in NASM) and then alters
the "codeword" link in the just-compiled
word to point to the "data" we compiled
into the word definition.
You use it like so:
: foo <machine code/assembly> ;CODE
Then Jones defines some assembly mnemonics for things
like the registers EAX, ECD, EDX, etc. and assembly
mnemonics for PUSH and POP.
Finally, a fun instruction for the Pentium (and later)
x86 CPUs, RDTSC that returns 64 bits worth of clock
cycles.
The assembled word that makes use of the mnemonics looks
like this:
: RDTSC ( -- lsb msb )
RDTSC ( writes the result in %edx:%eax )
EAX PUSH ( push lsb )
EDX PUSH ( push msb )
;CODE
Let's try it!
RDTSC . .
7 193238564
RDTSC DROP RDTSC DROP SWAP - .
8815
RDTSC DROP RDTSC DROP SWAP - .
9068
I'm dropping the most significant bytes since I'm
measuring smaller amounts of time (which wouldn't be
correct if the least significant bytes rolled over!)
Apparently a couple instructions takes over 8,000 CPU
cycles? Very interesting!
Well, I could try adding some new x86 instructions, I
suppose. I thought about it. But I think the
proof-of-concept is plenty.
One thing's for sure, Forth with an assembler is surely
the most flexible programming system ever.
The final trick in the jonesforth.f file is INLINE:
INLINE can be used to inline an assembler
primitive into the current (assembler) word.
For example:
: 2DROP INLINE DROP INLINE DROP ;CODE
Looking at the implementation, it literally copies the
machine code from the word to be inlined until the next
macro (which is no longer needed since the code can just
keep running to the next word and so on.
What's wild is that this is exactly what I was
contemplating when I first learned how the "threaded"
code in Forth works. (Threaded code is great when memory
and disk space are at an absolute premium. But on our
modern machines, a lot of what Forth does to save space
seems downright silly.) I thought, "why couldn't I just
copy the contents of these words rather than their
addresses?" Especially since most of the words are so
tiny, often just a handful of machine instructions. It
seems silly to JMP to them!
Anyway, I want to test this out. Reading a hex dump is a
pain, but I figure with a ton of repetition, I'll be
able to see if the code is, indeed, inlined:
JONESFORTH VERSION 1
20643 CELLS REMAINING
OK
: 6DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP ;CODE
42 6DUP .S
42 42 42 42 42 42 42
It works, now I'll make a silly word to dump memory at
a word definition so I can compare them:
: foo WORD FIND 64 DUMP ;
foo DUP
804A250 40 A2 4 8 3 44 55 50 6 94 4 8 50 A2 4 8 @....DUP....P...
804A260 4 4F 56 45 52 90 90 90 D 94 4 8 5C A2 4 8 .OVER.......\...
804A270 3 52 4F 54 15 94 4 8 6C A2 4 8 4 2D 52 4F .ROT....l....-RO
804A280 54 90 90 90 1E 94 4 8 78 A2 4 8 5 32 44 52 T.......x....2DR
foo 6DUP
A011D74 D8 1C 1 A 4 36 44 55 50 0 0 0 84 1D 1 A .....6DUP.......
A011D84 8B 4 24 50 8B 4 24 50 8B 4 24 50 8B 4 24 50 ..$P..$P..$P..$P
A011D94 8B 4 24 50 8B 4 24 50 AD FF 20 0 74 1D 1 A ..$P..$P.. .t...
A011DA4 3 66 6F 6F 5A 90 4 8 C4 A0 4 8 48 A1 4 8 .fooZ.......H...
Yeah, clearly the $P bit is repeated six times. And it
looks like each DUP is 4 bytes of machine code.
8B 04 24 50
Oh yeah, I can check that out with GDB, huh?
(gdb) disassemble /r code_DUP
Dump of assembler code for function code_DUP:
0x08049406 <+0>: 8b 04 24 mov eax,DWORD PTR [esp]
0x08049409 <+3>: 50 push eax
0x0804940a <+4>: ad lods eax,DWORD PTR ds:[esi]
0x0804940b <+5>: ff 20 jmp DWORD PTR [eax]
End of assembler dump.
Yup, that checks out.
Well, gosh. This concludes jonesforth/jonesforth.f.
Next, I'll take a look at the test files in the
jonesforth/ dir.
Next night: okay, so jonesforth/Makefile has this test
target:
test_%.test: test_%.f jonesforth
@echo -n "$< ... "
@rm -f .$@
@cat <(echo ': TEST-MODE ;') jonesforth.f $< <(echo 'TEST') | \
./jonesforth 2>&1 | \
sed 's/DSP=[0-9]*//g' > .$@
@diff -u .$@ $<.out
@rm -f .$@
@echo "ok"
So make isn't my favorite thing, but I understand that
it's going to run a selected <test>.f file and write the
output to <test>.test and diff it with <test>.out, which
contains the expected output.
The 'TEST-MODE' word definition simply causes JONESFORTH
to not display its welcome message. Hmmm...that's a
little tricky because my port runs that before anything
from STDIN. There are ways to make that work, but I'm
thinking I'll just make my test script ignore the
welcome instead.
Then the 'TEST' invocation runs whatever test word was
defined in the <test>.f file. Which it could have done
itself. But at least I can do that like the makefile
does!
Okay, let's see if we can do some basic script input
first:
cat <(echo 'CR ." BEEP BOOP. Test mode activated." CR ') | ./nasmjf
JONESFORTH VERSION 1
20643 CELLS REMAINING
OK
BEEP BOOP. Test mode activated.
$
LOL. Awesome.
I sometimes forget that this is now a "real" program and
it would't take much to make it a somewhat useful UNIX
citizen...
Anyway, now I just need to redirect each test file in
followed by the TEST invocation.
I'll use sed to skip the welcome message (and my silly
"test mode" message, which I'm totally keeping in
there).
Here's what the output looks like (condensed a bit to
make it a bit more compact to look at:
eeepc:~/nasmjf$ ./test.sh
2DROP: 2 1
t e s t i n g
0 1 0 1 1 0
1 0 1 0 0 1
1 1 1 0 1 0 1 1 0
1 1 1 1 0 1 0 0 1
1 0 1 0 1
0 1 0 1 0
0 1 0
1 0 1
0 0 1
1 0 0
0 1 1
1 1 0
TEST4+0 TEST3+8 CATCH+28 CATCH ( DSP=3218223136 ) TEST2+8 TEST+0
TEST4+0 TEST3+20 CATCH+28 CATCH ( DSP=3218223136 ) TEST2+8 TEST+0
TEST3 threw exception 26
TEST4+0 TEST3+8 TEST2+68 TEST+0
TEST4+0 TEST3+20 TEST2+68 TEST+0
UNCAUGHT THROW 26
123
-127
7FF77FF7
-1111111111101110111111111110111
7FF77FF7
test_read_file.f.out: ERRNO=2
0
42 42
0
1 2
1 2 1
2 1 3
1 3 2
2 1
4 3 4 3 2 1
2 1 4 3
0
TEST4+0 TEST3+0 TEST2+0 TEST+0
3
TEST4+0 TEST3+32 TEST2+0 TEST+0
TEST4+0 TEST3+0 TEST2+4 TEST+0
3
TEST4+0 TEST3+32 TEST2+4 TEST+0
As far as I know, that's all good except the failure to
open "test_read_file.f.out" which will be due to the
fact that I'm running the tests up a directory from
where they would normally be run.
I'll just go ahead and modify the test:
- S" test_read_file.f.out" R/O OPEN-FILE
+ S" jonesforth/test_read_file.f.out" R/O OPEN-FILE
Now I compare that output with what's expected in
Jones's .out files by adding a call to diff to my loop
and we'll see if any fail. After a couple tweaks
(whitespace, adding the stripping of "DSP=nnnnn" from
the stack trace tests using sed, etc.), I was able to get
a nice clean run!
eeepc:~/nasmjf$ ./test.sh
Testing: jonesforth/test_assembler.f
Testing: jonesforth/test_comparison.f
Testing: jonesforth/test_exception.f
Testing: jonesforth/test_number.f
Testing: jonesforth/test_read_file.f
Testing: jonesforth/test_stack.f
Testing: jonesforth/test_stack_trace.f
Well, then I call this port complete! I'm going to clean
up the assembly source (which is a royal mess) and see
if I can't maybe improve on the comments, etc. And
probably the README as well.
I've got more fun Forth stuff planned next.
: bye BEGIN ." Goodbye! " AGAIN ;
bye
Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G
dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go
bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo
ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good
e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb
! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby
Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye
Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G
dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go
bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo
ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good
e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb
! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby
Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye
Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G
dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go
bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo
ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good
e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb
! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby