Dave's nasmjf Dev Log 24

Created: 2022-07-22

nasmjf home

This is an entry in my developer’s log series written between December 2021 and August 2022 (started project in September). I wrote these as I completed my port of the JONESFORTH assembly language Forth interpreter.

← Previous 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Next →
    This log starts with neat stuff: checking and allocating
    memory!

    Memory alloction in Forth couldn't be less complicated.
    The application has some amount of memory upon startup.
    As you use it for variables, constants, strings, and new
    word definitions, the HERE pointer advances to the next
    unused spot.

    You check how much memory is left with UNUSED and
    request more with the deliciously retro-sounding command
    MORECORE.

    In this Linux interpreter, memory checking and
    allocation is handled with the brk system call.

    The "break" address (end of memory allocated for us by
    Linux) minus HERE gives us the amount of unused memory:

JONESFORTH VERSION 1
20643 CELLS REMAINING
OK
GET-BRK .
151916544
HERE .
134522628
UNUSED .
20643

    That checks out. Let's add a bit more:

1024 MORECORE
UNUSED .
21667

    Sweet!

    I did not have very much fun with the file io words.
    Though I did manage to create an empty file with:

S" foo.txt" R/W CREATE-FILE
CLOSE-FILE
BYE
$ ls
foo.txt

    But I'm not sure what to do with a file descriptor for
    writing. Words like TELL are hard-coded to use STDOUT.

    Eh, no big deal. Learning how to read and write files is
    not one of goals here. :-)

    But you know what _is_ a big deal? The proof-of-concept
    assembler implemented near the end of jonesforth.f:

        ;CODE - ends a colon word like usual, but it
                appends a machine code NEXT (just like
                our NEXT macro in NASM) and then alters
                the "codeword" link in the just-compiled
                word to point to the "data" we compiled
                into the word definition.

    You use it like so:

        : foo <machine code/assembly> ;CODE

    Then Jones defines some assembly mnemonics for things
    like the registers EAX, ECD, EDX, etc. and assembly
    mnemonics for PUSH and POP.

    Finally, a fun instruction for the Pentium (and later)
    x86 CPUs, RDTSC that returns 64 bits worth of clock
    cycles.

    The assembled word that makes use of the mnemonics looks
    like this:


        : RDTSC ( -- lsb msb )
                RDTSC        ( writes the result in %edx:%eax )
                EAX PUSH     ( push lsb )
                EDX PUSH     ( push msb )
        ;CODE

    Let's try it!

RDTSC . .
7 193238564
RDTSC DROP RDTSC DROP SWAP - .
8815
RDTSC DROP RDTSC DROP SWAP - .
9068

    I'm dropping the most significant bytes since I'm
    measuring smaller amounts of time (which wouldn't be
    correct if the least significant bytes rolled over!)

    Apparently a couple instructions takes over 8,000 CPU
    cycles? Very interesting!

    Well, I could try adding some new x86 instructions, I
    suppose. I thought about it. But I think the
    proof-of-concept is plenty.

    One thing's for sure, Forth with an assembler is surely
    the most flexible programming system ever.

    The final trick in the jonesforth.f file is INLINE:


	INLINE can be used to inline an assembler
        primitive into the current (assembler) word.

	For example:

		: 2DROP INLINE DROP INLINE DROP ;CODE

    Looking at the implementation, it literally copies the
    machine code from the word to be inlined until the next
    macro (which is no longer needed since the code can just
    keep running to the next word and so on.

    What's wild is that this is exactly what I was
    contemplating when I first learned how the "threaded"
    code in Forth works. (Threaded code is great when memory
    and disk space are at an absolute premium. But on our
    modern machines, a lot of what Forth does to save space
    seems downright silly.)  I thought, "why couldn't I just
    copy the contents of these words rather than their
    addresses?" Especially since most of the words are so
    tiny, often just a handful of machine instructions. It
    seems silly to JMP to them!

    Anyway, I want to test this out. Reading a hex dump is a
    pain, but I figure with a ton of repetition, I'll be
    able to see if the code is, indeed, inlined:

JONESFORTH VERSION 1
20643 CELLS REMAINING
OK
: 6DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP INLINE DUP ;CODE
42 6DUP .S
42 42 42 42 42 42 42

    It works, now I'll make a silly word to dump memory at
    a word definition so I can compare them:

: foo WORD FIND 64 DUMP ;
foo DUP
 804A250 40 A2  4  8  3 44 55 50  6 94  4  8 50 A2  4  8 @....DUP....P...
 804A260  4 4F 56 45 52 90 90 90  D 94  4  8 5C A2  4  8 .OVER.......\...
 804A270  3 52 4F 54 15 94  4  8 6C A2  4  8  4 2D 52 4F .ROT....l....-RO
 804A280 54 90 90 90 1E 94  4  8 78 A2  4  8  5 32 44 52 T.......x....2DR
foo 6DUP
 A011D74 D8 1C  1  A  4 36 44 55 50  0  0  0 84 1D  1  A .....6DUP.......
 A011D84 8B  4 24 50 8B  4 24 50 8B  4 24 50 8B  4 24 50 ..$P..$P..$P..$P
 A011D94 8B  4 24 50 8B  4 24 50 AD FF 20  0 74 1D  1  A ..$P..$P.. .t...
 A011DA4  3 66 6F 6F 5A 90  4  8 C4 A0  4  8 48 A1  4  8 .fooZ.......H...

    Yeah, clearly the $P bit is repeated six times. And it
    looks like each DUP is 4 bytes of machine code.

        8B 04 24 50

    Oh yeah, I can check that out with GDB, huh?

(gdb) disassemble /r code_DUP
Dump of assembler code for function code_DUP:
   0x08049406 <+0>:     8b 04 24        mov    eax,DWORD PTR [esp]
   0x08049409 <+3>:     50      push   eax
   0x0804940a <+4>:     ad      lods   eax,DWORD PTR ds:[esi]
   0x0804940b <+5>:     ff 20   jmp    DWORD PTR [eax]
End of assembler dump.

    Yup, that checks out.

    Well, gosh. This concludes jonesforth/jonesforth.f.

    Next, I'll take a look at the test files in the
    jonesforth/ dir.

    Next night: okay, so jonesforth/Makefile has this test
    target:

        test_%.test: test_%.f jonesforth
                @echo -n "$< ... "
                @rm -f .$@
                @cat <(echo ': TEST-MODE ;') jonesforth.f $< <(echo 'TEST') | \
                  ./jonesforth 2>&1 | \
                  sed 's/DSP=[0-9]*//g' > .$@
                @diff -u .$@ $<.out
                @rm -f .$@
                @echo "ok"

    So make isn't my favorite thing, but I understand that
    it's going to run a selected <test>.f file and write the
    output to <test>.test and diff it with <test>.out, which
    contains the expected output.

    The 'TEST-MODE' word definition simply causes JONESFORTH
    to not display its welcome message. Hmmm...that's a
    little tricky because my port runs that before anything
    from STDIN. There are ways to make that work, but I'm
    thinking I'll just make my test script ignore the
    welcome instead.

    Then the 'TEST' invocation runs whatever test word was
    defined in the <test>.f file. Which it could have done
    itself. But at least I can do that like the makefile
    does!

    Okay, let's see if we can do some basic script input
    first:

cat <(echo 'CR ." BEEP BOOP. Test mode activated." CR ') | ./nasmjf

JONESFORTH VERSION 1
20643 CELLS REMAINING
OK
BEEP BOOP. Test mode activated.
$

    LOL. Awesome.

    I sometimes forget that this is now a "real" program and
    it would't take much to make it a somewhat useful UNIX
    citizen...

    Anyway, now I just need to redirect each test file in
    followed by the TEST invocation.

    I'll use sed to skip the welcome message (and my silly
    "test mode" message, which I'm totally keeping in
    there).

    Here's what the output looks like (condensed a bit to
    make it a bit more compact to look at:

eeepc:~/nasmjf$ ./test.sh
2DROP: 2 1
t e s t i n g
0 1 0 1 1 0
1 0 1 0 0 1
1 1 1 0 1 0 1 1 0
1 1 1 1 0 1 0 0 1
1 0 1 0 1
0 1 0 1 0
0 1 0
1 0 1
0 0 1
1 0 0
0 1 1
1 1 0
TEST4+0 TEST3+8 CATCH+28 CATCH ( DSP=3218223136 ) TEST2+8 TEST+0
TEST4+0 TEST3+20 CATCH+28 CATCH ( DSP=3218223136 ) TEST2+8 TEST+0
TEST3 threw exception 26
TEST4+0 TEST3+8 TEST2+68 TEST+0
TEST4+0 TEST3+20 TEST2+68 TEST+0
UNCAUGHT THROW 26
123
-127
7FF77FF7
-1111111111101110111111111110111
7FF77FF7
test_read_file.f.out: ERRNO=2
0
42 42
0
1 2
1 2 1
2 1 3
1 3 2
2 1
4 3 4 3 2 1
2 1 4 3
0
TEST4+0 TEST3+0 TEST2+0 TEST+0
3
TEST4+0 TEST3+32 TEST2+0 TEST+0
TEST4+0 TEST3+0 TEST2+4 TEST+0
3
TEST4+0 TEST3+32 TEST2+4 TEST+0

    As far as I know, that's all good except the failure to
    open "test_read_file.f.out" which will be due to the
    fact that I'm running the tests up a directory from
    where they would normally be run.

    I'll just go ahead and modify the test:

        - S" test_read_file.f.out" R/O OPEN-FILE
        + S" jonesforth/test_read_file.f.out" R/O OPEN-FILE

    Now I compare that output with what's expected in
    Jones's .out files by adding a call to diff to my loop
    and we'll see if any fail. After a couple tweaks
    (whitespace, adding the stripping of "DSP=nnnnn" from
    the stack trace tests using sed, etc.), I was able to get
    a nice clean run!

eeepc:~/nasmjf$ ./test.sh
Testing: jonesforth/test_assembler.f
Testing: jonesforth/test_comparison.f
Testing: jonesforth/test_exception.f
Testing: jonesforth/test_number.f
Testing: jonesforth/test_read_file.f
Testing: jonesforth/test_stack.f
Testing: jonesforth/test_stack_trace.f

    Well, then I call this port complete! I'm going to clean
    up the assembly source (which is a royal mess) and see
    if I can't maybe improve on the comments, etc. And
    probably the README as well.

    I've got more fun Forth stuff planned next.

: bye BEGIN ." Goodbye! " AGAIN ;
bye
Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G
dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go
bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo
ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good
e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb
! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby
 Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye
Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G
dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go
bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo
ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good
e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb
! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby
 Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye
Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
oodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye!
odbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! G
dbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Go
bye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goo
ye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Good
e! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodb
! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodbye! Goodby
← Previous 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Next →