nasmjf is my NASM assembler port of the JONESFORTH (wrmj.wordpress.com) Forth interpreter by Richard W.M. Jones.
JONESFORTH is a lavishly commented Forth implementation in assembly and…Forth. Between the comments, ASCII diagrams, and source, you can learn how a typical Forth implementation works! That’s not an easy task, by the way, because Forth is made of a bunch of little independent moving parts. More about that in a minute.
My port is 1,600 lines of (heavily commented) i386 assembly. It runs
the original, unaltered
jonesforth.f (1,700 lines of Forth) to form a usable
base Forth implementation.
Here’s the nasmjf repo (github.com).
Here’s Assembly Nights, my article about the pleasure of writing nasmjf. (In a lot of ways, the process was more important than the product.)
Compiling and running a "hello world" word:
$ ./nasmjf JONESFORTH VERSION 1 20643 CELLS REMAINING OK : hello ." Hello world!" CR ; hello Hello world! BYE $
Traditional Forth is a "threaded interpreted language" (TIL). All functionality is implemented in functional units called "words". Nearly all of them can be executed as you type them in the interpreter.
The lowest level words (aka "code words") are written in machine code. Higher level words are composed of other words. More specifically, they’re composed of the addresses of other words. This is called "threading" and has nothing to do with "threads" (wikipedia.org) in the concurrent programming sense.
Every word contains its own "interpreter", which may be as little as a single machine code instruction (in the case of "code words", it simply passes execution to the rest of the word). Code words pass execution to the next word using a "NEXT" macro, which updates a pointer (much like the CPU’s instruction pointer). Higher level "regular" words "EXIT" back to the outer interpreter (via a return stack).
One of the earliest "Huh?" moments I had while reading JONESFORTH was trying to understand why the whole system starts with "QUIT". But that’s the outer interpreter loop and when you’re in QUIT, you have "quit" from interpreting a word and are ready for the next one…
So, yeah there’s two levels of interpreter: the word-level ones that handle the execution of each type of word, and an outer one that takes a stream of space-separated tokens (confusingly, also called "words"). The outer interpreter can be in one of two modes: immediate mode, where every word is executed as soon as it’s entered; or compile mode, where each word’s address is "compiled" into the memory where, presumably, you’re defining a new word. I say "presumably", because all of this behavior can be overridden at any time for special effects.
The incredible flexibility of Forth comes not just from being composed from a bunch of separate little mechanisms, but from the fact that you can coerce them into doing whatever you can dream up while the system is running. For example, to add syntactic elements to your Forth system, you can create new IMMEDIATE words to your dictionary. These run immediately, even in compile mode! Because of this, you can manipulate the incoming tokens to do whatever you want.
This sort of self-modifying system is not merely encouraged, it’s how a traditional Forth is created!
Figuring out if the code I was looking at was going to execute at compile
time or runtime and what that meant was often one of the hardest things for me
to wrap my head around, particularly in the early parts of
fundamental control structure words like
IF … THEN and
BEGIN … UNTIL are
Many of Forth’s idiosyncratic features are best understood from the context of
the late 1960s and early 1970s computing environment in which Charles H. Moore
conceived it. Such machines were extremely limited in main memory, so being
able to have only the parts of the language you needed made lots of sense, as
did storing words as a series of addresses rather than explicit calls to
subroutines (because just
foo is shorter than
call foo, even though you
have to write more code to execute it.)
Because Richard Jones released his work as public domain, it’s only right that I should release my port also as public domain. So it is.
What’s different about my nasmjf port?
Like the original, my port also targets 32-bit x86 (i386) Linux but is a re-write in NASM (nasm.us), the excellent "Netwide Assembler".
NASM is very nice and the documentation is excellent. NASM’s macro system allowed me to make some improvements such as automatically calculating the length of word-creation macro string parameters rather than have to count them and manually enter them myself!
I also prefer NASM’s "Intel flavor" over GAS’s "AT&T flavor" of assembly
because it puts the operands in the
destination = source order I’m familiar
with from other languages and because the addressing syntax (
address is just
[address] is the value at that address) is much easier for me
to read and understand.
The other big difference between my port and the original JONESFORTH is that it
jonesforth.f source file upon start. This makes running the
application much simpler (original JONESFORTH requires you to pipe the "second
half" of the interpreter into itself!). The file opening and reading could also
be exposed to FORTH and new words created to allow loading arbitrary files
within the interpreter.
If you’re going to use nasmjf/JONESFORTH for anything longer than a simple line or
two, I highly recommend wrapping it in
readline with the excellent
(https://github.com/hanslub42/rlwrap) like so:
$ rlwrap ./nasmjf
Then you’ll have line editing, history (up arrow lets you re-enter and edit the last line), and even persistent history between Forth sessions!
The original JONESFORTH comes with a test suite, and I adapted it to a Bash script because I don’t use make enough for Makefiles to make any sense to me. Anyway, you can run that. Check out a log24 where I get the test suite running! It’s a lot of fun.
This project has been an incredible learning experience, and I’m planning to learn more by porting more software.
This was a "bucket list" project for several reasons, but especially because I was able to confirm the legend of changing integer values for myself. See log20 where I try it out for the first time. (Search for the word "legendary".)
Forth is a language that made me rethink a lot of what I took for granted in computing and programming. Implementing (well, porting) one in assembly took that to another level entirely. Above all, it makes me want to experiment with other simple language implementations because once you see how this stuff is made, it’s kind of addicting.
Some books I met along the way
I’ve also been reading Forth books before, during, and after this project:
Starting Forth by Leo Brodie - 5 stars. My review: "A fantastic language introduction for beginners and fascinating artifact in its own right. The writing is friendly and conversational. The illustrations help make it memorable. The book takes you from the simplest concepts to the hardcore metaprogramming internals of Forth itself. I don’t think you could ask for more than this from an introductory programming book."
Threaded Interpretive Languages: Their Design and Implementation by R.G. Loeliger - 5 stars. My review: "This took me forever to get through. At least three months. As the author admits, TILs are a real bear to wrap your head around because they involve so many independent moving parts. Loeliger does a great job of giving a high-level and low-level explanation of the entire sequence of events, but I’d still be in awe of anyone who could simply read this book straight through and actually GROK how it all works together. Maybe if you work out some of it on paper as well?"
Programming A Problem Oriented Language: Forth - how the internals work by Charles H. Moore himself - 3 stars. My review: "This is not an amazing book by any means - but if you’re into the Forth language/ecosystem like I am right now, it’s pretty high on the list of things to read. It’s Moore’s own explanation of: 1) What he built, 2) How he reasons about a self-bootstrapping programming language (namely, Forth), and 3) His general philosophy of software development.
I gotta say, while you can argue about how Forth helps or hurts the cause, his First Principle of keeping it simple is spot on! And he’s practical about it too.
I like that we get an opinionated take on a variety of topics such as the pros and cons of word (function) name lookup strategies, etc. I love how practical he is about complexity vs. storage concerns vs. compute time concerns.
It’s a bit rambling and meandering and is not a highly professional and polished book. According to the introduction, it’s a manuscript that essentially sat in a drawer for decades. A later chapter even has a note that goes something like: "I’m not sure why I thought this part was needed, but here it is anyway…" Which I enjoy. Moore is human too!"
Thinking Forth by Leo Brodie - Enjoying this very much so far. Reminds me of Ousterhout’s 2018 A Philosophy of Software Design. I’m really curious how Forth fits into the age-old "simplicity" puzzle of software development. Update: Finished reading. 5 stars. My Review: "The high praise for this book was well-earned. It just goes to show that the problems we’ve been dealing with since the dawn of programming have remained, largely, the same. While this is definitely a Forth-oriented book (and rooted in its time), the philosophy of simplicity and the encouragement to turn a problem on its head until it can be expressed simply is utterly timeless. The interviews were great and the examples were well chosen. In the year 2022, we still don’t have the answers to a lot of these questions. Just opinions and more questions."
Other people’s JONESFORTH projects
There are quite a number of projects related to JONESFORTH:
Here’s a bare metal Raspberry Pi JONESFORTH ARM port and operating system! (github.com/organix)
Here’s another JONESFORTH ARM port with APL symbols (github.com/narenratan) - be sure to check to the modified
jonesforth.f to see the madness, LOL.
Here’s a JONESFORTH Windows port in C, which I definitely want to look at because I’m very interested in Forths in higher level languages and seeing this design that I now know intimately should be a huge help.
Here’s a JONESFORTH RISC-V port (github.com/jjyr). The "issues" there contains this interesting comment by Albert van der Horst, creator of ciforth:
"P.S. jonesforth is loosely based on ciforth (present on this github) which is based on good old fig-Forth. If you borrow from ciforth you may end up with a fully documented system with comprehensive testing. Its documentation for users is far superior than jonesforth, and the Forth is more powerful."
Make of that what you will. (Actually, what I find really compelling is van der Horst’s self-contained yourforth written in FASM (github.com/albertvanderhorst) which seems to have very similar goals to JONESFORTH. I will definitely be taking a good look at this. It looks very compact and I’ve been curious about FASM for a while.)
Another JONESFORTH RISC-V port (github.com/nickpascucci). No readme, but I see that it’s got PDFs for the SiFive FE310-G002.
There’s lots of really interesting talk on Jones’s submission to lambda-the-ultimate.org. (But watch out! That site is a dangerous rabbit hole of fun for anyone interesting in programming languages!)
If I run into any other links, I’ll add them here.
At one point I misunderstood the workings of Forth variables, thinking they should leave their values on the stack rather than their addresses. My confusion was compounded by my misreading of the comments in JONESFORTH. I got immediate help on Reddit. Here’s my confusion: The Latest Word and you can see me struggle and then facepalm in log19. Humbling, but ultimately satisfying.