Assembly Nights 2

Created: 2022-10-01 Updated: 2022-11-10
Assembly Nights Season One
Assembly Nights Season Two <---- You are here.

During Assembly Nights Season One (porting JONESFORTH to NASM), I kept having ideas for other Forth-like things I’d like to try. But I was good: I wrote them down and stayed the course with the port.

But when the port was done, one of those ideas just wouldn’t let me rest and I was compelled to start on it immediately!

Now when the lights go out and everyone else in the house has gone to sleep, I start writing assembly code…​

my lenovo 11e thinkpad with assembly code waiting romantically on the bed with a candle

The physical subject for this season is a Lenovo 11e Thinkpad shown here editing and debugging "meow5", my new experimental Forth-like ("concatenative") language.

Why a new computer? Well, the old EeePC is currently out of commission (SSD circa 2006(?) croaked, which you can read about here).

(By the way, how incredible is this: I just realized that I am creating this page on October 1st, exactly one year to the day after I created the first Assembly Nights page. Completely unplanned.)

The project

The idea that wouldn’t let me rest was a "super concatenative" language in which functions would be the inlined source of other functions all the way down to tiny assembly language primitives. In other words, a language with not just a concatenative data flow (via the stack), but one that makes executable code via concatenative compilation.

You see, traditional Forth goes to great lengths to minimize memory usage (programs written for indirect "threaded" interpreters can actually be shorter than the equivalent machine code!) which made tons of sense in the late 1960s, early 1970s when "core memory" meant magnetic-core memory (wikipedia.org).

image described below
Ferrite core memory
Cropped from this original image (wikimedia.org) by Konstantin Lanzet, CC BY-SA 3.0.

In the image to the right, you can actually see the bits in this ferrite core memory. This was dear stuff indeed!

But when every byte becomes precious, there’s a trade-off in terms of complexity and runtime performance.

Since our home computers today have more L1 cache on the CPU die than most computers had in main memory when Forth was born, I couldn’t help wondering what would happen if I traded space-savings for sheer simplicity.

So instead of composing words (functions) from the bare addresses of other words (or even with CALL instructions to those address like a normal human being), what if I just brute-force inlined the raw machine code of each word, concatenating them into larger words?

The logical conclusion of this is that a program would be a single word made of the concatenated machine code of every word it contained recursively down to a handful of primitives.

Using the familiar Forth colon and semicolon syntax (: …​ ;) to define words:

> : meow "Meow. " print ;
> : meow5 meow meow meow meow meow ;
> meow5
Meow. Meow. Meow. Meow. Meow.

In this canonical example, the "Meow. " string becomes data with an address. The meow word becomes the machine code to push the string address on the stack followed by the machine code to print whatever string’s address is on the stack. And the word meow5 is five whole copies of meow (but not the string data).

It seemed simple. Wasteful? Maybe. But surely not so bad if you planned the program with even a little bit of care.

Would it work? Would it be legal? Would I be struck by lightning for my heresy?

There was only one way to find out.

(Spoiler alert - as of 2022-11-07, the above example works exactly as written!)

Naturally, such a conCATenative (sorry) language would need to go "meow" and multiple meows, say five of them, seemed like a good example of the concept. And thus the "meow5" project had its name.

You can follow along with my progress by reading the dev log text files (log*.txt) in the repo which I hope are fairly entertaining to read (I write them as I go so you get to come along for the whole rollercoaster of emotions as I succeed and fail!)

And here’s the web page: Meow5: The Web Page

Eventually Meow5 will get its own page like nasmjf did when Assembly Nights Season One was complete. But its too soon to predict what it will become. Silly toy? Mainstream Rust contender?

The habit

I believe one of the reasons the first Assembly Nights was so successful is that I gave myself a specific time and a specific, constrained environment. This second "season" is no different.

The time slot is exactly the same: When the kids have had their bedtime stories and the lights are out, I hop into in bed and fire up the laptop which has spent the day propped on its side, charging, between my nightstand and bed.

I begin immediately to the project. Sometimes I’ll re-read the last bit of my log file to remind myself what I’m doing. Sometimes I just run the program to see where it crashes.

I program as long as I like. Sometimes I don’t end up writing any code. On really good days, my line count for the night is negative as I delete old stuff I don’t need anymore.

Low friction environment

Why is "low friction" important? Having an extremely low friction development setup when I’m tired can mean the difference between writing code for 5 minutes versus 0 minutes. And what I’m finding is that 5 minutes counts for way more than just a line or two of code. That 5 minutes loads the problem in my head. It keeps the whole project "alive" in my subconscious workspace for another day. Which means I’m doing creative problem solving on it while I sleep or empty the dishwasher or shower.

Why separate hardware? I never set out to have a separate computer for each of my projects, but that’s what I seem to be ending up with. That seems nuts unless you know that these are older, low-powered computers that would likely be sitting in a landfill if I weren’t using them. They’re not powered on unless I’m using them, so they take no energy the rest of the time. (Most of the laptops aren’t even plugged in to charge unless I’m definitely going to use them.)

Assembly Nights Season Two is powered by a Lenovo ThinkPad 11e ("e" for Education Series, I guess) which has a really nice size and weight and decent keyboard. I’m pretty sure the 11 indicates the screen size. The total width is 12 inches (30cm). The 4 Gb RAM and Celeron N4120 (designed to run at just 4.8 Watts!) feels extremely generous after the eeePC. (Also, I’d like to say how absolutely amazing it is to actually have a laptop that can run off the battery for a long time. This is a first for me!) It cost me a whopping USD $120 on eBay.

It’s running my favorite Linux distro, a full install of Slackware 15.0, which comes with all the dev tools you might need (at least three assemblers, for example). Though of course I still install some of my applications, like The Silver Searcher.

What is low friction startup? Thinking I would automate my setup in the terminal, I started off (re-)learning tmux by reading the entire man page and wrote some project-specific setup for it. That was neat and I don’t regret the investment.

But then I gave in to the siren call of awesome, the Lua-driven window manager. When I’m running X, I’m a big fan of tiling window managers. For years, Suckless’s dwm has been my go-to. But now that I’m experimenting with portrait-oriented monitors and other unusual setups, I’ve been wanting easier configuration options (dwm requires a recompile whenever you change anything) and better ability to share configurations across systems.

Running X instead of restricting myself to the Linux console does leave me vulnerable to the temptation to open a Web browser (and I do open Firefox occasionally to look up x86 references), but I’ve been pretty successful at ignoring this ability.

X has another advantage: auto-login. It may be a controversial feature, but being able to turn on the laptop and having it dump me right into my project environment without having to even log in is huge in terms of reducing friction. (Lest you think I’m being hyperbolic, with the eeePC, I had, on multiple occasions, fallen asleep at the login prompt (it also took at least twice as long to boot early Intel Atom vs 4-core Celeron!). Now I at least see a bit of my code every night.)

The point is this: The easier it is to press a button and work on my project, the easier it is to stick to the habit.

Tools

I’m still using NASM and still liking it. I was briefly tempted to try YASM or FASM since they both can apparently produce ELF executables without the use of a linker. But the linker becomes basically invisible once I script the assemble/link/run process and I knew better than to switch too many tools all at once. So Im' sticking with what I know for now.

As always, I’m using Vim for all my editing needs. (I did try NeoVim for a bit, but I’ll be darned if I didn’t have some compatibility problems and the feature I was most interested in, Lua integration, was harder to learn about than I’d expected.) Again, sticking with what I know.

I’m also using GDB for debugging, though it still feels very ill-suited for debugging my NASM assembly and extremely ill-suited for debugging the machine code produced by my program! Of course, I’m thankful that it’s powerful and free and already installed. And again, sticking with what I know.

The other unsung hero in this setup is Bash for shell aliases that reduce the repetitive typing of the project’s most common commands.

The 'meow' setup function

My Assembly Nights Season One setup had a bunch of little scripts and shell aliases to perform everything I might need to do in a couple keystrokes.

Season Two works that way too. But there are only so many one-letter and two-letter aliases you can create before you start getting some collisions.

So I came up with a solution that’s still dirt-simple but avoids collisions and gives me more features. It’s a function in my .bashrc that puts me in the project directory and sets up short aliases for the project and reminds me how to use them. Here’s the whole thing:

meow ()
{
    cd $HOME/meow5;
    alias nd="vim ../nasmdoc.txt";
    alias ndf="less /usr/doc/nasm-2.15.05/nasmdoc.txt";
    alias mvim="vim meow5.asm log_latest.txt";
    alias mrun="./build.sh run";
    alias mbug="./build.sh gdb";
    echo "Ready to meow!";
    echo "  nd = nasmdoc";
    echo "  ndf = full nasmdoc";
    echo "  mvim = open asm and log in vim";
    echo "  mrun = build+run";
    echo "  mbug = build+gdb"
}

So when I start up the laptop, I open two terminals (I could automate this part with awesome - and maybe I will eventually, but it’s so easy to do manually that I haven’t bothered yet…​) and then type meow in each terminal.

In the left terminal, I type m to open the assembly source and the latest dev log text file. In the right terminal, I either type mr to assemble, link, and run or mb to assemble, link, and debug with GDB. It’s super fast and efficient.

I think I was pretty clever with how I open the latest dev log file. Instead of having to change my alias every time I make a new file (e.g. going from log05.txt to log06.txt), I’ve simply symlinked the latest one to log_latest.txt so my alias is always correct. As a bonus, the repo contains the symlink as well, so it serves as meta-info about which log was active with each commit.

You’ll also notice the nd and ndf aliases, which are both short for 'nasmdoc'. The first one opens my own personal copy of NASM’s excellent nasmdoc.txt. The second one opens the full original document.

What’s the difference? Well, I found that I kept needing to reference the same sections of the documentation over and over again. Finally, it occurred to me that no one (not even the police) could stop me from making a copy of the file and deleting everything but the parts I needed. Thus, ~/nasmdoc.txt has become a pared-down cheatsheet of the stuff I always forget. Very handy and I highly recommend doing something like this yourself - it turns out nobody can stop you!

Learning

Assembly Nights Season One was a massive learning experience. Porting an existing JONESFORTH was a really big deal for me and I learned a lot about Forth, i386 assembly, and how computers really work under all (well, most) of the abstraction.

Inventing Meow5 as I go is an equally big deal in terms of learning. I’ve heard it said that Charles H. Moore says he "discovered" Forth rather than invented it. And I see what he means. Even though I could do anything I like with this language, I find myself making a lot of the same choices. Some paths are just easier to implement or easier to understand than others.

So now I really, really understand why Forth works this way instead of that way. It’s because that way would be harder.

Season One Lesson: Port stuff to learn it inside and out (and possibly one or more languages at the same time).

Season Two Lesson: Reinvent stuff to learn why it was built that way.

I’ve also been (very slowly) working through the book Programming From the Ground Up: An Introduction to Programming Using Linux Assembly Language by Jonathan Bartlett (blogspot.com). That link has the full book contents online courtesy of the author. A scanned copy is also on archive.org. I’m working from a used physical copy because I like physical books. I put bookmarks in it and read it on the couch. PFTGU uses GAS syntax, so like JONESFORTH, I’ve been (again, very slowly) porting the examples to NASM so that I know I fully understand them. Repo with my NASM ports here (github.com/ratfactor).

The book impresses me a lot because it doesn’t assume any other particular systems-level programming experience and because it’s not afraid to challenge the reader. For example, here’s a quote from page 30: "Load the next value in the list into the current value register (%eax). What addressing mode might we use here? Why?" It’s way more of a true teaching aid than most of the "x86 reference" stuff you’ll find online. It also teaches important related eat-your-vegetables things like the C calling convention and the UNIX file philosophy. Good stuff. Note that it’s 32-bit and Linux centric.

By the way, I still highly recommend starting your 32-bit x86 Linux assembly journey here:

https://asmtutor.com/ - NASM Assembly Language Tutorial by Daniel Givney

Note that I have been sorely tempted to start learning x86-64 because it sounds like a lot of fun compared to 32-bit i386, especially now that I’m on a 64-bit laptop! But I have been able to convince myself that sticking with what I’ve learned for now is 100% the way to go since creating an experimental language interpreter/compiler/runtime is plenty of new stuff to tackle already!

I digress. The point is, there are many ways to learn. Learning in public is also pretty cool.

Do you agree? Here’s a thing (created just now with Inkscape). Save it and put it on your own website, do your own Assembly Nights, hack the planet!

cool vector floppy labeled: ratfactor.com assembly is cool. keep computers weird. hack the planet. learn in public. assembly nights.

As I mentioned already above, I’m keeping a running log of my progress from the highs of Meow to the lows of SEGFAULT.

Slow and steady

As before, I sometimes struggle to keep from trying to push this project too quickly or "get it over with" so I can move on to all the other ideas I want to work on. It takes a certain amount of willpower, sometimes, to leave the program in a broken state so I can figure it out the next night.

I truly believe that slow and steady progress is better because I seem to be less prone to the sort of "tunnel vision" where I get invested in a particular solution and try to force it to work. Thinking more and typing less, I think, produces better ideas over the long run.

Slow and steady does not come naturally to me. Quite the opposite. But the more I do it, the easier it becomes to work this way. I think that’s because past success has taught me that I can keep my interest going on something long-term if I just keep showing up.

Thanks for reading!


The font used in the "title card" for Assembly Nights Season Two (except for the actual 'II') is White Dream, a "clean, swashy and beautiful script typeface" by Måns Grebäck One really neat thing about it is the swash ability:

Use underscore _ anywhere in a word to make a swash.
Example: Caste_la
Use multiple underscores for different swashes.
Example: Cinde__rella

I forget how many underscores I used to produce the swash I picked. It was a lot. There are a lot of swash choices. My princess dreams have come true.