Meow5: An Extremely Concatenative Programming Language

Created: 2022-11-10 Updated: 2022-11-20

meow5 kitty logo is black with two bright teal eyes and a pink nose

Meow5 is a "concatenative" (wikipedia.org) programming language. I also consider it to be a "Forth-like" language since it follows the same stack-based value passing method and borrows some of Forth’s syntax and nomenclature.

Hello world (in cat):

"Meow!" print
Meow!

(To see where I’m currently at, scroll down to Current progress below.)

It’s unique feature is its "inline all the things!" compiling method. A Meow5 function is the recursively concatenated machine code of every function it calls. At the bottom are "primitive" functions written in assembly language, which is where the machine code comes from.

Wait, you mean every time I call a function in another function, I’m making a complete copy of that function, which is a complete copy of every function that function calls, all the way down?

Yup! Pretty nuts, right? Feel free to run away from this page screaming now.

But when a primitive is just a dozen bytes long, you can make quite a few copies of it before it’s a problem. I’m very curious to see how it benchmarks against other languages in various ways: size, speed, cuteness.

Syntax

Meow5 borrows heavily from the ur-language Forth, which was delivered to our dimension by the prophet Charles H. Moore in the late 1960s.

Creating a new function, meow, that prints the string "Meow" looks like this:

: meow "Meow." print ;

Creating a another function, meow5, which executes meow five times:

: meow5 meow meow meow meow meow ;

Invoke their names to call these functions:

meow
Meow.

meow5
Meow.Meow.Meow.Meow.Meow.

That’s a lot of meows! Who’s a hungry kitty?

To drive home an earlier point: the above meow5 function doesn’t just call meow five times, it actually contains five consecutive copies of meow!

Current Progress

Here’s a list of milestones reached. The log text file references are from my devlog files which I write as I’m creating Meow5. You can find them in the repo!

(Note to self, when I put the logs up here like I did for nasmjf, link these!)

  • (log01.txt) Executing a copy of the machine code to print "Meow" at runtime

  • (log01.txt) Programatically inlining multiple copies of the "Meow" printer

  • (log02.txt) Added function "tail" metadata for linked list dictionary, to use Forth terminology

  • (log02.txt) Added find function to lookup functions by name for inlining

  • (log03.txt) Figured out how to call these machine code chunks so they can return to the call site

  • (log03.txt) Have interpreter parsing tokens from a hard-coded input string and executing them

  • (log04.txt) Have COMPILE vs IMMEDIATE modes and functions may opt-in for either

  • (log05.txt) Using NASM macros to inline functions at the assembly level to build interpreter

  • (log05.txt) Major: Finished : ("colon") and ; ("semicolon") for function definitions!

  • (log06.txt) Added string literals to the language can do this: "Meow!" print

  • (log06.txt) Have string literals working in compiled functions!

  • (log06.txt) DEBUG printing (can be inserted anywhere in the assembly)

  • (log06.txt) Major: The canonical meow5 function works as shown on this page!

  • (log07.txt) Added str2num and num2str for numeric input/output

  • (log07.txt) Have numeric interpolation in strings with $ placeholders

  • (log07.txt) Have string escape sequences (e.g. \n)

  • (log07.txt) Functions 'inspect' and 'inspect_all' print all functions with size in bytes

  • (log08.txt) Major: Reading interpreter input from STDIN!

>o.o<   --"Why yes, I do speak ASCII!"

Technical Details

32-bit x86 Linux: Meow5 is written with the NASM assembler (https://nasm.us/) and targets the 32-bit i386 (x86) architecture. It makes direct Linux syscalls and uses no external libraries. Portability is not a goal.

Stack-oriented: Meow5 uses the i386 PUSH and POP instructions in the normal fashion to set/get items from the stack (a region of memory to which items are added or removed sequentially in a "first in, last out" fashion). One thing that’s neat about this is that alleviates the difficult task of putting names to data. It also generally produces compact and uncluttered syntax. https://en.wikipedia.org/wiki/Stack-oriented_programming

Just in Time Compiled: The moment you start compiling a new function foo, it’s machine code is being inlined into memory. When you execute foo, a JMP instruction is issued to its machine code, which then runs until it hits a chunk of "return code" inlined at the end of the definition, which jumps back to the call site. It does not use CALL and RET and there can be only one level of direct function call in this fashion, which, it turns out, is all you need for an interactive Meow5 interpreter.

Inline All the Things!: If you use function foo in the definition of a new function bar, the machine code of foo (not including the "return code" mentioned above or the "tail" with function metadata) is inlined into bar.

The logical conclusion is that a Meow5 program is a top-level function containing a continuous stream of concatenated machine code. What I would like to do at some point is write out such a program with an ELF header and see if I can make stand-alone Linux executables!

OpenBSD

I was really curious what OpenBSD would think of this since my understanding is that it has W^X (wrote xor execute) by default.

Sure enough, even the linker catches this:

ld: error: can't create dynamic relocation R_386_32 against local symbol in
readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext'
to allow text relocations in the output
>>> defined in meow5.o
>>> referenced by meow5.asm:167
>>>               meow5.o:(tail_strlen)

which is in reference to:

%macro ENDWORD 3
...
tail_%1:
    dd LAST_WORD_TAIL ; <---- this line

Where I’m putting the address of the previous function in the tail of the current one to make a linked list.

I’m not sure what those two suggested options are. I might try them to see how far I get.


I made the SVG Meow5 logo in Inkscape 0.92. It uses the Sazanami Mincho font by Yasuyuki Furukawa. Mincho evidently means something very similar to what 'serif' means in Latin typefaces. See Ming typefaces (wikipedia.org).