Meow5: An Extremely Concatenative Programming Language

Meow5 is a "concatenative" (wikipedia.org) programming language. I also consider it to be a "Forth-like" language since it follows the same stack-based value passing method and borrows some of Forth’s syntax and nomenclature.
Hello world (in cat):
"Meow!" print Meow!
-
The Repo: https://github.com/ratfactor/meow5
-
Read more: Assembly Nights "Season 2" might explain why I would do something like this
(To see where I’m currently at, scroll down to Current progress below.)
It’s unique feature is its "inline all the things!" compiling method. A Meow5 function is the recursively concatenated machine code of every function it calls. At the bottom are "primitive" functions written in assembly language, which is where the machine code comes from.
Wait, you mean every time I call a function in another function, I’m making a complete copy of that function, which is a complete copy of every function that function calls, all the way down?
Yup! Pretty nuts, right? Feel free to run away from this page screaming now.
But when a primitive is just a dozen bytes long, you can make quite a few copies of it before it’s a problem. I’m very curious to see how it benchmarks against other languages in various ways: size, speed, cuteness.
Syntax
Meow5 borrows heavily from the ur-language Forth, which was delivered to our dimension by the prophet Charles H. Moore in the late 1960s.
Creating a new function, meow
, that prints the string "Meow" looks like this:
: meow "Meow." print ;
Creating a another function, meow5
, which executes meow
five times:
: meow5 meow meow meow meow meow ;
Invoke their names to call these functions:
meow Meow. meow5 Meow.Meow.Meow.Meow.Meow.
That’s a lot of meows! Who’s a hungry kitty?
To drive home an earlier point: the above meow5
function doesn’t just call meow
five times, it actually contains five consecutive copies of meow
!
Current Progress
Here’s a list of milestones reached. The log text file references are from my devlog files which I write as I’m creating Meow5. You can find them in the repo!
(Note to self, when I put the logs up here like I did for nasmjf, link these!)
-
(log01.txt) Executing a copy of the machine code to print "Meow" at runtime
-
(log01.txt) Programatically inlining multiple copies of the "Meow" printer
-
(log02.txt) Added function "tail" metadata for linked list dictionary, to use Forth terminology
-
(log02.txt) Added
find
function to lookup functions by name for inlining -
(log03.txt) Figured out how to call these machine code chunks so they can return to the call site
-
(log03.txt) Have interpreter parsing tokens from a hard-coded input string and executing them
-
(log04.txt) Have COMPILE vs IMMEDIATE modes and functions may opt-in for either
-
(log05.txt) Using NASM macros to inline functions at the assembly level to build interpreter
-
(log05.txt) Major: Finished
:
("colon") and;
("semicolon") for function definitions! -
(log06.txt) Added string literals to the language can do this:
"Meow!" print
-
(log06.txt) Have string literals working in compiled functions!
-
(log06.txt) DEBUG printing (can be inserted anywhere in the assembly)
-
(log06.txt) Major: The canonical
meow5
function works as shown on this page! -
(log07.txt) Added
str2num
andnum2str
for numeric input/output -
(log07.txt) Have numeric interpolation in strings with
$
placeholders -
(log07.txt) Have string escape sequences (e.g.
\n
) -
(log07.txt) Functions 'inspect' and 'inspect_all' print all functions with size in bytes
-
(log08.txt) Major: Reading interpreter input from STDIN!
>o.o< --"Why yes, I do speak ASCII!"
Technical Details
32-bit x86 Linux: Meow5 is written with the NASM assembler (https://nasm.us/) and targets the 32-bit i386 (x86) architecture. It makes direct Linux syscalls and uses no external libraries. Portability is not a goal.
Stack-oriented: Meow5 uses the i386 PUSH
and POP
instructions in the normal fashion to set/get items
from the stack (a region of memory to which items are added or removed sequentially in
a "first in, last out" fashion). One thing that’s neat about this is that alleviates the difficult task of putting
names to data. It also generally produces compact and uncluttered syntax.
https://en.wikipedia.org/wiki/Stack-oriented_programming
Just in Time Compiled: The moment you start compiling a new function foo
, it’s machine code is being
inlined into memory. When you execute foo
, a JMP
instruction is issued to its machine code, which then
runs until it hits a chunk of "return code" inlined at the end of the definition, which jumps back to the
call site. It does not use CALL
and RET
and there can be only one level of direct function call
in this fashion, which, it turns out, is all you need for an interactive Meow5 interpreter.
Inline All the Things!: If you use function foo
in the definition of a new function bar
, the machine
code of foo
(not including the "return code" mentioned above or the "tail" with function metadata) is
inlined into bar
.
The logical conclusion is that a Meow5 program is a top-level function containing a continuous stream of concatenated machine code. What I would like to do at some point is write out such a program with an ELF header and see if I can make stand-alone Linux executables!
OpenBSD
I was really curious what OpenBSD would think of this since my understanding is that it has W^X (wrote xor execute) by default.
Sure enough, even the linker catches this:
ld: error: can't create dynamic relocation R_386_32 against local symbol in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output >>> defined in meow5.o >>> referenced by meow5.asm:167 >>> meow5.o:(tail_strlen)
which is in reference to:
%macro ENDWORD 3 ... tail_%1: dd LAST_WORD_TAIL ; <---- this line
Where I’m putting the address of the previous function in the tail of the current one to make a linked list.
I’m not sure what those two suggested options are. I might try them to see how far I get.
I made the SVG Meow5 logo in Inkscape 0.92. It uses the Sazanami Mincho font by Yasuyuki Furukawa. Mincho evidently means something very similar to what 'serif' means in Latin typefaces. See Ming typefaces (wikipedia.org).