Okay, so here's the plan. I'm going to get Meow5 to the point where it can write out working ELF files that can print the "meow" repetition example that inspired the name of this language/program. To do that, I need to: 1. Write the 'data_area' to 'free' (where strings are stored) to the ELF file after the executable portion. 2. Update the ELF program header so it loads the additional data. 3. Ensure that the compiled executable can print the string from the memory loaded at runtime. The last one is the biggest challenge just because I'm not entirely sure how that's normally done. I could hack together something that works, but I'm more interested in learning how, say, GCC goes about it. Two nights later: Okay, it's just like assembly: data ends up in another segment. I also re-read portions of Charles Moore's Programming a Problem-Oriented Language this morning and he talks about how annoying strings are because they're chunks of data that we necessarily intersperse in our programs. A week later... I've mulled this over night after night. I'm calling it: Meow5 is a _failure_. Well, it's a complete success as a toy experiment! But to answer the question posed by Meow5: No. Inlining does not make everything simple, elegant, and easy. It works just fine for code and integers. But it fails with _data_. And strings are the first and most conspicuous example of data. Let's look at the three methods I was considering above: Whether I have a separate segment or a single segment makes little difference - I can't easily store a string in the running interpreter and also have the same reference it in a compiled executable without a level of ELF mastery I simply don't possess. I could easily store a _relative_ address, but a relative address is instantly wrong when I copy the word it's stored in... I could also invent mechanisms to update a relative address when a word is inlined, but that does not exactly scream "simplicity and elegance" to me. I'd rather put that effort into something more interesting. The alternative is to treat strings the way Moore describes in PaPOL (mentioned above): write them inline in the dictionary with an instruction to skip over them when executing and another instruction to put their addresses to the stack so they can be referenced. (Forth also pushes the length, a "counted string".) That is particularly horrible for Meow5 for two reasons: 1. I'll end up with five copies of the string "Meow. " in my canonical example program. 2. I'll have a jump instruction in every single word to skip over the string. I can live with the first problem. It highlights how ridiculous the data problem is, but it's merely wasteful. But the second problem isn't okay. One of the driving factors of "inline everything" was to _avoid_ jumps. Now I've gotta jump over every string...it just ruins it. I keep thinking there's gotta be some escape hatch, but everything I think of won't fly because it's complessly complicated OR it requires some sort of linker/loader-level wizardry that is no doubt possible, but that energy would be better spent on a less silly toy thingy. So here's what I'm gonna do: 1. I'll do those dang jump-over-me strings. 2. I'll make sure I can write a stand-alone meow5 executable that prints "Meow. Meow..." 3. Publish an obituary for the project. :-) A week later... I couldn't do it. I'm totally okay with the conclusion that meow5 is not a viable way to build software...but NOT because I couldn't figure out how to make ELF segments! If I can put data in a specific spot in memory while running the interpreter and then count on that data being in the same spot (virtual address, obviously) when the ELF executable runs, then it'll work. That was the original plan. So I "gave up" making the segments in meow5...but there's no reason I can't experiment in a much easier environment. And I've already got my 'mez' repo for reading elves with Zig , why not do the same for writing them? What I need to be able to do is iterate quickly and test theories quickly. Okay, now I'm all excited again! Time to put on the tool-making cape and hat and figure out The Mystery of the Elf File Offsets. +-----------------------------------------------------+ | | | Five Months Later | | | +-----------------------------------------------------+ Wow, so, uh, here I am again. Hello. I've actually been extremely productive on this project for the last couple weeks. Indeed, I did the Zig experimentation in the 'mez' repo as described above. Specifically, I made the 'mez' program display EXACTLY what I needed from a 32-bit Linux ELF executable file to debug my problems (and made it quite nice to look at, if I say so myself). Then I made a companion 'zem' program that writes a two-segment executable (has two LOAD program headers that create two memory segments - one with the executable program and one with data). As is often the case, *writing* the programs armed me with the intimate understanding of the problem which has rendered the programs themselves largely superfluous. And this intimacy has drawn me to an uncomfortable conclusion: THERE'S NO EASY WAY TO MOVE ADDRESSES AROUND :-( It's the problem I wrote about above, but now with all of the handy-wavy stuff gone. If I put a string into "data" memory in my running Meow5 interpreter and reference it by address in a program, there is no EASY way to put that same string in a different location in an exported ELF executable of the same program. I can likely keep them at the *same* address by padding the exported executable out the right amount (tricky!) to get everything to align. Or I could even try to _translate_ the addresses when written (tricky and fatal to the program if done incorrectly). As a matter of fact, having more than one segment seems to require a certain amount of padding anyway - I'm still a bit fuzzy about segment alignment and I should probably do some experimentation to get it 100% clear. With the benefit of a better feel for the problem and time to think about it, I conclude that having the string live 'inline' with the word that uses it and ending up with multiple copies of that string will: 1. Result in a *much* smaller ELF file anyway (padding to 0x1000 is way bigger than a couple copies of a small string!) 2. Be perfectly in line with the philosophy of this silly project anyway - plus, it will even show up better in a hex dump when you can SEE five copies of the "Meow" string in memory! So there we have it: I'm gonna do "inline" strings. This log has been super weird. I'm going to start the next one with a TODO list for making those strings work!