colorful rat Ratfactor.com > Dave's Repos

meow5

A stack-based pure inlining concatenative programming language written in NASM assembly
git clone http://ratfactor.com/repos/meow5/meow5.git

meow5/log12.txt

Download raw file: log12.txt

1 Okay, so here's the plan. I'm going to get Meow5 to the 2 point where it can write out working ELF files that can 3 print the "meow" repetition example that inspired the 4 name of this language/program. 5 6 To do that, I need to: 7 8 1. Write the 'data_area' to 'free' (where strings 9 are stored) to the ELF file after the executable 10 portion. 11 12 2. Update the ELF program header so it loads the 13 additional data. 14 15 3. Ensure that the compiled executable can print the 16 string from the memory loaded at runtime. 17 18 The last one is the biggest challenge just because I'm 19 not entirely sure how that's normally done. 20 21 I could hack together something that works, but I'm more 22 interested in learning how, say, GCC goes about it. 23 24 Two nights later: Okay, it's just like assembly: data 25 ends up in another segment. 26 27 I also re-read portions of Charles Moore's Programming a 28 Problem-Oriented Language this morning and he talks 29 about how annoying strings are because they're chunks of 30 data that we necessarily intersperse in our programs. 31 32 33 A week later... 34 35 36 I've mulled this over night after night. I'm calling it: 37 Meow5 is a _failure_. Well, it's a complete success as a 38 toy experiment! 39 40 But to answer the question posed by Meow5: No. Inlining 41 does not make everything simple, elegant, and easy. 42 43 It works just fine for code and integers. But it fails 44 with _data_. And strings are the first and most 45 conspicuous example of data. 46 47 Let's look at the three methods I was considering above: 48 49 Whether I have a separate segment or a single segment 50 makes little difference - I can't easily store a string 51 in the running interpreter and also have the same 52 reference it in a compiled executable without a level of 53 ELF mastery I simply don't possess. 54 55 I could easily store a _relative_ address, but a 56 relative address is instantly wrong when I copy the word 57 it's stored in... 58 59 I could also invent mechanisms to update a relative 60 address when a word is inlined, but that does not 61 exactly scream "simplicity and elegance" to me. I'd 62 rather put that effort into something more interesting. 63 64 The alternative is to treat strings the way Moore 65 describes in PaPOL (mentioned above): write them inline 66 in the dictionary with an instruction to skip over them 67 when executing and another instruction to put their 68 addresses to the stack so they can be referenced. 69 (Forth also pushes the length, a "counted string".) 70 71 That is particularly horrible for Meow5 for two reasons: 72 73 1. I'll end up with five copies of the string "Meow. " 74 in my canonical example program. 75 76 2. I'll have a jump instruction in every single word to 77 skip over the string. 78 79 I can live with the first problem. It highlights how 80 ridiculous the data problem is, but it's merely 81 wasteful. 82 83 But the second problem isn't okay. One of the driving 84 factors of "inline everything" was to _avoid_ jumps. Now 85 I've gotta jump over every string...it just ruins it. 86 87 I keep thinking there's gotta be some escape hatch, but 88 everything I think of won't fly because it's complessly 89 complicated OR it requires some sort of 90 linker/loader-level wizardry that is no doubt possible, 91 but that energy would be better spent on a less silly 92 toy thingy. 93 94 So here's what I'm gonna do: 95 96 1. I'll do those dang jump-over-me strings. 97 2. I'll make sure I can write a stand-alone meow5 98 executable that prints "Meow. Meow..." 99 3. Publish an obituary for the project. :-) 100 101 A week later... 102 103 I couldn't do it. I'm totally okay with the conclusion 104 that meow5 is not a viable way to build software...but 105 NOT because I couldn't figure out how to make ELF 106 segments! 107 108 If I can put data in a specific spot in memory while 109 running the interpreter and then count on that data 110 being in the same spot (virtual address, obviously) when 111 the ELF executable runs, then it'll work. That was the 112 original plan. 113 114 So I "gave up" making the segments in meow5...but 115 there's no reason I can't experiment in a much easier 116 environment. And I've already got my 'mez' repo for 117 reading elves with Zig , why not do the same for writing 118 them? What I need to be able to do is iterate quickly 119 and test theories quickly. 120 121 Okay, now I'm all excited again! Time to put on the 122 tool-making cape and hat and figure out The Mystery of 123 the Elf File Offsets. 124 125 +-----------------------------------------------------+ 126 | | 127 | Five Months Later | 128 | | 129 +-----------------------------------------------------+ 130 131 Wow, so, uh, here I am again. Hello. 132 133 I've actually been extremely productive on this project 134 for the last couple weeks. Indeed, I did the Zig 135 experimentation in the 'mez' repo as described above. 136 137 Specifically, I made the 'mez' program display EXACTLY 138 what I needed from a 32-bit Linux ELF executable file to 139 debug my problems (and made it quite nice to look at, if 140 I say so myself). 141 142 Then I made a companion 'zem' program that writes a 143 two-segment executable (has two LOAD program headers 144 that create two memory segments - one with the 145 executable program and one with data). 146 147 As is often the case, *writing* the programs armed me 148 with the intimate understanding of the problem which has 149 rendered the programs themselves largely superfluous. 150 151 And this intimacy has drawn me to an uncomfortable 152 conclusion: 153 154 THERE'S NO EASY WAY TO MOVE ADDRESSES AROUND :-( 155 156 It's the problem I wrote about above, but now with all 157 of the handy-wavy stuff gone. 158 159 If I put a string into "data" memory in my running Meow5 160 interpreter and reference it by address in a program, 161 there is no EASY way to put that same string in a 162 different location in an exported ELF executable of the 163 same program. 164 165 I can likely keep them at the *same* address by padding 166 the exported executable out the right amount (tricky!) 167 to get everything to align. Or I could even try to 168 _translate_ the addresses when written (tricky and fatal 169 to the program if done incorrectly). 170 171 As a matter of fact, having more than one segment seems 172 to require a certain amount of padding anyway - I'm 173 still a bit fuzzy about segment alignment and I should 174 probably do some experimentation to get it 100% clear. 175 176 With the benefit of a better feel for the problem and 177 time to think about it, I conclude that having the 178 string live 'inline' with the word that uses it and 179 ending up with multiple copies of that string will: 180 181 1. Result in a *much* smaller ELF file anyway 182 (padding to 0x1000 is way bigger than a couple 183 copies of a small string!) 184 2. Be perfectly in line with the philosophy of this 185 silly project anyway - plus, it will even show up 186 better in a hex dump when you can SEE five copies 187 of the "Meow" string in memory! 188 189 So there we have it: I'm gonna do "inline" strings. 190 191 This log has been super weird. I'm going to start the 192 next one with a TODO list for making those strings work!