1 Okay, so here's the plan. I'm going to get Meow5 to the
2 point where it can write out working ELF files that can
3 print the "meow" repetition example that inspired the
4 name of this language/program.
5
6 To do that, I need to:
7
8 1. Write the 'data_area' to 'free' (where strings
9 are stored) to the ELF file after the executable
10 portion.
11
12 2. Update the ELF program header so it loads the
13 additional data.
14
15 3. Ensure that the compiled executable can print the
16 string from the memory loaded at runtime.
17
18 The last one is the biggest challenge just because I'm
19 not entirely sure how that's normally done.
20
21 I could hack together something that works, but I'm more
22 interested in learning how, say, GCC goes about it.
23
24 Two nights later: Okay, it's just like assembly: data
25 ends up in another segment.
26
27 I also re-read portions of Charles Moore's Programming a
28 Problem-Oriented Language this morning and he talks
29 about how annoying strings are because they're chunks of
30 data that we necessarily intersperse in our programs.
31
32
33 A week later...
34
35
36 I've mulled this over night after night. I'm calling it:
37 Meow5 is a _failure_. Well, it's a complete success as a
38 toy experiment!
39
40 But to answer the question posed by Meow5: No. Inlining
41 does not make everything simple, elegant, and easy.
42
43 It works just fine for code and integers. But it fails
44 with _data_. And strings are the first and most
45 conspicuous example of data.
46
47 Let's look at the three methods I was considering above:
48
49 Whether I have a separate segment or a single segment
50 makes little difference - I can't easily store a string
51 in the running interpreter and also have the same
52 reference it in a compiled executable without a level of
53 ELF mastery I simply don't possess.
54
55 I could easily store a _relative_ address, but a
56 relative address is instantly wrong when I copy the word
57 it's stored in...
58
59 I could also invent mechanisms to update a relative
60 address when a word is inlined, but that does not
61 exactly scream "simplicity and elegance" to me. I'd
62 rather put that effort into something more interesting.
63
64 The alternative is to treat strings the way Moore
65 describes in PaPOL (mentioned above): write them inline
66 in the dictionary with an instruction to skip over them
67 when executing and another instruction to put their
68 addresses to the stack so they can be referenced.
69 (Forth also pushes the length, a "counted string".)
70
71 That is particularly horrible for Meow5 for two reasons:
72
73 1. I'll end up with five copies of the string "Meow. "
74 in my canonical example program.
75
76 2. I'll have a jump instruction in every single word to
77 skip over the string.
78
79 I can live with the first problem. It highlights how
80 ridiculous the data problem is, but it's merely
81 wasteful.
82
83 But the second problem isn't okay. One of the driving
84 factors of "inline everything" was to _avoid_ jumps. Now
85 I've gotta jump over every string...it just ruins it.
86
87 I keep thinking there's gotta be some escape hatch, but
88 everything I think of won't fly because it's complessly
89 complicated OR it requires some sort of
90 linker/loader-level wizardry that is no doubt possible,
91 but that energy would be better spent on a less silly
92 toy thingy.
93
94 So here's what I'm gonna do:
95
96 1. I'll do those dang jump-over-me strings.
97 2. I'll make sure I can write a stand-alone meow5
98 executable that prints "Meow. Meow..."
99 3. Publish an obituary for the project. :-)
100
101 A week later...
102
103 I couldn't do it. I'm totally okay with the conclusion
104 that meow5 is not a viable way to build software...but
105 NOT because I couldn't figure out how to make ELF
106 segments!
107
108 If I can put data in a specific spot in memory while
109 running the interpreter and then count on that data
110 being in the same spot (virtual address, obviously) when
111 the ELF executable runs, then it'll work. That was the
112 original plan.
113
114 So I "gave up" making the segments in meow5...but
115 there's no reason I can't experiment in a much easier
116 environment. And I've already got my 'mez' repo for
117 reading elves with Zig , why not do the same for writing
118 them? What I need to be able to do is iterate quickly
119 and test theories quickly.
120
121 Okay, now I'm all excited again! Time to put on the
122 tool-making cape and hat and figure out The Mystery of
123 the Elf File Offsets.
124
125 +-----------------------------------------------------+
126 | |
127 | Five Months Later |
128 | |
129 +-----------------------------------------------------+
130
131 Wow, so, uh, here I am again. Hello.
132
133 I've actually been extremely productive on this project
134 for the last couple weeks. Indeed, I did the Zig
135 experimentation in the 'mez' repo as described above.
136
137 Specifically, I made the 'mez' program display EXACTLY
138 what I needed from a 32-bit Linux ELF executable file to
139 debug my problems (and made it quite nice to look at, if
140 I say so myself).
141
142 Then I made a companion 'zem' program that writes a
143 two-segment executable (has two LOAD program headers
144 that create two memory segments - one with the
145 executable program and one with data).
146
147 As is often the case, *writing* the programs armed me
148 with the intimate understanding of the problem which has
149 rendered the programs themselves largely superfluous.
150
151 And this intimacy has drawn me to an uncomfortable
152 conclusion:
153
154 THERE'S NO EASY WAY TO MOVE ADDRESSES AROUND :-(
155
156 It's the problem I wrote about above, but now with all
157 of the handy-wavy stuff gone.
158
159 If I put a string into "data" memory in my running Meow5
160 interpreter and reference it by address in a program,
161 there is no EASY way to put that same string in a
162 different location in an exported ELF executable of the
163 same program.
164
165 I can likely keep them at the *same* address by padding
166 the exported executable out the right amount (tricky!)
167 to get everything to align. Or I could even try to
168 _translate_ the addresses when written (tricky and fatal
169 to the program if done incorrectly).
170
171 As a matter of fact, having more than one segment seems
172 to require a certain amount of padding anyway - I'm
173 still a bit fuzzy about segment alignment and I should
174 probably do some experimentation to get it 100% clear.
175
176 With the benefit of a better feel for the problem and
177 time to think about it, I conclude that having the
178 string live 'inline' with the word that uses it and
179 ending up with multiple copies of that string will:
180
181 1. Result in a *much* smaller ELF file anyway
182 (padding to 0x1000 is way bigger than a couple
183 copies of a small string!)
184 2. Be perfectly in line with the philosophy of this
185 silly project anyway - plus, it will even show up
186 better in a hex dump when you can SEE five copies
187 of the "Meow" string in memory!
188
189 So there we have it: I'm gonna do "inline" strings.
190
191 This log has been super weird. I'm going to start the
192 next one with a TODO list for making those strings work!