I left myself a nice easy one to start this log:

        [x] Make all words take params from the stack, not
            from pre-defined registers.

    Which ought to be simple: just push the values
    I need before calling the function. Then have
    the function pop the values into the registers
    and off we go.

    +----------------------------------------------------+
    |   NOTE: I'm still using call/ret to use the        |
    |   'find' and 'inline' words when the program       |
    |   initially runs. I've got a bit of a              |
    |   chicken-and-egg problem here because without a   |
    |   return, these won't seamlessly move on to the    |
    |   next instructions when they're done and I        |
    |   can't find or inline them because THEY are       |
    |   'find' and 'inline'!                             |
    |                                                    |
    |   I feel like that will be solved when I've got    |
    |   more of the interpreter or REPL in place. If     |
    |   not, I've got a puzzle on my hands. For the      |
    |   moment, things are just a bit...messy.           |
    +----------------------------------------------------+

    Well, I thought that was going to be easy. I mean, it is
    pretty easy. But it has a few snags I hadn't yet
    considered.

    It turns out, using the same stack for call/ret return
    address storage AND for passing values between functions
    in a truly concatenative manner gets real complicated
    real quick. And since I was using call/ret temporarily
    anyway, I have zero desire to do anything fancy to make
    it work.

    So I'm going to basically do my own return by storing a
    return address and jumping to it at the end of both
    'inline' and 'find'.

    I'm making a new variable in BSS to hold my return
    address. I only need one, not a stack, because I'm not
    making any nested calls.

        temp_return_addr: resb 4

    I'll put in a mockup of the code to get the assembled
    instruction lengths right (I hope) so I can figure out
    the address we just jump back to as a "return". (I'm
    pretty sure I can't just store the instruction pointer
    register because that'll be a point before the "call"
    jump and then I'll have an infinite loop, right?)

    I'll use NASM's listing feature for that. It comes out
    super wide (well, compared to the 60 columns I give
    myself on my little split screen setup!), so I'll see if
    I can reformat it enough to fit here:

152 00DA 68[0600]         push temp_meow_name
153 00DD 66C706[0904]-    mov dword [temp_return_addr], $
153 00E2 [DD000000]
154 00E6 EB8A             jmp find
155 00E8 6650             push eax

    The listing is so fun to look at and I find it almost
    fetishisticly beautiful. I mean, I've had all of these
    _questions_ about how all of this actually works and
    here, if you can read them, are all of the _answers_. I
    mean, I know the CPU still has secrets down below even
    this machine code layer. But for the application
    programmer, this is _it_. This is the bedrock upon which
    we lay all of our hopes and dreams. In the hex column on
    the left are the real instructions, no longer hidden by
    mnemonics or symbols.

    Anyway, where was I?

    Oh, yeah, so '$' is NASM for "the address at the
    beginning of this line". Which is very handy. And that's
    exactly what's gonna be put into temp_return_addr:

        0904 is little-endian for temp_return_addr at 0409
             (which I could see further up my listing)

        DD000000 is the address returned by $
                 (again, in little-endian)

    And assuming the assembled code won't change, it looks
    like I want my return address to be an additional...

        E8 - DD

    ...bytes. Which, uh, I'll ask my cat to subtract for me.
    
    Hmm. No, he purred, but was not forthcoming with the
    answer. Okay, how about dc?

        $ dc
        16 i
        e8 dd - p
        dc: 'e' (0145) unimplemented
        E8 DD - p
        11

    Okay, so dc hissed at me once for not entering the hex
    values in upper case. So score one point for my cat. But
    then it gave me the correct answer after that, so score
    one point for dc. Looks like this match is even.

    So I wanna add 11 bytes to my return addresses.

    Here's the new listing:

153 00DD 66C706[0904]-    mov dword [temp_return_addr], ($ + 11)
153 00E2 [E8000000]
154 00E6 EB8A             jmp find 
155 00E8 6650             push eax 

    Looks right to me, we want to jump ("ret") back to 00E8
    after the jump ("call") to find.

    Of course, this seems super fragile, but it's also super
    temporary. Let's just see if it works...

    Okay, dang it, a segfault. My changes have required
    another change and now the addresses are a little
    different, but the 11 bytes should still be the same:

154 000000DF 66C706[0904]- mov dword [temp_return_addr], ($ + 11)
154 000000E4 [EA000000]         
155 000000E8 EB88          jmp find
156 000000EA 6650          push eax

    Let's try it now:

(gdb) break find.found_it 
Breakpoint 1, find.found_it () at meow5.asm:116
116	    mov eax, edx  ; pointer to tail of dictionary word
(gdb) p/a $edx
$1 = 0x8049030 <meow_tail>

	So far so good. Now the return jump?

(gdb) s
117	    jmp [temp_return_addr]
(gdb) p/a (int)temp_return_addr 
$2 = 0x80490df <inline_a_meow+16>

	And just where might that be, exactly?


0x080490d4 <+5>:	c7 05 19 a4 04 08 df 90 04 08	movl
                                           $0x80490df, 0x804a419
0x080490de <+15>:	eb 88	jmp    0x8049068 <find>
0x080490e0 <+17>:	50	push   %eax
0x080490e1 <+18>:	e8 57 ff ff ff	call   0x804903d <inline>

	Hmmm...looks off by 1. 0x80490df points to the second
    byte of the jmp find instruction...

Program received signal SIGSEGV, Segmentation fault.
0x080490df in inline_a_meow () at meow5.asm:155
155	    jmp find           ; answer will be in eax

    Yeah. So... 12 bytes?

    And to think, I waxed all poetic about the NASM listing.
    I don't know how to explain the byte discrepancy. Let's
    see if this works:

(gdb) break find.found_it 
Breakpoint 1, find.found_it () at meow5.asm:116
116	    mov eax, edx  ; pointer to tail of dictionary word
(gdb) s
117	    jmp [temp_return_addr]
(gdb) s
inline_a_meow () at meow5.asm:156
156	    push eax            ; put it on the stack for inline

	Yes! But that was evidently even _more_ fragile than I'd
    expected. So I'll just bit the bullet and hold my nose
    and use some temporary labels. It's still quite
    compact, so I'll just paste it here:

            push temp_meow_name ; the name string to find
            mov dword [temp_return_addr], t1
            jmp find           ; answer will be in eax
        t1: push eax            ; put it on the stack for inline
            mov dword [temp_return_addr], t2
            jmp inline
        t2: dec byte [meow_counter]
            jnz inline_a_meow

            ; inline exit
            push temp_exit_name ; the name string to find
            mov dword [temp_return_addr], t3
            jmp find           ; answer will be in eax
        t3: push eax            ; put it on the stack for inline
            mov dword [temp_return_addr], t4
            jmp inline
        t4:
            ; Run!
            push 0           ; push exit code to stack for exit
            jmp data_segment ; jump to the "compiled" program

    Does it work?

dave@cygnus~/meow5$ mr
Meow.
Meow.
Meow.
Meow.
Meow.

    Yes!

    One last thing, now - 'find' is still leaving its answer
    in the eax register. If I have it push the answer to the
    stack instead, 'inline' will pop it and have what it
    needs - no need for that "push eax" beween the two
    functions/words (at labels t1 and t3 above).

    Now find.not_found and find.found_it push their return
    values on the stack:

        .not_found:
            push 0   ; return 0 to indicate not found
            jmp [temp_return_addr]

        .found_it:
            push edx ; return  pointer to tail of dictionary word
            jmp [temp_return_addr]

    And the calls simply flow one after the other without
    any explicit data passing:

            jmp find
        t1: mov dword [temp_return_addr], t2
            jmp inline
        t2: ...

    And does _that_ work?

dave@cygnus~/meow5$ mr
Meow.
Meow.
Meow.
Meow.
Meow.

    Yes, and now I can check that little box at the top of
    this log. We're doing pure stack-based concatenative
    programming now.

    Next step:

        [x] Parse the string "meow meow meow meow meow exit"
            as a program (pretend we're already in "compile
            mode" and we're gathering word tokens and
            compiling them) and execute it.

    It begins! Here's the string in the .data segment:

        input_buffer_start:
            db 'meow meow meow meow meow exit', 0
        input_buffer_end:

    And here's the .bss segment "variables":

        token_buffer: resb 32    ; For get_token
        input_buffer_pos: resb 4 ; Save position of read tokens

    Yup, just 32 chars for token names (well, 31 because I'm
    null-terminating the string). Hey, it's my language. Ha
    ha, I can always bump this up later. But 31 is actually
    quite long, you know?

        abcdefghijklmnopqrstuvwxyz01234

    I've created a word called 'get_token' which will do the
    job of both 'WORD' and 'KEY' in Forth. And I was just
    about to 'call' it to test it, but I can't bear to put
    in another manual temporary label 

    So, it's macro time!

        %macro CALLWORD 1
                mov dword [return_addr], %%return_to
                jmp %1
            %%return_to:
        %endmacro

    And it should be super easy to use. First, I'll test my
    temporary "manual" 'meow' and 'exit' inlines to make
    sure it works. They're about to go away, but they'll
    make a good test.

    Look at how clean the 'exit' one is:

        push temp_exit_name ; the name string to find
        CALLWORD find
        CALLWORD inline

     But does it work?

dave@cygnus~/meow5$ mr
Meow.
Meow.
Meow.
Meow.
Meow.

    First try! No way. I mean, of _course_ it worked first
    try and I never doubted it would.

    Okay, now let's get into this get_token function:

(gdb) break get_next_token 
Breakpoint 1 at 0x804912e: file meow5.asm, line 189.
(gdb) r
Starting program: /home/dave/meow5/meow5 

Breakpoint 1, get_next_token () at meow5.asm:189
189	        mov dword [return_addr], %%return_to ; CALLWORD
190	        jmp %1                               ; CALLWORD
150	    mov ebx, [input_buffer_pos] ; set input read addr
151	    mov edx, token_buffer       ; set output write addr
152	    mov ecx, 0                  ; position index
154	    mov al, [ebx + ecx] ; input addr + position index
155	    cmp al, 0           ; end of input?
(gdb) p/c $al
$2 = 109 'm'

    Nice! So the 'm' from the first 'meow' has been
    collected so far. Now the rest of the token...

(gdb) break 155
Breakpoint 2 at 0x80490c7: file meow5.asm, line 155.
(gdb) c
155	    cmp al, 0           ; end of input?
(gdb) p/c $al
$3 = 101 'e'
...
$4 = 111 'o'
$5 = 119 'w'
$6 = 32 ' '

    We have 'meow' and the space should be our token
    separator.

155	    cmp al, 0           ; end of input?
156	    je .end_of_input    ; yes
157	    cmp al, ' '         ; token separator? (space)
158	    je .return_token    ; yes
170	    add [input_buffer_pos], ecx ; save input position
171	    mov [edx + ecx], byte 0     ; terminate str null

    Looks good. Did we collect 4 characters as expected?

(gdb) p $ecx
$7 = 4

    Yup. Then get_token will "return" the token string
    address so 'find' can use it to find the 'meow' word:

172	    push DWORD token_buffer     ; return str address
173	    jmp [return_addr]
219	    cmp DWORD [esp], 0  ; check return without popping
220	    je run_it           ; all out of tokens!
189	        mov dword [return_addr], %%return_to ; CALLWORD
190	        jmp %1                               ; CALLWORD
96	    pop ebp ; first param from stack!
find () at meow5.asm:99
99	    mov edx, [last]

    Okay, the execution looks right. And did we pass the
    address correctly on the stack?

(gdb) p $ebp
$9 = (void *) 0x804a43d <token_buffer>

    Yup! And does it contain the expected 'meow' token?

(gdb) x/s $ebp
0x804a43d <token_buffer>:	"meow"

    Nice!

    I'm going to assume 'find' and 'inline' will work
    correctly. Let's see if we can get the next token from
    the input string:

(gdb) c
Continuing.
Breakpoint 2, get_token.get_char () at meow5.asm:155
155	    cmp al, 0           ; end of input?

    Alright, we're back in get_token. This should be the
    first character of the second 'meow' token:

(gdb) p/c $al
$10 = 32 ' '

    Uh oh. That doesn't look right. I'll continue anyway...

(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
inline () at meow5.asm:76
76	    mov ecx, [esi + 4] ; get len into ecx

    Yeah, that makes sense. 'find' will have
    failed to find the '' token and then 'inline' crashes when
    trying to read from address 0 (the null pointer return
    value from 'find').

    The best way to handle this is probably to ignore any
    leading spaces - that will not only be useful later, it
    will take care of this current character problem.

    In a higher-level language, I might choose to do this
    with nested logic, like so:

        if (char === ' ')
            if (token.len > 0)
                'eat' space (move to next input char)
            else
                return the token
            end
        end

    But in assembly, this all gets flattened. It's a
    surprisingly interesting exercise to formulate the logic
    in terms of jumps. (At least at first. I'm sure the
    novelty wears off after a while.)

    Anyway, here's my solution:

            cmp al, ' '         ; token separator? (space)
            jne .add_char       ; nope! get char
            cmp ecx, 0          ; yup! do we have a token yet?
            je .eat_space       ; no
            jmp .return_token   ; yes, return it
        .eat_space:
            inc ebx             ; 'eat' space by advancing input
            jmp .get_char

    I'll make sure that works in GDB. I changed the input
    string to:

        db ' meow   meow meow meow meow exit', 0

    with a leading space and two spaces before the second
    meow token. That'll make it easy to test:

155	    cmp al, 0           ; end of input?
(gdb) p/c $al
$1 = 32 ' '
157	    cmp al, ' '         ; token separator? (space)
158	    jne .add_char       ; nope! get char
162	    cmp ecx, 0
163	    je .eat_space
171	    inc ebx
172	    jmp .get_char

    Yup! That line 171 is my "eat the leading space" action
    and now we should get the 'm' in "meow" and store it:

154	    mov al, [ebx + ecx] ; input addr + position index
155	    cmp al, 0           ; end of input?
156	    je .end_of_input    ; yes
(gdb) p/c $al
$2 = 109 'm'
157	    cmp al, ' '         ; token separator? (space)
158	    jne .add_char       ; nope! get char
175	    mov [edx + ecx], al ; write character

    Yeah. This is looking good.

    You know what? I'm just gonna go for it:

dave@cygnus~/meow5$ mr
Meow.
Meow.
Meow.
Meow.
Meow.

    Yes! So that's another item checked off!

    This is really coming along.

    At nearly 500 lines, this log is complete. I'll see you
    in the next one, log04.txt. :-)