colorful rat Ratfactor.com > Dave's Repos

nasmjf

A NASM assembler port of JONESFORTH
git clone http://ratfactor.com/repos/nasmjf/nasmjf.git

nasmjf/devlog/log09.txt

Download raw file: devlog/log09.txt

1 Okay, since the last log, the var_HERE typo has been 2 fixed. CREATE is now getting the adddress _at_ HERE, now 3 the address _of_ here. We'll test that in this log. 4 5 And the memory allocator is in. I moved from the bottom 6 where Jones had it up into _start where it's used. I get 7 why it could be considered distracting from the Forth 8 concepts. But it's short and important and this is my 9 port, so up 10 it goes. 11 12 We'll start by trying it out: 13 14 _start () at nasmjf.asm:101 15 101 xor ebx, ebx 16 102 mov eax, [__NR_brk] ; syscall brk 17 18 Program received signal SIGSEGV, Segmentation fault. 19 20 Oops! __NR_brk is a NASM preprocessor assigned value, 21 not a runtime variable! 22 23 (gdb) load nasmjf 24 `/home/dave/nasmjf/nasmjf' has changed; re-reading symbols. 25 (gdb) r 26 Starting program: /home/dave/nasmjf/nasmjf 27 _start () at nasmjf.asm:101 28 101 xor ebx, ebx 29 102 mov eax, __NR_brk ; syscall brk 30 103 int 0x80 31 (gdb) p/x $eax 32 $1 = 0x804e000 33 34 Okay, so we now have our old "break" address. By the way, 35 see the source for my explanation of how the brk syscall 36 works. A lot of man pages, web pages, and discussion are 37 about the C stdlib brk() and sbrk(), but those are NOT 38 identical in usage to the syscall! 39 40 Then we request a new break address which is 0x16000 41 bytes "larger" than the old one. When we do this, Linux 42 reserves the memory in between for us! 43 44 104 mov [var_HERE], eax ; eax has start addr of data segment 45 105 add eax, 0x16000 ; add our desired number of bytes to break addr 46 106 mov ebx, eax ; reserve memory by setting this new break addr 47 107 mov eax, __NR_brk ; syscall brk again 48 108 int 0x80 49 (gdb) p/x $eax 50 $2 = 0x8064000 51 52 That looks right and means the new address means the 53 request succeeded. 54 55 previous break addr: 0x804e000 56 + 0x16000 57 new break addr: 0x8064000 58 59 Now the rest of the startup continues. 60 61 112 mov esi, cold_start 62 27 lodsd ; NEXT: Load from memory into eax, inc esi to point to next word. 63 28 jmp [eax] ; Jump to whatever code we're now pointing at. 64 65 It's time to see if the new memory allocation and the 66 var_HERE fix are working properly together to allow the 67 creation of new words in the dictionary. 68 69 (gdb) break code_CREATE 70 Breakpoint 3 at 0x8049251: file nasmjf.asm, line 559. 71 (gdb) c 72 Continuing. 73 : FIVE 5 ; 74 75 Breakpoint 3, code_CREATE () at nasmjf.asm:559 76 559 pop ecx ; length of word name 77 560 pop ebx ; address of word name 78 563 mov edi, [var_HERE] ; the address of the header 79 (gdb) p/x (int)var_HERE 80 $4 = 0x804e000 81 82 Excellent, the address at HERE looks like the start of 83 the space we reserved (the original "break" address 84 and the new break address mark the start and end of the 85 data section we've reserved). 86 87 Now we're going to store the link to the last dictionary 88 word entry in LATEST as the first 4 bytes of the header 89 of the new FIVE word we're compiling right now. 90 91 LATEST should point to its own header (how I chose to do 92 it), which is labeled "name_LATEST": 93 94 564 mov eax, [var_LATEST] ; get link pointer 95 565 stosd ; and store it in the header. 96 (gdb) p/x $eax 97 $6 = 0x804a3ac 98 (gdb) info sym $eax 99 name_LATEST in section .data of /home/dave/nasmjf/nasmjf 100 101 So far so good. We'll see if it stores it correctly in a 102 moment. Now we store the rest of the header: 103 104 -- Header With Name -- 105 4 bytes - link to previous word <--- done 106 1 byte - length of name + flags 107 N bytes - the ascii characters of the name 108 N bytes - possible empty space for 4 byte alignment 109 -- Code Body -- 110 <link to DOCOL to "interpret" the rest> 111 <the rest of the word addresses> 112 113 Neither the header nor the body symbols (name_FIVE, 114 code_FIVE) will exist in GDB since they're now written 115 in NASM and there aren't any symbols for them in the 116 DWARF2 debugging information in the executable. 117 From now on, we're making words with real Forth! 118 119 568 mov al, cl ; Get the length. 120 569 stosb ; Store the length/flags byte. 121 570 push esi 122 571 mov esi, ebx ; esi = word 123 572 rep movsb ; Copy the word 124 573 pop esi 125 574 add edi, 3 ; Align to next 4 byte boundary. See TCFA 126 575 and edi, ~3 127 128 Okay, let's see if the header is correct. First, HERE 129 should still be pointing to the beginning of the new 130 word's header because we haven't update it yet. 131 132 And the very first thing in the header should be a link 133 to the previous word in the dictionary. 134 135 (gdb) x/xw (int)var_HERE 136 0x804e000: 0x0804a3ac 137 138 Yup, that looks like the address of name_LATEST we saw 139 earlier. 140 141 Next is the length plus flags. In this case, just 142 length. Which should be 4 for the characters in the 143 name "FIVE". 144 145 (gdb) x/xb (int)var_HERE + 4 146 0x804e004: 0x04 147 148 Excellent, and finally, we should have the string "FIVE" 149 stored as ascii characters in the next four bytes. 150 151 (gdb) x/4cb (int)var_HERE + 5 152 0x804e005: 70 'F' 73 'I' 86 'V' 69 'E' 153 154 Bingo! 155 156 Now CREATE updates HERE to point at the address after 157 the header (aligned to 4 bytes) and LATEST to point to 158 the header of our new word. 159 160 578 mov eax, [var_HERE] 161 579 mov [var_LATEST], eax 162 580 mov [var_HERE], edi 163 164 Now our old pal NEXT will be moving on to the next word 165 in COLON to continue the compilation process. 166 167 Here's the entire definition of COLON: 168 169 DEFWORD ":",1,,COLON 170 dd FWORD ; Get the name of the new word 171 dd CREATE ; CREATE the dictionary entry / header 172 dd LIT, DOCOL, COMMA ; Append DOCOL (the codeword). 173 dd LATEST, FETCH, HIDDEN ; Make the word hidden while it's being compiled. 174 dd RBRAC ; Go into compile mode. 175 dd EXIT ; Return from the function. 176 177 So it looks like LIT is next. 178 179 27 lodsd ; NEXT: Load from memory into eax, inc esi to point to next word. 180 28 jmp [eax] ; Jump to whatever code we're now pointing at. 181 code_LIT () at nasmjf.asm:493 182 493 lodsd ; loads the value at esi into eax, incements esi 183 494 push eax ; push the literal number on to stack 184 185 Yup! Well, this has been great progress. The header for 186 our new word has been stored in memory we reserved. 187 188 I keep falling asleep, so the next log will pick up 189 where this left off. Then I can figure out what the heck 190 LIT is supposed to be accomplishing here.