Nerd Sniped
My “friends” have well and truly nerd-sniped me. We’re going to digress to look at Forth’s tiny primitives that can be used to build words like IF-ELSE-THEN and such. Spoiler: A lot of thinking, and then a tiny but significant SUCCESS!
It all started, doctor, when Hill asked me whether he would be able to define new immediate words with this little Forth thing that I’m building. We kicked it around briefly, and decided that it wouldn’t be too hard. But then Jason on Mastodon said:
IIRC, words like : ; IF ELSE and THEN were usually written in Forth using words from the tiny core of primitives. I’m not suggesting you should do that, but remembering that there could be words for reading from the token stream might be helpful. Immediate words now give me the vibe of macros or special forms in lisp. And I think you can use them to add new syntax to Forth.
So that sent me back to “the literature”. I couldn’t find much of use in Loeliger, although it may be there somewhere, but on forth.com, I’ve found some information sprinkled among its Starting Forth pages.
So this morning we’ll mostly examine some of those primitive words and speculate about how we might use them in our Forth. I think this will pay off in understanding, and it may well enable us to improve our design a bit.
Initial Understanding
The following discussion is certainly wrong but probably somewhat near the mark:
I am somewhat aware of some of the primitives. There are some with ‘@’ in their name, and they generally convert an address on the stack to the contents of that address. And some with ‘!’ in their name take two stack items, one of them an address, and store the other at that address.
The comma operator, ‘,’ stores a value in the next available word in the Forth dictionary. You’d use it to put things into a definition, I imagine. There is a related variable, CP, which is a variable that points to the next cell in the dictionary. And HERE is equal to CP @, i.e. the current address of the next available cell.
There’s the tick, “’”, which looks up the next word from the input stream in the dictionary and provides its “execution token”, or “xt”, which I believe to be the address of the word’s code definition.
There is the related “[’]”, which looks up the word in the definition we’re currently building and compiles it into the output. I freely grant that I do not grasp these ideas. I list them here because they are surely going to be important.
(Much of the above is gleaned from Under the Hood on forth.com.)
In a classical Forth, each dictionary entry has a pointer to code that is particular to the type of thing it defines, constants, variables, primary words, secondary words, etc. In the case of a secondary word, the code pointed to is, I think, code that pushes a return onto the return stack and then begins to execute the words in the definition. I think it is common to have the last word executed always be the same, a word that pops the return stack.
In that classical forth, the code bits are executed simply by branching to them, not calling them like a subroutine or method. The returns are managed explicitly in the particular bits one links to. This is why it’s called “threaded code”: things are tied together by threads (links, pointers) which are traced one after another.
How Might This Affect Us?
Let’s look at how some of our current primitives might be built from something smaller. Here’s IF-THEN-ELSE, which is pretty simple and might be our first attempt if we try this:
@staticmethod
def _define_if_else_then(lex):
def _if(forth):
forth.compile_conditional('*IF', forth.word_list)
def _else(forth):
forth.patch_the_skip(['*IF'], 1, forth.word_list)
forth.compile_conditional('*ELSE', forth.word_list)
def _then(forth):
forth.patch_the_skip(['*IF', '*ELSE'], -1, forth.word_list)
lex.append(PrimaryWord('IF', _if, immediate=True))
lex.append(PrimaryWord('ELSE', _else, immediate=True))
lex.append(PrimaryWord('THEN', _then, immediate=True))
def compile_conditional(self, word_to_compile, word_list):
self.compile_stack.push((word_to_compile, len(word_list) + 1))
word_list.append(self.find_word(word_to_compile))
word_list.append(0)
def patch_the_skip(self, expected, skip_adjustment, word_list):
key, patch_loc = self.compile_stack.pop()
last_loc = len(word_list) + skip_adjustment
word_list[patch_loc] = last_loc - patch_loc
This is a bit hard to decode. I think we may want to change the factoring. Anyway, the code:
flag IF foo ELSE bar THEN
Gets compiled via this sequence:
- IF stacks the current word list index + 1
- IF compiles
*IF 0into the definition - foo gets compiled into the definition
- ELSE unstacks the index and fixes up the zero after the
*IFto skip just past the following: - ELSE compiles
*ELSE 0 - ELSE re-stacks the current word index + 1
- THEN unstacks the index and fixes up the zero after the
*ELSEto skip to where we are.
The resulting code is therefore:
[flag] pop *IF n foo *THEN m bar ...
v----------------^
v----------^
n and m are set to however many words to skip, that is, however many have been compiled between when the IF or ELSE appeared and the closing ELSE or THEN occurred.
A real Forth would do this more simply and elegantly than I did. It has the word HERE, which is the address of the next word in the definition we’re compiling. That’s what I’m doing with my len(word_list)+1
stuff: I’ve just got things in an awkward order. Forth would stack HERE at the right moment: I stack my equivalent. Forth would likely patch in the actual addresses, not the skips like we do here. Then it would compile machine code for something like this:
[flag] pop JZ na foo J ma bar
Where JZ na is the machine code for jump to address na, and J is the unconditional jump to address ma. So the entire conditional compiles into two jump instructions, one conditional and one not.
If we look at our *IF and *ELSE, they are equivalent to that machine code:
lex.append(PrimaryWord('*IF', lambda f: f.star_if()))
lex.append(PrimaryWord('*ELSE', lambda f: f.active_word.skip(f.next_word())))
def star_if(self):
jump = self.next_word()
flag = self.stack.pop()
if not flag:
self.active_word.skip(jump)
The *IF skips if the flag is false, just like the JZ. The *ELSE skips unconditionally just as does the J.
Hill (private communication last year) mentioned a primitive 0BRAN, which I take to be like our JZ above, probably 0BRAN would pop the stack and then branch if the result is zero. Our implementation here has the popping and pushing built into the individual operations. That’s probably how we have to do it.
So would we perhaps define IF something like this:
: IF ' 0BRAN , 0 , HERE R> ;
Where >R means to pop the stack (HERE) and push the result onto the return stack.
Something like that. I’m not clear on when I need to tick and when I don’t. I’ll have to code some of these to figure it out.
Break
I’m going to take a break, do some reading, let my brain settle down. I think we could invent something right now, but I’d rather let the ideas perk and maybe see if I can find anything useful in Loeliger.
Unbreak
Hi, I’m back. Since I was right there, I noticed these two little methods:
def star_if(self):
jump = self.next_word()
flag = self.stack.pop()
if not flag:
self.active_word.skip(jump)
def star_until(self):
# if pop is true, skip one else skip in word + 1
to_check = self.stack.pop()
skip_back = self.next_word()
if to_check == 0:
self.active_word.skip(skip_back)
I have a feeling that exploring these will lead somewhere good. They have a certain, shall we say, similarity. Let’s make more.
def star_if(self):
jump = self.next_word()
flag = self.stack.pop()
if flag == 0:
self.active_word.skip(jump)
def star_until(self):
skip_back = self.next_word()
to_check = self.stack.pop()
if to_check == 0:
self.active_word.skip(skip_back)
Commit: tidying. Let’s make more similarity, just for fun:
def star_if(self):
jump = self.next_word()
if self.stack.pop() == 0:
self.active_word.skip(jump)
def star_until(self):
skip_back = self.next_word()
if self.stack.pop() == 0:
self.active_word.skip(skip_back)
Commit again. OK, enough messing around. I think this is in fact branch if zero, and if Python allowed names to start with 0 I’d name it 0BRAN but no. There is surely only one reference to each of these.
Rename one of them to a more generic name:
def zero_branch(self):
branch_distance = self.next_word()
if self.stack.pop() == 0:
self.active_word.skip(branch_distance)
And connect the other case:
@staticmethod
def define_skippers(lex):
lex.append(PrimaryWord('*#', lambda f: f.stack.push(f.next_word())))
lex.append(PrimaryWord('*IF', lambda f: f.zero_branch()))
lex.append(PrimaryWord('*ELSE', lambda f: f.active_word.skip(f.next_word())))
lex.append(PrimaryWord('*UNTIL', lambda f: f.zero_branch()))
lex.append(PrimaryWord('*DO', lambda f: f.star_do()))
lex.append(PrimaryWord('*LOOP', lambda f: f.star_loop()))
Commit. And remove the second method. Commit again.
We haven’t converted the define above to use the new scheme with embedded local functions, but we soon will.
Look at *ELSE. That is just an unconditional branch, similar to our conditional ones. We may find some use for that again.
What about the *DO and *LOOP ones? They look very simple, especially *DO:
def star_do(self):
start = self.stack.pop()
limit = self.stack.pop()
self.return_stack.push((start, limit))
def star_loop(self):
jump = self.next_word()
index, limit = self.return_stack.pop()
index += 1
if index < limit:
self.return_stack.push((index, limit))
self.active_word.skip(jump)
Let’s rename in star_loop:
def star_loop(self):
beginning_of_do_loop = self.next_word()
index, limit = self.return_stack.pop()
index += 1
if index < limit:
self.return_stack.push((index, limit))
self.active_word.skip(beginning_of_do_loop)
Not much improvement. Maybe a little. Commit it. But wait!
Remember R> and <R?
lex.append(PrimaryWord('>R', lambda f: f.return_stack.push(f.stack.pop())))
lex.append(PrimaryWord('R>', lambda f: f.stack.push(f.return_stack.pop())))
(I mistakenly named <R as >R but it’s fixed now.)
star_do and star_loop could be defined like this:
def i_word(self):
index = self.return_stack[-1]
self.stack.push(index)
def star_do(self):
start = self.stack.pop()
limit = self.stack.pop()
self.return_stack.push(limit)
self.return_stack.push(start)
def star_loop(self):
beginning_of_do_loop = self.next_word()
index = self.return_stack.pop()
limit = self.return_stack.pop()
index += 1
if index < limit:
self.return_stack.push(limit)
self.return_stack.push(index)
self.active_word.skip(beginning_of_do_loop)
We’ve moved from having a pair (index, limit) on the return stack to having two stack entries, index over limit.
(We have to keep i_word in sync with those.)
Now as those are coded, star_do at least could almost be coded using >R and part of star_loop could be using <R. Let’s code to_r and from_r as a temporary idea:
def to_r(self):
self.return_stack.push(self.stack.pop())
def from_r(self):
self.stack.push(self.return_stack.pop())
Now we can use them …
def star_do(self):
self.stack.swap()
self.to_r()
self.to_r()
Since that works, we can now see that *DO could be compiled like this:
: *DO SWAP >R >R ;
Shall we try it? We shall.
def define_skippers(self,lex):
lex.append(PrimaryWord('*#', lambda f: f.stack.push(f.next_word())))
lex.append(PrimaryWord('*IF', lambda f: f.zero_branch()))
lex.append(PrimaryWord('*ELSE', lambda f: f.active_word.skip(f.next_word())))
lex.append(PrimaryWord('*UNTIL', lambda f: f.zero_branch()))
lex.append(PrimaryWord('*LOOP', lambda f: f.star_loop()))
self.compile(': *DO SWAP >R >R ;')
Once I got the primary definitions in the right order, this works. In principle, we need to have this definition available before its actual use in ‘DO’, and it can’t compile until after SWAP and >R are defined. It can be defined after ‘DO’ is defined, so long as it is before ‘DO’ is actually compiled.
So this is actually, despite feeling kind of anticlimactic, a significant breakthrough. We have a key internal word, *DO, defined, not as a primary but instead with a compiled colon definition.
Surely this opens the door to doing more. Equally surely, I suspect we’ve opened the door to a need to keep the order of definition pretty clear in our mind(s), but the tests will fail when we get things wrong.
I’m not sure what’s next, but we have an answer to at least part of the question: yes, we can define key words using colon definitions. We have not yet marked one immediate, but I feel sure that that will be easy. ( could be wrong: I frequently am.)
- Note
- One way in which I think I am likely wrong is in the R words. I think that probably ‘>R’ should put things into R and ‘R>’’ should take them out. I’ll make that change right now. Fixed the code above.
- One More Thing
- I note that
i_word, which copies the word at the top of the return stack to the real stack, is the word known to some as ‘R@’. I’ll try to keep that in mind for later.
So, good news! See you next time!