Now that we have a Hex monitor, we are now capable of either creating a text file (no ability to correct mistakes along the way) or any arbitrary hex program we want.
I don't know about you but I don't want to manually type in every hex program forever, so I am going to use the capability of the Hex monitor to write text files full of commented Hex code.
But to usefully build those code files, we need a hex assembler and here is how we make one.
There are two methods,
1) Cheat and simply ./bin/hex < stage1/stage1_assembler-0.hex0 > roms/stage1_assembler-0
From this point on I will assume you are going to take the easy route and simply load the source code using some mechanism you trust or are using SET to manually duplicate the entries of the code files used in each of the proceeding steps.
Now that it is easy(ish) to create text files and we have a really stupid hex assembler, we probably don't want to manually calculate offsets and jumps any more.
So we are going to limit ourselves to single character labels and pointers (:a and @a respectively) which should be enough to make the next step reasonable.
Now that we have labels and pointers, I want the ability to have labels like :main_function and :stack_start and be able to reference the absolute address of things in my code like $stack_start and complex objects that have 32bit pointers like &foo_bar.
Hopefully 64 char labels is enough (if not, simply add more NOPs at the end and fix the 2 table address references [Which will alter the checksum accordingly])
I don't know about you but at this point, I don't wanna convert another instruction into HEX by hand, so to save myself the pain we are going to write the most often ignored but important development in computer programming. The LINE MACRO ASSEMBLER.
At 1792 bytes large and 448 hand converted instructions, this program will allow us to write proper assembly code provided we prefix our assembly code with a definitions file (Which by the way is High_level_prototypes/defs for our VM)
Now with our new line macro assembler, which with our definition file High_level_prototypes/defs means we can write straight assembly from here on out.
And now you have the ability to simply feed tape_01 with tapes in any order you desire and the combination of all those tapes will be in tape_02.
This also of course could be used for tape duplication if you are noticing wear on any of your tapes.
* Step 6a Build us a Lisp
If you are anything like me, you'll probably spend some time trying to find the easiest to implement High level programming language for your step after assembly.
The One thing that you'll find is that actually, Lisp is probably the easiest langauge to implement after you have a working assembler.
Since forth fans kept telling me Forth is easier to implement than lisp, I also implemented a Forth but it certainly took far longer to get the bugs out.
You'll note that this forth is only 4372 bytes large (almost half the size of the lisp) but you'll also note that it is also in a much more primitive state.
Having done all of the above; I must say if I had to start with this forth or the hex monitor, I would always choose the forth but I wouldn't dare dream of trying to jump straight to forth from Hex.
If you want to bootstrap anything, NEVER EVER EVER START WITH FORTH. Get yourself a good assembler first and you might find writing a garbage collected lisp or a C compiler is actually much easier.
Since I got tired of people saying you can't write a C compiler in assembly, I did exactly that and as you can see it was completed relatively quickly.
You'll note that the C compiler is 16370 bytes large, making it larger than the Lisp and FORTH put together but not much harder to implement than the lisp.
Having done all of the above; I must say one needs a really solid assembler and a minimal disassembler to effectively bootstrap a C compiler (alteast until you get the point that the tokenizer is working).
roms/cc_x86 should with the sha256sum of 12bb96de936fff18b27c2382ddcee2db6afb6a94b9f4c6e9e9b3d1d0d0d3b0ed
Our C compiler will read any code in tape_01 and output the compiled output to tape_02.
The compiled output is macro assembly (allowing for easy inspection) which then must go through the appropriate macro assembler and hex2 steps to become a working binary.
Should you wish to port the C compiler to a different target, one need only update the strings and a few minor places in the code; looking at M2-Planet's multi-arch support will cover all of those questions quickly.