A set of minimal dependency bootstrap binaries
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

10 KiB

-*- mode: org; -*-
* Copyright header
## Copyright (C) 2016 Jeremiah Orians
## This file is part of stage0.
## stage0 is free software: you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.
## stage0 is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## GNU General Public License for more details.
## You should have received a copy of the GNU General Public License
## along with stage0. If not, see <http://www.gnu.org/licenses/>.

The master repository for this work is located at:

* If you wish to contribute:
pull requests can be made at https://github.com/oriansj/stage0
and https://gitlab.com/janneke/stage0
or patches/diffs can be sent via email to Jeremiah (at) pdp10 [dot] guru
or join us on freenode's #bootstrappable
or update the wiki at https://bootstrapping.miraheze.org/wiki/Stage0

* Goal
This is a set of manually created hex programs in a Cthulhu Path to madness fashion.
Which only have the goal of creating a bootstrapping path to a C compiler capable of
compiling GCC, with only the explicit requirement of a single 1 KByte binary or less.

Additionally, all code must be able to be understood by 70% of the population of programmers.
If the code can not be understood by that volume, it needs to be altered until it satisfies the above requirement.

* Also found within
This repo contains a few of my false start pieces that may be of interest to people who
want to independently create the root binary. I welcome all bug fixes and code that aids
in the above stated goal.

I'll be adding more code and documentation as I build pieces.
ALL code in this REPO is under the GPLv3 or Later.

In order to build stage0 and all the pieces, one only needs to run make all.
Each individual piece can be built by simply running make $piece with $piece being replaced by the actual part you want to make.

The only pieces that have any external dependencies are the Web IDE (Python3+CherryPy), libvm (GCC) and vm (GCC+GNU getopt)

* Future
** Software
Add more ports to more hardware platforms.

** Hardware
Implement the Knight processor in FPGA and then convert into TTL.

* Need to know information
** stage0
The stage0 is the ultimate lowest level of bootstrap that is useful for systems
without firmware, operating systems nor any other provided software functionality
Those with such capabilities can skip this stage as it requires human input.

*** Hex0_monitor
The Hex0_monitor provides dual functionality:
1) It assembles hex0 programs manually typed in
2) It writes the characters, providing minimal text input functionality.

The first is essential for creating of the root binaries.
The second is essential for creating source files before you have an editor.
The distinction is important because only the Hex0 assembler in stage1 is built
by the Hex0_monitor and from that point onwards it is used as a minimal text
editor until a more advanced text editor can be bootstrapped.

** stage1
The stage1 is dependent on the availability of text source files and at least a
hex0 monitor or assembler. The steps in this stage can be fully automated should
one trust their automation or performed manually on any hardware they trust.

Regardless of which method selected, the resulting binaries MUST be identical.

*** Hex0
The Hex0 assembler or stage1_assembler-0 is the head node of the stage1 bootstrap.
Its functionality is reduced compared to the stage0 monitor simply because it
only performs half of the required functions; that of generating binaries from
hex0 source files.

Its most important features of note are:
; line comments and
# line comments
As careful notes are essential for this stage.

*** Hex1
The Hex1 assembler or stage1_assembler-1 is the next logical extension of the
Hex0 assembler, single character labels and relative displacement using a prefix.
In this case labels start with : thus the label a must be written :a and the
prefix for relative offsets is @ thus the pointer must be written @a
Further because of the mescc-tools standardization of syntax @label indicates a
16bit relative displacement.

Alternative architectures porting this need not limit themselves to 16bit
displacements should they so choose, rather they must provide at least 1 size
of displacement or if they so desire, they may skip and write their Hex2
assembler in Hex0 but as it is a much larger program, I recommend against it.

*** Hex2
The Hex2 assembler or stage1_assembler-2 or hex2_linker is as complex of a hex
language that is both meaningful and worth the effort.

Hex2's important advances over Hex1 are as follows:
Support for long labels (Minimal 42 chars, ideally unlimited)
Support for Absolute addressing ($label for 16bit absolute addresses)
Support for Alternative pointer sizes (%label for 32bit relative and &label for
32bit absolute addresses)

Optionally support for !label (8bit relative addressing) and ?label
(Architecture specific size/properties) and/or @label1>label2 %label1>label2
displacements may be implemented should the specific architecture require it
for human readable hex2 source files (such as ELF headers).

*** M0
M0 or M0-macro or M1-macro is the minimal string replacement program with string
processing functionality required to convert an Assembly like syntax into Hex2
programs that can be compiled. Its rules are merely an extension of Hex2 with
the goal of reducing the amount of hex that one would need to write.

The 3 essential pieces are:
1) DEFINE STRING1 HEX_CHARACTERS (No extra whitespace nor \t or \n inside
2) "Raw strings" allow every character except " as there is no support for
string escapes, including NULL; which are converted to Hex chars for Hex2
To convert back to the chars inside of the "quotes" with the addition of a
trailing NULL character or the number desired (Must be at least 1, no upper
bound) and restrictions such as padding to word boundaries are acceptable.
3) 'Raw char strings' will be passing anything inside of them (except ' which
terminates the string).

Thus by combining :label, @label, DEFINE SYSCALL 0F05, Raw strings and chars;
one has created a rather flexible and powerful Assembler capable of building
far more ambitious pieces in "Macro Assembly".

** stage2
The stage2 is dependent on the availability of text source files and at least a
functional macro assembler and can be used to build operating systems or other
"Bootstrap" functionality that might be required to enable functional binaries;
such as programs that set execute bits or generate dwarf stubs.

Because a great many people stated FORTH would be an ideal bootstrapping language
the time and effort was put forth by Caleb and Jeremiah to provide a framework
for those people to contribute immediately; thus the FORTH was born.

Several efforts were taken to make the FORTH more standard but ultimately it was
determined, Assembly was preferable as the underlying architecture wasn't total

It now sits waiting for any FORTH programmer who wishes to prove FORTH is a real
bootstrapping language.

*** Lisp
The next recommendation in bootstrapping was Lisp, so efforts were taken to
design the most minimal Lisp with all of the functionality described in the
original Lisp papers. The task was completed relatively quickly compared to the
FORTH and even had enhancements such as a compacting garbage collector.

Ultimately it was found, the lisp that many rave about isn't entirely compatible
with modern lisps or schemes; thus was shelved for any Lisper who wishes to pick
it up.

*** C
After being told for months there is no way to write a proper C compiler in
assembly and months of research without any human written C compilers in
assembly found. To prove the point Jeremiah decided the First C compiler on the
bootstrap would actually be a cross-compiler for x86, such that everyone would
be able to verify it did exactly what it was supposed to and see it self-host
its C version.

** stage3
The stage3 is dependent on the availability of text source files and at least a
functional M2-Planet level C compiler, FORTH and a Minimal Garbage collecting
Lisp and can be used to build more advanced tools that can be used in
bootstrapping whole operating systems with modern tool stacks.

*** initial_library
A library collection of very useful FORTH functionality designed to make the
lives of any FORTH programmer easier.

It now sits waiting for any FORTH programmer who wishes to build upon it.

*** ascension
A library collection of useful Lisp functionality designed to make the lives
of any Lisp programmer easier.

As it depends on archaic Lisp dialect; it will likely need to be replaced should
the Lisp be properly fixed.

*** blood-elf_x86
The x86 program for a dwarf stub generator used in mescc-tools bootstrapping.
Specifically mescc-tools-seed generation, which can be used to build M2-Planet
and thus complete the circle.

*** get_machine_x86
The trivial x86 program that allows one to skip tests or scripts that will not
run on that specific platform or run alternative commands depending upon the

*** hex2_linker_x86
The program that allows one to build the hex2 programs for any hardware platform
on x86 and thus verify software builds for hardware one does not even have.

*** M1-macro_x86
The program that allows one to build the M1 program for any hardware platform
on x86 and thus verify software builds for hardware one does not even have.

*** M2-Planet_x86
The x86 port of the M2-Planet C compiler v1.0 used as one of the paths in
bootstrapping M2-Planet on x86 hardware.

* Inspirations
This work wouldn't have come so far without the inspirational work of others
They are in alphabetical order of the Author's last names

GRIMLEY EVANS, Edmund - bcompiler [http://homepage.ntlworld.com/edmund.grimley-evans/bcompiler.html] :: The inspiration for hex0, hex1 and hex2
GRIMLEY EVANS, Edmund - cc500 [http://homepage.ntlworld.com/edmund.grimley-evans/cc500] :: The inspiration for M2-Planet
Jones, Richard W.M. - jonesforth [http://git.annexia.org/?p=jonesforth.git] :: The inspiration for stage2 FORTH and initial_library
Piner, Steve and Deutsch, L. Peter - Expensive Typewriter [http://archive.computerhistory.org/resources/text/DEC/pdp-1/DEC.pdp_1.1972.102650079.pdf] :: The inspiration for SET
kragensitaker - The Monitor [https://old.reddit.com/r/programming/comments/9x15g/programming_thought_experiment_stuck_in_a_room/c0ewj2c/] :: The inspiration for the hex0-monitor