Skip to content

Commit a5be1d0

Browse files
author
Jaakko Heusala
committed
New README
0 parents  commit a5be1d0

File tree

1 file changed

+110
-0
lines changed

1 file changed

+110
-0
lines changed

README.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
Gendo is a local-first programming system that treats plain language prompts,
2+
readable code, binary output and tests as equally important, version-controlled
3+
artefacts. Every piece of a programme lives in ordinary files under Git, so
4+
history, branching and review need no extra tooling. A unit of code is
5+
identified by its pathname and base-name, case is ignored and dots stand in for
6+
directory slashes, so the file core/arithmetic/abs.gnd declares the operation
7+
core.arithmetic.abs, but it can be invoked simply as abs by another file in the
8+
same folder or as ../arithmetic/abs by a nearby module. A unit’s human-written
9+
header ends with .llm and states purpose, dependencies and constraints in free
10+
prose. From that header a developer or the tooling may generate a prompt file
11+
ending in .gnd.llm; the local model expands that prompt into a readable
12+
implementation saved in .gnd. When performance or distribution requires, the
13+
interpreter turns that implementation into a dense byte-stream saved in .obj.
14+
Tests live in .test files if they are executable assertions or .test.llm files
15+
if they are questions for the model. The same base-name and any numeric prefix
16+
tie all these files together. If several numbered fragments exist, the build
17+
concatenates them in numeric order before parsing; unnumbered fragments come
18+
last. Dots and case disappear during name matching, so 010-Abs.gnd and abs.gnd
19+
are the same operation.
20+
21+
A .gnd file is a sequence of single-line instructions, each beginning with the
22+
opcode mnemonic, followed by the destination slot and any number of input slots
23+
or literals. The equals sign is unnecessary. Slots are immutable identifiers:
24+
once bound they never change. The special slot name underscore discards a
25+
result unless a later instruction explicitly reads from it; this enables
26+
pipelining without cluttering the namespace. Each opcode consumes all its
27+
inputs and produces exactly one output, but that output can be any value,
28+
including records and arrays, so multi-field results are carried in one object.
29+
Literals may be decimal, hexadecimal, floating or quoted strings with C-style
30+
escapes. There is no inline expression syntax: every transformation, no matter
31+
how small, takes its own line. Control flow is data-driven. The select opcode
32+
consumes a boolean, a then value and an else value and writes one of them to
33+
its destination slot. The iterate opcode consumes a function token, an
34+
accumulator, an iterable object and a step limit, and yields a final
35+
accumulator. All side effects travel through an explicit world token. Every
36+
IO-capable opcode must be declared in the header and whitelisted in the project
37+
policy; it consumes the current world token as an input and produces a new
38+
world token as its single output. Two special opcodes bridge to the model: llm
39+
sends a prompt string and receives a response string; compile sends a prompt
40+
and receives freshly generated code that must pass the same verifier as
41+
hand-written .gnd. If no .gnd implementation exists for a mnemonic the loader
42+
looks for an executable script with .sh or .bin; it stringifies inputs, runs
43+
the programme, captures stdout as the opcode’s output and turns non-zero exit
44+
status into an exception that carries stderr.
45+
46+
The interpreter is a small Go binary that loads .gnd or .obj, resolves
47+
mnemonics to opcodes, applies a deterministic fuel charge to each step, and
48+
runs with a single bounded heap, no reflection and no unsafe pointers. It is
49+
the only piece of Go that remains once the language is self-hosted. All other
50+
logic, including the compiler itself, moves into Gendo scripts. The compiler
51+
pipeline performs six passes: gather, tokenise, parse, verify, resolve,
52+
serialise. Gather walks the directory tree, concatenates numbered fragments
53+
and groups companion files. Tokenise splits each line into identifiers,
54+
literals and comments. Parse checks syntax and produces a list of statements,
55+
one record per instruction. Verify checks naming rules, slot immutability and
56+
whitelist conformance. Resolve maps each mnemonic to an integer opcode and
57+
each slot name to an index. Serialise produces the final byte stream. All
58+
passes are themselves expressed in .gnd instructions that call a fixed set of
59+
primitive operations. Those primitives form the seed vocabulary the Go kernel
60+
must provide.
61+
62+
The seed vocabulary contains twenty primitives. File-read takes a world token
63+
and a path string and returns the file’s contents and a new world token.
64+
File-list returns an array of names in a directory. Emit-file writes a byte
65+
sequence to a path. String-split splits a string by a delimiter. String-match
66+
applies a regular expression and returns match objects. Tokenise, list-map,
67+
list-filter and list-fold are higher-order combinators that drive the compiler
68+
passes. Dict-get and dict-set access records. Concat appends two strings or
69+
two arrays. Format fills placeholders in a template string. Parse-number
70+
converts a string to an integer or float. Serialise-obj packs opcodes and
71+
operands into bytes. Iterate walks a list calling a function token. Select
72+
chooses between two values. Identity copies its input. Make-error raises an
73+
exception. Llm-call bridges to the local model. These primitives, plus a thin
74+
error-handling shell, are the only Go functions the interpreter must know at
75+
boot time. Each Go primitive is exposed to Gendo through a one-line wrapper
76+
such as prim-file-read.gnd, preserving a uniform call style.
77+
78+
Bootstrapping proceeds in stages. Stage Zero delivers the interpreter and the
79+
twenty Go primitives with their one-line wrappers. Stage One writes real
80+
arithmetic, list and string utilities in .gnd by composing primitive wrappers;
81+
each new module lives in its own folder with .llm, .gnd.llm, .gnd and .test
82+
files. Stage Two implements the six compiler passes in .gnd using those
83+
utilities. Stage Three runs the interpreter on the compiler source to produce
84+
compiler.obj, resets the opcode map to use compiler.obj, recompiles the same
85+
source tree and compares hashes; if they match, Gendo is self-hosting. Stage
86+
Four rewrites as many primitives as feasible in Gendo, leaving only file and
87+
directory IO, bit-level serialisation and the model bridge in Go. At that
88+
point every future change—language evolution, new libraries, new compiler
89+
passes—occurs within plain .gnd and .llm files under Git, with reproducible
90+
builds, deterministic execution and an auditable prompt history.
91+
92+
Writing new code follows the same pattern developers will later use. They
93+
create a header in .llm describing the operation, dependencies and constraints.
94+
They write or auto-generate a .gnd.llm prompt that outlines the pipeline. They
95+
run the current compiler, which expands the prompt into .gnd code, emits
96+
documentation in .md and metadata in .json, and runs tests. If tests fail or
97+
the verifier rejects something, the compiler stops; the developer edits the
98+
prompt or the generated code and tries again. Because every step is plain text
99+
and every artefact is version-controlled, the bootstrapping trail remains
100+
visible forever: the system grows from the twenty primitive wrappers to a
101+
readable, self-compiled corpus.
102+
103+
Once bootstrapped Gendo can distribute itself as a single interpreter binary
104+
plus a directory tree of .obj files, or as a shaved-down Go binary that embeds
105+
the interpreter and the compiled standard library. Either way the entire tool
106+
chain runs offline with a two-gigabyte local model, remaining deterministic,
107+
sandboxed and audit-friendly while still supporting optional cloud models
108+
through declared dependencies and policy routing. With these pieces in place,
109+
the language is ready for day-to-day AI-assisted development on laptops,
110+
servers or embedded boards, entirely under the developer’s control.

0 commit comments

Comments
 (0)