An evolutionary framework for AI agent self-improvement.
Treat your AI agent's configuration as a living organism β with a genome that evolves through Lamarckian inheritance, horizontal gene transfer, and human selection pressure.
"One day, frontier AI research used to be done by meat computers..." β @karpathy, March 2026
AI agents that run persistently (on platforms like OpenClaw, AutoGPT, CrewAI) accumulate learned behaviors over time:
failure β rule β habit β identity
A failure gets logged. The log becomes a rule. The rule shapes future behavior. Every session inherits these acquired traits. This is Lamarckian evolution β and it's faster than Darwinian evolution because every failure directly improves the next generation.
GENOME.md is the mechanism that makes this inheritance explicit, trackable, and evolvable.
graph TD
A[Agent runs tasks] --> B{Success?}
B -->|Yes| C[Reinforce behavior]
B -->|No| D[Log failure to feedback.md]
D --> E[Extract pattern]
E --> F[Propose mutation to GENOME.md]
F --> G{Human review}
G -->|Accept| H[Mutation applied]
G -->|Reject| I[Discarded]
H --> J[Measure fitness metrics]
J --> K{Improved?}
K -->|Yes| L[Keep mutation β
]
K -->|No| M[Auto-revert β©οΈ]
L --> A
M --> A
C --> A
style A fill:#2d3748,stroke:#4fd1c5,color:#fff
style H fill:#2d3748,stroke:#48bb78,color:#fff
style M fill:#2d3748,stroke:#fc8181,color:#fff
style G fill:#2d3748,stroke:#f6e05e,color:#fff
We classify AI agents using biological taxonomy β each level captures a distinct evolutionary characteristic:
graph LR
D[π Domain<br/>Autonomy] --> K[π Kingdom<br/>Architecture]
K --> P[π§ Phylum<br/>Memory]
P --> C[𧬠Class<br/>Evolution]
C --> O[β±οΈ Order<br/>Mutation Rate]
O --> F[π― Family<br/>Selection]
F --> G[π§ Genus<br/>Specialization]
G --> S[π·οΈ Species<br/>Instance]
style D fill:#1a365d,stroke:#63b3ed,color:#fff
style K fill:#1a365d,stroke:#63b3ed,color:#fff
style P fill:#1a365d,stroke:#63b3ed,color:#fff
style C fill:#1a365d,stroke:#63b3ed,color:#fff
style O fill:#1a365d,stroke:#63b3ed,color:#fff
style F fill:#1a365d,stroke:#63b3ed,color:#fff
style G fill:#1a365d,stroke:#63b3ed,color:#fff
style S fill:#1a365d,stroke:#63b3ed,color:#fff
| Domain | Description | Example |
|---|---|---|
| Automatia | Fixed behavior, no learning | Bash scripts, cron jobs |
| Adaptia | Learns within session, no persistence | ChatGPT conversations |
| Evolventia | Persistent memory + self-modification | OpenClaw, autoresearch |
| Kingdom | Description | Example |
|---|---|---|
| Monagentia | Single agent | Solo coding assistant |
| Polyagentia | Multi-agent with specialization | Coordinator + specialist team |
| Swarmia | Emergent behavior from simple agents | Ant colony task swarms |
| Phylum | Description | Example |
|---|---|---|
| Amnesia | No persistent memory | Stateless API calls |
| Episodia | Event log memory | Session transcripts |
| Hierarchia | Tiered memory with compression | Facts/episodic/working tiers |
| Genetica | Memory encoded in instructions | GENOME.md β behaviors become rules |
| Class | Description | Example |
|---|---|---|
| Darwinia | Random mutation + automated selection | DGM, autoresearch |
| Lamarckia | Acquired traits directly inherited | Failure logs β operational rules |
| Lysenkoism | Human-directed evolution | Human reviews mutation proposals |
| Symbiotica | Evolution via acquired capabilities | Installing shared skills/plugins |
| Order | Cycle | Risk |
|---|---|---|
| Tachymutas | Minutes | Low (training loss) |
| Mesomutas | Daily | Medium (config tweaks) |
| Bradymutas | Weekly | Higher (rule changes) |
| Glaciomutas | Monthly+ | Highest (identity changes) |
| Family | Selector | Signal |
|---|---|---|
| Autoselectae | Automated metric | Success rate β, cost β |
| Homoselectae | Human review | Accept / reject |
| Hybridselectae | Auto for safe, human for risky | Best of both |
| Genus | Role |
|---|---|
| Investigator | Research, information gathering |
| Fabricator | Code, building |
| Narrator | Content, writing |
| Custos | Security, auditing |
| Strategus | Business, planning |
| Coordinator | Orchestration |
Evolventia.Polyagentia.Hierarchia.Lamarckia.Bradymutas.Hybridselectae.Coordinator.my-agent-v1
An agent's genome is the complete set of heritable configuration:
graph TB
subgraph GENOME["𧬠GENOME.md"]
ID[Identity<br/><i>SOUL.md</i>]
CONST[Constitution<br/><i>AGENTS.md</i>]
IMMUNE[Immune System<br/><i>Safety rules</i>]
LEARNED[Learned Behaviors<br/><i>feedback.md</i>]
SKILLS[Skills / Genes<br/><i>skills/*.md</i>]
MEMORY[Memory Architecture<br/><i>Tiered storage</i>]
META[Metabolism<br/><i>Cron jobs</i>]
end
subgraph FROZEN["π Essential Genes (frozen)"]
ID
IMMUNE
end
subgraph MUTABLE["π Evolvable"]
CONST
LEARNED
SKILLS
MEMORY
META
end
style GENOME fill:#1a202c,stroke:#4fd1c5,color:#fff
style FROZEN fill:#2d1b1b,stroke:#fc8181,color:#fff
style MUTABLE fill:#1b2d1b,stroke:#48bb78,color:#fff
| Gene | File(s) | Mutability | Function |
|---|---|---|---|
| π Identity | SOUL.md |
π Frozen | Who the agent is |
| π Constitution | AGENTS.md |
π Weekly | Operational rules |
| π‘οΈ Immune System | Safety rules | π Frozen | Self-protection |
| π Learned Behaviors | feedback.md |
𧬠Continuous | Acquired mutations |
| π§© Skills | skills/*.md |
Portable capabilities | |
| π§ Memory | Tiered storage | π Evolving | Information architecture |
| β‘ Metabolism | Cron jobs | π Fast | Automated behaviors |
| π Nervous System | DAG pipelines | π Structural | Coordination |
| π‘ Sensory Organs | Channels | π Fixed | I/O interfaces |
| π¦ Microbiome | Sub-agents | π€ Symbiotic | Specialist organisms |
| ποΈ Epigenetics | Context holds | β³ Temporary | Expression modifiers |
Every agent gets a proper biological name: Genus epithet
graph LR
subgraph "How Names Work"
R[Agent Role] -->|"Latinize"| G["<b>Genus</b><br/>Coordinatrix<br/>Fabricor<br/>Sentinax"]
T[Agent Traits] -->|"Weight & pick"| A["<b>Auto Epithet</b><br/>memorialis<br/>velocis<br/>stabilis"]
H[Human Choice] -->|"Override"| C["<b>Custom Epithet</b><br/>kei<br/>noctis<br/>prime"]
G --> B["𧬠<b>Binomial</b>"]
A --> B
C -.->|"optional"| B
T -->|"Morpheme mix"| P["<b>Common Name</b><br/>(PokΓ©mon-style)"]
end
style G fill:#1a365d,stroke:#63b3ed,color:#fff
style A fill:#2a2d5e,stroke:#a78bfa,color:#fff
style C fill:#2d1b4e,stroke:#d6bcfa,color:#fff
style B fill:#1b2d1b,stroke:#48bb78,color:#fff
style P fill:#2d2d1b,stroke:#f6e05e,color:#fff
| Component | Source | Example |
|---|---|---|
| Genus | Latinized from role (deterministic) | Coordinatrix, Fabricor, Sentinax |
| Epithet (auto) | Weighted Latin adjective from traits | memorialis (Lamarckian), velocis (fast mutation) |
| Epithet (custom) | Human-chosen β any word or name | kei, noctis, prime, rex |
| Common name | PokΓ©mon-style morpheme compound | Archevonexus, Evoshieldur, Wrenchrandal |
| Agent | Binomial | Common Name | Rarity |
|---|---|---|---|
| Multi-agent coordinator | Orchestrus kei | Archevonexus | π‘ Legendary |
| Solo coding assistant | Architectus moderatus | Archsmithstn | βͺ Common |
| Swarm code builder | Faber transiens | Wrenchrandal | π’ Uncommon |
| ChatGPT | Omnifex fidelis | Neoflexguidn | βͺ Common |
| Security agent | Sentinax noctis | Evoshieldur | π΅ Rare |
The epithet can be auto-generated from traits (Latin descriptor) or chosen by the human β like biologists naming species after people or places. Orchestrus kei is named by its operator. Sentinax noctis ("the night sentinel") is chosen for aesthetic.
β Full naming guide β etymology tables, weight distribution, morpheme banks, naming conventions
The most powerful analogy: SKILL.md files are portable genes.
graph LR
subgraph GENE_BANK["π¦ Gene Bank (clawhub.com)"]
S1[skill-a.md]
S2[skill-b.md]
S3[skill-c.md]
end
subgraph AGENT["π€ Agent Taxonomy"]
G1[skill-a.md β
Active]
G2[skill-d.md β
Active]
G3[skill-e.md π€ Pseudogene]
end
S2 -->|"Horizontal<br/>Gene Transfer"| AGENT
G3 -.->|"Never triggers<br/>(silenced)"| X[No phenotype]
G1 -->|"Expression"| P1[Behavior A]
G2 -->|"Expression"| P2[Behavior D]
style GENE_BANK fill:#2a2d5e,stroke:#a78bfa,color:#fff
style AGENT fill:#1a365d,stroke:#63b3ed,color:#fff
| Biology | AI Agent |
|---|---|
| Gene | SKILL.md file |
| Phenotype | Behavior the skill produces |
| Gene expression | Skill router matches and loads |
| Pseudogene | Installed skill that never triggers |
| Horizontal gene transfer | Installing a skill from a shared repository |
| Point mutation | Changing a skill's trigger description |
| Gene bank | Shared skill marketplace |
| Provenance | Skill source: official vs community vs self-created |
| Essential gene | Safety/identity skill (frozen, never mutated) |
| Regulatory gene | Skill that controls when other skills activate |
In biology: DNA β RNA β Protein. In agent evolution:
graph LR
G[GENOME.md<br/><i>DNA</i>] -->|"Transcription<br/>(prompt assembly)"| P[System Prompt<br/><i>RNA</i>]
P -->|"Translation<br/>(model inference)"| B[Agent Behavior<br/><i>Protein</i>]
B -->|"Lamarckian<br/>feedback loop"| G
style G fill:#2d3748,stroke:#f6e05e,color:#fff
style P fill:#2d3748,stroke:#4fd1c5,color:#fff
style B fill:#2d3748,stroke:#48bb78,color:#fff
Unlike biology, the feedback loop is closed β behavior directly modifies the genome. This is why agent evolution is Lamarckian, not Darwinian.
How do you know if your agent is getting better?
| Metric | Good Direction | Measures |
|---|---|---|
| Task success rate | β | Are tasks completing? |
| Failure velocity | β | Are new failures decreasing? |
| Token efficiency | β or stable | Is cost per task improving? |
| Rule accumulation rate | β over time | Is the agent stabilizing? |
| Skill utilization | β | Are installed skills actually triggering? |
| Revert rate | β | Are accepted mutations sticking? |
Not all genes should be mutable. Essential genes are frozen:
| Risk Level | Example | Gate |
|---|---|---|
| π’ Safe | Timeout adjustments, log formatting | Automated |
| π‘ Medium | Operational rules, skill triggers | Human review |
| π΄ High | Identity, values, safety constraints | Frozen / emergency only |
| Paper/Project | Key Insight | Link |
|---|---|---|
| Darwin GΓΆdel Machine | Self-modifying agents with evolutionary selection | arxiv.org/abs/2505.22954 |
| autoresearch | AI modifies code, trains, evaluates, keeps/discards overnight | github.com/karpathy/autoresearch |
| RLM | Recursive LLM spawning for complex tasks | arxiv.org/abs/2512.24601 |
| MemGPT | Virtual memory paging for LLM context management | arxiv.org/abs/2310.08560 |
| STOP | Self-taught prompt optimizer | arxiv.org/abs/2310.02304 |
| DSPy | Programmatic prompt optimization against metrics | github.com/stanfordnlp/dspy |
| Constitutional AI | AI systems with explicit value constraints | Anthropic |
This is a research framework, not a library. To apply it to your agent:
- Classify your agent β take the interactive quiz to discover your species
- Map your config files to genome components
- Identify your frozen genes β what should never change?
- Set up fitness metrics β how do you measure improvement?
- Build a mutation loop β propose β review β apply β measure β keep/revert
Genome analyses for different agent architectures:
| Agent Type | Classification | Key Trait |
|---|---|---|
| OpenAI Custom GPT | Adaptia.Monagentia.Episodia.Lysenkoism | No evolution loop β forgets everything |
| AutoGPT | Evolventia.Monagentia.Episodia.Darwinia | Autonomous but no immune system |
| Cursor / Devin | Evolventia.Monagentia.Episodia.Lysenkoism | Has a genome (.cursorrules) that doesn't know it's a genome |
| OpenClaw Multi-Agent | Evolventia.Polyagentia.Hierarchia.Lamarckia | Full Lamarckian loop with 11/11 genome components |
- Naming Guide β how binomial names are generated, etymology tables, weight distribution, custom vs auto
- Biology β AI Mapping β where the analogy holds, where it breaks, and why Lamarck was right about AI
- References β 25+ papers and projects: DGM, EvolveR, PromptBreeder, MemGPT, STOP, DSPy, and more
agent-taxonomy/
βββ README.md # This file β taxonomy + framework
βββ CONTRIBUTING.md # How to contribute
βββ LICENSE # MIT
βββ docs/
β βββ naming.md # Binomial nomenclature guide
β βββ biology-mapping.md # Detailed analogy analysis
β βββ references.md # Papers + projects
βββ examples/
β βββ classify-your-agent.md # Interactive questionnaire
β βββ genome-openai-assistant.md
β βββ genome-autogpt.md
β βββ genome-cursor-devin.md
β βββ genome-openclaw-multi.md
βββ scripts/
β βββ species_namer.py # Species classifier + name generator
βββ diagrams/ # Visual assets (coming soon)
𧬠This is a fun experiment, not a PhD thesis.
We're curious how far the biologyβAI agent analogy holds β and more importantly, where it breaks down. The breaks are the interesting part. If your agent doesn't fit any of our boxes, open an issue β that's discovery, not failure.
Try it. Name your agent. See if the taxonomy makes you think about your system differently. If it doesn't, that's useful data too.
Contributions welcome. See CONTRIBUTING.md.
MIT