pathwaycom · zer0condition · Nov 9, 2025 · Nov 9, 2025 · Nov 9, 2025 · Nov 9, 2025
diff --git a/README.md b/README.md
@@ -1,95 +1,84 @@
-# Baby Dragon Hatchling
+# Baby Dragon Hatchling Continual Learning (BDH-CL)
 
-## **Bridging the Gap Between Transformers and the Brain**
+**Fork of**: [pathwaycom/bdh](https://github.com/pathwaycom/bdh)
 
-**Baby Dragon Hatchling (BDH)** is a biologically inspired large language model architecture that connects principles of deep learning with the foundations of neuroscience. Developed by researchers at [Pathway](https://pathway.com), BDH provides a theoretical and practical framework for understanding the emergence of reasoning and generalization in artificial systems.
-
-This repository contains the official implementation from the paper:
-> *A. Kosowski, P. Uznański, J. Chorowski, Z. Stamirowska, M. Bartoszkiewicz.*
-> [_The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain_](https://doi.org/10.48550/arXiv.2509.26507), arXiv (2025).
+***
 
+## Introduction
 
-## Overview
+This repository attempts to extend the original Baby Dragon Hatchling (BDH) architecture, a biologically inspired large language model bridging transformers and neural computation, by integrating **continual learning** mechanisms inspired by biological synaptic plasticity.
 
-BDH represents a **scale-free, locally interacting network of neurons** capable of intrinsic reasoning dynamics. BDH scales like a Transformer on performance benchmarks—yet retains full interpretability and theoretical grounding in the fine-grained dynamics of neuron interactions.
+The key contribution of this fork is the implementation of **Adaptive Synaptic Consolidation**, enabling BDH to learn multiple tasks sequentially without catastrophic forgetting, in the spirit of Zenke et al.'s *Continual Learning Through Synaptic Intelligence* (2017).
 
-**Key properties:**
+***
 
-- **Scale-free network topology** mimicking biological connectivity
-- **Locally interacting neuron particles** with excitatory/inhibitory dynamics
-- **Hebbian working memory** based on synaptic plasticity, displaying monosemanticity
-- **GPU-friendly state-space formulation** for efficient implementation
-- **Interpretable activations** that are sparse and positive
+## Highlights of Changes and Improvements
 
-BDH formalizes a bridge between **neural computation and machine-based language understanding**. It shows how **macro reasoning behavior** in large AI models emerges from **micro-level neuron dynamics**, guided by principles of graph theory and local computation.
+### Continual Learning Integration
 
-Empirically, BDH matches **GPT-2–scale Transformers** across language and translation tasks at equivalent parameter scales (10M–1B).
+- Added **Elastic Weight Consolidation (EWC)** with Fisher information estimation to protect important weights from overwriting during new tasks.
+- Implemented **adaptive synaptic gates** that regulate plasticity at the neuron level, inspired by biological metaplasticity.
+- Integrated **path integral online importance measures** for efficient tracking of weight significance during training.
+- Supported **multi-task sequential training** enabling scalable lifelong learning.
 
+## Benchmarking Suite
 
-***
+| ![Permuted MNIST](res/PERMUTED_MNIST.PNG) | ![Rotated MNIST](res/ROTATED_MNIST.PNG) |
+|:----------------------------------------------------:|:-------------------------------------------------:|
+| Permuted MNIST (Simple)                              | Rotated MNIST (Simple)                            |
 
-## Architecture
+| ![Split CIFAR](res/SPLIT_CIFAR.PNG)      | ![Sequence](res/SEQUENCE.PNG)         |
+|:----------------------------------------------------:|:-------------------------------------------------:|
+| Split CIFAR (Simple)                                 | Sequence (Simple)                                 |
 
-<img src="figs/architecture.png" width="600"/>
 
 ***
 
-## Relation to Transformers
+## How to Use
 
-<img src="figs/vocab.png" width="600"/>
+- Install dependencies:
 
-BDH and the Transformer share attention-inspired computation; however, BDH’s graph-based architecture makes its attention **emerge naturally from neuron-level interactions**, reflecting attention as seen in biological systems.
+  ```bash
+  pip install -r requirements.txt
+  ```
 
-***
+- Train BDHC with continual learning enabled:
 
-## Scaling Laws
+  ```bash
+   train.py --continual_learning
+  ```
 
-<img src="figs/bdh_scaling.png" width="600"/>
+- Run simple benchmarks:
 
-BDH follows **Transformer-like scaling laws**, maintaining parameter efficiency while achieving interpretability at any scale.
+  ```bash
+   simple_benchmark.py --benchmark permuted_mnist --num_tasks 5 --epochs 10
+
+   simple_benchmark.py --benchmark split_cifar --num_tasks 5 --epochs 10
+
+   simple_benchmark.py --benchmark rotated_mnist --num_tasks 10 --epochs 10
+
+   simple_benchmark.py --benchmark sequence --num_tasks 5 --epochs 10
+  ```
 
 ***
 
-## Installation and Training
-
-```bash
-# install dependencies
-pip install -r requirements.txt
-
-# train BDH on a toy dataset
-python train.py
-```
+## Credits
 
-<!--For visualization and interpretability analysis, explore the example notebooks in `notebooks/`.-->
+This project builds upon and extends the original [Baby Dragon Hatchling repository by Pathway](https://github.com/pathwaycom/bdh). 
 
+The original authors' foundational work on biologically inspired neural architectures underpins this extension.
 
+***
 
-## Learn and Discuss
-
-- Watch the *SuperDataScience podcast* [▶️ *Dragon Hatchling: The Missing Link Between Transformers and the Brain*](https://www.youtube.com/watch?v=mfV44-mtg7c) (72 min.) featuring Adrian Kosowski in conversation with Jon Krohn, unpacking BDH’s neuron-level architecture and sparse reasoning dynamics.
-
-- Read about BDH in
-[*Forbes*](https://www.forbes.com/sites/victordey/2025/10/08/can-ai-learn-and-evolve-like-a-brain-pathways-bold-research-thinks-so/),
-[*Semafor*](https://www.semafor.com/article/10/01/2025/new-ai-research-claims-to-be-getting-closer-to-modeling-human-brain),
-[*The Turing Post*](https://www.turingpost.com/p/fod-121-300-million-to-start-a-big-promise-for-science#the-freshest-research-papers-catego),
-[*Quantum Zeitgeist*](https://quantumzeitgeist.com/palo-alto-ai-firm-pathway-unveils-post-transformer-architecture-for-autonomous-ai/),
-[*Golem*](https://www.golem.de/news/neue-ki-architektur-was-ist-baby-dragon-hatchling-2510-201047-2.html),
-and elsewhere in the media.
-
-- Discuss and share the BDH paper on:
-[*Hugging Face Papers*](https://huggingface.co/papers/2509.26507), 
-[*Alphaxiv*](https://alphaxiv.org/abs/2509.26507),
-and [*EmergentMind*](https://emergentmind.com/papers/2509.26507).
+## References
+- Vibe coding was involved; Py is not my primary language.
+- Zenke et al., *Continual Learning Through Synaptic Intelligence*, ICML 2017
+- Kosowski et al., *The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain*, arXiv 2025
 
-## Community Projects
+***
 
-- [adamskrodzki/bdh](https://github.com/adamskrodzki/bdh): dynamic vocabulary, stateful attention
-- [mosure/burn_dragon_hatchling](https://github.com/mosure/burn_dragon_hatchling): Burn port
-- [severian42/bdh](https://github.com/severian42/bdh): MLX port
-- [Git-Faisal/bdh](https://github.com/Git-Faisal/bdh)
-- [GrahLnn/bdh](https://github.com/GrahLnn/bdh)
+## Summary
 
-## Acknowledgements
-We thank Andrej Karpathy for the [nanoGPT](https://github.com/karpathy/nanoGPT/) code and the tiny Shapespeare dataset used in this demonstration.
+BDH-CL introduces practical, biologically inspired continual learning capabilities into the BDH architecture, enabling robust lifelong learning beyond the single-task limitations of the original. It offers a unique blend of neuroscience theory and state-of-the-art machine learning applied to next-generation language models.
 
-BDH research stands at the intersection of **AI architecture**, **biological learning models**, and **theoretical computer science**—an effort to map the *equations of reasoning* between artificial and biological intelligence.
+***
diff --git a/__init__.py b/__init__.py
diff --git a/bdh.py b/bdh.py