Goal:
Develop MLPs that start small and grow adaptively during training, guided by internal signals indicating capacity bottlenecks or saturation. The aim is to improve sample efficiency, reduce compute costs, and maintain or improve generalization.
This framework uses internal diagnostics—such as mutual information, gradient norms, and activation dynamics—to decide when and how the model should expand during training.
Modern neural networks are typically overparameterized from the start—wasting compute and memory on capacity that may not be needed.
We propose a different principle:
Grow only when necessary.
This approach promises to:
- Save computation and memory during early training
- Adaptively match model complexity to data
- Provide interpretability via explicit growth triggers
- Which internal signals most reliably indicate capacity saturation?
- Does online model growth improve training efficiency or generalization?
- What growth strategies are most effective (depth-first, width-first, hybrid)?
- How should new neurons or layers be initialized to integrate seamlessly?
- Tasks: MNIST, Fashion-MNIST, Parity Task, flattened CIFAR-10
- Baselines: fixed-size MLPs, dropout, early stopping
- Ablations: growth trigger type (MI vs gradients), growth direction, init schemes
- A modular, extensible PyTorch codebase
- Built-in diagnostic toolkit
- Benchmark results and visualizations
- Draft paper summarizing methods and findings
Let the model grow only when it needs to.