Implement mini-batch gradient descent for faster training by winpat · Pull Request #1 · winpat/znet

winpat · 2026-01-14T22:14:29Z

This commit introduces mini-batch training to significantly improve
training performance. Instead of updating weights after each sample,
gradients are accumulated over a batch and applied once per batch.

Changes:

Added gradient accumulation to Linear layer
- New zeroGradients() method to reset gradients at batch start
- Modified backward() to accumulate gradients instead of overwriting
- Updated applyGradients() to average gradients by batch size
Updated Network training loop
- Added batch_size parameter to train()
- Process data in configurable mini-batches
- Zero gradients at start of each batch
- Accumulate gradients over batch samples
- Apply averaged gradients once per batch
Updated all train() calls to include batch_size parameter
- main.zig: Uses batch_size=32 for Iris dataset
- net.zig test: Uses batch_size=1 for backward compatibility
- README.md: Updated example with batch_size=32

Benefits:

Faster training through reduced weight update overhead
More stable gradient estimates from batch averaging
Better convergence properties
Configurable batch size for different datasets

The default batch_size of 32 provides a good balance between
speed and gradient stability for most datasets.

This commit introduces mini-batch training to significantly improve training performance. Instead of updating weights after each sample, gradients are accumulated over a batch and applied once per batch. Changes: - Added gradient accumulation to Linear layer - New zeroGradients() method to reset gradients at batch start - Modified backward() to accumulate gradients instead of overwriting - Updated applyGradients() to average gradients by batch size - Updated Network training loop - Added batch_size parameter to train() - Process data in configurable mini-batches - Zero gradients at start of each batch - Accumulate gradients over batch samples - Apply averaged gradients once per batch - Updated all train() calls to include batch_size parameter - main.zig: Uses batch_size=32 for Iris dataset - net.zig test: Uses batch_size=1 for backward compatibility - README.md: Updated example with batch_size=32 Benefits: - Faster training through reduced weight update overhead - More stable gradient estimates from batch averaging - Better convergence properties - Configurable batch size for different datasets The default batch_size of 32 provides a good balance between speed and gradient stability for most datasets.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement mini-batch gradient descent for faster training#1

Implement mini-batch gradient descent for faster training#1
winpat wants to merge 1 commit intomainfrom
claude/optimize-training-speed-yXixY

winpat commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

winpat commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants