ScratchML is a Python library for implementing machine learning and deep learning models from scratch. The goal of this project is to provide a deeper understanding of how these algorithms work under the hood by building them without relying on high-level libraries like TensorFlow, PyTorch, or Scikit-learn.
- Linear Regression (With Gradient Descent & Normal Equation)
- Ridge Regression (L2 Regularization)
- Lasso Regression (L1 Regularization)
- Polynomial Regression
- Logistic Regression (Binary & Multi-class Classification)
- K-Nearest Neighbors (KNN) (Distance-based classification)
- Naïve Bayes (Gaussian & Multinomial)
- Decision Tree (Entropy, Gini Index)
- Random Forest (Ensemble Learning)
- Support Vector Machine (SVM) (Hard & Soft Margin, Kernel Trick)
- Gradient Boosting (AdaBoost, XGBoost, LightGBM)
- K-Means Clustering (Centroid-based clustering)
- Hierarchical Clustering (Agglomerative & Divisive)
- DBSCAN (Density-Based Spatial Clustering)
- Principal Component Analysis (PCA)
- Autoencoders (A neural network-based unsupervised model)
- Perceptron (The foundation of neural networks)
- Multi-Layer Perceptron (MLP) (Backpropagation from scratch)
- Convolutional Neural Networks (CNNs) (With Conv, Pooling, Dropout)
- Recurrent Neural Networks (RNNs) (Basic RNN for time-series/text)
- Long Short-Term Memory (LSTMs) (For NLP and sequential tasks)
- Transformers (Self-Attention, Positional Encoding)
Clone the repository to your local machine:
git clone https://github.com/your-username/ScratchML.git
cd ScratchMLEnsure you have Python 3.7+ installed along with the required dependencies:
pip install -r requirements.txtScratchML/
├── supervised_learning/
│ ├── regression.py # Contains regression models (Linear, Ridge, Lasso, Polynomial)
│ └── __init__.py # Package initialization
├── deep_learning/ # (Planned) Deep learning models and utilities
├── tests/ # Unit tests for the library
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- Educational Purpose: This library is designed to help developers and students understand the inner workings of machine learning and deep learning algorithms.
- Extendability: The library is modular, making it easy to add new models and features.
- Performance: While the focus is on understanding, efforts are made to ensure the models are efficient and scalable.
Contributions are welcome! If you'd like to add new models, improve existing ones, or fix bugs, feel free to open a pull request or submit an issue.
This project is licensed under the MIT License. See the LICENSE file for details.
This project is inspired by the desire to learn and teach the fundamentals of machine learning and deep learning by building models from scratch.