Skip to content

abhinaya-gov/Multivariate-Linear-Regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Multivariate-Linear-Regression

This project implements a multivariate linear regression model from scratch using Python and NumPy, and compares it against scikit-learn implementations. This walks through the entire machine learning pipeline, including data exploration, feature scaling, gradient descent optimization, convergence analysis, prediction, and model evaluation.


Main Concepts Covered

This project focuses on building strong intuition for the mathematical and algorithmic foundations of linear regression:

  • Multivariate linear regression hypothesis
  • Feature scaling (mean normalization & standardization)
  • Cost function (Mean Squared Error)
  • Gradient computation (∂J/∂W, ∂J/∂b)
  • Gradient descent optimization
  • Learning rate and convergence behavior
  • Residual analysis and error interpretation
  • Model evaluation metrics (R², MSE, RMSE, MAE)
  • Comparison with scikit-learn implementations

Tech Stack

  • Python
  • NumPy
  • Pandas
  • Matplotlib & Seaborn
  • scikit-learn
  • Jupyter Notebook

Repository Structure

.
├── multiple_regression.ipynb   # Complete implementation and analysis
├── README.md              # Project documentation

How to Run

  1. Clone the repository:

    git clone https://github.com/<your-username>/<repo-name>.git
    cd <repo-name>
  2. Install dependencies:

    pip install numpy pandas matplotlib seaborn scikit-learn
  3. Open the notebook:

    jupyter notebook enhance_linearR.ipynb
  4. Run the notebook cells sequentially.


Model Workflow

  1. Data creation & exploration
  2. Visualization using pair plots and correlation heatmaps
  3. Feature scaling to improve gradient descent convergence
  4. Custom gradient descent training
  5. Convergence analysis using cost vs iterations
  6. Predictions on unseen inputs
  7. Comparison with scikit-learn LinearRegression & SGDRegressor
  8. Evaluation using standard regression metrics

Results & Observations

  • Feature scaling significantly improves gradient descent stability
  • Custom implementation closely matches scikit-learn results
  • Convergence curves provide insight into optimization behavior
  • Residual plots help validate linear model assumptions

License

This project is intended for educational purposes. A license can be added later if the repository is extended or shared for reuse.


Author

Abhi Learning-focused Machine Learning & Python projects

About

This project implements a multivariate linear regression model from scratch using Python and NumPy, and compares it against scikit-learn implementations. This walks through the entire machine learning pipeline, including data exploration, feature scaling, gradient descent optimization, convergence analysis, prediction, and model evaluation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors