A freelance data science project delivering actionable insights through linear regression modeling. The results supported the academic research of a Production Engineering PhD candidate at UFES (Federal University of Espírito Santo).
This project focuses on applying linear regression techniques to real-world data, exploring both model construction and interpretability. Implemented in Python using popular data science libraries, the notebook walks through a complete analysis pipeline—from data exploration to model diagnostics.
✅ Designed to deliver clean, reproducible, and academically sound results.
- 📊 Data Cleaning & EDA: Initial exploration and preprocessing to ensure quality inputs.
- 🧮 Model Training: Linear regression using
statsmodelsandscikit-learn. - 📉 Model Evaluation: Diagnostic plots, residual analysis, and performance metrics.
- 📌 Result Interpretation: Coefficient analysis and insights for domain-specific applications.
- Python 3.x
- Jupyter Notebook
- pandas, numpy
- matplotlib, seaborn
- scikit-learn
- statsmodels
main.ipynb— The complete notebook with code, analysis, and comments.
-
Clone this repository:
git clone https://github.com/Pedro2um/LinearRegression-2024.git cd LinearRegression-2024 -
Install dependencies (preferably in a virtual environment):
pip install -r requirements.txt
-
Launch the notebook:
jupyter notebook main.ipynb
Special thanks to the PhD candidate from UFES for trusting this collaboration in their research journey.
Hi! I'm Pedro, a Computer Science student and aspiring Data Scientist / ML Engineer.
I love building things that are statistically sound and practically useful.
📫 Check out my GitHub