Skip to content

Zimal-Fatemah/Machine-Learning-Labs

Repository files navigation

Machine Learning Labs

A collection of hands-on Machine Learning lab notebooks covering core ML concepts, trained on real-world datasets. Implementations are done both from scratch and using scikit-learn. Completed as part of a 5th semester Machine Learning course.


📂 Contents

Notebook Topics Covered
data_preprocessing.py Handling missing values, encoding categorical features, data cleaning
Linear_Regression.ipynb Simple & multiple linear regression from scratch, RMSE, R² score
Linear_Regression_Sklearn.ipynb Linear regression using scikit-learn, feature scaling, evaluation
Logistic_Polynomial_Regression.ipynb Logistic regression (scratch + sklearn), polynomial regression
KNN_Distance_Metrics.ipynb K-Nearest Neighbors with different distance metrics (Euclidean, Manhattan)
KNN_SVM_Ensemble.ipynb KNN vs SVM comparison, ensemble methods (Bagging, Boosting, XGBoost)
K_Means_Clustering.ipynb K-Means clustering from scratch, elbow method, DBSCAN
KMeans_PCA.ipynb Dimensionality reduction with PCA, visualizing clusters
Descision_Trees.ipynb Decision Tree classifier with entropy criterion, confusion matrix, accuracy score & tree visualization

📊 Datasets Used

These labs use real-world datasets to train and evaluate models — not just toy data:

Dataset Used For
1000 Companies Multiple Linear Regression (profit prediction)
Car Price Prediction Regression (predicting used car prices)
House / Admission Predict Linear Regression (admission chance prediction)
Head & Brain Simple Linear Regression (head size vs brain weight)
Medical Cost Personal Regression (insurance cost prediction)
Mall Customers K-Means Clustering (customer segmentation)
Telco Customer Churn Classification (churn prediction)
Income Evaluation SVM classification
Social Network Ads Logistic Regression, KNN, SVM
Mushroom Dataset KNN & Ensemble classification
COVID-19 Dataset Data analysis & visualization
Iris Clustering, PCA visualization
Bill Authentication Descision Trees

Tech Stack

  • Python 3
  • NumPy — numerical computations
  • Pandas — data manipulation
  • Matplotlib / Seaborn — data visualization
  • Scikit-learn — ML models and evaluation

Concepts Covered

  • Supervised Learning — Linear Regression, Logistic Regression, Polynomial Regression, KNN, SVM, Descision Trees
  • Unsupervised Learning — K-Means Clustering, DBSCAN
  • Dimensionality Reduction — Principal Component Analysis (PCA)
  • Ensemble Methods — Random Forests, AdaBoost, Gradient Boosting, XGBoost
  • Model Evaluation — RMSE, R², Accuracy, Confusion Matrix, ROC Curve, K-Fold Cross Validation
  • Data Preprocessing — Missing values, categorical encoding, feature scaling

How to Run

  1. Clone the repository:

    git clone https://github.com/Zimal-Fatemah/Machine-Learning-Labs.git
    cd Machine-Learning-Labs
  2. Install dependencies:

    pip install numpy pandas matplotlib seaborn scikit-learn
  3. Open any notebook:

    jupyter notebook

Notebooks can also be opened directly in Google Colab.


📌 Notes

  • Implementations are done both from scratch (using NumPy) and using scikit-learn, to build a solid understanding of the underlying math.
  • Each notebook includes data loading, preprocessing, model training, and evaluation steps.

Author

Zimal Fatemah
BS Artificial Intelligence
GitHub

About

Hands-on ML labs training models on 12+ real-world datasets — covering regression, classification, clustering, and dimensionality reduction, built from scratch and with scikit-learn.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors