An interactive Water Quality Intelligence System built with machine learning and Streamlit.
This project predicts Water Quality Index (WQI) (numeric score) and Water Quality Classification (WQC) (Excellent / Good / Medium / Poor / Very Poor) based on water parameters.
- End-to-end pipeline: cleaning โ feature engineering โ modeling โ deployment
- Handles missing values (KNN/median strategies) and outliers (IQR capping + domain thresholds)
- Computes WQI dynamically and categorizes into classes (WQC)
- Supports both regression (WQI prediction) and classification (WQC prediction)
- Deployed as a sleek Streamlit web app
- Python (Pandas, NumPy, Scikit-learn, XGBoost)
- Streamlit for web UI
- Joblib for model persistence
- GitHub + Streamlit Cloud for deployment
Source: Kaggle โ Indian River Water Quality dataset
- 1991 rows, 8 key water quality parameters
- Target variables engineered: WQI and WQC