A streamlined predictive modeling project built to analyze health-related datasets and identify stroke risk factors using Python and statistical learning techniques.
Developed as part of PolyX University Research (Dec–Apr 2024) under the guidance of faculty mentors and presented at the University Research Fair.
Lead Researcher: PolyX University Research Initiative
Duration: December 2023 – April 2024
Led a collaborative research project analyzing large-scale flight pricing datasets to identify cost-effective booking times by season, day, and time.
Developed a predictive model using Python, Pandas, and NumPy; presented findings through Matplotlib visualizations and comprehensive statistical reports.
The project was showcased at the University Project Fair, earning faculty commendations for analytical depth and presentation clarity.
- Data Cleaning & Preprocessing: Removal of outliers, normalization, and categorical encoding
- Feature Engineering: Extraction of relevant predictors to enhance model performance
- Modeling Techniques: Logistic Regression, Random Forest, and Support Vector Machines
- Evaluation Metrics: Accuracy, Precision, Recall, F1-Score, ROC Curve visualization
- Statistical Charts: Distribution plots and feature correlation heatmaps
- Performance Graphs: ROC and precision-recall curves for comparative analysis
- Exploratory Dashboards: Interactive views using Matplotlib and Seaborn
- Predictive identification of high-risk individuals based on clinical and behavioral factors
- Statistical interpretation of model weights for actionable insights
- Scalable workflow adaptable to healthcare and public safety domains
| Category | Tools |
|---|---|
| Languages | Python |
| Libraries | Pandas, NumPy, Matplotlib, Scikit-learn |
| Environment | Jupyter Notebook, Google Colab |
| Version Control | Git, GitHub |
| Visualization | Matplotlib, Seaborn |
- Visualized correlation matrix for key features
- ROC curve comparison between Logistic Regression and Random Forest models
- Feature importance ranking to identify key risk predictors
- Clone the repository
git clone https://github.com/nathanslee/Stroke-Predictor.git cd Stroke-Predictor