Data Science & AI Professional | Machine Learning Enthusiast | Analytical Storyteller | Pennsylvania State University
I specialize in transforming complex, multi-source data into actionable insights and predictive models. My work bridges the gap between advanced mathematical theory (Graph Theory) and real-world applications (Sports Analytics & Urban Safety) as well as between technical depth and stakeholder communication.
- Status: Selected as a top-tier instructional guide for beginners.
- The Goal: To democratize data visualization by guiding novice analysts through the end-to-end process of building, filtering, and publishing interactive dashboards.
- Key Skill: Technical Communication. I successfully translated high-level Tableau operations (Data Pane, Marks Card, Dashboard Actions) into a step-by-step pedagogical framework.
- Tech: Python, XGBoost, Scikit-Learn, Pandas.
- The Problem: Improving match outcome forecasts by combining team-level, player-level, and betting market data.
- Insight: Mastered Entity Resolution, harmonizing naming conventions across three disparate databases, and engineered Rolling Form features that boosted model ROC-AUC to 0.78.
- Tech: Tableau, Audience Analysis, Iterative Prototyping.
- The Problem: Providing new residents with data-driven safety insights to inform housing and commuting decisions.
- Insight: Implemented a full UX/UI workflow, from audience persona definition to iterative validation. Discovered a distinct "Noon Spike" in crime frequency through temporal analysis.
- Tech: Discrete Mathematics, Dijkstraโs Algorithm, A*.
- The Problem: Analyzing the mathematical optimization behind Google Maps' routing and traffic forecasting.
- Insight: Bridged abstract math with production software. I learned to translate real-world constraints (traffic/weather) into mathematical weights in a dynamic graph.
- Tech: Climate Data Synthesis, GIS Mapping, Socio-Ecological Modeling.
- The Problem: Assessing the health of Lake Baikal by synthesizing 40 years of climate trends with social policy.
- Insight: Developed Systems Thinking skills, analyzing how industrial history and overtourism interact with ecological stressors like surface water warming.
- Languages: Python (Pandas, NumPy, Scikit-Learn), R (R Markdown, Tidyverse), SQL.
- AI/ML: XGBoost, Random Forest, NLP (Intent Classification), Neural Networks.
- Visualization: Tableau (Advanced, Interactive Dashboards, Storyboarding), Matplotlib, Seaborn.
- Tools: Git/GitHub, Docker, Google Colab, VS Code, Excel
- Specialties: Data Fusion, Graph Theory, AI Ethics & Fairness, Root Cause Analysis.
- Email: moulikjain04@gmail.com
- LinkedIn: Moulik Jain
"I don't just build models; I build narratives that make data understandable and ethical."