Skip to content
View gogoharrison's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report gogoharrison

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
gogoharrison/README.md

Gogo Harrison Banner

Gogo Isaac Harrison

Data Scientist · Machine Learning Engineer · Analytics

LinkedIn Portfolio Medium Email


About Me

I'm a Data Scientist and Machine Learning Engineer focused on turning complex datasets into decisions that drive measurable business value. I build end-to-end ML systems — from raw data pipelines to deployed models — with a strong emphasis on analytical rigour and practical impact.

Currently deepening expertise in data engineering, MLOps, and scalable data pipelines.


Core Skills

Area Tools & Technologies
Languages Python, SQL
Machine Learning scikit-learn, XGBoost, K-Means, feature engineering, model evaluation
Deep Learning TensorFlow / Keras, Neural Networks, NLP
Data Engineering PySpark, Apache Airflow, ETL pipelines, Docker
Data & Analytics Pandas, NumPy, statistical modelling, EDA
Visualisation Matplotlib, Seaborn, Plotly, Metabase
Cloud & Infra Azure, PostgreSQL, Docker Compose
Explainability SHAP, model interpretability

Featured Projects

⚙️ Spark ETL Movie Analytics Pipeline

PySpark · Apache Airflow · PostgreSQL · Docker · Python

Built a production-grade, end-to-end data engineering pipeline processing 33.8 million MovieLens ratings. Orchestrated a 15-task Airflow DAG with MD5 hash-based change detection that cut weekly run time from 90 minutes to 2 minutes on unchanged data. Outputs a PostgreSQL star schema powering Metabase dashboards — running on a fully automated weekly schedule at zero cloud cost.

→ View Repository


🧠 Sentiment Analysis with Neural Networks

TensorFlow · Keras · NLP · Python

Developed an end-to-end deep learning classifier to categorise e-commerce customer reviews as positive or negative. Achieved ~91% validation accuracy using a TextVectorization + Embedding + Dense architecture trained on 11,158 labelled reviews. Demonstrated full ML lifecycle from preprocessing through inference, with a foundation suitable for real-time customer feedback monitoring.

→ View Repository


👥 Customer Segmentation Using K-Means Clustering

scikit-learn · K-Means · Python · EDA

Applied unsupervised machine learning to segment retail customers by purchasing behaviour and demographics. Used the elbow method and silhouette scoring (0.55) to identify 5 well-defined customer clusters, delivering actionable targeting strategies for high-income/low-spend vs. low-income/high-spend segments.

→ View Repository


🏗️ Building Energy Efficiency — Load Prediction & Explainability

scikit-learn · SHAP · Linear Regression · Python

Modelled heating and cooling energy demand from architectural design parameters. Achieved R² ~0.92 for heating load and ~0.89 for cooling load using interpretable linear regression. Applied SHAP analysis to confirm physical drivers (compactness, glazing area), producing stakeholder-ready, explainable outputs for early-stage design decisions.

→ View Repository


Open To

I'm actively looking for opportunities in:

  • Data Science — predictive modelling, experimentation, statistical analysis
  • Machine Learning Engineering — model development and deployment
  • Data Engineering — pipeline design, orchestration, scalable data systems
  • Analytics & Decision Science — business intelligence, insight generation

If you're building data-driven products and need someone who bridges strong analytical thinking with hands-on implementation, let's talk.

📧 gogoharrison66@gmail.com


GitHub Stats


Open to remote, hybrid, and on-site roles globally.

Pinned Loading

  1. brazilian-ecommerce-analytics brazilian-ecommerce-analytics Public

    End-to-end data science project analyzing 1.5M+ Brazilian e-commerce transactions. Features ML models (Random Forest, XGBoost) for delivery prediction, churn modeling & lead scoring, interactive Da…

    Jupyter Notebook

  2. sentiment-analysis-system sentiment-analysis-system Public

    The aim of this project is to build an end-to-end sentiment analysis solution that can extract, preprocess, analyze, and visualize customer sentiment from textual reviews. This solution will provid…

    Jupyter Notebook

  3. sales-funnel-analysis sales-funnel-analysis Public

    E-commerce funnel analysis from MQL to delivery, integrating marketing, sales, and fulfillment data. Reveals 5.4x order expansion per seller with 99% delivery success. Identifies channel performanc…

  4. discount-impact-analysis discount-impact-analysis Public

    This project analyzes the impact of a 15% discount campaign on sales, profitability, and customer behavior. The goal is to determine whether the discount increased revenue, attracted new customers,…

    Python 1

  5. Heart-Disease-Prediction Heart-Disease-Prediction Public

    This project aims to build a machine-learning model that can predict the likelihood of a person having a heart disease based on some features.

    Jupyter Notebook

  6. global-suicide-rate-analytics global-suicide-rate-analytics Public

    A longitudinal study (1987–2016) using Tableau to analyze global suicide patterns against GDP and demographics. Features interactive dashboards, statistical correlation, and trend forecasting to id…