Skip to content

wavde/product-analytics-deepdive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Product Analytics Deep-Dive

End-to-end product analytics case studies: funnels, retention cohorts, engagement segmentation, and north-star metric design. SQL-first, memo-driven.

CI

hero

A small set of case studies covering the analytics work that sits behind product decisions: defining the metric, writing the query, reading the cohort, and writing the memo. Each case is framed around a concrete business question rather than a method in isolation.

Intended audience: analysts and data scientists who want a reference for shape-of-the-answer product work, and hiring managers or interviewers who want to see SQL and memo output side by side.

Case studies

# Case study Primary concepts Status
01 Funnel and retention on simulated e-commerce Conversion funnel, cohort retention, north-star framing Complete
02 Cohort engagement and segmentation RFM, L-28 power-user curve, k-means Complete
03 North-star metric design memo Metric tradeoffs, WAU × conversion decomposition, gaming risk Complete

Each case includes:

  1. A business question framed as a PM / DS partnership would frame it.
  2. An explicit metric definition with the tradeoffs stated.
  3. SQL (Postgres, Spark SQL, DuckDB) as the primary tool.
  4. Python where SQL does not express it cleanly (bootstrap intervals, charts).
  5. A memo that would survive an interview take-home review.

Stack

  • DuckDB as the portable SQL engine (window functions, runs locally).
  • Spark SQL for scale patterns, with dialect notes where it differs.
  • Python: pandas, numpy, matplotlib for analysis and charts.
  • experiment-toolkit where the analytics layer needs CUPED, sequential tests, or staggered DiD.

How to run

python -m venv .venv
.venv\Scripts\activate          # Windows
pip install -r requirements.txt
pytest                          # smoke tests; regenerates data and validates queries

Each case study has its own README and reproducer.

Out of scope

No production pipeline, no orchestration framework, no warehouse modelling guidance beyond what the queries demonstrate. The datasets are simulated; they are structured to behave realistically, not to represent any specific company's data.

License

MIT. See LICENSE.

About

End-to-end product analytics case studies: funnel analysis, cohort retention, engagement segmentation, north-star metric framework. SQL-first (DuckDB + Spark SQL), with a practitioner decision lens.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages