Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Analytics SQL: Production Problem Set

A curated set of SQL problems drawn from the kinds of production questions analytics teams at large-scale consumer tech companies actually answer. Useful whether you're preparing for an analytics interview or assembling a question bank to run one.

Each file is self-contained:

  1. A concrete business scenario naming a team and the decision at stake
  2. The formal prompt
  3. The schema
  4. The solution (PostgreSQL syntax by default)
  5. Notes on approach, pitfalls, and Spark SQL / other dialect differences

Problems

Core (window-function fundamentals)

# Problem Concepts
01 Nth highest salary Window functions, DENSE_RANK
02 Session reconstruction Gaps-and-islands, LAG, window sums
03 Funnel conversion Conditional aggregation, self-joins
04 Rolling 7-day retention Date arithmetic, LEFT JOIN with window
05 Median without percentile_cont Window functions, NTILE tricks
06 Top N per group Partitioned window ranking, DENSE_RANK vs ROW_NUMBER
07 Month-over-month growth LAG, NULLIF, calendar-spine pitfalls
08 Longest active streak Gaps-and-islands (date - row_number trick)
09 30-day rolling MAU Distinct-count-over-window, HLL approximation tradeoff
10 Cohort retention table Cohort bucketing, conditional aggregation, pivoted output

Advanced

# Problem Concepts
11 Last-touch attribution As-of join, LEFT JOIN + window ROW_NUMBER, default bucket
12 Two-step funnel with timing MIN per step, chained interval checks, ordering bugs
13 Current active streak Gaps-and-islands, take the last island only
14 Percentile distribution by group PERCENTILE_CONT vs DISC, small-n reliability flag
15 Recursive manager hierarchy Recursive CTE, cycle defence, up-vs-down traversal
16 Co-purchase product pairs Self-join with <, anti-double-count, lift extension
17 Top 3-step user paths LAG(..., k), users vs occurrences distinction
18 Event deduplication Burst-collapse gaps-and-islands, transitive-window subtlety

Files are intentionally short — roughly 2–3 KB each and readable top-to-bottom. The "pitfalls" and "dialect notes" sections are the point: they capture where each pattern generalises and where it quietly breaks in production.