A curated set of SQL problems drawn from the kinds of production questions analytics teams at large-scale consumer tech companies actually answer. Useful whether you're preparing for an analytics interview or assembling a question bank to run one.
Each file is self-contained:
- A concrete business scenario naming a team and the decision at stake
- The formal prompt
- The schema
- The solution (PostgreSQL syntax by default)
- Notes on approach, pitfalls, and Spark SQL / other dialect differences
| # | Problem | Concepts |
|---|---|---|
| 01 | Nth highest salary | Window functions, DENSE_RANK |
| 02 | Session reconstruction | Gaps-and-islands, LAG, window sums |
| 03 | Funnel conversion | Conditional aggregation, self-joins |
| 04 | Rolling 7-day retention | Date arithmetic, LEFT JOIN with window |
| 05 | Median without percentile_cont | Window functions, NTILE tricks |
| 06 | Top N per group | Partitioned window ranking, DENSE_RANK vs ROW_NUMBER |
| 07 | Month-over-month growth | LAG, NULLIF, calendar-spine pitfalls |
| 08 | Longest active streak | Gaps-and-islands (date - row_number trick) |
| 09 | 30-day rolling MAU | Distinct-count-over-window, HLL approximation tradeoff |
| 10 | Cohort retention table | Cohort bucketing, conditional aggregation, pivoted output |
| # | Problem | Concepts |
|---|---|---|
| 11 | Last-touch attribution | As-of join, LEFT JOIN + window ROW_NUMBER, default bucket |
| 12 | Two-step funnel with timing | MIN per step, chained interval checks, ordering bugs |
| 13 | Current active streak | Gaps-and-islands, take the last island only |
| 14 | Percentile distribution by group | PERCENTILE_CONT vs DISC, small-n reliability flag |
| 15 | Recursive manager hierarchy | Recursive CTE, cycle defence, up-vs-down traversal |
| 16 | Co-purchase product pairs | Self-join with <, anti-double-count, lift extension |
| 17 | Top 3-step user paths | LAG(..., k), users vs occurrences distinction |
| 18 | Event deduplication | Burst-collapse gaps-and-islands, transitive-window subtlety |
Files are intentionally short — roughly 2–3 KB each and readable top-to-bottom. The "pitfalls" and "dialect notes" sections are the point: they capture where each pattern generalises and where it quietly breaks in production.