A financial model estimating annual disposable income (DI) for US households as a function of salary, age, and household composition.
Disposable income is decomposed into three components:
DI = After-Tax Income − Essential Spending − Irregular/Shock Costs
Ground-truth bracket-by-bracket computation using IRS 2025 tax brackets, FICA (Social Security + Medicare), and Child Tax Credits. A smooth log-fit approximation is also calibrated via OLS regression for use in closed-form analysis:
τ(S) = 0.0693 · ln(S/1000) − 0.1104 (max residual: 1.09 percentage points)
An income-dependent exponential share function, scaled by household size (OECD equivalence scale) and age (logistic sigmoid to model rising healthcare costs):
s_ess(S) = 0.35 + 0.60 · exp(−S / 60,000)
Falls from ~78% of income at $20k to ~37% at $200k, consistent with BLS Consumer Expenditure Survey data.
Modelled as a compound Poisson-lognormal process — N ~ Poisson(λ_m) shock events per month, each with LogNormal severity (CV = 1.0 for heavy tails). Based on the Federal Reserve SHED finding that emergency expenses average ~5% of after-tax income annually.
| Profile | Salary | Expected DI | P(DI < 0) |
|---|---|---|---|
| Young single, low income | $30k | $4.5k | 0.4% |
| Young single, mid income | $55k | $12.2k | 0.1% |
| Single parent (1 child) | $50k | −$4.6k | 99.3% |
| Mid-career couple | $80k | +$0.3k | 40.9% |
| Family 2 adults + 2 children | $75k | −$20.2k | 100% |
| High earner, single | $150k | $48.4k | 0% |
The mid-career couple at $80k sits at a critical threshold — expected DI barely positive but nearly coin-flip odds of ending the year in deficit.
The most honest takeaway from this project is that we went too deep, too fast.
The PRD asked for a model that estimates disposable income given salary and household demographics. Fairly contained scope. Deterministic expected values with an optional stochastic extension.
We quickly found ourselves building:
- A 6-category monthly spending breakdown (housing, utilities, food, transport, healthcare, discretionary) with independent seasonal multipliers and per-category noise parameters
- Child-specific seasonal cost arrays — a full 12-month array for school supplies, summer camps, holiday gifts — that we essentially made up, without any explicit empirical citations to defend them
- A biweekly payroll simulation with 3-check months (July and December) because that's how US payroll actually works
- Credit card debt accrual at 22.99% APR for months when cash flow goes negative — which wasn't in the PRD at all
- Seasonal Poisson λ for irregular shocks (higher in December/January for winter emergencies, lower in spring)
- A full MATLAB port of everything
- 8 publication-quality figures and 90+ unit tests
By the time we had all of this working, we'd turned a modelling exercise into an engineering project. Every parameter felt load-bearing. The child seasonal array had specific percentages for each month — 5% in January, 30% in August for back-to-school — but if a judge asked "where does the 30% come from?", the honest answer was "it felt right."
The model was genuinely richer and more realistic than the baseline requirements. The Monte Carlo converged correctly. The tests passed. But we'd raised the stakes on ourselves:
- More parameters = more surface area for criticism
- Author-defined seasonal values without formal citations = harder to defend
- Monthly cash flow tracking with CC debt = useful feature, but not what was asked for, and adds logic that could diverge from how the judges expected the model to behave
The PRD already specified the key formulas. A clean solution was:
- Implement the tax function accurately (IRS brackets + FICA)
- Fit a smooth approximation
- Use the BLS essential share formula as written
- Add a simple annual lognormal multiplier for stochastic variation
- Report expected DI and P(DI < 0) for a set of profiles
That's probably 300–400 lines of Python, no MATLAB port, no seasonal arrays, no credit card debt. And it would have been fully defensible because every number traces back to a cited source.
Sophistication without citation is a liability in a competition context. The more we extended the model, the more we owned the numbers — and the harder it became to say "this comes from the data." Building realism into a model is good. Building in realism that you can't trace to a primary source in a graded context is a risk.
The model works. But if we were doing it again, we'd get the core right first, validate it against the PRD, and only then ask whether extensions were worth the added exposure.
.
├── config.py # All parameters (IRS brackets, BLS shares, seasonal multipliers)
├── tax.py # Tax computation: ground-truth bracket + smooth log-fit
├── calibration.py # OLS regression to fit smooth tax approximation
├── essentials.py # Essential spending: deterministic + stochastic monthly model
├── irregular.py # Irregular costs: Poisson-lognormal with seasonal λ
├── disposable_income.py # Combined DI model, monthly cash flow, CC debt accrual
├── visualizations.py # 8 publication-quality figures
├── tests/
│ ├── test_tax.py # 18 tests: brackets, CTC, residuals
│ ├── test_essentials.py # 16 tests: share bounds, MC convergence, age factor
│ ├── test_irregular.py # 13 tests: monthly λ, distribution statistics
│ ├── test_combined.py # 5 tests: cash flow format, 3-paycheck months, CC debt
│ ├── test_edge_cases.py # 34 parametrized tests: minimum wage, $1M, extreme households
│ └── test_sensitivity.py# 4 tests: σ_v, λ, CV, w_fix sweeps
├── matlab/ # MATLAB port (feature-parity with Python)
├── M3-Q1-PRD.md # Problem requirements and mathematical specification
└── M3-Q1-Report.md # Technical report with calibration results and validation tables
from disposable_income import compute_disposable_income
# Single person, age 35, $65k salary
result = compute_disposable_income(salary=65_000, age=35, n_adults=1, n_children=0)
print(result)
# With Monte Carlo simulation (1000 samples)
result = compute_disposable_income(salary=65_000, age=35, n_adults=1, n_children=0, n_sim=1000)| Component | Source |
|---|---|
| Tax brackets & standard deduction | IRS Rev. Proc. 2024-61 (2025 tables) |
| FICA rates and caps | SSA 2025 ($176,100 SS wage base) |
| Essential spending shares | BLS Consumer Expenditure Survey 2022–2023 |
| Emergency cost frequency | Federal Reserve SHED 2023 |
| Age-healthcare cost relationship | CMS National Health Expenditure Data |
| Household equivalence scaling | OECD-modified equivalence scale |