Recruiting funnel performance, conversion analysis, and diversity impact monitoring. End-to-end analytics a People Analytics team delivers to the CHRO every quarter: stage-to-stage conversion, source effectiveness, time-to-fill, and statistically rigorous demographic pass-rate monitoring.
Live dashboard → (no install required)
Recruiting is one of the largest HR operating budgets at any employer, but the quality of that spend is rarely measured. Most TA organizations track applications, interviews, and hires — not the ratios between them, the variance across sources, or the demographic differences in pass rates.
This project demonstrates the four analyses that drive real TA strategy decisions:
- Funnel conversion — where in the pipeline do candidates drop?
- Source effectiveness — which channels actually produce hires per dollar spent?
- Time-to-fill — which roles need pipelines built before reqs open?
- Demographic pass-rate analysis — does the funnel treat candidates equally at each gate?
Simulated 12-month hiring funnel: ~50,000 candidate applications across 8 roles, 6 sources, 6 locations.
| Field | Type | Notes |
|---|---|---|
candidate_id |
string | Unique identifier |
role |
categorical | 8 roles: Sales Rep, Software Engineer, ML Engineer, Product Manager, Data Analyst, Security Engineer, Recruiter, HR Generalist |
location |
categorical | 6 locations: NY, SF, Austin, Remote, Seattle, Denver |
source |
categorical | 6 sources: Referral, LinkedIn, Job Board, Company Site, Agency, Event/Conference |
gender |
categorical | F / M / NB (non-binary) |
is_urm |
boolean | URM flag (Black, Hispanic, Other/Multi) |
applied_date |
date | Application date across 12-month window |
stage_reached |
ordered categorical | Furthest stage: applied → screened → phone_screen → onsite → offer → accepted → hired |
days_in_stage |
integer | Days spent at terminal stage |
Real hiring funnel data is always proprietary. The simulator is calibrated against published industry benchmarks:
- Overall applied → hired conversion: ~1.5% (aligns with Talent Board / Jobvite corporate-role data)
- Source mix: ~15% referrals, ~35% LinkedIn, ~25% job boards (SHRM 2023)
- Phone screen → onsite: ~40–45% pass rate
- Offer → accept: ~80–90% (role dependent)
| Signal | Effect | Where it shows up |
|---|---|---|
| Referral boost | 1.6x pass-rate multiplier at every stage | Source effectiveness page: referrals show ~40x hire rate vs job boards |
| Role difficulty | ML and Security roles: ~30% lower screened→phone pass rate | Specialized roles show lower overall conversion and longer time-to-fill |
| URM screen bias | 12% lower relative URM pass rate at the phone-screen gate only | Bias detection page: 4-5 percentage point absolute gap, p < 0.001 |
reached pct_of_applied stage_pass_rate_pct
applied 50138 100.00 19.81
screened 9932 19.81 46.81
phone_screen 4649 9.27 45.13
onsite 2098 4.18 43.76
offer 918 1.83 88.78
accepted 815 1.63 95.83
hired 781 1.56 NaN
What this tells us: the single biggest drop is at the applied→screened gate (80% elimination at resume review). Offer→accept at ~89% is healthy — below 75% would flag a comp-competitiveness problem.
Referrals produce hires at 6.5% rate. Job boards produce hires at 0.15% rate — a 42x difference.
This is the quantitative backing for every referral bonus program ever run. It's also why TA budget should rarely be dominated by job-board spend: high volume, terrible conversion. The number of organizations that still allocate 60%+ of sourcing budget to Indeed/LinkedIn ads despite this finding is one of the most robust patterns in TA operations.
Specialized roles (ML, Security) show the most variance in fill time. This is the argument for pipelining before the req opens — the roles where waiting to source until a req is approved means 2-3 months of missed productivity.
A two-proportion z-test compares URM and non-URM candidate pass rates at a specific funnel gate. This is the same hypothesis test used in clinical trials for comparing outcome rates between treatment arms — applied here to hiring gates instead of medical endpoints.
The math. For two groups with pass rates p_URM and p_nonURM and sample sizes n_URM and n_nonURM:
p_pooled = (passes_URM + passes_nonURM) / (n_URM + n_nonURM)
√( p_pooled × (1 − p_pooled) × (1/n_URM + 1/n_nonURM) )
standard_error = ────────────────────────────────────────────────────
z = (p_URM − p_nonURM) / standard_error
p_value = 2 × (1 − Φ(|z|)) where Φ is the standard normal CDF
A p-value below 0.05 indicates the observed pass-rate difference is unlikely to have arisen by chance under the null hypothesis of equal pass rates. This is evidence of a systematic difference — not proof of discrimination, but enough to trigger a targeted review.
Recovered result on simulated data (injected 12% relative URM screen bias):
n_urm 2,759
n_nonurm 7,173
pass_rate_urm 0.4353
pass_rate_nonurm 0.4807
absolute_gap_pp 4.54
relative_gap_pct 9.44
z_stat -4.06
p_value < 0.001
A 4.5 percentage-point gap at the phone-screen gate, p < 0.001.
Set the injected URM bias to zero and re-run. The recovered gap drops to 0.7 percentage points with p = 0.51 — not statistically distinguishable from zero. The test is not generating false positives, which is the check that matters for any bias-detection methodology.
The stage-by-stage breakdown shows the gap concentrates at phone screen — exactly where the affinity-bias signal was injected. A real audit would use this to target intervention specifically at that gate (structured screen rubric, recruiter calibration, blind screen trial) rather than applying generic "DEI training" to the whole pipeline.
| File | Purpose |
|---|---|
src/simulate.py |
Generates a calibrated 12-month hiring funnel dataset. |
src/analyze.py |
Core analytics: conversion, sources, time-to-fill, bias tests. |
src/visualize.py |
Regenerates every figure in this README. |
notebooks/01_funnel_analytics.ipynb |
Full analyst walkthrough with TA-side interpretation. |
docs/ |
Generated visualizations. |
A multi-page Streamlit app ships with the repo. Try the live version or run it locally.
- Sidebar: funnel parameters (months to simulate, injected URM bias percent, random seed) and drill-down filters (role, location multi-select)
- KPI strip: total applications, total hires, applied→hired rate, offer→accept rate, top role by volume
- Candidate volume bar chart at each stage of the funnel
- Stage-to-stage conversion rates table with green gradient on pass rate
- CSV export of filtered candidate data
Where recruiting investment actually produces hires.
- KPI strip: best source and its hire rate, worst source and its hire rate, best-to-worst multiplier
- Grouped bar chart: conversion at three gates (phone screen, offer, hire) by source
- Source ranking table with green gradient on hire rate
- Volume-vs-effectiveness scatter (Plotly): each source plotted by application volume versus hire rate, with bubble size proportional to volume. The ideal sits top-right (high volume, high hire rate); budget drains sit bottom-right (high volume, low hire rate)
- Dynamic budget reallocation recommendation that names the specific high-yield and low-yield sources based on the current filtered data
- CSV export
The most important page for a People Analytics audience.
- KPI strip: URM pass rate with sample size, non-URM pass rate with sample size, absolute percentage-point gap, p-value (color-coded significant vs not)
- Alert banner: green success if no significant gap detected, red error if gap is statistically significant — includes the relative gap percentage and suggested next-step language
- Stage-by-stage pass rate bar chart split by URM status — shows where in the funnel the divergence occurs (intervention at phone screen is a different operational response than intervention at onsite)
- Stage detail table with red-green heatmap on the gap column
- Phone-screen gap by role: aggregate bias numbers hide variance across hiring managers and screeners. This view identifies which specific roles to focus intervention on rather than applying generic training across the board
- Disclaimer block explaining what the test does and does not mean (evidence of something, not proof of discrimination; could reflect resume quality, recruiter assignment, rubric design, or structured vs unstructured screens)
pip install -r requirements.txt
streamlit run app/streamlit_app.pyTableau / Power BI users: generate Tableau-ready CSVs:
python -m src.export_tableau # writes tableau/candidates.csv + aggregated CSVsThen follow docs/TABLEAU.md for the recipe to build four dashboard views.
python -m src.analyze # runs all analytics, prints tables
python -m src.visualize # regenerates the visualizations
jupyter lab notebooks/01_funnel_analytics.ipynb| Finding | Action |
|---|---|
| Referrals 40x better than job boards | Shift sourcing budget; expand referral bonus program. |
| ML Eng / Security Eng: highest time-to-fill | Build pipelines 30-60 days before the req opens; consider contract-to-hire. |
| URM phone-screen gap of 4.5pp, p<0.001 | Review screen rubric; audit which recruiters drive the gap; pilot structured screens. |
| Offer-accept rate < 80% for any role | Comp benchmark review; geographic premium review. |
| Top-of-funnel volume by source ≠ hires by source | Reallocate recruiter time to high-yield channels. |
The measurement loop: every quarterly review uses the same metrics against the same definitions. The intervention's effect is visible in the next quarter's numbers. That's what distinguishes analytics from reporting.
This project is structured as a decision-support tool, not a production system. The notes below document what would change in an enterprise rollout.
| Feature category | Production source |
|---|---|
| Candidate records + stage progression | ATS (Greenhouse / Lever / iCIMS / Workday Recruiting) Application + Job_Application_Stage |
| Source attribution | ATS source + UTM parameters from career-site analytics |
| Role, business unit, location | ATS Job + HRIS Job_Requisition |
| Recruiter assignment | ATS user records |
| Demographic attributes (self-ID) | ATS self-ID module, subject to jurisdiction and consent |
| Time-to-fill, time-in-stage | Derived from ATS stage-transition timestamps |
- Required fields non-null: role, source, stage, applied date
- Duplicate candidate detection across requisitions (same person re-applying)
- Source canonicalization — "LinkedIn", "Linkedin", "LI-sourcer" collapsed to a single channel; free-text source values quarantined pending mapping
- Stage ordering integrity: stage transitions must move forward (catching ATS configurations that allow out-of-order moves)
- Withdrawn vs rejected disambiguation: candidates who withdraw their application are not the same as candidates rejected by the employer; mixing them distorts conversion math
- Refresh cadence: daily for operational dashboards; near-real-time unnecessary for strategic funnel work
- Role-based access: recruiters see their own requisitions; TA leaders see their portfolio; People Analytics sees enterprise-wide. Candidate PII is visible only to recruiters actively working the requisition and their manager.
- Demographic data is sensitive: bias detection outputs must never include individual candidate names. Aggregate stage pass-rates by URM status only; never join back to candidate records in shared dashboards.
- Audit log: every bias-detection query is logged with the user, filters applied, and timestamp.
- Auditability: exports include the source-data snapshot timestamp so findings are reproducible.
- Baseline refresh: recompute the stage-pass-rate baselines monthly; alert if any stage rate drifts > 10 % from trailing-90-day average.
- Bias-test re-run: run the two-proportion z-test quarterly; flag any stage where the p-value crosses 0.05 in either direction.
- Source-mix drift: track the source mix of top-of-funnel applications; major shifts can invalidate the previous quarter's conclusions and warrant a baseline reset.
- Reviewer variance: when available, stratify by recruiter to isolate whether a bias signal is driven by specific screeners rather than by the rubric.
- Bias findings carry significant legal sensitivity. A statistically significant gap is evidence of something — not proof of discrimination — but how that finding is documented and shared has litigation implications.
- Intervention design: every intervention in response to a bias finding (rubric changes, recruiter recalibration, blind-screen pilot) requires Legal review. Some interventions (quotas, preferential routing) are not permitted in most U.S. jurisdictions.
- EEOC / OFCCP reporting: for federal contractors, any systemic bias finding may trigger reporting obligations. Employment law counsel must be in the loop before detected findings are documented in writing.
- Candidate-facing communications: candidates have the right (in many jurisdictions, the requirement) to understand how automated screening affects them.
| Role | Permitted use |
|---|---|
| Recruiters | Operate within their own requisitions; see funnel health for their portfolio |
| TA leaders | See source effectiveness, time-to-fill, and aggregate funnel across their teams |
| People Analytics | Maintain the model, investigate drift and bias, support intervention design |
| Employment Law / Compliance | Review bias findings before they are shared outside the core analytics team |
| HR Business Partners | Consume aggregate findings in their business unit |
| Executives | Consume enterprise summaries; approve cross-portfolio intervention decisions |
-
"These bias tests control for nothing." Correct — the headline tests are marginal. A complete audit regresses pass rate on demographics after controlling for resume strength, role, source, and time period. Included in the notebook as a follow-on analysis.
-
"Referral boost is confounded with role/level." Also correct — referrals tend to concentrate in higher-level engineering hires, where overall pass rates are higher anyway. A rigorous source analysis would include fixed effects for role.
-
"Simulated data can't demonstrate this works on real data." Fair. The value is showing the methodology — what metrics to compute, what tests to run, what breakdowns to present. Any real deployment would port the same analysis functions to an actual ATS export (Greenhouse, Lever, Workday Recruiting).
-
"Bias detection alone doesn't fix the problem." Absolutely. Detection is step 1; intervention design is step 2; impact measurement is step 3. The loop is what creates change, not any single quarter's analysis.
Part of a People Analytics portfolio covering workforce planning, recruiting, compensation equity, and retention. Companion repositories:
- workforce-planning-demand-forecast — strategic workforce planning and recruiter capacity
- compensation-equity-analysis — regression-based pay equity audit
- hr-attrition-predictor — responsible retention risk modeling
Maintainer: Jeff Otterson. Libraries: pandas, scipy, streamlit, plotly. MIT licensed.



