experience-rating

Experience modification factors, schedule rating, and NCD/bonus-malus systems for UK non-life insurance pricing. For teams whose experience rating logic lives in a spreadsheet no one fully understands.

The problem

Fleet motor is the clearest case. A 200-vehicle fleet has three years of claims history: 14 incidents, £320,000 in paid losses, £45,000 incurred but not yet settled. Market rating gives you the base. Experience rating answers what you actually want to know: how much should this account's own history move the price?

The maths is not hard, but the choices are. What credibility weight? What ballast? How do you cap a single catastrophic loss so it does not blow up the mod? These decisions are regularly buried in an Excel cell with no audit trail. This library makes them explicit and auditable.

Fleet also has no NCD system to fall back on — unlike personal motor, where a bad risk eventually self-selects up to a worse NCD level, fleet pricing is largely prospective. The experience mod factor is doing the full job of distinguishing good accounts from bad.

Personal motor NCD is the secondary use case here. If your team is asking "what is the steady-state distribution of our book across NCD levels at 10% claim frequency?" or "at what claim amount should a 65% NCD customer absorb the loss rather than claim?", this library handles that too. But the core experience rating machinery was designed around commercial lines where NCD does not exist.

Blog post

Your NCD Threshold Advice Is Wrong at 65%

What this library does not do

It does not calibrate BM scales from data (that requires a GLM pipeline and historical claims). It does not model policyholder heterogeneity (see the credibility library for that). It does not optimise NCD system design - it analyses a system you have already specified.

Installation

uv add experience-rating

Requires Python 3.10+. Dependencies: polars, numpy, scipy.

Quick start

Fleet motor: experience modification factor

Fleet is the primary use case. No NCD scale exists — the mod factor is the full adjustment.

import polars as pl
from experience_rating import ExperienceModFactor
from experience_rating.experience_mod import CredibilityParams

# With A=0.65 and B=£8,000, a fleet with £8k expected losses has 32.5% sensitivity
# to its own experience (A * E/(E+B) = 0.65 * 8k/16k). At £80k expected losses
# sensitivity rises to ~59% — the ballast-to-expected ratio drives how much own
# experience matters; larger fleets are more heavily experience-rated.
params = CredibilityParams(credibility_weight=0.65, ballast=8_000.0)
emod = ExperienceModFactor(params)

fleet_accounts = pl.DataFrame({
    "risk_id": ["Alpha Logistics", "Beta Haulage", "Gamma Couriers"],
    "expected_losses": [25_000.0, 80_000.0, 12_000.0],
    "actual_losses":   [32_000.0, 65_000.0,  4_000.0],
})

result = emod.predict_batch(fleet_accounts, cap=2.0, floor=0.5)
print(result)
# Alpha: slightly above 1.0 (worse than expected, moderate size)
# Beta: below 1.0 (better than expected, high credibility)
# Gamma: well below 1.0 (much better than expected, small fleet — high ballast-to-expected ratio damps the result)

Schedule rating for commercial risks

Use this alongside the mod factor for discretionary underwriter adjustments.

from experience_rating import ScheduleRating

sr = ScheduleRating(max_total_debit=0.25, max_total_credit=0.25)
sr.add_factor("Premises",      min_credit=-0.10, max_debit=0.10, description="Premises condition")
sr.add_factor("Management",    min_credit=-0.07, max_debit=0.07, description="Management quality")
sr.add_factor("Risk_Controls", min_credit=-0.08, max_debit=0.08, description="Risk controls")

factor = sr.rate({"Premises": 0.05, "Management": -0.03, "Risk_Controls": 0.02})
print(f"Schedule rating factor: {factor:.4f}")  # 1.0400

Personal motor: NCD scale and stationary distribution

NCD is the secondary use case — for personal lines teams who need to model the BM system analytically.

from experience_rating import BonusMalusScale, BonusMalusSimulator

# Commonly used UK NCD scale: levels 0%-65%, step up on claim-free year,
# back two on one claim, back to zero on two or more claims.
scale = BonusMalusScale.from_uk_standard()

sim = BonusMalusSimulator(scale, claim_frequency=0.10)

# Analytical stationary distribution (left eigenvector of transition matrix)
dist = sim.stationary_distribution(method="analytical")
print(dist)

# Expected premium factor at steady state
epf = sim.expected_premium_factor()
print(f"Average NCD at steady state: {(1 - epf) * 100:.1f}%")

Personal motor: optimal claiming threshold

from experience_rating import ClaimThreshold

ct = ClaimThreshold(scale, discount_rate=0.05)

# Customer at 65% NCD paying £280/year after discount
# Over a 3-year horizon, should they claim a £450 repair?
threshold = ct.threshold(current_level=9, annual_premium=280.0, years_horizon=3)
print(f"Claim only if loss exceeds £{threshold:.0f}")

should = ct.should_claim(
    current_level=9, claim_amount=450, annual_premium=280.0, years_horizon=3
)
print("Claiming is rational" if should else "Better to pay out of pocket")

API reference

`ExperienceModFactor`

Method	Description
`from_exposure(actual, full_credibility, ballast, formula)`	Construct from exposure-based credibility
`predict(expected_losses, actual_losses, cap, floor)`	Single-risk mod factor
`predict_batch(df, expected_col, actual_col, cap, floor)`	Portfolio mod factors (Polars DataFrame)
`sensitivity(expected_losses, actual_range, n_points)`	Mod vs actual loss curve

`ScheduleRating`

Method	Description
`add_factor(name, min_credit, max_debit, description)`	Register a rating factor (chainable)
`rate(features)`	Multiplicative schedule factor for one risk
`rate_batch(df)`	Schedule factors for a portfolio DataFrame
`summary()`	Registered factors as a Polars DataFrame

`BonusMalusScale`

Method	Description
`from_uk_standard()`	Commonly used UK NCD scale: 10 levels (0%-65%)
`from_dict(spec)`	Build from a dictionary specification
`transition_matrix(claim_frequency)`	Row-stochastic transition matrix (Poisson claims)
`summary()`	Polars DataFrame of level definitions

`BonusMalusSimulator`

Method	Description
`simulate(n_policyholders, n_years)`	Monte Carlo simulation of level flows
`stationary_distribution(method)`	`"analytical"` (eigenvector) or `"simulation"`
`expected_premium_factor(method)`	Probability-weighted average premium factor at steady state

`ClaimThreshold`

Method	Description
`threshold(current_level, annual_premium, years_horizon)`	Minimum loss amount that makes claiming rational
`should_claim(current_level, claim_amount, annual_premium, years_horizon)`	Boolean claiming decision
`threshold_curve(current_level, annual_premium, max_horizon)`	Threshold vs horizon DataFrame
`full_analysis(annual_premium, years_horizon)`	Thresholds for every level in the scale

Custom BM scale

spec = {
    "levels": [
        {
            "index": 0, "name": "No NCD", "premium_factor": 1.00, "ncd_percent": 0,
            "transitions": {"claim_free_level": 1, "claim_levels": {"1": 0, "2": 0}}
        },
        {
            "index": 1, "name": "20% NCD", "premium_factor": 0.80, "ncd_percent": 20,
            "transitions": {"claim_free_level": 2, "claim_levels": {"1": 0, "2": 0}}
        },
        {
            "index": 2, "name": "40% NCD", "premium_factor": 0.60, "ncd_percent": 40,
            "transitions": {"claim_free_level": 2, "claim_levels": {"1": 1, "2": 0}}
        },
    ]
}
scale = BonusMalusScale.from_dict(spec)

Design notes

Why expose ballast directly rather than deriving it? Because the choice of ballast is a deliberate actuarial decision that affects which risks get charged more and which get discounted. Hiding it inside a calibration function obscures a regulatory-facing choice. For fleet, this matters — a low ballast means small fleets are fully experience-rated, which can produce volatile pricing. Most underwriters choose a ballast that gives 50% credibility at around £5,000-£10,000 expected losses.

Why additive schedule rating (not multiplicative)? UK commercial practice is additive: factors are debits/credits expressed as percentage adjustments summed together. The aggregate cap is where you control total swing. Multiplicative schedule rating is used in some US lines but is not standard in UK admitted business.

Why eigenvector for stationary distribution? It is exact (no simulation noise) and fast. The simulation method exists as a sanity check - if the two disagree by more than a few percent, the transition matrix is probably not ergodic.

Tests

uv add "experience-rating[dev]"
pytest

105 tests covering scale construction, transition matrix properties, stationary distribution (analytical vs simulation agreement), claiming thresholds, experience modification formula, and schedule rating bounds validation.

Performance

Benchmarked against a flat portfolio rate (every policyholder charged the portfolio mean frequency, no individual adjustment) on a synthetic motor portfolio with a known data-generating process: 10,000 policyholders, 4 years of claims history, 10% annual mean frequency, with true individual frequencies drawn from a Gamma distribution (shape=2, mean=10%) — producing realistic heterogeneity across good, average, and bad risks.

The benchmark tests two components independently:

Experience modification factor: Credibility-weighted mod formula applied to 4 years of aggregate loss experience per policyholder, with cap=2.0 and floor=0.50.

NCD / bonus-malus system: 10,000 policyholders simulated through the commonly used UK motor NCD scale over 4 history years. The NCD level at year 5 is used as a premium predictor. This is a deliberately conservative test — the NCD level is a lossy encoding of history (level only, not raw claim counts), so some discrimination signal is discarded.

Method	Gini vs holdout claims	MSE vs DGP true frequency	Notes
Flat portfolio rate	baseline	baseline	No individual adjustment
NCD / BM system	expected +2 to +6 pp	expected 5–15% improvement	Lossy history encoding
Experience mod factor	expected +5 to +12 pp	expected 10–25% improvement	Full loss history retained

Gini and MSE figures are labelled "expected" because exact values depend on the random seed. The direction and ordering are consistent: experience mod outperforms NCD (because it uses full loss amounts rather than level transitions), and both outperform the flat rate whenever the portfolio has genuine individual risk heterogeneity (true frequency CV > 0.5, which this DGP produces by construction).

The NCD system's A/E ratio converges toward 1.0 within each risk quality tier (good/average/bad) more quickly than the flat rate, confirming that the NCD level discriminates underlying risk quality even though it is not designed for this purpose.

Both methods run in under 1 second for 100,000 risks. The computational case for flat rates over experience rating is nil once the data pipeline exists.

Run notebooks/benchmark.py on Databricks to reproduce.

Databricks Notebook

A ready-to-run Databricks notebook benchmarking this library against standard approaches is available in burning-cost-examples.

Related Burning Cost libraries

insurance-credibility - Bühlmann-Straub credibility weighting for scheme and affinity pricing. The experience mod factor here uses a simple credibility weight; insurance-credibility gives you the full structural parameter estimation (EPV, VHM, k) when you have panel data across multiple groups.
insurance-multilevel - Two-stage CatBoost + REML approach when individual risk factors and group factors need to be modelled jointly.

Model building

Library	Description
shap-relativities	Extract rating relativities from GBMs using SHAP
insurance-cv	Walk-forward cross-validation respecting IBNR structure

Uncertainty quantification

Library	Description
insurance-conformal	Distribution-free prediction intervals for Tweedie models
bayesian-pricing	Hierarchical Bayesian models for thin-data segments

Deployment and optimisation

Library	Description
insurance-optimise	Constrained rate change optimisation with FCA PS21/5 compliance
insurance-demand	Conversion, retention, and price elasticity modelling

All libraries

Related Libraries

Library	What it does
insurance-credibility	Bühlmann-Straub group credibility and Bayesian experience rating at policy level
insurance-multilevel	Two-stage CatBoost + REML random effects — applies the same credibility logic to broker and scheme factors

Licence

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
notebooks		notebooks
src/experience_rating		src/experience_rating
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

experience-rating

The problem

Blog post

What this library does not do

Installation

Quick start

Fleet motor: experience modification factor

Schedule rating for commercial risks

Personal motor: NCD scale and stationary distribution

Personal motor: optimal claiming threshold

API reference

`ExperienceModFactor`

`ScheduleRating`

`BonusMalusScale`

`BonusMalusSimulator`

`ClaimThreshold`

Custom BM scale

Design notes

Tests

Performance

Databricks Notebook

Related Burning Cost libraries

Related Libraries

Licence

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

experience-rating

The problem

Blog post

What this library does not do

Installation

Quick start

Fleet motor: experience modification factor

Schedule rating for commercial risks

Personal motor: NCD scale and stationary distribution

Personal motor: optimal claiming threshold

API reference

ExperienceModFactor

ScheduleRating

BonusMalusScale

BonusMalusSimulator

ClaimThreshold

Custom BM scale

Design notes

Tests

Performance

Databricks Notebook

Related Burning Cost libraries

Related Libraries

Licence

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`ExperienceModFactor`

`ScheduleRating`

`BonusMalusScale`

`BonusMalusSimulator`

`ClaimThreshold`

Packages