Modeling Car Insurances

Overview

This project aims to develop a predictive model to estimate the likelihood of a customer making a claim on their car insurance during the policy period. Commissioned by a fictional car insurance company, the project seeks to optimize pricing strategies and enhance risk assessment capabilities, crucial for maintaining a competitive edge in the large car insurance market.

Data Contents

The dataset, car_insurance.csv, includes various customer attributes:

Column	Description
`id`	Unique client identifier
`age`	Client's age: `0`: 16-25 `1`: 26-39 `2`: 40-64 `3`: 65+
`gender`	Client's gender: `0`: Female `1`: Male
`driving_experience`	Years the client has been driving: `0`: 0-9 `1`: 10-19 `2`: 20-29 `3`: 30+
`education`	Client's level of education: `0`: No education `1`: High school `2`: University
`income`	Client's income level: `0`: Poverty `1`: Working class `2`: Middle class `3`: Upper class
`credit_score`	Client's credit score (between zero and one)
`vehicle_ownership`	Client's vehicle ownership status: `0`: Does not own their vehilce (paying off finance) `1`: Owns their vehicle
`vehcile_year`	Year of vehicle registration: `0`: Before 2015 `1`: 2015 or later
`married`	Client's marital status: `0`: Not married `1`: Married
`children`	Client's number of children
`postal_code`	Client's postal code
`annual_mileage`	Number of miles driven by the client each year
`vehicle_type`	Type of car: `0`: Sedan `1`: Sports car
`speeding_violations`	Total number of speeding violations received by the client
`duis`	Number of times the client has been caught driving under the influence of alcohol
`past_accidents`	Total number of previous accidents the client has been involved in
`outcome`	Whether the client made a claim on their car insurance (response variable): `0`: No claim `1`: Made a claim

Methodology

Data Cleaning and Transformation: Handled missing values and transformed categorical variables into numerical representations.
Exploratory Data Analysis (EDA): Analyzed correlations and visualized data distributions to understand feature relationships.
Feature Selection: Applied logistic regression to assess each feature's predictive power.
Modeling:
- Logistic Regression: Tuned using GridSearchCV, achieving an accuracy of 79%.
- Random Forest: Tuned with parameters max_depth: None, min_samples_split: 2, n_estimators: 50, achieving an accuracy of 82%.
Evaluation: Used confusion matrices and classification reports to evaluate model performance.

Results

Best Feature: driving_experience was identified as the most predictive feature with an accuracy of 77.71%.
Model Performance:
- Logistic Regression: Precision of 0.92 for "No Claim" and 0.63 for "Claim".
- Random Forest: Precision of 0.85 for "No Claim" and 0.70 for "Claim".

Visualizations

Confusion matrices and correlation heatmaps are included to provide visual insights into model performance and feature relationships.

Conclusion

The project successfully identified key predictive features and developed models that enhance decision-making processes. The Random Forest model, with its higher accuracy, is recommended for deployment, providing a balance between simplicity and performance. This approach allows the company to start with a straightforward model in production, minimizing the need for complex infrastructure and expertise.

Getting Started

To run this project, ensure you have the following libraries installed:

pandas
numpy
matplotlib
seaborn
scikit-learn

Clone the repository and run the Jupyter Notebook to explore the analysis and predictions.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Modeling Car Insurances		Modeling Car Insurances
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modeling Car Insurances

Overview

Table of Contents

Data Contents

Methodology

Results

Visualizations

Conclusion

Getting Started

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

evdimitriou/Modeling-Car-Insurances

Folders and files

Latest commit

History

Repository files navigation

Modeling Car Insurances

Overview

Table of Contents

Data Contents

Methodology

Results

Visualizations

Conclusion

Getting Started

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages