Predicting Quote Conversion in Insurance

An AI-powered tool that helps insurance underwriters focus on the quotes most likely to convert.


Overview

This project began after a series of brainstorming sessions with an insurance client, who wanted to explore new ways to capitalise on the data captured by their insurance marketplace platform.

One idea stood out, a tool that would let users upload their current pipeline of quotes and instantly see predictions on the likelihood of each quote binding.

At the time, only around 10% of submitted quotes were converting, meaning large amounts of time were being spent chasing low-probability deals.

We set out to build an AI model, API, and simple web interface that could quickly process quote data and present ranked predictions.

The result was a solution that prioritised each quote by its likelihood to convert, enabling brokers and underwriters to focus on the opportunities that matter most.

Data Extraction & Assembly

Before any analysis could take place, we first needed to collect and prepare the data.

Quote information was spread across several SQL tables, with some important fields, such as limits and premiums, stored as embedded JSON arrays within columns. To create a unified dataset, we used SQL to combine and “explode” these nested structures, transforming them into a consistent, tabular format suitable for analysis.

Once assembled, the combined dataset captured several years of quote history across multiple products and countries, providing a solid foundation for exploratory analysis and model development.

Exploratory Data Analysis

We explored the dataset in a Jupyter notebook using Python, combining visual inspection with statistical methods to understand relationships between features and the target variable.

Our first step was to evaluate basic distributions, missing values, and outliers. Summary statistics and boxplots quickly highlighted inconsistencies in numeric fields, including several quotes with unusually large premium values that appeared to be data entry errors.

Boxplots immediately showed issues with heavily skewed data for premiums and revenue, which is to be expected for financial data. You can see below that there are many values outwith the whiskers of the plots.

Premium

Boxplot of Premium (raw)
Premium distribution is highly skewed.

Revenue

Boxplot of Revenue (raw)
Revenue distribution is highly skewed.

Limit

Boxplot of Limit (raw)
Coverage limits are right-skewed but the box looks good.

To understand the predictive power of individual features, we calculated univariate separability scores using the Area Under the ROC Curve (AUC). This revealed that a field had an unusually high correlation with the outcome, indicating potential data leakage. After further review, we confirmed that the field value was only present after a quote was bound and should be removed.

We discovered that one product type behaved differently from all others. All quotes for this product were consistently binding, and including them would have dominated the model, causing it to focus on this single, perfectly predictable pattern instead of learning the more subtle relationships across other product types.

Finally, we analysed the industry variable, which was encoded using NAIC industry classification codes. The codes were found to be too granular, leading to hundreds of rarely occurring categories.

Through this analysis, we developed a deeper understanding of the data and identified the transformations required to make it reliable and predictive. The EDA process also validated our feature assumptions and helped guide the next phase of model development.

Feature correlation heatmap
Correlation analysis, detecting multicollinearity.

Data Preparation & Feature Engineering

Following the exploratory phase, we prepared the data for modelling by applying the insights gathered during analysis. The goal was to create a clean, consistent, and informative dataset that captured meaningful business patterns without introducing data leakage.

We began by removing problematic and irrelevant rows identified during EDA, including quotes with a product type that behaved differently, consistently binding. Several entries with implausibly high premium values were also removed. Columns that risked leaking the outcome, such as post-bind status indicators, were also excluded from the training set.

Categorical variables, including product type and country, were standardised and encoded using one-hot encoding to ensure they were compatible with machine learning algorithms. The industry NAIC codes were too granular for the available data, so we grouped them into broader industry categories and then one-hot encoded.

Feature Engineering

To improve model performance and interpretability, a shared feature engineering pipeline was built to convert raw quote data into structured, model-ready features. The same transformations are applied consistently at both training and inference time, ensuring alignment between model development and production scoring.

Temporal Features

Quote creation timestamps were used to derive several time-based indicators capturing seasonality and operational behaviour.

Feature Description
created_dow Day of week the quote was created (0–6).
created_hour Hour of day of creation, capturing working-hour effects.
created_month Month of year to identify seasonal trends.
is_weekend / is_business_hours / is_morning / is_end_of_month Binary indicators for operational timing patterns.

Financial Normalisation and Scaling

Financial magnitudes were highly skewed, so log transformations were introduced to stabilise the data and improve comparability.

Feature Transformation Purpose
revenue_log1p log(1 + revenue) Compress extreme values and improve scale.
premium_log1p log(1 + premium) Compress extreme values and improve scale.

After applying log transformations to premium and revenue, the boxplots show a healthy distribution for modelling.

Premium log1p

Boxplot of Premium (log1p)
Boxplot looking much healthier for premiums.

Revenue log1p

Boxplot of Revenue (log1p)
Boxplot looking much healthier for revenue.

Ratio and Relational Features

Ratios between premiums and limits were added to provide context on pricing and risk appetite.

Feature Description
premium_to_limit Ratio of premium to insured limit.

Industry Grouping

Industry codes were mapped to higher-level sectors to reduce sparsity and improve generalisation across unseen data.

Feature Description
industry_sector_name Mapped from NAICS-style codes to broad sectors such as Manufacturing or Finance & Insurance.
industry_subsector_grouped Rare subsectors grouped under “Other” based on minimum sample thresholds.

Before training, the data was split into training and validation sets using stratified sampling to maintain the class balance of bound and non-bound quotes. This ensured the evaluation results reflected real-world behaviour and not random distribution effects.

The resulting dataset was clean, structured, and model-ready, providing a reliable foundation for the development and evaluation of predictive models.

Model Development & Evaluation

With the dataset prepared and validated, the next step was to identify which modelling approach would best capture the patterns that influence whether a quote binds. We evaluated a range of supervised classification algorithms, balancing model complexity, interpretability, and performance.

Model Selection

Because the dataset combined numeric, categorical, and engineered ratio features, we focused primarily on tree-based ensemble methods known for their ability to model non-linear relationships and handle mixed feature types with minimal preprocessing. For benchmarking, we also included a set of baseline models, linear and kernel-based methods, to illustrate the performance gap between traditional and gradient-boosting techniques.

Model Type
LightGBM Gradient Boosting (Histogram-based)
XGBoost Gradient Boosting (Tree-based)
Gradient Boosting (sklearn) Baseline Boosting
Random Forest Bagging Ensemble
Logistic Regression Linear Model
Support Vector Machine Kernel Method

Model Comparison Routine

To ensure fairness across experiments, all models were trained using the same stratified train/validation split and evaluated on identical data. A custom model comparison script automated this process, standardising metrics, reproducibility settings, and evaluation outputs.

Each model was trained with a fixed random seed to guarantee repeatable results and tuned using a consistent set of hyperparameters optimised for generalisation rather than overfitting. The evaluation script computed key performance metrics: ROC-AUC, Precision-Recall AUC (PR-AUC), and Lift@50- providing a balanced view of ranking quality and calibration.

The table below summarises the final model comparison results:

Final Rankings (by PR-AUC)

Model ROC-AUC PR-AUC Lift@50 Top50 Binds
LightGBM 0.928911 0.702337 8.594118 15
XGBoost 0.952065 0.675751 8.021176 14
Gradient Boosting 0.919024 0.611051 8.021176 14
Random Forest 0.917647 0.439035 7.448235 13
Logistic Regression 0.537547 0.047643 2.291765 4
SVM 0.481977 0.043755 0.572941 1

The LightGBM model achieved the best overall balance of recall and precision, with a PR-AUC of 0.70 and over eight-fold lift at the top 50 predictions compared to random chance. Although XGBoost recorded a slightly higher ROC-AUC, LightGBM demonstrated superior precision-recall performance, making it the preferred choice for deployment.

The results also illustrate the strength of gradient-boosting techniques on structured business data and the limitations of traditional linear approaches in this context. Logistic Regression and SVM models struggled to capture the complex interactions between features, while ensemble tree methods handled them naturally.

LightGBM was selected as the production model for its combination of high accuracy and speed.

The Solution

We packaged the model into a simple, production-ready API that fits seamlessly into the client’s workflow:

  • API: A lightweight FastAPI endpoint that accepts a CSV of new quotes and returns an ordered list ranked by bind probability.
  • Demo Interface: A simple tool to demo the API functionality that allows users to drag and drop their CSV file to view ranked results.

What’s Next

We’ll continue to collect data and periodically re-visit the model comparisons to ensure we’re squeezing the largest amount of predictive power out of it.

We also plan to introduce interpretability dashboards to help users understand which factors most influence each prediction, building transparency and trust in AI assisted decision making.

Key Takeaways

  • Identified high probability quotes using AI ranking
  • Fast, self contained API and upload tool
  • ROC-AUC: 0.93 and PR-AUC: 0.7 – strong predictive accuracy
  • Improved broker efficiency and decision focus
Improving Data Quality in Insurance Quoting with AI
Case Study

Improving Data Quality in Insurance Quoting with AI

An agentic research assistant that turns a company name into reliable, structured data for live business insurance quoting

Predicting Quote Conversion in Insurance
Case Study

Predicting Quote Conversion in Insurance

An AI-powered tool that helps insurance underwriters focus on the quotes most likely to convert.

High-performance, cloud-based voting platform
Case Study

High-performance, cloud-based voting platform

A high-performance, cloud-based voting platform built for Channel 4’s interactive talent show The Singer Takes It All, handling up to 300,000 concurrent users securely on AWS.

Skip to main content