Insurance Risk Management

Risk management is very important for insurance industry. Insurance means that insurance companies take over risks from customers. Insurers consider every available quantifiable factors to develop profiles of high and low insurance risk. Level of risk determines insurance premiums. Generally, insurance policies involving factors with greater risk of claims are charged at a higher rate. With much information at hand, insurers can evaluate risk of insurance policies at much higher accuracy. To this end, insurers collect a vast amount of information about policy holders and insured objects. Statistical methods and tools based on data mining techniques can be used to analyze or to determine insurance policy risk levels.

Insurance Risk Analysis

In this page, insurance risk analysis methods are described;

  • Insurance risk factor profiling
  • Insurance predictive modeling
  • insurance risk modeling
  • Insurance scoring
  • Insurance risk-level classification

Are consultants a waste of money? - ZDNet

If you have past insurance records, developing risk models is not that difficult! If you are interested in developing insurance scoring models by yourself, please write to us. We will send you step-by-step "Modeling Guide to Credit and Insurance Scoring". It details procedures described here: data preparation, variable relevancy analysis, hotspot and exception analysis, predictive decision tree probability modeling, neural network modeling, model validation, and rule-based model integration. Insurance analysis software trial license is also available. Fully featured insurance risk analysis software for insurance scoring modeling.

Software downloads: Evaluation copy of Cramer decision tree for modeling and drill-down segmentation analysis is available from CTM - Downloads.

Profiling of Risky Segments

Profiling insurance risk factors is very important. The Pareto principle suggests that 80%~90% of the insurance claims may come from 10%~20% of the insurance segment groups. Profiling these segments can reveal invaluable information for insurance risk management. Insurance providers often collect a large amount of information on insured entities. Policy information (such as automobile insurance, life insurance, general insurance, etc.) often consists of dozens or even hundreds of variables, involving both categorical and numerical data with noisy information. Profiling is to identify factors and variables that best summarize the segments.

Combinational factor analysis and Combinatorial blowout!

Analyzing such vast information is an extremely difficult and challenging task. In conventional profiling methods, factor analysis is performed on a few (to several) variables at a time using statistical software. As the total number of variables analyzed increases, the number of combinations to be examined in this way grows combinatorially. When a large number of variables is involved, the number of combinations is too large. Thorough systematic analysis is all but impossible! A conventional method to this problem is to examine only combinations that are likely to have influence. However, hunch can leave out important factors without being noticed.

Fortunately, this problem can be overcome with StarProbe Hotspot Profiling Analysis Tools. Hotspot profiling analysis drills-down data systematically and detects important relationships, co-factors, interactions, dependencies and associations amongst many variables and values accurately using Artificial Intelligence techniques such as incremental learning and searching, and generate profiles of most interesting segments. It is noted that insurance premiums are normally stipulated with profiles of risky (or very low-risk) policy holders. Hotspot analysis can identify profiles of high (and low) risk policies accurately through thorough analysis of all available insurance data. The followings are examples of risk factor profiles. It is noted that the same can be applied to other quantifiable risk insurances such as credit insurance, general insurance, and so on.

High risk healthcare coverage risk factor profiling

An insurance company keeps health care insurance coverage (or health insurance for short) or life insurance records in its database: gender, age, education, smoking, drinking, sun activity, height, weight (=obesity level), claim payment, etc., as well as other contact information. The company wishes to know which health insurance groups are at the highest risk, i.e., have the highest claim ratio. The following is a possible output of hotspot profiling analysis;

high risk health insurance profiling.
High risk auto insurance risk factor profiling

An insurance company keeps records on motor vehicle insurance (or automobile insurance) information in its database containing driver and vehicle information: Gender, age, license experience, education, occupation, drinking, smoking, mobile phone use; vehicle manufacturer, type, model, year make, and so on. The company wishes to know which motor vehicle insurance is at the highest risk groups, i.e., highest average insurance payouts. The following is a possible output of hotspot profiling analysis;

high risk auto insurance profling.

The following figure shows an example of StarProbe Data Miner hotspot analysis output. Top-left is hotspot drill-down tree. Top-right shows detailed statistics of hotspots selected. Bottom left and right provide gains/lift factor analysis.

Insurance risk factor profiling using hotspot analysis.

Insurance Risk Modeling

If past is any guide for predicting future events, predictive modeling is an excellent technique for insurance risk management. Predictive models are developed from past historical records of insurance polices, containing financial, demographic, psychographic, geographic information, along with properties of insured objects. From the past insurance policy information, predictive models can learn patterns of different insurance claim ratios, and can be used to predict risk levels of future insurance policies. It is important to note that statistical process requires a substantially large number of past historical records (or insurance policies) containing useful information. Useful information is something that can be a factor that differentially affects insurance claims ratios.

Insurance Predictive Modeling and Tools

StarProbe data miner supports robust easy-to-use predictive modeling tools. Users can develop models with the help of intuitive model visualization tools. Application and deployment of insurance risk models is also very simple. StarProbe supports the following predictive modeling tools;

  • Neural Network is a very powerful modeling tool. It generally offers most accurate and versatile predictive models. It's very easy to develop neural network predictive models with StarProbe. Network visualization tools will guide users from configuration, training, testing, and more importantly direct application to databases.
  • Cramer Decision Tree produces most compact and thus most general decision trees. Decision tree can be used for predicting segmentation-based statistical probability of insurance claims.
  • Regression produces mathematical functions for predicting insurance claims. It can be very limiting to be used as general-purpose insurance claims predictive modeling methods. However when it is used with above methods, it can be a very useful method.
Pitfalls of classification modeling techniques

Classification models predict events into categorical cases, say, "risky" or "safe". Classification methods are primarily supported by decision tree, SVM, neural network, etc. Intuitively, classification is a very appealing approach as prediction is made using terms that anyone can understand without professional interpretation. However, there is a serious drawback in applying classification techniques to insurance risk management. The problem lies with the fact that insurance claims are in general very low ratio events, say, less than 10%. Developing predictive models with skewed data is very difficult, especially with decision tree classification. Decision trees develop predictive models through segmenting populations into smaller groups repeatedly. It uses the dominant value (or most frequent value) of each segment as the predicted value for the segment. Dominant values are the values represented by over 50% segment population. Insurance customers are already well screened. It is possible that no segments may contain risky customers in excess over 50%! Even they exist, they may be slightly over 50%! Segments in which 49% customers have claim history will be predicted as "not" risky, although they are very high risk groups! This type of models will have very low accuracy in predicting risky customers as "risky". Much worse is that, as a consequence, more non-risky customers may end up being classified as "risky". Not much useful properties! It is important to note that ALL classification techniques have this limitation. To overcome this problem, some may be tempted to use tricks by introducing extra instances. However, such tricks will necessarily distort overall representation of population. Still the problem remains! A better approach is insurance scoring using statistical probability described in the next section.


Insurance Scoring

Insurance scoring is numerical rating of insurance policies. It measures the level of risk of being claimed. This section describes advanced insurance risk modeling and insurance scoring methods;

Method 1: Cramer decision tree predicting claims probability

Decision tree divides insurance customer segments into smaller sub segments recursively. At each segment, splitting is made in a way that boosts proportions of either claimed polices or no claim polices, in each resulting sub segment. This process repeats until no further improvement can be made.

Decision trees - statistical probaility insurance scoring

The above figure shows StarProbe data miner decision tree. Insurance segments are partitioned recursively in a way that increases the proportion of either claimed polices or no claim policies. In the figure, reds represent claimed policy portions and greens for no claim policies. Nodes in red indicate that over 50% customers of the segments have claimed policies. Green nodes have less than 50% of claimed policies.

For new insurance applications, when customer's information is applied to the tree, it will normally lead to a terminal node segment. The claims ratio of the node is used as the insurance score of the customer or policy. If the segment has 35% claims ratio in the past, the score will be 35% (0r 0.35). For more information, please read Decision Tree Software.

Method 2: Neural network predicting relative claims risk level

Tree-based insurance scoring provides coarse level prediction. It lacks the accuracy that neural network models can produce. Neural Network is a very powerful predictive modeling technique. Neural network is derived from animal nerve systems (e.g., human brains). The heart of the technique is (artificial) neural network. Neural networks can learn to predict in detail with high accuracy. The following shows the neural network module of StarProbe data miner;

neural network for credit scoring.

Neural network works differently from decision tree. It can be trained to predict either relative claim levels or expected claim amounts. When the former is used, network will predict relative level of insurance claims. The latter will predict expected claim amounts. The followings are histograms, showing distribution of insurance scores predicted by a neural network insurance scoring model. Note that reds are claimed polices. Greens represent no claim policies. Clearly, the neural network model predicts claimed policies with higher scores and no claim policies with lower scores. Analyzing distribution of scores, claims probability may be deduced.

proportional distribution of insurance scores. insurance score distribution histogram.
Method 3: Incorporating advanced insurance scoring techniques

The decision tree method provides coarse-level general insurance scoring where customers of the same segments are treated equally with the same ratings. The neural network method offers finer insurance scoring. Combining two methods can yield much more robust insurance scoring systems. Rule-based Modeling provides an environment where rules and mathematical formulas can be expressed with predictive models (such as neural networks and decision trees). Two insurance scoring methods can be combined using rule-based modeling.

Rule-based modeling is a powerful tool for modeling complex problems such as insurance scoring. For example, the following rule-based modeling expressions combine three insurance scoring models: one decision tree "insure-tree", the other two neural networks "insure-net1" and "insure-net2". First, if applicants have previous heart attacks, they are at much higher risk. So they are handled specially by the first rule. If the tree's prediction belongs to a segment specially marked as 'pareto-group', score will be computed as the maximum of the probability predicted by the tree and the average value predicted by the two neural networks. Otherwise the decision tree probability will be returned by the third statement;

      // For customers having previous heart attacks;
      // return the maximum of decision tree and regression predictions;
      IF (Heart_attacks > 0) {
         RETURN MAX(MODEL("insure-tree", 'claimed'), 0.7) // at least 70%!
      END ;

      // Although no previous heart attacks, belongs to the 80~90% Pareto group;
      IF (MODEL("insure-tree", LABEL)='pareto-group') {
         RETURN MAX(MODEL("insure-tree", 'claimed'), 
             AVG(MODEL("insure-net1"), MODEL("insure-net2")))

      // For the rest;
      END ;
      RETURN MODEL("insure-tree", 'claimed') ;

Insurance Fraud Detection

Generally speaking, data mining fraud detection techniques do not work well. One problem is that fraudulent claims are only partially discovered. It is difficult to develop predictive models. Note that predictive modeling requires a substantially large number of training data! When data is available in bulk, predictive modeling may be used to identify similar claims which have not been discovered. In addition, the way healthcare insurance operates provides indirect approaches for fraud detection. For more, please read Healthcare Insurance Fraud Detection.

New way of deploying credit risk models for insurance actuaries

Conventionally, risk models are made available to insurance actuaries using purposely-written programs, normally desktop programs with graphical user interfaces (GUI). However, developing fully-featured GUI programs is very costly and time consuming. In addition, it is very difficult to manage distribution. Risk models should be revised to reflex changing business environment periodically. Model upgrade is another problematic area.

StarProbe Rule-based Modeling Environment and Web Service Kit provides a robust environment for deploying sophisticated insurance risk management systems over the web in a simple manner without any programming effort. Credit managers can access credit scoring models using web browsers, as shown in the following figure. Deploying models over StarProbe RME Webkit is very simple and easy. More importantly, in a secure manner, centrally controlled. Model upgrade can happen instantaneously to all users.

credit risk scoring on web.

Turn your risk models into robust audit and monitoring agents!

Insurance transactions are often processed automatically with minimum human intervention. Sophisticated credit scoring models may be embedded into insurance processing systems to audit and filter out risky transactions. The core technology is based on SOA (Service Oriented Architecture) and embedded packages. For more, please read Transaction Audit and Event monitoring.

Harness your risk management products with Embedded intelligence!

StarProbe Data Miner and Rule-based Modeling Environment provides ideal end-to-end solutions for developing and deploying intelligent systems. It provides robust modeling power incorporating rules, formulas, and predictive models. It is based on component technology and can be easily integrated to your risk management software using program API calls, SOA, SOAP, Web Services, Java/J2EE, Servlets, JSPs, XML, etc. Essential technology for systems integrators, outsourcing companies, and service providers! For more, please Contact Us.