Credit and Finance Risk Management

Credit risk analysis (finance risk analysis, loan default risk analysis) and credit risk management is important to financial institutions which provide loans to businesses and individuals. Credit can occur for various reasons: bank mortgages (or home loans), motor vehicle purchase finances, credit card purchases, installment purchases, and so on. Credit loans and finances have risk of being defaulted. To understand risk levels of credit users, credit providers normally collect vast amount of information on borrowers. Statistical predictive analytic techniques can be used to analyze or to determine risk levels involved in credits, finances, and loans, i.e., default risk levels.

Why internal credit scoring?

Personal credit scores are normally computed from information available in credit reports collected by external credit bureaus and ratings agencies. Credit scores may indicate personal financial history and current situation. However, it does not tell you exactly what constitutes a "good" score from a "bad" score. More specifically, it does not tell you the level of risk for the lending you may be considering. Internal credit scoring methods described in this page address the problem. It is noted that internal credit scoring techniques can be applied to commercial credits as well.

Credit Risk Analysis and Modeling

In this page, the following credit risk analysis methods are described;

  • Credit risk factor profiling or loans default analysis.
  • Credit predictive modeling or loans default predictive modeling.
  • Credit risk modeling or finance risk modeling.
  • Credit scoring (Internal).

Was subprime meltdown necessary?

If you have past credit records, developing risk models is not that difficult! If you are interested in developing credit scoring models by yourself, please write to us. We will send you step-by-step "Modeling Guide to Credit and Insurance Scoring". It details the procedures described here: data preparation, variable relevancy analysis, hotspot and exception analysis, predictive decision tree probability modeling, neural network modeling, model validation, and rule-based model integration. Credit analysis software trial license is also available. Fully featured credit risk analysis software for internal credit scoring modeling.

Software downloads: Evaluation copy of Cramer decision tree for modeling and drill-down segmentation analysis is available from CTM - Downloads.

Profiling Risky Credit Segments

Credit risk profiling (finance risk profiling) is very important. The Pareto principle suggests that 80%~90% of the credit defaults may come from 10%~20% of the lending segments. Profiling the segments can reveal useful information for credit risk management. Credit providers often collect a vast amount of information on credit users. Information on credit users (or borrowers) often consists of dozens or even hundreds of variables, involving both categorical and numerical data with noisy information. Profiling is to identify factors or variables that best summarize the segments.

Combinational factor analysis and Combinatorial blowout!

Analyzing such vast information is an extremely difficult and challenging task! In conventional methods, factor analysis is performed on a few (to several) variables at a time using statistical software. As the total number of variables increases, the number of combinations to be examined in this way grows combinatorially. When a large number of variables is involved, the number of combinations is too large to be examined manually. Thorough systematic accurate analysis is all but impossible! A conventional method to this problem is to examine combinations that are likely to have influence. However, hunch can leave out important factors without being noticed.

Fortunately, this problem can be overcome with StarProbe Hotspot Profiling Analysis. Hotspot profiling analysis drills-down data systematically and detects important relationships, co-factors, interactions, dependencies and associations amongst many variables and values accurately using Artificial Intelligence techniques, and generate profiles of most interesting segments. Hotspot analysis can identify profiles of high (and low) risk loans accurately through thorough systematic analysis of all available data. The followings are examples of hotspot profiling applied to credit information.

Finance risk factor profiling examples

Finance risk factor profiles can be easily developed with StarProbe data miner. The followings describe how StarProbe hotspot analysis tools can be used in developing profiles.

[Example 1] A financing firm (or bank) keeps loan records on motor vehicle purchase in its database including default information: gender, age, education, occupation, income; vehicle type, manufacturer, model, year make, price, loan amount, default, default amount, etc. The firm wishes to know which types of loans for motor vehicle purchases are at the highest risk, i.e., highest default ratio by probability;

Credit, loans, default, finance risk factor profiling example 1.

[Example 2] For the same data, the bank wishes to know which types of loans for motor vehicle purchases are at the lowest risk in terms of lowest average default amounts;

Credit, loans, default, finance risk factor profiling example 2.

The following figure shows an example of StarProbe Data Miner hotspot analysis output. Top-left is hotspot drill-down tree. Top-right shows detailed statistics of hotspots selected. Bottom left and right provide lift factor analysis.

Credit, loans, default, finance risk factor profiling using hotspot analysis.

Credit Risk Modeling

If past is any guide for predicting future events, predictive modeling is an excellent technique for credit risk management. Predictive models are developed from past historical records of credit loans, containing financial, demographic, psychographic, geographic information, etc. From the past credit information, predictive models can learn patterns of different credit default ratios, and can be used to predict risk levels of future credit loans. It is important to note that statistical process requires a substantially large number of past historical records (or customer loans) containing useful information. Useful information is something that can be a factor that differentially affects credit default ratios.

Credit Risk Predictive Modeling and Tools

StarProbe data miner supports robust easy-to-use predictive modeling tools. Users can develop models with the help of intuitive model visualization tools. Application and deployment of credit risk models is also very simple. StarProbe supports the following predictive modeling tools;

  • Neural Network is a very powerful modeling tool. It generally offers most accurate and versatile models. It's very easy to develop neural network predictive models with StarProbe. Network visualization tools will guide users from configuration, training, testing, and more importantly direct application to databases.
  • Cramer Decision Tree produces most compact and thus most general decision trees. Decision tree can be used for predicting segmentation-based statistical probability of credit loan defaults.
  • Regression produces mathematical functions for predicting default risk levels. It can be very limiting to be used as general-purpose credit risk predictive modeling methods. However when it is used with above methods, it can be a very useful method.
Pitfalls of classification modeling techniques

Classification models predict events into categorical classes, say, "risky" or "safe". Classification methods are supported by decision tree, SVM, neural network, etc. Intuitively, this is a very appealing approach as prediction is made using terms that anyone can understand! However, there is a serious drawback in applying classification techniques to credit risk management. The problem lies with the fact that credit defaults are in general very low ratio events, say, less than 10%. Developing predictive models with skewed data is very difficult, especially with decision tree classification. Decision trees develop predictive models by segmenting populations into smaller groups recursively. It uses the dominant category (or most frequent value) of each segment as the predicted value for the segment. Dominant categories are the values represented by over 50% segment population. Credit users are already well screened. It is possible that no segments may contain risky customers in excess over 50%! Even it exists, it may be slightly over 50%! Segments in which 49% customers have default-history will be predicted as "not" risky, although they are in very high risk segments! This type of models will have very low accuracy in predicting risky customers as "risky". Much worse is that, as a consequence, more non-risky customers may end up being classified as "risky". Not much useful properties! It is important to note that all classification techniques have this limitation. To overcome this problem, you may be tempted to use tricks by introducing extra instances. However, such tricks will necessarily distort overall representation of population. Still the problem remains! A better approach is credit scoring using statistical probability described in the next sections.


Credit Scoring

(Internal) credit score is a numerical rating of credit loans. It measures the level of risk of being defaulted. The level of default risk can be best predicted with predictive modeling. Credit scores can be measured in term of default probability or relative numerical ratings. The following subsections outline several credit scoring methods;

Method 1: Cramer decision tree predicting default probability

Decision tree divides customer loan segments into smaller sub segments recursively. At each segment, splitting is made in a way that boosts proportions of either defaulted loans or fully-recovered loans, in each resulting sub segment. This process repeats until no further improvement can be made.

Decision trees - statistical probaility credit scoring

The above figure shows StarProbe data miner decision tree. Customer loan segments are partitioned recursively in a way that increases the proportion of either defaulted or fully-recovered loans. In the figure, reds represent defaulted loan portions and greens for fully-recovered loans. Nodes in red indicate that over 50% customers of the segments have defaulted loans. Green nodes have less than 50% of defaulted customers.

For new loan applications, when customer's information is applied to the tree, it will normally lead to a terminal node segment. The default ratio of the node is used as the credit score of the customer. If the segment has 35% default ratio in the past, the score will be 35% (0r 0.35). For more information, please read Decision Tree Software.

Method 2: Neural network predicting relative default risk level

Tree-based credit scoring provides coarse level prediction. It lacks the accuracy that neural network models can produce. Neural Network is a very powerful predictive modeling technique. Neural network is derived from animal nerve systems (e.g., human brains). The heart of the technique is (artificial) neural network. Neural networks can learn to predict in detail with high accuracy. The following shows the neural network module of StarProbe data miner;

neural network for credit scoring.

Neural network works differently from decision tree. It can be trained to predict either relative default levels or expected default amounts. When the former is used, network will predict relative level of credit defaults. The latter will predict expected default amounts. The followings are histograms, showing distribution of credit scores predicted by a neural network credit scoring model. Note that reds are credit loans defaulted. Greens represent credit loans fully recovered. Clearly, the neural network model predicts default loans with higher scores and loans fully-recovered with lower scores. Analyzing distribution of scores, default probability may be deduced.

proportional distribution of credit scores. credit score distribution histogram.
Method 3: Incorporating advanced credit scoring techniques

The decision tree method provides coarse-level general credit scoring where customers of the same segments are treated equally with the same ratings. The neural network method offers finer credit scoring. Combining two methods can yield much more robust credit scoring systems. Rule-based Modeling provides an environment where rules and mathematical formulas can be expressed with predictive models (such as neural networks and decision trees). Two credit scoring methods can be combined using rule-based modeling.

Rule-based modeling is a powerful tool for modeling complex problems such as credit scoring. For example, the following rule-based modeling expressions combine three credit scoring models: one decision tree "credit-tree", the other two neural networks "credit-net1" and "credit-net2". First, if applicants have previous credit defaults, they are handled specially by the first rule. If the tree's prediction belongs to a segment specially labeled as 'pareto-group', score will be computed as the maximum of the probability predicted by the tree and the average value predicted by the two neural networks. Otherwise the decision tree probability will be returned by the third statement.

      // For customers having previous loan defaults,
      // return the maximum of decision tree and regression predictions;
      IF (Prev_Defaults > 0) { 
         RETURN MIN(MAX(MODEL("credit-tree", 'defaulted'), Prev_Defaults*0.01+0.85), 1.0)
      END ;

      // Although no previous loan defaults, belongs to the 80~90% Pareto group;
      IF (MODEL("credit-tree", LABEL)='pareto-group') {
         RETURN MAX(MODEL("credit-tree", 'defaulted'), 
                  AVG(MODEL("credit-net1"), MODEL("credit-net2")))
      END ;

      // For the rest;
      RETURN MODEL("credit-tree", 'defaulted') ;

New way of deploying credit risk models for credit managers

Conventionally, risk models are made available to credit managers using purposely-written programs, normally desktop programs with graphical user interfaces (GUI). However, developing fully-featured GUI programs is very costly and time consuming. In addition, it is very difficult to manage distribution. Risk models should be revised to reflex changing business environment periodically. Model upgrade is another problematic area.

StarProbe Rule-based Modeling Environment and Web Service Kit provides a robust environment for deploying sophisticated credit scoring systems over the web in a simple manner without any programming effort. Credit managers can access credit scoring models using web browsers, as shown in the following figure. Deploying models over StarProbe RME Webkit is very simple and easy. More importantly, in a secure manner, centrally controlled. Model upgrade can happen instantaneously to all users.

credit risk scoring on web.

Turn your risk models into robust audit and monitoring agents!

Credit transactions are often processed automatically with minimum human intervention. Sophisticated credit scoring models may be embedded into credit processing systems to audit and filter out risky transactions. The core technology is based on SOA (Service Oriented Architecture) and embedded packages. For more, please read Transaction Audit and Event monitoring.

Harness your risk management products with Embedded intelligence!

StarProbe Data Miner and Rule-based Modeling Environment provides ideal end-to-end solutions for developing and deploying intelligent systems. It provides robust modeling power incorporating rules, formulas, and predictive models. It is based on component technology and can be easily integrated to your risk management software using program API calls, SOA, SOAP, Web Services, Java/J2EE, Servlets, JSPs, XML, etc. Essential technology for systems integrators, outsourcing companies, and service providers! For more, please Contact Us.