Predictive Modeling and Predictive Models

A predictive model is a system created and used to perform prediction. Predictive models can predict or forecast variety of things and events. For example, future share prices, credit defaults, insurance claims, customer ordering products, and so on. Predictive models are developed from past historical data or from purposely collected data through sampling. Typical examples may include;

Insurance
Annual insurance policy applications and claims records can be used to develop models that can predict probability (or level of risk) of insurance claims. Predictive models use demographic and financial information of policy holders along with characteristics of insured objects in determining risk levels. For more, please read Insurance Risk Modeling.

Credit loans
Predicting default risk for credit loan applications is another use of predictive models. Data collected from past customer loans, including demographic and financial information of borrowers, can be used to build predictive models that can predict likelihood of loans being defaulted. For more, please read Credit Risk Modeling.

Marketing
Predictive modeling can be used for various tasks. For example, from past customer purchasing records, you can develop models that can select customers who are likely to buy your new products. Another example is customer churn detection. Using past customer information, models that can predict customers who are likely to churn near future. This can be very useful in retention marketing. For more, please read customer retention.

Balanced Scorecard
The Balanced Scorecard (BSC) is a framework for managing business performance. Predictive analytics is a powerful tool to improve Balanced Scorecards with enhanced business visibility and corporate governance.

Predictive Modeling Software Program Tools

Predictive modeling is done automatically by computer software that can learn patterns from data. CMSR supports powerful predictive modeling tools. Users can build models with the help of intuitive model visualization tools. Application of models is very easy. Users can apply models directly to user data using built-in database interface tools. CMSR comes with the following predictive modeling software programs;

  • Predictive Neural Network
    Neural network is a predictive model which is based on the architecture of, say, our brains. It can be used to predict both numerical values and categorical classifications. Generally speaking, neural net offers most accurate and versatile predictive models. For more, please read Neural Network.

  • Decision Tree
    Decision tree develops predictive models based on recursive segmentation. Decision tree models have tree-like structures. As the rule, decision is made based on the democratic principle: the winner takes all. If a category of a decision node has the largest number of cases, it will be the predicted category. Of course, this leads to certain limitations! To overcome this, StarProbe data miner also uses probability. For more, please read Decision Tree Classifier Software

  • Regression
    Compared to above methods, regression may be very limiting and inflexible, since all categorical information should be encoded into numerical variables. However, regression can be very useful in developing mathematically oriented models with simple variable sets. Especially, time-series regression analytics are very useful in balanced scorecard applications.

Decision tree classification predictive modeling. Neural network predictive modeling.

Neural Clustering and Radial Basis Functions (RBF)

Neural clustering also known as Self Organizing Maps (SOM) is a very powerful clustering and segmentation tool. It clusters similar objects together in a way that simiar objects are placed in the same or nearby cells as shown in the following figure. When combined with predictive modeling methods, it renders very poweful Radial Basis Functions. For more, please read Neural Clustering and Radial Basis Functions;

neural clustering for expert systems.

How can you develop predictive models? To learn more about predictive modeling,
please read The Cookbook for Predictive Analytics.


Key requirement for predictive modeling

The most important factor that can lead to successful implementation of predictive modeling is the availability of useful information. It is noted that predictive models are statistically-developed patterns extracted out of past historical data or purposely collected sampling survey data. With proper data representing predictive patterns of application domains, accurate predictive models can be developed quite easily. For more, please read Cookbook for Predictive Analytics.

Do regression methods work?

Generally speaking, regression methods don't work well for complex modeling. This is especially true if modeling data have severe skews. It tends to produce rather randomly predictions. The following histograms show comparison between different modeling techniques under severe data skew;

By Neural network
Neural network is a very powerful modeling framework. As shown in the left figure, it can learn in very detail. Most green areas are located below 0.4. Most red areas are located above 0.4.
By Cramer Tree Segmentation
The left is a result from probability modeling using Cramer decision tree segmentation. Although it is not as good as neural network, it still produces useful result patterns.
By Regression: General Linear Model
This result is produced with general linear regression models. With general linear model of RR=0.99936, it produces totally useless predictive patterns! This figure shows no patterns in distributions of reds.

Predictive Modelling in Rule Inference Systems

Generally, predictive modelling is not much useful if it can not deal with complexities of real world requirements. Predictive modelling is very effective when it is applied to very specific problems with well-defined scope seen from past data. But most real world problems can be readily defined with rather complex rules and mathematical formulas. This leaves predictive modeling lesser ground to stand for, which means lesser applicable domains.

Rule-based modeling provides an environment where predictive models can be used in conjunction with rules, mathematical formulas and clustering (or segmentation) methods. In another words, rules and formulas are used to model real world problems. Predictive models and clustering schemes are used as functions inside rules and formulas. This paves a perfect modeling environment for intelligent applications that may involve complex rules and predictive models.

RME-EP is an rete-like expert systems shell. It provides a best mix of predictive modeling and rule-based model evaluation. It incorporates Rete-style rule evaluation with predictive modeling. Rete engine is a de facto industry standard technique for rule-based expert systems. It is also known as expert system shells. RME-EP combines them using intuitive SQL-like RME-EP rule specification language. The following is an example of RME-EP rule-based models: By combining rules and time-series regression, the model detects new sales trends and alert automatically;

// SALES TREND MONITORING;
RULE 1: // if sales dropped over 5% with correlation coefficient 0.3 and over;
IF TIMESERIES(RR, LINEAR, 1, Month1, Month2, Month3, Month4, Month5, Month6) > 0.3 
     AND TIMESERIES(GROWTH_RATE, LINEAR, 1, Month1, Month2, Month3, Month4, Month5, Month6) < -0.05 
  THEN SET sales.trend AS 'declining' END;

RULE 2:  // if sales increased over 15% with correlation coefficient 0.3 and over;
IF TIMESERIES(RR, LINEAR, 1, Month1, Month2, Month3, Month4, Month5, Month6) > 0.3 
     AND TIMESERIES(GROWTH_RATE, LINEAR, 1, Month1, Month2, Month3, Month4, Month5, Month6) > 0.15 
  THEN SET sales.trend AS 'increasing' END;

RULE 3: // alert if something detected;
IF sales.trend IS NOT NULL 
   THEN THROWEVENT('alert', Region, Channels, sales.trend) END;
Knowledge-embedded Predictive Balanced Scorecards

Balanced scorecards provide concise, predictive and actionable information about how a company is performing and may perform in the future. Knowledge-enhanced predictive balanced scorecards can improve business visibility, harnessing balanced scorecards with predictive modeling and business logic using expert systems. For more, please read The Predictive Balanced Scorecard.

Audit and Fraud detection

Real-time audit of business transactions is required for various reasons. Laws and government regulations prohibit certain types of transactions. Internal policies may forbid them. In addition, auditing may lead to fraud detection and risk management. RME-EP engine is an excellent platform for implementing real-time auditing systems. With the expressive power of the language integrated with predictive modeling, complex auditing rules can be expressed very easily. For more, please read real-time transaction audit and event monitoring systems .


For more information and s/w trial, please write to us.