|Rosella Predictive Knowledge & Data Mining|
CramerTree - Cramer Decision Tree
The quality of predictive modeling is measured in terms with accuracy on un-seen data. In decision trees, there are two factors that we can measure: accuracy and numbers of nodes. The latter is a very important factor for un-seen data. For the same or similar accuracy, smaller numbers of nodes mean that trees were constructed with more general splitting criteria. Thus they can work better with un-seen data. Note that higher numbers imply that trees use splitting variables with large numbers of values. This can result in negative impact on un-seen data.
CramerTree uses Cramer coefficients as node splitting criteria. In general, it produces best trees measured in both accuracy and numbers of nodes. It is noted that most node splitting criteria inherently favor splits with many branches. This tends to result in trees with a large number of nodes. Cramer leverages this phenomena with degrees of freedom. Therefore, it tends to produce most compact decision trees. It is noted that less braching nodes means more large sized nodes, which translates to higher statistical supports. Generally, this will lead to better predictive accuracy on unseen data. We performed three experiments on personnel data as follows.
The following is the outcome of the experiments. Note that "(b)" denotes binary splits. The table shows that CramerTree in general produces smaller numbers of trees while maintaining high accuracy. This justifies the use of CramerTree as the default splitting criteria.
For more, read decision trees and drill-down analysis.
For information about software, please read Data Mining Software. Software download is available from the page.
Applications of Decision Tree Classification Predictive Modeling
To find out how decision tree is used in Database Marketing.
To find out how decision tree is used in Targeted Marketing.
To find out how decision tree is used in Direct Marketing.
To find out how decision tree is used in Direct Mail Marketing.
To find out how decision tree is used in Credit Predictive Modeling and Credit Scoring.