CHAID Visualizations

The following tables and options are available for CHAID visualizations.

Model Evaluation panel

For classification models, the Model Evaluation panel shows a bar graph displaying the overall prediction accuracy, or proportion of correct predictions, and a table containing a set of evaluation statistics (if the prediction accuracy is exactly 0, the graph will not be shown). The evaluation statistics include the overall accuracy and a series of figures based on treating each category of the target field as the category of interest (or positive response) and averaging the calculated statistics across categories with weights proportional to the observed proportions of instances in each category. The weighted measures include true and false positive rates (TPR and FPR), precision, recall, and the F1 measure, which is the harmonic mean of precision and recall. When weighted in this manner (based on observed proportions), weighted true positive rate and weighted recall are the same as overall accuracy.

For regression models, the panel shows a bar graph displaying the R2 as a measure of prediction accuracy, and a table with the same measure.

Model Information table

This table contains information on the type of model fitted, identifies the target field, the number of input features and the size of the resulting tree model.

Predictor Importance chart

This chart displays bars representing the predictors in descending order of relative importance for predicting the target, as determined by a variance-based sensitivity analysis algorithm. The values for each predictor are scaled so that they add to 1.

Top Decision Rules table

The decision rules leading to the terminal or leaf nodes with the highest percentages of the total records. Rules are defined by lists of conditions that define the partitioning of data by the algorithm and can be used to assign individual records to child nodes based on the values of different predictors.

Tree Diagram

Illustrates the derived tree structure for the model. Two check boxes are available to allow control over whether to view labels on nodes and/or branches. Hovering over a branch triggers a display of the split field and split value. Hovering over a node triggers a tabular display of the node ID, the score for records in that node based on the model, the confidence, and counts and percentages of the total records in that node and the number at each value of the target.

Confusion Matrix (Classification Table)

The confusion matrix or classification table contains a cross-classification of observed by predicted labels or groups. The numbers of correct predictions are shown in the cells along the main diagonal. Correct percentages are shown for each row, column and overall:

  • The percent correct for each row shows what percentage of the observations with that observed label were correctly predicted by the model. If a given label is considered a target label, this is known as sensitivity, recall or true positive rate (TPR). In a 2 x 2 confusion matrix, if one label is considered the non-target label, the percentage for that row is known as the specificity or true negative rate (TNR).
  • The percent correct for each column shows the percentage of observations with that predicted label that were correctly predicted. If a given predicted label is considered a target label, this is known as precision or positive predictive value (PPV). For a 2 x 2 confusion matrix, if one label is considered the non-target label, the percentage for that column is known as the negative predictive value (NPV).
  • The percent correct at the bottom right of the table gives the overall percentage of correctly classified observations, known as the overall accuracy.

Next steps

Like your visualization? Why not deploy it? For more information, see Deploy a model.