Auto Classifier Visualizations

The following tables and options are available for Auto Classifier visualizations.

Models table

Contains a set of one or more models generated by the Auto Classifier node, each of which can be individually viewed or selected for use in scoring. For each model, the model type, the time it took to build the model, the number of fields or features used in the model, the prediction accuracy, and a thumbnail bar chart of observed and predicted counts for each target category are shown. The Actions column lets you see the details of an individual model, or remove it from the screen (this does not change the results in the Ensemble Model Evaluation panel or the Ensemble Confusion Matrix).

Ensemble Model Evaluation panel

This panel displays a bar graph showing the overall prediction accuracy, or proportion of correct predictions, and a table containing a set of evaluation statistics (if the prediction accuracy is exactly 0, the graph will not be shown). The evaluation statistics include the overall accuracy and a series of figures based on treating each category of the target field as the category of interest (or positive response) and averaging the calculated statistics across categories with weights proportional to the observed proportions of instances in each category. The weighted measures include true and false positive rates (TPR and FPR), precision, recall, and the F1 measure, which is the harmonic mean of precision and recall. When weighted in this manner (based on observed proportions), weighted true positive rate and weighted recall are the same as overall accuracy.

Note that all evaluation measures are based on the full ensemble incorporating all models run in the Auto Classifier node. The results do not change if one or models are removed from the ensemble in the output node.

Confusion Matrix (Classification Table)

The confusion matrix or classification table contains a cross-classification of observed by predicted labels or groups. The numbers of correct predictions are shown in the cells along the main diagonal. Correct percentages are shown for each row, column and overall:

  • The percent correct for each row shows what percentage of the observations with that observed label were correctly predicted by the model. If a given label is considered a target label, this is known as sensitivity, recall or true positive rate (TPR). In a 2 x 2 confusion matrix, if one label is considered the non-target label, the percentage for that row is known as the specificity or true negative rate (TNR).
  • The percent correct for each column shows the percentage of observations with that predicted label that were correctly predicted. If a given predicted label is considered a target label, this is known as precision or positive predictive value (PPV). For a 2 x 2 confusion matrix, if one label is considered the non-target label, the percentage for that column is known as the negative predictive value (NPV).
  • The percent correct at the bottom right of the table gives the overall percentage of correctly classified observations, known as the overall accuracy.

Note that all evaluation measures are based on the full ensemble incorporating all models run in the Auto Classifier node. The results do not change if one or models are removed from the ensemble in the output node.

Next steps

Like your visualization? Why not deploy it? For more information, see Deploy a model.