Selecting an AutoAI model

AutoAI automatically prepares data, applies algorithms, and attempts to build model pipelines best suited for your data and use case. This topic describes how to evaluate the model pipelines.

During AutoAI training, your data set is split to a training part and a hold-out part. The training part is used by the AutoAI training stages to generate the AutoAI model pipelines and cross-validation scores used to rank them.  After AutoAI training, the hold-out part is used for the resulting pipeline model evaluation and computation of performance information such as ROC curves and confusion matrices, shown in the leaderboard. The training/hold-out split ratio  is  90/10. 

As the training progresses, you are presented with a dynamic infographic and leaderboard. Hover over nodes in the infographic to explore the factors that pipelines share as well as their unique properties. You can see the factors that pipelines share as well as the properties that make a pipeline unique. For a guide to the data in the infographic, click the Legend tab in the information pane. Or, to see a different view of the pipeline creation, click the Experiment details tab of the notification pane, then click Switch views to view the progress map. In either view, click a pipeline node to view the associated pipeline in the leaderboard. The leaderboard contains model pipelines ranked by cross-validation scores.

View the pipeline transformations

Hover over a node in the infographic to view the transformations for a pipeline.The sequence of data transformations consists of a pre-processing transformer and a sequence of data transformers, if feature engineering was performed for the pipeline. The algorithm is determined by model selection and optimization steps during AutoAI training.

Pipeline transformation for AutoAI models

See Implementation details to review the technical details for creating the pipelines.

View the leaderboard

Each model pipeline is scored for a variety of metrics and then ranked. The default ranking metric for binary classification models is the area under the ROC curve, for multi-class classification models is accuracy, and for for regression models is the root mean-squared error (RMSE). The highest-ranked pipelines are displayed in a leaderboard, so you can view more information about them. The leaderboard also provides the option to save select model pipelines after reviewing them.

Leaderboard AutoAI models

You can evaluate the pipelines as follows:

  • Click a pipeline in the leaderboard to view more detail about the metrics and performance.
  • Click Compare to view how the top pipelines compare.
  • Sort the leaderboard by a different metric.

Expanding an AutoAI pipeline

Viewing the confusion matrix

One of the details you can view for a pipeline for a binary classification experiment is a Confusion matrix.

The confusion matrix is based on the holdout data, which is the portion of the training dataset not used for training the model pipeline but only used to measure its performance on data that was not seen during training.

In a binary classification problem with a positive class and a negative class, the confusion matrix summarizes the pipeline model’s positive and negative predictions in four quadrants depending on their correctness with respect to the positive or negative class labels of the holdout dataset.

For example, the Bank sample experiment seeks to identify customers that will take promotions offered to them. The confusion matrix for the top-ranked pipeline is:

Confusion matrix


The positive class is ‘yes’ (meaning a user will take the promotion), so you can see that the measurement of true negatives, that is, customers the model predicted correctly they would refuse their promotions, is fairly high.

Click the items in the navigation menu to view other details about the selected pipeline. For example, Feature importance shows which data features contribute most to your prediction output.

Save a pipeline as a model

When you are satisfied with a pipeline, click Save model to save the candidate as a model to your project so you can test and deploy it.

Next steps

Promote the trained model to a deployment space so that you can test it with new data and generate predictions.