AutoAI automatically prepares data, applies algorithms, and builds model pipelines that are best suited for your data and use case. Learn how to generate the model pipelines that you can save as machine learning models.
Follow these steps to upload data and have AutoAI create the best model for your data and use case.
Collect and prepare your training data. For details on allowable data sources, see AutoAI overview.
Note:
If you are creating an experiment with a single training data source, you have the option of using a second data source specifically as testing, or holdout, data for validating the pipelines.
Open the AutoAI tool
Copy link to section
For your convenience, your AutoAI model creation uses the default storage that is associated with your project to store your data and to save model results.
Open your project.
Click the Assets tab.
Click New asset > Build machine learning models or Retrieval-augmented generation patterns automatically.
Note: After you create an AutoAI asset it displays on the Assets page for your project in the **AutoAI experiments** section, so you can return to it.
Specify details of your experiment
Copy link to section
Specify a name and description for your experiment.
Select a machine learning service instance and provide task credentials if prompted. Then click Create.
Choose data from your project or upload it from your file system or from the asset browser, then press Continue. Click the preview icon to review your data. (Optional) Add a second file as holdout data for testing the trained
pipelines.
Choose the Column to predict for the data you want the experiment to predict.
Based on analyzing a subset of the data set, AutoAI selects a default model type: binary classification, multiclass classification, or regression. Binary is selected if the target column has two possible values. Multiclass has a discrete
set of 3 or more values. Regression has a continuous numeric variable in the target column. You can optionally override this selection.
Note: The limit on values to classify is 200. Creating a classification experiment with many unique values in the prediction column is resource-intensive and affects the experiment's performance and training
time. To maintain the quality of the experiment: - AutoAI chooses a default metric for optimizing. For example, the default metric for a binary classification model is *Accuracy*. - By default, 10% of the training data is held
out to test the performance of the model.
Click Run Experiment to begin model pipeline creation.
An infographic shows you the creation of pipelines for your data. The duration of this phase depends on the size of your data set. A notification message informs you if the processing time will be brief or require more time. You can work in
other parts of the product while the pipelines build.
Hover over nodes in the infographic to explore the factors that pipelines share and their unique properties. You can see the factors that pipelines share and the properties that make a pipeline unique. For a guide to the data in the infographic,
click the Legend tab in the information panel. Or, to see a different view of the pipeline creation, click the Experiment details tab of the notification pane, then click Switch views to view the progress map. In either view,
click a pipeline node to view the associated pipeline in the leaderboard.
View the results
Copy link to section
When the pipeline generation process completes, you can view the ranked model candidates and evaluate them before you save a pipeline as a model.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.