AutoAI Overview

The AutoAI graphical tool in Watson Studio automatically analyzes your data and generates candidate model pipelines customized for your predictive modeling problem.  These model pipelines are created over time as AutoAI algorithms learn more about your dataset and discover data transformations, estimator algorithms, and parameter settings that work best for your problem setting.  Results are displayed on a leaderboard, showing the automatically generated model pipelines ranked according to your problem optimization objective.

Required service
Watson Machine Learning service
Data format
Tabular: CSV files
Data size
Less than 100 MB

For more information on choosing the right tool for your data and use case, see Choosing a tool.

AutoAI process

Using Auto AI, you can build and deploy a machine learning model with sophisticated training features and no coding. The tool does most of the work for you.

The Auto AI process

The AutoAI process follows this sequence to build candidate pipelines:

Data pre-processing

Most data sets contain different data formats and missing values, but standard machine learning algorithms work with numbers and no missing values. AutoAI applies various algorithms to analyze, clean, and prepare your raw data for machine learning. It automatically detects and categorizes features based on data type, such as categorical or numerical. Depending on the categorization, it uses hyper-parameter optimization to determine the best combination of  strategies for missing value imputation, feature encoding, and feature scaling for your data.

Automated model selection

The next step is automated model selection that matches your data.  AutoAI uses a novel approach that enables testing and ranking candidate estimators against small subsets of the data, gradually increasing the size of the subset for the most promising estimators to arrive at the best match. This approach saves time without sacrificing performance.  It enables ranking a large number of candidate estimators and selecting the best match for the data.

Automated feature engineering

Feature engineering attempts to transform the raw data into the combination of features that best represents the problem to achieve the most accurate prediction. AutoAI uses a novel approach that explores various feature construction choices in a structured, non-exhaustive manner, while progressively maximizing model accuracy using reinforcement learning. This results in an optimized sequence of  transformations for the data that best match the estimators of the model selection step.

Hyperparameter optimization

Finally, a hyper-parameter optimization step refines the best performing model pipelines. AutoAI uses a novel hyper-parameter optimization algorithm optimized for costly function evaluations such as model training and scoring that are typical in machine learning. This approach enables fast convergence to a good solution despite long evaluation times of each iteration.

Next step

Follow the steps in the topic Creating an AutoAI experiment from sample data to build and deploy a sample application, or use your own data to build an AutoAI model.