0 / 0
Tutorial: AutoAI multivariate time series experiment with Supporting features
Tutorial: AutoAI multivariate time series experiment with Supporting features

Tutorial: AutoAI multivariate time series experiment with Supporting features

Use sample data to train a multivariate time series experiment that predicts pollution rate and temperature with the help of supporting features that influence the prediction fields.

When you set up the experiment, you load sample data that tracks weather conditions in Beijing from 2010 to 2014. The experiment generates a set of pipelines that use algorithms to predict future pollution and temperature with supporting features, including dew, pressure, snow, and rain. After generating the pipelines, AutoAI compares and tests them, chooses the best performers, and presents them in a leaderboard for you to review.

Before you begin

Make sure you have the following resources:

Data set overview

For this tutorial, you will use the Beijing PM 2.5 data set from the Watson Studio gallery. This data set describes the weather conditions in Beijing from 2010 to 2014, which are measured in 1-day steps, or increments. You will use this data set to configure your AutoAI experiment and select Supporting features. Details about the data set are described here:

  • Each column, other than the date column, represents a weather condition that impacts pollution index.
  • The Gallery shows the origin of the data. You can preview the file before downloading.
  • The sample data is structured in rows and columns and saved as a .csv file.

Data set preview

Tasks overview

In this tutorial, you follow steps to create a multivariate time series experiment that uses Supporting features:

  1. Create a Watson Studio project
  2. Create an AutoAI experiment
  3. Configure the experiment
  4. Review experiment results
  5. Deploy the trained model
  6. Test the deployed model

Create a Watson Studio project

Follow these steps to create an empty project and download the Beijing PM 2.5 data set from the Watson Studio Gallery:

  1. From the main navigation pane, click Projects > View all projects, then click New Project.
    a. Click Create an empty project.
    b. Enter a name and optional description for your project.
    c. Click Create.
  2. From the main navigation panel, click Gallery and download a local copy of the Beijing PM 2.5 data set.

Create an AutoAI experiment

Follow these steps to create an AutoAI experiment and add sample data to your experiment:

  1. Open your project and go to the Assets tab.
  2. Click New asset and choose AutoAI.
  3. Specify a name and optional description for your experiment.
  4. Click Associate a Machine Learning service instance to associate a machine learning service instance with your experiment.
  5. In the Associate service page, select Watson Machine Learning. Note: You can also select any other service instance of type Machine Learning.
  6. Click Associate. In the Create an AutoAI experiment dialog box, click Reload to confirm your configuration.

  7. Click Create.

  8. To add sample data, choose one of the following:
    • If you downloaded your file locally, upload the training data file, PM25.csv by clicking Browse and then following the prompts.
    • If you already uploaded your file to your project, click Select from project, then select the Data asset tab and choose Beijing PM 25.csv.

Configure the experiment

Follow these steps to configure your multivariate AutoAI time series experiment:

  1. Click Yes for the option to create a Time Series Forecast.
  2. Choose as prediction columns: pollution, temp.
  3. Choose as the date/time column: date.

    Configuring experiment settings. Yes, to time series forecast and pollution and temp as the prediction columns with date as the date/time column.

  4. Click Experiment settings to configure the experiment:
    a. In the Prediction page, accept the default selection for Algorithms to include. Algorithms that allow you to use Supporting features are indicated by a checkmark in the column Allows supporting features.
    Configuring experiment settings. Algorithms that support the use of Supporting features

    b. Go to the Data Source page. For this tutorial, you will supply future values of Supporting features while testing. Future values are helpful when values for the supporting features are knowable for the prediction horizon. Accept the default enablement for Leverage future values of supporting features. Additionally, accept the default selection for columns that will be used as Supporting features.
    Supporting features
    c. Click Cancel to exit from Experiment settings.

  5. Click Run experiment to begin the training.

Review experiment results

The experiment takes several minutes to complete. As the experiment trains, the relationship map shows the transformations that are used to create pipelines. Follow these steps to review experiment results and save the pipeline with the best performance.

  1. Optional: Hover over any node in the relationship map to get details on the transformation for a particular pipeline. Relationship map
  2. Optional: After the pipelines are listed on the leaderboard, click Pipeline comparison to see how they differ. For example: Pipeline comparison
  3. When the training completes, the top three best performing pipelines are saved to the leaderboard. Click any pipeline name to review details.
    Note: Pipelines that use Supporting features are indicated by SUP enhancement. Pipeline leaderboard
  4. Select the pipeline with Rank 1 and click Save as to create your model. Then, click Create. This saves the pipeline under the Models section in the Assets tab.

Deploy the trained model

Before you can use your trained model to make predictions on new data, you must deploy the model. Follow these steps to promote your trained model to a deployment space:

  1. You can deploy the model from the model details page. To access the model details page, choose one of these options:
    • Click the model’s name in the notification that is displayed when you save the model.
    • Open the Assets page for the project containing the model and click the model’s name in the Machine Learning Model section.
  2. Select Promote to Deployment Space, then select or create a deployment space where the model will be deployed.
    Optional: Follow these steps to create a deployment space:
    a. From the Target space list, select Create a new deployment space.
    b. Enter a name for your deployment space.
    c. To associate a machine learning instance, go to Select machine learning service (optional) and select a machine learning instance from the list.
    d. Click Create.
  3. Once you have selected or created your space, click Promote.
  4. Click the deployment space link from the notification.
  5. From the Assets tab of the deployment space:
    a. Hover over the model’s name and click the deployment icon Deploy icon.
    b. In the page that opens, complete the fields:
    • Select Online as the Deployment type.
    • Specify a name for the deployment.
    • Click Create.

After the deployment is complete, click the Deployments tab and select the deployment name to view the details page.

Test the deployed model

Follow these steps to test the deployed model from the deployment details page:

  1. On the Test tab of the deployment details page, go to New observations (optional) spreadsheet and enter the following values:
    pollution (double): 80.417
    temp (double): -5.5
    dew (double): -7.083
    press (double): 1020.667
    wnd_spd (double): 9.518
    snow (double): 0
    rain (double): 0

    New observations

  2. To add future values of Supporting features, go to Future exogenous features (optional) spreadsheet and enter the following values:
    dew (double): -12.667
    press (double): 1023.708
    wnd_spd (double): 9.518
    snow (double): 0
    rain (double): 0.042
    Note: You must provide the same number of values for future exogenous features as the prediction horizon that you set during experiment configuration stage.

    Future exogenous values

  3. Click Predict. The resulting prediction indicates values for pollution and temperature.
    Note: Prediction values shown in the output may differ when you test your deployment.

    Resulting prediction

Learn more

Parent topic: Building a time series experiment