The automatic setup option for machine learning model evaluations sets up a machine learning environment, a database, and a sample model for you. Follow the steps in the guided tour to learn how to evaluate the sample model. After the setup is complete, you can add your own model to the dashboard.
Sample model
The automatic setup uses the sample data set German Credit Risk to demonstrate key features of model evaluations.
Overview of the sample data
The German Credit Risk sample data provides a collection of records for bank customers who were used to train the sample model. It contains 20 attributes for each loan applicant. The sample models provisioned as part of the automatic setup are trained to predict level of credit risk for new customers. Two of the attributes considered for the prediction - sex and age - can be tested for bias to make sure that outcomes are consistent regarding gender or age of customers.
To evaluate the outcomes, results are divided into groups. The Reference groups are the groups that are considered most likely to have positive outcomes. In this case, the Reference groups are male customers and customers over the age of 25. The Monitored groups are the groups that you want to review to ensure that the results do not differ greatly from the results for the monitored groups. In this case, the Monitored groups are females and customers aged 19 - 25.
Running the automatic setup
Follow these steps run the automatic setup:
- Launch Watson OpenScale.
- Choose the Auto setup option.
The process takes about 10 minutes to complete. Three deployments are configured during the setup:
Model | Binding | Description |
---|---|---|
GermanCreditRiskModelPreProd | Pre-production, approved | This deployment represents the current approved model that is being evaluated in the internal environment. |
GermanCreditRiskModelChallenger | Pre-production | The challenger model is deployed to compare performance and other attributes against the approved pre-production model deployment. |
GermanCreditRiskModel | Production | Between the approved pre-production model and the challenger model, the model that delivers more favorable results is selected for production and deployed from the production space. |
After the setup is complete, follow the guided tour to learn the features of model evaluations.
Guided tour highlights
The guided tour demonstrates these features:
- Introduction to the user interface (UI): The four main areas of the UI include Insights, Explanations, Configuration, and Support.
- Monitoring and viewing results for the German credit risk model: Use predefined monitors to evaluate your model for fairness, quality, and drift. You can also use custom monitors for model evaluation.
- Exploring Fairness monitor: Use the Fairness monitor to looks for biased outcomes from your model. If a fairness issue is found, an alert is triggered based on configurable thresholds.
- Exploring data sets: Toggle between balanced, payload, training, and debiased data sets to see how they affect the fairness score of your model.
- Introduction to transactions: Review transactions from the payload data set for group bias and individual bias.
- Explaining model outcomes: Understand the features that led to the model prediction to build trust in the model. Additionally, learn how to change feature values to receive more favorable model outcomes.
- Exploring Drift monitor: Use the Drift monitor to determine if the processing of data in the model is causing a drop in accuracy.
- Reviewing transactions: Review the transactions list to investigate the drop in accuracy.
Touring a specific page
To use the automatic setup guided tour for a specific page, follow these steps:
- Open the page for which you would like to follow the guided tour.
- Open the Support tab and select Tour this page.
Resetting the tour
To reset the automatic setup tour, open the Support tab and select Reset auto setup.
Learn more
Parent topic: Setup options for model evaluations