Configuring fairness evaluations in Watson OpenScale
Watson OpenScale evaluate models for bias to ensure fair outcomes among different groups.
Evaluating the model for fairness
You can use fairness evaluations to determine whether your model produces biased outcomes. The fairness evaluation checks when the model shows a tendency to provide a favorable (preferable) outcome more often for one group over another.
The fairness evaluation generates a set of metrics every hour by default. You can generate these metrics on demand by clicking Evaluate fairness now or by using the Python client.
When you test an evaluation in a pre-production environment, you evaluate fairness based on test data. Test data must have the same format and schema as the training data you used to train the model.
In a production environment, you monitor feedback data, which is the actual data logged with payload logging. For proper monitoring, you must regularly log feedback data to Watson OpenScale. You can provide feedback data by clicking Upload feedback data on the Evaluations page of the Watson OpenScale Insights dashboard. You can also provide feedback data by using the Python client or REST API.
Before you begin
To configure fairness evaluations unstructured text and image models, you must provide payload data that contains meta fields, such as Gender, to calculate disparate impact. To calculate performance metrics for unstructured text and image models, you must also provide feedback data that contains meta fields with the correctly predicted outcomes.
You must complete similar requirements when you configure fairness evaluations for indirect bias. When you configure fairness evaluations for unstructured text and image models, you don't have to provide training data.
Configuring the evaluation
You can configure fairness evaluations manually or you can run a custom notebook to generate a configuration file. You can upload the configuration file to specify the settings for your evaluation.
When you configure fairness evaluations manually, you can specify the reference group (value) that you expect to represent favorable outcomes. You can also select the corresponding model attributes (features) to monitor for bias (for example, Age or Sex), that will be compared against the reference group. Depending on your training data, you can also specify the minimum and maximum sample size that will be evaluated.
Favorable and unfavorable outcomes
The output of the model is categorized as either favorable or unfavorable. For example, if the model is recommending whether a person gets a loan or not, then the Favorable outcome might be
Loan Granted or
Loan Partially Granted.
The unfavorable outcome might be
The values that represent a favorable outcome are derived from the
label column in the training data. By default the
predictedLabel column is set as the
prediction column. Favorable and unfavorable values must be specified by using the value of the
prediction column as a string data type, such as
1 when you are uploading training data.
This section allows users to select all the metrics they want to configure. By default only Disparate impact metric is computed.
Minimum sample size
The minimum sample size is set to delay the fairness evaluation until a minimum number of records are available in the evaluation data set. This function ensures that the sample size is not too small and skews results. By setting a minimum sample size, you prevent measuring fairness until a minimum number of records are available in the evaluation data set. This ensures that the sample size is not too small to skew results. When the fairness monitor runs, it uses the minimum sample size to decide the number of records that are evaluated.
Features: Reference and monitored groups
The values of the features are specified as either a reference or monitored group. The monitored group represents the values that are most at risk for biased outcomes. For example, for the
Sex feature, you can
Non-binary as the monitored groups. For a numeric feature, such as
Age, you can set
[18-25] as the monitored group. All other values for the feature are
then considered as the reference group, for example,
Configuring thresholds for the monitor
To configure the model for fairness, follow these steps:
- Select a model deployment tile and click Configure monitors.
- Select Fairness in the Evaluations section of the Configure tab.
- For each configuration item, click Edit to specify the Favorable outcomes, Sample Size, and the features to evaluate.
The features are the model attributes that are evaluated to detect bias. For example, you can configure the fairness monitor to evaluate features such as
Age for bias. Only features that are of categorical,
numeric (integer), float, or double fairness data type are supported.
Fairness alert threshold
The fairness alert threshold specifies an acceptable difference between the percentage of favorable outcomes for the monitored group and the percentage of favorable outcomes for the reference group. For example, if the percentage of favorable outcomes for a group in your model is 70% and the fairness threshold is set to 80%, then the fairness monitor detects bias in your model.
Testing for indirect bias
If you select a field that is not a training feature, called an added field, Watson OpenScale will look for indirect bias by finding associated values in the training features. For example, the profession “student” may imply a younger individual even though the Age field was excluded from model training.For details on configuring the Fairness monitor to consider indirect bias, see Configuring the Fairness monitor for indirect bias.
Watson OpenScale uses two types of debiasing: passive and active. Passive debiasing reveals bias, while active debiasing prevents you from carrying that bias forward by changing the model in real time for the current application. For details on interpreting results and mitigating bias in a model, see Reviewing results from a Fairness evaluation.
Parent topic: Configuring model evaluations