You can configure fairness evaluations to determine whether your model produces biased outcomes. Use fairness evaluations to identify when your model shows a tendency to provide favorable outcomes more often for one group over another.
Configuring fairness evaluations for machine learning models
You can configure fairness evaluations manually or you can run a custom notebook to generate a configuration
file. You can upload the configuration file to specify the settings for your evaluation.
When you configure fairness evaluations manually, you can specify the reference group (value) that you expect to represent favorable outcomes. You can also select the corresponding model attributes (features) to monitor for bias (for example,
Age or Sex), that will be compared against the reference group. Depending on your training data, you can also specify the minimum and maximum sample size for evaluations.
Select favorable and unfavorable outcomes
Copy link to section
You must specify favorable and unfavorable outcomes when configure fairness evaluations. The values that represent a favorable outcome are derived from the label column in the training data.
By default the predictedLabel column is set as the prediction column. Favorable and unfavorable values must be specified by using the value of the prediction column as a string data type, such as 0 or 1 when you are uploading training data.
Select features
Copy link to section
You must select the features that are the model attributes that you want to evaluate to detect bias. For example, you can evaluate features such as Sex or Age for bias. Only features that are of categorical, numeric
(integer), float, or double fairness data type are supported.
The values of the features are specified as either a reference or monitored group. The monitored group represents the values that are most at risk for biased outcomes. For example, for the Sex feature, you can
set Female and Non-binary as the monitored groups. For a numeric feature, such as Age, you can set [18-25] as the monitored group. All other values for the feature are
then considered as the reference group, for example, Sex=Male or Age=[26,100].
Set fairness threshold
Copy link to section
You can set the fairness threshold to specify an acceptable difference between the percentage of favorable outcomes for the monitored group and the percentage of favorable outcomes for the reference group. For example, if the percentage of
favorable outcomes for a group in your model is 70% and the fairness threshold is set to 80%, then the fairness monitor detects bias in your model.
Set sample size
Copy link to section
Sample sizes are used to spedicy how to process the number of transactions that are evaluated. You must set a minimum sample size to indicate the lowest number of transactions that you want to evaluate. You can also set a maximum sample size
to indicate the maximum number of transactions that you want to evaluate.
Testing for indirect bias
Copy link to section
If you select a field that is not a training feature, called an added field, indirect bias is identified by finding associated values in the training features. For example, the profession “student” may imply a younger individual even though
the Age field was excluded from model training. For details on configuring fairness evaluations to consider indirect bias, see Configuring the Fairness monitor for indirect bias.
Mitigating bias
Copy link to section
Passive and active debiasing are used for machine learning model evaluations. Passive debiasing reveals bias, while active debiasing prevents you from carrying that bias forward by changing the model in real time for the current application.
For details on interpreting results and mitigating bias in a model, see Reviewing results from a Fairness evaluation.
Configuring fairness evaluations in watsonx.governance
Copy link to section
When you evaluate prompt templates, you can review a summary of fairness evaluation results for the text classification tasks.
Select favorable and unfavorable outcomes
Copy link to section
You must specify favorable and unfavorable outcomes when configure fairness evaluations. The values that represent a favorable outcome are derived from the label column in the test data that you provide. By default the predictedLabel column is set as the prediction column. Favorable and unfavorable values must be specified by using the value of the prediction column as a string data type, such as 0 or 1 when you are
uploading training data.
Select meta-fields
Copy link to section
You must select meta-fields to enable watsonx.governance to identify fields that are not specified as features in the test data that you provide.
Set fairness thresholds
Copy link to section
To configure fairness evaluations with your own settings, you can set a minimum and maximum sample size for each metric. The minimum or maximum sample size indicates the minimum or maximum number of model transactions that you want to evaluate.
You can also configure baseline data and set threshold values for each metric. Threshold values create alerts on the evaluation summary page that apper when metric scores violate your thresholds. You must set the values between the range of
0 to 1. The metric scores must be lower than the threshold values to avoid violations.
Set sample size
Copy link to section
Watsonx.governance uses sample sizes to understand how to process the number of transactions that are evaluated during evaluations. You must set a minimum sample size to indicate the lowest number of transactions that you want watsonx.governance
to evaluate. You can also set a maximum sample size to indicate the maximum number of transactions that you want watsonx.governance to evaluate.
Supported fairness metrics
Copy link to section
Supported languages: English only
When you enable fairness evaluations for machine learning models or generative AI assets, you can view a summary of evaluation results with metrics for the type of model that you're evaluating.
You can view the results of your fairness evaluations for machine learning modles on the Insights dashboard. For more information, see Reviewing fairness results.
The following metrics are supported by fairness evaluations:
Disparate impact
Copy link to section
Disparate impact is specified as the fairness scores for different groups. Disparate impact compares the percentage of favorable outcomes for a monitored group to the percentage of favorable outcomes for a reference group.
How it works: When you view the details of a model deployment, the Fairness section of the model summary that is displayed, provides the fairness scores for different groups that are described as metrics.
The fairness scores are calculated with the disparate impact formula.
Uses the confusion matrix to measure performance: No
The num_positives value represents the number of individuals in the group who received a positive outcome, and the num_instances value represents the total number of individuals in the group. The privileged=False label specifies unprivileged groups and the privileged=True label specifies privileged groups. The positive outcomes are designated as the favorable outcomes, and the negative outcomes are designated as the unfavorable outcomes.
The privileged group is designated as the reference group, and the unprivileged group is designated as the monitored group.
The calculation produces a percentage that specifies how often the rate that the unprivileged group receives the positive outcome is the same rate that the privileged group receives the positive outcome. For example, if a credit risk model assigns
the “no risk” prediction to 80% of unprivileged applicants and to 100% of privileged applicants, that model has a disparate impact of 80%.
Supported fairness details
The following details for fairness metrics are supported:
The favorable percentages for each of the groups
Fairness averages for all the fairness groups
Distribution of the data for each of the monitored groups
Distribution of payload data
Statistical parity difference
Copy link to section
The statistical parity difference compares the percentage of favorable outcomes for monitored groups to reference groups.
Description: Fairness metric that describes the fairness for the model predictions. It is the difference between the ratio of favorable outcomes in monitored and reference groups
Under 0: Higher benefits for the monitored group.
At 0: Both groups have equal benefit.
Over 0 Implies higher benefit for the reference group.
Uses the confusion matrix to measure performance: Yes
The impact score compares the rate that monitored groups are selected to receive favorable outcomes to the rate that reference groups are selected to receive favorable outcomes.
Do the math:
The following formula calculates the selection rate for each group:
number of individuals receiving favorable outcomes
Selection rate = ________________________________________________________
total number of individuals
Copy to clipboardCopied to clipboard
The following formula calculates the impact score:
selection rate for monitored groups
Impact score = ________________________________________________________
selection rate for reference groups
Thresholds:
Lower bound: 0.8
Upper bound: 1.0
How it works: Higher scores indicate higher selection rates for monitored groups
False negative rate difference
Copy link to section
The false negative rate difference gives the percentage of positive transactions that were incorrectly scored as negative by your model.
Description: Returns the difference in false negative rates for the monitored and reference groups
At 0: Both groups have equal benefit.
Uses the confusion matrix to measure performance: Yes
Do the math:
The following formula is used for calculating false negative rate (FNR):
false negatives
False negative rate = __________________________
all positives
Copy to clipboardCopied to clipboard
The following formula is used for calculating false negative rate difference:
False negative rate difference = FNR of monitored group - FNR of reference group
Copy to clipboardCopied to clipboard
False positive rate difference
Copy link to section
The false positive rate difference gives the percentage of negative transactions that were incorrectly scored as positive by your model.
Description: Returns the ratio of false positive rate for the monitored group and reference groups.
At 0: Both groups have equal odds.
Uses the confusion matrix to measure performance: Yes
Do the math:
The following formula is used for calculating false positive rate (FPR):
false positives
False positive rate = ________________________
total negatives
Copy to clipboardCopied to clipboard
The following formula is used for calculating false positive rate difference:
False positive rate difference = FPR of monitored group - FPR of reference group
Copy to clipboardCopied to clipboard
False discovery rate difference
Copy link to section
The false discovery rate difference gives the amount of false positive transactions as a percentage of all transactions with a positive outcome. It describes the pervasiveness of false positives among all positive transactions.
Description: Returns the difference in false discovery rate for the monitored and reference groups.
At 0: Both groups have equal odds.
Uses the confusion matrix to measure performance: Yes
Do the math:
The following formula is used for calculating the false discovery rate (FDR):
The following formula is used for calculating the false discovery rate difference:
False discovery rate difference = FDR of monitored group - FDR of reference group
Copy to clipboardCopied to clipboard
False omission rate difference
Copy link to section
The false omission rate difference gives the number of false negative transactions as a percentage of all transactions with a negative outcome. It describes the pervasiveness of false negatives among all negative transactions.
Description: Returns the difference in false omission rate for the monitored and reference groups
At 0: Both groups have equal odds.
Uses the confusion matrix to measure performance: Yes
Do the math:
The following formula is used for calculating the false omission rate (FOR):
The following formula is used for the false omission rate difference:
False omission rate difference = FOR of monitored group - FOR of reference group
Copy to clipboardCopied to clipboard
Error rate difference
Copy link to section
The error rate difference calculates the percentage of transactions that are incorrectly scored by your model.
Description: Returns the difference in error rate for the monitored and reference groups.
At 0: Both groups have equal odds.
Uses the confusion matrix to measure performance: Yes
Do the math:
The following formula is used for calculating the error rate (ER):
false positives + false negatives
Error rate = ___________________________________________
all positives + all negatives
Copy to clipboardCopied to clipboard
The following formula is used for calculating the error rate difference:
Error rate difference = ER of monitored group - ER of reference group
Copy to clipboardCopied to clipboard
Average odds difference
Copy link to section
The average odds difference gives the percentage of transactions that was incorrectly scored by your model.
Description: Returns the difference in error rate for the monitored and reference groups.
At 0: Both groups have equal odds.
Uses the confusion matrix to measure performance: Yes
Do the math:
The following formula is used for calculating false positive rate (FPR):
false positives
False positive rate = _________________________
total negatives
Copy to clipboardCopied to clipboard
The following formula is used for calculating true positive rate (TPR):
True positives
True positive rate = ______________________
All positives
Copy to clipboardCopied to clipboard
The following formula is used for calculating average odds difference:
(FPR monitored group - FPR reference group) + (TPR monitored group - TPR reference group)
Average odds difference = ___________________________________________________________________________________________
2
Copy to clipboardCopied to clipboard
Average absolute odds difference
Copy link to section
The average absolute odds difference compares the average of absolute difference in false positive rates and true positive rates between monitored groups and reference groups.
Description: Returns the average of the absolute difference in false positive rate and true positive rate for the monitored and reference groups.
At 0: Both groups have equal odds.
Uses the confusion matrix to measure performance: Yes
Do the math:
The following formula is used for calculating false positive rate (FPR):
false positives
False positive rate = ____________________________
all negatives
Copy to clipboardCopied to clipboard
The following formula is used for calculating true positive rate (TPR):
True positives
True positive rate = ________________________
All positives
Copy to clipboardCopied to clipboard
The following formula is used for calculating average absolute odds difference:
|FPR monitored group - FPR reference group| + |TPR monitored group - TPR reference group|
Average absolute odds difference = ______________________________________________________________________________________________
2
Copy to clipboardCopied to clipboard
Measure Performance with Confusion Matrix
Copy link to section
The confusion matrix measures performance categorizes positive and negative predictions into four quadrants that represent the measurement of actual and predicted values as shown in the following example:
Actual/Predicted
Negative
Positive
Negative
TN
FP
Positive
FN
TP
The true negative (TN) quadrant represents values that are actually negative and predicted as negative and the true positive (TP) quadrant represents values that are actually positive and predicted as positive. The false positive (FP) quadrant
represents values that are actually negative but are predicted as positive and the the false negative (FN) quadrant represents values that are actually positive but predicted as negative.
Note: Performance measures are not supported for regression models.