Model fairness is about assess how the output of the model can impact different subgroup of the population.
Tool:
Tensorflow Fairness Indicators
- For binary and multi-class classifiers
- Compares various performance metrics at slices of data (both features and label)
- Accuracy
- AUC ROC
- AUC precision recall
- Loss
- Precision, Recall, etc (see Advanced Classification Metrics: Precision, Recall, and more)
- Built on top of Tensorflow Model Analysis
How to measure fairness?
- Positive/negative rate: examples labeled as positive/negative
- Ex: in a loan approval classifier
- Positive = approved loan
- Negative = declined loan
- Is the positive rate different between the male vs female population?
- Ex: in a loan approval classifier
- Establish context for user groups
- Require domain expert
- Understand people’s social identity, social structure, and culture system
- Talk to social scientists, linguists, anthropologists
- Slice data widely and wisely
- Race, ethnicity, gender, nationality, sexual orientation, income, disability status
- Evaluate metric across different thresholds
- Check low margin cases
- Compare TPR (true positive rate) vs FNR (false negative rate)
- TPR – predicted true out of all actually true
- FNR – predicted false out of all actually true
- Compare TPR across different subgroups
- Ex: need to ensure same positive rate across gender for qualified loan applicants
One thought on “Production ML – Fairness”