What Do We Understand by the Term Fair?
Machine learning and big data are becoming more and more relevant in our daily lives with measurable impacts on society. Therefore it is important to address bias and fairness in machine learning implementations. During the course of this blog we will be looking for the exact meaning of these two words (fairness and bias), explore how to detect biases, and finally look at the methodologies to ensure that we are having a model that is free from bias and is fair in its outcomes.
We can agree on the fact that we would not want models to be negatively biased toward people based on characteristics such as gender, race, political orientation, sexual preferences, religion, etc. For the sake of this blog, we will define fairness as consistency in achieving equal results, for two different individuals, unless a logical and meaningful distinction exists between the two. In other words, the model should not be making decisions for or against a specific social subgroup given that the underlying logical factors/features are the same.
Why Does It Matter?
The question of considering fairness and bias in machine learning models is crucial. It is important to ensure that the negative biases, and prejudices that plague our society, which are otherwise irrelevant to the decision-making process, should not get transferred into the machine learning systems.
How to Reduce Bias in Machine Learning?
There are several ways to try to reduce bias in machine learning models:
Use a diverse training dataset: Make sure that the training dataset represents a diverse and representative sample of the population that the model will be used on. If the dataset is not representative, the model is likely to be biased.
Remove sensitive variables: Remove any sensitive variables (such as race, gender, or age) from the training dataset, as these can easily introduce bias into the model.
Use bias mitigation techniques: There are several techniques that can be used to mitigate bias in machine learning models, such as bias correction, bias amplification, and biased data sampling.
Regularly evaluate the model: Regularly evaluate the model’s performance on different subgroups to check for any potential biases.
Use human oversight: Use human oversight and decision-making in conjunction with the model to ensure that any biases are identified and addressed.
Data pre-processing: Another way to reduce bias in the data is to pre-process it in a way that balances the representation of different groups. For example, oversampling the minority class in a dataset can help to balance the distribution of the classes.
Techniques Used to Reduce Bias in Machine Learning Implementation
Some widely popular techniques for increasing the transparency of models, such as LIME, SHAP, and rule lists, can help to understand which features are driving a model’s predictions and identify potential sources of bias. Let’s discuss these techniques in greater depth.
LIME
LIME (Local Interpretable Model-Agnostic Explanations) is a technique that can be used to explain the predictions of a machine learning model. It is particularly useful when the model is complex and difficult to interpret, such as deep neural networks.
The idea behind LIME is to locally approximate the complex model by a simpler model, such as a linear model, that is easier to interpret. The local approximation is done by generating a new dataset, called a perturbed dataset, that consists of the input data point of interest and its perturbed versions, for example by adding random noise.
The simplified model is then fit to this perturbed dataset and the feature importance of each one can be understood by the weight it has in this simplified model. By identifying which features are driving a particular prediction and how much weight each feature is assigned, LIME can help to identify potential sources of bias in the model.
LIME can also be used to generate explanations of the model’s predictions, both globally and locally, which can be used to communicate the model’s decision-making process to stakeholders in a clear and interpretable way. By providing interpretable explanations of model predictions, LIME can help with detecting and mitigating bias by identifying problem areas, and also can help communicate this to domain experts or end-users.
It’s worth noting that while LIME can be a helpful tool for detecting and mitigating bias, it is not a panacea, and it’s important to be aware that it has some limitations as well. For example, LIME assumes that the model is locally linear, which may not be the case in some situations. Additionally, LIME may not always provide the complete picture of a model’s decision-making process, so it should be used in combination with other techniques to evaluate a model’s fairness. It can be calculated with the following code
import lime import lime.lime_tabular import pandas as pd from sklearn.ensemble import RandomForestClassifier # Load the data data = pd.read_csv('data.csv') # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(data.drop('label', axis=1), data['label'], test_size=0.2, random_state=42) # Train a random forest classifier clf = RandomForestClassifier() clf.fit(X_train, y_train) # Create an explainer object explainer = lime.lime_tabular.LimeTabularExplainer(X_train, feature_names=X_train.columns, class_names=['0', '1'], discretize_continuous=True) # Explain the predictions for a specific instance i = 10 exp = explainer.explain_instance(X_test.iloc[i], clf.predict_proba, num_features=10, top_labels=1) # Print the explanation print(exp.as_list())
The output of the above code will be a list of (feature, weight) tuples, where each tuple represents the contribution of a feature to the final prediction of the instance at index i of the test data. The list will be limited by the number of features specified in the num_features parameter passed to the explain_instance method.
[('age', 0.03), ('income', -0.02), ('education', 0.01), ('gender', -0.01), ('marital_status', 0.01), ('children', -0.01), ('employement', 0.01), ('location', 0.01), ('home_ownership', -0.01), ('credit_score', 0.01)]
SHAP
SHAP (SHapley Additive exPlanations) is a technique for explaining the predictions of a machine learning model. Like LIME, it can be used to identify which features are driving a particular prediction, but unlike LIME, SHAP values are based on a solid mathematical foundation and have a number of advantages over LIME.
SHAP values can be used to quantify the contribution of each feature to the prediction of a specific instance. SHAP values are based on the concept of cooperative game theory called Shapley values, which provide a way to fairly distribute a value among a group of individuals (in this case, the features) by considering all possible coalitions.
SHAP values provide a unified measure of feature importance, meaning they can handle both categorical and numerical variables, and interactions between features and are consistent with the model’s output. The most important feature is the one with the highest total absolute contribution.
In terms of reducing bias, SHAP values can be used to detect if certain feature groups are disproportionately affecting certain groups of instances. For example, you could use SHAP values to compare the feature importances between different subpopulations of data and see if the feature importances are different for different demographic groups.
By identifying which features are driving a particular prediction, and how much weight each feature is assigned, SHAP can help detect potential sources of bias in the model and give insights on what data to collect and how to pre-process it.
Like LIME, SHAP is just one tool that can be used to detect and mitigate bias, it’s important to use it in combination with other techniques, and also to be aware that it has some limitations, such as it is computationally expensive for high-dimensional data, but still it is a useful tool to analyze models and to detect bias in them. A sample code implementation for the same is as follows:
import shap import xgboost from sklearn.datasets import load_iris # Load the data iris = load_iris() X = iris["data"] y = iris["target"] # Train an xgboost model model = xgboost.XGBClassifier() model.fit(X, y) # Create an explainer object explainer = shap.Explainer(model, X) # Explain the predictions for a specific instance i = 10 exp = explainer.shap_values(X[i,:]) # Print the explanation print(exp)
The Output of the same is:
[[-0.07611141 0.0589547 -0.04200161 0.01705649] [-0.01354728 -0.0372259 0.04737315 -0.01396475] [ 0.08565869 -0.0144791 -0.00707554 -0.03309174]]
The output is an array with the same number of columns as features in the input data, and one row per output class. Each element of the array represents the contribution of a feature to the final prediction of the instance. The values are signed, so a negative value means that a feature opposes the predicted class.
How to Ensure Fairness?
The approach is to use fairness constraints during the training of the model. This can be done by adding a term to the loss function that penalizes the model for making decisions that are unfair to certain groups. There are several different types of fairness constraints that can be used, depending on the problem and the dataset. Some common types of fairness constraints include:
Demographic parity: Demographic parity ensures that the model’s decisions are independent of the protected attribute. In other words, the model should not discriminate based on protected attributes such as race, gender, or age.
Equal opportunity: Equal opportunity ensures that the model’s true positive rate is equal across different protected groups. For example, the model’s true positive rate for one protected group should not be significantly different from the true positive rate for another protected group.
Predictive parity: Predictive parity ensures that the model’s predictions are similar across different protected groups. In other words, the model should not have systematically different predictions for different protected groups.
Calibration: Calibration ensures that the model’s predicted probabilities align with the true positive rates. This can be achieved by adjusting the threshold of the model so that it makes the same number of false positive or false negative predictions across different protected groups.
These are some examples of fairness constraints. During the training process, fairness constraints can be included in the optimization problem to optimize both accuracy and fairness. The choice of constraints and their implementation will vary depending on the specific problem and dataset, as well as the algorithm and framework being used.
Conclusion
In conclusion, reducing bias in machine learning models is important to ensure that the model is fair and unbiased in its predictions and decisions. There are several steps that can be taken to mitigate bias in machine learning models, such as using a diverse and representative training dataset, removing sensitive variables, using bias mitigation techniques, regularly evaluating the model, and using human oversight.
While it is not possible to completely eliminate bias, these steps can help to reduce it and ensure that the model is as fair and unbiased as possible. It is important to regularly review and update the model to ensure that it remains fair and unbiased over time. It is important to note that bias in machine learning models can be difficult to detect, and addressing it may require a combination of the approaches listed above.