The most obvious step to detect bias in AI is to simply look at the data more closely; simple checks for class imbalance, proper representation of the population in the dataset, etc. Some of the more sophisticated implementations and tools to detect biases in the model as well as the corresponding dataset used to train a model are as follows:
The idea behind this open-sourced tool from Google is to ask the model various sets of questions to determine if the model is biased in its predictions. It primarily looks for changes in the prediction behavior of the model based on the changes in the training data points. Training and testing against a huge number of such changes are difficult and tedious tasks if done manually. The tool makes such experiments quick and structured. It includes a GUI that helps determine the impact of the respective changes on the prediction through various visualizations making the tool really interactive and easy to use.
The most obvious,brute-force way of looking for very niche issues related to biases in the predictions of a machine learning model results is by using people to look for specific problems in the models by finding new innovative techniques to detect such issues. Many large organizations have taken this path to accurately assess the presence of biases in their algorithms that were not detected by any of the existing tools available. Good examples of this are Microsoft and the University of Maryland, using crowdsourcing to detect biases in their natural language processing applications.
3. AI Fairness 360
It is a tool introduced by IBM that is a step ahead in the tools as it gives us the ability to not only detect biases in the models, but also suggest ways and remedies to rectify the same. The tool includes more than 70 built-in fairness metrics that will help you determine the existing biases in your models.
This toolbox also has a python open-sourced implementation available and is used to audit almost all the predictive machine learning models for any inherent biases. The main objective here is to detect how much the output of the model changes with small shifts in the input features.