In his visionary paper “Computing Machinery and Intelligence”, Alan Turing proposes a test for measuring whether computers can be regarded as intelligent, which he calls “The Imitation Game”. Rather than ponder what it actually means to be able to “think” or “feel”, which will always be defined differently by different individuals, Turing searches for a pragmatic definition. The idea he comes up with is as follows, if one cannot tell the difference between man and machine in some test – we will call such a machine an “intelligent” being. Thus perhaps the most fundamental question in the field of artificial intelligence begins with a game.
In this post, we will discuss some recent uses of games in the field of Machine Learning. Games can relate to any sort of system with multiple agents that have different goals they are trying to achieve. These games are often used in order to define a complex objective that cannot be explicitly stated. I believe that the power of these tools is not yet fully realized.
What are GANs?
GANs (Generative Adversarial Networks) were introduced by Goodfellow et al. (2014). The problem they aim to solve is the generation of synthetic examples that are “similar” to a given dataset. More specifically, we are interested in creating a probability distribution over examples that enables us to turn our discrete finite dataset into something continuous. The idea of GANs had a huge impact on the ML world, enabling the generation of Deepfakes and achieving state-of-the-art results in Computer Vision tasks.
GANs provide the technology for creating Deepfakes (source)
While discriminative networks typically take in high dimensional data and then output something low dimensional, generative networks typically take in low dimensional data and create something high dimensional. Thus generative networks can be used to generate synthetic pictures of faces, English text, and even new musical compositions
How does it work?
So far we’ve covered the generative part, which is the goal. Now we will focus on the adversarial part – the means.
It is not so straightforward to define whether a generated example could have been sampled from the original dataset. Of course, we could label each example from the dataset as 1, and anything else as 0, but that will incentivize our model to simply memorize the dataset and sample from it which is not our intention.
Perhaps one way to define our objective is generating examples that a human can’t distinguish from the original dataset. This is the main idea behind the proposed method. We train two models simultaneously: G (the generator) and D (the discriminator). The discriminator’s objective is to detect whether a given input came from the original dataset, or from the generator’s output, while the generator attempts to output examples that will fool the discriminator. This can be framed as a two-player minimax game with the following value function:
In less mathematical terms, we can imagine that G is a counterfeiter who prints fake money, while D is the cop trying to detect the fake banknotes.
During the training process, the generator becomes increasingly good at producing fake examples that seem real, while the discriminator becomes better at discerning what is real and what is fake.
A current hot topic in the ML world is fairness. A fair ML model is one that is not biased with respect to attributes such as gender, race, or other sensitive parameters. Intuitively, we might think that simply throwing away protected attributes from the data before training or inference will suffice. However, it turns out that it is not always so simple to define which attributes are considered protected. For example, a CV of a job applicant might include the words “my husband”, which implies that the applicant is probably female. And thus, even if we remove the field “gender” from the structured data, we may very likely have correlated features that remain in the processed data. Thus, there is a need for more sophisticated methods for creating unbiased ML models.
One such method is called adversarial debiasing. In this method, we do not throw away any attributes from the input data, but rather we ensure that the protected attributes can not be predicted with high probability from our model’s prediction. In order to guarantee this, we train an adversarial model A which aims to detect the protected attributes based on the predictor (P) output. Our total loss function is composed of the predictor loss and the adversarial loss.
The problem of creating unbiased predictive models can be viewed as an instance of a constrained optimization problem. We are interested in minimizing the original loss function of the predictor while satisfying the constraints – having an unbiased model with respect to sensitive attributes.
One of the most well-known methods for solving constrained optimization problems is the method of Lagrange multipliers. If it’s been a while since you took some calculus courses, or if you haven’t heard of this method before here’s a quick refresher.
Suppose we want to minimize the following function :
Subject to the constraint:
If we would not have the constraint, we could simply find the solution by equating the gradient to zero:
But how do we fit the constraints into the problem?
Lagrange’s method tells us to introduce a new variable, and create the Lagrangian function:
We now search for extremum points of a new unconstrained problem, and these give us candidates for the original problem. Looks like magic, right?
An Intuitive Game-theoretic Explanation
It turns out that the Lagrangian can be viewed as the value function of a zero-sum game. Player 1 wants to minimize the expression while they can control the values of x and y. Player 2 is looking to maximize the expression while controlling the value of. The Nash Equilibrium of this game gives us the values of x and y that satisfy the original constrained problem. How does this make sense?
Assume player 1 chooses a point that does not satisfy the constraint. In this case player 2 can choose a value that will make the total expression explode. Thus, player 1 is forced to try to minimize the function f while satisfying the constraint g.
This intuitive interpretation for the Lagrange multipliers method offers an insightful outlook for solving complex problems. Instead of attempting to have a single agent satisfy a complex constrained problem, we introduce a game with multiple agents, in which each player focuses on a simpler task.
To conclude, we have seen multiple use cases for training competing ML models simultaneously in a game-like setup. This can prove useful in cases where it is hard to define the objective function explicitly (as is the case with GANs), or where we want to solve a constrained optimization problem where it would be easier to use a “divide and conquer” policy. In my humble opinion, this idea is one of the neatest concepts in Machine Learning, and I believe it will be applied and generalized to more use cases in the near future.
M. TURING, I.—COMPUTING MACHINERY AND INTELLIGENCE, Mind, Volume LIX, Issue 236, October 1950, Pages 433–460
Zhang, B.H., Lemoine, B., & Mitchell, M. (2018). Mitigating Unwanted Biases with Adversarial Learning. Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society
Beutel, A., Chen, J., Zhao, Z., & Chi, E.H. (2017). Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations. ArXiv, abs/1707.00075
Auto-Regressive Generative Models (PixelRNN, PixelCNN++)
Gordon, G., Linear Programming, Lagrange Multipliers, and Duality