When we create them to solve practical problems, artificial neural networks can perform complex (if ultimately mundane) tasks like identifying spoken words or optimizing financial networks. But the structure of those complicated neural networks, or ANNs for short, also lends itself very well to solving complicated problems that real humans face.

Now, DeepMind, the artificial intelligence company owned by Alphabet, is taking some of the oldest thought experiments from game theory and using ANNs to find unbiased takes on the “best” strategy for each; what they’re finding could revolutionize thinking in the social sciences, and begin to introduce a whole new way of running experiments in human behavior.

One of the oldest problems in game theory is called the Prisoner’s Dilemma, in which two people are imprisoned and held in separate cells. The cops need confessions to make the big charge stick, and so they give each prisoner a choice: If you testify against the other prisoner, you will be released without any punishment, but the other prisoner will serve three years in prison. If both prisoners blame the other, they will both serve two years. If neither defects against the other, the cops will press the lesser charge and both prisoners will receive a single year’s sentence.

The assignment is for each prisoner to minimize his or her time in jail, but depending on the actions of the other prisoner, each could take actions leading to a lesser or harsher sentence — and the prisoners don’t know which choice the other is going to make. This uncertainty is called a social dilemma, and it’s classically very difficult for a human being to accurately think through all the third- and fourth-order consequences of their own actions within such a recursively interconnected system.

But artificial neural networks are built specifically to see patterns within complex self-interacting systems — maybe they can help. DeepMind’s engineers created two different versions of similar social dilemmas and let deep learning artificial neural networks figure out how to maximize their success along very simple gameplay terms.

In one, two actors (a red and a blue square, respectively) have to collect as many “apples” (green squares) as they can. The twist is that the number of apples available per unit time is slowly decreased, leading to resource shortages and, naturally, competition. They gave each actor the ability to “shoot” its opponent to freeze it, thus forfeiting one of their own turns in which they could be apple-gathering in exchange for freezing their opponent even longer.

As you might imagine, as the apples got scarcer, the fighting became far more fierce. In such a situation, it suddenly generates better outcomes for someone to spend a large portion of time on offense rather than defense, but as in the Prisoner’s Dilemma, this strategy is available to both squares. This game also revealed that more complex actors with a wider array of possible actions will exploit this freedom to act further outside the rules: less cooperatively and more aggressively.

story continues below

However, the second of DeepMind’s digital thought experiments showed the opposite effect. While the apple gathering game offered an advantage for aggressiveness, WolfPack offers an advantage for cooperation, and when the actors became more complex, they again used that complexity in the most advantageous way available. This game tended to naturally produce cooperative routines, and the “smarter” the agents got, the more cooperative they became.

Graphs of agent behavior in DeepMind's two virtual social experiments. Credit: DeepMind
Graphs of agent behavior in DeepMind's two virtual social experiments. Credit: DeepMind

The team’s overall approach is to model social dilemmas as something they call sequential social dilemmas (SSDs) which are better able to be put into practice through software; in the non-sequential Prisoner’s Dilemma, each agent only makes a single decision and goes to jail a single time, thus removing any chance of learning and adaptation. In sequential games, each decision adjusts the environment in which the next decision occurs — a setup that much more closely mirrors the sorts of real social and economic systems people interact with in their daily lives.

ANNs have long been used to model real processes, from traffic congestion to autistic brain structures. But now, as these computer models become more nuanced, they are beginning to be able to model far more complex ideas and look less into what people will do, and more into why they choose to do it.