October 22, 2004

New Tack Wins Prisoner's Dilemma

Here's a classic example of how a competition based on the economics problem known as the Prisoner's Dilemma has been exploited: A seemingly complete theory has once again been turned on its head. All's fair in love and war, and the best attacks come when we challenge the other guy's assumptions.

New Tack Wins Prisoner's Dilemma
By Wendy M. Grossman
Story location: http://www.wired.com/news/culture/0,1284,65317,00.html
02:00 AM Oct. 13, 2004 PT

Proving that a new approach can secure victory in a classic strategy game, a team from England's Southampton University has won the 20th-anniversary Iterated Prisoner's Dilemma competition, toppling the long-term winner from its throne.

The Southampton group, whose primary research area is software agents, said its strategy involved a series of moves allowing players to recognize each other and act cooperatively.

The Prisoner's Dilemma is a game-theory problem for two players. As typically described, two accomplices are arrested and separated for interrogation by the police, who give each the same choice: confess to authorities (defect) or remain silent (cooperate). If one defects and the other cooperates, the defector walks free and the cooperator gets 10 years in jail. If both cooperate, both get six months. If both defect, both get six years. Neither suspect knows the other's choice.

"The Prisoner's Dilemma is this canonical problem of how to get cooperation to emerge from selfish agents," said Nick Jennings, a professor in computer science at Southampton University and leader of the winning team along with his Ph.D. student, Gopal Ramchurn. "People are very keen on it because they can see so many parallels in real life."

Before Southampton came along, a strategy called Tit for Tat had a consistent record of winning the game. Under that strategy, a player's first move is always to cooperate with other players. Afterward, the player echoes whatever the other players do. The strategy is similar to the one nuclear powers adopted during the Cold War, each promising not to use its weaponry so long as the other side refrained from doing so as well.

The 20th-anniversary competition was the brainchild of Graham Kendall, a lecturer in the University of Nottingham's School of Computer Science and Information Technology and a researcher in game theory, and was based on the original 1984 competition run by a University of Michigan political scientist, Robert Axelrod.

The Iterated Prisoner's Dilemma is a version of the game in which the choice is repeated over and over again and in which the players can remember their previous moves, allowing them to evolve a cooperative strategy. The 2004 competition had 223 entries, with each player playing all the other players in a round robin setup. Because Axelrod's original competition was run twice, Kendall will run a second competition in April 2005, for which he hopes to attract even more entries.

Teams could submit multiple strategies, or players, and the Southampton team submitted 60 programs. These, Jennings explained, were all slight variations on a theme and were designed to execute a known series of five to 10 moves by which they could recognize each other. Once two Southampton players recognized each other, they were designed to immediately assume "master and slave" roles -- one would sacrifice itself so the other could win repeatedly.

If the program recognized that another player was not a Southampton entry, it would immediately defect to act as a spoiler for the non-Southampton player. The result is that Southampton had the top three performers -- but also a load of utter failures at the bottom of the table who sacrificed themselves for the good of the team.

Another twist to the game was the addition of noise, which allowed some moves to be deliberately misrepresented. In the original game, the two prisoners could not communicate. But Southampton's design lets the prisoners do the equivalent of signaling to each other their intentions by tapping in Morse code on the prison wall.

Kendall noted that there was nothing in the competition rules to preclude such a strategy, though he admitted that the ability to submit multiple players means it's difficult to tell whether this strategy would really beat Tit for Tat in the original version. But he believes it would be impossible to prevent collusion between entrants.

"Ultimately," he said, "what's more important is the research."

"What's interesting from our point of view," he said, "was to test some ideas we had about teamwork in general agent systems, and this detection of working together as a team is a quite fundamental problem. What was interesting was to see how many colluders you need in a population. It turns out we had far too many -- we would have won with around 20."

Jennings is also interested in testing the strategy on an evolutionary variant of the game in which each player plays only its neighbors on a grid. If your neighbors do better than you do, you adopt their strategy.

"Our initial results tell us that ours is an evolutionarily stable strategy -- if we start off with a reasonable number of our colluders in the system, in the end everyone will be a colluder like ours," he said.

The winners don't get much -- an unexpected $50 check and a small plaque. But, says Kendall, "Everybody in our field knows the name of Anatol Rapoport, who won the Axelrod competition. So if you can win the 20th-anniversary one, in our field there's a certain historical significance."

Posted by iang at October 22, 2004 10:56 AM | TrackBack

I read about this one -- IMHO they made a meta-error. Because now you just have a prisoner's dilemma problem regarding how the agents WITHIN that "southampton group" should act -- ie whether or not to follow the scheme the "programmers set" as it were

BTW talking of the _Evolution of Cooperation_ book, Its a great pity Axlerod more or less went crazy later in life and joined the U.N. !!!!

Posted by: JPMay at October 23, 2004 07:14 AM


well, you *could* say that ... I'd say that the agents were always supposed to cooperate anyway, that's the beginning assumption of the two prisoners.

What is interesting though is that up until now, everyone assumed that it was always exactly two prisoners. And what happened between those exact two prisoners? Of course, in the game, there were multiple sets of two prisoners, and nobody thought about collusion across sets, only within each set, as if that was the entire universe. A classic example of being stuck in the box of ones assumptions!

But in practice, what the Southampton team looked at is more the "real problem." When you think about those two prisoners, the reason they cooperate is because the Don back at their home with their wifes and their children will rub out anyone who doesn't cooperate...

So there are more than 2 players in the game, in general. And the simplification of 2 players that was made for explanatory purposes back when Axelrod was sane has kind of become an accepted tenet of the science, at least as the game would have it.

Posted by: Iang at October 23, 2004 07:19 AM