book excerptise:   a book unexamined is wasting trees

The evolution of cooperation

Robert M. Axelrod

Axelrod, Robert M.;

The evolution of cooperation

Basic Books, 1984, 241 pages

ISBN 0465021220, 9780465021222

topics: |  social | psychology | evolution |


The classic experiment in game theoretic analysis of multi-agent
interactions.  This body of work must be the most cited game theory work in
areas from statistical mechanics to behavioural economics to cognitive
psychology.  For example, see Matt Ridley's
The Origins of Virtue: Human Instincts and the Evolution of Cooperation
(1998) or Philip Ball's Critical Mass: How
One Thing Leads to Another, (2006)

The book arose out of an experiment that Axelrod conducted in which he
invited programmers to send in code for playing the Iterated Prisoner's
Dilemma game.

Read on...


Excerpts

I invited experts in game theory to submit programs for a Computer
Prisoner's Dilemma tournament — much like a computer chess tournament. vii

Each program would have available to it the history of the interaction so
far and could use this history in making its choice of whether or not to
cooperate on the current move. Entries came from game theorists in
economics, psychology, sociology, political science, and mathematics. I ran
the fourteen entries and a random rule against each other in a round robin
tournament. To my considerable surprise, the winner was the simplest of all
the programs submitted, TIT FOR TAT.

TIT FOR TAT is merely the strategy of starting with cooperation, and
thereafter doing what the other player did on the previous move.
[defect if it defects, else keep cooperating]

I then circulated the results and solicited entries for a second round of
the tournament. This time I received sixty-two entries from six
countries. Most of the contestants were computer hobbyists, but there were
also professors of evolutionary biology, physics, and computer science, as
well as the five disciplines represented in the first round. As in the
first round, some very elaborate programs were submitted. There were also a
number of attempts to improve on TIT FOR TAT itself. TIT FOR TAT was again
sent in by the winner of the first round, Anatol Rapoport of the University
of Toronto. Again it won.

I suspected that the properties that made TIT FOR TAT so successful in the
tournaments would work in a world where any strategy was possible. If so,
then cooperation based solely on reciprocity seemed possible.

The tournament results were published in the Journal of Conflict Resolution
(Axelrod 1980a and 1980b), and are presented here in revised form in
chapter 2. The theoretical results about initial viability, robustness, and
stability were published in the American Political Science Review (Axelrod
1981). These findings provide the basis for chapter 3.

After thinking about the evolution of cooperation in a social context, I
realized that the findings also had implications for biological
evolution. So I collaborated with a biologist— William Hamilton—to develop
the biological implications of these strategic ideas. This resulted in a
paper published in Science (Axelrod and Hamilton 1981) which appears here
in revised form as chapter 5.


1 The Problem of Cooperation

US Senate


Each senator has an incentive to appear effective to his or her constituents,
even at the expense of conflicting with other senators who are trying to
appear effective to their constituents. But this is hardly a situation of
completely opposing interests, a zero-sum game. On the contrary, there are
many opportunities for mutually rewarding activities by two senators. These
mutually rewarding actions have led to the creation of an elaborate set of
norms, or folkways, in the Senate. Among the most important of these is the
norm of reciprocity—a folkway which involves helping out a colleague and
getting repaid in kind. It includes vote trading but extends to so many types
of mutually rewarding behavior that "it is not an exaggeration to say that
reciprocity is a way of life in the Senate" (Matthews 1960, p. 100; see also
Mayhew 1975).

Washington was not always like this. Early observers saw the members of the
Washington community as quite unscrupulous, unreliable, and characterized by
"falsehood, deceit, treachery" (Smith 1906, p. 190). In the 1980s the
practice of reciprocity is well established.

Prisoner's Dilemma game


First num = payoff for ROW

                    ------------------------------
                    |        Column player	 |
                    |   co-op           defect	 |
--------------------|----------------------------|
            co-op   |   3, 3             0, 5	 |
 Row player 	    |				 |
            defect  |   5, 0             1, 1	 |
--------------------------------------------------

	5 = Temptation to defect
	0 = Sucker's payoff
	3,3 = win-win
	1,1 = lose-lose

Dilemma:
	both players can see that, whether the other defects or not,
	each is better off defecting.  but in the end, they end up
	at (1,1) which is not the best for either.
	Thus rational decisions lead to a poor outcome for both...
	This is the dilemma.

2 The Tournament


The most important kingmaker was based on an "outcome maximization" principle
originally developed as a possible interpretation of what human subjects do
in the Prisoner's Dilemma laboratory experiments (Downing 1975). This rule,
called DOWNING, is a particularly interesting...

DOWNING will try to get away with whatever it can by defecting. On the other
hand, if the other player does seem responsive, DOWNING will cooperate. To
judge the other's responsiveness, DOWNING estimates the probability that the
other player cooperates after it (DOWNING) cooperates, and also the
probability that the other player cooperates after DOWNING defects. For each
move, it updates its estimate of these two conditional probabilities and then
selects the choice which will maximize its own long-term payoff under the
assumption that it has correctly modeled the other player. If the two
conditional probabilities have similar values, DOWNING determines that it
pays to defect, since the other player seems to be doing the
same thing whether DOWNING cooperates or not. Conversely, if the other player
tends to cooperate after a cooperation but not after a defection by DOWNING,
then the other player seems responsive, and DOWNING will calculate that the
best thing to do with a responsive player is to cooperate. Under certain
circumstances, DOWNING will even determine that the best strategy is to
alternate cooperation and defection.

the one that scored lowest was also the one that was least forgiving. This is
FRIEDMAN, a totally unforgiving rule that employs permanent retaliation.  It
is never the first to defect, but once the other defects even once, FRIEDMAN
defects from then on. In contrast, the winner, TIT FOR TAT, is unforgiving
for one move, but thereafter is totally forgiving of that defection.  After
one punishment, it lets bygones be bygones.

JOSS has a sneaky rule that tries to get away with an occasional defection.
This decision rule is a variation of TIT FOR TAT. Like TIT FOR TAT, it always
defects immediately after the other player defects. But instead of always
cooperating after the other player cooperates, 10 percent of the time it
defects after the other player cooperates. Thus it tries to sneak in an
occasional exploitation of the other player.  This decision rule seems like a
fairly small variation of TIT FOR TAT, but in fact its overall performance
was much worse, and it is interesting to see exactly why. Table 1 shows the
move-by-move history of a game between JOSS and TIT FOR TAT. At first both
players cooperated, but on the sixth move JOSS selected one of its
probabilistic defections. On the next move JOSS cooperated again, but TIT FOR
TAT defected in response to JOSS's previous defection. Then JOSS defected in
response to TIT FOR TAT's defection. In effect, the single defection of JOSS
on the sixth move created an echo back and forth between JOSS and TIT FOR
TAT. This echo resulted in JOSS defecting on all the subsequent even numbered
moves and TIT FOR TAT defecting on all the subsequent odd numbered moves.

Second Round

[The] entrants to the second round were all given the detailed
analysis of the first round, including a discussion of the
supplemental rules that would have done very well in the
environment of the first round.

An important element (ch.8) was the addition of noise:

	A player might not be certain about the choice actually made by the
	other player in the previous move. There could be problems of random
	noise or systematic misperception (Jervis 1976). To study this, the
	first round of the tournament was rerun with a 1 percent chance of
	misperception of the other's previous move.  This resulted in yet
	another victory for TIT FOR TAT. This result indicates that TIT FOR
	TAT is relatively robust under conditions of moderate error in
	perception.


Here modifications of TfT - later known as Contrite Tit-for-Tat - enable
enhanced performance in the Iterated PD games.

e.g. Boyd, Robert (1989). "Mistakes Allow Evolutionary Stability in the
Repeated Prisoner's Dilemma Game". Journal of Theoretical Biology 136 (1):
47–56.

In high noise situations, contrite TfT seems to do very well.


Modifications of Tit-for-Tat : World War I


A fascinating case of the development of cooperation based on continuing
interaction occurred in the trench warfare of World War I. In the midst of
this very brutal war there developed between the men facing each other what
came to be called the "live-and-let-live system." The troops would attack
each other when ordered to do so, but between large battles each side would
deliberately avoid doing much harm to the other side—provided that the other
side reciprocated.

The strategy was not necessarily TIT FOR TAT. Sometimes it was two for
one. As a British officer wrote in his memoirs of the takeover of a new
sector from the French:

	It was the French practice to "let sleeping dogs lie" when in a quiet
	sector . . . and of making this clear by retorting vigorously only
	when challenged. In one sector which we took over from them they
	explained to me that they had practically a code which the enemy well
	understood: they fired two shots for every one that came over, but
	never fired first. (Kelly 1930, p. 18)

Such practices of tacit cooperation were quite illegal— but they were also
endemic. For several years this system developed and elaborated itself
despite the passions of the war and the best efforts of the generals to
pursue a policy of constant attrition. The story is so rich in illuminating
detail that all of the next chapter will be devoted to it.

[...]

Similarities in basic needs and activities let the soldiers appreciate that
the other side would probably not be following a strategy of unconditional
defection.  Thus, in the summer of 1915, a soldier saw that the enemy would
be likely to reciprocate cooperation based on the desire for fresh rations.

	It would be child's play to shell the road behind the enemy's
	trenches, crowded as it must be with ration wagons and water carts,
	into a bloodstained wilderness ... but on the whole there is
	silence. After all, if you prevent your enemy from drawing his
	rations, his remedy is simple: he will prevent you from drawing
	yours. (Hay 1916, pp. 224-25)


[This last example seems more like a model of Contrite-TfT...]



Modern Variants of Contrite TfT

cTfT has also spawned a number of variants, as seen in this paper...


@article{boerlijst-nowak-97_logic-of-contrition,
  title={The logic of contrition},
  author={Boerlijst, Maarten C and Nowak, Martin A and Sigmund, Karl},
  journal={Journal of Theoretical Biology},
  volume={185},
  number={3},
  pages={281--293},
  year={1997},
  publisher={Elsevier},
  annote = {

Abstract

A highly successful strategy for the Repeated Prisoner's Dilemma is Contrite
Tit For Tat, which bases its decisions on the “standings” of the two
players. This strategy is as good as Tit For Tat at invading populations of
defectors, and much better at overcoming errors in implementation against
players who are also using it. However, it is vulnerable to errors in
perception. In this paper, we discuss the merits of Contrite Tit For Tat and
compare it with other strategies, like Pavlov and the newly-introduced
Remorse. We embed these strategies into an eight-dimensional space of
stochastic strategies which we investigate by analytical means and numerical
simulations. Finally, we show that if one replaces the conventions concerning
the “standing” by other, even simpler conventions, one obtains an
evolutionarily stable strategy (called Prudent Pavlov) which is immune
against both mis-perception and mis-implementation.

}}


World War II alliance formation

[also see fascinating later work, Axelrod and Bennett, on modeling the
process of coalition formation as minimas in a coalition energy landscape.
* Axelrod, Robert and Bennett, D Scott;
  A landscape theory of aggregation,
  British journal of political science, v.2(03), 1993.

	
   	Two “energy” minimas emerge in the landscape of alliance formation
	before WW 2.
	deepest minima: v. similar to the split into Allied and Axis powers -
	only Portugal and Poland placed in the “wrong” camp. The other basin
	predicts a very different history, with Europe united against the
	Soviet Union.
	[merges a hypothetical figure in axelrod & bennett 93]

fig. from Critical Mass: How One Thing
Leads to Another by Philip Ball (2006)


bookexcerptise is maintained by a small group of editors. get in touch with us! bookexcerptise [at] gmail [dot] .com.

This review by Amit Mukerjee was last updated on : 2015 Sep 06