October 15, 2004

The Coordination Problem

Canny financial cryptographers know that connection-oriented protcools like TCP are not up to scratch if you really care about your packets. They are reliable, but that doesn't mean you can rely on them! TCP is only a reliable protocol until it stops, and then it becomes unreliable - there is no way for you to tell whether a dropped connection delivered the data or not. Or, indeed, how many times.

This problem is underwritten by what amounts to a law of computer science called the Coordination Problem. In trying to recall what this was called, I asked Twan van der Schoot, and predictably, he gave me the fullest answer. Here it is, somewhat edited:

The "Law" you are refering to [writes Twan] is the result of the attempt to solve the "Coordination Problem" or "Coordinated Attack Problem", failing in the attempt, and then proving that it cannot be solved. People generally say it is a "folk" problem in the distributed systems community. Paul Syverson attributes the original problem statement to Gray (1978).

Here's the proof. It is very simple and of a rare beauty. The only thing wrong with it is that it needs an indirect method using a reductio ad absurdum argument.

The problem setup:

We have two perfect processes (i.e. which do not fail), say Alice and Bob. Alice and Bob communicate bidirectionally over a channel with transient errors (i.e. an imperfect channel). Can you devise a protocol that guarantees that Alice and Bob both choose the same action a or b?

The proof:

1. Assume there is such a protocol. And, without loss of generality, assume that it is the shortest one (i.e. the one with the least communication exchanges in either direction).

2a. Assume that Alice just has sent its last message m in the last step of the protocol. At this point in the protocol, Alice's choice of action a or b must be independent of the message m. Alice will not receive any message thereafter. In other words, prior to sending m, Alice already committed to either action a or b independent of message m.

2b. The choice by Bob for either action a or b must be the same as the choice by Alice after receiving message m from Alice, whether Bob received message m or not (message loss). So Bob's commitment to action a or b is also independent of the received message m.

2c. But then sending message m in the last step of the protocol is redundant, and can be dropped. But then we have a protocol which is one step shorter;

4. But then we have a contradiction, because the protocol was the shortest (1). Hence there is no such protocol.

Conclusion: So there is no protocol which solves the coordination problem.

There are more formal proofs, but they require a lot of formal theory. This one I grabbed from "Distributed Systems, 2nd Edition. Sape Mullender ed. Addison-Wesley 1993".

Note, however, that the "coordination attack" fails in an "absolute" sense. If we allow for some probablity thinking, we can at least approximate a solution. And that is why, say, the Internet works :)

Paul Syverson wrote a little monopgraph in 2003 "Logic, Convention, and Common Knowledge; A Conventionalist Account of Logic". Syverson claims that the Coordination Attack can be solved using a combination of logic and game theoretical ideas. I've started reading the book recently and it is laden with philosophy, (epistemic) logic and game theory. So it will take a little time before I'll grasp the basic tennet of the underlying concepts.



Posted by iang at October 15, 2004 02:01 PM | TrackBack

Fascinating. I've never before heard this classic problem referred to as the "coordination problem". I first read about this problem in this paper by Akkoyunlu, Ekanadham, and Huber, "Some Constraints and Tradeoffs in the design of Network Communications", e.g.:


where it isn't given a name but it describes (at the end of the paper) the problem of two groups of gangsters trying to agree to put a plan into action. They present essentially the same proof above.

There has been a fair amount of discussion of the "The Byzantine Generals Problem" paper by Lamport, Shostak, and Pease:


but after looking again at that paper I think it focused primarily on a different set of problems (n/3 ...) and doesn't deal with the above "coordination problem" directly.

I hope we have some others on the list who are steeped in computer science history and may be able to shed some light on the development of the understanding of this problem and its naming.

--Jed http://www.webstart.com/jed/
cap-talk mailing list

Posted by: Jed at January 27, 2006 08:46 AM

[Another] discussion led to what I considered an interesting discussion of the "coordination problem" AKA the "Two Generals Problem":


where I was able to add some clarity to the historical thread as noted on the above Wikipedia page (not self serving in this case ;-).

--Jed http://www.webstart.com/jed/

Posted by: Jed (Two Generals' Problem) at August 30, 2006 05:32 AM

Hello from Madrid, Spain. I am researching about the Nature of Internet. And translating some problems into Spanish. I have been surfing trying to find the original version of Gray, no sucess. Would you be so kind as to tell me if avalaible?

Posted by: Nuño Valenzuela at March 8, 2009 03:58 PM

I know it can be found in:

James Gray "Notes on Data Base Operating Systems"
In Lecture Notes In Computer Science; Vol. 60 "Operating Systems, An Advanced Course" pp 393 - 481
1978 Springer-Verlag London, UK

Apparently, Gray called it the "Generals Paradox".

But I don't have a copy of it on my bookshelfs.

In any case I read about the "Two Generals" problem in:
"Distributed Systems - Second Edition"
Edited by Sape Mullender
Addison-Wesley 1993
Section 2.2 Good Models/A Coordination Problem.

Only later I learned the "Coordination problem" originated with Jim Gray.

Posted by: Twan at March 9, 2009 03:01 PM

Please, find first translation into Spanish of the Generals Paradox here

Also avalaible The problems of Ganster´s and the Go problem.

If you want to send me or write drafs of ideas at this wiki, I will translated them into Spanish.

I am inquiring about using or not a reductio ad absurdum argument for re-thinking these kind problems. Any fuzzy or discrete ideas?

This "old archeology of deep thought stuff" is fascinating.

Thank you very much from Spain!

Posted by: Nuño Valenzuela at March 10, 2009 01:03 AM

Hi Nuño Valenzuela,

Great you were able to digg up the original text! Since university I don't have easy access to such archeological material.

Although Gray stated the "Generals Paradox" problem in 1978, the formal proof using the reductio ad absurdam is given by Yemini and Cohen in 1979 [1]

Halpern and Moses provided a proof in 1990 [2] using the notion of "common knowledge". If I'm not mistaken this proof is discussed in Fagin et al from 1995 [3] section 6.1. I don't remember the details of that proof, but it's main line of reasoning doesn't rely on such a reductio. May be a small lemma burrowed deeply in the proof does require the use of reductio, you've to find that out yourself ;-)

In any case, the proof in [3] is much much longer than the one given in [1].

FYI, Paul Syverson claims to have found a solution to the paradox. He needs a booklet [4] to explain it, invoking game theory and epistemic logic.

best regards,


[1] Y. Yemini and D. Cohen "Some Issues in distributed processes communication" in proc. of the 1st International Conf. on Distributed Computing Systems, pp 199-203.
[2] J.Y. Halpern and Y.Moses "Knowledge and common knowledge in a distributed environment", JACM 37(2), pp549-587.
[3] R. Fagin, J. Y. Halpern, Y. Moses and M. Y. Vardi "Reasoning About Knowledge" MIT-press 1995 (ISBN 0-262-06162-7).
[4] Paul Syverson, "Logic, Convention, and Common Knowledge - A Conventionalist Account of Logic", CLSI publications 2003 (ISBN 1-57586-392-8).

Posted by: Twan at March 10, 2009 06:54 AM
Post a comment

Remember personal info?

Hit preview to see your comment as it would be displayed.