The Economist has a great article on how psychologists are looking at how computer scientists are using Bayesian prediction engines for things like help wizards and spam filters. The Psychologists asked an unusual question - maybe people use Bayesian logic?
Of course! Er, well, maybe. Science needs to test the hypothesis, and that's what they set out to do:
Dr Griffiths and Dr Tenenbaum conducted their experiment by giving individual nuggets of information to each of the participants in their study (of which they had, in an ironically frequentist way of doing things, a total of 350), and asking them to draw a general conclusion. For example, many of the participants were told the amount of money that a film had supposedly earned since its release, and asked to estimate what its total “gross” would be, even though they were not told for how long it had been on release so far.
Besides the returns on films, the participants were asked about things as diverse as the number of lines in a poem (given how far into the poem a single line is), the time it takes to bake a cake (given how long it has already been in the oven), and the total length of the term that would be served by an American congressman (given how long he has already been in the House of Representatives). All of these things have well-established probability distributions, and all of them, together with three other items on the list—an individual's lifespan given his current age, the run-time of a film, and the amount of time spent on hold in a telephone queuing system—were predicted accurately by the participants from lone pieces of data.
There were only two exceptions, and both proved the general rule, though in different ways. Some 52% of people predicted that a marriage would last forever when told how long it had already lasted. As the authors report, “this accurately reflects the proportion of marriages that end in divorce”, so the participants had clearly got the right idea. But they had got the detail wrong. Even the best marriages do not last forever. Somebody dies. And “forever” is not a mathematically tractable quantity, so Dr Griffiths and Dr Tenenbaum abandoned their analysis of this set of data.
The other exception was a topic unlikely to be familiar to 21st-century Americans—the length of the reign of an Egyptian Pharaoh in the fourth millennium BC. People consistently overestimated this, but in an interesting way. The analysis showed that the prior they were applying was an Erlang distribution, which was the correct type. They just got the parameters wrong, presumably through ignorance of political and medical conditions in fourth-millennium BC Egypt. On congressmen's term-lengths, which also follow an Erlang distribution, they were spot on.
Which leaves me wondering what an Erlang distribution is... Wikipedia doesn't explain it in human terms, but it looks like a Poisson distribution:
Curious footnote - look at who they credited as the source of their graph of distributions.Posted by iang at January 7, 2006 10:19 AM | TrackBack