The daftest statistics of 2007
Following on from yesterday's post about misleading averages there was an excellent article in The Times this week by Andrew Dilnot and Michael Blastland about the most ridiculous statistical errors of 2007.
You can read it online at
One example quoted concerns the AIDS/HIV statistics published by the United Nations. Despite the increasing spread of the disease they had to adjust down their estimates of the number of people infected, which had been too high.
The reason: the previous estimates of the number of HIV positive people had been based on samples at maternity clinics. But this is not a reliable way to make such an estimate. It eventually dawned on someone that in terms of exposure to AIDS, pregnant women are not representative of the overall population because, of course, they have all had unprotected sex. DOH!
Another example of a misleading statistic concerns prostate cancer survival rates in the USA and the UK. When Rudi Giuliani, aspiring US President, was diagnosed with prostate cancer, he said that his chance of surviving in the US, he said in August, was 82 per cent but that in the UK it would have been about about half as good.
The proportion of men in Britain and America who actually die of prostate cancer appears to be quite similar, although there is a degree of uncertainty about this because many men who appear to have died of completely unrelated conditions are found to have also had slow-developing cases of prostate cancer. (It is sometimes alleged of prostate cancer that "most men die with it but few men die of it.")
However, in the USA, many more cases of prostate cancer are diagnosed than in the UK. With a similar proportion of deaths, that gives you a massively higher survival rate for those who are actually diagnosed.
I don't go all the way with Dilnot and Blastland on this: the lower rate of diagnosis in the UK may be responsible for some unnecessary deaths, although it is also possible that the higher rate of diagnosis in the states may result in men undergoing highly unpleasant treatments (including castration) which do not give any real benefits in terms of quality of life or life expectancy. And we cannot be certain that prostate cancer is not a factor in some deaths in the UK which are ascribed to other causes.
However, Dilnot and Blastland are undoubtedly right to criticise the idea that the differential between quoted survival rates in the two countries is of any use whatsoever as a measure of whether we could save lives if the NHS adopted USA-style treatments.
A third example in the Times article concerns the impact on railway safety of the privatisation of the railways. It is almost universally believed that rail safety deteriorated after privatisation. But the statistics simply do not bear this out.
Railway inspectorate data shows clearly that railway accidents have not just continued to fall after privatisation, but fell faster after privatisation than before. More than 100 people survived who might otherwise have been expected to die had British Rail's rate of progress continued.
The problem the railways have is that although they are much safer than roads in terms of deaths per passenger mile, when people die on the railways it is usually as part of a major accident which gets lots of publicity. People die on the roads every week, but it just doesn't get the attention.
In Cumbria the current death rate on the roads is about 50 a year or one a week. I think I am right in saying that more people die on the roads of Cumbria alone every year than the combined death toll for every major accident on the whole of the UK railway network all the way back to privatisaion.
The fact that media coverage tends to give the impression that the risks on the railways are much greater than is actually the case, while road deaths get less attention, can be thoroughly pernicious as we saw with the reaction to the Hatfield crash.
Which do you think killed more people - the Hatfield crash, or the way the authorities and the media reacted to it?
I was a commuter into London at the time, and the ridiculous over-reaction of the railway authorities after the Hatfield crash made getting into London a nightmare. The Economist magazine also convincingly argued that this over-reaction killed more people in extra deaths on the roads than died in the crash itself.
The Economist obtained figures for the huge blip of extra road traffic during the months after Hatfield, which appears to have been caused by a combination of fears about the safety of rail travel and restrictions on rail travel after the crash. Then they multiplied the extra number of person miles on the roads by the differential between deaths per passenger miles on the railways and roads. The answer came out at six extra road deaths - slightly more than the number of fatalities in the actual Hatfield crash.
If anyone reading this would like a recommendation for a really good book about how to use, and how not to use, statistics, I can make three.
The first is "How to lie with Statistics" by Darrell Huff. First written in 1954, before I was born, this book is absolutely timeless and, given that it is a book about maths, incredibly easy to understand. (It is also a delight to read, which is even more unusual for a book about maths.)
The other two are "Damned lies and statistics" and "More Damned lies and statistics" both by Joel Best. These books are set at a slightly more challenging level that Huff's, but both are still more accessible, and easier to understand, than most maths books. They contain a wealth of recent examples of some of the problems people can have with misleading statistics.