Why everyone should read a book such as "How to lie with statistics" by Darrell Huff
One of the themes I wrote about quite frequently when I first started this blog was the desperate need in modern society for a better understanding of statistics.
It's not a new problem - Disraeli referred to as far back as the nineteenth century to "Lies, Damned lies, and statistics" but about the only skill in relation to statistics which is common - particularly among salesmen and advertisers, politicians, pressure groups and journalists - if the ability to pick the statistic which best supports the story you want to tell.
Unfortunately that particular statistic will only rarely and incidentally be the one which is also the most helpful and accurate in understanding the whole truth.
I was prompted to revisit the need to improve statistical knowledge when an individual who may or not have meant to be ironic posted a comment on this blog in response to a mention of misleading averages. He or she asked whether the concept that averages could be misleading was my idea.
No, it's not my idea, people who write books about how to understand and use statistics have been including chapters about issues such as how averages can be misleading for well over half a century. And anyone for whom the idea is a surprise or novelty really does need to read one of those books.
It is apparent that many people, including some who are very eminent in areas of knowledge other than mathematics, lack the most basic understanding of statistics. What is even more unfortunate is that some of those people wrongly imagine that they do understand statistics, sometimes to an extent amounting to Dunning-Kruger delusions of competence. They can do terrible damage as a result.
A classic example from earlier in this century, about which I blogged here at the time and revisited here occurred when one of the most distinguished paediatricians in Britain - a man who really did know a vast amount about children's illness and mortality - was struck off when it became clear that he had given mistaken evidence as an expert witness in murder trials, evidence which resulted in at least two women who were almost certainly innocent being sent to jail.
That disciplinary action was quashed by a court, during a legal battle which eventually produced a compromise ruling; the decision to strike him off stayed quashed, but fortunately the Appeal Court overturned an unwise finding by a lower court that expert witnesses were immune from disciplinary action for giving inaccurate evidence. So expert witnesses who cause a miscarriage of justice by giving evidence in court which is dangerous nonsense can be held to account for it.
The basic problem was that neither the expert witness himself nor the people who should have challenged him realised that his vast expertise in one area - child health - did not translate into understanding of statistics. He gave evidence as an expert witness in the trials of a number of women whose children had died and who were accused of murdering them.
Unfortunately, because of his enormous knowledge of paediatrics, at least two juries accepted at face value statements which he made about the probability of a family losing two or more children to cot death which were completely wrong and gross underestimates because he apparently did not understand the concept of conditional probability. As a consequence of this misunderstanding of statistics, at least two women who were almost certainly innocent were wrongly convicted of murder.
The expert witness was the main culprit, but he was not the only one. The defence lawyers should have challenged his statistics. The jurors should have realised that his expertise in medicine did not guarantee expertise in maths. But above all, our society is too ready to both to tolerate bad statistics unless we have good reason to want to disbelieve them, and to reject good statistical data which does not fit our preconceptions. This particular case, where innocent women were sent to jail because of bad statistics, is an extreme one but it is far from being the only case.
Anyone who serves as a judge, barrister or on the jury in a court, any citizen of a democracy who wants to be able to cast their vote having made an intelligent assessment of the statistics put out by competing candidates or campaigns, and anyone who doesn't want to be easily fooled by clever but misleading adverts, would be very will advised to make sure he or she has read at least one good book on how to understand statistics and avoid being fooled by bad ones.
One of the oldest, but still one of the best and easiest to understand is "How to lie with statistics" by Darrell Huff. Apart from, perhaps, what sixty years of inflation has done to the relevance of some of the prices quoted, this excellent book has aged astonishingly well and almost everything in it is still very relevant indeed,
Despite being about numbers but manages to be both extremely easy to read and very entertaining.
And although it is so accessible that a ten-year old of average intelligence should be able to understand everything in this book, the points it makes are so universal in application that even someone with much greater mathematical knowledge - and I write this as a graduate with two degrees in a discipline which requires statistical understanding - can find it full of useful reminders and even the odd valuable idea you might not have thought of or heard of.
The book is about how numbers can be manipulated, by accident or design, to trick people into making false conclusions, and how to spot when you are being fed misleading numbers.
Anyone with a serious interest in the subject who wants an update on some of the more recent examples of how statistics are misused might start by reading "How to Lie with Statistics" and then follow up with the equally good "Damned Lies and Statistics" by Joel Best, which is more current and nearly as accessible. The two books complement each other very well. Best has written a sequel, "More damned lies and statistics."
If every voter read books like these, fewer bad politicians would be elected on the basis of dishonest campaign statistics. If every consumer read them, fewer bad products would be sold on the basis of dishonest advertising statistics, and if every journalist read them there might be less harm done by scare stories based on bad statistics.