Saturday, December 22, 2007

The tyranny of misleading averages

A West country MP, Gerry Neale, used to tell the story that he was once making a speech to Cornish farmers and said that "on average, I do not think you are doing too badly."

"Look here, mister" replied one of the farmers, "Stand me with my left foot in a block of ice and my right foot in a bucket of boiling water and tell me on average I am all right and I'll tell you I'm not!"

I was reminded of this during a recent seminar on improving the economy of West Cumbria when one of the officers of Copeland Council referred to the area as having a high wage and high skill economy.

I pointed out to him that we have one industry employing a lot of people many of whom are highly skilled and many of whom, either because of those skills or because their work is at unsocial hours or hazardous, are fairly well paid, but that the statement was not true of the remainder of the local workforce.

It is not at all unusual for a group of people - the residents of a ward or constituency, the people who work in a broad field - to be divided into two or more sub-segments and for average statistics which describe the whole group to bear no relation to the circumstances of any given individual.

An example of an area where this can cause problems is with average statstics for measures of poverty. In both Cumbria and Hertfordshire I have seen policies to target disadvantaged areas based on average statistics for council wards. Unfortunately those averages may be very misleading where a council ward is large and diverse. For instance, both the areas I have had the privilege of being elected to represent, my current ward of Bransty in Copeland and my previous ward of Sandridge in St Albans, were disadvantaged by this analysis. Sandridge ward contained the relatively new Jersey Farm estate, many of whose residents commute into the City of London to work, and which substantially reduced the ward average figures for most measures of deprivation.

However, the ward also contains the village from which it gets its name, and in that village there is much more social and economic deprivation.

Bransty ward is similar in that the electoral division contains some very disparate areas, from Bransty Hill itself through the Sunny Hill and Bay vista areas through to two new estates at The Highlands in Whitehaven and in the village of Moresby Parks. Overall the degree of poverty and need in the ward is much greater than you would imagine from ward average statistics, and this sometimes has an impact on the distribution of resources.

The lesson from this is that authorities should take care when planning their economic strategies to be aware of the fact that some average statistics may be very misleading. Apologies for a bit of basic statistical jargon, but this is still true whether the average that you use is an arithmetic mean (add all the figures and divide by the number of people) the median (put the numbers in order from the lowest to the highest and take the number half way down the list) or the mode (the most common result.)

And when distributing resources it is necessary to bear in mind that an area which on average is affluent may contain pockets of considerable poverty.

Links to this post: the Daley half dozen at Iain Dale's Diary


Graeme Archer said...

Hi Chris. Really good article.

I'm a statistician by training. One of the tests always set for undergraduate statisticians is to ask them: "If you were a government minister, would you use the median, or the arithmetic mean, to denote the country's average income? Would you use the same measure if you were the Opposition spokesperson?".

The reason is to get students to think about what can be hidden by "average". The government spokesmen will use the arithmetic mean - what most people think of as the average - because the relatively small number of very high wage-earners will shift the average to the right, ie make it higher than is relevant for most people.

The Opposition (if they are sensible) will quote the median figure, the figure beneath which 50% of the salaries in the country lie. In the UK, the median will always be lower than the mean.

Similar messing with numbers lies behind the government's 'success' in lowering the proportion of people living in poverty. Here, they have done so by setting 'poverty' as a point on the income distribution to which a large proportion of people are to the left hand side (have lower incomes). They have then targetted people who are just under this threshold, and spent a lot of money to push them just *over* that threshold. The result is a "fall" in the proportion of people "in poverty" ... despite nothing being done about the very poorest people in our society: something which Iain Duncan Smith's report on Breakdown Britain made very clear.

Chris Whiteside said...

Thanks Graeme, you are absolutely right and the same problem applies to a large number of other statistics under this government and others, such as the proportion of children reaching particular levels in their Key Stage 1 or other SATS tests. It's too easy to improve the figures by lifting a few children at the category boundaries while not paying enough attention to other children - who may need the help more.

I believe there is no realistic option but to collect and publish performance stats for the public services because otherwise you are flying completely blind, but we also have to be incredibly careful to watch out for how all statistics can be manipulated.

That's one reason I am convinced we need an independent statistical watchdog which is not answerable to ministers.