I am often amused and sometimes alarmed by the way data and statistics are handled by firms, public officials and particularly the media. The erroneous use of data is either at best a result of ignorance on behalf of these people or at worst a deliberate misuse to manipulate public opinion.

One area that exemplifies this misuse is the subject of the road toll. Let me concoct an example to demonstrate what I mean.

In the mythical state of Dystopia the annual death from road accidents over the last ten years has been as follows:

99/00 00/01 01/02 02/03 03/04 04/05 05/06 06/07 07/08 08/09
321 346 307 339 341 329 348 328 336 345

Over this period of time the average road toll has been 334 with a standard deviation of 12.9. (If you are not familiar with statistics, standard deviation measures the dispersal of the values over the range of data. But don’t worry about it. I will give an example for you later of the impact of this measure.)

In 2003/04, in the face of a rising road toll for three years, the Minister for Police in a fit of high moral dudgeon and ministerial paternalism accused the motorists of Dystopia of being careless drivers and subject to speeding and drink driving. As a result he determined to mount a serious campaign to stop this aberrant behaviour and, hopefully, save the lives of a few citizens.

At the end of the next year he trumpeted the success of his intervention. Apparently, his acute understanding of the problem had to be recognised because there were 12 fewer deaths. He had obviously found the solution and more of the same would create even better outcomes. It became then very difficult to explain the increasing deaths in the next two years.

At the end of 2005/06 the Government was thrown out of office. The new police minister was more concerned with the rising number of “break and enters” in the cities of Dystopia and therefore took the focus off the road toll and put more resources into policing suburbia. Yet strangely the road toll fell dramatically. What was going on here?

Well, as you would expect intuitively, just because the average road toll is 334, it is still very unlikely, because of chance (reflected by the dispersal of data as measured by the standard deviation) that a particular year’s outcome will come out as exactly 334. All that we know is, that over an extended period, the outcome will tend to that average – some years giving higher outcomes and some years giving lower. If the Minister had taken no action, all things being equal, he had a far better than even chance (in fact slightly more than a 70% chance) that the road toll would have fallen. That is because the figure for that year was well above the mean.

The standard deviation measure tells us that there is a 68% likelihood that the road toll will fall between 321 and 347. It is very difficult to determine the impact of the Minister’s intervention in the face of this probabilistic distribution. So the road toll goes up a few percent and the Minister regales us for being poor drivers and it goes down a few percent and he tells us how marvellously effective his interventions have been. More than likely the natural variability in the data has had more impact than anything he might have done.

(Although for this simple example I have assumed a normal distribution of road deaths this is not quite the case. Because of population growth the Minister should expect some increase and if for example, the economy is improving and the ownership of cars per capita is increasing that would contribute to an increase as well.)

When you get a result well below the median then the next set of data is likely to be higher. When you get a result that is way above the median the next set is probably going to be lower. I read somewhere about someone training pilots to land aircraft. He had come to the conclusion that the best response was to castigate those who landed badly, because when he did this their next attempts were invariably better. He had tried praising those who landed well, but very often their next attempt was worse. Therefore he had come to the conclusion that punishing bad performance was more effective than rewarding good performance. But his strategy again was overtaken by probabilities and the causation he implied was, at best, very dubious!

Another trick often resorted to by the media and those with great vested interests in increasing fear amongst citizens (for example, drug companies) is to resort to relative statistics without revealing the absolute base. You might read for example, that the effect of even smoking one joint of marijuana is to increase your risk of succumbing to schizophrenia by 40%. Although not wanting to promote such behaviour, I should still tell you what they don’t tell you. The actual risk of being schizophrenic is something in the order of one percent. Consequently the elevated risk is still very minor.

As someone said, “Fear sells.” Therefore there are always players wanting to exaggerate risk (for commercial benefits).

Finally, one of the most common mistakes with statistics is to imply causation where there is correlation. As per the above example, if Marijuana usage seems to correlate with the risk of increased schizophrenia, our natural inclination is to assume a causal connection. And of course this is not always the case.

I can remember as a student, a lecturer telling us that a strong correlation existed in the UK between the consumption of bananas and the birth rate. Did this imply a causal relation? Well no! It transpires that in the UK bananas are a luxury item. Consequently when the economy is doing well people consume more bananas. As well when people are more optimistic about economic outcomes they tend to have more children. Thus both outcomes (the consumption of bananas and the propensity to have more children) are related to the economic conditions of the time. Those expecting that their fertility might be increased by the consumption of bananas would be greatly disappointed!

This is then just a precautionary tale to warn you that you must take great care both with the use of statistics and the interpretation of statistics used by others. Look to the source. The media will inevitably try to sensationalise things or drug companies (and lawyers, insurance companies, security firms etc.) will want to make you afraid and thus inveigle you to use their products. Politicians are mainly ignorant of statistical processes and will be prone to use dubious figures to have you re-elect them! They are, however, pretty well across the probability of that outcome!

7 Replies to “Lies, Damned Lies and Statistics”

Agree with what you’ve said Ted – I can’t help think of the climate change debate…your thoughts ?

I once heard it said that during the Vietnam war more than 99.9% of soldiers died with their pants on. Clearly if you wanted to survive in Vietnam you should remove your duds 🙂

A few years back I read a book by Bjorn Lomborg called ‘The Skeptical Environmentalist’ which sought to debunk claims that environmentalists were making about the state of the world. Lomborg as a statistician produced pages and pages of data and graphs to present a very different picture of the condition of the planet to the doomsday predictions by certain environmentalists that has been picked up by the media without much questioning. Not being someone with a strong statistical bent myself I was a bit skeptical about Lomborg and his use of statistics to prove his point but it certainly drew sharply into focus the very issues that you’ve raised this week Ted.
And your points about standard deviation and correlation/causation probably explain why I struggle to know what to think about the whole climate change debate and who I should listen to about this issue.

I agree strongly with this entry and have always felt this way about statistics. At times however; it can be difficult to turn a blind eye to some of the so called ‘facts’ presented in modern media. I think that everybody has a certain degree of understanding that the majority of statistical claims made by politicians, media etc are heavily biased and are more often than not used to deter or promote a particular outcome (don’t bugger up a good story with the facts!). This however does not seem to stop people from doing what they want to do, in my experience (limited that it may be).

Ultimately people seem to have a desired outcome of their own and will use presented statistics and figures to fuel their argument in accordance with this outcome. A classic example from the top of my head is smokers. A friend of mine in high school was smoking at the age of 14 and when I asked him what he thought it was doing to his body the simple reply was, “My grandad smoked for nearly 60 years before he died.”

The back of the packet had a picture of a gangrenous foot, indubitably detailing the very real and serious implications of a life of smoking. This individual however chose to consider a single statistic, his grandfather, over the countless cases of lung cancer, emphysema and goodness knows how many other conditions that are diagnosed and attributed to cigarette smoking on a daily basis.

Ultimately, I feel that people use statistics to prove that they are right, or to prove that somebody else is wrong. They’re very much a method of data presentation fabricated for the persuasive genre. For that reason I find it very difficult to take statistics seriously; and try, wherever possible, to use other means to form an opinion.

Great entry Ted, I thoroughly enjoyed it.

Adrian

Thank you all for some great comments.

Suzie, climate change like other current issues (road rage, child molestation, shark attacks etc) has been sensationalised by the media where the most dire predictions by the least qualified seem to be given greatest exposure.

I am reasonably sure that we are experiencing climate change. I haven’t come to that conclusion from reading the media but have been fortunate enough because of various roles I’ve had to have been exposed to a good amount of genuine scientific papers and rigorous research supporting the phenomenon.

I am far less convinced that climate change is entirely due to humankind’s activities on the earth. (Read again my concerns about drawing causal relationships from correlation!) Whilst it is likely that mankind has contributed I am open to the fact there might be other contributory ( and perhaps even more significant) causes.

As I said in my main text, fear sells. Therefore we can’t always blame the media when the media is in fact pandering to our innate (albeit perverse) desires.

We seem to like dramatic and gruesome events. Shark attacks therefore make a very good subject. I read recently that more Australians are electrocuted by their Christmas tree lights than are killed by sharks! Hence a particularly violent car crash attracts excessive media attention. Those hundreds of people who die quietly of diabetes never get a mention. The famous also get undue attention. An accident to a pop star or film star will undoubtedly make the front page while a thousands starving to death in a third world country won’t rate a story.

I could go on and on but I won’t. Thanks again for your comments.

Hi Ted,
Thanks for verbalising my own thoughts on this matter. I remain sceptical of the media-hyped notion that humankind is the primary cause of the recent change in climate, however measured. Whilst there is probably little doubt that there have been some relatively minor changes in climate in recent years, the emphass must be on “recent”. We do not hear the media carrying stories of the malaria epidemic in southern England in the 10th century, or the fields of wheat grown by the Vikings in Geenland in earlier times, or too much mention of the times when the Sahara was a forest. We must remember that our climate statistics (there’s that word again!) are of very, very recent vintage.

Let us all, especially scientists and media, look a great deal deeper, and facilitate a full and proper examination of what changes humankind has really caused, and more relevant, what humankind can actually do about this perceived threat to our earthly existence.

I also endorse your comments about the media and its proclivity to promote sales by feeding our ghoulish and fearful appetites.

The media and our scientific community need to be especially aware of “noblesse oblige” when using their positions to carry unwarranted sentiment along the path of sensationalism and fear.

Keep up the good work Ted, all thought needs to be challenged!

Brian

Why aren’t property values decreasing in Sydney Harbour frontages?

Comments are closed.