Harvard Business Review has a blog post out called Economists Are Overconfident. So Are You. Normally, I would just tweet such an article but the content is a bit heavy on statistics and so it bears a bit more plain English. Here’s the gist of the article:What Soyer and Hogarth did was get 257 economists to read about a regression analysis that related independent variable X to dependent variable Y, then answer questions about the probabilities of various outcomes (example: if X is 1, what’s the probability of Y being greater than 0.936?). When the results were presented in the way empirical results usually are presented in economics journals — as the average outcomes of the regression followed by a few error terms — the economists did a really bad job of answering the questions. They paid too much attention to the averages, and too little to the uncertainties inherent in them, thereby displaying too much confidence.
Clear as mud? What they’re talking about is a statistical tool called regression analysis. Regressions are tools that allow anyone analyze a set of data and use that data to make a “reliable” prediction of the future. Regressions are used everywhere from climate science to economics. At it’s most basic level it’s nothing but a fancy correlation.
Let’s say for example I want to try and predict the stock market movement for a given week. I get 3 variables in place that I think will tell me where the stock market is going. For simplicity, let’s say they are:
- What the stock market return was the week before
- The change in government interest rates the week before
- The previous week’s unemployment benefit claims
I can run a regression and it will give me a number I can use to predict the stock market movement for the next week. But that isn’t the only number the regression gives. It will also tell me how much of the stock market’s movement can be explained by my 3 variables and what just seemed like a statistical oddity. For example, my variables might only explain 30% of the stock market’s movement. I might need more variables, or more data to find a set that explains 50% or more. It will also give me a range of possibilities that might make make whole regression useless.
So my regression output might tell me the stock market will go up next week by 2%, but 80% of the time it will be between -5% and 5%. That’s not very helpful at all. I can’t put money on the table for a bet like that. Regressions can be fun, because you can compare all kinds of data. You can regress stock market returns based on how many bills Congress passed this week, or how many times Obama said “economy”. Any data point can be regressed, it’s just a matter of how reliable the results are.
This HBR article is saying that economists are only focusing on the 2% return and not anything else. And even when they do share the whole statistical outcome of their analysis to someone that may act on the information, that person is likely to ignore it too (be it a CEO or head of state). This could help explain some of the arrogance behind some economic pundits. But it’s not just economics, this applies to anyone that uses regressions.
Interestingly, when presented with charts showing the range of possibilities instead of just the regression output (just numbers), economists were much better at predicting the future. Visualizing the outcomes in charts and graphs has a great impact on our ability to understand data. But the nerds crunching the numbers probably think they don’t need it.
To help better understand the benefit of visualizing data, look at the chart below.
This is the forecast for inflation in England. The media, or the Prime Minister, or even the head of the Bank of England might tell the public inflation is predicted to be 2.5% in 2012. But the data behind that prediction actually says it could be anywhere from 1% to 5%. That’s a HUGE difference in inflation. But you’ll rarely see charts like this make it to the public.
So when we say economists got their prediction wrong, they don’t care. They know all they were doing was reporting the average and understood in the bank of their heads what that number meant. They do not understand the damage this can cause.