Thursday, December 20, 2012

Observations on economic forecasting

I decided to build a macroeconomic forecasting model. This is a strictly statistical model--it doesn't impose a lot of economic theory on the data. For readers who know a bit of statistics, it's a Bayesian VAR model with a Normal/inverse Wishart prior, basically patterned after this one.* (Many thanks to Saeed Zaman for a lot of helpful guidance on this--he explained some of his own work in this area, including a working paper on which my model is based, and he pointed me to a bunch of good BVAR resources. This is very similar to the exercise I described here, which was also done by Zaman.)

I'm going to talk about the model a bit then offer some observations about the problem of economic forecasting generally. Each section of this blog post is independent, so skip the ones that look boring to you (yes, I know that will probably mean you skip them all). I think that the section of the most general interest is the last one.

The model

For those unfamiliar with this kind of modeling, the setup is basically this: choose a handful of important economic variables (as dictated by forecasting needs and theory), collect a bunch of data on them (in my case, quarterly data from 1964 to the present), then use a statistical method to figure out the (linear) relationships that exist (a) between different variables and (b) within each variable over time. That basically describes VAR (vector autoregression) models.

It turns out that standard VAR models are dismally poor forecasters because they are too complicated (too many parameters); they can provide a really good fit of existing data, but they catch too much random movement in variables and aren't able to predict averages out of the data sample (they have an "overfitting" problem). Modelers have been able to improve them by using Bayesian methods. Specifically, Bayesian methods allow the modeler to, essentially, bias the model towards parsimony and simplicity (using the word "bias" loosely). Then the data have to be really informative to move the model away from that simplicity.

In my model, I start with the prior view that each variable is a "random walk" (with drift)--that is, each variable just moves randomly, maybe with some trend, and none of them influence each other. I also impose priors that set the bar progressively higher on how much past values of variables can influence present values. (for those who know, the prior mean is a random walk for each variable, and the prior variance for each coefficient shrinks as the lag increases--this is a shrinkage estimator, which is common in the BVAR literature).

To put it simply, my model considers movement and relationships among variables to be random unless convincingly shown otherwise. This kind of parsimony has been found to significantly improve forecasts.

Some model results

How good is my model? Imagine Woody Harrelson's voice asking, "compared to what?"

This may sound sad to some non-economists, but a common benchmark for evaluating economic forecast models is whether they can forecast better than a random walk--a model that literally specifies all variables as moving randomly. The standard way of doing this is to estimate your model on just part of your data, then use it to forecast the other parts of your data, and see how well the forecasts do by getting the average error (squared). Then compare that error to the error of a random walk with drift. I did this and looked at results for four variables: GDP, unemployment rate, inflation (core PCE), and the Fed's policy interest rate. I made some tables, but the one person still reading this will leave if I post them, so I'll just say this: my model beats a random walk model for GDP forecasts up to 6 quarters ahead, unemployment forecasts at least 8 quarters ahead, and inflation at least 8 quarters ahead. It fails miserably at forecasting the Fed's policy rate; this is pretty standard in models like this (for understandable reasons I won't go into).

By the way, the random walk with drift approach isn't as silly as it sounds. If I had been living in a cave for a few years and someone demanded that I forecast U.S. real GDP for next year, I'd probably say "2.5 to 3 percent higher than it was this year." (The variables are logged, so the drift term is the average growth rate).

I can also compare my model to a standard VAR--that is, a very similar model that does not use any kind of Bayesian approach. My model handily beats the standard VAR for all variables at all forecast horizons. Imposing some parsimony on such models has huge payoffs. Force the data to really tell you something.

I think it's ok to say that on average, across lots of time, the model is a decent forecaster.** But...

Could it predict our huge recession?


No models I know of really predicted the Great Recession with any reasonable amount of warning (and yes, a few people seem to have seen it coming, but a lot of them have since been shown to be stopped-clock predictors). I have some thoughts on this problem later in this post.

Here are a few charts of the model trying to predict the Great Recession. Here's what the model forecasts when estimated on data through the 3rd quarter of 2008 (click for larger image):

So here's what's going on in the chart above. The solid black line plots the actual data for GDP. The black dashed line with + (plus) signs is what a random walk model (the benchmark!) predicted for the path of GDP. The rest of the lines show the path for different simulations of my model; I simulated the model 100,000 times, and the red lines surround 80 percent of the simulated paths.*** The blue lines surround 40 percent of the simulated paths. The black dashed line in the middle is the mean forecast. See how poorly it does? Let's see if the next quarter's forecast is any better.

Amazing! Forecasting from the 4th quarter of 2008 goes really well. But...

The model overshoots (actually undershoots), failing to catch this upward turning point just as it failed to catch the downward turning point (I don't show the unemployment forecasts, which are slightly better). This bad forecasting doesn't surprise me too much; I don't think many, if any, models are great for forecasting turning points. But there are other excuses people could make, probably along the lines of things not included in the model (like ad hoc policies).

For those who didn't read the previous section, this doesn't mean the model is an epic failure. It actually forecasts ok on average. But just when good forecasts are most needed, it doesn't perform (though it still beats some other models, and likely beats a lot of more informal forecasting methods).

The problem of forecasting

I've written before that I don't think the purpose of the study of economics is forecasting. The value of economics as a field comes primarily from its ability to provide understanding of economic phenomena and do good quantitative policy analysis. The majority of economists spend little or no time trying to forecast things. We don't have a crystal ball or tea leaves. I did this project to satisfy a time series class and because I thought it'd be fun to think a little harder about forecasting, but this is not my main research agenda.

But it would be nice if we could predict recessions and other economic crises, right? Of course. But there are a lot of things that make that kind of predicting hard to do.

One difficulty is the most obvious: the economy is complex. It has a lot of moving parts. It is subject to all sorts of shocks and vulnerabilities (and they don't follow nice 2-parameter distributions, by the way). It's just really hard to predict the future of complex systems that involve a lot of randomness. Luckily there seems to be some inertia to the macroeconomy, allowing for ok prediction during "normal times," but when we really need good predictions they're not so good. It is somewhat easier to look at the past or think with models to come up with an idea of how things work or how policies might affect economic outcomes or counterfactuals (my BVAR model can do some really cool ones), and that's why we spend most of our time doing that instead of trying to tell fortunes.

Another difficulty is the Efficient Market Hypothesis (actually this isn't only a difficulty; it can help too). Some versions of this are really controversial, but a weak version just says that prices already reflect the information that is available to the public, so you can't predict their movements unless you know something the markets don't. This idea has huge implications for investing and financial punditry--entire careers exist that wouldn't if people understood the EMH. It also has big implications for forecasters, namely, that what you see in markets is sometimes all there is. There isn't much other information you can grab to say a lot about where markets will be tomorrow. The upside of this is that markets can do some of our forecasting for us, but the downside is that forecasters who think they know more than the markets are likely to be proven wrong.

Finally, I think the biggest reason we can't very well predict recessions in particular is that we don't experience them often enough. This is a good thing, of course; but it makes recessions difficult to study. Basically, we've had about a dozen recessions since we started collecting halfway decent data. We've had about half that many since we started collecting good data. We've had five recessions since the Fed started really doing its job in the early 1980s. And during the time of good data, how many recessions were as deep as this last one?

It's really hard to study something that rarely happens. It's really hard to find reliable predictors of something that rarely happens. It's a small sample of data. And it's extra difficult since by now most economists think that not all recessions are caused by the same thing. If there are several things that can cause recessions, and we've only had several recessions during the period of good data, it should not surprise anyone that our predictions are not very reliable. Some people say that economics is a young science, and that's part of it. But I think the bigger problem is that good data collection is a relatively new practice, and we need a much bigger sample. The good news is that the sample is constantly being expanded, both over time and in terms of richer "drill-down data" that allow us to study macroeconomic questions with micro data (see here).

The economy grows most of the time, so it's hard to make a model reliably predict economic contraction. The good news is that we're reasonably good at knowing how the economy responds to recessionary forces. But still, we're not great at quantitative prediction at any long horizon.

Throwing our hands up and saying that forecasting is a waste of time is not the solution. Forecasting is absolutely necessary for economic decisionmaking; we can do it systematically and transparently, or we can rely heavily on our cognitive biases or idiot pundits or people trying to sell us gold. Over time, my money is on the systematic approach. The important thing is understanding the limits of our knowledge and fostering realistic expectations in the consumers of forecasts.

So I think this is a mixed bag. In "normal times", a model like my BVAR is likely to do a better forecasting job than most or all other approaches. This is just a first pass at it, so its performance can be improved further with a bit of work (which I may or may not do). But I don't think it's going to predict the next recession, and it won't predict the one after that either.

*This model started out as a replication of this working paper; I even received some very helpful guidance from Saeed Zaman (an author on that working paper). This is a 17-variable BVAR (almost the same 16 variables as the working paper model, plus I add real residential investment); the variables are motivated by the Fed's forecasting needs and medium-scale New Keynesian models (which, after all, are typically solved by linearizing into a VAR format). I employ a number of order-selection criteria and settle on a 2-lag model. The prior is similar to the famous Minnesota prior with a few modifications.

**To be transparent, though, a better test of useful forecasting ability requires the use of real-time data--that is, data as they were when first released, rather than the more accurate data that have been revised. Real forecasters have to rely on real-time data, not knowing revisions until after they had to make forecasts.

***The posterior distribution of the coefficient vector is multivariate t (that's the beauty of the Normal/inverse Wishart prior--it is a natural conjugate). There are many ways to get the predictive density, such as Gibbs sampling or numerical integration (since all needed distributions have closed forms); I go with numerical simulation instead. The simulation exercise requires getting 100,000 draws of the coefficient vector from the MVT and 100,000 draws of the model's disturbance term (actually 100,000 for each forecast horizon), which I specify as multivariate normal with covariance matrix chosen empirically. So a careful modeler can improve forecasts if they have an idea of what sort of shocks may be coming.