«Has Output Become More Predictable? Changes in Greenbook Forecast Accuracy Peter Tulip 2005-31 NOTE: Staff working papers in the Finance and Economics ...»
Finance and Economics Discussion Series
Divisions of Research & Statistics and Monetary Affairs
Federal Reserve Board, Washington, D.C.
Has Output Become More Predictable?
Changes in Greenbook Forecast Accuracy
NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS)
are preliminary materials circulated to stimulate discussion and critical comment. The
analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors.
References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.
Has Output Become More Predictable?
Changes in Greenbook Forecast Accuracy Peter Tulip * Abstract Several researchers have recently documented a large reduction in output volatility. In contrast, this paper examines whether output has become more predictable. Using forecasts from the Federal Reserve Greenbooks, I find the evidence is somewhat mixed. Output seems to have become more predictable at short horizons, but not necessarily at longer horizons.
The reduction in unpredictability is much less than the reduction in volatility. Associated with this, recent forecasts had little predictive power.
JEL classification: E37 Keywords: Predictability, Variability, Forecast Errors, Greenbook * Division of Research and Statistics, Federal Reserve Board, Washington, D.C. 20551.
email@example.com. I would like to acknowledge helpful comments from Dave Reifschneider, David Wilcox, Neil Ericsson, Charles Goodhart, Michael Kiley, seminar participants at the Federal Reserve Board, and, especially, John Roberts. The views presented in this paper are solely those of the author and do not necessarily represent those of the Federal Reserve Board or its staff.
Introduction The volatility of the US economy has declined dramatically. The standard deviation of annualized changes in quarterly real seasonally-adjusted GDP declined from
1.2 percentage points in the period 1947-1983 to 0.5 percentage points in 1984-2004.
This “Great Moderation” has been described as one of the most striking changes in the business cycle in recent decades (Ben Bernanke, 2004a; James Stock and Mark Watson, 2003). It is the subject of a large and growing literature, of which Margaret McConnell and Gabriel Perez-Quiros (2000), Chang Jin Kim and Charles Nelson (1999), and Olivier Blanchard and John Simon (2001) are prominent examples.
However, what matters to most people is not volatility but uncertainty. Because resources can generally be transferred from known periods of high income to those of low income, predictable variations are not a serious concern. Presumably, it is unpredictable changes that cause large welfare losses. When people cannot accurately predict the future, they make decisions that, with hindsight, turn out to be mistakes.
Firms build factories when they shouldn’t. Central banks raise interest rates when they should have lowered them. Resources are wasted taking precautions against events that do not occur. And so on.
The clearest evidence of the importance of uncertainty relative to volatility is the lack of interest in seasonal economic variations. Seasonal variations are huge, accounting for about 85 percent of the variability of output (J. Joseph Beaulieu and Jeffrey A. Miron, 1992, table 1). But because they are predictable, almost no-one pays attention to them (at a macro-economic level). Even the studies of so-called “volatility” use seasonally-adjusted data. They do not measure the total variation in the data; only the variation not accounted for by one specific influence. But there is no obvious reason for singling out seasonality. Just as predictable seasonal variations are appropriately removed from the data, so should other predictable influences.
Board of Governors, as published in a document called the Greenbook. Differences between these forecasts and actual outcomes are the Greenbook errors.
The Greenbook errors provide a good measure of uncertainty for several reasons.
Previous researchers have found that the Greenbook forecasts are more accurate than other forecasts (Christina Romer and David Romer, 2000; Christopher Sims, 2002). So they can be taken as representing the state-of-the-art or the envelope of predictability.
Furthermore, the data on Greenbook forecasts is richer than for many private sector forecasts. The forecast horizon is longer and the data extend further back in time.1 Trends in the Greenbook errors are also interesting because of their relevance to monetary policy. As Chairman Alan Greenspan (2004, p. 8) has noted, “the success of monetary policy depends importantly on the quality of forecasting”. So, from a historical perspective, changes in the quality of the forecasts might help explain changes in policy performance, to the extent that policy was guided by the staff forecasts. From a normative perspective, the accuracy of forecasts and its stability help determine the extent to which monetary policy should be “forward-looking”. Lastly, if the forecast errors are stable over time then the monetary policy environment can be described as one of “risk” rather than “Knightian uncertainty”. That is, we can quantify what we do not know. In particular, the distribution of outcomes about previous forecasts would provide a reliable guide to the distribution of possible outcomes about the current forecast. This is relevant both to the FOMC’s assessment of risks, and (more so in other countries than in the US) the public presentation of policy.
Although the paper is indirectly motivated by these monetary policy issues, its primary focus is whether uncertainty has declined. I find that there has been a clear and large reduction in uncertainty at short horizons, but not necessarily at longer horizons. I also find that the reduction in uncertainty is much less than the reduction in volatility.
Closely associated with this, recent forecasts have had remarkably little predictive power.
For example, whereas the Greenbook forecasts for real GDP began in 1965, the Survey of Professional Forecasters began in 1968, DRI forecasts began in 1970, Blue Chip forecasts began in 1977, and The Wall Street Journal survey began in 1986.
-4Whereas the Fed predicted a large share of the fluctuations in output in the 1970s and 1980s, more recent fluctuations have been surprises.
II. Related literature The view that unpredictability is of greater interest than volatility is not new. As noted above, almost all of the studies of volatility remove predictable seasonal influences from the data. Many others remove the predictions of a vector autoregression. Several papers in this literature – for example, Stock and Watson (2003) – explicitly discuss unpredictability.
However, insofar as measures of uncertainty are presented, it is typically in the form of the errors of an econometric model. After-the-event regression residuals are easier to compile than real-time forecast errors, and they facilitate decomposition and analysis. But otherwise, they provide an unsatisfactory measure of the uncertainty that faced decision makers in real time. On the one hand, they understate real-time uncertainty because regressions are estimated after the event and so benefit from hindsight. For example, they “know” the sample mean (unless estimated recursively) and data revisions (unless real time data is used). Unavoidably, their specifications reflect information that was unavailable to forecasters. On the other hand, they tend to overstate uncertainty because they are simple. Even the most complicated econometric models incorporate much less information than the Greenbook forecast, which reflects the pooling of many variables, models, and statistical methods by a large team of economists.
Previous comparisons suggest that the second of these biases has usually been more important. The Greenbook and private sector forecasts have been much more accurate (over a limited range of measures) than autoregressions, and slightly more accurate than large econometric models, such as MPS.2 That is, autoregressions have tended to overstate uncertainty.
Examples of forecast comparisons include Romer and Romer (2000), Sims (2002), Campbell (2004) and unpublished studies conducted by the Federal Reserve staff.
-5Several recent papers have analyzed real-time forecast errors, including Scott Schuh (2001), Charles Goodhart (2004), and Sean Campbell (2004). Schuh and Goodhart find some similar results to mine, using different data sets, which I note below. However, neither of these papers is directly focused on changes in the errors over time.
Campbell’s work, circulated while this paper was in preparation, overlaps to a greater extent. We both find that short-horizon forecast errors have narrowed by less than the decline in output volatility. However, Campbell’s focus is on differences between private sector forecasts and autoregressions, rather than assessing whether uncertainty has changed. Also, his data comes from the Survey of Professional Forecasters (SPF), whereas I use the Federal Reserve Greenbooks. Accordingly, Campbell’s analysis is more relevant to private sector decision-making while mine is more relevant to monetary policy.
Furthermore, the horizon of the Greenbooks is longer than that of the SPF. Partly because of this, my conclusions are slightly different. Whereas Campbell (p2) finds that “macroeconomic uncertainty … (has) exhibited a substantial decline since 1984,” I find that the evidence of a reduction is mixed. At longer horizons, point estimates of uncertainty have not substantially declined.
Before scheduled meetings of the Federal Reserve’s Federal Open Market Committee (FOMC), the staff of the Board of Governors prepares a detailed forecast. This is published in a document universally, though unofficially, called the Greenbook. The purpose of the Greenbook is to facilitate the deliberations of the FOMC. The forecasts reflect the views of the staff, not the Committee members, who may hold quite different views about the evolution of the economy.3 The Greenbook forecasts are available at the website of the Federal Reserve Bank of Philadelphia, except for those from the last five years, which are confidential. The The Committee members report their own forecasts for GDP growth, unemployment, and inflation to Congress twice a year.
-6first current-quarter forecast for real GNP was published in November 1965. The forecast horizon has been extended since then, reaching four quarters (including the current quarter) in 1968, eight quarters in 1979 and ten quarters in 1990. The horizon typically rolls forward to cover a new calendar year every twelve months (currently in September). Because of this, the data are discontinuous, particularly at longer horizons.
I use forecasts through November 1999, which have a horizon extending to 2001q4.
I use one Greenbook per quarter, although the actual frequency of publication is higher. I assume that the potential loss of information is outweighed by the convenience of measuring forecasts and outcomes at the same frequency. I choose the Greenbook closest to the middle of the quarter, for comparability with the spreadsheets maintained by the Philadelphia Federal Reserve. I focus on the forecast for real output, defined as GNP prior to 1991, then GDP. This series uses prices from fixed base years until 1996, then is chain-weighted.
To calculate forecast errors, I compare these predictions with real-time data.
Specifically, I use the GDP/GNP estimate as of the middle of the quarter two quarters after the relevant event, also available from the Federal Reserve Bank of Philadelphia’s real-time data set. Hence, “truth” for, say, the change in output in the four quarters to 2000q1 is the estimate as of mid-August 2000. Typically, these estimates represent the “first final” estimate (also called the “second revision”) of the BEA. These data reflect a more comprehensive analysis of source data than earlier estimates, while usually adhering to the same data definitions as at the time of the forecast.
The use of real-time data differs from the approach of Sims (2002, p7), Campbell (2004), and many others, who use latest available estimates. Using recent estimates is easier but involves treating changes in data definitions as forecast errors. There are several problems with this approach, of which two are important for my purposes. First, use of recent data would bias results toward showing that predictability has increased over time, because recent forecasts would use data definitions that were closer to the “truth” than earlier forecasts. Second, using later data definitions would make forecast errors correlated, lowering the information content of individual errors. Other reasons for
-7preferring real-time to current data are noted in Romer and Romer (2000), Robertson and Tallman (1998) and several references cited by Schuh (2001, n.14).4 Charts 1 and 2 show some illustrative data. The lines show real-time measures of changes in output. The dots show corresponding forecasts, dated by time of the event, not the time of the forecast. That is, a dot that is close to the line represents an accurate forecast. (The filled-in dots represent a set of non-overlapping forecasts I use in the Appendix). Chart 1 shows four-quarter changes, with forecasts for the current quarter and three following quarters. Chart 2 shows eight-quarter changes (the current quarter and seven following quarters). Elsewhere, these forecasts are sometimes called “threequarter ahead” and “seven-quarter ahead” forecasts.
Other approaches would also be possible. For example, one could use even earlier estimates, such as the advance or preliminary NIPA. These are based on incomplete data augmented by the BEA assumptions.
Hence “forecast errors” measured on this basis reflect the extent to which the forecaster shares the BEA’s assumptions, rather than consistency with actual economic conditions. Another possibility would be to only use data defined exactly the same way as the forecast, excluding observations at the time of benchmark revisions. However, this would substantially reduce the number of long-horizon errors in my sample. In practice, one-off changes arising from redefinitions to GDP are small relative to overall forecast errors. For example, the root mean squared difference between the current measures of four-quarter changes in GNP and GDP between 1991q3 and 1993q4, the period affected by this change in definition, is
0.13 percent, tiny relative to the 4-quarter RMSE, 1.6 percent. Of course, were this discrepancy to be applied to all previous forecasts (as in the use of latest available data) its effect would cease to be trivial.
As the charts show, there were large swings in activity in the 1970s and 1980s.
Interestingly, the Fed staff anticipated a substantial share of these. But more recently, the staff missed the boom of the 1990s and subsequent downturn. Schuh (2001, Figure 1) shows a similar deterioration in the performance of the Survey of Professional Forecasters.
-6 -6 IV. Changes in Predictability The main results in this paper are presented in charts 4, 5, and 6. Each chart shows two series. The dashed black lines show the variance of output growth. The solid grey lines show unpredictability, measured as the Mean Squared Error (MSE) of forecasts of output growth. Both series are measured using 5-year rolling windows, following the approach of Olivier Blanchard and John Simon (2001). Note that the sample of forecast errors is incomplete; the MSEs are calculated using whatever observations are available within the window.
The variances and MSEs shown in the charts are algebraically related. Let yt represent actual output growth in quarter t and ft its forecast. The forecast error is then et = yt - ft. I use the same real-time measure of yt in both the MSE and the variance.