In September 2011, the Economic Cycle Research Institute proclaimed a new U.S recession would begin sometime in the coming year. ECRI based its prediction on a host of its own internal long-leading indexes, together with its widely followed weekly leading index (WLI).
I do not wish to debate the merits of ECRI’s recession call here (I wrote on this topic last week), but since the ECRI WLI is so widely followed – presumably because it is free to the public – I want to focus on the proper use of the WLI and examine its accuracy in recession dating, in order to put this current recession call into context.
The WLI is unique in that it is published weekly, giving more of a real-time view than other indicators, such as the Conference Board leading economic indicator (LEI), the e-forecasting.com eLEI and the Organisation for Economic Cooperation and Development (OECD) LEI.
WLI data are available as far back as January 6, 1967, consisting of the WLI itself and a proprietary “growth” column that is used for estimation of recession risk and timing. It is a mystery how ECRI calculates the growth column, and attempts to replicate it with rate-of-change computations on the WLI have proved fruitless. For those curious enough, the closest correlation my firm has achieved in replicating it is 0.966 with a 37-week percentage rate-of-change of the WLI:
The proprietary nature of the WLI always leaves one with a uneasy feeling. A case in point is the period spanning 1990 to 1997, when the WLI suddenly became a lot more volatile, as can be seen in the shaded area of the above chart. At this scale it may be hard to see but zooming into that period and looking at the 37-week %change is more revealing:
The ultimate purpose of any LEI is to enable an investor to forecast business cycle turning points – more specifically, the onset and duration of recession – in a timely fashion. For the purposes of this exercise, we will use the National Bureau of Economic Research (NBER) recession-dating methodology, as they are the official arbiters of recession dating for the U.S.
To time recessions with an LEI, one needs to de-trend the time series (by deriving a rate-of-change oscillator from it) and then make recession calls when this oscillator pivots around a threshold. In effect, the WLI growth rate published by ECRI is a de-trended oscillator, as is the 37-week rate-of-change we used to replicate it as closely as possible in the previous chart. The natural inclination is to make recession calls when these oscillators pivot about zero (that is, when a rate of change oscillator dips below zero, that implies the LEI itself has entered a down-trend) but this is a common misconception. In fact, if one uses the WLI growth figures published by ECRI, pivots about zero are not actually optimal for recession forecasting and dating.
The chart below shows the WLI growth oscillator together with recession calls (“MODEL”) made on pivots about zero (“TRIGGER”) and how these match up to actual NBER recession days:
This method does a fine job of warning about recession and capturing all the relevant recession periods – it correctly warned of and registered all seven NBER recessions. But it also generated 12 distinct false positives,(wrong calls) rendering the methodology all but useless. Using a trigger of -2.638 however brings about the true magic of the WLI:
There are still a worrying four false positives. How are you to trust a system if it has a history of making wrong calls? It is not as if you are making small trades; generally, when you act on a recession call it involves large-scale changes to asset allocation changes or expansion and hiring strategy. So the stakes are high with recession calls.
Let us turn our attention to the accuracy of the WLI growth figure pivoted about -2.638. Why -2.638? This number comes from back testing that mathematically optimizes the accuracy of the recession calling model from three aspects: AUC (area under the curve) accuracy, NBER capture rate, and false positives. There are others one might use, such as lead and lag accuracy, but all three of these metrics are crucial measurements of a recession forecasting/dating system, and they are sometimes mutually exclusive. For example a high AUC does not guarantee a high NBER capture, and a high NBER capture does not guarantee a low false positive rate. So your optimization technique has to find the trigger that gives the best blend of these three metrics. For the above example, if we try to raise the NBER capture rate, the false positives increase. And if we try to change the threshold to a number that reduces the false positives to less than 4 then the NBER capture rate drops dramatically.
Let’s review these metrics before moving on:
1.“AUC accuracy” is how many of the sample points (2,292 weeks for this particular exercise) the MODEL correctly categorized as being either a recession week or a non-recession week. The AUC for the WLI growth metric pivoted about the trigger of -2.638 is 0.904, meaning the model correctly categorised 90.4% of all weeks since 1967. Conversely, this means it had an error rate of 9.6% – some 220 weeks were incorrectly categorized by the model (either by calling a recession when there was none or saying there was no recession when there was indeed one). In statistical terms, this is known as the percentage correct or “area under the curve” of the model.
2.“NBER capture” refers to how many of the 360 NBER dated recessionary weeks since 1967 the MODEL correctly categorized. The figure is 86.1%, meaning 13.9% or some 50 weeks were completely missed by the model. In statistical terms this is known as the sensitivity of the model.
3.“False positives” are how often the model flagged a recession when there was none. When determining false positives for the model we ignored cases in which the model flagged a recession up to 20 weeks before the actual recession occurred, since this is a good thing – the model is giving us several months advance warning of a recession that did indeed materialize. We also ignored the current recession reading to the right of the chart, as we have no way of knowing if we are indeed in a recession right now(we are pretty convinced we are not!).
Is there a better model than the WLI growth metric?
So is the WLI Growth metric provided by ECRI is the best oscillator to use? To find out, my firm performed an optimization procedure to find out what WLI percentage rate-of-change period yields a model that best blends the three performance metrics discussed previously. The result is below:
We found that a three-period simple moving average of the 52-week WLI percentage rate-of-change produced the best results. The chart shows that this new model still flags every NBER recession, but eliminates three false positives. AUC accuracy is 2.8% better (64 additional accurate weeks), although NBER capture drops 4.16% (15 recession weeks). The overriding benefit, however, is that this indicator would have yielded only one false positive in the last 40+ years. We cannot over-emphasise how important this improvement is – it leaves you with a system you are more likely to trust and act upon (although in our prior article we warn against the use of a single indicator for recession dating.)
Eliminating the false positives is not without cost – the previous method using WLI growth metric gives you more warning of the 2nd, 3rd, 4th and 7th recessions. You will want to use both methods when evaluating risks of recession. If the first method, using the WLI growth computed by ECRI, is flagging recession, then you can treat it is a valid warning but subject to false positives. You may elect to act on this warning or revert to the second method for further confirmation (sacrificing a few weeks in the process if a recession is indeed iminent).