The Myth of the Credit Spread Puzzle
Feldhütter, Peter;Schaefer, Stephen M
2018-03-24 00:00:00
Abstract Are standard structural models able to explain credit spreads on corporate bonds? In contrast to much of the literature, we find that the Black-Cox model matches the level of investment-grade spreads well. Model spreads for speculative-grade debt are too low, and we find that bond illiquidity contributes to this underpricing. Our analysis makes use of a new approach for calibrating the model to historical default rates that leads to more precise estimates of investment-grade default probabilities. Received October 25, 2016; editorial decision January 12, 2018 by Editor Andrew Karolyi. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online. The structural approach to credit risk, pioneered by Merton (1974) and others, represents the leading theoretical framework for studying corporate default risk and pricing corporate debt. While the models are intuitive and simple, many studies find that, once calibrated to match historical default and recovery rates and the equity premium, they fail to explain the level of actual investment-grade credit spreads, a result referred to as the “credit spread puzzle.” Papers that find a credit spread puzzle typically use Moody’s historical default rates, measured over a period of around 30 years and starting from 1970, as an estimate of the expected default rate.1 Our starting point is to show that the appearance of a credit spread puzzle strongly depends on the period over which historical default rates are measured. For example, Chen, Collin-Dufresne, and Goldstein (2009) use default rates from 1970 to 2001 and find BBB-AAA model spreads of 57–79 basis points (bps) (depending on maturity), values that are substantially lower than historical spreads of 94–102 bps. If, instead, we use Moody’s default rates for 1920–2001, model spreads are 91–112 bps, a range that is in line with historical spreads. Using simulations, we demonstrate two key points about historical default rates. The first is, over sample periods of around 30 years that are typically used in the literature, there is a large sampling error in the observed average rate. For example, if the true 10-year BBB cumulative default probability were 5.09$$\%$$,2 a 95$$\%$$ confidence band for the realized default rate measured over 31 years would be $$[1.15\%, 12.78\%]$$. Intuitively, the large sample error arises because defaults are correlated and 31 years of data only give rise to three nonoverlapping 10-year intervals. As a result of the large sampling error, when historical default rates are used as estimates of ex ante default probabilities, the difference between actual spreads and model spreads needs to be large—much larger, for example, than that found for the BBB-AAA spread mentioned above—to be interpreted as statistically significant evidence against the model. Second, and equally crucial, distributions of average historical investment-grade default rates are highly positively skewed. Most of the time we see few defaults, but, occasionally, we see many defaults, meaning that there is a high probability of observing a rate that is below the actual mean. Positive skewness is likely to lead to the conclusion that a structural model underpredicts investment-grade spreads even if the model is correct. The reason for the presence of skewness is that defaults are correlated across firms as a result of the common dependence of individual firm values on systematic (“market”) shocks. To see why correlation leads to skewness, we can think of a large number of firms with a default probability (over some period) of 5$$\%$$ and where their defaults are perfectly correlated. In this case we will observe a zero default rate 95$$\%$$ of the time (and a 100$$\%$$ default rate 5$$\%$$ of the time), and so the realized default rate will underestimate the default probability 95$$\%$$ of the time. If the average default rate is calculated over three independent periods, the realized default rate will still underestimate the default probability $$0.95^3=85.74\%$$ of the time. We propose a new approach to estimate default probabilities. Instead of using the historical default rate at a single maturity and rating as an estimate of the default probability for this same maturity and rating, we use a wide cross-section of default rates at different maturities and ratings. We use the Black and Cox (1976) model and what ties default probabilities for firms with different ratings together in the model is that we assume that they will, nonetheless, have the same default boundary. (The default boundary is the value of the firm, measured as a fraction of the face value of debt, below which the firm defaults.) This is reasonable since, if the firm were to default, there is no obvious reason the default boundary would depend on the rating the firm had held previously. We show in simulations that our approach results in much more precise and less skewed estimates of investment-grade default probabilities. For the estimated 10-year BBB default probability, for example, the standard deviation and skewness using the new approach are only 16$$\%$$ and 4$$\%$$, respectively, of those using the existing approach. The improved precision is partly the result of the fact that we combine information across 20 maturities and 7 ratings and default probability estimates from different rating/maturity pairs are imperfectly correlated. But, to a significant extent, it is the result of combining default information on investment-grade and high-yield defaults. Because defaults occur much more frequently in high-yield debt, these firms provide more information on the location of the default boundary. Since the boundary is common to investment-grade and high-yield debt, when we combine investment-grade and high-yield default data, we “import” the information on the location of the default boundary from high-yield to investment-grade debt. The reduction in skewness is also the result of including default rates that are significantly higher than those for BBB debt. While a low default rate for investment-grade debt produces a positive skew in the distribution of defaults, a default rate of 50$$\%$$ produces a symmetric distribution and, for even higher default rates, the skew is actually negative. We use our estimation approach and the Black-Cox model to investigate spreads over the period 1987–2012. Our data set consists of 256,698 corporate bond yield spreads to the swap rate of noncallable bonds issued by industrial firms and is more extensive than those previously used in the literature. Our implementation of the Black-Cox model is new to the literature in that it allows for cross-sectional and time-series variation in firm leverage and payout rate while matching historical default rates. Applying our proposed estimation approach, we estimate the default boundary such that average model-implied default probabilities match average historical default rates from 1920 to 2012. In calibrating the default boundary we use a constant Sharpe ratio and match the equity premium, but, once we have implied out the single firm-wide default boundary parameter, we compute firm- and time-specific spreads using standard “risk-neutral” pricing formulae. We first explore the difference between average spreads in the Black-Cox model and actual spreads. The average model spread across all investment-grade bonds with a maturity between 3 and 20 years is 111 bps, whereas the average actual spread is 92 bps. A confidence band for the model spread that takes into account uncertainty in default probabilities is $$[88$$ bps$$;128$$ bps$$]$$; thus there is no statistical difference between actual and model investment spreads. For speculative-grade bonds, the average model spread is 382 bps, whereas the actual spread is 544 bps, and here the difference is statistically highly significant. We also sort bonds by the actual spread and find that actual and model-implied spreads are similar, except for bonds with a spread of more than 1,000 bps. For example, for bonds with an actual spread between 100–150 bps the average actual spread is 136 bps, whereas the average model-implied spread is 121 bps. Importantly, the results are similar if we calibrate the model using default rates from 1970 to 2012 rather than from 1920 to 2012, thus resolving the problem described above that results in the earlier literature depend significantly on the historical period chosen to benchmark the model. To study the time series, we calculate average spreads on a monthly basis and find that for investment-grade bonds there is a high correlation of 93$$\%$$ between average actual spreads and model spreads. Note that the model-implied spreads are “out-of-sample” predictions in the sense that actual spreads are not used in the calibration. Furthermore, for a given firm only changes in leverage and the payout rate—calculated using accounting data and equity values—lead to changes in the firm’s credit spread. For speculative-grade bonds the correlation is only 40$$\%$$, showing that the model has a much harder time matching spreads for low-quality firms. Although average investment-grade spreads are captured well on a monthly basis, the model does less well at the individual bond level. Regressing individual investment-grade spreads on those implied by the Black-Cox model gives an $$R^2$$ of only 44$$\%$$, so at the individual bond level less than half the variation in investment-grade spreads is explained by the model. For speculative-grade spreads the corresponding $$R^2$$ is only 13$$\%$$. We also investigate the potential contribution of bond illiquidity to credit spreads. We use bond age as the liquidity measure and double sort bonds on liquidity and credit quality. For investment-grade bonds we find no relation between bond liquidity and spreads, consistent with the ability of the model to match actual spreads and the finding in Dick-Nielsen, Feldhütter, and Lando (2012) that outside the 2007–2008 financial crisis illiquidity premiums in investment-grade bonds were negligible. For speculative-grade bonds we find a strong relation between bond liquidity and yield spreads, suggesting that bond liquidity may explain much of the underpricing of speculative-grade bonds. In this paper we use the Black and Cox (1976) model as a lens through which to study the credit spread puzzle. The results in Huang and Huang (2012) show that many structural models which appear very different in fact generate similar spreads once the models are calibrated to the same default probabilities, recovery rates, and the equity premium. The models tested in Huang and Huang (2012) include features such as stochastic interest rates, endogenous default, stationary leverage ratios, strategic default, time-varying asset risk premiums, and jumps in the firm value process, yet all generate a similar level of credit spread. To the extent that different structural models produce similar investment-grade default probabilities under our estimation approach, our finding that the Black-Cox model matches average investment-grade spreads is likely to hold for a wide range of structural models. An extensive literature tests structural models. Leland (2006), Cremers, Driessen, and Maenhout (2008), Chen, Collin-Dufresne, and Goldstein (2009), Chen (2010), Huang and Huang (2012), Chen, Cui, He, and Milbradt (2017), Bai (2016), Bhamra, Kuehn, and Strebulaev (2010), and Zhang, Zhou, and Zhu (2009) use the historical default rate at a given rating and maturity to estimate the default probability at that maturity and rating. We show that this test is statistically weak. Eom, Helwege, and Huang (2004), Ericsson, Reneby, and Wang (2015), and Bao (2009) allow for heterogeneity in firms and variation in leverage ratios, but do not calibrate to historical default rates. Bhamra, Kuehn, and Strebulaev (2010) observe that default rates are noisy estimators of default probabilities, but do not propose a solution to this problem as we do. 1. A Motivating Example There is a tradition in the credit risk literature of using Moody’s average realized default rate for a given rating and maturity as a proxy for the corresponding ex ante default probability. This section provides an example showing that the apparent existence or nonexistence of a credit spread puzzle depends on the particular period over which the historical default rate is measured. Later in the paper we describe an alternative approach for extracting default probability estimates from historical default rates that not only provides much greater precision but is also less sensitive to the sample period chosen. To understand how Moody’s calculates default frequencies, consider the 10-year BBB cumulative default frequency of 5.09$$\%$$ used in Chen, Collin-Dufresne, and Goldstein (2009).3 This number is published in Moody’s (2002) and is based on default data for the period 1970–2001. For the year 1970, Moody’s identifies a cohort of BBB-rated firms and then records how many of these default over the next 10 years. The 10-year BBB default frequency for 1970 is the number of defaulted firms divided by the number in the 1970 cohort. The average default rate of 5.09$$\%$$ is calculated as the average of the twenty-two 10-year default rates for the cohorts formed at yearly intervals over the period 1970–1991. A large part of the literature has focused on the BBB-AAA spread at 4- and 10-year maturities. In our main empirical analysis (Section 3), we study a much wider range of ratings and maturities but for now, to keep our example simple, we also focus on the BBB-AAA spread. For a given sample period we use the BBB and AAA average default rates for the 4- and 10-year horizons reported by Moody’s. Following the literature (e.g., Chen et al. 2009; Huang and Huang 2012; and others) we first benchmark a model to match these default rates, one at a time. Using the benchmarked parameters we then compute risk-neutral default probabilities and, from these, credit spreads. Following Eom, Helwege, and Huang (2004), Bao (2009), Huang and Huang (2012), and others, we assume that if default occurs, investors receive (at maturity) a fraction of the originally promised face value, but now with certainty. The credit spread, $$s$$, is then calculated as \begin{eqnarray} \label{eq:BlackCoxcreditspread} s=y-r=-\frac{1}{T}\log [1-(1-R)\pi^Q(T)], \end{eqnarray} (1) where $$R$$ is the recovery rate, $$T$$ is the bond maturity, and $$\pi^Q(T)$$ is the risk-neutral default probability. Throughout our analysis we employ the Black-Cox model (Black and Cox 1976). Appendix A provides the model details. We use our average parameter values for the period 1987–2012 estimated in Section 3 and Chen, Collin-Dufresne, and Goldstein’s (2009) estimates of the Sharpe ratio and recovery rate. We estimate the default boundary by matching an observed default frequency. The default boundary is the value of the firm, measured as a fraction of the face value of debt, below which the firm defaults. Following Chen, Collin-Dufresne, and Goldstein (2009) and others, we carry this out separately for each maturity and rating such that, conditional on the other parameters, the model default probability matches the reported Moody’s default frequency. For each maturity and rating we then use the benchmarked default boundary and calculate the credit spread using Equation (1). The solid bars in Figure 1 show estimates of the actual BBB-AAA corporate bond credit spread from a number of papers. For both the 4- and 10-year maturities, the estimated BBB-AAA spread is in the range of 98–109 bps with the notable exception of Huang and Huang’s (2012) estimate of the 10-year BBB-AAA of 131 bps. (Huang and Huang use both callable and noncallable bonds in their estimate of the spread and this may explain why it is higher.) Figure 1 View largeDownload slide Actual and model-implied BBB-AAA corporate bond yield spreads when using existing approach in the literature This figure shows actual and model-implied BBB-AAA spreads based on different estimates of actual and model-implied spreads. The actual BBB-AAA yield spreads are estimates from Duffee (1998) (Duf), Huang and Huang (2012) (HH), Chen, Collin-Dufresne, and Goldstein (2009) (CDG), and Cremers, Driessen, and Maenhout (2008) (CDM). The solid lines show spreads in the Black-Cox model based on Moody’s default rates from the period 1920–2002 and 1970–2001, respectively. The dashed lines show spreads in the Merton model based on Moody’s default rates from the period 1920–2002 and 1970–2001, respectively. Figure 1 View largeDownload slide Actual and model-implied BBB-AAA corporate bond yield spreads when using existing approach in the literature This figure shows actual and model-implied BBB-AAA spreads based on different estimates of actual and model-implied spreads. The actual BBB-AAA yield spreads are estimates from Duffee (1998) (Duf), Huang and Huang (2012) (HH), Chen, Collin-Dufresne, and Goldstein (2009) (CDG), and Cremers, Driessen, and Maenhout (2008) (CDM). The solid lines show spreads in the Black-Cox model based on Moody’s default rates from the period 1920–2002 and 1970–2001, respectively. The dashed lines show spreads in the Merton model based on Moody’s default rates from the period 1920–2002 and 1970–2001, respectively. Using Moody’s average default rates from the period 1970–2001, the 4- and 10-year BBB-AAA spreads in the Black-Cox model are 52 and 72 bps, respectively. These model estimates are substantially below actual spreads, a finding that has been coined the “credit spread puzzle.” Figure 1 also shows the model-implied spreads using Moody’s average historical default rates from 1920 to 2002 (default rates from 1920 to 2001 are not available). Using default rates from this longer period, the model-implied spreads are substantially higher: the 4- and 10-year BBB-AAA spreads are 87 and 104 bps, respectively. Thus, when we use default rates from a longer time period the puzzle largely disappears. To emphasize that this conclusion is not specific to the Black-Cox model, Figure 1 also shows the four spreads computed under the Merton model (and using the parameters and method given in Chen et al. 2009). These spreads are very similar to, and just a little higher than, the Black-Cox spreads. What remains unchanged is the finding that the appearance of a credit spread puzzle depends on the sample period. In the example we compare corporate bond yields relative to AAA yields to be consistent with CDG and others. In our later analysis we use bond yields relative to swap rates. The average difference between swap rates and AAA yields is small: over our sample period 1987–2012, the average 5- and 10-year AAA-swap spreads are 4 and 6 bps, respectively. We use swap rates in our later analysis, because the term structure of swap rates is readily available on a daily basis. There are very few AAA-rated bonds in the later part of our sample period, so we would not be able to calculate a AAA yield at different maturities. In summary, realized average default rates vary substantially over time, and, as a result, when these are taken as ex ante default probabilities the historical period over which they are measured has a strong influence on whether or not there will appear to be a credit spread puzzle. In the next section we first explore the statistical uncertainty of historical default rates in more detail and then propose a different approach to estimating default probabilities that exploits the information contained in historical default rates more efficiently than has been the case in the literature so far. 2. Estimating Ex Ante Default Probabilities The existing literature on the credit spread puzzle and, more broadly, the literature on credit risk typically uses the average ex post historical default rate for a single maturity and rating as an estimate of the ex ante default probability for this same maturity and rating.4 We find that the statistical uncertainty associated with these estimates is large and propose a new approach that uses historical default rates for all maturities and ratings simultaneously to extract the ex ante default probability for any given maturity and rating. Simulations show that our approach greatly reduces statistical uncertainty. 2.1 Existing approach: Extracting the ex ante default probability from a single ex post default frequency An ex post realized default frequency may be an unreliable estimate of the ex ante default probability for two significant reasons. The first is that the low level of default frequency, particularly for investment-grade firms, leads to a sample size problem with default histories as short as those typically used in the literature when testing standard models. The second is that, even though the problem of sample size is potentially mitigated by the presence of a large number of firms in the cross-section, defaults are correlated across firms and so the benefit of a large cross-section in improving precision is greatly reduced. How severe are these statistical issues? We address this question in a simulation study and base our simulation parameters on the average 10-year BBB default rate of 5.09$$\%$$ over 1970–2001 used in Chen, Collin-Dufresne, and Goldstein (2009). In an economy in which the ex ante 10-year default probability is 5.09$$\%$$ for all firms, we simulate the ex post realized 10-year default frequency over 31 years. We assume that in year 1 we have 445 identical firms, equal to the average number of firms in Moody’s BBB cohorts over the period 1970–2001. In the Black-Cox (and Merton) model firm $$i$$’s asset value under the natural measure follows a GBM: \begin{eqnarray} \label{BlackCoxGBM2} \frac{dV_{it}}{V_{it}}=(\mu-\delta)dt+\sigma dW^P_{it}, \end{eqnarray} (2) where $$\mu$$ is the drift of firm value before payout of the dividend yield $$\delta$$ and $$\sigma$$ is the volatility of firm value. Like in Section 1, we use our average parameter values for the period 1987–2012 estimated in Section 3: $$\mu=10.05\%$$, $$\delta=4.72\%$$, and $$\sigma=24.6\%$$. We introduce systematic risk by assuming that \begin{eqnarray} \label{BaoGBMsimSystAndIdio} W^P_{it}=\sqrt{\rho}W_{st}+\sqrt{1-\rho}W_{it}, \end{eqnarray} (3) where $$W_i$$ is a Wiener process specific to firm $$i$$, $$W_s$$ is a Wiener process common to all firms, and $$\rho$$ is the pairwise correlation between percentage changes in firm value. All the Wiener processes are independent. The firm defaults the first time asset value hits a boundary equal to a fraction $$d$$ of the face value of debt $$F$$, that is, the first time $$V_{\tau}\leq dF$$. The realized 10-year default frequency in the year 1 cohort is found by simulating one systematic and 445 idiosyncratic processes in Equation (3). In year 2 we form a cohort of 445 new firms. The firms in year 2 have characteristics that are identical to those of the year 1 cohort at the time of formation. We calculate the realized 10-year default frequency of the year 2 cohort as we did for the year 1 cohort. Crucially, the common shock for years 1–9 for the year 2 cohort is the same as the common shock for years 2–10 for firms in the year 1 cohort. We repeat the same process for 22 years and calculate the overall average realized cumulative 10-year default frequency in the economy by taking an average of the default frequencies across the 22 cohorts. Finally, we repeat this entire simulation 25,000 times. To estimate the correlation parameter $$\rho$$, we calculate pairwise equity correlations for rated industrial firms in the period 1987–2012. Specifically, for each year we calculate the average pairwise correlation of daily equity returns for all industrial firms for which Standard $$\&$$ Poor’s provide a rating and then calculate the average of the 26 yearly estimates over 1987–2012. We estimate $$\rho$$ to be 20.02$$\%$$. To set the default boundary, we proceed as follows. First, without loss of generality, we assume that the initial asset value of each firm is equal to one. This means that the firm’s leverage, $$L\equiv\frac{F}{V}=F$$, and we set the default boundary $$dF(=dL)$$ such that the model-implied default probability given in Equation (A2) in the appendix matches the 10-year default rate of 5.09$$\%$$.5 Panel A of Figure 2 shows the distribution of the realized average 10-year default rate in the simulation study and the black vertical line shows the ex ante default probability of 5.09$$\%$$. The 95$$\%$$ confidence interval for the realized average default rate is wide at [1.15$$\%$$; 12.78$$\%$$]. We also see that the default frequency is significantly skewed to the right; that is, the modal value of around 3$$\%$$ is significantly below the mean of 5.09$$\%$$. This means that the default frequency most often observed—for example, the estimate from the rating agencies—is below the mean. Specifically, although the true 10-year default probability is 5.09$$\%$$, the probability that the observed average 10-year default rate over 31 years is half that level or less is 19.9$$\%$$. This skewness means that the number reported by Moody’s (5.09$$\%$$) is more likely to be below the true mean than above it and, in this case, if spreads reflect the true expected default probability, they will appear too high relative to the observed historical loss rate. Figure 2 View largeDownload slide Distribution of estimated 10-year BBB default probability when using default rates measured over 31 years The existing approach in the literature is to use an average historical default rate for a specific rating and maturity as an estimate for the default probability when testing spread predictions of structural models. One example is Chen, Collin-Dufresne, and Goldstein (2009), who use the 10-year BBB default rate of 5.09$$\%$$ realized over the period 1970-2001 a