Robust Models of CEO Turnover: New Evidence on Relative Performance Evaluation

Robust Models of CEO Turnover: New Evidence on Relative Performance Evaluation Abstract We examine the robustness of empirical models and findings concerning CEO turnover. We show that the sensitivity of turnover to abnormal firm performance is an extremely robust result. In contrast, evidence indicating a relation between turnover and industry performance is both weak and fragile. We show that small changes in turnover modeling choices can affect inferences in a large way. Our evidence casts substantial doubt on the hypothesis that there is a large industry performance component to turnover decisions. We use our findings to offer some general prescriptions for checking robustness results in CEO turnover research. Received June 6, 2017; editorial decision July 10, 2017 by Editor Uday Rajan. Questions concerning how CEOs are removed from office have attracted much attention from financial economists. This is an important issue if CEO skills are substantive inputs into firm performance, as any inefficiencies in the process by which CEOs are removed from office could have first-order implications for a firm’s success. In addition, CEO action choices should be influenced by the manager’s anticipation of the mechanism governing their job security. The empirical literature on CEO turnover exhibits a wide variety of different modeling choices including those related to the definition of turnover, the timing windows over which turnover or performance are measured, the construction of performance variables, and the econometric model selected. This heterogeneity is not surprising given the coarse guidance provided by the underlying theories and wide variation in data availability across settings. The usual hope is that reasonable choices, robustness checking, and multiple studies across different samples will lead to robust inferences that reflect widespread economic behavior. Although these modeling issues are common in many settings, they are particularly acute in turnover studies, as data are often hand-collected and key choices are fixed early in the research process. In this paper we will identify some of the key choices that arise in studies of CEO turnover and explore whether these choices have a substantive effect on important economic questions. With regard to the well-researched question of the relation between CEO turnover and abnormal firm performance, we detect an extremely robust negative relation. Although the relation is highly significant in all cases, certain modeling choices do affect the economic magnitudes of the estimates in a substantive way. With regard to the issue of the widely discussed relation between turnover and industry performance (e.g., Jenter and Kanaan 2015), our findings indicate that empirical support for the general presence of such a relation is, at best, fragile and nonrobust. In particular, using two large samples, we find that when we use standard annual timing conventions, there is little convincing evidence of any relation between CEO turnover and industry performance. This evidence and treatment of the data is consistent with the findings of Gibbons and Murphy (1990) and others, and it suggests full relative performance evaluation in CEO turnover decisions. When we use an alternative nonstandard monthly timing convention for modeling turnover along the lines of Jenter and Kanaan (2015) (JK hereafter), we uncover some limited evidence suggesting a small role for industry performance in CEO turnover, although this evidence appears both weak and fragile. Given these findings, we will argue that the notion that CEOs are widely blamed for bad luck or credited for good luck in the removal decision is simply not strongly supported by the data. Thus, in our view, much of the recent discussion on the efficiency implications of industry performance factors in CEO turnover is misguided and attempts to address, at best, fairly rare behavior. Our evidence suggests that a simple efficient learning perspective with full industry relative performance evaluation (RPE) is a reasonable description of the CEO turnover process, at least in most traditional turnover model settings. Our turnover findings, coupled with widespread prior evidence that CEO compensation depends on luck (e.g., Bertrand and Mullainathan 2001), suggest that turnover and compensation policies play quite distinct roles in incentivizing managers. Our analysis offers useful insights into the role of modeling choices that play an important role in inferences regarding the CEO turnover process. One of these choices is the turnover event categorization procedure, with some of the more popular choices almost surely incorporating systematic biases. A second key choice is the mechanical construction of performance metrics, as seemingly innocuous assumptions regarding compounding, rebalancing, and monotonic transforms of the performance data can have a large effect on model inferences. Finally, and quite surprisingly, the inclusion of explanatory variables that are orthogonal to the variable of interest can have a significant effect on model inferences owing to the nonlinearity of the underlying models. For all three of these modeling issues, we will recommend a set of empirical checks that should assist researchers in assessing whether their inferences are reasonably robust. Since the use of monthly timing conventions does lead to some faint evidence for the presence of an industry effect in CEO turnover, we will look particularly closely at timing issues. In the case of firm performance, the picture is clear. Abnormal firm performance is on average poor and steadily declining in the two years preceding turnover event dates. Thus, abnormal firm performance measured over any standard window as a predictor of turnover in the subsequent year or month always displays a strong negative relation that is easy to detect. In the case of industry performance, the picture is nuanced. Industry performance is essentially flat over the 2-year window before the exact turnover date, but it trends slightly positive in the -24-month to -13-month period and negative in the -12-month to -1-month window. Thus, when using the traditional timing convention of fiscal-year performance windows as a predictor of turnover in the next fiscal year, the performance window captures part of the industry up-down pattern, and there is no evident relation between industry performance and CEO turnover. If one follows the nonstandard approach of using annual performance to predict turnover in the next month, the downward industry trend is emphasized more, and, consequently, some models reveal a marginally significant negative relation. These nuanced findings on industry performance do not seem to support a simple performance attribution error scenario, as in this case we would expect industry performance to exhibit the same long-term decline before turnover that is evident with abnormal firm performance. The results also suggest little in terms of an industry performance incentive effect induced by the threat of turnover, as the relation between industry performance and turnover measured over long windows is essentially zero, independent of timing subtleties. 1. Prior Literature and Empirical Strategy 1.1 Theoretical considerations Boards of directors are responsible for the decision to remove a CEO from office. Several nonperformance factors have been hypothesized to affect this decision. Holding these constant, most models assume that boards of directors update their assessment of a CEO’s suitability for remaining in office based on performance signals that arrive over time. Gibbons and Murphy (1990) (GM hereafter) assume that boards efficiently incorporate this performance information into their assessment of a CEO’s ability and follow an optimal dismissal decision rule based on this assessment. We will refer to this behavior as the efficient learning perspective. Simple tests of the efficient learning perspective examine whether there is any sensitivity of turnover to abnormal firm performance, followed by introspection into whether observed magnitudes appear consistent with optimality. More subtle tests often rely on comparative statics investigations suggested by the theory. GM examine a particularly interesting prediction of the efficient learning perspective. Motivated by Holmström’s (1982) theory of RPE, they argue that optimal turnover decisions should depend only on performance measured relative to a benchmark that filters out components of firm performance unrelated to managerial effort or ability, most notably industry performance.1 We highlight two theoretical issues that have important empirical design implications. The first concerns the party instigating the turnover decision. If a CEO has unattractive outside opportunities, turnover outcomes should largely reflect board-instigated decisions. However, when a CEO has attractive external opportunities, for example, a prestigious new job or highly valued leisure opportunities in retirement, the CEO may instigate the job separation. Researchers often attempt to label these outcomes as “forced” and “voluntary” and model them separately. However, most authors recognize that these assignment procedures are necessarily imperfect. In particular, many “voluntary” turnover events may in fact be far from voluntary.2 Consequently, a useful robustness check is to consider whether results change substantially when using alternative definitions of forced turnover.3 A second issue concerns the technology describing the distribution of managerial ability and the mapping between ability and performance. A common approach is to assume additive normal distributions, as this yields simple expressions for the efficient updating process. Although this approach is theoretically pragmatic, difficulties appear when moving to the data. Most performance metrics in turnover studies are distinctly nonnormal. For example, annual stock returns are skewed and extreme realizations of returns are more common than the normal distribution predicts. Consequently, revisions to assessments of managerial ability after observing extreme realizations may be smaller than the theory would suggest. There is no obvious solution to this issue, given our limited knowledge of the underlying technologies. Consequently, to assure result robustness, it would appear prudent to check that a key finding holds for reasonable transformations of the performance data or changes in timing conventions. 1.2 Empirical models of CEO turnover and firm performance 1.2.1 Identifying and categorizing CEO turnover Most prior empirical models of CEO turnover are logit models that include a dependent variable that assumes a value of 1 when a turnover event occurs in a given period and an independent variable that captures some measure of abnormal firm performance in the immediately prior period. The most general definition of turnover includes all cases where the identity of a firm’s CEO changes between time t and t + 1. Additional information is often used to assess the voluntary versus forced nature of these events, with researchers using algorithms relying on news articles, subsequent employment outcomes, and age information. A common categorization procedure is to use some variant of the Parrino (1997) algorithm. This categorization assumes that news article revelations of a forced departure automatically qualify a turnover event as forced. In the absence of such a revelation, the default choice is to treat young (old) executives who depart from the firm as forced (voluntary), using age 60 as the dividing line.4 Although this approach has intuitive appeal, it may contain some systematic biases. In particular, the press may be more likely to label any given departure as forced when absolute firm performance is poor, leading to spurious inferences. In addition, since news revelations of overtly forced departures are rather rare, this algorithm relies heavily on a CEO’s age, treating a 59- (60)-year-old who departs as likely forced (not forced). Since no perfect categorization scheme exists, it seems reasonable to expect that robust conclusions regarding CEO turnover behavior should hold when the dependent variable is varied at least somewhat over a liberal to conservative spectrum. Moreover, careful consideration of age effects would appear prudent. 1.2.2 Measuring performance The most common abnormal firm performance metrics in the CEO turnover literature are based on stock returns, but some studies also consider accounting measures (e.g., Weisbach 1988; Engel, Hayes, and Wang 2003). An attractive feature of stock returns is that they are largely unpredictable, while accounting measures exhibit predictable dynamics. Since prior evidence suggests a larger role for stock returns in predicting CEO turnover, we focus on market-based metrics.5 The actual construction of performance variables varies across studies. Many studies use simple annual stock returns to construct the key abnormal firm performance metric. Since extreme returns are not uncommon, these approaches may be sensitive to outliers. An alternative that may lessen the influence of outliers and transform the data into a form that more closely resembles the normal distribution is to use a log transformation of returns. An even more aggressive transformation borrows from Aggarwal and Samwick (1999a, 2003a, 2003b) and converts annual returns into percentile ranks, creating a metric that follows a uniform distribution.6 Since these three approaches cannot be ranked on a priori grounds, robustness concerns suggest checking all three. Any substantive differences in findings across these transformations may help identify the performance observations that are most influential in predicting turnover. 1.2.3 Benchmarking performance Following Holmström (1982), the theory of RPE posits that managers should be evaluated based on metrics that filter out factors that are unrelated to effort or ability. If we identify an abnormal firm performance measure (AbFirm) that is purged of these factors, and we are confident that these factors play no independent role in predicting turnover, we could estimate a model in which the probability of turnover for firm i in year t is modeled as   Pr(CEO turnoverit)=Ф(β1 × AbFirmit+γ×Zit+ɛit), (1) where Zit is a vector of non-performance-related controls, ɛit is a noise term, and Φ is a known distribution, for example, the logistic cumulative distribution function. The coefficient β1 from this model would then provide information on the sensitivity of turnover to abnormal firm performance. If we add the exogenous performance factors that are used to construct the abnormal firm performance measure (ExogPerf) to the estimated model, the theory of RPE predicts a coefficient of 0 on this variable. Thus, in a correctly specified model in which   Pr(CEO turnoverit)=Ф(β1 × AbFirmit+β2 × ExogPerfit+γ×Zit+ɛit), (2) we expect β1 to be negative and significant and β2 to be zero under full RPE. Most authors emphasize industry/peer performance as a key exogenous factor in determining firm performance and use an industry measure to construct ExogPerf. Constructing an appropriate measure of abnormal firm performance can be challenging, since it will depend on unknown technological conditions. For example, whether CEOs should be compared to an equally weighted versus value weighted comparison portfolio of peer firms depends on which more closely captures exogenous factors related to the CEO’s firm. Similar comments apply to the identification of the comparison group. Given this lack of guidance, it is not surprising that prior researchers have used a variety of different adjustment procedures to create abnormal performance metrics. Robustness concerns dictate that any findings on the role of RPE in turnover should be relatively insensitive to the procedure used, unless a clear case can be made for choosing one based on a priori grounds. Several early papers assume the presence of full RPE and measure abnormal firm performance as the difference between firm and industry performance (e.g., Weisbach 1988; Parrino 1997; Huson, Parrino, and Starks 2001). Rather than assuming full RPE, GM directly examine this issue by including both relative-to-industry firm performance and industry performance in their turnover prediction models. The evidence they report is consistent with the lack of any independent role for industry performance in predicting CEO turnover, suggesting the presence of full RPE. Similar results for other samples of public firms are reported by Barro and Barro (1990), Garvey and Milbourn (2006), and Hazarika, Karpoff, and Nahata (2012), and Cornelli, Kominek, Ljungqvist (2013) uncover parallel evidence for private-equity backed firms.7 Jenter and Kanaan (2015) make the important point that firm performance may not move one-for-one with industry performance. If this is the case, to construct an appropriate measure of abnormal firm performance, one should use the residual from a predictive regression of firm performance on industry performance. In a model predicting CEO turnover, the RPE theory would then predict a significant negative coefficient on this idiosyncratic abnormal firm performance variable (i.e., the regression residual which measures AbFirm), and a coefficient of zero on the industry performance variable (i.e., the regression predicted value which measures ExogPerf). Interestingly, JK present evidence indicating that CEO dismissals are in fact related to industry performance, with poor performance resulting in an enhanced likelihood of turnover. Similar findings have been reported by Bushman, Dai, and Wang (2010), Gopalan, Milbourn, and Song (2010), Kaplan and Minton (2012), and Eisfeldt and Kuhnen (2013). These findings are interesting, as they cast doubt on a simple efficient learning view of turnover. Given the potential importance of this result, and the contrasting results reported originally by GM and more recently Hazarika, Karpoff, and Nahata (2012) and Huang, Maharjan, and Thakor (2016), below we apply our modeling insights to more fully explore and understand the relation between industry performance and CEO turnover.8 1.2.4 Timing issues Given that some information on firms is available only on an annual basis, prior authors almost always predict turnover over a 1-year period (usually a fiscal year) as a function of prior year performance and start of year-firm characteristics. We refer to this as annual timing. Since stock returns are available on a more frequent basis, it is possible to model turnover over smaller time windows using updated annual performance data to predict turnover in the selected window. Weisbach (1988) and Hadlock and Lumer (1997) use quarterly windows, whereas Jenter and Kanaan (2015) appear to effectively use monthly windows. We refer to this latter approach as monthly timing. The choice between annual and monthly timing is unclear on a priori grounds. That almost all studies find a strong relation between abnormal firm performance and turnover using annual timing certainly suggests that the annual timing does capture a substantial part of the performance evaluation component incorporated into turnover decisions. Monthly timing may be more informative if boards make decisions quickly in response to new performance data, and if the researcher can obtain unlagged data on when a turnover decision was actually made. If there is a lag on either of these dimensions, some of the performance information with monthly timing may post-date the turnover decision. With regard to timing issues, annual timing is often the only practical choice. If finer timing on turnover decisions is available, monthly timing could also be informative. To the extent that there are differences in findings using the two conventions, there may be information content in the differences regarding how turnover decisions are made. Certainly, if a result holds using both timing conventions, a researcher’s confidence in the robustness of the findings will be increased. Along similar lines, longer performance windows (rather than shorter turnover windows) also may be informative. Huson, Parrino, and Starks (2001) exploit this approach. 1.2.5 Econometric modeling Most studies estimate simple logit models predicting a 0/1 variable indicating whether turnover occurs in a given time window. Multinomial logit models are occasionally estimated, generally confirming inferences from sets of pairwise logits (e.g., Parrino 1997; Huson, Parrino, and Starks 2001; Hadlock, Lee, and Parrino 2002). Less frequently, linear probability models or probit models are estimated for robustness. A few studies estimate Cox hazard models (e.g., Hadlock, Lee, and Parrino 2002; Jenter and Kanaan 2015). As has been discussed in the econometrics literature, the (discretized) Cox model is numerically identical to the conditional logit model, and the conditional logit model is a close cousin to the traditional logit model with a full set of tenure dummy variable controls.9 We have confirmed that this is true for turnover models using our data. Thus, little is gained by deviating from the standard logit treatment and we thus focus our attention on these models. 1.3 Empirical strategy As is clear from the preceding discussion, there are a variety of reasonable choices for empirically modeling CEO turnover. A strong relation between turnover and abnormal firm performance has been detected by many authors, regardless of their modeling specific choices, indicating a robust result. The real concern is whether other results are robust. Of particular note, the conflicting results that appear in the literature on the relation between industry performance and turnover suggest the possibility that differences in modeling may have a substantive effect on economic inferences. To investigate, we undertake an empirical investigation. We have two goals in this analysis. First, we hope to understand why different researchers have reached different conclusions on the important issue of the turnover-industry performance relation. Second, we hope to offer insights on the merits of different CEO turnover modeling choices and practical guidance on checking the robustness of findings of interest. A large number of modeling permutations are available to choose from, so we make initial baseline choices that closely match with the choices of JK and/or GM. These are two of the most prominent papers that study CEO turnover and industry performance, and these authors reach quite different conclusions. We emphasize the JK choices more heavily, as the samples we construct have substantial overlap with their sample. After presenting the baseline models, we vary our modeling choices to provide both a more complete picture of the underlying economic behavior and the robustness properties of these types of models. As a preliminary step, we first attempt to characterize the “standard” treatment in the literature. To do this, we identify all JSTOR articles with the phrase “CEO Turnover” in the abstract, along with the first 30 articles sorted by relevance with this phrase published in the Journal of Financial Economics.10 If a flagged paper presents a model predicting turnover, we characterize the paper’s baseline modeling treatment. We first consider whether an author uses a fairly coarse definition of turnover which includes most CEO changes (generic turnover), or a strict definition that relies at least partially on press characterizations (overt turnover). We find that 27 of the 38 papers use a generic categorization, with the remaining 11 using a stricter/overt categorization. Next, we categorize each paper by whether the authors use annual timing versus monthly timing. Here, we find 34 papers use annual timing, with only 1 study adopting monthly timing (three have indeterminate timing). Of the studies that use stock returns, we find that 26 papers use untransformed return information, whereas three transform or compress these data in some way (e.g., percentiles). Sampling choices vary widely across studies, as some studies focus on a specific type of firm (e.g., foreign firms or a specific industry). Only 4 of the 38 studies specifically exclude financial firms or utilities, so this sampling restriction appears to be relatively uncommon in turnover studies. Using the literature as our guide, the “standard” approach to modeling turnover appears to be to (a) use generic turnover, (b) use annual timing, (c) use untransformed returns, and (d) include financials and utilities. Interestingly, both GM and JK differ from this standard treatment. In particular, GM use a transformed version of returns by exploiting continuously compounded returns, while JK use forced turnover and a version of monthly timing. The standard treatment is of course not necessarily the optimal treatment, and many standard choices are likely dictated by data availability. Nevertheless, it is useful to have this literature characterization in mind as we consider various modeling permutations. 2. Sample Construction 2.1 Sample selection and categorization of turnover We collect data on two samples, which we refer to as samples 1 and 2, respectively. Following JK, sample 1 includes all Execucomp firms from 1993 to 2009, thus including most of the largest public firms. Sample 2 borrows from Fee, Hadlock, and Pierce (2013) and is drawn from the universe of all Compustat firms from 1990 to 2006, but excludes utilities, financial firms, foreign firms, and firms with under $10 million in assets.11 Sample 2 is substantially larger than sample 1, as the sampling choices that it borrows from Fee, Hadlock, and Pierce (2013) impose a much smaller minimum size threshold which is only partially offset by their exclusion of utilities and financials. For both samples, we code a fairly liberal turnover categorization, referred to as generic turnover, and a conservative categorization, referred to as forced turnover. For both samples, we require that a firm is listed in the originating database at the start and end of the fiscal year under consideration. This requirement should eliminate turnovers that arise from external events such as acquisitions, bankruptcies, and going private transactions.12 To identify turnover in sample 1, we conduct news searches for every case in which the identity of the firm’s CEO changes for two consecutive annual listings in the Execucomp database. If we can confirm a CEO change occurred and that the change was not related to a health/death event, an acquisition, or an immediate jump to new employer, we code the generic turnover variable as 1. The overtly forced turnover variable for this sample is then coded as 1 for the subset of these events that also satisfy the Parrino (1997) forced event categorization. For this sample, we use Execucomp’s listed departure date as the turnover date.13 Since sample 2 is much larger, and information on the precise timing of CEO changes scarcer, in this sample we restrict ourselves to annual timing and detect the fiscal year in which every CEO change in the sample occurs, conditional on a firm being listed at the start and end of the fiscal year. Similar to sample 1, we define a generic turnover event to be every CEO change during the fiscal year, except for those related to health/death/acquisitions/jumps. For sample 2, our definition of overtly forced turnover is relatively stricter than for sample 1, as we require a press characterization of the departure as forced as revealed by a Factiva search. Given the more limited information available for broader sets of public firms, further refinements using inputs into the Parrino (1997) algorithm are not practical for sample 2. Summary statistics with basic information on samples 1 and 2 are reported in panels A and B of Table 1, respectively. As expected, the average firm in sample 1 is larger than the average firm in sample 2. The generic annual turnover rate is slightly lower in sample 1 (9.38% versus 11.16%), but both rates are consistent with typical figures reported in the literature. The forced annual turnover rate in sample 1 of 2.74% is similar to the rate reported by JK, who use a similar sampling and categorization procedures. The forced turnover rate in sample 2 is lower at 0.87%, which is not surprising given the stricter coding of the forced variable in this sample and more sparse press coverage of smaller firms. Table 1 Sample characteristics   Statistic  Obs.  A. Sample 1 (Execucomp, 1993-2009)      Number of firm years  28,296  28,293  Mean book assets  14,641.77  28,293  Median book assets  1,909.28  28,293  Median age at turnover  59  2,643  Generic annual turnover rate  .0938  28,293  Overtly forced annual turnover rate  .0274  28,293    B. Sample 2 (Compustat, 1990-2006)      Number of firm years  60,203  60,203  Mean book assets  1,534.89  60,203  Median book assets  144.27  60,203  Median age at turnover  56  6,721  Generic annual turnover rate  .1116  60,203  Overtly forced annual turnover rate  .0087  60,203  C. Firm abnormal perf.: Generic versus nonturnover      Sample 1, annual timing, untransformed  −.0809***  2,152  Sample 1, annual timing, cont. compound.  −.0905***  2,152  Sample 1, monthly timing, untransformed  −.0962***  2,048  Sample 1, monthly timing, cont. compound.  −.1090***  2,048  Sample 2, annual timing, untransformed  −.1435***  4,413  Sample 2, annual timing, cont. compound  −.1492***  4,413  D. Industry abnormal perf.: Generic versus nonturnover      Sample 1, annual timing, untransformed  0.0014  2,152  Sample 1, annual timing, cont. compound.  0.0007  2,152  Sample 1, monthly timing, untransformed  −.0113**  2,048  Sample 1, monthly timing, cont. compound.  −0.0091**  2,048  Sample 2, annual timing, untransformed  0.0068*  4,413  Sample 2, annual timing, cont. compound  0.0035  4,413    Statistic  Obs.  A. Sample 1 (Execucomp, 1993-2009)      Number of firm years  28,296  28,293  Mean book assets  14,641.77  28,293  Median book assets  1,909.28  28,293  Median age at turnover  59  2,643  Generic annual turnover rate  .0938  28,293  Overtly forced annual turnover rate  .0274  28,293    B. Sample 2 (Compustat, 1990-2006)      Number of firm years  60,203  60,203  Mean book assets  1,534.89  60,203  Median book assets  144.27  60,203  Median age at turnover  56  6,721  Generic annual turnover rate  .1116  60,203  Overtly forced annual turnover rate  .0087  60,203  C. Firm abnormal perf.: Generic versus nonturnover      Sample 1, annual timing, untransformed  −.0809***  2,152  Sample 1, annual timing, cont. compound.  −.0905***  2,152  Sample 1, monthly timing, untransformed  −.0962***  2,048  Sample 1, monthly timing, cont. compound.  −.1090***  2,048  Sample 2, annual timing, untransformed  −.1435***  4,413  Sample 2, annual timing, cont. compound  −.1492***  4,413  D. Industry abnormal perf.: Generic versus nonturnover      Sample 1, annual timing, untransformed  0.0014  2,152  Sample 1, annual timing, cont. compound.  0.0007  2,152  Sample 1, monthly timing, untransformed  −.0113**  2,048  Sample 1, monthly timing, cont. compound.  −0.0091**  2,048  Sample 2, annual timing, untransformed  0.0068*  4,413  Sample 2, annual timing, cont. compound  0.0035  4,413  Sample 1 consists of all Execucomp firm-years from 1993 to 2009, and sample 2 is all Compustat firm-years from 1990 to 2006, but excluding financials, utilities, foreign firms, and firms below $10 million in assets in 1990 dollars. Both samples require the firm to have a listing at the start and end of the fiscal year, and the sample of turnover events includes all CEO changes over the course of the fiscal year. Book assets are measured in millions of dollars inflation-adjusted to December 2013 using the Consumer Price Index, treating all fiscal year accounting information as if it were reported on December 31. Overtly forced CEO turnover cases are events classified as forced using the Parrino (1997) algorithm for sample 1 and solely press characterizations for sample 2. For both samples, generic turnover represents all CEO departures, except for events related to control changes, health, death, or jumping immediately to take a position elsewhere. Figures in panels C and D are differences in abnormal firm and industry returns for turnover versus nonturnover events, with p-values calculated with a t-test across groups. Performance statistics denoted as using annual timing are calculated over the fiscal year immediately preceding the year in which a turnover event takes place, whereas monthly timing is calculated over the 12-month period immediately preceding the month in which the turnover event occurs. A firm’s untransformed stock return is calculated as the firm’s buy-and-hold return winsorized at the 1% and 99% tails, and the abnormal return is this return less the corresponding industry return. Continuously compounded returns are calculated as log(1+untransformed return). Industry returns are the corresponding returns on the equally weighted portfolio of all firms in the same Fama-French industry, rebalanced monthly. Industry abnormal returns are calculated by subtracting the CRSP equally weighted index return over the same period. The figures in panels C and D of this table are calculated over the subset of observations in which the CEO has tenure for at least 24 months at the start of the turnover observation window. * significant at the 10% level, ** significant at the 5% level, and *** significant at the 1% level. It is worth commenting briefly on the large differences between generic and forced turnover rates in both samples. The generic departure group surely incorrectly includes voluntary departures, while the forced departure group surely excludes many involuntary departures. As discussed above, roughly two-thirds (one-third) of the literature focuses on generic (forced) departures, including GM (JK). Clearly there is no perfect categorization, although age controls or age restrictions will hopefully adequately adjust for predictable retirements at certain ages. To investigate further, for the subset of observations that are in both sample 1 and sample 2, we systematically look for evidence of severance payments to departed executives. Fee and Hadlock (2004) suggest this as an alternative approach to identify forced turnover events. The forced turnover rate using severance information is 4.81%, nearly triple the rate of 1.54% when using only news articles, and substantially higher than the rate using the Parrino (1997) algorithm. This indicates that relying on press characterizations to identify turnover will likely miss a large number of relevant events.14 In untabulated results, we have also examined post-turnover labor market opportunities of sample CEOs after turnover. Regardless of whether the event is characterized as forced, few CEOs resurface in elite positions, and those that do tend to obtain jobs that appear inferior to their prior positions.15 This again suggests that many “voluntary” departures are far from voluntary. Given these observations, it would appear prudent to check results for both relatively liberal and conservative definitions of turnover, to assure that findings are not driven by the peculiarities of a given categorization scheme. 2.2 Measuring firm and industry performance As discussed above, the most common performance metric used in the prior literature is some measure of abnormal stock returns in the year prior to the window in which turnover is predicted. Most authors use a simple version of the firm’s buy-and-hold annual return, less a measure of benchmark performance, for example, industry returns. We follow JK’s suggested improvement on this approach and measure a firm’s abnormal performance as the residual from a sample-wide regression of firm returns against an industry benchmark. This procedure essentially amounts to using the firm’s return less some sample-wide industry beta estimate times the contemporaneous corresponding industry benchmark return.16 We refer to this measure as the firm’s untransformed abnormal return. As discussed earlier, the theory is ambiguous on whether returns should be transformed to follow an alternative distribution. A minority of authors, including GM, adopt a transformed returns approach. Thus, as an alternative to the default choice of untransformed returns, we consider a continuously compounded alternative (i.e., log of 1+ return) suggested by the work of GM, and a percentile version of returns suggested by the work of Aggarwal and Samwick (1999a, 1999b, 2003). In using these alternatives, we first transform the data and then define the abnormal return as the residual from a sample-wide regression of the transformed firm returns against the transformed industry benchmarks. In constructing a measure of industry benchmark performance, there are a host of choices related to the composition of the industry, weighting schemes, rebalancing rules, and the population over which any regression adjustment is performed. We follow JK in choosing equally weighted benchmarks as our initial choice using the Fama-French 48 industry categories (the industry “other” is excluded). Since industry composition can change during the year, we rebalance all portfolios on a monthly basis and compound the derived monthly return over the fiscal year. We also consider value-weighted returns for the same industry portfolio in some of our robustness checks. We follow JK and define industry broadly to include all firms listed on Compustat/CRSP, but we have experimented with restricting the industry to within-sample firms, and the main results we report below are substantively unchanged. In the case of untransformed returns, the associated industry benchmark is the one-year return on the industry portfolio as described and calculated above (with the firm’s own return excluded in the calculation). The continuously compounded benchmark is simply the log of (1 + untransformed industry return). The percentile version of the industry benchmark is based on the equally weighted industry portfolio of the firm relative to the contemporaneous annual cohort. These benchmarks are used in the regression adjustment procedure described above to measure abnormal firm performance. 2.3 A first look at performance To provide some initial evidence on the relation between turnover and performance, we calculate statistics for abnormal firm and industry performance in a one-year period prior to turnover. These figures are referred to as annual timing statistics when we use the fiscal year prior to the turnover event, and monthly timing statistics when we use the 12-month period ending the month immediately prior to the turnover event. The figures in panel C of Table 1 clearly indicate poor relative firm performance on average before all turnover events, independent of the timing, sample, or performance transformation. In all cases, abnormal firm performance is negative at high levels of significance, with values from -8.09% to -14.92%. The industry performance figures in panel D of Table 1 are less clear. Using annual timing, there is no evidence of poor industry performance prior to turnover. In fact, some of the industry performance figures are positive. With monthly timing, there is some evidence of poor industry performance, with figures suggesting abnormal industry performance on the order of -1% in the year immediately prior to a turnover event. These initial figures hint at a negative industry performance relation to turnover, but one that is both small in magnitude and quite sensitive to timing issues.17 3. The Relation between CEO Turnover and Performance 3.1 Univariate evidence To investigate the turnover-performance relation more systematically, we calculate turnover rates by performance quintiles. With annual timing, we consider performance over a given fiscal year predicting turnover in the subsequent fiscal year. With monthly timing, we consider performance over each 12-month period ending on the last day of a month predicting turnover in the next month. In the univariate statistics, we multiply monthly turnover rates by 12 to create annualized rates. As expected, the figures in panel A of Table 2 indicate significantly higher turnover rates in the worst abnormal firm performance quintile compared to the best, regardless of timing (annual or monthly), sample (sample 1 or 2), or turnover definition (forced versus generic). Turnover rates increase proportionally more with firm performance when using the overtly forced definition, but more in absolute magnitude when using the generic turnover definition. This suggests that the forced category is more successful at picking up forced events, but also that some involuntary events are missed by the forced definition and are captured in generic turnover. Table 2 Univariate turnover rates, top and bottom performance quintiles   Sample 1  Sample 2  A. Abnormal Firm perf., both timings                Q5  Q1  p-val.  Q5  Q1  p-val.  Overt, annual timing  0.015  0.064  .000  0.004  0.022  .000  Generic, annual timing  0.089  0.139  .000  0.084  0.161  .000  Overt, monthly timing  0.014  0.072  .000        Generic, monthly timing  0.082  0.159  .000        B. Industry perf., annual timing                Q5  Q1  p-val.  Q5  Q1  p-val.  Overt  0.030  0.035  .227  0.011  0.011  .819  Generic  0.109  0.108  .832  0.119  0.110  .073  Overt, no financials/utilities  0.034  0.033  .860        Generic, no financials/utilities  0.116  0.107  .240        C. Industry perf., monthly timing                Q5  Q1  p-val.        Overt  0.025  0.033  .061        Generic  0.102  0.112  .181        Overt, no financials/utilities  0.027  0.033  .263        Generic, no financials/utilities  0.105  0.114  .280          Sample 1  Sample 2  A. Abnormal Firm perf., both timings                Q5  Q1  p-val.  Q5  Q1  p-val.  Overt, annual timing  0.015  0.064  .000  0.004  0.022  .000  Generic, annual timing  0.089  0.139  .000  0.084  0.161  .000  Overt, monthly timing  0.014  0.072  .000        Generic, monthly timing  0.082  0.159  .000        B. Industry perf., annual timing                Q5  Q1  p-val.  Q5  Q1  p-val.  Overt  0.030  0.035  .227  0.011  0.011  .819  Generic  0.109  0.108  .832  0.119  0.110  .073  Overt, no financials/utilities  0.034  0.033  .860        Generic, no financials/utilities  0.116  0.107  .240        C. Industry perf., monthly timing                Q5  Q1  p-val.        Overt  0.025  0.033  .061        Generic  0.102  0.112  .181        Overt, no financials/utilities  0.027  0.033  .263        Generic, no financials/utilities  0.105  0.114  .280        Each cell with a column heading of Q5 (Q1) reports the annualized turnover rate for firms in the highest (lowest) performance quintile. Abnormal firm performance quintiles are based on the firms’ annual buy-and-hold stock return relative to the same industry-period cohort. Industry performance quintiles are based on the equally weighted industry portfolio of the firm relative to the same period cohort. Overt turnover events in sample 1 are CEO departures that are considered forced using the Parrino (1997) algorithm and in sample 2 are departures characterized by the press as forced. Generic turnover events are all departures, except for those related to control changes, health, death, and job jumps. Annual timing denotes that performance is calculated over the fiscal year immediately preceding the year in which a turnover event could take place, whereas monthly timing is calculated over the 12-month period immediately preceding the month over which a potential turnover event may occur. Monthly turnover rates are annualized by multiplying by 12. Sample 1 consists of all Execucomp firm-years from 1993 to 2009, and sample 2 is all Compustat firm-years from 1990 to 2006 except financials, utilities, foreign firms, and firms below $10 million in assets. All p-values are for a simple t-test of a difference in turnover rates between the two quintiles. The figures in this table are calculated over the subset of these observations in which the CEO has tenure for at least 24 months at the start of the turnover observation window. Sample 1 with annual (monthly) timing includes 20,828 (249,849) observations, whereas sample 2 with annual timing includes 42,937 observations. Turning to industry performance, the figures in panel B of Table 2 suggest no strong relation between industry performance and turnover when using annual timing, regardless of sample, definition of turnover, or whether we exclude financials and utilities. Turning to monthly timing in panel C, the figures here are more suggestive of an industry performance relation. In all cases, the worst performance quintile has a higher rate than the best performance quintile, and the differences are larger than the corresponding differences for annual timing. However, the evidence is still weak. Only the overtly forced departures display a significant difference, with a rate of 3.29% (2.53%) for the worst (best) industry performance group, a difference that is only marginally significant (p=.061). It is worth noting that JK report figures of 4.10% and 2.36% when making a similar comparison in a similar sample. Thus, while our sample hints at a similar relation, the underlying pattern in our data appears much weaker.18 3.2 Initial multivariate evidence We now consider standard logit models predicting turnover. In addition to abnormal firm performance and industry performance, all models include controls for firm size, CEO tenure, age, an age older than 60 dummy, an ages 63 to 65 dummy, and year dummies.19 Standard errors are clustered by industry and all observations for CEOs with a tenure of under two years are excluded. We estimate these models for both samples, both definitions of turnover, with monthly and annual timing for sample 1 and annual timing only for sample 2. For ease of comparison with JK, we use untransformed measures of performance in this initial table. These choices result in six models, with primary interest on the firm and industry performance coefficients. For reference, we note that the sample 1/overt turnover/monthly timing model permutation should be a very close match to JK on all dimensions and we label this in Table 3 as the JK match. The sample 1/generic turnover/annual timing model will be the closest match to the GM model and sample choice and is also labeled as such, although the sample period we study clearly differs from theirs. Table 3 Initial logit model estimates of CEO turnover   A. Annual timing   B. Monthly timing     Sample 1, overt  Sample 2, overt  (GM Match) Sample 1, generic  Sample 2, generic  (JK Match) Sample 1, overt  Sample 1, generic    (1)  (2)  (3)  (4)  (5)  (6)  Abnormal firm performance  −1.566***  −1.618***  −0.476***  −0.423***  −1.881***  −0.699***    [0.142]  [0.321]  [0.069]  [0.040]  [0.303]  [0.067]  Industry performance  −0.597***  −0.755**  0.052  0.093  −1.076***  −0.311**    [0.202]  [0.336]  [0.185]  [0.088]  [0.372]  [0.143]  Age  0.025***  0.004  0.039***  0.018***  0.023***  0.039***    [0.009]  [0.008]  [0.005]  [0.004]  [0.008]  [0.005]  Ages 60+ dummy  −0.935***  −0.198  0.408***  0.345***  −0.950***  0.402***    [0.187]  [0.187]  [0.077]  [0.079]  [0.196]  [0.076]  Ages 63–66 dummy  −0.548  −0.111  0.584***  0.403***  −0.464*  0.504***    [0.360]  [0.289]  [0.098]  [0.071]  [0.278]  [0.095]  Log Assets  0.051*  0.343***  0.025*  0.035***  0.058*  0.037***    [0.027]  [0.034]  [0.014]  [0.011]  [0.033]  [0.013]  Tenure  −.003***  −0.022**  −0.002***  −0.025***  −0.003***  −0.002***    [0.001]  [0.011]  [0.000]  [0.002]  [0.001]  [0.000]  Obs.  18,215  35,239  19,786  39,292  214,301  222,732  Hosmer-Lemeshow  30.98***  34.74***  11.08  29.16***  49.53***  3.14    [0.000]  [0.000]  [0.197]  [0.000]  [0.000]  [0.925]    Prob. turn. firm  perf. 10th/90th  .046/.011  .016/.002  .118/.077  .131/.080  .047/.008  .125/.066  Prob. turn. ind.  perf. 10th/90th  .027/.019  .007/.005  .094/.097  .102/.107  .027/.015  .099/.083    A. Annual timing   B. Monthly timing     Sample 1, overt  Sample 2, overt  (GM Match) Sample 1, generic  Sample 2, generic  (JK Match) Sample 1, overt  Sample 1, generic    (1)  (2)  (3)  (4)  (5)  (6)  Abnormal firm performance  −1.566***  −1.618***  −0.476***  −0.423***  −1.881***  −0.699***    [0.142]  [0.321]  [0.069]  [0.040]  [0.303]  [0.067]  Industry performance  −0.597***  −0.755**  0.052  0.093  −1.076***  −0.311**    [0.202]  [0.336]  [0.185]  [0.088]  [0.372]  [0.143]  Age  0.025***  0.004  0.039***  0.018***  0.023***  0.039***    [0.009]  [0.008]  [0.005]  [0.004]  [0.008]  [0.005]  Ages 60+ dummy  −0.935***  −0.198  0.408***  0.345***  −0.950***  0.402***    [0.187]  [0.187]  [0.077]  [0.079]  [0.196]  [0.076]  Ages 63–66 dummy  −0.548  −0.111  0.584***  0.403***  −0.464*  0.504***    [0.360]  [0.289]  [0.098]  [0.071]  [0.278]  [0.095]  Log Assets  0.051*  0.343***  0.025*  0.035***  0.058*  0.037***    [0.027]  [0.034]  [0.014]  [0.011]  [0.033]  [0.013]  Tenure  −.003***  −0.022**  −0.002***  −0.025***  −0.003***  −0.002***    [0.001]  [0.011]  [0.000]  [0.002]  [0.001]  [0.000]  Obs.  18,215  35,239  19,786  39,292  214,301  222,732  Hosmer-Lemeshow  30.98***  34.74***  11.08  29.16***  49.53***  3.14    [0.000]  [0.000]  [0.197]  [0.000]  [0.000]  [0.925]    Prob. turn. firm  perf. 10th/90th  .046/.011  .016/.002  .118/.077  .131/.080  .047/.008  .125/.066  Prob. turn. ind.  perf. 10th/90th  .027/.019  .007/.005  .094/.097  .102/.107  .027/.015  .099/.083  Each column reports traditional coefficient estimates from a single logit model in which turnover of the indicated type in the indicated sample is predicted. Asymptotic standard errors are reported in parentheses under each coefficient estimate. The dependent variable assumes a value of 1 for turnover events of the indicated type, missing for other turnover events, and 0 for no turnover. Panel A models use performance over one fiscal year to predict turnover in the subsequent year, whereas panel B models use performance over each 12-month period to predict turnover in the next calendar month. The two samples and two types of turnover modeled are described in the text and in prior tables. Each model is estimated only over the set of observations in which the CEO has tenure for at least 24 months at the start of the turnover observation window. Abnormal firm performance is the residual from a sample-wide regression predicting a firm’s stock returns against an industry return measure. Industry performance is the predicted value from this regression. Firm performance is winsorized at the 1% tails. All age variables are based on the CEO’s age as of the fiscal year end prior to the turnover observation window. Log of assets is based on the firm’s most recent inflation-adjusted total book assets figure prior to the turnover observation window. CEO tenure is measured in months as of the start of the turnover window. The figure under each Hosmer-Lemeshow statistic is the p-value from a chi-square test based on this statistic for the null of a correctly specified model. The penultimate and final rows in the table report the implied annual probability of turnover in decimal form derived from the column's logit model estimates as abnormal firm performance and industry performance respectively are varied from the 10th to 90th percentile levels, holding all other variables constant at the sample mean. In the case of the monthly timing models, we derive annual implied probabilities by multiplying monthly probabilities by 12. All models include year effects. Standard errors are clustered at the industry level. * significant at the 10% level, **significant at the 5% level, and ***significant at the 1% level. As the resultant estimates reveal in row 1 of Table 3, the coefficient on abnormal firm performance is negative and highly significant in all models. Thus, we see again that the negative relation between firm abnormal performance and turnover is evident no matter how turnover is defined, how the sample is selected, or what timing conventions is selected. This is comforting and echoes both our earlier univariate evidence and the long prior literature examining this relation. In the second to last row of Table 3 we report the implied annual probabilities of turnover derived from the model estimates when firm performance is varied from the 10th to 90th percentile. These figures reveal an effect of reasonably large size, with turnover rates for the poorest performers in the overt (generic) turnover models in all cases at least four times the rate of (4% greater than) the best performers. The initial picture for industry performance emerging from row 2 of Table 3 is somewhat more supportive of an underlying relation than our preceding univariate evidence would suggest. We estimate a significant negative coefficient on industry performance for overtly forced turnover regardless of timing or sample. In addition, the industry performance coefficient is negative and significant in one of three generic turnover models (sample 1, monthly timing). However, the industry performance coefficient is positive and insignificant in the two generic turnover models with annual timing. These initial coefficients are consistent with both JK and GM, as the two models that are the closest matches to each of these authors (columns 5 and 3, respectively) reveal coefficients that are consistent with what they report with respect to the role of industry performance in CEO turnover. In the cases in which the estimated coefficient on industry performance is negative, the implied probabilities reported in the final row of Table 3 suggest an industry effect that is much smaller in economic magnitude than the firm performance effect. 3.3 Robustness concerns While these initial multivariate models provide some support for the presence of an industry performance role in CEO turnover (a significant negative coefficient in 4 of 6 models in Table 3), there are reasons to be concerned. First, the earlier univariate evidence was substantially weaker, with no evident relation when using annual timing and a very weak relation with monthly timing. Second, two of the six logit models in Table 3, notably the annual timing/generic turnover models that most closely match GM, do not even hint at an industry performance relation, in fact the point estimates are positive (but insignificant). These observations suggest that substantive robustness issues and/or specification errors are likely associated with these initial models. To provide some information on model fit, we report in Table 3 the Hosmer-Lemeshow test statistic for goodness of fit of a logit model specification. This test aggregates the sum of the squares of the difference between the predicted and actual number of events based on sample deciles. A larger test statistic is associated with a poorer model fit, and an asymptotic p-value for the test statistic is available based on a chi squared distribution under the null of a correctly specified model. As the figures in Table 3 reveal, in all three of the models of overtly forced turnover the test statistic is large and the null is easily rejected. These are the models that offered the strongest evidence of an industry performance role in turnover. The generic turnover models fare better, with 2 of the 3 models not rejected at conventional levels. Clearly these findings do not inspire confidence in the particular logit model specifications of forced turnover reported in Table 3, or the corresponding discretized Cox duration models. 3.4 Exploring robustness In light of our concerns about robustness and poor fit of many of these initial models, we are left with an unclear picture on the degree to which industry performance is related to CEO turnover. The default choices in our initial models are made to closely match JK, and to a lesser extent GM, to ease comparisons to the prior literature. However, as we discuss earlier, there are a large number of reasonable model alterations suggested by prior work. In order to get a clearer picture of the underlying economic relation between industry performance and CEO turnover, we consider here several of these alterations. In this analysis, which we report in Table 4, we tabulate only the industry-performance coefficients, as the abnormal firm performance coefficients are, as expected, negative and highly significant in all models. Table 4 Industry performance coefficients for an alternative model of CEO turnover   Cont. comp. perf.  Percentile perf.  Linear regress  Drop firm perf.  Value weight  A. Fiscal year timing  Sample 1, overt  −0.269  2.161  −0.004  −0.166  −0.547**    [0.237]  [1.949]  [0.006]  [0.206]  [0.251]  Sample 2, overt  −0.317  2.844  0.000  −0.053  −0.524    [0.260]  [2.284]  [0.002]  [0.239]  [0.399]  Sample 1, generic  (GM match)  0.011  −0.620  0.009  0.090  0.040    [0.234]  [1.020]  [0.016]  [0.152]  [0.180]  Sample 2, generic  0.112  −1.361  0.016*  0.165**  0.149    [0.120]  [0.875]  [0.009]  [0.080]  [0.113]  B. Monthly timing            Sample 1, overt  (JK match)  −0.391  2.189  −0.001*  −0.544*  −0.823**    [0.300]  [2.122]  [0.001]  [0.285]  [0.358]  Sample 1, generic  −0.224  0.978  −0.002  −0.187  −0.240*    [0.148]  [0.858]  [0.001]  [0.128]  [0.136]    Cont. comp. perf.  Percentile perf.  Linear regress  Drop firm perf.  Value weight  A. Fiscal year timing  Sample 1, overt  −0.269  2.161  −0.004  −0.166  −0.547**    [0.237]  [1.949]  [0.006]  [0.206]  [0.251]  Sample 2, overt  −0.317  2.844  0.000  −0.053  −0.524    [0.260]  [2.284]  [0.002]  [0.239]  [0.399]  Sample 1, generic  (GM match)  0.011  −0.620  0.009  0.090  0.040    [0.234]  [1.020]  [0.016]  [0.152]  [0.180]  Sample 2, generic  0.112  −1.361  0.016*  0.165**  0.149    [0.120]  [0.875]  [0.009]  [0.080]  [0.113]  B. Monthly timing            Sample 1, overt  (JK match)  −0.391  2.189  −0.001*  −0.544*  −0.823**    [0.300]  [2.122]  [0.001]  [0.285]  [0.358]  Sample 1, generic  −0.224  0.978  −0.002  −0.187  −0.240*    [0.148]  [0.858]  [0.001]  [0.128]  [0.136]  Each cell in this table, except for the linear regression column, reports an estimated industry coefficient from a logit model predicting turnover. Asymptotic standard errors are reported in parentheses under each coefficient estimate. Each model corresponds to a model in Table 3 with all model choices the same except the modification indicated in the column heading. The continuously compounded performance column replaces the firm and industry performance measures in Table 3 with continuously compounded versions of these measures by using a log transform. The percentile performance column replaces these measures with percentile measures based on the annual cohort. The linear regression column estimates linear regressions with a 0/1 dependent variable and all control variables identical to the corresponding logit models in Table 3. The dropped firm performance column estimates the same models used in Table 3 but with the firm performance variable dropped from the estimation. The value-weighted column replaces the equally weighted industry performance benchmark with a corresponding value weighted measure. All other model choices and labeling conventions in the models that lead to these coefficients are identical to the corresponding models and specifications in Table 3. *significant at the 10% level, **significant at the 5% level, and ***significant at the 1% level. As we discuss earlier, a case can be made for using continuously compounded returns or return percentiles as measures of performance. These alternatives place different weights on performance variations compared to the simple untransformed return measure. When we make this alteration, the industry performance coefficients, reported in the first two columns of Table 4, become insignificant in all models. An alternative to nonlinear logit models, with some desirable properties, along with notable shortcomings, is the simple linear probability model (see Angrist and Pischke 2009). As we report in Column 3 of Table 4, when we use this alteration, the industry performance results again become generally insignificant across models, with one negative and marginally (10%) significant coefficient in the model that most closely matches JK (sample 1, overt turnover, monthly timing). It is worth noting that with each of these robustness checks, the corresponding model coefficient on abnormal firm performance is always negative and highly significant (untabulated). The goodness-of-fit of the models that use the transformed measures of performance are generally at least as good as those using the untransformed measures, as indicated by similar or lower values for the Hosmer-Lemeshow statistic (figures untabulated). Our evidence that these seemingly innocuous model checks do not confirm the Table 3 findings, along with the earlier univariate evidence, raises serious doubt about the robustness of any estimated turnover-industry performance relation. As an additional check, we estimate the default Table 3 models but drop abnormal firm performance as an explanatory variable. As we report in the fourth column of Table 4, this step of dropping an orthogonal variable from the model reveals only one significant negative coefficient on industry performance (out of 6), and this is only significant at the 10% level in the model that most closely matches JK.20 This change must reflect the nonlinearity of the logit model, as the dropping of an orthogonal variable in a traditional ordinary least-squares (OLS) linear regression cannot have an effect on other coefficient estimates. Given the varying treatment in the literature, we also experiment with using a value-weighted performance benchmark in place of an equally weighted industry benchmark. JK report that their evidence of an industry performance relation with turnover also holds when using a value weighted measure. As we report in the final column of Table 4, in our sample this modification slightly weakens the evidence regarding a turnover-industry performance relation in some of the Table 3 models. We have experimented with a variety of other model alterations, including (a) restricting attention only to S&P 1500 firms, (b) dropping financials and utilities, (c) measuring industry performance using only the sample rather than the entire CRSP/Compustat universe, and (d) restricting attention to executives under the age of 60. While there are some minor changes in findings with some of these alterations, the basic conclusions are substantively unchanged (results untabulated). In particular, with these changes, some of the initial Table 3 models indicate a negative industry performance coefficient, but this evidence almost always disappears, or weakens drastically, when moving to the Table 4 robustness checks. The JK study presents some evidence suggesting that industry performance may play a different role in turnover for strong versus weak performers. In particular, CEOs with weak relative performance may be able to hide behind good industry performance if they are credited with good luck. On the other hand, CEOs with strong relative performance may be inappropriately blamed for bad luck (i.e., bad industry performance). The JK evidence suggests that the former behavior may be more common than the latter. To investigate, in Table 5 we separate the sample into below-average (panel A) and above-average (panel B) relative firm performance subsamples, and then we compare differences in turnover rates within each subsample dependent on whether industry performance was low (below median) or high (above median). As the table reports, all of the differences across the industry performance groups are insignificant, regardless of sample, timing, or turnover definition. Table 5 Credit for good luck versus blame for bad luck   High industry perf.  Low industry perf.  p-value  A. Poor performers and hiding behind good luck        Sample 1, annual: Poor firm performance, overt turnover  0.042  0.048  .182  Sample 2, annual: Poor firm performance, overt turnover  0.014  0.016  .399  Sample 1, monthly: Poor firm performance, overt turnover  0.041  0.049  .103  Sample 1, annual: Poor firm performance, generic turnover  0.129  0.120  .172  Sample 2, annual: Poor firm performance, generic turnover  0.136  0.132  .461  Sample 1, monthly: Poor firm performance, generic turnover  0.127  0.133  .412  B. Strong performers and blame for bad luck        Sample 1, annual: strong firm performance, overt turnover  0.019  0.018  .715  Sample 2, annual: strong firm performance, overt turnover  0.005  0.007  .149  Sample 1, monthly: strong firm performance, overt turnover  0.014  0.017  .324  Sample 1, annual: strong firm performance, generic turnover  0.094  0.092  .799  Sample 2, annual: strong firm performance, generic turnover  0.092  0.091  .921  Sample 1, monthly: strong firm performance, generic turnover  0.089  0.093  .518    High industry perf.  Low industry perf.  p-value  A. Poor performers and hiding behind good luck        Sample 1, annual: Poor firm performance, overt turnover  0.042  0.048  .182  Sample 2, annual: Poor firm performance, overt turnover  0.014  0.016  .399  Sample 1, monthly: Poor firm performance, overt turnover  0.041  0.049  .103  Sample 1, annual: Poor firm performance, generic turnover  0.129  0.120  .172  Sample 2, annual: Poor firm performance, generic turnover  0.136  0.132  .461  Sample 1, monthly: Poor firm performance, generic turnover  0.127  0.133  .412  B. Strong performers and blame for bad luck        Sample 1, annual: strong firm performance, overt turnover  0.019  0.018  .715  Sample 2, annual: strong firm performance, overt turnover  0.005  0.007  .149  Sample 1, monthly: strong firm performance, overt turnover  0.014  0.017  .324  Sample 1, annual: strong firm performance, generic turnover  0.094  0.092  .799  Sample 2, annual: strong firm performance, generic turnover  0.092  0.091  .921  Sample 1, monthly: strong firm performance, generic turnover  0.089  0.093  .518  Each row in panel A (B) identifies observations with below (above) median abnormal firm performance relative to the same time period cohort for the indicated sample. The figures in the high industry (low industry) performance column indicate the turnover rate of the indicated type for firms with above (below) median industry performance over the same period. Firm performance cuts are based on the firms’ annual buy-and-hold stock return relative to the same industry/period cohort. Industry performance groupings are based on the equally weighted industry portfolio of the firm relative to the same period cohort. Annual timing denotes that performance is calculated over the fiscal year immediately preceding the year in which a turnover event could take place, whereas monthly timing is calculated over the 12-month period immediately preceding the month over which a potential turnover event may occur. Monthly turnover rates are annualized by multiplying by 12. All p-values are for a simple t-test of a difference in turnover rates between the two industry performance groups in a given row. The figures in this table are calculated over the subset of observations in which a CEO has tenure for at least 24 months at the start of the turnover observation window. All other definitions and variables are defined like in the earlier tables. 4. Making sense of the evidence 4.1 Synthesizing the evidence Summarizing the preceding findings, the negative relation between CEO turnover and abnormal firm performance is strong and evident no matter what modeling choices we make. The relation between turnover and industry performance is much weaker. The univariate evidence is nonexistent with annual timing and quite weak with monthly timing. The multivariate evidence is moderate under some initial modeling assumptions, but collapses or weakens drastically when (a) the performance metric is transformed, (b) linear regression is used in place of a logit model, or (c) abnormal firm performance is dropped from the model. In general, the evidence for the presence of an industry performance relation with turnover is stronger for sample 1 (large Execucomp firms), monthly timing rather than annual timing (JK timing rather than GM timing), and untransformed measures of returns (JK measurement versus GM measurement). However, even with the most supportive choices, the case for a robust relation between CEO turnover and industry performance in our data and samples appears weak at best. Given the hints of an industry performance relation when we use monthly timing, we more closely examine industry performance in an extended window preceding turnover events in sample 1. We examine the 24-month period ending the month before the turnover departure date (-24 to -1), along with the two annual subintervals denoted by -24 to -13 and -12 to -1. As we report in the first two panels of Table 6 (see Column 2), average abnormal industry performance is insignificantly negative over the full 24-month period for both definitions of turnover. For the 12-month subintervals, average industry abnormal performance is small, positive, and insignificant in the -24- to -13-month period, and small, negative, but statistically significant in the -12- to -1-month interval. This pattern aggregates to an insignificant average over the entire 2-year period, regardless of whether we use overtly forced events (panel A) or generic turnover events (panel B). Even the significant negative industry performance patterns over the -12- to -1-month window are small in magnitude and statistical significance implicitly depends on certain parametric assumptions. Table 6 Turnover and performance over 24 months A. Performance prior to turnover, overt        Abnormal firm performance  Abnormal industry performance  Months -24 to -1  −0.387***  −0.012  Months -24 to -13  −0.140***  0.006  Months -12 to -1  −0.216***  −0.027***  B. Performance prior to turnover, generic        Abnormal firm performance  Abnormal industry performance  Months -24 to -1  −0.182***  −0.003  Months -24 to -13  −0.077***  0.006  Months -12 to -1  −0.096***  −0.011**  C. Turnover rates by 2-year performance quintile        Firm performance  Industry performance  Overt turnover rate, Q5 performance  0.011  0.029  Overt turnover rate, Q1 performance  0.073  0.032  p-val., Q1 versus Q5  .000  .502  Generic turnover, Q5 performance  0.073  0.109  Generic turnover, Q1 performance  0.168  0.113  p-val., Q1 versus Q5  .000  .564  A. Performance prior to turnover, overt        Abnormal firm performance  Abnormal industry performance  Months -24 to -1  −0.387***  −0.012  Months -24 to -13  −0.140***  0.006  Months -12 to -1  −0.216***  −0.027***  B. Performance prior to turnover, generic        Abnormal firm performance  Abnormal industry performance  Months -24 to -1  −0.182***  −0.003  Months -24 to -13  −0.077***  0.006  Months -12 to -1  −0.096***  −0.011**  C. Turnover rates by 2-year performance quintile        Firm performance  Industry performance  Overt turnover rate, Q5 performance  0.011  0.029  Overt turnover rate, Q1 performance  0.073  0.032  p-val., Q1 versus Q5  .000  .502  Generic turnover, Q5 performance  0.073  0.109  Generic turnover, Q1 performance  0.168  0.113  p-val., Q1 versus Q5  .000  .564  Figures in panels A and B are differences in abnormal firm and industry performance for turnover vs. nonturnover events, with p-values calculated using a t-test across groups In panels A and B, we calculate the mean level of abnormal firm and industry performance over the 24-month period ending at the end of the calendar month immediately preceding the turnover event, and also averages over the two 12-month subintervals that compose this 24-month period. Panel A is for overtly forced events, and panel B is for generic turnover events. A firm’s untransformed stock return is calculated as the firm’s buy-and-hold return winsorized at the 1% and 99% tails, and the abnormal return is this return less the corresponding industry return. Industry returns are the returns on the equally weighted portfolio of all firms in the same Fama-French industry, rebalanced monthly. Industry abnormal returns are calculated by subtracting the CRSP equally weighted index return for the same period. Panel C reports turnover rates for the highest (Q5) and lowest (Q1) quintile performers over the entire 24-month window based on firm performance (first column of figures) or industry performance (second column of figures). Firm performance quintiles are based on the firms’ annual buy-and-hold stock return relative to the same industry/period cohort. Industry performance quintiles are based on the equally weighted industry portfolio of the firm relative to the same period cohort. Turnover rates are converted from a monthly figure to an annualized figure by multiplying by 12. The figures in this table are calculated over the subset of observations in which the CEO has a tenure of at least 24 months at the start of the turnover observation window. The p-values in panel C indicate whether the Q5 and Q1 turnover rates immediately above the indicated p-value differ significantly using a simple t-test. *significant at the 10% level, **significant at the 5% level, and ***significant at the 1% level. This evidence may help explain our earlier findings and related findings in the literature. When we use annual timing, a turnover event in a 1-year window is predicted using performance in the prior year. This implies an average 6-month lag between the event and the end of the performance measurement window, corresponding to months -18 to -7. With monthly timing, the -12 to -1 period is used to predict turnover in the next month. Given the hints of a slight positive industry performance trend in the more distant months, and a slight negative trend in the later months, the strongest (weakest) evidence for an industry performance relation with CEO turnover will occur with timing that is measured immediately (with a delay) preceding a possible event. To provide context for this evidence, we note that abnormal firm performance is negative with large magnitudes over all of these windows, as reported in the first column of Table 6. Thus, it appears that the entire 24-month window is relevant in the performance evaluation process that governs turnover decisions. As we report in panel C of Table 6, if we consider this longer window, there is no evidence of significant differences in CEO turnover rates by industry performance quintiles. Moreover, if we estimate logit models with monthly timing using these 24-month performance windows, the industry performance coefficient is insignificant using any of the modeling permutations from the first four columns of Table 4 (results untabulated). 4.2 Additional insights on modeling turnover While the preceding evidence offers little convincing support for the presence an economically important industry performance relation to turnover, it is concerning that some models do indicate a relation that quickly disappears with small modeling changes. This suggests that many common turnover models may be fairly sensitive to specification choices. In addition, the Hosmer-Lemeshow test statistics indicate that many common turnover model specifications may fit the data quite poorly. Given these observations, it would appear prudent to thoroughly explore the robustness of any given turnover finding to a variety of model alterations. While this general recommendation applies in almost all empirical finance settings, our evidence and the mixed results in the past literature suggest that this is a particularly acute issue in studies of CEO turnover. To provide some additional insights on modeling choices, and also to more fully investigate the sensitivity of turnover findings to small modeling changes, we conduct a simulation analysis. In particular, we first create CEO turnover data that is mechanically constructed to depend only on a firm’s relative-to-industry continuously compounded performance. We accomplish this by using the coefficients from the models in the first column of Table 4 for all variables, except for industry performance, to derive an implied turnover probability for each observation. The industry performance coefficient is set to 0 in this calculation. We then use a random number generator to create a 0/1 turnover outcome for each observation governed by this predicted probability. The resultant data set is a simulated sample that mechanically displays by construction a particular version of full RPE (i.e., with continuous compounding), and otherwise agrees with the original data set. We create 1,000 such simulated data sets for each of the six models in the first column of Table 4. We take this simulated data and estimate the corresponding logit model specifications presented in Table 3. That is, we estimate models with (an inappropriate) untransformed measures of abnormal firm performance and industry performance. Since the samples are mechanically constructed to display no industry performance relation when performance is measured in a certain way (i.e., with continuously compounded returns), we can assess model performance by observing whether estimating the wrong model (i.e. one that assumes untransformed returns) leads to an inflated or deflated rate of false positives on the estimated industry performance coefficients. We report in the first column of Table 7 the fraction of the 1,000 simulated data sets for each model in which the industry performance coefficient is significant at the 5% level using a two-tailed test. A rate substantially above 5% suggests that a small error in the model can incorrectly indicate an underlying relation, even when such a relation is, by construction, not present. As the figures in the table reveal, in all cases the rate of false positives far exceeds 5%.21 The highest error rate appears to be in the monthly timing, overt turnover, sample 1 model (i.e., the JK match specification), as in this case over 90% of the simulated data sets indicate a significant industry coefficient even when no industry relation is present. We report next to these estimates the fraction of the time that the Hosmer-Lemeshow test statistic indicates that the (incorrect) model specification should be rejected at the 5% level. The highest rate of model rejection is for this same model (56.30%), lending some confidence to this test as a way to detect model misspecification. Table 7 Simulation-based evidence on industry performance coefficients and turnover modeling   Full logit model industry coefs. signif. at 5%  Full logit model HL test signif. at 5%  Logit model no firm perf. industry coefs. signif. at 5%  OLS/linear probability model industry coefs. signif. at 5%  Sample 1: Overt, annual  .4470  .3050  .0670  .0710  Sample 1: Generic, annual  (GM Match)  .0920  .0490  .0640  .0740  Sample 2: Overt, annual  .5430  .2480  .1390  .1130  Sample 2: Generic, annual  .1800  .3280  .1300  .0970  Sample 1: Overt, monthly  (JK Match)  .9120  .5630  .0520  .0700  Sample 1: Generic, monthly  .1830  .0930  .0940  .0820    Full logit model industry coefs. signif. at 5%  Full logit model HL test signif. at 5%  Logit model no firm perf. industry coefs. signif. at 5%  OLS/linear probability model industry coefs. signif. at 5%  Sample 1: Overt, annual  .4470  .3050  .0670  .0710  Sample 1: Generic, annual  (GM Match)  .0920  .0490  .0640  .0740  Sample 2: Overt, annual  .5430  .2480  .1390  .1130  Sample 2: Generic, annual  .1800  .3280  .1300  .0970  Sample 1: Overt, monthly  (JK Match)  .9120  .5630  .0520  .0700  Sample 1: Generic, monthly  .1830  .0930  .0940  .0820  In each row, we randomly create turnover data using a random number generator where the probability of turnover for each observation obeys the implied probabilities of turnover from the coefficients in the corresponding model of the first column of Table 4 (continuously compounded performance with sample, timing, and turnover definition as indicated) but with the industry performance coefficient set equal to zero. For each model, we create 1,000 simulated samples. We then estimate the corresponding Table 3 model on each simulated data set with turnover assumed to be a function of untransformed abnormal firm performance and industry performance. The first column of figures in the table reports the fraction of simulated data sets in which the estimated industry performance coefficient is (spuriously) significant at the 5% level. The second column indicates the fraction of cases in which the corresponding Hosmer-Lemeshow test for this would lead to the rejection of an appropriately specified model at the 5% significance level. The third column indicates the fraction of cases in which the industry performance coefficient is significant at the 5% level when we drop firm performance from the logit estimation on the simulated data. The final column indicates the fraction of cases in which an estimated linear probability (OLS) model corresponding to the Table 3 models (including both firm and industry performance) estimated over the simulated data indicates a significant industry performance coefficient at the 5% level. In the final two columns of Table 7, we report the frequency of false positives on the industry performance coefficient when we (a) drop the abnormal firm performance variable and (b) use linear regression rather than logit. While these figures are slightly elevated compared to the 5% we would expect by chance, they are far lower than when the full incorrect logit model is estimated. This suggests that these checks are a reasonable way to examine the robustness of a given finding in light of the apparent extreme sensitivity of nonlinear logit models to seemingly small model misspecifications. 5. Conclusion The empirical literature on CEO turnover exhibits a wide variety of modeling choices. Some reported findings, most notably the sensitivity of CEO turnover to abnormal firm performance, appear insensitive to specific choices. The evidence is less clear for other findings. Of particular note, several prominent papers report little sensitivity of CEO turnover to industry performance, whereas others have detected a significant relation. This variation across the literature, coupled with a particularly wide variety of modeling and data choices, raises the general issue of result robustness in the CEO turnover context, as well as the specific issue of whether widespread violations of full relative performance evaluation are present in CEO turnover decisions. In this paper, we investigated these issues by studying two large samples and estimating CEO turnover models that span most of the popular choices in the prior literature. The finding of a significant sensitivity of CEO turnover to abnormal firm performance is never in doubt. It emerges no matter what modeling, sampling, or data choices we considered. The evidence of a significant role for industry performance in CEO turnover appears weak and sensitive to modeling choices. At a univariate level, we have detected some limited evidence of poor industry performance before turnover, but the magnitude is economically small and only apparent over some time windows for some turnover definitions. When we predicted turnover using common multivariate models, we uncovered somewhat stronger evidence of a significant industry performance coefficient, particularly when we (a) used a fairly strict definition of turnover, (b) used a timing convention that emphasizes very recent performance innovations, (c) studied a sample restricted to the largest Compustat firms, and (d) used a performance measure that does not transform the raw performance data. Alterations to (a) or (b) tend to weaken the evidence considerably, whereas alterations to (c) or (d) cause evidence in support of an industry performance relation to effectively disappear in our data. In light of our evidence, the case of an important role for industry performance in CEO turnover decisions appears quite limited, at least in the data that we studied. Some hints of this behavior are present, but the hints are faint. It is possible that an industry performance role is stronger and more robust in other samples, but certainly our evidence, which is based on two large samples and a fairly exhaustive set of model permutations, raises much doubt that industry performance plays an important role in most CEO removal decisions. Prior authors have come down on both sides of this question, but these studies tend to limit attention to a smaller set of modeling choices, and samples vary widely across studies. Thus, it is difficult to determine whether it is modeling choices, data differences, or some combination thereof that drive the differing conclusions across studies. Whatever the case, our evidence certainly suggests that there is little in terms of a large industry performance component to typical CEO turnover decisions. This conclusion contrasts with much prior evidence that industry performance plays a strong role in CEO compensation awards, raising interesting questions about the distinct roles played by turnover and compensation policies in managerial incentive contracting. In addition to adding to the evidence on the economic behavior governing turnover decisions, we have established several methodological points regarding modeling turnover decisions. Most importantly, we showed that coefficient estimates in turnover models can be quite sensitive to moderate changes in data definitions and/or timing conventions. When we perturb various modeling choices in small ways, coefficient estimates often sharply change in magnitude, sign, and statistical significance. In addition, simulations reveal that small model misspecifications can indicate a significant relation with high statistical confidence, for example, industry performance predicting turnover, even when an underlying relation is, by construction, not present. It is entirely possible that the reverse result also holds (i.e., no evidence of a relation in a misspecified model even when one is present). Since any test of an industry performance role in turnover is necessarily a joint test of this hypothesis coupled with the hypothesis that the model is correctly specified, our evidence indicates that much more attention should be paid to specification issues in future research. Ideally, either the underlying theory or an auxiliary empirical analysis can provide guidance on the appropriate specification choice. Although this recommendation generally applies to empirical research, given our evidence, it would appear to be particularly important in the CEO turnover context. The lack of model robustness we have identified, which apparently has not been widely recognized, is concerning and important, as appropriate modeling choices in the CEO turnover context are unclear based on a priori grounds. In light of the fragility we have detected, it would appear prudent for researchers in this area to consider a variety of reasonable modeling and data choices before concluding that an underlying relation is present. Our analysis offers some guidance on choices that may span the reasonable set including: (a) estimating a linear model that corresponds to the usual nonlinear models, (b) experimenting with dropping orthogonal variables from the estimation, (c) considering simple monotonic transformations of the performance measures, and (d) considering a range of alternative timing conventions. Understanding CEO turnover is certainly an important question for financial economists. Although we highlighted the challenges in studying this issue, we are hopeful that some of our insights are useful in arriving at robust and definitive conclusions regarding this important dimension of corporate governance. We thank Ashwini Agrawal, Rajesh Aggarwal, Radha Gopalan, Camelia Kuhnen, Dirk Jenter, Uday Rajan (the editor), and Amit Seru; several anonymous referees; and seminar participants at Alabama, Clemson, Georgia State, Georgetown, Kentucky, Minnesota, Nebraska, NYU, Purdue, Texas A&M, Texas Tech, Tulane, UNC Charlotte, and the 2013 WFA meetings for helpful comments and discussions. Earlier drafts of this paper circulated under different titles. All errors remain our own. Footnotes 1 A parallel literature on RPE in CEO compensation includes, for example, Aggarwal and Samwick (1999b), Bertrand and Mullainathan (2001), Garvey and Milbourn (2003, 2006), and Albuquerque (2009). 2 This point also has been made by, among others, Kaplan and Minton (2012) and Jenter and Lewellen (2014). See also Eisfeldt and Kuhnen (2013) for an equilibrium model of executive job separations. We present evidence below suggesting that many forced turnover events may be mistakenly labeled as voluntary. 3 As illustrated in many labor economics models (e.g., Mclaughlin 1991), the distinction of who instigates a change is often irrelevant for testing efficient learning theories of turnover, as performance signals should symmetrically affect all parties’ assessment of match quality. 4 The Parrino (1997) categorization contains some other elements, but the ones discussed here have the largest empirical effect on how events are categorized. 5 Firms may not directly rely on any of these metrics in their decisions. Instead, firms may rely on other (perhaps unobservable to the econometrician) metrics correlated with or partially captured by these traditional metrics. 6 A closely related alternative, suggested by the work of Bushman, Dai, and Wang (2010), is to scale the performance metric by some measure of return volatility so that returns are measured in units of standard deviations. 7 The Hazarika, Karpoff, and Nahata (2012) results are particularly relevant, as they use a sample and methodology similar to JK, yet their coefficients and figures reveal no significant poor abnormal industry performance before forced turnover. 8 Many of the studies we cite here do not explicitly discuss their coefficient estimates on industry performance, but an inspection of the tabulated coefficients in these studies leads to our characterization of whether the reported findings are consistent or inconsistent with JK. 9 These points are explicitly established by Sueyoshi (1995) and Jenkins (1995). Practitioner-oriented discussions can be found in Beck, Katz, and Tucker (1998) and Singer and Willett (2003). The nohr display option in Stata converts Cox coefficients to values comparable to logit coefficients. We have checked our main results for a model that includes tenure dummies. These coefficients are almost identical to what we report in the main tables using a simple linear trend variable for tenure. 10 Many articles in JSTOR do not include abstracts, and other articles use alternative phrases, such as “Management Turnover.” Our goal is not to survey all articles, but a reasonably representative set. The Journal of Financial Economics is a popular outlet for turnover studies, and it is not included in the JSTOR database. 11 Our timing convention is that a year reference means the end of the fiscal year has been referenced. With annual timing, turnover is measured starting at this point in time, and performance is measured until this point in time. 12 We have hand-checked a set of 50 of turnover events that are excluded by these criteria and find that the vast majority are directly related to external events, such as control changes, acquisitions, bankruptcies, or legal events. In the handful of cases not related to external events, there is no evidence of below average industry performance. 13 Turnover is automatically considered overtly forced whenever a press characterization suggests a forced departure. In all other cases, the variable is set equal to 0 if the CEO is over the age of 60. If the CEO is under 60, the variable is set equal to 1, unless the event was announced at least 6 months in advance or the CEO stays on at the firm (typically as Board Chair) after losing the CEO role. In the subset of cases with an available announcement date, the median time difference with the departure date is 0, with over 80% of these two dates falling within 31 days of each other. 14 In untabulated regressions, we have estimated models predicting turnover events accompanied by severance payments as an alternative to models predicting overtly forced departures. Our findings with regard to firm and industry performance using this categorization are quite similar to what we report in the text/tables for generic turnover. 15 These results were reported in an earlier draft of the paper and are available from the authors on request. 16 We winsorize firm returns at the 1% tails before estimating these regressions. 17 JK report summary statistics that indicate substantially poorer average industry performance in the forced turnover group compared to the no turnover group using monthly timing. Using a sample and data treatment similar to that of JK, Hazaraki, Karpoff, and Nahata (2012) report almost no difference in industry performance between these groups. 18 In our calculations, we exclude executives with tenures of under 2 years to maintain comparability to the later logit models. We also measure industry performance relative to the annual cohort. The choices that JK make when reporting corresponding figures are unclear. 19 We do not discuss the coefficients on the control variables in depth, but it is interesting to note that the age greater than 60 dummy is only negative and significant in the overt turnover models, as these are the models that use a variant of the Parrino (1997) algorithm. This suggests a distinct drop in predicted turnover at the age 60 threshold that is artificially generated by the algorithm. 20 If we run analogous specifications with industry performance measured in percentiles and abnormal firm performance excluded, the coefficients on industry performance are insignificant in all cases. 21 We also have experimented with simulations in which turnover data are constructed to depend on untransformed firm abnormal performance and a model is estimated that incorrectly assumes continuously compounded performance. The rate of false positives is also high with this specification error, although slightly lower than what we report in the first column of Table 7. References Aggarwal R. K., Samwick A. A.. 1999a. The other side of the trade-off: the impact of risk on executive compensation. Journal of Political Economy  107: 65– 105. Google Scholar CrossRef Search ADS   Aggarwal R. K., Samwick A. A.. 1999b. Executive compensation, strategic competition, and relative performance evaluation: Theory and evidence. Journal of Finance  54: 1999– 2043. Google Scholar CrossRef Search ADS   Aggarwal R. K., Samwick A. A.. 2003a. Why do managers diversify their firms? Agency reconsidered. Journal of Finance  58: 71– 118. Google Scholar CrossRef Search ADS   Aggarwal R. K., Samwick A. A.. 2003b. Performance incentives within firms: The effect of managerial responsibility. Journal of Finance  58: 1613– 50. Google Scholar CrossRef Search ADS   Albuquerque A. 2009. Peer firms in relative performance evaluation. Journal of Accounting and Economics  48: 69– 89. Google Scholar CrossRef Search ADS   Angrist J. D., Pishke J.. 2009. Mostly harmless economics: An empiricist’s companion , Princeton, NJ: Princeton University Press. Barro J. R., Barro R. J.. 1990. Pay, performance, and turnover of bank CEOs. Journal of Labor Economics  8: 448– 81. Google Scholar CrossRef Search ADS   Beck N., Katz J., Tucker R.. 1998. Taking time seriously: Time-series-cross-section analysis with a binary dependent variable. American Journal of Political Science  42: 1260– 88. Google Scholar CrossRef Search ADS   Bertrand M., Mullainathan S.. 2001. Are executives paid for luck? The ones without principals are. Quarterly Journal of Economics  116: 901– 32. Google Scholar CrossRef Search ADS   Bushman R., Dai Z., Wang X.. 2010. Risk and CEO turnover. Journal of Financial Economics  96: 381– 98. Google Scholar CrossRef Search ADS   Cornelli F., Kominek Z., Ljungqvist A.. 2013. Monitoring managers: Does it matter? Journal of Finance  68: 431– 81. Google Scholar CrossRef Search ADS   Eisfeldt A. L., Kuhnen C.. 2013. CEO turnover in a competitive assignment framework. Journal of Financial Economics  103: 351– 72. Google Scholar CrossRef Search ADS   Engel E., Hayes R. M., Wang X.. 2003. CEO turnover and properties of accounting information. Journal of Accounting and Economics  36: 197– 226. Google Scholar CrossRef Search ADS   Fee C. E., Hadlock C. J.. 2004. Management turnover across the corporate hierarchy. Journal of Accounting and Economics  37: 3– 38. Google Scholar CrossRef Search ADS   Fee C. E., Hadlock C. J., Pierce J. R.. 2013. Managers with and without style: Evidence using exogenous variation. Review of Financial Studies  26: 567– 601. Google Scholar CrossRef Search ADS   Garvey G., Milbourn T.. 2003. Incentive compensation when executives can hedge the market: Evidence of relative performance evaluation in the cross-section. Journal of Finance  58: 1557– 81. Google Scholar CrossRef Search ADS   Garvey G., Milbourn T.. 2006. Asymmetric benchmarking in compensation: Executives are paid for good luck but not punished for bad. Journal of Financial Economics  82: 197– 225. Google Scholar CrossRef Search ADS   Gibbons R., Murphy K. J.. 1990. Relative performance evaluation for chief executive officers. Industrial and Labor Relations Review  43: 30– 52. Google Scholar CrossRef Search ADS   Gopalan R., Milbourn T., Song F.. 2010. Strategic flexibility and the optimality of pay for sector performance. Review of Financial Studies  23: 2060– 98. Google Scholar CrossRef Search ADS   Hadlock C. J., Lumer G. B.. 1997. Compensation, turnover, and top management incentives: Historical evidence. Journal of Business  70: 153– 87. Google Scholar CrossRef Search ADS   Hadlock C. J., Lee S., Parrino R.. 2002. Chief executive officer careers in regulated environments: Evidence from electric and gas utilities. Journal of Law and Economics  45: 535– 63. Google Scholar CrossRef Search ADS   Hazarika S., Nahata R., Karpoff J.. 2012. Internal corporate governance, CEO turnover, and earnings management. Journal of Financial Economics  104: 44– 69. Google Scholar CrossRef Search ADS   Holmström B. 1982. Moral hazard in teams. Bell Journal of Economics  13: 392– 415. Google Scholar CrossRef Search ADS   Huang S., Maharjan J., Thakor A.. 2016. Disagreement-induced CEO turnover . Working Paper, Washington University in St. Louis. Huson M., Parrino R., Starks L.. 2001. Internal monitoring mechanisms and CEO turnover: A long-term perspective. Journal of Finance  56: 2265– 99. Google Scholar CrossRef Search ADS   Jenkins S. P. 1995. Easy estimation methods for discrete-time duration models. Oxford Bulletin of Economics and Statistics  57: 129– 36. Google Scholar CrossRef Search ADS   Jenter D., Kanaan F.. 2015. CEO turnover and relative performance evaluation. Journal of Finance  70: 2155– 83. Google Scholar CrossRef Search ADS   Jenter D., Lewellen K.. 2014. Performance-induced CEO turnover . Working Paper, Stanford University. Kaplan S., Minton B.. 2012. How has CEO turnover changed? International Review of Finance  12: 57– 87. Google Scholar CrossRef Search ADS   Mclaughlin K. J. 1991. A theory of quits and layoffs with efficient turnover. Journal of Political Economy  99: 1– 29. Google Scholar CrossRef Search ADS   Parrino R. 1997. CEO Turnover and outside succession: A cross-sectional analysis. Journal of Financial Economics  46: 165– 97. Google Scholar CrossRef Search ADS   Shumway T. 2001. Forecasting bankruptcy more accurately: A simple hazard model. Journal of Business  74: 101– 24. Google Scholar CrossRef Search ADS   Singer J. D., Willett J. B.. 2003. Applied longitudinal data analysis: Modeling change and event occurrence . New York: Oxford University Press. Google Scholar CrossRef Search ADS   Sueyoshi G. T. 1995. A class of binary response models for grouped data. Journal of Applied Econometrics  110: 411– 31. Google Scholar CrossRef Search ADS   Weisbach M. S. 1988. Outside directors and CEO turnover. Journal of Financial Economics  20: 431– 60. Google Scholar CrossRef Search ADS   © The Author 2017. Published by Oxford University Press on behalf of The Society for Financial Studies. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Review of Corporate Finance Studies Oxford University Press

Robust Models of CEO Turnover: New Evidence on Relative Performance Evaluation

Loading next page...
 
/lp/ou_press/robust-models-of-ceo-turnover-new-evidence-on-relative-performance-l5GVUnb2wf
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press on behalf of The Society for Financial Studies. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
ISSN
2046-9128
eISSN
2046-9136
D.O.I.
10.1093/rcfs/cfx018
Publisher site
See Article on Publisher Site

Abstract

Abstract We examine the robustness of empirical models and findings concerning CEO turnover. We show that the sensitivity of turnover to abnormal firm performance is an extremely robust result. In contrast, evidence indicating a relation between turnover and industry performance is both weak and fragile. We show that small changes in turnover modeling choices can affect inferences in a large way. Our evidence casts substantial doubt on the hypothesis that there is a large industry performance component to turnover decisions. We use our findings to offer some general prescriptions for checking robustness results in CEO turnover research. Received June 6, 2017; editorial decision July 10, 2017 by Editor Uday Rajan. Questions concerning how CEOs are removed from office have attracted much attention from financial economists. This is an important issue if CEO skills are substantive inputs into firm performance, as any inefficiencies in the process by which CEOs are removed from office could have first-order implications for a firm’s success. In addition, CEO action choices should be influenced by the manager’s anticipation of the mechanism governing their job security. The empirical literature on CEO turnover exhibits a wide variety of different modeling choices including those related to the definition of turnover, the timing windows over which turnover or performance are measured, the construction of performance variables, and the econometric model selected. This heterogeneity is not surprising given the coarse guidance provided by the underlying theories and wide variation in data availability across settings. The usual hope is that reasonable choices, robustness checking, and multiple studies across different samples will lead to robust inferences that reflect widespread economic behavior. Although these modeling issues are common in many settings, they are particularly acute in turnover studies, as data are often hand-collected and key choices are fixed early in the research process. In this paper we will identify some of the key choices that arise in studies of CEO turnover and explore whether these choices have a substantive effect on important economic questions. With regard to the well-researched question of the relation between CEO turnover and abnormal firm performance, we detect an extremely robust negative relation. Although the relation is highly significant in all cases, certain modeling choices do affect the economic magnitudes of the estimates in a substantive way. With regard to the issue of the widely discussed relation between turnover and industry performance (e.g., Jenter and Kanaan 2015), our findings indicate that empirical support for the general presence of such a relation is, at best, fragile and nonrobust. In particular, using two large samples, we find that when we use standard annual timing conventions, there is little convincing evidence of any relation between CEO turnover and industry performance. This evidence and treatment of the data is consistent with the findings of Gibbons and Murphy (1990) and others, and it suggests full relative performance evaluation in CEO turnover decisions. When we use an alternative nonstandard monthly timing convention for modeling turnover along the lines of Jenter and Kanaan (2015) (JK hereafter), we uncover some limited evidence suggesting a small role for industry performance in CEO turnover, although this evidence appears both weak and fragile. Given these findings, we will argue that the notion that CEOs are widely blamed for bad luck or credited for good luck in the removal decision is simply not strongly supported by the data. Thus, in our view, much of the recent discussion on the efficiency implications of industry performance factors in CEO turnover is misguided and attempts to address, at best, fairly rare behavior. Our evidence suggests that a simple efficient learning perspective with full industry relative performance evaluation (RPE) is a reasonable description of the CEO turnover process, at least in most traditional turnover model settings. Our turnover findings, coupled with widespread prior evidence that CEO compensation depends on luck (e.g., Bertrand and Mullainathan 2001), suggest that turnover and compensation policies play quite distinct roles in incentivizing managers. Our analysis offers useful insights into the role of modeling choices that play an important role in inferences regarding the CEO turnover process. One of these choices is the turnover event categorization procedure, with some of the more popular choices almost surely incorporating systematic biases. A second key choice is the mechanical construction of performance metrics, as seemingly innocuous assumptions regarding compounding, rebalancing, and monotonic transforms of the performance data can have a large effect on model inferences. Finally, and quite surprisingly, the inclusion of explanatory variables that are orthogonal to the variable of interest can have a significant effect on model inferences owing to the nonlinearity of the underlying models. For all three of these modeling issues, we will recommend a set of empirical checks that should assist researchers in assessing whether their inferences are reasonably robust. Since the use of monthly timing conventions does lead to some faint evidence for the presence of an industry effect in CEO turnover, we will look particularly closely at timing issues. In the case of firm performance, the picture is clear. Abnormal firm performance is on average poor and steadily declining in the two years preceding turnover event dates. Thus, abnormal firm performance measured over any standard window as a predictor of turnover in the subsequent year or month always displays a strong negative relation that is easy to detect. In the case of industry performance, the picture is nuanced. Industry performance is essentially flat over the 2-year window before the exact turnover date, but it trends slightly positive in the -24-month to -13-month period and negative in the -12-month to -1-month window. Thus, when using the traditional timing convention of fiscal-year performance windows as a predictor of turnover in the next fiscal year, the performance window captures part of the industry up-down pattern, and there is no evident relation between industry performance and CEO turnover. If one follows the nonstandard approach of using annual performance to predict turnover in the next month, the downward industry trend is emphasized more, and, consequently, some models reveal a marginally significant negative relation. These nuanced findings on industry performance do not seem to support a simple performance attribution error scenario, as in this case we would expect industry performance to exhibit the same long-term decline before turnover that is evident with abnormal firm performance. The results also suggest little in terms of an industry performance incentive effect induced by the threat of turnover, as the relation between industry performance and turnover measured over long windows is essentially zero, independent of timing subtleties. 1. Prior Literature and Empirical Strategy 1.1 Theoretical considerations Boards of directors are responsible for the decision to remove a CEO from office. Several nonperformance factors have been hypothesized to affect this decision. Holding these constant, most models assume that boards of directors update their assessment of a CEO’s suitability for remaining in office based on performance signals that arrive over time. Gibbons and Murphy (1990) (GM hereafter) assume that boards efficiently incorporate this performance information into their assessment of a CEO’s ability and follow an optimal dismissal decision rule based on this assessment. We will refer to this behavior as the efficient learning perspective. Simple tests of the efficient learning perspective examine whether there is any sensitivity of turnover to abnormal firm performance, followed by introspection into whether observed magnitudes appear consistent with optimality. More subtle tests often rely on comparative statics investigations suggested by the theory. GM examine a particularly interesting prediction of the efficient learning perspective. Motivated by Holmström’s (1982) theory of RPE, they argue that optimal turnover decisions should depend only on performance measured relative to a benchmark that filters out components of firm performance unrelated to managerial effort or ability, most notably industry performance.1 We highlight two theoretical issues that have important empirical design implications. The first concerns the party instigating the turnover decision. If a CEO has unattractive outside opportunities, turnover outcomes should largely reflect board-instigated decisions. However, when a CEO has attractive external opportunities, for example, a prestigious new job or highly valued leisure opportunities in retirement, the CEO may instigate the job separation. Researchers often attempt to label these outcomes as “forced” and “voluntary” and model them separately. However, most authors recognize that these assignment procedures are necessarily imperfect. In particular, many “voluntary” turnover events may in fact be far from voluntary.2 Consequently, a useful robustness check is to consider whether results change substantially when using alternative definitions of forced turnover.3 A second issue concerns the technology describing the distribution of managerial ability and the mapping between ability and performance. A common approach is to assume additive normal distributions, as this yields simple expressions for the efficient updating process. Although this approach is theoretically pragmatic, difficulties appear when moving to the data. Most performance metrics in turnover studies are distinctly nonnormal. For example, annual stock returns are skewed and extreme realizations of returns are more common than the normal distribution predicts. Consequently, revisions to assessments of managerial ability after observing extreme realizations may be smaller than the theory would suggest. There is no obvious solution to this issue, given our limited knowledge of the underlying technologies. Consequently, to assure result robustness, it would appear prudent to check that a key finding holds for reasonable transformations of the performance data or changes in timing conventions. 1.2 Empirical models of CEO turnover and firm performance 1.2.1 Identifying and categorizing CEO turnover Most prior empirical models of CEO turnover are logit models that include a dependent variable that assumes a value of 1 when a turnover event occurs in a given period and an independent variable that captures some measure of abnormal firm performance in the immediately prior period. The most general definition of turnover includes all cases where the identity of a firm’s CEO changes between time t and t + 1. Additional information is often used to assess the voluntary versus forced nature of these events, with researchers using algorithms relying on news articles, subsequent employment outcomes, and age information. A common categorization procedure is to use some variant of the Parrino (1997) algorithm. This categorization assumes that news article revelations of a forced departure automatically qualify a turnover event as forced. In the absence of such a revelation, the default choice is to treat young (old) executives who depart from the firm as forced (voluntary), using age 60 as the dividing line.4 Although this approach has intuitive appeal, it may contain some systematic biases. In particular, the press may be more likely to label any given departure as forced when absolute firm performance is poor, leading to spurious inferences. In addition, since news revelations of overtly forced departures are rather rare, this algorithm relies heavily on a CEO’s age, treating a 59- (60)-year-old who departs as likely forced (not forced). Since no perfect categorization scheme exists, it seems reasonable to expect that robust conclusions regarding CEO turnover behavior should hold when the dependent variable is varied at least somewhat over a liberal to conservative spectrum. Moreover, careful consideration of age effects would appear prudent. 1.2.2 Measuring performance The most common abnormal firm performance metrics in the CEO turnover literature are based on stock returns, but some studies also consider accounting measures (e.g., Weisbach 1988; Engel, Hayes, and Wang 2003). An attractive feature of stock returns is that they are largely unpredictable, while accounting measures exhibit predictable dynamics. Since prior evidence suggests a larger role for stock returns in predicting CEO turnover, we focus on market-based metrics.5 The actual construction of performance variables varies across studies. Many studies use simple annual stock returns to construct the key abnormal firm performance metric. Since extreme returns are not uncommon, these approaches may be sensitive to outliers. An alternative that may lessen the influence of outliers and transform the data into a form that more closely resembles the normal distribution is to use a log transformation of returns. An even more aggressive transformation borrows from Aggarwal and Samwick (1999a, 2003a, 2003b) and converts annual returns into percentile ranks, creating a metric that follows a uniform distribution.6 Since these three approaches cannot be ranked on a priori grounds, robustness concerns suggest checking all three. Any substantive differences in findings across these transformations may help identify the performance observations that are most influential in predicting turnover. 1.2.3 Benchmarking performance Following Holmström (1982), the theory of RPE posits that managers should be evaluated based on metrics that filter out factors that are unrelated to effort or ability. If we identify an abnormal firm performance measure (AbFirm) that is purged of these factors, and we are confident that these factors play no independent role in predicting turnover, we could estimate a model in which the probability of turnover for firm i in year t is modeled as   Pr(CEO turnoverit)=Ф(β1 × AbFirmit+γ×Zit+ɛit), (1) where Zit is a vector of non-performance-related controls, ɛit is a noise term, and Φ is a known distribution, for example, the logistic cumulative distribution function. The coefficient β1 from this model would then provide information on the sensitivity of turnover to abnormal firm performance. If we add the exogenous performance factors that are used to construct the abnormal firm performance measure (ExogPerf) to the estimated model, the theory of RPE predicts a coefficient of 0 on this variable. Thus, in a correctly specified model in which   Pr(CEO turnoverit)=Ф(β1 × AbFirmit+β2 × ExogPerfit+γ×Zit+ɛit), (2) we expect β1 to be negative and significant and β2 to be zero under full RPE. Most authors emphasize industry/peer performance as a key exogenous factor in determining firm performance and use an industry measure to construct ExogPerf. Constructing an appropriate measure of abnormal firm performance can be challenging, since it will depend on unknown technological conditions. For example, whether CEOs should be compared to an equally weighted versus value weighted comparison portfolio of peer firms depends on which more closely captures exogenous factors related to the CEO’s firm. Similar comments apply to the identification of the comparison group. Given this lack of guidance, it is not surprising that prior researchers have used a variety of different adjustment procedures to create abnormal performance metrics. Robustness concerns dictate that any findings on the role of RPE in turnover should be relatively insensitive to the procedure used, unless a clear case can be made for choosing one based on a priori grounds. Several early papers assume the presence of full RPE and measure abnormal firm performance as the difference between firm and industry performance (e.g., Weisbach 1988; Parrino 1997; Huson, Parrino, and Starks 2001). Rather than assuming full RPE, GM directly examine this issue by including both relative-to-industry firm performance and industry performance in their turnover prediction models. The evidence they report is consistent with the lack of any independent role for industry performance in predicting CEO turnover, suggesting the presence of full RPE. Similar results for other samples of public firms are reported by Barro and Barro (1990), Garvey and Milbourn (2006), and Hazarika, Karpoff, and Nahata (2012), and Cornelli, Kominek, Ljungqvist (2013) uncover parallel evidence for private-equity backed firms.7 Jenter and Kanaan (2015) make the important point that firm performance may not move one-for-one with industry performance. If this is the case, to construct an appropriate measure of abnormal firm performance, one should use the residual from a predictive regression of firm performance on industry performance. In a model predicting CEO turnover, the RPE theory would then predict a significant negative coefficient on this idiosyncratic abnormal firm performance variable (i.e., the regression residual which measures AbFirm), and a coefficient of zero on the industry performance variable (i.e., the regression predicted value which measures ExogPerf). Interestingly, JK present evidence indicating that CEO dismissals are in fact related to industry performance, with poor performance resulting in an enhanced likelihood of turnover. Similar findings have been reported by Bushman, Dai, and Wang (2010), Gopalan, Milbourn, and Song (2010), Kaplan and Minton (2012), and Eisfeldt and Kuhnen (2013). These findings are interesting, as they cast doubt on a simple efficient learning view of turnover. Given the potential importance of this result, and the contrasting results reported originally by GM and more recently Hazarika, Karpoff, and Nahata (2012) and Huang, Maharjan, and Thakor (2016), below we apply our modeling insights to more fully explore and understand the relation between industry performance and CEO turnover.8 1.2.4 Timing issues Given that some information on firms is available only on an annual basis, prior authors almost always predict turnover over a 1-year period (usually a fiscal year) as a function of prior year performance and start of year-firm characteristics. We refer to this as annual timing. Since stock returns are available on a more frequent basis, it is possible to model turnover over smaller time windows using updated annual performance data to predict turnover in the selected window. Weisbach (1988) and Hadlock and Lumer (1997) use quarterly windows, whereas Jenter and Kanaan (2015) appear to effectively use monthly windows. We refer to this latter approach as monthly timing. The choice between annual and monthly timing is unclear on a priori grounds. That almost all studies find a strong relation between abnormal firm performance and turnover using annual timing certainly suggests that the annual timing does capture a substantial part of the performance evaluation component incorporated into turnover decisions. Monthly timing may be more informative if boards make decisions quickly in response to new performance data, and if the researcher can obtain unlagged data on when a turnover decision was actually made. If there is a lag on either of these dimensions, some of the performance information with monthly timing may post-date the turnover decision. With regard to timing issues, annual timing is often the only practical choice. If finer timing on turnover decisions is available, monthly timing could also be informative. To the extent that there are differences in findings using the two conventions, there may be information content in the differences regarding how turnover decisions are made. Certainly, if a result holds using both timing conventions, a researcher’s confidence in the robustness of the findings will be increased. Along similar lines, longer performance windows (rather than shorter turnover windows) also may be informative. Huson, Parrino, and Starks (2001) exploit this approach. 1.2.5 Econometric modeling Most studies estimate simple logit models predicting a 0/1 variable indicating whether turnover occurs in a given time window. Multinomial logit models are occasionally estimated, generally confirming inferences from sets of pairwise logits (e.g., Parrino 1997; Huson, Parrino, and Starks 2001; Hadlock, Lee, and Parrino 2002). Less frequently, linear probability models or probit models are estimated for robustness. A few studies estimate Cox hazard models (e.g., Hadlock, Lee, and Parrino 2002; Jenter and Kanaan 2015). As has been discussed in the econometrics literature, the (discretized) Cox model is numerically identical to the conditional logit model, and the conditional logit model is a close cousin to the traditional logit model with a full set of tenure dummy variable controls.9 We have confirmed that this is true for turnover models using our data. Thus, little is gained by deviating from the standard logit treatment and we thus focus our attention on these models. 1.3 Empirical strategy As is clear from the preceding discussion, there are a variety of reasonable choices for empirically modeling CEO turnover. A strong relation between turnover and abnormal firm performance has been detected by many authors, regardless of their modeling specific choices, indicating a robust result. The real concern is whether other results are robust. Of particular note, the conflicting results that appear in the literature on the relation between industry performance and turnover suggest the possibility that differences in modeling may have a substantive effect on economic inferences. To investigate, we undertake an empirical investigation. We have two goals in this analysis. First, we hope to understand why different researchers have reached different conclusions on the important issue of the turnover-industry performance relation. Second, we hope to offer insights on the merits of different CEO turnover modeling choices and practical guidance on checking the robustness of findings of interest. A large number of modeling permutations are available to choose from, so we make initial baseline choices that closely match with the choices of JK and/or GM. These are two of the most prominent papers that study CEO turnover and industry performance, and these authors reach quite different conclusions. We emphasize the JK choices more heavily, as the samples we construct have substantial overlap with their sample. After presenting the baseline models, we vary our modeling choices to provide both a more complete picture of the underlying economic behavior and the robustness properties of these types of models. As a preliminary step, we first attempt to characterize the “standard” treatment in the literature. To do this, we identify all JSTOR articles with the phrase “CEO Turnover” in the abstract, along with the first 30 articles sorted by relevance with this phrase published in the Journal of Financial Economics.10 If a flagged paper presents a model predicting turnover, we characterize the paper’s baseline modeling treatment. We first consider whether an author uses a fairly coarse definition of turnover which includes most CEO changes (generic turnover), or a strict definition that relies at least partially on press characterizations (overt turnover). We find that 27 of the 38 papers use a generic categorization, with the remaining 11 using a stricter/overt categorization. Next, we categorize each paper by whether the authors use annual timing versus monthly timing. Here, we find 34 papers use annual timing, with only 1 study adopting monthly timing (three have indeterminate timing). Of the studies that use stock returns, we find that 26 papers use untransformed return information, whereas three transform or compress these data in some way (e.g., percentiles). Sampling choices vary widely across studies, as some studies focus on a specific type of firm (e.g., foreign firms or a specific industry). Only 4 of the 38 studies specifically exclude financial firms or utilities, so this sampling restriction appears to be relatively uncommon in turnover studies. Using the literature as our guide, the “standard” approach to modeling turnover appears to be to (a) use generic turnover, (b) use annual timing, (c) use untransformed returns, and (d) include financials and utilities. Interestingly, both GM and JK differ from this standard treatment. In particular, GM use a transformed version of returns by exploiting continuously compounded returns, while JK use forced turnover and a version of monthly timing. The standard treatment is of course not necessarily the optimal treatment, and many standard choices are likely dictated by data availability. Nevertheless, it is useful to have this literature characterization in mind as we consider various modeling permutations. 2. Sample Construction 2.1 Sample selection and categorization of turnover We collect data on two samples, which we refer to as samples 1 and 2, respectively. Following JK, sample 1 includes all Execucomp firms from 1993 to 2009, thus including most of the largest public firms. Sample 2 borrows from Fee, Hadlock, and Pierce (2013) and is drawn from the universe of all Compustat firms from 1990 to 2006, but excludes utilities, financial firms, foreign firms, and firms with under $10 million in assets.11 Sample 2 is substantially larger than sample 1, as the sampling choices that it borrows from Fee, Hadlock, and Pierce (2013) impose a much smaller minimum size threshold which is only partially offset by their exclusion of utilities and financials. For both samples, we code a fairly liberal turnover categorization, referred to as generic turnover, and a conservative categorization, referred to as forced turnover. For both samples, we require that a firm is listed in the originating database at the start and end of the fiscal year under consideration. This requirement should eliminate turnovers that arise from external events such as acquisitions, bankruptcies, and going private transactions.12 To identify turnover in sample 1, we conduct news searches for every case in which the identity of the firm’s CEO changes for two consecutive annual listings in the Execucomp database. If we can confirm a CEO change occurred and that the change was not related to a health/death event, an acquisition, or an immediate jump to new employer, we code the generic turnover variable as 1. The overtly forced turnover variable for this sample is then coded as 1 for the subset of these events that also satisfy the Parrino (1997) forced event categorization. For this sample, we use Execucomp’s listed departure date as the turnover date.13 Since sample 2 is much larger, and information on the precise timing of CEO changes scarcer, in this sample we restrict ourselves to annual timing and detect the fiscal year in which every CEO change in the sample occurs, conditional on a firm being listed at the start and end of the fiscal year. Similar to sample 1, we define a generic turnover event to be every CEO change during the fiscal year, except for those related to health/death/acquisitions/jumps. For sample 2, our definition of overtly forced turnover is relatively stricter than for sample 1, as we require a press characterization of the departure as forced as revealed by a Factiva search. Given the more limited information available for broader sets of public firms, further refinements using inputs into the Parrino (1997) algorithm are not practical for sample 2. Summary statistics with basic information on samples 1 and 2 are reported in panels A and B of Table 1, respectively. As expected, the average firm in sample 1 is larger than the average firm in sample 2. The generic annual turnover rate is slightly lower in sample 1 (9.38% versus 11.16%), but both rates are consistent with typical figures reported in the literature. The forced annual turnover rate in sample 1 of 2.74% is similar to the rate reported by JK, who use a similar sampling and categorization procedures. The forced turnover rate in sample 2 is lower at 0.87%, which is not surprising given the stricter coding of the forced variable in this sample and more sparse press coverage of smaller firms. Table 1 Sample characteristics   Statistic  Obs.  A. Sample 1 (Execucomp, 1993-2009)      Number of firm years  28,296  28,293  Mean book assets  14,641.77  28,293  Median book assets  1,909.28  28,293  Median age at turnover  59  2,643  Generic annual turnover rate  .0938  28,293  Overtly forced annual turnover rate  .0274  28,293    B. Sample 2 (Compustat, 1990-2006)      Number of firm years  60,203  60,203  Mean book assets  1,534.89  60,203  Median book assets  144.27  60,203  Median age at turnover  56  6,721  Generic annual turnover rate  .1116  60,203  Overtly forced annual turnover rate  .0087  60,203  C. Firm abnormal perf.: Generic versus nonturnover      Sample 1, annual timing, untransformed  −.0809***  2,152  Sample 1, annual timing, cont. compound.  −.0905***  2,152  Sample 1, monthly timing, untransformed  −.0962***  2,048  Sample 1, monthly timing, cont. compound.  −.1090***  2,048  Sample 2, annual timing, untransformed  −.1435***  4,413  Sample 2, annual timing, cont. compound  −.1492***  4,413  D. Industry abnormal perf.: Generic versus nonturnover      Sample 1, annual timing, untransformed  0.0014  2,152  Sample 1, annual timing, cont. compound.  0.0007  2,152  Sample 1, monthly timing, untransformed  −.0113**  2,048  Sample 1, monthly timing, cont. compound.  −0.0091**  2,048  Sample 2, annual timing, untransformed  0.0068*  4,413  Sample 2, annual timing, cont. compound  0.0035  4,413    Statistic  Obs.  A. Sample 1 (Execucomp, 1993-2009)      Number of firm years  28,296  28,293  Mean book assets  14,641.77  28,293  Median book assets  1,909.28  28,293  Median age at turnover  59  2,643  Generic annual turnover rate  .0938  28,293  Overtly forced annual turnover rate  .0274  28,293    B. Sample 2 (Compustat, 1990-2006)      Number of firm years  60,203  60,203  Mean book assets  1,534.89  60,203  Median book assets  144.27  60,203  Median age at turnover  56  6,721  Generic annual turnover rate  .1116  60,203  Overtly forced annual turnover rate  .0087  60,203  C. Firm abnormal perf.: Generic versus nonturnover      Sample 1, annual timing, untransformed  −.0809***  2,152  Sample 1, annual timing, cont. compound.  −.0905***  2,152  Sample 1, monthly timing, untransformed  −.0962***  2,048  Sample 1, monthly timing, cont. compound.  −.1090***  2,048  Sample 2, annual timing, untransformed  −.1435***  4,413  Sample 2, annual timing, cont. compound  −.1492***  4,413  D. Industry abnormal perf.: Generic versus nonturnover      Sample 1, annual timing, untransformed  0.0014  2,152  Sample 1, annual timing, cont. compound.  0.0007  2,152  Sample 1, monthly timing, untransformed  −.0113**  2,048  Sample 1, monthly timing, cont. compound.  −0.0091**  2,048  Sample 2, annual timing, untransformed  0.0068*  4,413  Sample 2, annual timing, cont. compound  0.0035  4,413  Sample 1 consists of all Execucomp firm-years from 1993 to 2009, and sample 2 is all Compustat firm-years from 1990 to 2006, but excluding financials, utilities, foreign firms, and firms below $10 million in assets in 1990 dollars. Both samples require the firm to have a listing at the start and end of the fiscal year, and the sample of turnover events includes all CEO changes over the course of the fiscal year. Book assets are measured in millions of dollars inflation-adjusted to December 2013 using the Consumer Price Index, treating all fiscal year accounting information as if it were reported on December 31. Overtly forced CEO turnover cases are events classified as forced using the Parrino (1997) algorithm for sample 1 and solely press characterizations for sample 2. For both samples, generic turnover represents all CEO departures, except for events related to control changes, health, death, or jumping immediately to take a position elsewhere. Figures in panels C and D are differences in abnormal firm and industry returns for turnover versus nonturnover events, with p-values calculated with a t-test across groups. Performance statistics denoted as using annual timing are calculated over the fiscal year immediately preceding the year in which a turnover event takes place, whereas monthly timing is calculated over the 12-month period immediately preceding the month in which the turnover event occurs. A firm’s untransformed stock return is calculated as the firm’s buy-and-hold return winsorized at the 1% and 99% tails, and the abnormal return is this return less the corresponding industry return. Continuously compounded returns are calculated as log(1+untransformed return). Industry returns are the corresponding returns on the equally weighted portfolio of all firms in the same Fama-French industry, rebalanced monthly. Industry abnormal returns are calculated by subtracting the CRSP equally weighted index return over the same period. The figures in panels C and D of this table are calculated over the subset of observations in which the CEO has tenure for at least 24 months at the start of the turnover observation window. * significant at the 10% level, ** significant at the 5% level, and *** significant at the 1% level. It is worth commenting briefly on the large differences between generic and forced turnover rates in both samples. The generic departure group surely incorrectly includes voluntary departures, while the forced departure group surely excludes many involuntary departures. As discussed above, roughly two-thirds (one-third) of the literature focuses on generic (forced) departures, including GM (JK). Clearly there is no perfect categorization, although age controls or age restrictions will hopefully adequately adjust for predictable retirements at certain ages. To investigate further, for the subset of observations that are in both sample 1 and sample 2, we systematically look for evidence of severance payments to departed executives. Fee and Hadlock (2004) suggest this as an alternative approach to identify forced turnover events. The forced turnover rate using severance information is 4.81%, nearly triple the rate of 1.54% when using only news articles, and substantially higher than the rate using the Parrino (1997) algorithm. This indicates that relying on press characterizations to identify turnover will likely miss a large number of relevant events.14 In untabulated results, we have also examined post-turnover labor market opportunities of sample CEOs after turnover. Regardless of whether the event is characterized as forced, few CEOs resurface in elite positions, and those that do tend to obtain jobs that appear inferior to their prior positions.15 This again suggests that many “voluntary” departures are far from voluntary. Given these observations, it would appear prudent to check results for both relatively liberal and conservative definitions of turnover, to assure that findings are not driven by the peculiarities of a given categorization scheme. 2.2 Measuring firm and industry performance As discussed above, the most common performance metric used in the prior literature is some measure of abnormal stock returns in the year prior to the window in which turnover is predicted. Most authors use a simple version of the firm’s buy-and-hold annual return, less a measure of benchmark performance, for example, industry returns. We follow JK’s suggested improvement on this approach and measure a firm’s abnormal performance as the residual from a sample-wide regression of firm returns against an industry benchmark. This procedure essentially amounts to using the firm’s return less some sample-wide industry beta estimate times the contemporaneous corresponding industry benchmark return.16 We refer to this measure as the firm’s untransformed abnormal return. As discussed earlier, the theory is ambiguous on whether returns should be transformed to follow an alternative distribution. A minority of authors, including GM, adopt a transformed returns approach. Thus, as an alternative to the default choice of untransformed returns, we consider a continuously compounded alternative (i.e., log of 1+ return) suggested by the work of GM, and a percentile version of returns suggested by the work of Aggarwal and Samwick (1999a, 1999b, 2003). In using these alternatives, we first transform the data and then define the abnormal return as the residual from a sample-wide regression of the transformed firm returns against the transformed industry benchmarks. In constructing a measure of industry benchmark performance, there are a host of choices related to the composition of the industry, weighting schemes, rebalancing rules, and the population over which any regression adjustment is performed. We follow JK in choosing equally weighted benchmarks as our initial choice using the Fama-French 48 industry categories (the industry “other” is excluded). Since industry composition can change during the year, we rebalance all portfolios on a monthly basis and compound the derived monthly return over the fiscal year. We also consider value-weighted returns for the same industry portfolio in some of our robustness checks. We follow JK and define industry broadly to include all firms listed on Compustat/CRSP, but we have experimented with restricting the industry to within-sample firms, and the main results we report below are substantively unchanged. In the case of untransformed returns, the associated industry benchmark is the one-year return on the industry portfolio as described and calculated above (with the firm’s own return excluded in the calculation). The continuously compounded benchmark is simply the log of (1 + untransformed industry return). The percentile version of the industry benchmark is based on the equally weighted industry portfolio of the firm relative to the contemporaneous annual cohort. These benchmarks are used in the regression adjustment procedure described above to measure abnormal firm performance. 2.3 A first look at performance To provide some initial evidence on the relation between turnover and performance, we calculate statistics for abnormal firm and industry performance in a one-year period prior to turnover. These figures are referred to as annual timing statistics when we use the fiscal year prior to the turnover event, and monthly timing statistics when we use the 12-month period ending the month immediately prior to the turnover event. The figures in panel C of Table 1 clearly indicate poor relative firm performance on average before all turnover events, independent of the timing, sample, or performance transformation. In all cases, abnormal firm performance is negative at high levels of significance, with values from -8.09% to -14.92%. The industry performance figures in panel D of Table 1 are less clear. Using annual timing, there is no evidence of poor industry performance prior to turnover. In fact, some of the industry performance figures are positive. With monthly timing, there is some evidence of poor industry performance, with figures suggesting abnormal industry performance on the order of -1% in the year immediately prior to a turnover event. These initial figures hint at a negative industry performance relation to turnover, but one that is both small in magnitude and quite sensitive to timing issues.17 3. The Relation between CEO Turnover and Performance 3.1 Univariate evidence To investigate the turnover-performance relation more systematically, we calculate turnover rates by performance quintiles. With annual timing, we consider performance over a given fiscal year predicting turnover in the subsequent fiscal year. With monthly timing, we consider performance over each 12-month period ending on the last day of a month predicting turnover in the next month. In the univariate statistics, we multiply monthly turnover rates by 12 to create annualized rates. As expected, the figures in panel A of Table 2 indicate significantly higher turnover rates in the worst abnormal firm performance quintile compared to the best, regardless of timing (annual or monthly), sample (sample 1 or 2), or turnover definition (forced versus generic). Turnover rates increase proportionally more with firm performance when using the overtly forced definition, but more in absolute magnitude when using the generic turnover definition. This suggests that the forced category is more successful at picking up forced events, but also that some involuntary events are missed by the forced definition and are captured in generic turnover. Table 2 Univariate turnover rates, top and bottom performance quintiles   Sample 1  Sample 2  A. Abnormal Firm perf., both timings                Q5  Q1  p-val.  Q5  Q1  p-val.  Overt, annual timing  0.015  0.064  .000  0.004  0.022  .000  Generic, annual timing  0.089  0.139  .000  0.084  0.161  .000  Overt, monthly timing  0.014  0.072  .000        Generic, monthly timing  0.082  0.159  .000        B. Industry perf., annual timing                Q5  Q1  p-val.  Q5  Q1  p-val.  Overt  0.030  0.035  .227  0.011  0.011  .819  Generic  0.109  0.108  .832  0.119  0.110  .073  Overt, no financials/utilities  0.034  0.033  .860        Generic, no financials/utilities  0.116  0.107  .240        C. Industry perf., monthly timing                Q5  Q1  p-val.        Overt  0.025  0.033  .061        Generic  0.102  0.112  .181        Overt, no financials/utilities  0.027  0.033  .263        Generic, no financials/utilities  0.105  0.114  .280          Sample 1  Sample 2  A. Abnormal Firm perf., both timings                Q5  Q1  p-val.  Q5  Q1  p-val.  Overt, annual timing  0.015  0.064  .000  0.004  0.022  .000  Generic, annual timing  0.089  0.139  .000  0.084  0.161  .000  Overt, monthly timing  0.014  0.072  .000        Generic, monthly timing  0.082  0.159  .000        B. Industry perf., annual timing                Q5  Q1  p-val.  Q5  Q1  p-val.  Overt  0.030  0.035  .227  0.011  0.011  .819  Generic  0.109  0.108  .832  0.119  0.110  .073  Overt, no financials/utilities  0.034  0.033  .860        Generic, no financials/utilities  0.116  0.107  .240        C. Industry perf., monthly timing                Q5  Q1  p-val.        Overt  0.025  0.033  .061        Generic  0.102  0.112  .181        Overt, no financials/utilities  0.027  0.033  .263        Generic, no financials/utilities  0.105  0.114  .280        Each cell with a column heading of Q5 (Q1) reports the annualized turnover rate for firms in the highest (lowest) performance quintile. Abnormal firm performance quintiles are based on the firms’ annual buy-and-hold stock return relative to the same industry-period cohort. Industry performance quintiles are based on the equally weighted industry portfolio of the firm relative to the same period cohort. Overt turnover events in sample 1 are CEO departures that are considered forced using the Parrino (1997) algorithm and in sample 2 are departures characterized by the press as forced. Generic turnover events are all departures, except for those related to control changes, health, death, and job jumps. Annual timing denotes that performance is calculated over the fiscal year immediately preceding the year in which a turnover event could take place, whereas monthly timing is calculated over the 12-month period immediately preceding the month over which a potential turnover event may occur. Monthly turnover rates are annualized by multiplying by 12. Sample 1 consists of all Execucomp firm-years from 1993 to 2009, and sample 2 is all Compustat firm-years from 1990 to 2006 except financials, utilities, foreign firms, and firms below $10 million in assets. All p-values are for a simple t-test of a difference in turnover rates between the two quintiles. The figures in this table are calculated over the subset of these observations in which the CEO has tenure for at least 24 months at the start of the turnover observation window. Sample 1 with annual (monthly) timing includes 20,828 (249,849) observations, whereas sample 2 with annual timing includes 42,937 observations. Turning to industry performance, the figures in panel B of Table 2 suggest no strong relation between industry performance and turnover when using annual timing, regardless of sample, definition of turnover, or whether we exclude financials and utilities. Turning to monthly timing in panel C, the figures here are more suggestive of an industry performance relation. In all cases, the worst performance quintile has a higher rate than the best performance quintile, and the differences are larger than the corresponding differences for annual timing. However, the evidence is still weak. Only the overtly forced departures display a significant difference, with a rate of 3.29% (2.53%) for the worst (best) industry performance group, a difference that is only marginally significant (p=.061). It is worth noting that JK report figures of 4.10% and 2.36% when making a similar comparison in a similar sample. Thus, while our sample hints at a similar relation, the underlying pattern in our data appears much weaker.18 3.2 Initial multivariate evidence We now consider standard logit models predicting turnover. In addition to abnormal firm performance and industry performance, all models include controls for firm size, CEO tenure, age, an age older than 60 dummy, an ages 63 to 65 dummy, and year dummies.19 Standard errors are clustered by industry and all observations for CEOs with a tenure of under two years are excluded. We estimate these models for both samples, both definitions of turnover, with monthly and annual timing for sample 1 and annual timing only for sample 2. For ease of comparison with JK, we use untransformed measures of performance in this initial table. These choices result in six models, with primary interest on the firm and industry performance coefficients. For reference, we note that the sample 1/overt turnover/monthly timing model permutation should be a very close match to JK on all dimensions and we label this in Table 3 as the JK match. The sample 1/generic turnover/annual timing model will be the closest match to the GM model and sample choice and is also labeled as such, although the sample period we study clearly differs from theirs. Table 3 Initial logit model estimates of CEO turnover   A. Annual timing   B. Monthly timing     Sample 1, overt  Sample 2, overt  (GM Match) Sample 1, generic  Sample 2, generic  (JK Match) Sample 1, overt  Sample 1, generic    (1)  (2)  (3)  (4)  (5)  (6)  Abnormal firm performance  −1.566***  −1.618***  −0.476***  −0.423***  −1.881***  −0.699***    [0.142]  [0.321]  [0.069]  [0.040]  [0.303]  [0.067]  Industry performance  −0.597***  −0.755**  0.052  0.093  −1.076***  −0.311**    [0.202]  [0.336]  [0.185]  [0.088]  [0.372]  [0.143]  Age  0.025***  0.004  0.039***  0.018***  0.023***  0.039***    [0.009]  [0.008]  [0.005]  [0.004]  [0.008]  [0.005]  Ages 60+ dummy  −0.935***  −0.198  0.408***  0.345***  −0.950***  0.402***    [0.187]  [0.187]  [0.077]  [0.079]  [0.196]  [0.076]  Ages 63–66 dummy  −0.548  −0.111  0.584***  0.403***  −0.464*  0.504***    [0.360]  [0.289]  [0.098]  [0.071]  [0.278]  [0.095]  Log Assets  0.051*  0.343***  0.025*  0.035***  0.058*  0.037***    [0.027]  [0.034]  [0.014]  [0.011]  [0.033]  [0.013]  Tenure  −.003***  −0.022**  −0.002***  −0.025***  −0.003***  −0.002***    [0.001]  [0.011]  [0.000]  [0.002]  [0.001]  [0.000]  Obs.  18,215  35,239  19,786  39,292  214,301  222,732  Hosmer-Lemeshow  30.98***  34.74***  11.08  29.16***  49.53***  3.14    [0.000]  [0.000]  [0.197]  [0.000]  [0.000]  [0.925]    Prob. turn. firm  perf. 10th/90th  .046/.011  .016/.002  .118/.077  .131/.080  .047/.008  .125/.066  Prob. turn. ind.  perf. 10th/90th  .027/.019  .007/.005  .094/.097  .102/.107  .027/.015  .099/.083    A. Annual timing   B. Monthly timing     Sample 1, overt  Sample 2, overt  (GM Match) Sample 1, generic  Sample 2, generic  (JK Match) Sample 1, overt  Sample 1, generic    (1)  (2)  (3)  (4)  (5)  (6)  Abnormal firm performance  −1.566***  −1.618***  −0.476***  −0.423***  −1.881***  −0.699***    [0.142]  [0.321]  [0.069]  [0.040]  [0.303]  [0.067]  Industry performance  −0.597***  −0.755**  0.052  0.093  −1.076***  −0.311**    [0.202]  [0.336]  [0.185]  [0.088]  [0.372]  [0.143]  Age  0.025***  0.004  0.039***  0.018***  0.023***  0.039***    [0.009]  [0.008]  [0.005]  [0.004]  [0.008]  [0.005]  Ages 60+ dummy  −0.935***  −0.198  0.408***  0.345***  −0.950***  0.402***    [0.187]  [0.187]  [0.077]  [0.079]  [0.196]  [0.076]  Ages 63–66 dummy  −0.548  −0.111  0.584***  0.403***  −0.464*  0.504***    [0.360]  [0.289]  [0.098]  [0.071]  [0.278]  [0.095]  Log Assets  0.051*  0.343***  0.025*  0.035***  0.058*  0.037***    [0.027]  [0.034]  [0.014]  [0.011]  [0.033]  [0.013]  Tenure  −.003***  −0.022**  −0.002***  −0.025***  −0.003***  −0.002***    [0.001]  [0.011]  [0.000]  [0.002]  [0.001]  [0.000]  Obs.  18,215  35,239  19,786  39,292  214,301  222,732  Hosmer-Lemeshow  30.98***  34.74***  11.08  29.16***  49.53***  3.14    [0.000]  [0.000]  [0.197]  [0.000]  [0.000]  [0.925]    Prob. turn. firm  perf. 10th/90th  .046/.011  .016/.002  .118/.077  .131/.080  .047/.008  .125/.066  Prob. turn. ind.  perf. 10th/90th  .027/.019  .007/.005  .094/.097  .102/.107  .027/.015  .099/.083  Each column reports traditional coefficient estimates from a single logit model in which turnover of the indicated type in the indicated sample is predicted. Asymptotic standard errors are reported in parentheses under each coefficient estimate. The dependent variable assumes a value of 1 for turnover events of the indicated type, missing for other turnover events, and 0 for no turnover. Panel A models use performance over one fiscal year to predict turnover in the subsequent year, whereas panel B models use performance over each 12-month period to predict turnover in the next calendar month. The two samples and two types of turnover modeled are described in the text and in prior tables. Each model is estimated only over the set of observations in which the CEO has tenure for at least 24 months at the start of the turnover observation window. Abnormal firm performance is the residual from a sample-wide regression predicting a firm’s stock returns against an industry return measure. Industry performance is the predicted value from this regression. Firm performance is winsorized at the 1% tails. All age variables are based on the CEO’s age as of the fiscal year end prior to the turnover observation window. Log of assets is based on the firm’s most recent inflation-adjusted total book assets figure prior to the turnover observation window. CEO tenure is measured in months as of the start of the turnover window. The figure under each Hosmer-Lemeshow statistic is the p-value from a chi-square test based on this statistic for the null of a correctly specified model. The penultimate and final rows in the table report the implied annual probability of turnover in decimal form derived from the column's logit model estimates as abnormal firm performance and industry performance respectively are varied from the 10th to 90th percentile levels, holding all other variables constant at the sample mean. In the case of the monthly timing models, we derive annual implied probabilities by multiplying monthly probabilities by 12. All models include year effects. Standard errors are clustered at the industry level. * significant at the 10% level, **significant at the 5% level, and ***significant at the 1% level. As the resultant estimates reveal in row 1 of Table 3, the coefficient on abnormal firm performance is negative and highly significant in all models. Thus, we see again that the negative relation between firm abnormal performance and turnover is evident no matter how turnover is defined, how the sample is selected, or what timing conventions is selected. This is comforting and echoes both our earlier univariate evidence and the long prior literature examining this relation. In the second to last row of Table 3 we report the implied annual probabilities of turnover derived from the model estimates when firm performance is varied from the 10th to 90th percentile. These figures reveal an effect of reasonably large size, with turnover rates for the poorest performers in the overt (generic) turnover models in all cases at least four times the rate of (4% greater than) the best performers. The initial picture for industry performance emerging from row 2 of Table 3 is somewhat more supportive of an underlying relation than our preceding univariate evidence would suggest. We estimate a significant negative coefficient on industry performance for overtly forced turnover regardless of timing or sample. In addition, the industry performance coefficient is negative and significant in one of three generic turnover models (sample 1, monthly timing). However, the industry performance coefficient is positive and insignificant in the two generic turnover models with annual timing. These initial coefficients are consistent with both JK and GM, as the two models that are the closest matches to each of these authors (columns 5 and 3, respectively) reveal coefficients that are consistent with what they report with respect to the role of industry performance in CEO turnover. In the cases in which the estimated coefficient on industry performance is negative, the implied probabilities reported in the final row of Table 3 suggest an industry effect that is much smaller in economic magnitude than the firm performance effect. 3.3 Robustness concerns While these initial multivariate models provide some support for the presence of an industry performance role in CEO turnover (a significant negative coefficient in 4 of 6 models in Table 3), there are reasons to be concerned. First, the earlier univariate evidence was substantially weaker, with no evident relation when using annual timing and a very weak relation with monthly timing. Second, two of the six logit models in Table 3, notably the annual timing/generic turnover models that most closely match GM, do not even hint at an industry performance relation, in fact the point estimates are positive (but insignificant). These observations suggest that substantive robustness issues and/or specification errors are likely associated with these initial models. To provide some information on model fit, we report in Table 3 the Hosmer-Lemeshow test statistic for goodness of fit of a logit model specification. This test aggregates the sum of the squares of the difference between the predicted and actual number of events based on sample deciles. A larger test statistic is associated with a poorer model fit, and an asymptotic p-value for the test statistic is available based on a chi squared distribution under the null of a correctly specified model. As the figures in Table 3 reveal, in all three of the models of overtly forced turnover the test statistic is large and the null is easily rejected. These are the models that offered the strongest evidence of an industry performance role in turnover. The generic turnover models fare better, with 2 of the 3 models not rejected at conventional levels. Clearly these findings do not inspire confidence in the particular logit model specifications of forced turnover reported in Table 3, or the corresponding discretized Cox duration models. 3.4 Exploring robustness In light of our concerns about robustness and poor fit of many of these initial models, we are left with an unclear picture on the degree to which industry performance is related to CEO turnover. The default choices in our initial models are made to closely match JK, and to a lesser extent GM, to ease comparisons to the prior literature. However, as we discuss earlier, there are a large number of reasonable model alterations suggested by prior work. In order to get a clearer picture of the underlying economic relation between industry performance and CEO turnover, we consider here several of these alterations. In this analysis, which we report in Table 4, we tabulate only the industry-performance coefficients, as the abnormal firm performance coefficients are, as expected, negative and highly significant in all models. Table 4 Industry performance coefficients for an alternative model of CEO turnover   Cont. comp. perf.  Percentile perf.  Linear regress  Drop firm perf.  Value weight  A. Fiscal year timing  Sample 1, overt  −0.269  2.161  −0.004  −0.166  −0.547**    [0.237]  [1.949]  [0.006]  [0.206]  [0.251]  Sample 2, overt  −0.317  2.844  0.000  −0.053  −0.524    [0.260]  [2.284]  [0.002]  [0.239]  [0.399]  Sample 1, generic  (GM match)  0.011  −0.620  0.009  0.090  0.040    [0.234]  [1.020]  [0.016]  [0.152]  [0.180]  Sample 2, generic  0.112  −1.361  0.016*  0.165**  0.149    [0.120]  [0.875]  [0.009]  [0.080]  [0.113]  B. Monthly timing            Sample 1, overt  (JK match)  −0.391  2.189  −0.001*  −0.544*  −0.823**    [0.300]  [2.122]  [0.001]  [0.285]  [0.358]  Sample 1, generic  −0.224  0.978  −0.002  −0.187  −0.240*    [0.148]  [0.858]  [0.001]  [0.128]  [0.136]    Cont. comp. perf.  Percentile perf.  Linear regress  Drop firm perf.  Value weight  A. Fiscal year timing  Sample 1, overt  −0.269  2.161  −0.004  −0.166  −0.547**    [0.237]  [1.949]  [0.006]  [0.206]  [0.251]  Sample 2, overt  −0.317  2.844  0.000  −0.053  −0.524    [0.260]  [2.284]  [0.002]  [0.239]  [0.399]  Sample 1, generic  (GM match)  0.011  −0.620  0.009  0.090  0.040    [0.234]  [1.020]  [0.016]  [0.152]  [0.180]  Sample 2, generic  0.112  −1.361  0.016*  0.165**  0.149    [0.120]  [0.875]  [0.009]  [0.080]  [0.113]  B. Monthly timing            Sample 1, overt  (JK match)  −0.391  2.189  −0.001*  −0.544*  −0.823**    [0.300]  [2.122]  [0.001]  [0.285]  [0.358]  Sample 1, generic  −0.224  0.978  −0.002  −0.187  −0.240*    [0.148]  [0.858]  [0.001]  [0.128]  [0.136]  Each cell in this table, except for the linear regression column, reports an estimated industry coefficient from a logit model predicting turnover. Asymptotic standard errors are reported in parentheses under each coefficient estimate. Each model corresponds to a model in Table 3 with all model choices the same except the modification indicated in the column heading. The continuously compounded performance column replaces the firm and industry performance measures in Table 3 with continuously compounded versions of these measures by using a log transform. The percentile performance column replaces these measures with percentile measures based on the annual cohort. The linear regression column estimates linear regressions with a 0/1 dependent variable and all control variables identical to the corresponding logit models in Table 3. The dropped firm performance column estimates the same models used in Table 3 but with the firm performance variable dropped from the estimation. The value-weighted column replaces the equally weighted industry performance benchmark with a corresponding value weighted measure. All other model choices and labeling conventions in the models that lead to these coefficients are identical to the corresponding models and specifications in Table 3. *significant at the 10% level, **significant at the 5% level, and ***significant at the 1% level. As we discuss earlier, a case can be made for using continuously compounded returns or return percentiles as measures of performance. These alternatives place different weights on performance variations compared to the simple untransformed return measure. When we make this alteration, the industry performance coefficients, reported in the first two columns of Table 4, become insignificant in all models. An alternative to nonlinear logit models, with some desirable properties, along with notable shortcomings, is the simple linear probability model (see Angrist and Pischke 2009). As we report in Column 3 of Table 4, when we use this alteration, the industry performance results again become generally insignificant across models, with one negative and marginally (10%) significant coefficient in the model that most closely matches JK (sample 1, overt turnover, monthly timing). It is worth noting that with each of these robustness checks, the corresponding model coefficient on abnormal firm performance is always negative and highly significant (untabulated). The goodness-of-fit of the models that use the transformed measures of performance are generally at least as good as those using the untransformed measures, as indicated by similar or lower values for the Hosmer-Lemeshow statistic (figures untabulated). Our evidence that these seemingly innocuous model checks do not confirm the Table 3 findings, along with the earlier univariate evidence, raises serious doubt about the robustness of any estimated turnover-industry performance relation. As an additional check, we estimate the default Table 3 models but drop abnormal firm performance as an explanatory variable. As we report in the fourth column of Table 4, this step of dropping an orthogonal variable from the model reveals only one significant negative coefficient on industry performance (out of 6), and this is only significant at the 10% level in the model that most closely matches JK.20 This change must reflect the nonlinearity of the logit model, as the dropping of an orthogonal variable in a traditional ordinary least-squares (OLS) linear regression cannot have an effect on other coefficient estimates. Given the varying treatment in the literature, we also experiment with using a value-weighted performance benchmark in place of an equally weighted industry benchmark. JK report that their evidence of an industry performance relation with turnover also holds when using a value weighted measure. As we report in the final column of Table 4, in our sample this modification slightly weakens the evidence regarding a turnover-industry performance relation in some of the Table 3 models. We have experimented with a variety of other model alterations, including (a) restricting attention only to S&P 1500 firms, (b) dropping financials and utilities, (c) measuring industry performance using only the sample rather than the entire CRSP/Compustat universe, and (d) restricting attention to executives under the age of 60. While there are some minor changes in findings with some of these alterations, the basic conclusions are substantively unchanged (results untabulated). In particular, with these changes, some of the initial Table 3 models indicate a negative industry performance coefficient, but this evidence almost always disappears, or weakens drastically, when moving to the Table 4 robustness checks. The JK study presents some evidence suggesting that industry performance may play a different role in turnover for strong versus weak performers. In particular, CEOs with weak relative performance may be able to hide behind good industry performance if they are credited with good luck. On the other hand, CEOs with strong relative performance may be inappropriately blamed for bad luck (i.e., bad industry performance). The JK evidence suggests that the former behavior may be more common than the latter. To investigate, in Table 5 we separate the sample into below-average (panel A) and above-average (panel B) relative firm performance subsamples, and then we compare differences in turnover rates within each subsample dependent on whether industry performance was low (below median) or high (above median). As the table reports, all of the differences across the industry performance groups are insignificant, regardless of sample, timing, or turnover definition. Table 5 Credit for good luck versus blame for bad luck   High industry perf.  Low industry perf.  p-value  A. Poor performers and hiding behind good luck        Sample 1, annual: Poor firm performance, overt turnover  0.042  0.048  .182  Sample 2, annual: Poor firm performance, overt turnover  0.014  0.016  .399  Sample 1, monthly: Poor firm performance, overt turnover  0.041  0.049  .103  Sample 1, annual: Poor firm performance, generic turnover  0.129  0.120  .172  Sample 2, annual: Poor firm performance, generic turnover  0.136  0.132  .461  Sample 1, monthly: Poor firm performance, generic turnover  0.127  0.133  .412  B. Strong performers and blame for bad luck        Sample 1, annual: strong firm performance, overt turnover  0.019  0.018  .715  Sample 2, annual: strong firm performance, overt turnover  0.005  0.007  .149  Sample 1, monthly: strong firm performance, overt turnover  0.014  0.017  .324  Sample 1, annual: strong firm performance, generic turnover  0.094  0.092  .799  Sample 2, annual: strong firm performance, generic turnover  0.092  0.091  .921  Sample 1, monthly: strong firm performance, generic turnover  0.089  0.093  .518    High industry perf.  Low industry perf.  p-value  A. Poor performers and hiding behind good luck        Sample 1, annual: Poor firm performance, overt turnover  0.042  0.048  .182  Sample 2, annual: Poor firm performance, overt turnover  0.014  0.016  .399  Sample 1, monthly: Poor firm performance, overt turnover  0.041  0.049  .103  Sample 1, annual: Poor firm performance, generic turnover  0.129  0.120  .172  Sample 2, annual: Poor firm performance, generic turnover  0.136  0.132  .461  Sample 1, monthly: Poor firm performance, generic turnover  0.127  0.133  .412  B. Strong performers and blame for bad luck        Sample 1, annual: strong firm performance, overt turnover  0.019  0.018  .715  Sample 2, annual: strong firm performance, overt turnover  0.005  0.007  .149  Sample 1, monthly: strong firm performance, overt turnover  0.014  0.017  .324  Sample 1, annual: strong firm performance, generic turnover  0.094  0.092  .799  Sample 2, annual: strong firm performance, generic turnover  0.092  0.091  .921  Sample 1, monthly: strong firm performance, generic turnover  0.089  0.093  .518  Each row in panel A (B) identifies observations with below (above) median abnormal firm performance relative to the same time period cohort for the indicated sample. The figures in the high industry (low industry) performance column indicate the turnover rate of the indicated type for firms with above (below) median industry performance over the same period. Firm performance cuts are based on the firms’ annual buy-and-hold stock return relative to the same industry/period cohort. Industry performance groupings are based on the equally weighted industry portfolio of the firm relative to the same period cohort. Annual timing denotes that performance is calculated over the fiscal year immediately preceding the year in which a turnover event could take place, whereas monthly timing is calculated over the 12-month period immediately preceding the month over which a potential turnover event may occur. Monthly turnover rates are annualized by multiplying by 12. All p-values are for a simple t-test of a difference in turnover rates between the two industry performance groups in a given row. The figures in this table are calculated over the subset of observations in which a CEO has tenure for at least 24 months at the start of the turnover observation window. All other definitions and variables are defined like in the earlier tables. 4. Making sense of the evidence 4.1 Synthesizing the evidence Summarizing the preceding findings, the negative relation between CEO turnover and abnormal firm performance is strong and evident no matter what modeling choices we make. The relation between turnover and industry performance is much weaker. The univariate evidence is nonexistent with annual timing and quite weak with monthly timing. The multivariate evidence is moderate under some initial modeling assumptions, but collapses or weakens drastically when (a) the performance metric is transformed, (b) linear regression is used in place of a logit model, or (c) abnormal firm performance is dropped from the model. In general, the evidence for the presence of an industry performance relation with turnover is stronger for sample 1 (large Execucomp firms), monthly timing rather than annual timing (JK timing rather than GM timing), and untransformed measures of returns (JK measurement versus GM measurement). However, even with the most supportive choices, the case for a robust relation between CEO turnover and industry performance in our data and samples appears weak at best. Given the hints of an industry performance relation when we use monthly timing, we more closely examine industry performance in an extended window preceding turnover events in sample 1. We examine the 24-month period ending the month before the turnover departure date (-24 to -1), along with the two annual subintervals denoted by -24 to -13 and -12 to -1. As we report in the first two panels of Table 6 (see Column 2), average abnormal industry performance is insignificantly negative over the full 24-month period for both definitions of turnover. For the 12-month subintervals, average industry abnormal performance is small, positive, and insignificant in the -24- to -13-month period, and small, negative, but statistically significant in the -12- to -1-month interval. This pattern aggregates to an insignificant average over the entire 2-year period, regardless of whether we use overtly forced events (panel A) or generic turnover events (panel B). Even the significant negative industry performance patterns over the -12- to -1-month window are small in magnitude and statistical significance implicitly depends on certain parametric assumptions. Table 6 Turnover and performance over 24 months A. Performance prior to turnover, overt        Abnormal firm performance  Abnormal industry performance  Months -24 to -1  −0.387***  −0.012  Months -24 to -13  −0.140***  0.006  Months -12 to -1  −0.216***  −0.027***  B. Performance prior to turnover, generic        Abnormal firm performance  Abnormal industry performance  Months -24 to -1  −0.182***  −0.003  Months -24 to -13  −0.077***  0.006  Months -12 to -1  −0.096***  −0.011**  C. Turnover rates by 2-year performance quintile        Firm performance  Industry performance  Overt turnover rate, Q5 performance  0.011  0.029  Overt turnover rate, Q1 performance  0.073  0.032  p-val., Q1 versus Q5  .000  .502  Generic turnover, Q5 performance  0.073  0.109  Generic turnover, Q1 performance  0.168  0.113  p-val., Q1 versus Q5  .000  .564  A. Performance prior to turnover, overt        Abnormal firm performance  Abnormal industry performance  Months -24 to -1  −0.387***  −0.012  Months -24 to -13  −0.140***  0.006  Months -12 to -1  −0.216***  −0.027***  B. Performance prior to turnover, generic        Abnormal firm performance  Abnormal industry performance  Months -24 to -1  −0.182***  −0.003  Months -24 to -13  −0.077***  0.006  Months -12 to -1  −0.096***  −0.011**  C. Turnover rates by 2-year performance quintile        Firm performance  Industry performance  Overt turnover rate, Q5 performance  0.011  0.029  Overt turnover rate, Q1 performance  0.073  0.032  p-val., Q1 versus Q5  .000  .502  Generic turnover, Q5 performance  0.073  0.109  Generic turnover, Q1 performance  0.168  0.113  p-val., Q1 versus Q5  .000  .564  Figures in panels A and B are differences in abnormal firm and industry performance for turnover vs. nonturnover events, with p-values calculated using a t-test across groups In panels A and B, we calculate the mean level of abnormal firm and industry performance over the 24-month period ending at the end of the calendar month immediately preceding the turnover event, and also averages over the two 12-month subintervals that compose this 24-month period. Panel A is for overtly forced events, and panel B is for generic turnover events. A firm’s untransformed stock return is calculated as the firm’s buy-and-hold return winsorized at the 1% and 99% tails, and the abnormal return is this return less the corresponding industry return. Industry returns are the returns on the equally weighted portfolio of all firms in the same Fama-French industry, rebalanced monthly. Industry abnormal returns are calculated by subtracting the CRSP equally weighted index return for the same period. Panel C reports turnover rates for the highest (Q5) and lowest (Q1) quintile performers over the entire 24-month window based on firm performance (first column of figures) or industry performance (second column of figures). Firm performance quintiles are based on the firms’ annual buy-and-hold stock return relative to the same industry/period cohort. Industry performance quintiles are based on the equally weighted industry portfolio of the firm relative to the same period cohort. Turnover rates are converted from a monthly figure to an annualized figure by multiplying by 12. The figures in this table are calculated over the subset of observations in which the CEO has a tenure of at least 24 months at the start of the turnover observation window. The p-values in panel C indicate whether the Q5 and Q1 turnover rates immediately above the indicated p-value differ significantly using a simple t-test. *significant at the 10% level, **significant at the 5% level, and ***significant at the 1% level. This evidence may help explain our earlier findings and related findings in the literature. When we use annual timing, a turnover event in a 1-year window is predicted using performance in the prior year. This implies an average 6-month lag between the event and the end of the performance measurement window, corresponding to months -18 to -7. With monthly timing, the -12 to -1 period is used to predict turnover in the next month. Given the hints of a slight positive industry performance trend in the more distant months, and a slight negative trend in the later months, the strongest (weakest) evidence for an industry performance relation with CEO turnover will occur with timing that is measured immediately (with a delay) preceding a possible event. To provide context for this evidence, we note that abnormal firm performance is negative with large magnitudes over all of these windows, as reported in the first column of Table 6. Thus, it appears that the entire 24-month window is relevant in the performance evaluation process that governs turnover decisions. As we report in panel C of Table 6, if we consider this longer window, there is no evidence of significant differences in CEO turnover rates by industry performance quintiles. Moreover, if we estimate logit models with monthly timing using these 24-month performance windows, the industry performance coefficient is insignificant using any of the modeling permutations from the first four columns of Table 4 (results untabulated). 4.2 Additional insights on modeling turnover While the preceding evidence offers little convincing support for the presence an economically important industry performance relation to turnover, it is concerning that some models do indicate a relation that quickly disappears with small modeling changes. This suggests that many common turnover models may be fairly sensitive to specification choices. In addition, the Hosmer-Lemeshow test statistics indicate that many common turnover model specifications may fit the data quite poorly. Given these observations, it would appear prudent to thoroughly explore the robustness of any given turnover finding to a variety of model alterations. While this general recommendation applies in almost all empirical finance settings, our evidence and the mixed results in the past literature suggest that this is a particularly acute issue in studies of CEO turnover. To provide some additional insights on modeling choices, and also to more fully investigate the sensitivity of turnover findings to small modeling changes, we conduct a simulation analysis. In particular, we first create CEO turnover data that is mechanically constructed to depend only on a firm’s relative-to-industry continuously compounded performance. We accomplish this by using the coefficients from the models in the first column of Table 4 for all variables, except for industry performance, to derive an implied turnover probability for each observation. The industry performance coefficient is set to 0 in this calculation. We then use a random number generator to create a 0/1 turnover outcome for each observation governed by this predicted probability. The resultant data set is a simulated sample that mechanically displays by construction a particular version of full RPE (i.e., with continuous compounding), and otherwise agrees with the original data set. We create 1,000 such simulated data sets for each of the six models in the first column of Table 4. We take this simulated data and estimate the corresponding logit model specifications presented in Table 3. That is, we estimate models with (an inappropriate) untransformed measures of abnormal firm performance and industry performance. Since the samples are mechanically constructed to display no industry performance relation when performance is measured in a certain way (i.e., with continuously compounded returns), we can assess model performance by observing whether estimating the wrong model (i.e. one that assumes untransformed returns) leads to an inflated or deflated rate of false positives on the estimated industry performance coefficients. We report in the first column of Table 7 the fraction of the 1,000 simulated data sets for each model in which the industry performance coefficient is significant at the 5% level using a two-tailed test. A rate substantially above 5% suggests that a small error in the model can incorrectly indicate an underlying relation, even when such a relation is, by construction, not present. As the figures in the table reveal, in all cases the rate of false positives far exceeds 5%.21 The highest error rate appears to be in the monthly timing, overt turnover, sample 1 model (i.e., the JK match specification), as in this case over 90% of the simulated data sets indicate a significant industry coefficient even when no industry relation is present. We report next to these estimates the fraction of the time that the Hosmer-Lemeshow test statistic indicates that the (incorrect) model specification should be rejected at the 5% level. The highest rate of model rejection is for this same model (56.30%), lending some confidence to this test as a way to detect model misspecification. Table 7 Simulation-based evidence on industry performance coefficients and turnover modeling   Full logit model industry coefs. signif. at 5%  Full logit model HL test signif. at 5%  Logit model no firm perf. industry coefs. signif. at 5%  OLS/linear probability model industry coefs. signif. at 5%  Sample 1: Overt, annual  .4470  .3050  .0670  .0710  Sample 1: Generic, annual  (GM Match)  .0920  .0490  .0640  .0740  Sample 2: Overt, annual  .5430  .2480  .1390  .1130  Sample 2: Generic, annual  .1800  .3280  .1300  .0970  Sample 1: Overt, monthly  (JK Match)  .9120  .5630  .0520  .0700  Sample 1: Generic, monthly  .1830  .0930  .0940  .0820    Full logit model industry coefs. signif. at 5%  Full logit model HL test signif. at 5%  Logit model no firm perf. industry coefs. signif. at 5%  OLS/linear probability model industry coefs. signif. at 5%  Sample 1: Overt, annual  .4470  .3050  .0670  .0710  Sample 1: Generic, annual  (GM Match)  .0920  .0490  .0640  .0740  Sample 2: Overt, annual  .5430  .2480  .1390  .1130  Sample 2: Generic, annual  .1800  .3280  .1300  .0970  Sample 1: Overt, monthly  (JK Match)  .9120  .5630  .0520  .0700  Sample 1: Generic, monthly  .1830  .0930  .0940  .0820  In each row, we randomly create turnover data using a random number generator where the probability of turnover for each observation obeys the implied probabilities of turnover from the coefficients in the corresponding model of the first column of Table 4 (continuously compounded performance with sample, timing, and turnover definition as indicated) but with the industry performance coefficient set equal to zero. For each model, we create 1,000 simulated samples. We then estimate the corresponding Table 3 model on each simulated data set with turnover assumed to be a function of untransformed abnormal firm performance and industry performance. The first column of figures in the table reports the fraction of simulated data sets in which the estimated industry performance coefficient is (spuriously) significant at the 5% level. The second column indicates the fraction of cases in which the corresponding Hosmer-Lemeshow test for this would lead to the rejection of an appropriately specified model at the 5% significance level. The third column indicates the fraction of cases in which the industry performance coefficient is significant at the 5% level when we drop firm performance from the logit estimation on the simulated data. The final column indicates the fraction of cases in which an estimated linear probability (OLS) model corresponding to the Table 3 models (including both firm and industry performance) estimated over the simulated data indicates a significant industry performance coefficient at the 5% level. In the final two columns of Table 7, we report the frequency of false positives on the industry performance coefficient when we (a) drop the abnormal firm performance variable and (b) use linear regression rather than logit. While these figures are slightly elevated compared to the 5% we would expect by chance, they are far lower than when the full incorrect logit model is estimated. This suggests that these checks are a reasonable way to examine the robustness of a given finding in light of the apparent extreme sensitivity of nonlinear logit models to seemingly small model misspecifications. 5. Conclusion The empirical literature on CEO turnover exhibits a wide variety of modeling choices. Some reported findings, most notably the sensitivity of CEO turnover to abnormal firm performance, appear insensitive to specific choices. The evidence is less clear for other findings. Of particular note, several prominent papers report little sensitivity of CEO turnover to industry performance, whereas others have detected a significant relation. This variation across the literature, coupled with a particularly wide variety of modeling and data choices, raises the general issue of result robustness in the CEO turnover context, as well as the specific issue of whether widespread violations of full relative performance evaluation are present in CEO turnover decisions. In this paper, we investigated these issues by studying two large samples and estimating CEO turnover models that span most of the popular choices in the prior literature. The finding of a significant sensitivity of CEO turnover to abnormal firm performance is never in doubt. It emerges no matter what modeling, sampling, or data choices we considered. The evidence of a significant role for industry performance in CEO turnover appears weak and sensitive to modeling choices. At a univariate level, we have detected some limited evidence of poor industry performance before turnover, but the magnitude is economically small and only apparent over some time windows for some turnover definitions. When we predicted turnover using common multivariate models, we uncovered somewhat stronger evidence of a significant industry performance coefficient, particularly when we (a) used a fairly strict definition of turnover, (b) used a timing convention that emphasizes very recent performance innovations, (c) studied a sample restricted to the largest Compustat firms, and (d) used a performance measure that does not transform the raw performance data. Alterations to (a) or (b) tend to weaken the evidence considerably, whereas alterations to (c) or (d) cause evidence in support of an industry performance relation to effectively disappear in our data. In light of our evidence, the case of an important role for industry performance in CEO turnover decisions appears quite limited, at least in the data that we studied. Some hints of this behavior are present, but the hints are faint. It is possible that an industry performance role is stronger and more robust in other samples, but certainly our evidence, which is based on two large samples and a fairly exhaustive set of model permutations, raises much doubt that industry performance plays an important role in most CEO removal decisions. Prior authors have come down on both sides of this question, but these studies tend to limit attention to a smaller set of modeling choices, and samples vary widely across studies. Thus, it is difficult to determine whether it is modeling choices, data differences, or some combination thereof that drive the differing conclusions across studies. Whatever the case, our evidence certainly suggests that there is little in terms of a large industry performance component to typical CEO turnover decisions. This conclusion contrasts with much prior evidence that industry performance plays a strong role in CEO compensation awards, raising interesting questions about the distinct roles played by turnover and compensation policies in managerial incentive contracting. In addition to adding to the evidence on the economic behavior governing turnover decisions, we have established several methodological points regarding modeling turnover decisions. Most importantly, we showed that coefficient estimates in turnover models can be quite sensitive to moderate changes in data definitions and/or timing conventions. When we perturb various modeling choices in small ways, coefficient estimates often sharply change in magnitude, sign, and statistical significance. In addition, simulations reveal that small model misspecifications can indicate a significant relation with high statistical confidence, for example, industry performance predicting turnover, even when an underlying relation is, by construction, not present. It is entirely possible that the reverse result also holds (i.e., no evidence of a relation in a misspecified model even when one is present). Since any test of an industry performance role in turnover is necessarily a joint test of this hypothesis coupled with the hypothesis that the model is correctly specified, our evidence indicates that much more attention should be paid to specification issues in future research. Ideally, either the underlying theory or an auxiliary empirical analysis can provide guidance on the appropriate specification choice. Although this recommendation generally applies to empirical research, given our evidence, it would appear to be particularly important in the CEO turnover context. The lack of model robustness we have identified, which apparently has not been widely recognized, is concerning and important, as appropriate modeling choices in the CEO turnover context are unclear based on a priori grounds. In light of the fragility we have detected, it would appear prudent for researchers in this area to consider a variety of reasonable modeling and data choices before concluding that an underlying relation is present. Our analysis offers some guidance on choices that may span the reasonable set including: (a) estimating a linear model that corresponds to the usual nonlinear models, (b) experimenting with dropping orthogonal variables from the estimation, (c) considering simple monotonic transformations of the performance measures, and (d) considering a range of alternative timing conventions. Understanding CEO turnover is certainly an important question for financial economists. Although we highlighted the challenges in studying this issue, we are hopeful that some of our insights are useful in arriving at robust and definitive conclusions regarding this important dimension of corporate governance. We thank Ashwini Agrawal, Rajesh Aggarwal, Radha Gopalan, Camelia Kuhnen, Dirk Jenter, Uday Rajan (the editor), and Amit Seru; several anonymous referees; and seminar participants at Alabama, Clemson, Georgia State, Georgetown, Kentucky, Minnesota, Nebraska, NYU, Purdue, Texas A&M, Texas Tech, Tulane, UNC Charlotte, and the 2013 WFA meetings for helpful comments and discussions. Earlier drafts of this paper circulated under different titles. All errors remain our own. Footnotes 1 A parallel literature on RPE in CEO compensation includes, for example, Aggarwal and Samwick (1999b), Bertrand and Mullainathan (2001), Garvey and Milbourn (2003, 2006), and Albuquerque (2009). 2 This point also has been made by, among others, Kaplan and Minton (2012) and Jenter and Lewellen (2014). See also Eisfeldt and Kuhnen (2013) for an equilibrium model of executive job separations. We present evidence below suggesting that many forced turnover events may be mistakenly labeled as voluntary. 3 As illustrated in many labor economics models (e.g., Mclaughlin 1991), the distinction of who instigates a change is often irrelevant for testing efficient learning theories of turnover, as performance signals should symmetrically affect all parties’ assessment of match quality. 4 The Parrino (1997) categorization contains some other elements, but the ones discussed here have the largest empirical effect on how events are categorized. 5 Firms may not directly rely on any of these metrics in their decisions. Instead, firms may rely on other (perhaps unobservable to the econometrician) metrics correlated with or partially captured by these traditional metrics. 6 A closely related alternative, suggested by the work of Bushman, Dai, and Wang (2010), is to scale the performance metric by some measure of return volatility so that returns are measured in units of standard deviations. 7 The Hazarika, Karpoff, and Nahata (2012) results are particularly relevant, as they use a sample and methodology similar to JK, yet their coefficients and figures reveal no significant poor abnormal industry performance before forced turnover. 8 Many of the studies we cite here do not explicitly discuss their coefficient estimates on industry performance, but an inspection of the tabulated coefficients in these studies leads to our characterization of whether the reported findings are consistent or inconsistent with JK. 9 These points are explicitly established by Sueyoshi (1995) and Jenkins (1995). Practitioner-oriented discussions can be found in Beck, Katz, and Tucker (1998) and Singer and Willett (2003). The nohr display option in Stata converts Cox coefficients to values comparable to logit coefficients. We have checked our main results for a model that includes tenure dummies. These coefficients are almost identical to what we report in the main tables using a simple linear trend variable for tenure. 10 Many articles in JSTOR do not include abstracts, and other articles use alternative phrases, such as “Management Turnover.” Our goal is not to survey all articles, but a reasonably representative set. The Journal of Financial Economics is a popular outlet for turnover studies, and it is not included in the JSTOR database. 11 Our timing convention is that a year reference means the end of the fiscal year has been referenced. With annual timing, turnover is measured starting at this point in time, and performance is measured until this point in time. 12 We have hand-checked a set of 50 of turnover events that are excluded by these criteria and find that the vast majority are directly related to external events, such as control changes, acquisitions, bankruptcies, or legal events. In the handful of cases not related to external events, there is no evidence of below average industry performance. 13 Turnover is automatically considered overtly forced whenever a press characterization suggests a forced departure. In all other cases, the variable is set equal to 0 if the CEO is over the age of 60. If the CEO is under 60, the variable is set equal to 1, unless the event was announced at least 6 months in advance or the CEO stays on at the firm (typically as Board Chair) after losing the CEO role. In the subset of cases with an available announcement date, the median time difference with the departure date is 0, with over 80% of these two dates falling within 31 days of each other. 14 In untabulated regressions, we have estimated models predicting turnover events accompanied by severance payments as an alternative to models predicting overtly forced departures. Our findings with regard to firm and industry performance using this categorization are quite similar to what we report in the text/tables for generic turnover. 15 These results were reported in an earlier draft of the paper and are available from the authors on request. 16 We winsorize firm returns at the 1% tails before estimating these regressions. 17 JK report summary statistics that indicate substantially poorer average industry performance in the forced turnover group compared to the no turnover group using monthly timing. Using a sample and data treatment similar to that of JK, Hazaraki, Karpoff, and Nahata (2012) report almost no difference in industry performance between these groups. 18 In our calculations, we exclude executives with tenures of under 2 years to maintain comparability to the later logit models. We also measure industry performance relative to the annual cohort. The choices that JK make when reporting corresponding figures are unclear. 19 We do not discuss the coefficients on the control variables in depth, but it is interesting to note that the age greater than 60 dummy is only negative and significant in the overt turnover models, as these are the models that use a variant of the Parrino (1997) algorithm. This suggests a distinct drop in predicted turnover at the age 60 threshold that is artificially generated by the algorithm. 20 If we run analogous specifications with industry performance measured in percentiles and abnormal firm performance excluded, the coefficients on industry performance are insignificant in all cases. 21 We also have experimented with simulations in which turnover data are constructed to depend on untransformed firm abnormal performance and a model is estimated that incorrectly assumes continuously compounded performance. The rate of false positives is also high with this specification error, although slightly lower than what we report in the first column of Table 7. References Aggarwal R. K., Samwick A. A.. 1999a. The other side of the trade-off: the impact of risk on executive compensation. Journal of Political Economy  107: 65– 105. Google Scholar CrossRef Search ADS   Aggarwal R. K., Samwick A. A.. 1999b. Executive compensation, strategic competition, and relative performance evaluation: Theory and evidence. Journal of Finance  54: 1999– 2043. Google Scholar CrossRef Search ADS   Aggarwal R. K., Samwick A. A.. 2003a. Why do managers diversify their firms? Agency reconsidered. Journal of Finance  58: 71– 118. Google Scholar CrossRef Search ADS   Aggarwal R. K., Samwick A. A.. 2003b. Performance incentives within firms: The effect of managerial responsibility. Journal of Finance  58: 1613– 50. Google Scholar CrossRef Search ADS   Albuquerque A. 2009. Peer firms in relative performance evaluation. Journal of Accounting and Economics  48: 69– 89. Google Scholar CrossRef Search ADS   Angrist J. D., Pishke J.. 2009. Mostly harmless economics: An empiricist’s companion , Princeton, NJ: Princeton University Press. Barro J. R., Barro R. J.. 1990. Pay, performance, and turnover of bank CEOs. Journal of Labor Economics  8: 448– 81. Google Scholar CrossRef Search ADS   Beck N., Katz J., Tucker R.. 1998. Taking time seriously: Time-series-cross-section analysis with a binary dependent variable. American Journal of Political Science  42: 1260– 88. Google Scholar CrossRef Search ADS   Bertrand M., Mullainathan S.. 2001. Are executives paid for luck? The ones without principals are. Quarterly Journal of Economics  116: 901– 32. Google Scholar CrossRef Search ADS   Bushman R., Dai Z., Wang X.. 2010. Risk and CEO turnover. Journal of Financial Economics  96: 381– 98. Google Scholar CrossRef Search ADS   Cornelli F., Kominek Z., Ljungqvist A.. 2013. Monitoring managers: Does it matter? Journal of Finance  68: 431– 81. Google Scholar CrossRef Search ADS   Eisfeldt A. L., Kuhnen C.. 2013. CEO turnover in a competitive assignment framework. Journal of Financial Economics  103: 351– 72. Google Scholar CrossRef Search ADS   Engel E., Hayes R. M., Wang X.. 2003. CEO turnover and properties of accounting information. Journal of Accounting and Economics  36: 197– 226. Google Scholar CrossRef Search ADS   Fee C. E., Hadlock C. J.. 2004. Management turnover across the corporate hierarchy. Journal of Accounting and Economics  37: 3– 38. Google Scholar CrossRef Search ADS   Fee C. E., Hadlock C. J., Pierce J. R.. 2013. Managers with and without style: Evidence using exogenous variation. Review of Financial Studies  26: 567– 601. Google Scholar CrossRef Search ADS   Garvey G., Milbourn T.. 2003. Incentive compensation when executives can hedge the market: Evidence of relative performance evaluation in the cross-section. Journal of Finance  58: 1557– 81. Google Scholar CrossRef Search ADS   Garvey G., Milbourn T.. 2006. Asymmetric benchmarking in compensation: Executives are paid for good luck but not punished for bad. Journal of Financial Economics  82: 197– 225. Google Scholar CrossRef Search ADS   Gibbons R., Murphy K. J.. 1990. Relative performance evaluation for chief executive officers. Industrial and Labor Relations Review  43: 30– 52. Google Scholar CrossRef Search ADS   Gopalan R., Milbourn T., Song F.. 2010. Strategic flexibility and the optimality of pay for sector performance. Review of Financial Studies  23: 2060– 98. Google Scholar CrossRef Search ADS   Hadlock C. J., Lumer G. B.. 1997. Compensation, turnover, and top management incentives: Historical evidence. Journal of Business  70: 153– 87. Google Scholar CrossRef Search ADS   Hadlock C. J., Lee S., Parrino R.. 2002. Chief executive officer careers in regulated environments: Evidence from electric and gas utilities. Journal of Law and Economics  45: 535– 63. Google Scholar CrossRef Search ADS   Hazarika S., Nahata R., Karpoff J.. 2012. Internal corporate governance, CEO turnover, and earnings management. Journal of Financial Economics  104: 44– 69. Google Scholar CrossRef Search ADS   Holmström B. 1982. Moral hazard in teams. Bell Journal of Economics  13: 392– 415. Google Scholar CrossRef Search ADS   Huang S., Maharjan J., Thakor A.. 2016. Disagreement-induced CEO turnover . Working Paper, Washington University in St. Louis. Huson M., Parrino R., Starks L.. 2001. Internal monitoring mechanisms and CEO turnover: A long-term perspective. Journal of Finance  56: 2265– 99. Google Scholar CrossRef Search ADS   Jenkins S. P. 1995. Easy estimation methods for discrete-time duration models. Oxford Bulletin of Economics and Statistics  57: 129– 36. Google Scholar CrossRef Search ADS   Jenter D., Kanaan F.. 2015. CEO turnover and relative performance evaluation. Journal of Finance  70: 2155– 83. Google Scholar CrossRef Search ADS   Jenter D., Lewellen K.. 2014. Performance-induced CEO turnover . Working Paper, Stanford University. Kaplan S., Minton B.. 2012. How has CEO turnover changed? International Review of Finance  12: 57– 87. Google Scholar CrossRef Search ADS   Mclaughlin K. J. 1991. A theory of quits and layoffs with efficient turnover. Journal of Political Economy  99: 1– 29. Google Scholar CrossRef Search ADS   Parrino R. 1997. CEO Turnover and outside succession: A cross-sectional analysis. Journal of Financial Economics  46: 165– 97. Google Scholar CrossRef Search ADS   Shumway T. 2001. Forecasting bankruptcy more accurately: A simple hazard model. Journal of Business  74: 101– 24. Google Scholar CrossRef Search ADS   Singer J. D., Willett J. B.. 2003. Applied longitudinal data analysis: Modeling change and event occurrence . New York: Oxford University Press. Google Scholar CrossRef Search ADS   Sueyoshi G. T. 1995. A class of binary response models for grouped data. Journal of Applied Econometrics  110: 411– 31. Google Scholar CrossRef Search ADS   Weisbach M. S. 1988. Outside directors and CEO turnover. Journal of Financial Economics  20: 431– 60. Google Scholar CrossRef Search ADS   © The Author 2017. Published by Oxford University Press on behalf of The Society for Financial Studies. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Journal

Review of Corporate Finance StudiesOxford University Press

Published: Mar 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Unlimited reading

Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.

Stay up to date

Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.

Organize your research

It’s easy to organize your research with our built-in tools.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

Monthly Plan

  • Read unlimited articles
  • Personalized recommendations
  • No expiration
  • Print 20 pages per month
  • 20% off on PDF purchases
  • Organize your research
  • Get updates on your journals and topic searches

$49/month

Start Free Trial

14-day Free Trial

Best Deal — 39% off

Annual Plan

  • All the features of the Professional Plan, but for 39% off!
  • Billed annually
  • No expiration
  • For the normal price of 10 articles elsewhere, you get one full year of unlimited access to articles.

$588

$360/year

billed annually
Start Free Trial

14-day Free Trial