Getting the Incentives Right: Backfilling and Biases in Executive Compensation Data

Getting the Incentives Right: Backfilling and Biases in Executive Compensation Data Abstract We document that backfilling in the ExecuComp database introduces a data-conditioning bias that can affect inferences and make replicating previous work difficult. Although backfilling can be advantageous due to greater data coverage, if not addressed, the oversampling of firms with strong managerial incentives and higher subsequent returns leads to a significant upward bias in abnormal compensation, pay-for-performance sensitivity, and the magnitudes of several previously established relations. The bias also can lead to one misinterpreting the appropriate functional form of a relation and whether the data support one compensation theory over another. We offer methods to address this issue. Received May 12, 2014; editorial decision May 10, 2016 by Editor David Hirshleifer. Executive compensation, especially CEO compensation, consistently fosters debate among a wide variety of interested parties, including shareholders, government officials, the media, the public, and academic researchers. For example, academic scrutiny of compensation stems from the central role of contracting in agency theory and an interest in understanding how compensation mitigates or exacerbates agency issues.1 Accordingly, a large number of studies in finance, economics, accounting, management, and other fields employ data from the Standard & Poor’s (S&P’s) Compustat ExecuComp database (henceforth, ExecuComp), which was introduced in 1994. In fact, since 1998, more than a 1,000 published articles have used this database, and over half of these articles can be found in nine of the leading finance and accounting journals.2 Given the widespread use of the ExecuComp database, it is imperative to understand its construction, any inadvertent biases that arise as a result, and the implications of—and ideally, solutions to—those biases for research. Despite its importance, we are aware of no studies that examine biases within the ExecuComp database.3 We provide the first such study by examining issues that arise from the practice of “backfilling” or adding historical data when ExecuComp initiates coverage of firms or managers. While documenting the effects of backfilling on the results presented in the previous literature is beyond the scope of this (or any one) study, we examine some typical analyses to present examples of the potential data-conditioning biases backfilling can generate. Our results indicate that the backfilled observations are not random, but instead lead to oversampling from certain types of firm-years (e.g., strong-performing growth firms that use more incentive compensation). Because of this, we find that backfilling can significantly affect estimates, both economically and statistically, and the resulting interpretations and conclusions. Backfilling in the ExecuComp database arises due to S&P’s practice of collecting all available compensation data from the proxy statements of covered firms. Because proxy statements almost always contain historical compensation data prior to the particular year, this policy implies that the initiation of coverage of managers results in backfilling. Three circumstances prompt the addition of managers to the ExecuComp database: (1) an individual employed at the firm becomes one of the five highest-paid executives in the firm (e.g., gets promoted), (2) a firm is added to the S&P 1500 index, or (3) a firm that is not and has never been in the S&P 1500 index is added to the database for some reason. Per conversations with S&P, generally this third circumstance occurs when an S&P client requests a firm’s inclusion in the database, although it has also been suggested that firms might be added due to their addition to S&P industry indices.4 According to S&P, backfilling was discontinued in 2006 due to new regulatory reporting requirements that limited the amount of historical compensation data disclosed in firm proxy statements. However, reporting requirements were again changed after that date and backfilling has apparently resumed. Backfilling a database, such as ExecuComp, can be beneficial due to the increased data availability and accessibility and the increased power of empirical tests. However, backfilling can also be problematic. If the data-conditioning bias is unaddressed it can significantly alter inferences from empirical analyses. Moreover, backfilling makes it extremely difficult to replicate the results of a previous study because later research will include backfilled observations that were not available to the previous researchers, even if one uses identical date ranges and filters. We find a large amount of backfilling in the ExecuComp database. For example, in the October 2009 version (or “vintage”) of the database, we estimate that 4,037 (17.8%) of the 22,720 firm-year observations (and all of their reported executives) were backfilled. At the manager level, we estimate that at least 31,901 (23%) of the 136,684 salary observations for fiscal years 1994–2005 have been backfilled.5,6 Moreover, the three events that lead S&P to backfill do not occur randomly. For example, consider an index addition (a firm being added to the S&P 1500), which is likely to follow a period of strong firm performance and high stock returns. Assuming the added firm was not already included in ExecuComp (e.g., due to a previous client request), the index addition triggers ExecuComp coverage and typically results in two years of backfilled data because the first proxy collected by S&P includes three years of historical data. Consistent with the implications of this example, our tests show that backfilling leads to oversampling of high-growth companies that experienced high returns with low risk (i.e., low time-series variation in returns relative to nonbackfilled firms). In addition, we find that managers whose data are backfilled tend to have lower salaries, lower total compensation, and higher stock ownership than other managers in the database. Most importantly, we find that failure to control for backfilling not only generates a strong and significant upward bias in the magnitudes of several previously established relations, it can also lead to misinterpretations of the appropriate functional form of a relation, or of the degree with which the data supports one theory over another. Moreover, in structural modeling with data from ExecuComp—even assuming that the model provides the correct functional form—given the feedback from one estimate to another, biased compensation data could affect other estimates as well as one’s conclusion that the structural model is appropriate. For example, consider estimates of pay-for-performance sensitivities, stemming from the original work of Jensen and Murphy (1990), which play a major role in the empirical corporate literature. Our results show that pay-for-performance sensitivities derived from ExecuComp data without adjusting for backfilling are significantly overestimated. After excluding data that we estimate to have been backfilled, we find that a CEO at a median-risk firm receives an additional $${\$}$$0.55 in total direct compensation per $${\$}$$1,000 change in shareholder wealth. However, using all available ExecuComp data (including observations we believe to have been backfilled), we find a sensitivity of $${\$}$$0.69 per $${\$}$$1,000, an increase of 25%. The upward bias is striking when one examines only the backfilled observations: the sensitivity estimated when only using backfilled data is $${\$}$$2.95 per $${\$}$$1,000, over five times higher than the estimated sensitivity for nonbackfilled data. The effects of backfilled data are more dramatic if we incorporate stock and option ownership in our estimates of pay-for-performance sensitivities. Failure to adjust for backfilled data generates a pay-for-performance sensitivity estimate that is 64% higher than the pay-for-performance sensitivity we find when using backfill-free data ($${\$}$$2.87 per $${\$}$$1,000 using all of ExecuComp versus $${\$}$$1.75 using backfill-free data). Our results also have implications for the use of the ExecuComp data to address the widespread debate over executive compensation as rent extraction or optimal contracting (e.g., see Bebchuk and Fried 2004; Murphy 2002). We show that the presence of backfilled observations is more likely to tilt some tests toward the optimal-contracting view because the ExecuComp database tends to oversample CEOs who have lower salaries, lower total compensation, and stronger incentives. Moreover, the backfilled CEOs tend to be drawn from a set of strong-performing firms rather than at random from the fuller set of firms. However, the presence of backfilled observations does not favor the optimal contracting view for all empirical tests that have been employed. For example, we find that removing the backfilled data (observations that tend to have low abnormal compensation and high future returns) changes the estimation of a significant negative relation between abnormal compensation and future returns to no significant relation. Further, researchers who estimate abnormal compensation need to consider that the addition of backfilled data leads to upward-biased estimates of abnormal compensation for the nonbackfilled observations, on average. The presence of backfilled observations significantly affects the estimation of several other previously identified relations. For example, failure to control for backfilling leads to an over-estimation of the relation between pay-performance sensitivities and firm risk or firm size, as well as the relation between managerial ownership and firm value. Further, the inclusion of backfilled data in tests of the association between Tobin’s q and ownership can lead to significantly misestimating the magnitude of the relation (statistically and economically), and also to misinterpreting the correct functional form of the relation. Taken together, our results highlight the importance of controlling for backfilled data in future research. Consequently, we develop methods that allow researchers to identify and control for backfilling in the data. These methods not only allow the researcher to prevent biases in their empirical studies, but also allow more appropriate replications of previous work. 1. Data and Potential Biases 1.1 ExecuComp construction and the backfilling process S&P reports detailed executive compensation data from firms’ annual proxy statements (form DEF 14A) based on the reported current and two-year historical compensation data for the five most highly compensated executives (often referred to as the “Top five”).7 S&P also constructs a measure of aggregate total direct compensation (TDC1), computed as the sum of salary, bonus, other annual compensation, total value of restricted stock granted, total value of stock options granted (using a Black-Scholes approach), long-term incentive payouts, and all other pay. However, in 2006, the SEC adjusted compensation disclosure requirements to better match the FAS 123R accounting changes, which included reporting options at “fair value.” The SEC allowed a three-year phase-in period during which companies would only be required to report the current year of compensation data for the first year after the rule change, then two years of data, with a return to three years of data thereafter.8 ExecuComp covers all firms in the S&P 1500 index, and maintains coverage if a firm drops out of the index. In addition, ExecuComp includes some firms that have never been in the S&P 1500. For example, of the 22,720 firm-year observations with fiscal years ranging from 1994 to 2005 in the 2009 vintage, 17,422 (76.7%) are members of the S&P 1500 index at the time of database release, 1,261 (5.6%) are firms that were previously in the index, 705 firms (3.1% of the total) were included up to two years before they entered the S&P, and the remaining 3,332 firms (14.7% of the total) apparently were added to the database for other reasons.9 The executives covered in the ExecuComp database also change over time. Beyond the executives included due to the firm additions discussed above, new individuals are added when they become one of the highest-paid executives in a firm as reported to the SEC. When any of these events occur over the 1994–2005 time frame, available historical compensation data was added. The appendix provides more details of the process. 1.2 The nature of potential biases Using stock performance as an example, Figure 1 illustrates the potential for backfilling to affect analyses based on firm characteristics. For each firm-year in ExecuComp for the 1994–2005 period, we collect stock returns over the years $$t$$, $$t+$$1, and $$t+$$2. We do this separately for the full sample and for the five following subsamples: (1) firm-years for which we believe no managers have backfilled compensation data, (2) firm-years where at least one manager has backfilled data for any reason, (3) firm-years for which at least one manager has backfilled data due to manager addition (e.g., due to promotion), (4) firm-years where data for the firm was backfilled because it was added to the S&P index, and (5) firm years for which the data for the firm was backfilled for some other reason (e.g., client request).10 Figure 1 View largeDownload slide Cumulative stock returns This figure plots the average cumulative stock returns for firm-year observations in ExecuComp over 1994–2005. The return from 0 to $$t$$ represents the average stock return measured contemporaneously with that firm’s year $$t$$ compensation data. We also report cumulative stock returns that incorporate years $$t+$$1 and $$t+$$2. We report returns separately for the full sample, observations that we estimate were not backfilled, backfilled observations, and three subsets of backfilled observations – index, other, and manager. Specifically, a firm-level observation is termed backfilled if any manager in that year is backfilled for any reason. An observation is not termed backfilled if no manager has been backfilled. The manager backfilled sample consists of any firm with backfilling due to manager additions. The index backfilled sample consists of any firm that was backfilled because it was added to the S&P 1500 Index. The Other backfilled sample consists of firms that were added to ExecuComp for other reasons. Figure 1 View largeDownload slide Cumulative stock returns This figure plots the average cumulative stock returns for firm-year observations in ExecuComp over 1994–2005. The return from 0 to $$t$$ represents the average stock return measured contemporaneously with that firm’s year $$t$$ compensation data. We also report cumulative stock returns that incorporate years $$t+$$1 and $$t+$$2. We report returns separately for the full sample, observations that we estimate were not backfilled, backfilled observations, and three subsets of backfilled observations – index, other, and manager. Specifically, a firm-level observation is termed backfilled if any manager in that year is backfilled for any reason. An observation is not termed backfilled if no manager has been backfilled. The manager backfilled sample consists of any firm with backfilling due to manager additions. The index backfilled sample consists of any firm that was backfilled because it was added to the S&P 1500 Index. The Other backfilled sample consists of firms that were added to ExecuComp for other reasons. As Figure 1 shows, there are substantial differences in mean cumulative returns across the different samples. Backfilled observations have higher average returns than the nonbackfilled sample. The distinction between manager additions, index additions, and other additions shows that the performance differences are driven by the backfilling due to firm additions, thus resulting in oversampling of successful, high-growth firms. In contrast, observations that were backfilled due to manager additions have almost identical average returns to those of nonbackfilled observations because such additions do not result in new firms (and their returns) being added to the database. Figure 2 shows that the backfilling can affect inferences about managerial compensation from both firm and manager backfilling. The five bar charts illustrate the level of executive pay and its composition divided among salary, bonus, options, and other compensation for each year from 1994 through 2005. The charts show, in order, compensation for the full ExecuComp sample, the nonbackfilled observations, all backfilled observations, firm-backfilled observations, and manager-backfilled observations. Comparing the second and third charts, it is clear that backfilled observations have lower levels of pay relative to nonbackfilled observations. The fourth and fifth charts show that these differences are driven not only by firm backfilling but also by manager backfilling. Figure 2 View largeDownload slide Level and composition of executive pay This figure summarizes the level and composition of executive pay by year. The bar height represents the median level of executive pay in that year, in thousands of dollars. We separate compensation into salary, bonus, Black-Scholes option value, and all other compensation. All values are as reported by ExecuComp. Composition percentages are computed by first measuring the percentages for each executive and then averaging across all executives. Figure 2 View largeDownload slide Level and composition of executive pay This figure summarizes the level and composition of executive pay by year. The bar height represents the median level of executive pay in that year, in thousands of dollars. We separate compensation into salary, bonus, Black-Scholes option value, and all other compensation. All values are as reported by ExecuComp. Composition percentages are computed by first measuring the percentages for each executive and then averaging across all executives. In a similar manner, Figure 3 shows median pay-for-performance sensitivities for each year from 1994 through 2005 for the full sample and the four subsamples. Following Murphy (1999), we measure pay-for-performance sensitivity separately for options, stock, cash compensation and long-term incentive payouts.11 The bar height represents the median total change in executive wealth per $${\$}$$1,000 change in shareholder wealth. The contribution of individual sources of pay to the total pay-for-performance sensitivity is computed by first measuring the percentage of pay-for-performance sensitivity that comes from each source for each executive, then averaging across all executives in that cross section. Given that all of the charts are depicted on the same scale, it is clear that backfilled firms have very different average pay-for-performance sensitivities relative to nonbackfilled firms. Further, the charts indicate that while the pay-for-performance sensitivity for the median executive has varied substantially across time, the variation is much greater for the backfilled observations, particularly the manager backfilled observations. Further, relative to the nonbackfilled sample, a larger portion of backfilled executives’ estimated pay-for-performance sensitivity stems from changes in option, restricted stock, and stock values. Again, the differences arise from both firm- and manager-backfilled observations. As we show in later regressions, these differences are both economically and statistically significant. Figure 3 View largeDownload slide Pay-for-performance sensitivity This figure summarizes the sensitivity of executive pay to performance. The $$y$$-axis is median change in executive wealth per $${\$}$$1,000 change in shareholder wealth. Composition percentages are computed by first measuring the percentages for each executive and then averaging across executives. Computation of the pay-for-performance sensitivity estimates for the individual components follows Murphy (1999). Figure 3 View largeDownload slide Pay-for-performance sensitivity This figure summarizes the sensitivity of executive pay to performance. The $$y$$-axis is median change in executive wealth per $${\$}$$1,000 change in shareholder wealth. Composition percentages are computed by first measuring the percentages for each executive and then averaging across executives. Computation of the pay-for-performance sensitivity estimates for the individual components follows Murphy (1999). These initial findings suggest that backfilling systematically oversamples certain types of firms and managers, generating the potential to draw incorrect inferences or misleading estimates. Moreover, failing to control for backfilling can lead to a lack of comparability of results across studies because for a given year, follow-up studies would include firms not contained in the original ExecuComp vintage. The extent of the problem would depend on the exact methodology and the specific vintage of ExecuComp data used. 1.3 Backfilled observations Table 1 summarizes the number of backfilled observations and the years in which the backfilling occurs. Panels A and B report the occurrence of Salary and TDC1 backfilling, respectively. Moving from left to right, the columns report the total number of observations for that year in the 2009 vintage, the total number of backfilled observations, and the number of backfilled observations in each vintage of the database. Each row represents a given fiscal year of data. For example, there are 12,641 manager observations with Salary data for fiscal year 1998. We estimate that 3,923 (31%) of these were backfilled at some point. Thus, a paper using the 2009 ExecuComp database to examine 1998 compensation will have a very different sample relative to that of an earlier paper that used (for example) the 1999 vintage. The bottom row of panel A summarizes the incidence of backfilling across all vintages and shows that of the 154,522 salary observations for fiscal years 1992–2005, we estimate that 32,046 (21%) are backfilled. Panel B reports the occurrence of TDC1 backfilling. The bottom row of panel B shows our estimated total of 17,909 (14%) of the 130,424 TDC1 observations that have been backfilled.12 Table 1 Backfilling by vintage and fiscal year of observation Fiscal  Total  Total #  # Backfilled by ExecuComp Vintage  year of obs.  obs. in 2009 vintage  of back-filled obs.  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2009  A. Number of observations in which Salary is backfilled  1992  8,028  42  9  11  0  0  10  0  4  0  0  0  8  1993  9,810  103  57  11  0  9  12  0  5  0  0  0  9  1994  10,662  1,214  1,147  18  0  15  16  0  8  0  0  0  10  1995  11,138  2,948  1,564  1,172  175  17  18  0  2  0  0  0  0  1996  11,687  3,218  0  1,585  1,393  217  24  0  3  0  0  0  0  1997  12,044  3,668  0  0  1,859  1,543  269  1  0  0  0  0  0  1998  12,641  3,923  0  0  0  2,099  1,644  175  5  0  0  0  0  1999  12,214  3,439  0  0  0  0  2,395  546  482  16  0  0  0  2000  11,542  2,472  0  0  0  0  0  790  1,613  59  7  1  2  2001  11,381  1,954  0  0  0  0  0  0  526  1,125  39  53  211  2002  11,549  3,083  0  0  0  0  0  0  0  1,637  1,066  103  277  2003  11,817  3,100  0  0  0  0  0  0  0  0  1,546  1,097  457  2004  10,900  2,187  0  0  0  0  0  0  0  0  0  1,568  619  2005  9,109  695  0  0  0  0  0  0  0  0  0  0  695  Total  154,522  32,046  2,777  2,797  3,427  3,900  4,388  1,512  2,648  2,837  2,658  2,822  2,288   B. Number of observations in which TDC1 is backfilled  1992  5,567  2,377  2217  15  84  10  18  0  16  2  1  6  8  1993  8,332  368  27  0  314  6  9  0  4  0  0  0  8  1994  9,029  902  339  4  524  8  11  0  6  0  0  0  10  1995  9,306  1,207  419  338  423  15  12  0  0  0  0  0  0  1996  9,838  1,471  0  676  682  92  17  0  5  0  3  0  0  1997  10,080  1,797  0  0  895  784  117  1  0  0  0  0  0  1998  10,416  1,827  0  0  0  974  731  120  2  0  0  0  0  1999  10,227  1,563  0  0  0  0  1,195  245  116  7  0  0  0  2000  9,799  780  0  0  0  0  0  323  423  27  4  1  2  2001  9,389  902  0  0  0  0  0  0  319  489  20  21  53  2002  9,591  1,309  0  0  0  0  0  0  0  721  458  46  84  2003  10,171  1,600  0  0  0  0  0  0  0  0  702  543  355  2004  9,817  1,243  0  0  0  0  0  0  0  0  0  793  450  2005  8,862  563  0  0  0  0  0  0  0  0  0  0  563  Total  130,424  17,909  3,002  1,033  2,922  1,889  2,110  689  891  1,246  1,188  1,410  1,533  Fiscal  Total  Total #  # Backfilled by ExecuComp Vintage  year of obs.  obs. in 2009 vintage  of back-filled obs.  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2009  A. Number of observations in which Salary is backfilled  1992  8,028  42  9  11  0  0  10  0  4  0  0  0  8  1993  9,810  103  57  11  0  9  12  0  5  0  0  0  9  1994  10,662  1,214  1,147  18  0  15  16  0  8  0  0  0  10  1995  11,138  2,948  1,564  1,172  175  17  18  0  2  0  0  0  0  1996  11,687  3,218  0  1,585  1,393  217  24  0  3  0  0  0  0  1997  12,044  3,668  0  0  1,859  1,543  269  1  0  0  0  0  0  1998  12,641  3,923  0  0  0  2,099  1,644  175  5  0  0  0  0  1999  12,214  3,439  0  0  0  0  2,395  546  482  16  0  0  0  2000  11,542  2,472  0  0  0  0  0  790  1,613  59  7  1  2  2001  11,381  1,954  0  0  0  0  0  0  526  1,125  39  53  211  2002  11,549  3,083  0  0  0  0  0  0  0  1,637  1,066  103  277  2003  11,817  3,100  0  0  0  0  0  0  0  0  1,546  1,097  457  2004  10,900  2,187  0  0  0  0  0  0  0  0  0  1,568  619  2005  9,109  695  0  0  0  0  0  0  0  0  0  0  695  Total  154,522  32,046  2,777  2,797  3,427  3,900  4,388  1,512  2,648  2,837  2,658  2,822  2,288   B. Number of observations in which TDC1 is backfilled  1992  5,567  2,377  2217  15  84  10  18  0  16  2  1  6  8  1993  8,332  368  27  0  314  6  9  0  4  0  0  0  8  1994  9,029  902  339  4  524  8  11  0  6  0  0  0  10  1995  9,306  1,207  419  338  423  15  12  0  0  0  0  0  0  1996  9,838  1,471  0  676  682  92  17  0  5  0  3  0  0  1997  10,080  1,797  0  0  895  784  117  1  0  0  0  0  0  1998  10,416  1,827  0  0  0  974  731  120  2  0  0  0  0  1999  10,227  1,563  0  0  0  0  1,195  245  116  7  0  0  0  2000  9,799  780  0  0  0  0  0  323  423  27  4  1  2  2001  9,389  902  0  0  0  0  0  0  319  489  20  21  53  2002  9,591  1,309  0  0  0  0  0  0  0  721  458  46  84  2003  10,171  1,600  0  0  0  0  0  0  0  0  702  543  355  2004  9,817  1,243  0  0  0  0  0  0  0  0  0  793  450  2005  8,862  563  0  0  0  0  0  0  0  0  0  0  563  Total  130,424  17,909  3,002  1,033  2,922  1,889  2,110  689  891  1,246  1,188  1,410  1,533  This table reports the number of backfilled observations in ExecuComp by vintage and fiscal year of the observation. The rows represent different fiscal years, as indicated, and the columns represent different vintages. We also report for each year the total manager-year observations in the 2009 vintage of ExecuComp and the total number of these observations that we identify as backfilled. The remaining columns indicate our estimates of the vintages in which the backfilling occurs. Panel A uses Salary to identify backfilled observations, and panel B uses TDC1. For the remainder of the paper, we maintain two approaches in our tests. First, to illustrate the problems with backfilling, we focus on the 1994–2005 sample period, the period over which most of the early backfilling occurred.13 Second, for each test, we select the appropriate backfilling identifier based on the data required since not every observation has the more detailed compensation data, for example, option grants. If a test uses only summary compensation data, we use Back_Salary as the identifier. If a test uses detailed data on the components of compensation, such as option data needed to calculate pay-for-performance sensitivities, then we use Back_TDC1 to identify backfilled observations. 1.4 Brief overview of identifying the reasons for backfilling The appendix reports details of the strategy we employ to estimate the types of backfilling. In brief, we use two pieces of information: the timing of the backfilling and the historical S&P 1500 constituents list. If a firm-level backfilled observation occurs one or two years prior to index addition, then it is defined as Index backfilling. If a firm is backfilled more than two years prior to the firm entering the S&P 1500, then it is Other. If an observation is backfilled and the firm is currently in the S&P 1500, or if there are other observations for the same firm-year that are not backfilled, then it must be Manager backfilling. We use indicator variables Back_Salary_Index, Back_Salary_Other, and Back_Salary_Manager to indicate index, other, and manager backfilling for observations in which the Salary variable is backfilled, respectively. Similarly, we construct indicator variables, Back_Salary_Index, Back_Salary_Other, and Back_Salary_Manager, for the TDC1 backfilled data. Lastly, in several analyses we report results that distinguish firm-level backfilling from manager-level backfilling. In these cases, firm-level backfilling is the union of Index and Other backfilling. The results of this identification process are shown in Table 2. The majority of backfilling occurs because new managers move into the top-five group. For example, out of the 3,923 backfilled manager observations in fiscal year 1998, we estimate that 2,169 (55.3%) were backfilled because a new manager moved into the top-five group, 581 (14.8%) were added because the firm moved into the S&P 1500, and 1,446 (36.9%) for other reasons.14 Table 2 Backfilling by fiscal year and type of backfilling    (1)  (2)  (3)  Fiscal year  Index addition  Other request  Manager addition  A. Back_Salary           1994  107  416  694  1995  313  994  1,711  1996  424  1,201  1,763  1997  588  1,428  1,875  1998  581  1,446  2,169  1999  462  820  2,486  2000  287  359  2,013  2001  238  239  1,631  2002  178  527  2,608  2003  222  571  2,412  2004  145  386  1,735  2005  26  38  646  Total  3,571  8,425  21,743  B. Back_Total           1994  143  291  469  1995  162  621  445  1996  257  831  426  1997  398  1,028  432  1998  442  1,001  452  1999  370  517  777  2000  223  219  395  2001  181  166  651  2002  119  374  963  2003  183  460  997  2004  124  303  844  2005  25  26  515  Total  2,627  5,837  7,366     (1)  (2)  (3)  Fiscal year  Index addition  Other request  Manager addition  A. Back_Salary           1994  107  416  694  1995  313  994  1,711  1996  424  1,201  1,763  1997  588  1,428  1,875  1998  581  1,446  2,169  1999  462  820  2,486  2000  287  359  2,013  2001  238  239  1,631  2002  178  527  2,608  2003  222  571  2,412  2004  145  386  1,735  2005  26  38  646  Total  3,571  8,425  21,743  B. Back_Total           1994  143  291  469  1995  162  621  445  1996  257  831  426  1997  398  1,028  432  1998  442  1,001  452  1999  370  517  777  2000  223  219  395  2001  181  166  651  2002  119  374  963  2003  183  460  997  2004  124  303  844  2005  25  26  515  Total  2,627  5,837  7,366  This table presents the number of backfilled manager-year observations by fiscal year and type of backfilling. Column 1 reports the total number of observations that we estimate to have been backfilled because the firm entered the S&P 1500 index. Column 2 reports the total number of observations that we estimate to have been backfilled due to firms being added for other reasons. Column 3 reports the total number of observations that we estimate to have been backfilled because the manager entered the top five paid managers in the firm. Panel A reports results using Salary as the identifying compensation item, and panel B uses TDC1. 2. Differences in Backfilled Data 2.1 Univariate analysis We now describe some of the differences between backfilled and nonbackfilled observations (by type), which we expect to occur because of the systematic ways in which observations are added to the database. In Table 3, panel A, we report means and medians of manager compensation and ownership statistics using variables from ExecuComp for the full sample and each of the subsamples. We also present $$t$$-tests for differences in means relative to the nonbackfilled sample (Column 2), where standard errors are clustered by firm, and we report Wilcoxon’s z-scores for the differences in medians. Table 3 Summary statistics for manager compensation and ownership A. Manager compensation and ownership characteristics across samples     (1)  (2)  (3)  (4)  (5)  (6)  (7)     Full sample  Not backfilled  Backfilled  All Firm Backfilled  Index addition  Other addition  Manager addition  Salary  345**  375  244**  244**  237**  246**  241**     (279**)  (304)  (208**)  (200**)  (200**)  (200**)  (210**)  Bonus  328**  369  194**  193**  200**  190**  189**     (132**)  (151)  (88**)  (85**)  (87**)  (83**)  (86**)  Other annual compensation  27  28  23  15**  11**  17**  27     (0)  (0)  (0)  (0)  (0)  (0)  (0)  Restricted stock grant  193**  222  98**  75**  61**  82**  106**     (0)  (0)  (0)  (0)  (0)  (0)  (0)  TDC1  2,122**  2,257  1,183**  1,309**  1,291**  1,317**  1,034**     (915**)  (988)  (518**)  (579**)  (592**)  (573**)  (435**)  Allpay  3,072  3,114  2,018**  2,282**  3,454  1,689**  1,573**     (1,100**)  (1,140)  (669**)  (774**)  (849**)  (734**)  (538**)  % shares owned  0.01**  0.009  0.018**  0.022**  0.021**  0.022**  0.009     (0.001**)  (0.001)  (0.001**)  (0.001**)  (0.002**)  (0.001**)  (0.001)  Black-Scholes option value  991**  1064  479**  637**  681**  617**  305**     (194)  (227)  (0**)  (70**)  (94**)  (59**)  (0**)  Option PFP  0.95**  0.85  2.36**  2.5**  2.6**  2.45**  1.98**  per $${\$}$$1000  (0.003)  (0.01)  (0.87**)  (0.99**)  (1.04**)  (0.94**)  (0.61**)  Option PFP  12,504  12,144  17,347**  18,732**  23,745**  16,409*  13,712  per 1%  (174**)  (70)  (3,869**)  (4,303**)  (5,169**)  (3,637**)  (3,274**)  # Salary observations  136,684  104,783  31,901  11,996  3,571  8,425  21,729  # TDC1 observations  116,525  101,451  15,074  8,448  2,624  5,824  7,275  A. Manager compensation and ownership characteristics across samples     (1)  (2)  (3)  (4)  (5)  (6)  (7)     Full sample  Not backfilled  Backfilled  All Firm Backfilled  Index addition  Other addition  Manager addition  Salary  345**  375  244**  244**  237**  246**  241**     (279**)  (304)  (208**)  (200**)  (200**)  (200**)  (210**)  Bonus  328**  369  194**  193**  200**  190**  189**     (132**)  (151)  (88**)  (85**)  (87**)  (83**)  (86**)  Other annual compensation  27  28  23  15**  11**  17**  27     (0)  (0)  (0)  (0)  (0)  (0)  (0)  Restricted stock grant  193**  222  98**  75**  61**  82**  106**     (0)  (0)  (0)  (0)  (0)  (0)  (0)  TDC1  2,122**  2,257  1,183**  1,309**  1,291**  1,317**  1,034**     (915**)  (988)  (518**)  (579**)  (592**)  (573**)  (435**)  Allpay  3,072  3,114  2,018**  2,282**  3,454  1,689**  1,573**     (1,100**)  (1,140)  (669**)  (774**)  (849**)  (734**)  (538**)  % shares owned  0.01**  0.009  0.018**  0.022**  0.021**  0.022**  0.009     (0.001**)  (0.001)  (0.001**)  (0.001**)  (0.002**)  (0.001**)  (0.001)  Black-Scholes option value  991**  1064  479**  637**  681**  617**  305**     (194)  (227)  (0**)  (70**)  (94**)  (59**)  (0**)  Option PFP  0.95**  0.85  2.36**  2.5**  2.6**  2.45**  1.98**  per $${\$}$$1000  (0.003)  (0.01)  (0.87**)  (0.99**)  (1.04**)  (0.94**)  (0.61**)  Option PFP  12,504  12,144  17,347**  18,732**  23,745**  16,409*  13,712  per 1%  (174**)  (70)  (3,869**)  (4,303**)  (5,169**)  (3,637**)  (3,274**)  # Salary observations  136,684  104,783  31,901  11,996  3,571  8,425  21,729  # TDC1 observations  116,525  101,451  15,074  8,448  2,624  5,824  7,275  B. Manager compensation and ownership characteristics of backfilled versus index subsamples     Backfilled  Not backfilled (in SP1500)  Not backfilled (in SP400 or SP600)  Not backfilled (dropped from SP indices)  Salary  244  384**  311**  318**     (208)  (313**)  (262**)  (265**)  Bonus  194  387**  229**  183*  $$ $$  (88)  (162**)  (110**)  (68**)  Other annual compensation  23  28  17  26  $$ $$  (0)  (0)  (0)  (0)  Restricted stock grant  98  236**  101  120*     (0)  (0)  (0)  (0)  TDC1  1,183  2,321**  1,316**  1,252     (518)  (1032**)  (733**)  (593**)  Allpay  2,018  3,123**  2,035  1,764     (669)  (1,167**)  (825**)  (683)  % shares owned  0.018  0.009**  0.011**  0.008**     (0.001)  (0.001**)  (0.001**)  (0.001**)  Black-Scholes option value  479  1,073**  561**  473     (0)  (247**)  (153**)  (62**)  Option PFP  2.36  0.72**  1.03**  2.07*  per $${\$}$$1000  (0.87)  (0.002**)  (0.159**)  (0.706**)  Option PFP  17,347  10,923**  9,344**  6,727**  per 1%  (3,869)  (16**)  (831**)  (1,368**)  # Salary observations  31,901  90,760  58,792  5,681  # TDC1 observations  15,074  88,364  57,310  5,529  B. Manager compensation and ownership characteristics of backfilled versus index subsamples     Backfilled  Not backfilled (in SP1500)  Not backfilled (in SP400 or SP600)  Not backfilled (dropped from SP indices)  Salary  244  384**  311**  318**     (208)  (313**)  (262**)  (265**)  Bonus  194  387**  229**  183*  $$ $$  (88)  (162**)  (110**)  (68**)  Other annual compensation  23  28  17  26  $$ $$  (0)  (0)  (0)  (0)  Restricted stock grant  98  236**  101  120*     (0)  (0)  (0)  (0)  TDC1  1,183  2,321**  1,316**  1,252     (518)  (1032**)  (733**)  (593**)  Allpay  2,018  3,123**  2,035  1,764     (669)  (1,167**)  (825**)  (683)  % shares owned  0.018  0.009**  0.011**  0.008**     (0.001)  (0.001**)  (0.001**)  (0.001**)  Black-Scholes option value  479  1,073**  561**  473     (0)  (247**)  (153**)  (62**)  Option PFP  2.36  0.72**  1.03**  2.07*  per $${\$}$$1000  (0.87)  (0.002**)  (0.159**)  (0.706**)  Option PFP  17,347  10,923**  9,344**  6,727**  per 1%  (3,869)  (16**)  (831**)  (1,368**)  # Salary observations  31,901  90,760  58,792  5,681  # TDC1 observations  15,074  88,364  57,310  5,529  This table reports means and medians of manager compensation and ownership characteristics for fiscal years 1994 through 2005. Panels A and B show the manager compensation and ownership characteristics from ExecuComp (Salary, Bonus, other annual compensation, restricted stock grants, total direct compensation (TDC1), shares owned, and Black-Scholes option values). The panel also shows variables computed following the previous compensation research: Allpay is constructed following Jensen and Murphy (1990) and equals total cash compensation plus the change in the present value of future cash compensation plus the change in option value. Option PFP is the pay-for-performance sensitivity from option grants (dollar change in executive’s option value per $${\$}$$1,000 change in shareholder wealth) as defined in Yermack (1995). The first column summarizes the means (with medians below in parentheses) for all manager-year salary observations in the October 2009 vintage. The second column includes all manager-year observations that are not backfilled (using Salary as the identifier), and Column 3 reports statistics on backfilled observations. The last four columns report statistics on subsets of backfilled data. Column 4 uses firm-level backfilled data, which are observations backfilled either due to index additions or other reasons. Columns five through seven decompose backfilling into its three types: index, other and manager. Panel B reports the manager compensation and ownership characteristics for the backfilled sample versus S&P index firms. ** and * indicate differences at the 1% and 5% level, respectively, between the respective column and Column 2 (nonbackfilled data), where differences in means are tested using $$t$$-tests with standard errors clustered at the firm level, and differences in medians are tested using Wilcoxon rank-sum tests. The first three columns in Table 3 show that on almost every dimension considered in the table, the backfilled firms differ significantly from the nonbackfilled observations, both economically and statistically. These results are consistent with those illustrated in Figures 1–3, which graphically present differences in returns, levels of compensation, and pay-for-performance sensitivities. Backfilled observations have much lower values for compensation components, e.g., salary, bonus, restricted stock grants, and option grant value, but higher fractional ownership than their nonbackfilled counterparts. The mean (median) salary among backfilled observations is $${\$}$$244,000 ($${\$}$$208,000), compared to $${\$}$$375,000 ($${\$}$$304,000) for the nonbackfilled observations. The mean (median) total compensation, or TDC1, is $${\$}$$1,183,000 ($${\$}$$518,000) for backfilled observations, which is roughly half the total compensation among the nonbackfilled observations, $${\$}$$2,257,000 ($${\$}$$988,000). These differences are statistically significant at the 1% level and suggest that using later vintages of ExecuComp without adjusting for backfilling leads to compensation estimates that are biased downward relative to estimates based on the original data. Following Yermack (1995), we also compute option-grant pay-for-performance sensitivities, labeled Option PFP, and find a mean sensitivity of $${\$}$$2.36 per $${\$}$$1,000 change in shareholder wealth for the backfilled observations, compared to $${\$}$$0.85 for the nonbackfilled sample. This indicates that the backfilled observations have substantially greater option-grant pay-for-performance sensitivity than do the original observations. In panel B of Table 3 we report differences in backfilled observations from subsamples of nonbackfilled S&P index firms. For each of the columns two through four, we present $$t$$-tests for differences in means relative to the backfilled sample (Column 1), where standard errors are clustered by firm, and we report Wilcoxon’s z-scores for the differences in medians. In general, we see that the backfilled observations differ most from the nonbackfilled S&P 1500, as well as the 400 and 600 firms, with smaller (but still significant) differences between the backfilled firms and the firms that have been dropped from the S&P indices. Table 4 reports the means and medians of firm characteristics from CRSP and Compustat. In panel A, we report these statistics for the full sample and the subsamples.15 The table also presents $$t$$-tests for differences in means relative to the nonbackfilled sample (Column 2), where standard errors are clustered by firm, and we report Wilcoxon’s z-scores for differences in medians. Table 4 Summary statistics for firm characteristics A. Firm characteristics across subsamples     (1)  (2)  (3)  (4)  (5)  (6)  (7)     Full sample  Not backfilled  Backfilled  All firm backfilled  Index addition  Other addition  Manager addition  Market value of equity ($${\$}$$MM)  5181  4824  5446*  787**  579**  886**  6088**     (985)  (946)  (1022)*  (353**)  (301**)  (378**)  (1199**)  Leverage  0.189  0.185  0.192*  0.161**  0.147**  0.167*  0.197**     (0.157*)  (0.15)  (0.161**)  (0.067**)  (0.052**)  (0.072**)  (0.174**)  Dividend yield (%)  2.26  2.34  2.19  0.54**  0.4**  0.62**  2.43     (0.14)  (0.16)  (0.14)  (0**)  (0**)  (0**)  (0.22**)  Tobin’s q  2.12**  1.99  2.21**  2.79**  3.03**  2.68**  2.12**     (1.49)  (1.48)  (1.5*)  (1.75**)  (1.79**)  (1.73**)  (1.47)  CDF($$\sigma^{2}_{\it ret})$$  0.5  0.49  0.50  0.26**  0.23**  0.27**  0.54**     (0.5)  (0.49)  (0.51*)  (0.19**)  (0.16**)  (0.2**)  (0.55**)  Return  0.22**  0.19  0.24**  0.53**  0.65**  0.47**  0.20     (0.12*)  (0.11)  (0.13**)  (0.29**)  (0.35**)  (0.27**)  (0.12)  Return$$_{t+1}$$  0.23**  0.19  0.25**  0.64**  0.60**  0.67**  0.20     (0.13)  (0.12)  (0.14)  (0.34**)  (0.31**)  (0.34**)  (0.12**)  Return$$_{t+2}$$  0.19**  0.16  0.21**  0.49**  0.39**  0.53**  0.15     (0.11)  (0.10)  (0.11)  (0.22**)  (0.20**)  (0.24**)  (0.09**)  Return$$_{t+3}$$  0.12**  0.08  0.15**  0.20**  0.14*  0.23**  0.14**     (0.06**)  (0.03)  (0.08**)  (0.07**)  (0.06)  (0.08**)  (0.08**)  Instl ownership Herfindahl  0.075  0.073  0.077**  0.124**  0.126**  0.123**  0.071     (0.05)  (0.05)  (0.05*)  (0.09**)  (0.09**)  (0.09**)  (0.05**)  # firm-year observations  22,720  9,439  13,281  1,545  2,250  705  11,342  A. Firm characteristics across subsamples     (1)  (2)  (3)  (4)  (5)  (6)  (7)     Full sample  Not backfilled  Backfilled  All firm backfilled  Index addition  Other addition  Manager addition  Market value of equity ($${\$}$$MM)  5181  4824  5446*  787**  579**  886**  6088**     (985)  (946)  (1022)*  (353**)  (301**)  (378**)  (1199**)  Leverage  0.189  0.185  0.192*  0.161**  0.147**  0.167*  0.197**     (0.157*)  (0.15)  (0.161**)  (0.067**)  (0.052**)  (0.072**)  (0.174**)  Dividend yield (%)  2.26  2.34  2.19  0.54**  0.4**  0.62**  2.43     (0.14)  (0.16)  (0.14)  (0**)  (0**)  (0**)  (0.22**)  Tobin’s q  2.12**  1.99  2.21**  2.79**  3.03**  2.68**  2.12**     (1.49)  (1.48)  (1.5*)  (1.75**)  (1.79**)  (1.73**)  (1.47)  CDF($$\sigma^{2}_{\it ret})$$  0.5  0.49  0.50  0.26**  0.23**  0.27**  0.54**     (0.5)  (0.49)  (0.51*)  (0.19**)  (0.16**)  (0.2**)  (0.55**)  Return  0.22**  0.19  0.24**  0.53**  0.65**  0.47**  0.20     (0.12*)  (0.11)  (0.13**)  (0.29**)  (0.35**)  (0.27**)  (0.12)  Return$$_{t+1}$$  0.23**  0.19  0.25**  0.64**  0.60**  0.67**  0.20     (0.13)  (0.12)  (0.14)  (0.34**)  (0.31**)  (0.34**)  (0.12**)  Return$$_{t+2}$$  0.19**  0.16  0.21**  0.49**  0.39**  0.53**  0.15     (0.11)  (0.10)  (0.11)  (0.22**)  (0.20**)  (0.24**)  (0.09**)  Return$$_{t+3}$$  0.12**  0.08  0.15**  0.20**  0.14*  0.23**  0.14**     (0.06**)  (0.03)  (0.08**)  (0.07**)  (0.06)  (0.08**)  (0.08**)  Instl ownership Herfindahl  0.075  0.073  0.077**  0.124**  0.126**  0.123**  0.071     (0.05)  (0.05)  (0.05*)  (0.09**)  (0.09**)  (0.09**)  (0.05**)  # firm-year observations  22,720  9,439  13,281  1,545  2,250  705  11,342  B. Firm characteristics of backfilled versus index subsamples     (1)  (2)  (3)  (4)     Backfilled  Not backfilled (in SP1500)  Not backfilled (in SP400 or SP600)  Not backfilled (dropped from SP indices)  Market value of equity ($${\$}$$MM)  5446  5308  1039***  562***     (1022)  (1085***)  (653***)  (161***)  Leverage  0.192  0.18***  0.18***  0.25***     (0.161)  (0.154**)  (0.143***)  (0.144)  Dividend yield (%)  2.19  2.63**  0.49***  0.42***     (0.14)  (0.22***)  (0.06***)  (0***)  Tobin’s q  2.21  1.91***  1.88***  1.73***     (1.51)  (1.47***)  (1.44***)  (1.29***)  CDF($$\sigma^{2}_{\it ret})$$  0.50  0.51  0.39***  0.39***     (0.51)  (0.51)  (0.38***)  (0.37***)  Return  0.24  0.16***  0.17***  0.25     (0.13)  (0.10***)  (0.10***)  (0.03***)  Return$$_{t+1}$$  0.25  0.16***  0.16***  0.39***     (0.14)  (0.12**)  (0.11***)  (0.14)  Return$$_{t+2}$$  0.21  0.16***  0.16***  0.30***     (0.11)  (0.11)  (0.10)  (0.11)  Return$$_{t+3}$$  0.15  0.07***  0.08***  0.08*     (0.08)  (0.03***)  (0.03***)  (–0.05***)  Instl ownership Herfindahl  0.077  0.064***  0.071***  0.182***     (0.054)  (0.050***)  (0.056***)  (0.115***)  # firm-year observations  13,281  7,913  5,595  676  B. Firm characteristics of backfilled versus index subsamples     (1)  (2)  (3)  (4)     Backfilled  Not backfilled (in SP1500)  Not backfilled (in SP400 or SP600)  Not backfilled (dropped from SP indices)  Market value of equity ($${\$}$$MM)  5446  5308  1039***  562***     (1022)  (1085***)  (653***)  (161***)  Leverage  0.192  0.18***  0.18***  0.25***     (0.161)  (0.154**)  (0.143***)  (0.144)  Dividend yield (%)  2.19  2.63**  0.49***  0.42***     (0.14)  (0.22***)  (0.06***)  (0***)  Tobin’s q  2.21  1.91***  1.88***  1.73***     (1.51)  (1.47***)  (1.44***)  (1.29***)  CDF($$\sigma^{2}_{\it ret})$$  0.50  0.51  0.39***  0.39***     (0.51)  (0.51)  (0.38***)  (0.37***)  Return  0.24  0.16***  0.17***  0.25     (0.13)  (0.10***)  (0.10***)  (0.03***)  Return$$_{t+1}$$  0.25  0.16***  0.16***  0.39***     (0.14)  (0.12**)  (0.11***)  (0.14)  Return$$_{t+2}$$  0.21  0.16***  0.16***  0.30***     (0.11)  (0.11)  (0.10)  (0.11)  Return$$_{t+3}$$  0.15  0.07***  0.08***  0.08*     (0.08)  (0.03***)  (0.03***)  (–0.05***)  Instl ownership Herfindahl  0.077  0.064***  0.071***  0.182***     (0.054)  (0.050***)  (0.056***)  (0.115***)  # firm-year observations  13,281  7,913  5,595  676  This table reports means and medians of firm characteristics for fiscal years 1994 through 2005. Panels A and B use firm-level observations and compare firm characteristics obtained from COMPUSTAT and CRSP. Leverage is long-term debt divided by total assets. Div yield is the sum of dividends over the year divided by market equity. $$q$$ is Tobin’s q and CDF($$\sigma^{2}_{\it ret})$$ is the cumulative distribution function of the variance of returns for firms in our sample, following Aggarwal and Samwick (1999). Instl ownership Herfindahl is the Herfindahl index of institutional ownership using holdings from the Thomson database. A firm-level observation is considered to be backfilled (Column 3 in panel A and Column 1 in panel B) if any manager is backfilled. Similarly, Column 7 in panel A is all firms with any manager that was backfilled due to manager addition. In panel A the first column summarizes the means (with medians below in parentheses) for all manager-year salary observations in the October 2009 vintage. The second column includes all manager-year observations that are not backfilled (using Salary as the identifier), and Column 3 reports statistics on backfilled observations. The last four columns report statistics on subsets of backfilled data. Column 4 uses firm-level backfilled data, which are observations backfilled either due to index additions or other reasons. Columns 5 through 7 decompose backfilling into its three types: index, other and manager. Panel B reports the firm characteristics for the backfilled sample versus S&P index firms. ** and * indicate differences at the 1% and 5% level, respectively, relative to Column 2 in panel A (nonbackfilled data), and relative to Column 1 in panel B (backfilled data), where differences in means are tested using $$t$$-tests with standard errors clustered at the firm level, and differences in medians are tested using Wilcoxon rank-sum tests. Focusing on Column 4, which reports summary statistics for firm-level backfilling (Index and Other additions), we find that the backfilled firms tend to be smaller, with lower dividend yields, lower leverage, and higher growth. The stock returns contain notable differences as well. Backfilled firms tend to have substantially higher subsequent stock performance (i.e., after the date of the observation), but lower variance in returns relative to firms that were not backfilled. Panel B of Table 4 reports differences in firm characteristics for backfilled observations compared to three subsamples of nonbackfilled S&P index firms along with $$t$$-tests and Wilcoxon’s z-scores. Comparing backfilled to nonbackfilled firms in the S&P 1500, we find that backfilled firms have significantly higher Tobin’s q and returns, but no significant differences in return volatility. Nonbackfilled observations and observations of firms dropped from the S&P 1500 are markedly different, with the latter group exhibiting low market values of equity, dividend yields, and q. Overall, Tables 3 and 4 show that the differences between backfilled and nonbackfilled observations are both statistically and economically significant. These findings alone raise concerns that including backfilled observations in analyses of relations between pay and various firm characteristics is problematic. 2.2 Multivariate analysis To better understand the differences in characteristics between backfilled and nonbackfilled observations, we use logit specifications to model the likelihoods that (1) a firm-year is backfilled (Table 5) and (2) a firm-manager-year observation is backfilled (Table 6). In all specifications the dependent variable takes the value of one if the observation is backfilled and zero otherwise. Our empirical specifications include variables that have been identified in prior work as being important in explaining the variation in compensation across firms.16 To compare the magnitudes of the models’ coefficients, we standardize the independent variables to have zero mean and unit variance and report odds ratios from the logit models.17 Table 5 Descriptors of backfilled firm-level data    TDC1  TDC1  TDC1  TDC1  Salary     All firm-level  Index  Other  Any exec backfilled  Any exec backfilled     (1)  (2)  (3)  (4)  (5)  ln(Firm size)  0.80*  0.86  0.73*  0.80**  1.08     (–2.05)  (–0.87)  (–2.55)  (–3.19)  (1.34)  Leverage  0.94  0.91  0.97  1.10**  1.07*     (–1.05)  (–1.11)  (–0.50)  (2.85)  (2.54)  Div yield  0.93  0.73  0.92  1.05  0.97     (–0.49)  (–0.51)  (–0.48)  (1.15)  (–0.94)  $$q$$  1.19**  1.16**  1.18**  0.93  1.01     (3.18)  (2.81)  (3.28)  (–1.22)  (0.30)  CDF($$\sigma^{2}_{\it ret}$$)  0.48**  0.37**  0.61**  0.97  1.05     (–6.77)  (–5.51)  (–3.90)  (–0.46)  (1.00)  Return  1.24**  1.16**  1.12**  0.92*  1.04     (6.58)  (3.63)  (3.96)  (–2.19)  (1.76)  Return$$_{t+1}$$  1.45**  1.20**  1.31**  0.87**  1.06**     (11.6)  (4.83)  (7.35)  (–4.67)  (3.18)  Return$$_{t+2}$$  1.36**  1.06  1.34**  0.91**  1.07**     (10.2)  (1.71)  (8.69)  (–2.88)  (3.32)  Return$$_{t+3}$$  1.08**  0.98  1.10**  0.97  1.02     (2.93)  (–0.38)  (3.33)  (–1.21)  (1.05)  Observations  15,463  15,463  15,463  15,463  15,626     TDC1  TDC1  TDC1  TDC1  Salary     All firm-level  Index  Other  Any exec backfilled  Any exec backfilled     (1)  (2)  (3)  (4)  (5)  ln(Firm size)  0.80*  0.86  0.73*  0.80**  1.08     (–2.05)  (–0.87)  (–2.55)  (–3.19)  (1.34)  Leverage  0.94  0.91  0.97  1.10**  1.07*     (–1.05)  (–1.11)  (–0.50)  (2.85)  (2.54)  Div yield  0.93  0.73  0.92  1.05  0.97     (–0.49)  (–0.51)  (–0.48)  (1.15)  (–0.94)  $$q$$  1.19**  1.16**  1.18**  0.93  1.01     (3.18)  (2.81)  (3.28)  (–1.22)  (0.30)  CDF($$\sigma^{2}_{\it ret}$$)  0.48**  0.37**  0.61**  0.97  1.05     (–6.77)  (–5.51)  (–3.90)  (–0.46)  (1.00)  Return  1.24**  1.16**  1.12**  0.92*  1.04     (6.58)  (3.63)  (3.96)  (–2.19)  (1.76)  Return$$_{t+1}$$  1.45**  1.20**  1.31**  0.87**  1.06**     (11.6)  (4.83)  (7.35)  (–4.67)  (3.18)  Return$$_{t+2}$$  1.36**  1.06  1.34**  0.91**  1.07**     (10.2)  (1.71)  (8.69)  (–2.88)  (3.32)  Return$$_{t+3}$$  1.08**  0.98  1.10**  0.97  1.02     (2.93)  (–0.38)  (3.33)  (–1.21)  (1.05)  Observations  15,463  15,463  15,463  15,463  15,626  This table presents odds ratios from logit specifications using firm-level observations from ExecuComp for 1994 through 2005. In Column 1, the dependent variable equals one if the executives in a given firm have TDC1 backfilled due to either index additions or Other, and zero otherwise. The second column uses a dependent variable equal to one if TDC1 for any executive at the firm has been backfilled due to index addition, and Column 3 uses a dependent variable equal to one if TDC1 has been backfilled for other reasons. Column 4 uses a dependent variable equal to one if an executive in the firm has TDC1 that has been backfilled for any reason. The last column identifies a firm as backfilled if any manager has Salary backfilled. Ln(Firm size) is the log of market equity at the beginning of the fiscal year. Leverage is long-term debt divided by total assets. Div yield is the sum of dividends over the year divided by market equity. $$q$$ is Tobin’s q and CDF($$\sigma^{2}_{\it ret})$$ is the cumulative distribution function of the variance of returns for firms in our sample, following Aggarwal and Samwick (1999). All independent variables are standardized to have mean zero and unit variance. All specifications include year effects. Standard errors are clustered at the firm level. ** and * indicate statistical significance at the 1% and 5% level, respectively. Table 6 Descriptors of backfilled manager-level data    TDC1 all  TDC1 firm  TDC1 index  TDC1 other  TDC1 manager  Salary all     (1)  (2)  (3)  (4)  (5)  (6)  Salary  0.87**  1.13  1.03  1.20**  0.61**  0.81**     (–2.62)  (1.95)  (0.30)  (2.77)  (–5.19)  (–4.25)  Bonus  1.03  1.03  1.08  1.02  1.00  0.98     (0.81)  (0.63)  (1.36)  (0.50)  (0.011)  (–0.49)  Other ann comp  1.03  1.17  0.99  1.20  0.81  1.00     (0.26)  (1.60)  (–0.030)  (1.78)  (–1.40)  (0.069)  Stock grants  1.01*  1.01*  0.97  1.01**  1.01**        (2.44)  (2.22)  (–0.21)  (2.77)  (2.81)     Black-Scholes opt val  0.93  1.00  1.02  1.00  0.62        (–1.30)  (0.066)  (0.75)  (0.022)  (–1.95)     Option PFP  1.17**  1.15**  1.10**  1.16**  1.16**        (6.06)  (4.83)  (3.82)  (4.51)  (4.86)     % shares owned  1.19**  1.21**  1.18**  1.23**  1.10**        (10.4)  (9.23)  (6.19)  (8.75)  (3.64)     CEO  0.62**  0.50**  0.61**  0.42**  0.87**  0.63**     (–17.5)  (–18.6)  (–11.1)  (–15.5)  (–3.93)  (–16.1)  ln(Firm size)  0.84*  0.77*  0.78  0.74*  0.88  0.92     (–2.40)  (–2.53)  (–1.48)  (–2.48)  (–1.27)  (–1.39)  Leverage  0.98  0.95  0.88  0.96  1.02  0.99     (–0.51)  (–1.01)  (–1.41)  (–0.54)  (0.52)  (–0.32)  Div yield  1.08  0.92  0.94  0.90  1.09  1.09**     (1.75)  (–0.64)  (–0.18)  (–0.72)  (1.68)  (2.65)  $$q$$  1.16**  1.21**  1.21**  1.19**  1.11*  1.13**     (3.56)  (3.61)  (2.93)  (3.23)  (1.97)  (2.95)  CDF($$\sigma^{2}_{\it ret})$$  0.58**  0.45**  0.34**  0.51**  0.91  0.65**     (–8.04)  (–7.60)  (–6.29)  (–5.32)  (–1.00)  (–6.97)  Return  1.16**  1.20**  1.20**  1.13**  1.02  1.18**     (5.13)  (4.94)  (3.61)  (3.11)  (0.39)  (5.60)  Return$$_{t+1}$$  1.32**  1.43**  1.36**  1.42**  1.01  1.31**     (11.5)  (12.1)  (8.12)  (11.4)  (0.26)  (11.1)  Return$$_{t+2}$$  1.25**  1.31**  1.19**  1.36**  1.09*  1.22**     (9.69)  (10.0)  (4.10)  (10.5)  (2.08)  (8.88)  Return$$_{t+3}$$  1.07**  1.09**  1.02  1.11**  1.04  1.08**     (2.75)  (3.16)  (0.37)  (3.70)  (0.92)  (3.52)  Observations  84,017  79,617  75,865  77,729  78,823  98,776     TDC1 all  TDC1 firm  TDC1 index  TDC1 other  TDC1 manager  Salary all     (1)  (2)  (3)  (4)  (5)  (6)  Salary  0.87**  1.13  1.03  1.20**  0.61**  0.81**     (–2.62)  (1.95)  (0.30)  (2.77)  (–5.19)  (–4.25)  Bonus  1.03  1.03  1.08  1.02  1.00  0.98     (0.81)  (0.63)  (1.36)  (0.50)  (0.011)  (–0.49)  Other ann comp  1.03  1.17  0.99  1.20  0.81  1.00     (0.26)  (1.60)  (–0.030)  (1.78)  (–1.40)  (0.069)  Stock grants  1.01*  1.01*  0.97  1.01**  1.01**        (2.44)  (2.22)  (–0.21)  (2.77)  (2.81)     Black-Scholes opt val  0.93  1.00  1.02  1.00  0.62        (–1.30)  (0.066)  (0.75)  (0.022)  (–1.95)     Option PFP  1.17**  1.15**  1.10**  1.16**  1.16**        (6.06)  (4.83)  (3.82)  (4.51)  (4.86)     % shares owned  1.19**  1.21**  1.18**  1.23**  1.10**        (10.4)  (9.23)  (6.19)  (8.75)  (3.64)     CEO  0.62**  0.50**  0.61**  0.42**  0.87**  0.63**     (–17.5)  (–18.6)  (–11.1)  (–15.5)  (–3.93)  (–16.1)  ln(Firm size)  0.84*  0.77*  0.78  0.74*  0.88  0.92     (–2.40)  (–2.53)  (–1.48)  (–2.48)  (–1.27)  (–1.39)  Leverage  0.98  0.95  0.88  0.96  1.02  0.99     (–0.51)  (–1.01)  (–1.41)  (–0.54)  (0.52)  (–0.32)  Div yield  1.08  0.92  0.94  0.90  1.09  1.09**     (1.75)  (–0.64)  (–0.18)  (–0.72)  (1.68)  (2.65)  $$q$$  1.16**  1.21**  1.21**  1.19**  1.11*  1.13**     (3.56)  (3.61)  (2.93)  (3.23)  (1.97)  (2.95)  CDF($$\sigma^{2}_{\it ret})$$  0.58**  0.45**  0.34**  0.51**  0.91  0.65**     (–8.04)  (–7.60)  (–6.29)  (–5.32)  (–1.00)  (–6.97)  Return  1.16**  1.20**  1.20**  1.13**  1.02  1.18**     (5.13)  (4.94)  (3.61)  (3.11)  (0.39)  (5.60)  Return$$_{t+1}$$  1.32**  1.43**  1.36**  1.42**  1.01  1.31**     (11.5)  (12.1)  (8.12)  (11.4)  (0.26)  (11.1)  Return$$_{t+2}$$  1.25**  1.31**  1.19**  1.36**  1.09*  1.22**     (9.69)  (10.0)  (4.10)  (10.5)  (2.08)  (8.88)  Return$$_{t+3}$$  1.07**  1.09**  1.02  1.11**  1.04  1.08**     (2.75)  (3.16)  (0.37)  (3.70)  (0.92)  (3.52)  Observations  84,017  79,617  75,865  77,729  78,823  98,776  This table reports odds ratios from Logit specifications using manager-level observations from ExecuComp for 1994 through 2005. Columns one through five use TDC1 to determine backfilling, and the last column uses Salary. Column 1 reports odds ratios from a specification in which the dependent variable equals one for any backfilled observation, and zero otherwise. Column 2 sets the dependent variable equal to one if the observation is either index or backfilled for other reasons. Columns three through five use backfilling indicator variables for index, other, and manager backfilling, respectively. Manager compensation and ownership characteristics are obtained directly from ExecuComp (Salary, Bonus, other annual compensation, stock grants, the percentage of shares owned, and Black-Scholes option values). Option PFP is the pay-for-performance sensitivity of options grants as defined in Yermack (1995). CEO takes the value of one if the CEOANN variable in ExecuComp equals “CEO”. Ln(Firm size) is the log of market equity at the beginning of the fiscal year. Leverage is long-term debt divided by total assets. Div yield is the sum of dividends over the year divided by market equity. $$q$$ is Tobin’s q, CDF($$\sigma^{2}_{\it ret})$$ is the cumulative distribution function of the variance of returns for firms in our sample, following Aggarwal and Samwick (1999). If the % shares owned is missing, it is set to zero, and we include a dummy variable equal to one if %shares owned is missing. We do the same for Option PFP. Coefficients on these dummy variables are not reported for brevity. All independent variables are standardized. All specifications include year effects. Standard errors are clustered at the firm level. ** and * indicate statistical significance at the 1% and 5% level, respectively. The odds ratios reported in Table 5 show that backfilled firms generally have higher Tobin’s q, higher stock returns and lower return variances. From Column 1, a firm with Tobin’s q that is one standard deviation above the mean is 1.19 times more likely to be backfilled than a firm with the average $$q$$. A firm with a one-standard-deviation higher variance (given by CDF($$\sigma^{2}_{\it ret}))$$ is roughly half as likely to be backfilled relative to the firm with average variance. Consistent with our univariate results, backfilled firms also tend to have higher subsequent returns. For example, a firm with a stock return over the year $$t+$$1 that is one standard deviation above the mean is 1.45 times more likely to be backfilled (where $$t$$ is the year of the compensation observation). Table 6 focuses on executive-year observations and the specifications include as explanatory variables details of the executive’s compensation, including bonus, stock grants, the Black-Scholes value of option grants, and pay-for-performance sensitivity of option grants [Yermack (1995)]. We also include executive ownership, an indicator variable for executives serving as CEO (CEOANN in ExecuComp) and firm characteristics. Consistent with the univariate analysis, Salary is significantly lower for the backfilled observations, as shown in columns one and six. However, columns two through five show that this magnitude difference is driven by manager-level backfilling. Backfilled observations tend to have larger stock grants, although the economic magnitude of the effect is low. We also see that backfilling appears to be associated with higher pay-for-performance sensitivity, backfilled executives tend to hold a higher percentage of the firm’s shares, and that CEOs are less likely to be backfilled. Examination of firm-level predictors indicates that firms with high stock returns and low return variance (CDF($$\sigma^{2}_{\it ret}))$$ are much more likely to be backfilled. Comparison of coefficients within a particular column indicates that the level and volatility of firm performance are among the strongest predictors of backfilling. These results are consistent with our earlier discussion regarding potential biases induced by systematically backfilling data for managers of firms that have exhibited strong firm performance. 3. The Impact of Backfilled Data on the Estimation of Compensation Metrics A large number of studies examine the relations between executive compensation and other variables of interest. Just a few examples show the wide spectrum of academic interest in the area (some of these studies employ the ExecuComp database and others do not): broad issues about incentives and contracting (e.g., Jensen and Murphy 1990; Yermack 1995; Gillan, Hartzell, and Parrino 2009), compensation incentives and the reporting of financial information (e.g., Burns and Kedia 2006; Bergstresser and Philippon 2006), and governmental regulation and compensation contracts (e.g., Perry and Zenner 2001). To highlight the potential effects of backfilling, we replicate several common tests used in the corporate finance literature. 3.1 The level of compensation Much research has been dedicated to explaining the level of executive pay. For example, several papers provide theoretical and empirical evidence that compensation is related to firm size (e.g., Murphy 1985; Baker and Hall 2004; Murphy and Zábojník 2004; Gabaix and Landier 2008). Given the differences in firm size and executive compensation between backfilled and nonbackfilled observations shown in the previous section, it is likely that backfilling affects the estimated relations between these variables. Similarly, other studies combine compensation data with size and additional firm characteristics in order to construct estimates of abnormal pay (e.g., Smith and Watts 1992; Core, Holthausen, and Larcker 1999; Murphy 1999; Core, Guay and Larcker 2008; Gillan, Hartzell, and Parrino 2009). To demonstrate one such approach using total compensation (TDC1) as the relevant level of compensation, abnormal compensation can be computed as actual compensation minus its expected value from the following prediction equation:   \begin{align*} \ln(\textit{TDC1}_{t}) &= {\rm b}_{1} \ln(\textit{Tenure}_{t}) + {\rm b}_{2} \ln(\textit{Sales}_{t-1}) + {\rm b}_{3} \textit{S&P500}_{t}+ {\rm b}_{4} \ln(\textit{BTM}_{t-1}) \nonumber\\ &\quad + {\rm b}_{5} \textit{ROA}_{t}+ {\rm b}_{6} \textit{ROA}_{t-1}+ {\rm b}_{7} \textit{RET}_{t}+ {\rm b}_{8} \textit{RET}_{t-1} + {\rm b}_{9} \textit{CEO}_{t}, \end{align*} where Tenure$$_{t}$$ is the number of years at time $$t$$ that the executive has been with the company, Sales$$_{t-1}$$ is the company’s lagged annual sales, S&P500$$_{t}$$ is an indicator variable set to one if the firm is in the S&P 500 index and zero otherwise, BTM$$_{t-1}$$ is the lagged value of book equity over market equity, ROA is earnings before interest and taxes (EBIT) divided by the firm’s assets, RET is the firm’s stock return and CEO is an indicator set to one if the executive is the CEO (as identified by the CEOANN variable in ExecuComp).18 Table 7 shows the results of computing abnormal compensation for our full sample (Column 1) and the nonbackfilled sample (Column 2). Below the regression output, we summarize both the fitted values and error terms from each respective regression, that is, normal and abnormal compensation. We see significant differences in both normal and abnormal compensation across the backfilled and nonbackfilled observations. On average, expected or normal TDC1 is $${\$}$$1,759,000 for the full sample, but only $${\$}$$916,000 for backfilled observations. The average error term is zero by construction for the full sample, but when splitting the sample by Back_Total, we see that backfilled observations have negative prediction errors on average, and nonbackfilled data have positive errors. The difference between the two is highly statistically significant ($$t$$-statistic of 5.14). Table 7 Abnormal compensation    Sample:  Independent var: ln(TDC1)$$_{t}$$  All data  Back_Total$$=$$0     (1)  (2)  ln(Tenure)$$_{t}$$  –0.008  –0.010     (–0.76)  (–0.92)  ln(Sales)$$_{t-1}$$  0.31**  0.32**     (25.3)  (25.0)  S&P 500$$_{t}$$  0.37**  0.37**     (9.14)  (8.96)  ln(BTM)$$_{t}$$  –0.22**  –0.23**     (–10.9)  (–10.9)  ROA$$_{t}$$  –0.021  –0.17*     (–0.22)  (–1.95)  ROA$$_{t-1}$$  –0.090  0.003     (–1.02)  (0.03)  Ret$$_{t}$$  0.066**  0.067**     (4.44)  (4.38)  Ret$$_{t-1}$$  0.11**  –0.13**     (8.00)  (–8.58)  CEO$$_{t}$$  0.84**  0.83**     (49.1)  (48.4)  Mean predicted TDC1, in thousands  Full sample  1,759     Backfilled obs.  916     Nonbackfilled obs.  1,826  1,846  Mean abnormal TDC1 (actual minus predicted TDC1), in thousands  Full sample  915     Backfilled obs.  279     Nonbackfilled obs.  966  945  Paired $$t$$-test for difference among nonbackfilled obs. across samples     (21.35)  Mean abnormal log compensation: actual ln(TDC1) minus predicted ln(TDC1), in thousands  Full sample  0.0000     Backfilled obs.  –0.1389     Nonbackfilled obs.  0.0110  0.0000     Sample:  Independent var: ln(TDC1)$$_{t}$$  All data  Back_Total$$=$$0     (1)  (2)  ln(Tenure)$$_{t}$$  –0.008  –0.010     (–0.76)  (–0.92)  ln(Sales)$$_{t-1}$$  0.31**  0.32**     (25.3)  (25.0)  S&P 500$$_{t}$$  0.37**  0.37**     (9.14)  (8.96)  ln(BTM)$$_{t}$$  –0.22**  –0.23**     (–10.9)  (–10.9)  ROA$$_{t}$$  –0.021  –0.17*     (–0.22)  (–1.95)  ROA$$_{t-1}$$  –0.090  0.003     (–1.02)  (0.03)  Ret$$_{t}$$  0.066**  0.067**     (4.44)  (4.38)  Ret$$_{t-1}$$  0.11**  –0.13**     (8.00)  (–8.58)  CEO$$_{t}$$  0.84**  0.83**     (49.1)  (48.4)  Mean predicted TDC1, in thousands  Full sample  1,759     Backfilled obs.  916     Nonbackfilled obs.  1,826  1,846  Mean abnormal TDC1 (actual minus predicted TDC1), in thousands  Full sample  915     Backfilled obs.  279     Nonbackfilled obs.  966  945  Paired $$t$$-test for difference among nonbackfilled obs. across samples     (21.35)  Mean abnormal log compensation: actual ln(TDC1) minus predicted ln(TDC1), in thousands  Full sample  0.0000     Backfilled obs.  –0.1389     Nonbackfilled obs.  0.0110  0.0000  This table reports coefficients from OLS specifications of executive observations from ExecuComp for fiscal years 1994 through 2005. The dependent variable is the natural log of TDC1. Column 1 uses all data, and Column 2 restricts the sample to nonbackfilled observations (Back_Total$$=$$0). Tenure is the time in years since the executive started at the firm and is obtained from ExecuComp. Sales is annual sales from Compustat. S&P 500 equals one if the firm is in the S&P500 index, and zero otherwise. BTM is book equity over market equity, ROA is EBIT over assets, RET is firm stock return and CEO is an indicator set to one if the CEOANN variable in ExecuComp equals “CEO.” All regressions include year and two-digit SIC effects. Below the regression coefficients we report mean values of the fitted values (“normal” compensation) and residuals (“abnormal” compensation). Abnormal compensation is computed as actual TDC1 minus the predicted value of TDC1 from the regression. Standard errors are clustered at the firm level. ** and * indicate statistical significance at the 1% and 5% level, respectively. The results in Table 7 are important for at least three reasons. First, the results suggest that the addition of backfilled data leads to upwardly biased estimates of abnormal compensation for the nonbackfilled observations on average.19 Thus, not correcting for the backfilled data can lead one to conclude that a large portion of the ExecuComp sample appears to be “overpaid” relative to the group of managers that were backfilled, which exhibited stronger performance and lower compensation levels. This result is very strong statistically; as shown in the table, a paired $$t$$-test of the residuals for the nonbackfilled observations across the full and clean samples resoundingly rejects the null hypothesis of no difference (with a t statistic of more than 21). Second, while including backfilled data has a modest effect on abnormal compensation economically – roughly $${\$}$$21,000, on average – the small difference in means masks some very large differences for specific observations. For example, John Menzer, an executive at Wal-Mart, received compensation in 2003 that is $${\$}$$402,000 higher than that predicted when using all of ExecuComp, suggesting he may have been overpaid. However, after excluding backfilled data, his compensation is actually lower than the predicted value by $${\$}$$49,000, suggesting he was underpaid. Thus, backfilled data can affect whether a particular executive appears to be over- or underpaid, which can be quite important for compensation consultants or shareholder activists, and potentially important for researchers. Third, in addition to introducing an upward bias in the typical estimated abnormal compensation, the inclusion of backfilled firms will also change the distribution of abnormal compensation for different subsamples. While an investigation of changes in relations with all possible covariates of abnormal compensation is beyond the scope of this study, we did consider Tobin’s q as a plausible representative variable of interest (e.g., as a way to examine the relation between firm performance or valuation and excess compensation). We find that in a regression of Tobin’s q on abnormal compensation and a set of control variables, there is a negative and significant coefficient on the portion of abnormal compensation estimated erroneously due to backfilling. Next, we consider a related question—the relation between abnormal incentive compensation and future firm performance—as another way to demonstrate the potential impact of backfilled data on economic and statistical inferences. 3.2 Incentive compensation and future returns One of the classic questions in corporate governance is the degree to which an executive’s compensation affects future firm performance. The literature has made two basic arguments about the direction of this relationship. Some authors have argued (with supporting evidence) that compensation structures can provide management with incentives that lead to higher future firm performance (e.g., Abowd 1990; Hayes and Schaefer 2000; Minnick, Unal and Yang 2011). In contrast, other authors have argued and presented evidence that due to agency issues, greater executive incentive compensation leads to lower subsequent firm performance (e.g., Core, Holthausen, and Larcker 1999; Bebchuk, Cremers, and Peyer 2011; Cooper, Gulen, and Rau 2014). For example, Cooper, Gulen, and Rau (2014) find that the firms with the highest abnormal incentive compensation for their executives have significantly lower future returns. These results not only have had potential academic impact but they also have garnered significant attention in the popular press.20 The relation between abnormal incentive compensation and future firm performance is one that would be particularly susceptible to the backfilling bias that we have identified. Thus, we examine this link using two samples (the total sample and the sample without the backfilled observations) and then compare the differences between them. Specifically, we place the sample firms into deciles each year based on their excess incentive compensation, measured as the excess compensation over control firms matched on industry (two-digit SIC) and size (measured by sales). We then measure the cumulative abnormal returns in the year by calculating the excess of the average return of an industry and lagged-return matched portfolio. Table 8 reports the results for the lowest and highest deciles of firms sorted by their excess CEO compensation. Although we have a different sample period from that of Cooper, Gulen, and Rau (2014), our results for the full sample of firms are similar: We find the firms with the highest excess incentive compensation have lower future abnormal returns, a result that is significant at the 10% level. In contrast, if we eliminate the backfilled firms from the sample, there is no relation between abnormal incentive compensation and future firm performance. Table 8 CEO incentive compensation and future firm performance    Abnormal stock return in year t$$+$$1     Full sample  Nonbackfilled sample     (1)  (2)  Lowest decile abnormal  0.046***  0.047***  incentive comp in year t  (5.35)  (5.33)  Highest decile abnormal  0.021**  0.034***  incentive comp in year t  (2.09)  (3.26)  High - low  –0.025*  –0.013     (–1.89)  (–0.94)     Abnormal stock return in year t$$+$$1     Full sample  Nonbackfilled sample     (1)  (2)  Lowest decile abnormal  0.046***  0.047***  incentive comp in year t  (5.35)  (5.33)  Highest decile abnormal  0.021**  0.034***  incentive comp in year t  (2.09)  (3.26)  High - low  –0.025*  –0.013     (–1.89)  (–0.94)  This table reports the results of tests between the abnormal performance of the lowest and highest decile of firms sorted by their excess CEO incentive compensation. The table reports the cumulative abnormal returns in the year after the compensation ranking. Excess incentive compensation is measured as the excess compensation over control firms matched on industry and size. The cumulative abnormal returns are calculated in excess of the average return of an industry and lagged-return matched portfolio. Standard errors are clustered at the firm level. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. This example shows that taking a data sample and adding a large mass of firms with low levels of unexplained incentive compensation and high future performance, in which the firms were selected because of their strong performance, can distort the interpretation of the relation between compensation and performance. 3.3 Pay-for-performance sensitivity Since Jensen and Murphy (1990), researchers often focus on the level of pay-for-performance sensitivity and whether it is sufficiently high. For example, Hall and Liebman (1998) and Murphy (1999) analyze the level and drivers of pay-for-performance sensitivity. Yermack (1995) examines the pay-for-performance sensitivity derived from option compensation. Studies of the level of international pay-for-performance sensitivity often compare their results to findings in the U.S. market based on ExecuComp data (e.g., Fernandes et al. 2013). Researchers have also examined changes in the levels of pay-for-performance sensitivity of executives after major macroeconomic events such as the Sarbanes-Oxley Act or the financial crisis (e.g., Chen, Jeter, and Yang 2015; Fahlenbrach and Stulz 2011). Additional work has tested whether the magnitude of a CEO’s pay-for-performance sensitivity affects corporate decisions (e.g., Minnick, Unal and Yang 2011). We employ two measures of executive compensation for the pay-for-performance sensitivity estimations. The first dependent variable we use is total direct compensation, TDC1. The second measure includes total direct compensation plus stock and option ownership and is defined as TDC1$$+\Delta$$(value of shares and options owned). The change in the value of shares equals the share price times the number of shares owned multiplied by the stock return. Similarly, the change in value of options owned is estimated as the sum across all options in the manager’s portfolio of the Black-Scholes value of the option times an estimate of the option delta multiplied by the stock return.21 3.3.1 Pay-for-performance sensitivity and firm size Given that the backfilled firms in ExecuComp tend to have higher performance than the firms that are not backfilled, and that prior research has shown the sensitivity of pay to performance and the structure of compensation to be a function of stock returns, we expect backfilling in ExecuComp to generate a variety of poorly estimated and perhaps even spurious relations.22 For example, prior work has concluded that pay-for-performance sensitivity is significantly decreasing in firm size (Schaefer 1998; Baker and Hall 2004). However, these studies used ExecuComp data, and this relation could be largely driven by the backfilling bias. Consider a simple example in which there are four types of firms: small firms with high returns (Small/High); small firms with low returns (Small/Low); large firms with high returns (Large/High); and large with low returns (Large/Low). If one further assumes that pay-for-performance sensitivity is an increasing function only of returns and not size, for example, because firms with higher expected returns grant more options, the Small/High and Large/High groups will have high pay-for-performance sensitivity, while Small/Low and Large/Low will have low pay-for-performance sensitivity. In this example, pay-for-performance sensitivity is solely a function of returns and there is no true relation between pay-for-performance sensitivity and size. As long as all four sets of firms are included in the sample for a regression analysis, the econometrician would find no relation with size. However, researchers employing ExecuComp would tend to observe the large firms due to their presence in the index (Large/Low and Large/High), but also a nonrandom sample of small firms, as the backfilled firms would tend to be from the Small/High group. Thus, including the backfilled sample, i.e., using Large/Low, Large/High, and Small/High in the analysis, would overestimate the level of pay-for-performance sensitivity, and would lead the econometrician to find a negative relation between pay-for-performance sensitivity and firm size when in fact none exists. If the backfilled observations were removed, then the estimated average pay-for-performance sensitivity would equal the population average, and one would find no relation between pay-for-performance sensitivity and size. We test for the presence of such a bias by examining the relation between pay-for-performance sensitivity and size. Following Schaefer (1998), in Table 9, we report pay-for-performance sensitivity estimates from fixed effect regressions, in which we split the sample into small and large stocks based on lagged market value of equity. Panel A provides the results for CEOs, and panel B provides the results for all executives. The first two columns in each panel employ all observations available, while the latter two exclude backfilled observations. We find that the relation between, pay-for-performance sensitivity and size differs depending on whether all of the observations are included or if the sample is free of backfilled firms. Table 9 Pay-for-performance sensitivity and firm size    TDC1 $$+ \Delta$$ (value of stock and option ownership)     (1)  (2)  (3)  (4)     Full sample  Excluding backfilled obs     Small  Large  Small  Large  A. CEO’s  $$\Delta_{t}$$shrwealth  11.85  0.027  2.35**  0.026     (1.79)  (0.50)  (6.29)  (0.48)  $$\Delta_{t-1}$$shrwealth  3.25  0.10  0.53**  0.11     (1.56)  (1.35)  (2.78)  (1.36)  Observations  4,743  5,607  4,435  5,145  R-squared  0.362  0.041  0.145  0.041  Diff-in-diff  9.5*              (1.87)           Bootstrapped diff-in-diff (average)  0.47           Bootstrapped $$t$$-stat (average)  (–0.10)           Fraction of bootstrapped diff-in-diff above 9.5  3.3%           Fraction of bootstrapped $$t$$-stats above 1.87  0.2%           B. All executives  $$\Delta_{t}$$shrwealth  2.99  –0.001  0.87**  –0.001     (1.51)  (–0.04)  (8.78)  (–0.07)  $$\Delta_{t-1}$$shrwealth  0.82  0.075**  0.32**  0.076**     (1.52)  (3.48)  (5.72)  (3.46)  Observations  24,642  29,099  23,278  27,068  R-squared  0.104  0.045  0.088  0.044  Diff-in-diff  2.13              (1.07)           Bootstrapped diff-in-diff (average)  0.11           Bootstrapped $$t$$-stat (average)  (–0.06)           Fraction of bootstrapped diff-in-diff above 2.13  5.6%           Fraction of bootstrapped $$t$$-stats above 1.07  2.4%              TDC1 $$+ \Delta$$ (value of stock and option ownership)     (1)  (2)  (3)  (4)     Full sample  Excluding backfilled obs     Small  Large  Small  Large  A. CEO’s  $$\Delta_{t}$$shrwealth  11.85  0.027  2.35**  0.026     (1.79)  (0.50)  (6.29)  (0.48)  $$\Delta_{t-1}$$shrwealth  3.25  0.10  0.53**  0.11     (1.56)  (1.35)  (2.78)  (1.36)  Observations  4,743  5,607  4,435  5,145  R-squared  0.362  0.041  0.145  0.041  Diff-in-diff  9.5*              (1.87)           Bootstrapped diff-in-diff (average)  0.47           Bootstrapped $$t$$-stat (average)  (–0.10)           Fraction of bootstrapped diff-in-diff above 9.5  3.3%           Fraction of bootstrapped $$t$$-stats above 1.87  0.2%           B. All executives  $$\Delta_{t}$$shrwealth  2.99  –0.001  0.87**  –0.001     (1.51)  (–0.04)  (8.78)  (–0.07)  $$\Delta_{t-1}$$shrwealth  0.82  0.075**  0.32**  0.076**     (1.52)  (3.48)  (5.72)  (3.46)  Observations  24,642  29,099  23,278  27,068  R-squared  0.104  0.045  0.088  0.044  Diff-in-diff  2.13              (1.07)           Bootstrapped diff-in-diff (average)  0.11           Bootstrapped $$t$$-stat (average)  (–0.06)           Fraction of bootstrapped diff-in-diff above 2.13  5.6%           Fraction of bootstrapped $$t$$-stats above 1.07  2.4%           This table reports the results of OLS regressions of pay-for-performance sensitivities based on firm size, using ExecuComp compensation data from 1994 through 2005. Panel A uses data on CEOs where CEOs are identified using the CEOANN variable in ExecuComp if available, otherwise the executive with highest total compensation in that firm-year is identified as the CEO. Panel B uses data on all executives. The dependent variable is TDC1 plus the estimated change in value of shares and options owned by the executive. $$\Delta _{t}$$shrwealth is beginning of fiscal year market equity times the firm’s real return (stock return minus CPI return) in millions. Columns one and three (two and four) use data from firms with market equity below (above) the median. Columns one and two use the full sample of firms and columns three and four exclude backfilled observations. All regressions include a control for return volatility (CDF($$\sigma^{2}_{\it ret}))$$ and executive and year fixed effects. At the bottom of each panel we compare the estimated difference in the coefficient on $$\Delta_{t}$$shrwealth between Columns 1 and 2 from the difference between Columns 3 and 4, and compare this estimated difference-in-differences to bootstrapped estimates and $$t$$-statistics. Standard errors are clustered at the firm level. ** and * indicate statistical significance at the 1% and 5% level, respectively. At the bottom of each panel, we report results on the statistical significance of the difference-in-differences (diff-in-diff) (the difference in the coefficient on$$\Delta_{t}$$shrwealth between small and large firms within the full sample versus the same difference within the sample that excludes backfilling). We bootstrap the difference-in-differences, and report the fraction of the bootstrapped sample that lies below our estimated difference-in-differences.23 These results demonstrate that the estimated pay-for-performance sensitivity-size relation is about five times larger in magnitude when one uses all observations versus the estimates obtained using only nonbackfilled firms. The difference is statistically and economically significant, and consistent with bias resulting from backfilling and forward-looking compensation. 3.3.2 Pay-for-performance sensitivity and firm risk Another cross-sectional determinant of pay for performance is firm risk; Aggarwal and Samwick (1999) find that executive pay is less sensitive to firm performance when that performance is highly volatile. We follow their approach and test whether pay-for-performance sensitivity is decreasing in the cumulative distribution function of variance of firm returns, CDF($$\sigma^{2}_{\it ret}$$), which equals the firm’s percentile of return variation among all firms in the sample. Values of zero and one correspond to firms with the minimum and maximum dollar return variation, respectively. To estimate pay-for-performance sensitivities, we regress one of our two measures of the change in CEO wealth (measured in thousands) on contemporaneous and lagged changes in shareholder wealth (measured in millions). We test the conditional relation with firm risk using an interaction term. We further condition all of these relations on backfilling by interacting each variable with an indicator variable that takes the value of one if the observation is backfilled. Specifically, we estimate the following regression using an ordinary least-squares (OLS) framework with manager and time fixed effects:   \begin{align*} \Delta ({\it Executive \ wealth})_{\rm t} & = {\rm a} + {\rm b}_{1}(\Delta\textit{Shareholder wealth})_{\rm t} \nonumber\\ &\quad+{\rm b}_{2}(\Delta \textit{Shareholder wealth})_{\mathrm{t-1}}\nonumber\\ &\quad + {\rm b}_{3}(\Delta \textit{Shareholder wealth})_{\rm t}*\textit{Back} \nonumber\\ &\quad + {\rm b}_{4}(\Delta \textit{Shareholder wealth})_{\rm t-1}*\textit{Back} + {\rm b}_{5} \textit{CDF}(\sigma^{2}_{\it ret})\nonumber\\ &\quad + {\rm b}_{6} \textit{CDF}(\sigma^{2}_{\it ret}) *(\Delta \textit{Shareholder wealth})_{\rm t} \nonumber\\ &\quad + {\rm b}_{7} \textit{CDF}(\sigma^{2}_{\it ret}) *(\Delta \textit{Shareholder wealth})_{\rm t} * \textit{Back}, \end{align*} where Back is one of our indicator variables for backfilling.24 The backfilling indicator variables used in these regressions are Back_Total, Back_Total_Index, Back_Total_Other, and Back_Total_Manager, indicating all backfilling, index, other, and manager backfilling (respectively) for observations in which the TDC1 variable is backfilled. The primary independent variable in these regressions, the change in shareholder wealth, is computed as market equity measured at the beginning of the firm’s fiscal year multiplied by the firm’s real return over the year (the stock return minus the percentage change in the consumer price index, or CPI). Table 10 shows the results. Table 10 Pay-for-performance sensitivity and firm risk    TDC1  TDC1$$+\Delta $$(value of stock & option ownership)     (1)  (2)  (3)  (4)  (5)  (6)  $$\Delta_{t}$$shrwealth  1.40**  1.12**  1.12**  5.83**  3.55**  3.43**     (5.97)  (4.63)  (4.64)  (15.8)  (9.66)  (9.48)  $$\Delta_{t-1}$$shrwealth  0.10**  0.098**  0.098**  0.10**  0.11**  0.11**     (8.76)  (8.56)  (8.56)  (5.52)  (6.02)  (6.10)  CDF($$\sigma^{2}_{\it ret})$$  4,357**  5,528**  5,239**  4,541**  7,770**  7,264**     (4.25)  (5.21)  (4.93)  (3.17)  (5.47)  (5.19)  $$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$  –1.42**  –1.14**  –1.14**  –5.93**  –3.61**  –3.48**     (–5.95)  (–4.61)  (–4.63)  (–15.7)  (–9.60)  (–9.42)  Back_Total     1,181*        –750           (2.10)        (–0.94)     Back_Total *$$\Delta_{t}$$shrwealth     4.09**        32.9**           (4.29)        (23.8)     Back_Total *$$\Delta_{t-1}$$shrwealth     0.29*        –0.35**           (2.33)        (–2.60)     Back_Total * CDF($$\sigma^{2}_{\it ret})$$     –3,351*        –1,402           (–2.38)        (–0.69)     Back_Total *$$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$     –3.38**        –33.4**           (–3.11)        (–21.5)     Back_Total_Mgr        527        995           (0.41)        (0.60)  Back_Total_Index        1,645        –5,220**           (1.50)        (–3.29)  Back_Total_Other        1,444        2,530*           (1.95)        (2.31)  Back_Total_Mgr *$$\Delta_{t}$$shrwealth        3.42        12.1**           (1.12)        (3.15)  Back_Total_Index *$$\Delta_{t}$$shrwealth        1.49        67.4**           (0.49)        (17.4)  Back_Total_Other *$$\Delta_{t}$$shrwealth        –4.40*        –4.34           (–2.18)        (–1.28)  Back_Total_Mgr *$$\Delta_{t-1}$$shrwealth        –0.19        –0.30*           (–1.35)        (–2.14)  Back_Total_Index *$$\Delta_{t-1}$$shrwealth        –0.57        –6.57*           (–0.28)        (–2.23)  Back_Total_Other *$$\Delta_{t-1}$$shrwealth        1.94**        0.14           (6.09)        (0.15)  Back_Total_Mgr * CDF($$\sigma^{2}_{\it ret})$$        –1,523        –764           (–0.60)        (–0.23)  Back_Total_Index * CDF($$\sigma^{2}_{\it ret})$$        –5,941        2,129           (–1.69)        (0.38)  Back_Total_Other * CDF($$\sigma^{2}_{\it ret})$$        –1,739        –5,814           (–0.86)        (–1.94)  Back_Total_Mgr *$$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$        –3.53        –12.1**           (–1.12)        (–3.05)  Back_Total_Index *$$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$        3.64        –88.2**           (0.65)        (–12.0)  Back_Total_Other *$$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$        9.85**        6.28           (3.47)        (1.32)  $$\chi $$2     11.67  10.68     155.10  79.62  p($$\chi $$2)     0.00  0.00     0.00  0.00  Observations  18,609  18,609  18,609  10,563  10,563  10,563  R-squared  0.025  0.028  0.034  0.060  0.132  0.168  Number of executives  4,412  4,412  4,412  2,959  2,959  2,959     TDC1  TDC1$$+\Delta $$(value of stock & option ownership)     (1)  (2)  (3)  (4)  (5)  (6)  $$\Delta_{t}$$shrwealth  1.40**  1.12**  1.12**  5.83**  3.55**  3.43**     (5.97)  (4.63)  (4.64)  (15.8)  (9.66)  (9.48)  $$\Delta_{t-1}$$shrwealth  0.10**  0.098**  0.098**  0.10**  0.11**  0.11**     (8.76)  (8.56)  (8.56)  (5.52)  (6.02)  (6.10)  CDF($$\sigma^{2}_{\it ret})$$  4,357**  5,528**  5,239**  4,541**  7,770**  7,264**     (4.25)  (5.21)  (4.93)  (3.17)  (5.47)  (5.19)  $$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$  –1.42**  –1.14**  –1.14**  –5.93**  –3.61**  –3.48**     (–5.95)  (–4.61)  (–4.63)  (–15.7)  (–9.60)  (–9.42)  Back_Total     1,181*        –750           (2.10)        (–0.94)     Back_Total *$$\Delta_{t}$$shrwealth     4.09**        32.9**           (4.29)        (23.8)     Back_Total *$$\Delta_{t-1}$$shrwealth     0.29*        –0.35**           (2.33)        (–2.60)     Back_Total * CDF($$\sigma^{2}_{\it ret})$$     –3,351*        –1,402           (–2.38)        (–0.69)     Back_Total *$$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$     –3.38**        –33.4**           (–3.11)        (–21.5)     Back_Total_Mgr        527        995           (0.41)        (0.60)  Back_Total_Index        1,645        –5,220**           (1.50)        (–3.29)  Back_Total_Other        1,444        2,530*           (1.95)        (2.31)  Back_Total_Mgr *$$\Delta_{t}$$shrwealth        3.42        12.1**           (1.12)        (3.15)  Back_Total_Index *$$\Delta_{t}$$shrwealth        1.49        67.4**           (0.49)        (17.4)  Back_Total_Other *$$\Delta_{t}$$shrwealth        –4.40*        –4.34           (–2.18)        (–1.28)  Back_Total_Mgr *$$\Delta_{t-1}$$shrwealth        –0.19        –0.30*           (–1.35)        (–2.14)  Back_Total_Index *$$\Delta_{t-1}$$shrwealth        –0.57        –6.57*           (–0.28)        (–2.23)  Back_Total_Other *$$\Delta_{t-1}$$shrwealth        1.94**        0.14           (6.09)        (0.15)  Back_Total_Mgr * CDF($$\sigma^{2}_{\it ret})$$        –1,523        –764           (–0.60)        (–0.23)  Back_Total_Index * CDF($$\sigma^{2}_{\it ret})$$        –5,941        2,129           (–1.69)        (0.38)  Back_Total_Other * CDF($$\sigma^{2}_{\it ret})$$        –1,739        –5,814           (–0.86)        (–1.94)  Back_Total_Mgr *$$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$        –3.53        –12.1**           (–1.12)        (–3.05)  Back_Total_Index *$$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$        3.64        –88.2**           (0.65)        (–12.0)  Back_Total_Other *$$\Delta_{t}$$shrwealth * CDF($$\sigma^{2}_{\it ret})$$        9.85**        6.28           (3.47)        (1.32)  $$\chi $$2     11.67  10.68     155.10  79.62  p($$\chi $$2)     0.00  0.00     0.00  0.00  Observations  18,609  18,609  18,609  10,563  10,563  10,563  R-squared  0.025  0.028  0.034  0.060  0.132  0.168  Number of executives  4,412  4,412  4,412  2,959  2,959  2,959  This table reports the results of OLS regressions of pay-for-performance sensitivities, and sensitivities conditional on firm risk using ExecuComp CEO compensation data from 1994 through 2005. CEOs are identified using the CEOANN variable in ExecuComp if available, otherwise the executive with the highest total compensation in that firm-year is identified as the CEO. The dependent variable in columns one through three is total direct compensation (TDC1) in thousands of dollars paid over fiscal year $$t$$. The dependent variable in columns four through six is TDC1 plus the estimated change in value of shares and options owned by the executive. The change in the value of shares owned is the share price times the number of shares owned multiplied by the stock return over $$t$$. The change in value of options is the sum across all options owned of the Black-Scholes value of options times the estimated option delta multiplied by the stock return over $$t$$. TDC1 and changes in ownership values are in thousands. $$\Delta_{t}$$Shrwealth is beginning of fiscal year market equity times the firm’s real return (stock return minus CPI return) in millions. CDF is the cumulative distribution function of the variance of returns for firms in our sample, following Aggarwal and Samwick (1999). Columns two and five include interactions with Back_Total. This indicator variable equals one if TDC1 is backfilled for that manager-year observation, and zero otherwise. Columns three and six include interactions with backfilling indicators that distinguish the type of backfilling, Back_Mgr, Back_Firm, and Back_Other. Each regression includes executive and year effects. ** and * indicate statistical significance at the 1% and 5% level, respectively. The first three columns use TDC1 to measure the change in CEO wealth, and columns 4 through 6 add wealth changes due to stock and option ownership to TDC1. Columns 2, 3, 5, and 6 include backfilling indicator variables to test for marginal effects of backfilling on the estimation. In Column 1, we do not make the distinction between backfilled and nonbackfilled data. We find that a CEO at the median-risk firm receives $${\$}$$$$0.69 = 1.40-0.5*1.42$$ in total direct compensation for every $${\$}$$1,000 generated for shareholders (ignoring the lagged effects of performance on pay). This is the estimate that would be obtained by a researcher unaware of backfilling. In Column 2, we distinguish between backfilled and nonbackfilled data using interactions with a backfill indicator variable. We find that using a sample of ExecuComp that excludes backfilled data results in a pay sensitivity estimate of $${\$}0.55 =1.12-0.5*1.14$$. Therefore, including backfilled data generates a pay-for-performance sensitivity estimate that is 25% higher than what would be found using a backfill-free sample. As this implies, we find that among backfilled observations, the pay-for-performance sensitivity estimate is statistically significantly higher, at $${\$}2.95 = 0.55 + 4.09- 0.5*3.38$$. In Column 3, we find that most of the backfilling effect is driven by Other backfilling. We find an even larger impact of backfilling on estimates that include wealth effects due to stock and option ownership. In the full sample, PPS for the median firm is estimated to be $${\$}2.87 = 5.83 - 0.5*5.93$$ per $${\$}$$1,000 change in shareholder wealth (Column 4). From Column 5, we see that for nonbackfilled data the median-firm pay-for-performance sensitivity is estimated as $${\$}1.75 = 3.55 - 0.5*3.61$$. Therefore, the inclusion of backfilled data generates a pay-for-performance sensitivity estimate that is 64% higher than what one would estimate among a nonbackfilled sample of ExecuComp data, a difference that is both economically and statistically significant. These findings suggest that, whether one measures wealth using TDC1 alone or TDC1 plus changes in the value of stock and option ownership, failure to account for backfilling leads to estimates of pay-for-performance sensitivities that are significantly biased upwards. We also test for backfilling-induced biases when estimating the relation between pay-for-performance sensitivity and firm risk. The coefficient in Column 1 on $$\Delta_{\rm t}$$Shareholder wealth * CDF($$\sigma^{2}_{\it ret})$$ indicates that the pay sensitivity at the maximum variance firm is lower by $${\$}$$1.42 per $${\$}$$1,000 relative to the sensitivity at the minimum variance firm. This is the estimate that results from using all data in ExecuComp. After excluding backfilled data we estimate this spread in pay sensitivity to be $${\$}$$1.14. This indicates that the inclusion of backfilled data generates an estimate of the conditional effect of firm variance that is 25% higher than what one would obtain using backfill-free data. These differences are again more dramatic once the value of stock and option ownership is included in the CEO wealth calculation ($$-5.93$$ versus $$-3.61$$). In this case, failure to account for backfilling leads to overstating the conditional effect of firm risk on PPS estimates by approximately 64%. At the bottom of the table, we present Wald test statistics for the joint significance of the backfilling interaction terms. Under the null hypothesis (that the interaction terms are jointly zero), backfilling would not introduce a statistically significant bias into the regression. As the results show, we can resoundingly reject the null (at the 0.01 significance level), providing strong statistical evidence of the economically significant bias caused by the backfilled observations across the regression’s coefficients. Thus, beyond affecting estimates of abnormal compensation and pay-for-performance sensitivity, we have used firm size and risk as examples to demonstrate that backfilling has the potential to generate economically and statistically significant biases when one examines cross-sectional variation in executive compensation. Next, we focus on the relation between firm value and managerial stock ownership. 3.4 Firm value and managerial ownership As shown in Tables 4, 5, and 6, backfilling is strongly associated with both managerial ownership (as a percentage of the firm) and Tobin’s q. In light of this, we estimate the impact of using backfilled data to estimate relations between firm value and managerial ownership like in Himmelberg, Hubbard, and Palia (1999) (hereafter, HHP). To do so, we regress Tobin’s q on total managerial ownership as a percentage of shares outstanding and other firm-level variables. Because this test uses firm-level observations, we redefine the backfilling dummy as one if any manager observation within that firm-year is backfilled, and zero otherwise. We interact this backfilling indicator variable with various measures of firm ownership in a manner consistent with HHP. Specifically, we test two different functional forms of the relation between ownership and firm value. Table 11 reports the results. We include squared ownership in columns one through four, nine, and 10 as an additional independent variable, and in columns five through eight, 11 and 12, we employ a spline specification. All regressions include year fixed effects and the last four columns include firm fixed effects. The first column shows a positive and significant association between Tobin’s q and managerial ownership, $$m$$. A 10% increase in managerial ownership is associated with a 0.21 increase in $$q$$. In the second column, we include indicator variables for backfilled data. The coefficient on ownership is no longer statistically significant when conditioning on backfilled data. The relation between ownership and $$q$$ among backfilled data is positive and significant (and significantly concave). Table 11 Firm value and managerial ownership    (1)  (2)  (3)  (4)  (5)  (6)  (7)  (8)  (9)  (10)  (11)  (12)  $$m$$  2.06**  0.51  0.24  –1.09              2.13**  1.28*           (3.70)  (0.98)  (0.44)  (–1.79)              (4.74)  (2.25)        $$m^{2}$$  –0.56  1.11  0.79  2.66*              –0.71  0.70           (–0.77)  (0.93)  (1.07)  (2.16)              (–1.45)  (0.82)        Back_Salary * m     2.79**     2.27**                 1.20*              (3.55)     (2.89)                 (2.37)        Back_Salary * m$$^{2}$$     –2.82*     –2.92*                 –1.92*              (–2.04)     (–2.12)                 (–2.11)        m1              4.44*  3.01  –3.87  –3.73        1.15  1.59                 (2.14)  (1.38)  (–1.84)  (–1.65)        (0.68)  (0.80)  m2              1.37  0.22  1.64  0.31        2.38**  1.36                 (1.38)  (0.27)  (1.71)  (0.40)        (3.72)  (1.67)  m3              1.43  1.51  0.80  1.27        1.17**  1.90**                 (1.84)  (1.47)  (1.04)  (1.36)        (2.94)  (2.83)  Back_Salary * m1                 2.53     –0.22           –0.67                    (0.89)     (–0.077)           (–0.40)  Back_Salary * m2                 2.40     2.61           1.74*                    (1.48)     (1.64)           (1.99)  Back_Salary * m3                 –0.35     –0.91           –1.13                    (–0.25)     (–0.69)           (–1.49)  Controls        Y  Y        Y  Y  Y  Y  Y  Y  Observations  18,866  18,866  16,921  16,921  18,866  18,866  16,921  16,921  16,921  16,921  16,921  16,921  R-squared  0.021  0.024  0.206  0.207  0.022  0.024  0.206  0.207  0.165  0.166  0.166  0.166  Number of firms                          2,256  2,256  2,256  2,256     (1)  (2)  (3)  (4)  (5)  (6)  (7)  (8)  (9)  (10)  (11)  (12)  $$m$$  2.06**  0.51  0.24  –1.09              2.13**  1.28*           (3.70)  (0.98)  (0.44)  (–1.79)              (4.74)  (2.25)        $$m^{2}$$  –0.56  1.11  0.79  2.66*              –0.71  0.70           (–0.77)  (0.93)  (1.07)  (2.16)              (–1.45)  (0.82)        Back_Salary * m     2.79**     2.27**                 1.20*              (3.55)     (2.89)                 (2.37)        Back_Salary * m$$^{2}$$     –2.82*     –2.92*                 –1.92*              (–2.04)     (–2.12)                 (–2.11)        m1              4.44*  3.01  –3.87  –3.73        1.15  1.59                 (2.14)  (1.38)  (–1.84)  (–1.65)        (0.68)  (0.80)  m2              1.37  0.22  1.64  0.31        2.38**  1.36                 (1.38)  (0.27)  (1.71)  (0.40)        (3.72)  (1.67)  m3              1.43  1.51  0.80  1.27        1.17**  1.90**                 (1.84)  (1.47)  (1.04)  (1.36)        (2.94)  (2.83)  Back_Salary * m1                 2.53     –0.22           –0.67                    (0.89)     (–0.077)           (–0.40)  Back_Salary * m2                 2.40     2.61           1.74*                    (1.48)     (1.64)           (1.99)  Back_Salary * m3                 –0.35     –0.91           –1.13                    (–0.25)     (–0.69)           (–1.49)  Controls        Y  Y        Y  Y  Y  Y  Y  Y  Observations  18,866  18,866  16,921  16,921  18,866  18,866  16,921  16,921  16,921  16,921  16,921  16,921  R-squared  0.021  0.024  0.206  0.207  0.022  0.024  0.206  0.207  0.165  0.166  0.166  0.166  Number of firms                          2,256  2,256  2,256  2,256  This table reports estimates of the relation between firm value measured by Tobin’s q and managerial ownership, using firm-level observations from 1994 through 2005. All regressions have year effects, and the last four columns also include firm effects. Columns 1–4 and 9–10 use the sum of managerial ownership as a percentage of shares outstanding, $$m$$, and squared ownership, $$m^{2}$$. Columns 5–8 and 11–12 use a spline specification with breakpoints of 5% and 25%, indicated by independent variables m1, m2, and m3. The backfill indicator variable equals one if Back_Salary$$=$$1 for any manager in that firm-year. We report results for all regressions both with and without a set of control variables. Controls are ln(Sales), ln(Sales) squared, the ratio of property, plant, and equipment (PPE) to Sales, the squared ratio of PPE to Sales, ratio of operating income to Sales, the standard deviation of idiosyncratic stock-price risk and a dummy equal to one if data are available to calculate this standard deviation, the ratio of R&D to PPE and a dummy equal to one if data are available to calculate this ratio, the ratio of advertising expenditures to PPE and a dummy equal to one if data are available to calculate this ratio, and the ratio of capital expenditures to PPE. These controls are included in columns three, four, seven, eight, eleven and twelve. Standard errors are clustered at the firm level. ** and * indicate statistical significance at the 1% and 5% level, respectively. Examination of the remaining columns suggests that the effect of backfilling is not quite as straightforward as is suggested by the first two columns. The impact of backfilling depends on the presence of controls for firm characteristics and the specification of the functional form of the relation between ownership and $$q$$. For example, Column 9 includes firm fixed effects as well as additional control variables. The coefficient on ownership is 2.13, and while this again drops after controlling for backfilling in Column 10 (to 1.28), it remains statistically significant. Like in the other columns, in these specifications, concavity in the ownership-$$q$$ relation is driven by the backfilled data. The spline specification offers similar results. Conditioning on backfilling either renders the relation insignificant (Column 5 vs. 6), or alters the magnitude of the effect and its functional form (Columns 11 and 12). These results present another, perhaps even more alarming example of the impact of backfilling. With pay-for-performance, we documented statistically and economically significant differences in estimated coefficients. Here, using the example of the relation between Tobin’s q and ownership, we now find that the inclusion of backfilled data can generate statistically significant coefficients of interest that lose their significance when one controls for backfilled data. Furthermore, these changes in significance can even lead to confusion over the proper functional form of the association. 4. Identifying Backfilling with Readily Available Data In the previous sections, we have shown that including backfilled data in the sample can produce meaningful differences in estimated coefficients and consequently, the interpretation of results. One solution is to use the data we make available for research purposes. These include the identification of backfilled observations, when we estimate the observation was backfilled (useful for purposes of replication), and why we think the observation was backfilled. A drawback of this data set is that it only applies to 1994–2005. While the magnitude of backfilling dropped immediately following the SEC’s 2006 compensation disclosure rule change, it appears that the practice has not stopped completely and may have rebounded. In this section, we suggest screens that could reasonably be applied to any time period. First, a simple way to reduce bias induced by including backfilled data in estimation can be obtained by imposing the following two rules: (1) discard managers in which Salary is available but TDC1 is missing and (2) discard firms that are not and have never been members of the S&P 1500. Out of the 136,684 observations for fiscal years 1994–2005, there are 20,159 observations in which Salary is available but TDC1 is missing. According to our estimation, 86% of these observations are backfilled. Therefore, even if analysis of a research question requires only summary compensation data, the researcher requiring TDC1 to be available will mitigate the backfilling bias. If the sample is further restricted to executives at firms that were either in the S&P 1500 during that fiscal year in which the compensation data apply or were in the S&P 1500 prior to that fiscal year, then 100% of additions due to backfilling from Other or Index are removed. In addition to removing backfilled data, this screen also removes executives at firms whose inclusion in ExecuComp was initiated for other reasons (but are not necessarily backfilled). Specifically, this removes 16,809 observations, 53% of which are backfilled, reducing the final sample to 99,716. Of the remaining 99,716 observations, only 6% of Salary observations and 6% of TDC1 observations are backfilled as the result of manager additions. This can be compared to the unconditional probability that a Salary (TDC1) observation is backfilled, 23% (13%). To better understand the validity of this screening procedure, we examine the mean manager and firm characteristics for the different subsamples. In Columns 1 and 2 of Table 12, we report sample means for nonbackfilled and backfilled samples, respectively, as identified by the screening process discussed in this section. Column 1 reports mean characteristics of manager-year observations in which both Salary and TDC1 are available, and the firm is currently in the S&P 1500 or was included in the index previously. Column 2 reports means for observations in which either TDC1 is missing or the firm is not and has never been in the S&P 1500. This distinction between estimating which data are nonbackfilled and backfilled can be compared to results we obtain when making the distinction using the 11 overlapping vintages of ExecuComp (columns three and four). Column 5 reports statistical significance of the difference between data estimated to be nonbackfilled using screens, versus data determined to be nonbackfilled using the 11 vintages. Table 12 A comparison of alternate methods for identifying backfilling    Estimated using readily available data  Estimated using overlapping vintages of ExecuComp           Nonbackfilled  Backfilled  Nonbackfilled  Backfilled  Difference (1)–(3)  $$t$$-stat on difference     (1)  (2)  (3)  (4)  (5)  (6)  Salary  377  256  379  244  –2  (–1.70)  TDC1  2,185  1,748  2,261  1,187  –76**  (–2.77)  Shares owned  711  734  786  565  –75  (–1.15)  BS Option Value  923  980  1066  486  –143**  (–6.82)  Option PFP  0.58  1.38  0.65  0.93  –0.07**  (–5.40)  $$q$$  1.91  2.59  2.03  2.29  –0.12**  (–3.17)  Mkt value of equity  6446  4855  6383  4890  63  (1.70)  Return$$_{t}$$  0.158  0.345  0.183  0.287  –0.025**  (–4.75)  Return$$_{t+1}$$  0.167  0.315  0.177  0.310  –0.01**  (–2.73)  Return$$_{t+2}$$  0.152  0.230  0.148  0.244  0.004  (1.54)  Return$$_{t+3}$$  0.109  0.147  0.109  0.150  0.000  (0.25)     Estimated using readily available data  Estimated using overlapping vintages of ExecuComp           Nonbackfilled  Backfilled  Nonbackfilled  Backfilled  Difference (1)–(3)  $$t$$-stat on difference     (1)  (2)  (3)  (4)  (5)  (6)  Salary  377  256  379  244  –2  (–1.70)  TDC1  2,185  1,748  2,261  1,187  –76**  (–2.77)  Shares owned  711  734  786  565  –75  (–1.15)  BS Option Value  923  980  1066  486  –143**  (–6.82)  Option PFP  0.58  1.38  0.65  0.93  –0.07**  (–5.40)  $$q$$  1.91  2.59  2.03  2.29  –0.12**  (–3.17)  Mkt value of equity  6446  4855  6383  4890  63  (1.70)  Return$$_{t}$$  0.158  0.345  0.183  0.287  –0.025**  (–4.75)  Return$$_{t+1}$$  0.167  0.315  0.177  0.310  –0.01**  (–2.73)  Return$$_{t+2}$$  0.152  0.230  0.148  0.244  0.004  (1.54)  Return$$_{t+3}$$  0.109  0.147  0.109  0.150  0.000  (0.25)  This table reports mean manager and firm characteristics for subsamples. The first two columns report means for samples in which the distinction between backfilled and nonbackfilled data is made using readily available information from ExecuComp and Compustat. In columns three and four, the distinction is made using 11 overlapping vintages of ExecuComp. All variables are as defined in Table 4. The last column reports $$t$$-statistics (with standard errors clustered at the firm level) for the difference between the two nonbackfilled samples (Column 1 minus Column 3). ** and * indicate statistical significance at the 1% and 5% level, respectively. Results indicate that the screening procedure does a reasonable job of obtaining a sample that is representative of the nonbackfilled observations. The screening sufficiently controls for differences in salary and ownership, firm size, and to some extent, future firm returns. It is not perfect as indicated by the differences that remain. Encouragingly though, results suggest that these screens using readily available data do a reasonably good job of capturing many of the biases in firm and manager characteristics that result from backfilling. Perhaps most importantly, even the remaining statistically significant differences are much smaller in economic magnitude compared to the uncorrected differences. 5. Conclusions Standard and Poor’s has often backfilled compensation data when compiling its ExecuComp database. Three events can lead to backfilling: a firm enters the S&P 1500, historical information for a non-S&P 1500 firms is added to the data set, or a manager enters the group of the top five compensated managers within the firm, a set that requires reporting the manager’s compensation in SEC filings. Because this backfilling process is nonrandom, it is perhaps not surprising that these data differ significantly from the nonbackfilled data along several dimensions. For example, backfilled observations tend to be executives at firms with high stock returns and low return volatility. Further these executives tend to have lower salary and higher option compensation relative to nonbackfilled data. Thus, while the additional data can be helpful for some purposes, we demonstrate that using the full data set can be problematic for other purposes. After examining several examples of compensation-based relations that have been of interest among researchers in financial economics, we find that using the backfilled data biases estimates. For example, we find a pay-for-performance sensitivity estimate among backfilled data that is several times higher than the sensitivity among nonbackfilled data. These differences not only make it difficult to replicate earlier work, they can also lead to inappropriate inferences. As one example, we find that a previously documented relation between abnormal incentive compensation and future stock returns is not robust to dropping the backfilled data. We also have documented statistically and economically significant effects due to backfilling for two examples of cross-sectional determinants of managerial incentives, firm size and risk. Perhaps even more worrisome are the results from the example of firm value and managerial ownership, where including backfilled data can lead statistically insignificant coefficients to be significant, and even change the apparent estimated functional form of the relation. While examining the entire literature on executive compensation is obviously beyond the scope of this study, the evidence contained in these examples points to the potential for backfilling to introduce economically and statistically significant biases in empirical work in this area. Other types of research could be affected by the backfilling bias as well. For example, structural models based on ExecuComp data that rely on the estimation of executive compensation or managerial ownership could have misleading interpretations. Studies that divide firms into treatment and control groups based on the data and that conduct differences-in-differences tests may result in biased differences. Further, it is important to recognize the backfilling issue for studies that focus on variables other than compensation, but that use ExecuComp data as controls in the analyses. Such studies also will be subject to the effects of the ex post conditioning bias, including potential misrepresentation of relations. To assist researchers in avoiding these issues going forward, we have identified a set of screens that rely on only readily available data designed to identify observations that are likely backfilled. We are grateful to seminar participants and colleagues at Erasmus University, the University of Georgia, University of Pittsburgh, University of Texas at Austin, Texas Tech University, and the European Finance Association; John Bizjak, George Cashman, Alan Crane, Jonathan Cohn, Nicholas Hirschey, David Hirshleifer (the editor), Shane Johnson, Rose Liao, Bradley Paye, Elvira Solji, Chester Spatt, Johan Sulaeman, and Chisen Wei; and the anonymous referees for their helpful comments and feedback. We are also grateful to the staff at S&P for helping us understand their data construction process. Appendix. Identifying Types of Backfilling ExecuComp releases several versions of the database each year as they update data based on firms’ annual proxy statements filed with the Securities and Exchange Commission. Proxy statements must be filed within 120 days of the firm’s fiscal year-end, and most firms have a fiscal year-end of December 31. However, because not all firms have fiscal year-end dates in December, S&P releases ExecuComp in April of each year, and then provides updates throughout the year, usually in May, June, and October. Naturally, the October vintage of a particular year’s ExecuComp database is the most complete in terms of coverage for the most recent fiscal year. The approach we take to identify backfilled data is to examine a number of vintages of ExecuComp and then use overlapping periods of coverage to back out the vintage in which each observation first appears. We use October releases of the ExecuComp database for each year from 1996 through 2006, with the exception of 2002, for which we have a June release. Our strategy for identifying backfilled observations considers the delay in data being entered into the system. That is, we allow for the maximum 120-day period after fiscal year-end that firms have to file their proxies and an additional two-month processing time for proxy data to be added to the database.25 To ensure conservative estimation of backfilling, and to allow for uncertainty in intra-month timing, we add an additional month and allow a full seven months from the firm’s fiscal year-end before we expect the data to be in the database. The specifics of S&P’s process for additions to the database are as follows. If a new firm is added to the S&P 1500 in year $$t$$, then S&P would collect all option and summary compensation variables for year $$t$$, and backfill data from previous years depending on how long each manager had been in the top-five group: (1) if the manager was in the top five in t–1 and t–2, then all option and summary compensation data are backfilled for these years; (2) if the manager was not in the top five in $$t$$–1 and $$t$$–2, then S&P would backfill only summary compensation items for these years. If a non-S&P-1500 firms is added to the database, the backfilling would depend in part on the rationale for addition (e.g., a client requests historical data) and in part on data availability. If a new individual enters the set of top-five executives for which compensation is disclosed in year $$t$$, S&P would collect all option and summary compensation data for year $$t$$, and backfill only summary compensation variables for years $$t$$–1 and $$t$$–2. Given this process, we use the following algorithm to identify the types of backfilling. If the executive-year observation is backfilled, then: Other: If the firm as of 2009 has never been in the S&P 1500 index, it is defined as Other backfilled. If the year of the earliest vintage in which any of that firm’s observations were backfilled is less than or equal to the year in which the firm was added to the index AND the year of the observation is before the year in which the firm was added to the index, then it is defined as Other backfilled. Index-level If the year of the earliest vintage in which any of that firm’s observations were backfilled is one year after the firm was added to the index, it is defined as Firm backfilled. If the year of the earliest vintage in which any of that firm’s observations were backfilled is two years after the firm was added to the index AND all observations in that firm-year were backfilled, then it is defined as Firm backfilled. Manager-level If the year of the earliest vintage in which any of that firm’s observations were backfilled is at least three years after the firm was added to the index (or exactly two years after and not all of the observations in that firm-year were backfilled), it is defined as Manager backfilled. If the year the observation was backfilled is after the year of the earliest vintage in which any of that firm’s observations were backfilled, it is defined as Manager backfilled. If the year of the observation is equal to or greater than the year the firm was added to the index, it is defined as Manager backfilled. Table A1 summarizes our eleven vintages. The first two columns report the number of observations with Salary and TDC1 for each vintage. The third and fourth columns report the number of observations meeting the requirement that the firm’s fiscal year ends seven months prior to the vintage month and year. Table A1 Summary of ExecuComp vintage years    Full sample  Excluding last seven months of vintage  Vintage  Salary obs.  TDC1 obs.  Salaryobs.  TDC1 obs.  1996  Oct  35,273  29,777  34,966  29,470  1997  Oct  46,507  38,758  46,212  38,466  1998  Oct  57,705  48,102  57,408  47,807  1999  Oct  69,812  59,572  69,524  59,294  2000  Oct  82,492  70,126  82,192  69,830  2001  Oct  96,020  81,289  95,726  80,996  2002  Jun  100,782  85,213  98,903  83,339  2003  Oct  118,026  99,630  117,768  99,372  2004  Oct  129,584  109,451  129,332  109,199  2005  Oct  140,932  119,195  140,684  118,950  2006  Oct  152,490  129,154  152,234  128,904  2009  Oct  179,761  154,624  179,761  154,624     Full sample  Excluding last seven months of vintage  Vintage  Salary obs.  TDC1 obs.  Salaryobs.  TDC1 obs.  1996  Oct  35,273  29,777  34,966  29,470  1997  Oct  46,507  38,758  46,212  38,466  1998  Oct  57,705  48,102  57,408  47,807  1999  Oct  69,812  59,572  69,524  59,294  2000  Oct  82,492  70,126  82,192  69,830  2001  Oct  96,020  81,289  95,726  80,996  2002  Jun  100,782  85,213  98,903  83,339  2003  Oct  118,026  99,630  117,768  99,372  2004  Oct  129,584  109,451  129,332  109,199  2005  Oct  140,932  119,195  140,684  118,950  2006  Oct  152,490  129,154  152,234  128,904  2009  Oct  179,761  154,624  179,761  154,624  This table shows the number of manager-year observations in each of the twelve vintages of ExecuComp, from 1996 to 2006, plus 2009. The first two columns indicate the year and month of each vintage. The third column reports the number of observations in which Salary is available, and the fourth column reports the number of observations in which TDC1 is available. The last two columns summarize the number of observations with fiscal year-end at least seven months prior to the date the vintage was released. Using overlapping coverage periods from these 11 vintages, plus the October 2009 version of ExecuComp, for each observation we identify which variables are backfilled and the year in which S&P backfilled the data. We do this separately based on two different compensation variables: Salary and TDC1. We define Back_Salary as an indicator variable equal to one if Salary is backfilled and zero otherwise. Similarly Back_Total is an indicator variable equal to one if TDC1 is backfilled and zero otherwise. Following S&P’s backfilling process, the observations identified as backfilled by Back_Total should be a subset of that identified by Back_Salary. The specific procedure by which we identify backfilled observations and the vintage in which the observation was backfilled is fairly straightforward. If a firm’s fiscal year-end is at least seven months prior to the release date of a given vintage of ExecuComp, the data should appear in that vintage of the database. If the data instead first appears in a later vintage, then that observation is identified as backfilled. For example, the compensation data of managers at a firm with a fiscal year-end of December 1995 should appear in an October 1996 vintage of the database. If the 1995 compensation data are not in the October 1996 vintage of ExecuComp, but appear in October 1997 or a later vintage, then that observation is identified as backfilled. We repeat this process for each subsequent annual edition of ExecuComp, always excluding the seven months prior to that vintage’s release date. To identify backfilled data, we do this for each manager-year observation twice, once using the Salary variable, and again using the TDC1 variable. In doing so, we identify whether an observation has been backfilled, the type of compensation data that was backfilled, and the year in which the backfilling took place. References Aboody, D., and Kasznik. R. 2000. CEO stock option awards and the timing of corporate voluntary disclosures. Journal of Accounting and Economics  29: 73– 100. Google Scholar CrossRef Search ADS   Abowd, D. 1990. Does performance based managerial compensation affect corporate performance? ILR Review 43(Special issue) : 52S– 73S. Aggarwal, R., and Samwick. A. 1999. The other side of the tradeoff: The impact of risk on executive compensation. Journal of Political Economy  107: 65– 105. Google Scholar CrossRef Search ADS   Baker, G. P., and Hall. B. J. 2004. CEO incentives and firm size. Journal of Labor Economics  22: 767– 98. Google Scholar CrossRef Search ADS   Baker, G. P., Jensen, M. and Murphy. K. J. 1988. Compensation and incentive: Practice vs. theory. Journal of Finance  43: 593– 616. Google Scholar CrossRef Search ADS   Bebchuk, L., Cremers, M. and Peyer. U. 2011. The CEO pay slice. Journal of Financial Economics  102: 199– 221. Google Scholar CrossRef Search ADS   Bebchuk, L., and Fried. J. 2004. Pay Without Performance  ( Harvard University Press). Bergstresser, D., and Philippon. T. 2006. CEO incentives and earnings management. Journal of Financial Economics  80: 511– 29. Google Scholar CrossRef Search ADS   Burns, N., and Kedia. S. 2006. The impact of performance-based compensation on misreporting. Journal of Financial Economics  79: 35– 67. Google Scholar CrossRef Search ADS   Cadman, B., Klasa, S. and Matsunaga. S. 2010. Determinants of CEO pay: A comparison of ExecuComp and non-ExecuComp firms. Accounting Review  85: 1511– 43. Google Scholar CrossRef Search ADS   Chauvin, K., and Shenoy. C. 2001. Stock price decreases prior to executive stock-option grants. Journal of Corporate Finance  7: 53– 76. Google Scholar CrossRef Search ADS   Chen, H., Jeter, D. and Yang. Y. 2015. Pay-performance sensitivity before and after SOX. Journal of Accounting and Public Policy  34: 52– 73. Google Scholar CrossRef Search ADS   Conyon, M. J., Core, J. E. and Guay. W. R. 2011. Are US CEOs paid more than UK CEOs? Inferences from risk-adjusted pay. Review of Financial Studies  24: 402– 38. Google Scholar CrossRef Search ADS   Conyon, M. J., and Murphy. K. J. 2000. The prince and the pauper? CEO pay in the United States and United Kingdom. Economic Journal  110: 640– 71. Google Scholar CrossRef Search ADS   Cooper, M. J., Gulen, I. and Rau. P. R. 2014. Performance for pay? The relation between CEO incentive compensation and future stock price performance. Working Paper , University of Utah. Google Scholar PubMed PubMed  Core, J., Guay, W. and Larcker. D. 2008. Power of the pen, and executive compensation. Journal of Financial Economics  88: 1– 25. Google Scholar CrossRef Search ADS   Core, J., Holthausen, R. and Larcker. D. 1999. Corporate governance, chief executive officer compensation, and .firm performance. Journal of Financial Economics  51: 371– 406. Google Scholar CrossRef Search ADS   Eley, J. 2013. Beware the Torres effect in executive pay. Financial Times , May 4. Fahlenbrach, R., and Stulz. R. M. 2011. Bank CEO incentives and the credit crisis, Journal of Financial Economics  99: 11– 26. Google Scholar CrossRef Search ADS   Fernandes, N., Ferreira, M. Matos, P. and Murphy. K.J. 2013. Are U.S. CEOs paid more? New international evidence. Review of Financial Studies  26: 323– 67. Google Scholar CrossRef Search ADS   Gabaix, X. and Landier. A. 2008. Why has CEO pay increased so much? Quarterly Journal of Economics  123: 49– 100. Google Scholar CrossRef Search ADS   Gillan, S., Hartzell J. and Parrino. R. 2009. Explicit vs. implicit contracts: Evidence from CEO employment agreements. Journal of Finance  64: 1629– 55. Google Scholar CrossRef Search ADS   Grinstein, Y. and Hribar. P. 2004. CEO compensation and incentives-Evidence from M&A bonuses. Journal of Financial Economics  73: 119– 43. Google Scholar CrossRef Search ADS   Hall, B., and Liebman. J. 1998. Are CEOs really paid like bureaucrats? Quarterly Journal of Economics  112: 653– 91. Google Scholar CrossRef Search ADS   Hartzell, J., and Starks. L. 2003. Institutional investors and executive compensation. Journal of Finance  58: 2351– 74. Google Scholar CrossRef Search ADS   Hayes, R. and Schaefer, S. 2000. Implicit contracts and the explanatory power of top executive compensation for future performance. Rand Journal of Economics  31: 273– 93. Google Scholar CrossRef Search ADS   Himmelberg, C., Hubbard, R. G. and Palia. D. 1999. Understanding the determinants of managerial ownership and the link between ownership and performance. Journal of Financial Economics  53: 353– 84. Google Scholar CrossRef Search ADS   Jensen, M. and Murphy. K. J. 1990. Performance pay and top-management incentives. Journal of Political Economy  98: 225– 64. Google Scholar CrossRef Search ADS   Joskow, P., Rose, N. and Shepard. A. 1993. Regulatory constraints on CEO compensation. Brookings papers on Economic Activity: Microeconomics  1: 1– 72. Google Scholar CrossRef Search ADS   Kale, J. E. Reis, and Venkateswaran. A. 2009. Rank-order tournaments and incentive alignment: The effect on firm performance. Journal of Finance  64: 1479– 512. Google Scholar CrossRef Search ADS   Landier, A., Savagnat, J. Sraer, D. and Thesmar. D. 2013. Bottom-up corporate governance. Review of Finance  17: 161– 201. Google Scholar CrossRef Search ADS   Lie, E. 2005. On the timing of CEO stock option awards. Management Science  51: 802– 12. Google Scholar CrossRef Search ADS   Main, B. G. M., O’Reilly, C. A. and Wade. J. 1993. Top executive pay: Tournament or teamwork? Journal of Labor Economics  11: 606– 28. Google Scholar CrossRef Search ADS   Minnick, K., Unal H. and Yang. L. 2011. Pay for performance: CEO compensation and acquirer returns in BHCs. Review of Financial Studies  24: 439– 72. Google Scholar CrossRef Search ADS   Murphy, K. J. 1985. Corporate performance and managerial remuneration: an empirical investigation. Journal of Accounting and Economics  7: 11– 42. Google Scholar CrossRef Search ADS   Murphy, K. J. 1999. Executive compensation. In Ashenfelter O. and Card, D. (eds.) Handbook of Labor Economics . Amsterdam: North Holland Google Scholar CrossRef Search ADS   Murphy, K. J. 2002. Executive compensations: Managerial power versus the perceived cost of options. University of Chicago Law Review  69: 847– 69. Google Scholar CrossRef Search ADS   Murphy, K. J. 2003. Stock-based pay in new-economy firms. Journal of Accounting and Economics  34: 129– 47. Google Scholar CrossRef Search ADS   Murphy, K. J. 2013. Executive compensation: Where we are and how we got there. In Constantinides, G. Harris, M. and Stulz R. (eds.) Handbook of the Economics of Finance  Amsterdam: North Holland, 211– 356. Murphy, K. J., and Zábojník. J. 2004. CEO pay and appointments: A market-based explanation for recent trends. American Economic Review Papers and Proceedings  94: 192– 6. Google Scholar CrossRef Search ADS   Nagel, G. L. 2010. The effect of labor market demand on US CEO pay since 1980. Financial Review  45: 931– 50. Google Scholar CrossRef Search ADS   Perry, T., and Zenner. M. 2001. Pay for performance? Government regulation and the structure of compensation contracts. Journal of Financial Economics  62: 453– 88. Google Scholar CrossRef Search ADS   Rose, N. and Wolfram. C. 2002. Regulating executive pay: Using the tax code to influence Chief Executive Officer compensation. Journal of Labor Economic  20: 138– 75. Google Scholar CrossRef Search ADS   Saft, J. 2012. Saft on wealth. High pay and the folly of the Dow 13000. Reuter News , February 23. Schaefer, S. 1998. The dependence of pay-performance sensitivity on firm size. Review of Economics and Statistics  80: 436– 43. Google Scholar CrossRef Search ADS   Smith, C., and Watts. R. 1992. The investment opportunity set and corporate financing, dividend, and compensation policies, Journal of Financial Economics  32: 263– 92. Google Scholar CrossRef Search ADS   Yermack, D. 1995. Do corporations award CEO stock options effectively? Journal of Financial Economics  39: 237– 69. Google Scholar CrossRef Search ADS   Yermack, D. 1997. Good timing: CEO stock option awards and company news announcements. Journal of Finance  52: 449– 76. Google Scholar CrossRef Search ADS   1See Murphy (1999, 2013) for details on the history of public and academic interest in executive compensation. 2In addition to academic researchers, other users of the database include companies and consultants interested in benchmarking executive compensation, media outlets interested in reporting on executive compensation issues (e.g., BusinessWeek), financial analysts and portfolio managers seeking to inform investment decisions, and governance specialists and think tanks looking to inform policy debates. 3Some studies compare ExecuComp firms to non-ExecuComp firms. For example, see Cadman, Klasa, and Matsunaga (2010). 4The S&P 1500 is an investable index that combines the S&P 500, the S&P MidCap 400, and the S&P SmallCap 600. The base date of the index is 1994; see http://www.standardandpoors.com/indices/sp-composite-1500/en/us/?indexId=spusa-15–usduf–p-us—-. While the additions we study are to any of the three indices, for ease of exposition we refer to these as S&P 1500 additions. 5We examine separate vintages of ExecuComp for each year over the period 1996–2006. These provide overlapping coverage of compensation data for fiscal years 1994–2004. We also include a later vintage (2009) to ensure that we have a sufficiently populated data set with which to identify backfilling of fiscal year 2005 data. Throughout the remainder of the paper we focus primarily on compensation data for fiscal years 1994–2005. The earliest vintage of ExecuComp was released in 1994, thus limiting our ability to identify backfilled data for fiscal years prior to 1994, and per Standard & Poor’s, the practice of backfilling was discontinued in 2006 as only one year of compensation data was required for the transition year. However, as noted above, there appears to be more backfilling after that date coincident with the SEC phasing in disclosure of prior years’ data. 6Similarly, of the 116,525 manager-year observations with data on option compensation, we estimate that 15,164 (13%) have been backfilled. 7Firms may choose to report data on more executives than the top five in compensation. If so, S&P collects data on up to nine executives (while always including the five executives who would have been in the top five had they still been in place at the end of the year). In addition, the pre-2006 rules allowed firms to not disclose compensation for executives other than the CEO if their total salary and bonus were less than $${\$}$$200,000. However, such exceptions are rare in the data. 8http://www.sec.gov/rules/final/2006/33-8732a.pdf, page 197. 9Although the backfilling of ExecuComp officially ended in 2006 (fiscal year 2005), to ensure that we have a full view of backfilling for the last few years of the relevant sample, we use the database from three years later, the October 2009 vintage. The numbers cited here summarize ExecuComp as of October, 2009 and cover firms over fiscal years 1994 through 2005 (specifically, through fiscal year-end month of March 2006). Because backfilling continues post 2006—but at a lower rate—a version of ExecuComp downloaded after October 2009 is likely to contain more than the 22,720 firm-year observations we identify for this same 1994–2005 sample period. 10Samples 3, 4, and 5 are subsets of sample 2. Samples 4 and 5 differ from the others because in these cases, a new firm is added to ExecuComp, implying that both firm characteristics and manager characteristics are backfilled. Such firm additions typically result in backfilling of up to three years of stock returns. 11Specifically, option sensitivity is measured as the number of options owned scaled by shares outstanding multiplied by the option delta. Stock sensitivity is measured similarly, but with delta set to one. The sensitivity of cash compensation is measured by first regressing the change in log cash compensation on the change in log shareholder value in each cross section, where cash compensation is salary, bonus, and other compensation. This elasticity is then multiplied by the CEO’s cash compensation scaled by the market equity value of the firm to convert to a firm-level cash compensation sensitivity. LTIP sensitivity is LTIP divided by the change in shareholder wealth over the prior three years. To avoid negative sensitivities, we set them equal to zero whenever shareholder returns are less than 5% annually over the period. All sensitivities are converted to reflect dollar change in CEO wealth for a $${\$}$$1,000 change in shareholder wealth. See Murphy (1999) for a detailed discussion of the construction of these sensitivities, and the associated assumptions. 12Not surprisingly, given the description of the backfilling process in the appendix, there are fewer TDC1 backfilled observations relative to Salary backfilled observations. 13That said, we see some evidence of backfilling post-2006 in both panels of Table 2. Further, a comparison of panels A and B shows several instances in which TDC1, but not Salary, was backfilled. 14These percentages add to more than 100% because some observations can be classified into more than one type of backfilling. For example, an observation might be backfilled because a manager entered the top five executives. If that firm was also added to the S&P 1500, then the observation is also classified as index backfilled. 15We define a firm-level observation to be backfilled if any executive in that firm-year was backfilled for any reason. Similarly, we define a firm-year as having a manager addition if there are backfilled data for any manager in that firm-year due to a manager being added to the database. 16For example, see Hartzell and Starks (2003). 17The specifications include year indicators. The standard errors are clustered at the firm level. 18We also include year and industry fixed effects, where industry is defined using two-digit SIC codes. 19One could also show that adding backfilled data decreases the estimated intercepts, but given the large number of year and industry fixed effects, and the literature’s use of abnormal compensation in a wide variety of tests, we focus on the change in the residuals rather than changes in the intercepts. 20See, for example, Eley (2013) and Saft (2012). 21We estimate an option’s delta as the partial derivative of the Black-Scholes value. We follow the assumptions in Yermack (1995) when calculating the Black-Scholes parameters. 22For example, see Yermack (1997), Aboody and Kasznik (2000), Chauvin and Shenoy (2001), and Lie (2005). 23For example, to bootstrap panel A, we randomly select the entire time series of observations for 2,743 CEO’s with replacement to replicate the full sample. We then exclude 1,698 randomly selected observations to generate the bootstrapped “clean” sample. We sort these samples by size and estimate the difference-in-differences in pay-for-performance sensitivities. We do this 1,000 times to generate a distribution of difference-in-differences estimates and $$t$$-statistics. 24Alternatively, if we use a median regression framework as used in Aggarwal and Samwick (1999), we find similar results. 25Standard & Poor’s informed us that it typically takes one to two months for new proxy data to be added to ExecuComp, and that this processing time declines as the data set become more complete. Because we generally use October vintages, the data processing time should be much less than two months as most data have already been included and only late filers are being added. © The Author 2017. Published by Oxford University Press on behalf of The Society for Financial Studies. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Review of Financial Studies Oxford University Press

Getting the Incentives Right: Backfilling and Biases in Executive Compensation Data

Loading next page...
 
/lp/ou_press/getting-the-incentives-right-backfilling-and-biases-in-executive-YQhLDqb88p
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press on behalf of The Society for Financial Studies. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
ISSN
0893-9454
eISSN
1465-7368
D.O.I.
10.1093/rfs/hhx061
Publisher site
See Article on Publisher Site

Abstract

Abstract We document that backfilling in the ExecuComp database introduces a data-conditioning bias that can affect inferences and make replicating previous work difficult. Although backfilling can be advantageous due to greater data coverage, if not addressed, the oversampling of firms with strong managerial incentives and higher subsequent returns leads to a significant upward bias in abnormal compensation, pay-for-performance sensitivity, and the magnitudes of several previously established relations. The bias also can lead to one misinterpreting the appropriate functional form of a relation and whether the data support one compensation theory over another. We offer methods to address this issue. Received May 12, 2014; editorial decision May 10, 2016 by Editor David Hirshleifer. Executive compensation, especially CEO compensation, consistently fosters debate among a wide variety of interested parties, including shareholders, government officials, the media, the public, and academic researchers. For example, academic scrutiny of compensation stems from the central role of contracting in agency theory and an interest in understanding how compensation mitigates or exacerbates agency issues.1 Accordingly, a large number of studies in finance, economics, accounting, management, and other fields employ data from the Standard & Poor’s (S&P’s) Compustat ExecuComp database (henceforth, ExecuComp), which was introduced in 1994. In fact, since 1998, more than a 1,000 published articles have used this database, and over half of these articles can be found in nine of the leading finance and accounting journals.2 Given the widespread use of the ExecuComp database, it is imperative to understand its construction, any inadvertent biases that arise as a result, and the implications of—and ideally, solutions to—those biases for research. Despite its importance, we are aware of no studies that examine biases within the ExecuComp database.3 We provide the first such study by examining issues that arise from the practice of “backfilling” or adding historical data when ExecuComp initiates coverage of firms or managers. While documenting the effects of backfilling on the results presented in the previous literature is beyond the scope of this (or any one) study, we examine some typical analyses to present examples of the potential data-conditioning biases backfilling can generate. Our results indicate that the backfilled observations are not random, but instead lead to oversampling from certain types of firm-years (e.g., strong-performing growth firms that use more incentive compensation). Because of this, we find that backfilling can significantly affect estimates, both economically and statistically, and the resulting interpretations and conclusions. Backfilling in the ExecuComp database arises due to S&P’s practice of collecting all available compensation data from the proxy statements of covered firms. Because proxy statements almost always contain historical compensation data prior to the particular year, this policy implies that the initiation of coverage of managers results in backfilling. Three circumstances prompt the addition of managers to the ExecuComp database: (1) an individual employed at the firm becomes one of the five highest-paid executives in the firm (e.g., gets promoted), (2) a firm is added to the S&P 1500 index, or (3) a firm that is not and has never been in the S&P 1500 index is added to the database for some reason. Per conversations with S&P, generally this third circumstance occurs when an S&P client requests a firm’s inclusion in the database, although it has also been suggested that firms might be added due to their addition to S&P industry indices.4 According to S&P, backfilling was discontinued in 2006 due to new regulatory reporting requirements that limited the amount of historical compensation data disclosed in firm proxy statements. However, reporting requirements were again changed after that date and backfilling has apparently resumed. Backfilling a database, such as ExecuComp, can be beneficial due to the increased data availability and accessibility and the increased power of empirical tests. However, backfilling can also be problematic. If the data-conditioning bias is unaddressed it can significantly alter inferences from empirical analyses. Moreover, backfilling makes it extremely difficult to replicate the results of a previous study because later research will include backfilled observations that were not available to the previous researchers, even if one uses identical date ranges and filters. We find a large amount of backfilling in the ExecuComp database. For example, in the October 2009 version (or “vintage”) of the database, we estimate that 4,037 (17.8%) of the 22,720 firm-year observations (and all of their reported executives) were backfilled. At the manager level, we estimate that at least 31,901 (23%) of the 136,684 salary observations for fiscal years 1994–2005 have been backfilled.5,6 Moreover, the three events that lead S&P to backfill do not occur randomly. For example, consider an index addition (a firm being added to the S&P 1500), which is likely to follow a period of strong firm performance and high stock returns. Assuming the added firm was not already included in ExecuComp (e.g., due to a previous client request), the index addition triggers ExecuComp coverage and typically results in two years of backfilled data because the first proxy collected by S&P includes three years of historical data. Consistent with the implications of this example, our tests show that backfilling leads to oversampling of high-growth companies that experienced high returns with low risk (i.e., low time-series variation in returns relative to nonbackfilled firms). In addition, we find that managers whose data are backfilled tend to have lower salaries, lower total compensation, and higher stock ownership than other managers in the database. Most importantly, we find that failure to control for backfilling not only generates a strong and significant upward bias in the magnitudes of several previously established relations, it can also lead to misinterpretations of the appropriate functional form of a relation, or of the degree with which the data supports one theory over another. Moreover, in structural modeling with data from ExecuComp—even assuming that the model provides the correct functional form—given the feedback from one estimate to another, biased compensation data could affect other estimates as well as one’s conclusion that the structural model is appropriate. For example, consider estimates of pay-for-performance sensitivities, stemming from the original work of Jensen and Murphy (1990), which play a major role in the empirical corporate literature. Our results show that pay-for-performance sensitivities derived from ExecuComp data without adjusting for backfilling are significantly overestimated. After excluding data that we estimate to have been backfilled, we find that a CEO at a median-risk firm receives an additional $${\$}$$0.55 in total direct compensation per $${\$}$$1,000 change in shareholder wealth. However, using all available ExecuComp data (including observations we believe to have been backfilled), we find a sensitivity of $${\$}$$0.69 per $${\$}$$1,000, an increase of 25%. The upward bias is striking when one examines only the backfilled observations: the sensitivity estimated when only using backfilled data is $${\$}$$2.95 per $${\$}$$1,000, over five times higher than the estimated sensitivity for nonbackfilled data. The effects of backfilled data are more dramatic if we incorporate stock and option ownership in our estimates of pay-for-performance sensitivities. Failure to adjust for backfilled data generates a pay-for-performance sensitivity estimate that is 64% higher than the pay-for-performance sensitivity we find when using backfill-free data ($${\$}$$2.87 per $${\$}$$1,000 using all of ExecuComp versus $${\$}$$1.75 using backfill-free data). Our results also have implications for the use of the ExecuComp data to address the widespread debate over executive compensation as rent extraction or optimal contracting (e.g., see Bebchuk and Fried 2004; Murphy 2002). We show that the presence of backfilled observations is more likely to tilt some tests toward the optimal-contracting view because the ExecuComp database tends to oversample CEOs who have lower salaries, lower total compensation, and stronger incentives. Moreover, the backfilled CEOs tend to be drawn from a set of strong-performing firms rather than at random from the fuller set of firms. However, the presence of backfilled observations does not favor the optimal contracting view for all empirical tests that have been employed. For example, we find that removing the backfilled data (observations that tend to have low abnormal compensation and high future returns) changes the estimation of a significant negative relation between abnormal compensation and future returns to no significant relation. Further, researchers who estimate abnormal compensation need to consider that the addition of backfilled data leads to upward-biased estimates of abnormal compensation for the nonbackfilled observations, on average. The presence of backfilled observations significantly affects the estimation of several other previously identified relations. For example, failure to control for backfilling leads to an over-estimation of the relation between pay-performance sensitivities and firm risk or firm size, as well as the relation between managerial ownership and firm value. Further, the inclusion of backfilled data in tests of the association between Tobin’s q and ownership can lead to significantly misestimating the magnitude of the relation (statistically and economically), and also to misinterpreting the correct functional form of the relation. Taken together, our results highlight the importance of controlling for backfilled data in future research. Consequently, we develop methods that allow researchers to identify and control for backfilling in the data. These methods not only allow the researcher to prevent biases in their empirical studies, but also allow more appropriate replications of previous work. 1. Data and Potential Biases 1.1 ExecuComp construction and the backfilling process S&P reports detailed executive compensation data from firms’ annual proxy statements (form DEF 14A) based on the reported current and two-year historical compensation data for the five most highly compensated executives (often referred to as the “Top five”).7 S&P also constructs a measure of aggregate total direct compensation (TDC1), computed as the sum of salary, bonus, other annual compensation, total value of restricted stock granted, total value of stock options granted (using a Black-Scholes approach), long-term incentive payouts, and all other pay. However, in 2006, the SEC adjusted compensation disclosure requirements to better match the FAS 123R accounting changes, which included reporting options at “fair value.” The SEC allowed a three-year phase-in period during which companies would only be required to report the current year of compensation data for the first year after the rule change, then two years of data, with a return to three years of data thereafter.8 ExecuComp covers all firms in the S&P 1500 index, and maintains coverage if a firm drops out of the index. In addition, ExecuComp includes some firms that have never been in the S&P 1500. For example, of the 22,720 firm-year observations with fiscal years ranging from 1994 to 2005 in the 2009 vintage, 17,422 (76.7%) are members of the S&P 1500 index at the time of database release, 1,261 (5.6%) are firms that were previously in the index, 705 firms (3.1% of the total) were included up to two years before they entered the S&P, and the remaining 3,332 firms (14.7% of the total) apparently were added to the database for other reasons.9 The executives covered in the ExecuComp database also change over time. Beyond the executives included due to the firm additions discussed above, new individuals are added when they become one of the highest-paid executives in a firm as reported to the SEC. When any of these events occur over the 1994–2005 time frame, available historical compensation data was added. The appendix provides more details of the process. 1.2 The nature of potential biases Using stock performance as an example, Figure 1 illustrates the potential for backfilling to affect analyses based on firm characteristics. For each firm-year in ExecuComp for the 1994–2005 period, we collect stock returns over the years $$t$$, $$t+$$1, and $$t+$$2. We do this separately for the full sample and for the five following subsamples: (1) firm-years for which we believe no managers have backfilled compensation data, (2) firm-years where at least one manager has backfilled data for any reason, (3) firm-years for which at least one manager has backfilled data due to manager addition (e.g., due to promotion), (4) firm-years where data for the firm was backfilled because it was added to the S&P index, and (5) firm years for which the data for the firm was backfilled for some other reason (e.g., client request).10 Figure 1 View largeDownload slide Cumulative stock returns This figure plots the average cumulative stock returns for firm-year observations in ExecuComp over 1994–2005. The return from 0 to $$t$$ represents the average stock return measured contemporaneously with that firm’s year $$t$$ compensation data. We also report cumulative stock returns that incorporate years $$t+$$1 and $$t+$$2. We report returns separately for the full sample, observations that we estimate were not backfilled, backfilled observations, and three subsets of backfilled observations – index, other, and manager. Specifically, a firm-level observation is termed backfilled if any manager in that year is backfilled for any reason. An observation is not termed backfilled if no manager has been backfilled. The manager backfilled sample consists of any firm with backfilling due to manager additions. The index backfilled sample consists of any firm that was backfilled because it was added to the S&P 1500 Index. The Other backfilled sample consists of firms that were added to ExecuComp for other reasons. Figure 1 View largeDownload slide Cumulative stock returns This figure plots the average cumulative stock returns for firm-year observations in ExecuComp over 1994–2005. The return from 0 to $$t$$ represents the average stock return measured contemporaneously with that firm’s year $$t$$ compensation data. We also report cumulative stock returns that incorporate years $$t+$$1 and $$t+$$2. We report returns separately for the full sample, observations that we estimate were not backfilled, backfilled observations, and three subsets of backfilled observations – index, other, and manager. Specifically, a firm-level observation is termed backfilled if any manager in that year is backfilled for any reason. An observation is not termed backfilled if no manager has been backfilled. The manager backfilled sample consists of any firm with backfilling due to manager additions. The index backfilled sample consists of any firm that was backfilled because it was added to the S&P 1500 Index. The Other backfilled sample consists of firms that were added to ExecuComp for other reasons. As Figure 1 shows, there are substantial differences in mean cumulative returns across the different samples. Backfilled observations have higher average returns than the nonbackfilled sample. The distinction between manager additions, index additions, and other additions shows that the performance differences are driven by the backfilling due to firm additions, thus resulting in oversampling of successful, high-growth firms. In contrast, observations that were backfilled due to manager additions have almost identical average returns to those of nonbackfilled observations because such additions do not result in new firms (and their returns) being added to the database. Figure 2 shows that the backfilling can affect inferences about managerial compensation from both firm and manager backfilling. The five bar charts illustrate the level of executive pay and its composition divided among salary, bonus, options, and other compensation for each year from 1994 through 2005. The charts show, in order, compensation for the full ExecuComp sample, the nonbackfilled observations, all backfilled observations, firm-backfilled observations, and manager-backfilled observations. Comparing the second and third charts, it is clear that backfilled observations have lower levels of pay relative to nonbackfilled observations. The fourth and fifth charts show that these differences are driven not only by firm backfilling but also by manager backfilling. Figure 2 View largeDownload slide Level and composition of executive pay This figure summarizes the level and composition of executive pay by year. The bar height represents the median level of executive pay in that year, in thousands of dollars. We separate compensation into salary, bonus, Black-Scholes option value, and all other compensation. All values are as reported by ExecuComp. Composition percentages are computed by first measuring the percentages for each executive and then averaging across all executives. Figure 2 View largeDownload slide Level and composition of executive pay This figure summarizes the level and composition of executive pay by year. The bar height represents the median level of executive pay in that year, in thousands of dollars. We separate compensation into salary, bonus, Black-Scholes option value, and all other compensation. All values are as reported by ExecuComp. Composition percentages are computed by first measuring the percentages for each executive and then averaging across all executives. In a similar manner, Figure 3 shows median pay-for-performance sensitivities for each year from 1994 through 2005 for the full sample and the four subsamples. Following Murphy (1999), we measure pay-for-performance sensitivity separately for options, stock, cash compensation and long-term incentive payouts.11 The bar height represents the median total change in executive wealth per $${\$}$$1,000 change in shareholder wealth. The contribution of individual sources of pay to the total pay-for-performance sensitivity is computed by first measuring the percentage of pay-for-performance sensitivity that comes from each source for each executive, then averaging across all executives in that cross section. Given that all of the charts are depicted on the same scale, it is clear that backfilled firms have very different average pay-for-performance sensitivities relative to nonbackfilled firms. Further, the charts indicate that while the pay-for-performance sensitivity for the median executive has varied substantially across time, the variation is much greater for the backfilled observations, particularly the manager backfilled observations. Further, relative to the nonbackfilled sample, a larger portion of backfilled executives’ estimated pay-for-performance sensitivity stems from changes in option, restricted stock, and stock values. Again, the differences arise from both firm- and manager-backfilled observations. As we show in later regressions, these differences are both economically and statistically significant. Figure 3 View largeDownload slide Pay-for-performance sensitivity This figure summarizes the sensitivity of executive pay to performance. The $$y$$-axis is median change in executive wealth per $${\$}$$1,000 change in shareholder wealth. Composition percentages are computed by first measuring the percentages for each executive and then averaging across executives. Computation of the pay-for-performance sensitivity estimates for the individual components follows Murphy (1999). Figure 3 View largeDownload slide Pay-for-performance sensitivity This figure summarizes the sensitivity of executive pay to performance. The $$y$$-axis is median change in executive wealth per $${\$}$$1,000 change in shareholder wealth. Composition percentages are computed by first measuring the percentages for each executive and then averaging across executives. Computation of the pay-for-performance sensitivity estimates for the individual components follows Murphy (1999). These initial findings suggest that backfilling systematically oversamples certain types of firms and managers, generating the potential to draw incorrect inferences or misleading estimates. Moreover, failing to control for backfilling can lead to a lack of comparability of results across studies because for a given year, follow-up studies would include firms not contained in the original ExecuComp vintage. The extent of the problem would depend on the exact methodology and the specific vintage of ExecuComp data used. 1.3 Backfilled observations Table 1 summarizes the number of backfilled observations and the years in which the backfilling occurs. Panels A and B report the occurrence of Salary and TDC1 backfilling, respectively. Moving from left to right, the columns report the total number of observations for that year in the 2009 vintage, the total number of backfilled observations, and the number of backfilled observations in each vintage of the database. Each row represents a given fiscal year of data. For example, there are 12,641 manager observations with Salary data for fiscal year 1998. We estimate that 3,923 (31%) of these were backfilled at some point. Thus, a paper using the 2009 ExecuComp database to examine 1998 compensation will have a very different sample relative to that of an earlier paper that used (for example) the 1999 vintage. The bottom row of panel A summarizes the incidence of backfilling across all vintages and shows that of the 154,522 salary observations for fiscal years 1992–2005, we estimate that 32,046 (21%) are backfilled. Panel B reports the occurrence of TDC1 backfilling. The bottom row of panel B shows our estimated total of 17,909 (14%) of the 130,424 TDC1 observations that have been backfilled.12 Table 1 Backfilling by vintage and fiscal year of observation Fiscal  Total  Total #  # Backfilled by ExecuComp Vintage  year of obs.  obs. in 2009 vintage  of back-filled obs.  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2009  A. Number of observations in which Salary is backfilled  1992  8,028  42  9  11  0  0  10  0  4  0  0  0  8  1993  9,810  103  57  11  0  9  12  0  5  0  0  0  9  1994  10,662  1,214  1,147  18  0  15  16  0  8  0  0  0  10  1995  11,138  2,948  1,564  1,172  175  17  18  0  2  0  0  0  0  1996  11,687  3,218  0  1,585  1,393  217  24  0  3  0  0  0  0  1997  12,044  3,668  0  0  1,859  1,543  269  1  0  0  0  0  0  1998  12,641  3,923  0  0  0  2,099  1,644  175  5  0  0  0  0  1999  12,214  3,439  0  0  0  0  2,395  546  482  16  0  0  0  2000  11,542  2,472  0  0  0  0  0  790  1,613  59  7  1  2  2001  11,381  1,954  0  0  0  0  0  0  526  1,125  39  53  211  2002  11,549  3,083  0  0  0  0  0  0  0  1,637  1,066  103  277  2003  11,817  3,100  0  0  0  0  0  0  0  0  1,546  1,097  457  2004  10,900  2,187  0  0  0  0  0  0  0  0  0  1,568  619  2005  9,109  695  0  0  0  0  0  0  0  0  0  0  695  Total  154,522  32,046  2,777  2,797  3,427  3,900  4,388  1,512  2,648  2,837  2,658  2,822  2,288   B. Number of observations in which TDC1 is backfilled  1992  5,567  2,377  2217  15  84  10  18  0  16  2  1  6  8  1993  8,332  368  27  0  314  6  9  0  4  0  0  0  8  1994  9,029  902  339  4  524  8  11  0  6  0  0  0  10  1995  9,306  1,207  419  338  423  15  12  0  0  0  0  0  0  1996  9,838  1,471  0  676  682  92  17  0  5  0  3  0  0  1997  10,080  1,797  0  0  895  784  117  1  0  0  0  0  0  1998  10,416  1,827  0  0  0  974  731  120  2  0  0  0  0  1999  10,227  1,563  0  0  0  0  1,195  245  116  7  0  0  0  2000  9,799  780  0  0  0  0  0  323  423  27  4  1  2  2001  9,389  902  0  0  0  0  0  0  319  489  20  21  53  2002  9,591  1,309  0  0  0  0  0  0  0  721  458  46  84  2003  10,171  1,600  0  0  0  0  0  0  0  0  702  543  355  2004  9,817  1,243  0  0  0  0  0  0  0  0  0  793  450  2005  8,862  563  0  0  0  0  0  0  0  0  0  0  563  Total  130,424  17,909  3,002  1,033  2,922  1,889  2,110  689  891  1,246  1,188  1,410  1,533  Fiscal  Total  Total #  # Backfilled by ExecuComp Vintage  year of obs.  obs. in 2009 vintage  of back-filled obs.  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2009  A. Number of observations in which Salary is backfilled  1992  8,028  42  9  11  0  0  10  0  4  0  0  0  8  1993  9,810  103  57  11  0  9  12  0  5  0  0  0  9  1994  10,662  1,214  1,147  18  0  15  16  0  8  0  0  0  10  1995  11,138  2,948  1,564  1,172  175  17  18  0  2  0  0  0  0  1996  11,687  3,218  0  1,585  1,393  217  24  0  3  0  0  0  0  1997  12,044  3,668  0  0  1,859  1,543  269  1  0  0  0  0  0  1998  12,641  3,923  0  0  0  2,099  1,644  175  5  0  0  0  0  1999  12,214  3,439  0  0  0  0  2,395  546  482  16  0  0  0  2000  11,542  2,472  0  0  0  0  0  790  1,613  59  7  1  2  2001  11,381  1,954  0  0  0  0  0  0  526  1,125  39  53  211  2002  11,549  3,083  0  0  0  0  0  0  0  1,637  1,066  103  277  2003  11,817  3,100  0  0  0  0  0  0  0  0  1,546  1,097  457  2004  10,900  2,187  0  0  0  0  0  0  0  0  0  1,568  619  2005  9,109  695  0  0  0  0  0  0  0  0  0  0  695  Total  154,522  32,046  2,777  2,797  3,427  3,900  4,388  1,512  2,648  2,837  2,658  2,822  2,288   B. Number of observations in which TDC1 is backfilled  1992  5,567  2,377  2217  15  84  10  18  0  16  2  1  6  8  1993  8,332  368  27  0  314  6  9  0  4  0  0  0  8  1994  9,029  902  339  4  524  8  11  0  6  0  0  0  10  1995  9,306  1,207  419  338  423  15  12  0  0  0  0  0  0  1996  9,838  1,471  0  676  682  92  17  0  5  0  3  0  0  1997  10,080  1,797  0  0  895  784  117  1  0  0  0  0  0  1998  10,416  1,827  0  0  0  974  731  120  2  0  0  0  0  1999  10,227  1,563  0  0  0  0  1,195  245  116  7  0  0  0  2000  9,799  780  0  0  0  0  0  323  423  27  4  1  2  2001  9,389  902  0  0  0  0  0  0  319  489  20  21  53  2002  9,591  1,309  0  0  0  0  0  0  0  721  458  46  84  2003  10,171  1,600  0  0  0  0  0  0  0  0  702  543  355  2004  9,817  1,243  0  0  0  0  0  0  0  0  0  793  450  2005  8,862  563  0  0  0  0  0  0  0  0  0  0  563  Total  130,424  17,909  3,002  1,033  2,922  1,889  2,110  689  891  1,246  1,188  1,410  1,533  This table reports the number of backfilled observations in ExecuComp by vintage and fiscal year of the observation. The rows represent different fiscal years, as indicated, and the columns represent different vintages. We also report for each year the total manager-year observations in the 2009 vintage of ExecuComp and the total number of these observations that we identify as backfilled. The remaining columns indicate our estimates of the vintages in which the backfilling occurs. Panel A uses Salary to identify backfilled observations, and panel B uses TDC1. For the remainder of the paper, we maintain two approaches in our tests. First, to illustrate the problems with backfilling, we focus on the 1994–2005 sample period, the period over which most of the early backfilling occurred.13 Second, for each test, we select the appropriate backfilling identifier based on the data required since not every observation has the more detailed compensation data, for example, option grants. If a test uses only summary compensation data, we use Back_Salary as the identifier. If a test uses detailed data on the components of compensation, such as option data needed to calculate pay-for-performance sensitivities, then we use Back_TDC1 to identify backfilled observations. 1.4 Brief overview of identifying the reasons for backfilling The appendix reports details of the strategy we employ to estimate the types of backfilling. In brief, we use two pieces of information: the timing of the backfilling and the historical S&P 1500 constituents list. If a firm-level backfilled observation occurs one or two years prior to index addition, then it is defined as Index backfilling. If a firm is backfilled more than two years prior to the firm entering the S&P 1500, then it is Other. If an observation is backfilled and the firm is currently in the S&P 1500, or if there are other observations for the same firm-year that are not backfilled, then it must be Manager backfilling. We use indicator variables Back_Salary_Index, Back_Salary_Other, and Back_Salary_Manager to indicate index, other, and manager backfilling for observations in which the Salary variable is backfilled, respectively. Similarly, we construct indicator variables, Back_Salary_Index, Back_Salary_Other, and Back_Salary_Manager, for the TDC1 backfilled data. Lastly, in several analyses we report results that distinguish firm-level backfilling from manager-level backfilling. In these cases, firm-level backfilling is the union of Index and Other backfilling. The results of this identification process are shown in Table 2. The majority of backfilling occurs because new managers move into the top-five group. For example, out of the 3,923 backfilled manager observations in fiscal year 1998, we estimate that 2,169 (55.3%) were backfilled because a new manager moved into the top-five group, 581 (14.8%) were added because the firm moved into the S&P 1500, and 1,446 (36.9%) for other reasons.14 Table 2 Backfilling by fiscal year and type of backfilling    (1)  (2)  (3)  Fiscal year  Index addition  Other request  Manager addition  A. Back_Salary           1994  107  416  694  1995  313  994  1,711  1996  424  1,201  1,763  1997  588  1,428  1,875  1998  581  1,446  2,169  1999  462  820  2,486  2000  287  359  2,013  2001  238  239  1,631  2002  178  527  2,608  2003  222  571  2,412  2004  145  386  1,735  2005  26  38  646  Total  3,571  8,425  21,743  B. Back_Total           1994  143  291  469  1995  162  621  445  1996  257  831  426  1997  398  1,028  432  1998  442  1,001  452  1999  370  517  777  2000  223  219  395  2001  181  166  651  2002  119  374  963  2003  183  460  997  2004  124  303  844  2005  25  26  515  Total  2,627  5,837  7,366     (1)  (2)  (3)  Fiscal year  Index addition  Other request  Manager addition  A. Back_Salary           1994  107  416  694  1995  313  994  1,711  1996  424  1,201  1,763  1997  588  1,428  1,875  1998  581  1,446  2,169  1999  462  820  2,486  2000  287  359  2,013  2001  238  239  1,631  2002  178  527  2,608  2003  222  571  2,412  2004  145  386  1,735  2005  26  38  646  Total  3,571  8,425  21,743  B. Back_Total           1994  143  291  469  1995  162  621  445  1996  257  831  426  1997  398  1,028  432  1998  442  1,001  452  1999  370  517  777  2000  223  219  395  2001  181  166  651  2002  119  374  963  2003  183  460  997  2004  124  303  844  2005  25  26  515  Total  2,627  5,837  7,366  This table presents the number of backfilled manager-year observations by fiscal year and type of backfilling. Column 1 reports the total number of observations that we estimate to have been backfilled because the firm entered the S&P 1500 index. Column 2 reports the total number of observations that we estimate to have been backfilled due to firms being added for other reasons. Column 3 reports the total number of observations that we estimate to have been backfilled because the manager entered the top five paid managers in the firm. Panel A reports results using Salary as the identifying compensation item, and panel B uses TDC1. 2. Differences in Backfilled Data 2.1 Univariate analysis We now describe some of the differences between backfilled and nonbackfilled observations (by type), which we expect to occur because of the systematic ways in which observations are added to the database. In Table 3, panel A, we report means and medians of manager compensation and ownership statistics using variables from ExecuComp for the full sample and each of the subsamples. We also present $$t$$-tests for differences in means relative to the nonbackfilled sample (Column 2), where standard errors are clustered by firm, and we report Wilcoxon’s z-scores for the differences in medians. Table 3 Summary statistics for manager compensation and ownership A. Manager compensation and ownership characteristics across samples     (1)  (2)  (3)  (4)  (5)  (6)  (7)     Full sample  Not backfilled  Backfilled  All Firm Backfilled  Index addition  Other addition  Manager addition  Salary  345**  375  244**  244**  237**  246**  241**     (279**)  (304)  (208**)  (200**)  (200**)  (200**)  (210**)  Bonus  328**  369  194**  193**  200**  190**  189**     (132**)  (151)  (88**)  (85**)  (87**)  (83**)  (86**)  Other annual compensation  27  28  23  15**  11**  17**  27     (0)  (0)  (0)  (0)  (0)  (0)  (0)  Restricted stock grant  193**  222  98**  75**  61**  82**  106**     (0)  (0)  (0)  (0)  (0)  (0)  (0)  TDC1  2,122**  2,257  1,183**  1,309**  1,291**  1,317**  1,034**     (915**)  (988)  (518**)  (579**)  (592**)  (573**)  (435**)  Allpay  3,072  3,114  2,018**  2,282**  3,454  1,689**  1,573**     (1,100**)  (1,140)  (669**)  (774**)  (849**)  (734**)  (538**)  % shares owned  0.01**  0.009  0.018**  0.022**  0.021**  0.022**  0.009     (0.001**)  (0.001)  (0.001**)  (0.001**)  (0.002**)  (0.001**)  (0.001)  Black-Scholes option value  991**  1064  479**  637**  681**  617**  305**     (194)  (227)  (0**)  (70**)  (94**)  (59**)  (0**)  Option PFP  0.95**  0.85  2.36**  2.5**  2.6**  2.45**  1.98**  per $${\$}$$1000  (0.003)  (0.01)  (0.87**)  (0.99**)  (1.04**)  (0.94**)  (0.61**)  Option PFP  12,504  12,144  17,347**  18,732**  23,745**  16,409*  13,712  per 1%  (174**)  (70)  (3,869**)  (4,303**)  (5,169**)  (3,637**)  (3,274**)  # Salary observations  136,684  104,783  31,901  11,996  3,571  8,425  21,729  # TDC1 observations  116,525  101,451  15,074  8,448  2,624  5,824  7,275  A. Manager compensation and ownership characteristics across samples     (1)  (2)  (3)  (4)  (5)  (6)  (7)     Full sample  Not backfilled  Backfilled  All Firm Backfilled  Index addition  Other addition  Manager addition  Salary  345**  375  244**  244**  237**  246**  241**     (279**)  (304)  (208**)  (200**)  (200**)  (200**)  (210**)  Bonus  328**  369  194**  193**  200**  190**  189**     (132**)  (151)  (88**)  (85**)  (87**)  (83**)  (86**)  Other annual compensation  27  28  23  15**  11**  17**  27     (0)  (0)  (0)  (0)  (0)  (0)  (0)  Restricted stock grant  193**  222  98**  75**  61**  82**  106**     (0)  (0)  (0)  (0)  (0)  (0)  (0)  TDC1  2,122**  2,257  1,183**  1,309**  1,291**  1,317**  1,034**     (915**)  (988)  (518**)  (579**)  (592**)  (573**)  (435**)  Allpay  3,072  3,114  2,018**  2,282**  3,454  1,689**  1,573**     (1,100**)  (1,140)  (669**)  (774**)  (849**)  (734**)  (538**)  % shares owned  0.01**  0.009  0.018**  0.022**  0.021**  0.022**  0.009     (0.001**)  (0.001)  (0.001**)  (0.001**)  (0.002**)  (0.001**)  (0.001)  Black-Scholes option value  991**  1064  479**  637**  681**  617**  305**     (194)  (227)  (0**)  (70**)  (94**)  (59**)  (0**)  Option PFP  0.95**  0.85  2.36**  2.5**  2.6**  2.45**  1.98**  per $${\$}$$1000  (0.003)  (0.01)  (0.87**)  (0.99**)  (1.04**)  (0.94**)  (0.61**)  Option PFP  12,504  12,144  17,347**  18,732**  23,745**  16,409*  13,712  per 1%  (174**)  (70)  (3,869**)  (4,303**)  (5,169**)  (3,637**)  (3,274**)  # Salary observations  136,684  104,783  31,901  11,996  3,571  8,425  21,729  # TDC1 observations  116,525  101,451  15,074  8,448  2,624  5,824  7,275  B. Manager compensation and ownership characteristics of backfilled versus index subsamples     Backfilled  Not backfilled (in SP1500)  Not backfilled (in SP400 or SP600)  Not backfilled (dropped from SP indices)  Salary  244  384**  311**  318**     (208)  (313**)  (262**)  (265**)  Bonus  194  387**  229**  183*  $$ $$  (88)  (162**)  (110**)  (68**)  Other annual compensation  23  28  17  26  $$ $$  (0)  (0)  (0)  (0)  Restricted stock grant  98  236**  101  120*     (0)  (0)  (0)  (0)  TDC1  1,183  2,321**  1,316**  1,252     (518)  (1032**)  (733**)  (593**)  Allpay  2,018  3,123**  2,035  1,764     (669)  (1,167**)  (825**)  (683)  % shares owned  0.018  0.009**  0.011**  0.008**     (0.001)  (0.001**)  (0.001**)  (0.001**)  Black-Scholes option value  479  1,073**  561**  473     (0)  (247**)  (153**)  (62**)  Option PFP  2.36  0.72**  1.03**  2.07*  per $${\$}$$1000  (0.87)  (0.002**)  (0.159**)  (0.706**)  Option PFP  17,347  10,923**  9,344**  6,727**  per 1%  (3,869)  (16**)  (831**)  (1,368**)  # Salary observations  31,901  90,760  58,792  5,681  # TDC1 observations  15,074  88,364  57,310  5,529  B. Manager compensation and ownership characteristics of backfilled versus index subsamples     Backfilled  Not backfilled (in SP1500)  Not backfilled (in SP400 or SP600)  Not backfilled (dropped from SP indices)  Salary  244  384**  311**  318**     (208)  (313**)  (262**)  (265**)  Bonus  194  387**  229**  183*  $$ $$  (88)  (162**)  (110**)  (68**)  Other annual compensation  23  28  17  26  $$ $$  (0)  (0)  (0)  (0)  Restricted stock grant  98  236**  101  120*     (0)  (0)  (0)  (0)  TDC1  1,183  2,321**  1,316**  1,252     (518)  (1032**)  (733**)  (593**)  Allpay  2,018  3,123**  2,035  1,764     (669)  (1,167**)  (825**)  (683)  % shares owned  0.018  0.009**  0.011**  0.008**     (0.001)  (0.001**)  (0.001**)  (0.001**)  Black-Scholes option value  479  1,073**  561**  473     (0)  (247**)  (153**)  (62**)  Option PFP  2.36  0.72**  1.03**  2.07*  per $${\$}$$1000  (0.87)  (0.002**)  (0.159**)  (0.706**)  Option PFP  17,347  10,923**  9,344**  6,727**  per 1%  (3,869)  (16**)  (831**)  (1,368**)  # Salary observations  31,901  90,760  58,792  5,681  # TDC1 observations  15,074  88,364  57,310  5,529  This table reports means and medians of manager compensation and ownership characteristics for fiscal years 1994 through 2005. Panels A and B show the manager compensation and ownership characteristics from ExecuComp (Salary, Bonus, other annual compensation, restricted stock grants, total direct compensation (TDC1), shares owned, and Black-Scholes option values). The panel also shows variables computed following the previous compensation research: Allpay is constructed following Jensen and Murphy (1990) and equals total cash compensation plus the change in the present value of future cash compensation plus the change in option value. Option PFP is the pay-for-performance sensitivity from option grants (dollar change in executive’s option value per $${\$}$$1,000 change in shareholder wealth) as defined in Yermack (1995). The first column summarizes the means (with medians below in parentheses) for all manager-year salary observations in the October 2009 vintage. The second column includes all manager-year observations that are not backfilled (using Salary as the identifier), and Column 3 reports statistics on backfilled observations. The last four columns report statistics on subsets of backfilled data. Column 4 uses firm-level backfilled data, which are observations backfilled either due to index additions or other reasons. Columns five through seven decompose backfilling into its three types: index, other and manager. Panel B reports the manager compensation and ownership characteristics for the backfilled sample versus S&P index firms. ** and * indicate differences at the 1% and 5% level, respectively, between the respective column and Column 2 (nonbackfilled data), where differences in means are tested using $$t$$-tests with standard errors clustered at the firm level, and differences in medians are tested using Wilcoxon rank-sum tests. The first three columns in Table 3 show that on almost every dimension considered in the table, the backfilled firms differ significantly from the nonbackfilled observations, both economically and statistically. These results are consistent with those illustrated in Figures 1–3, which graphically present differences in returns, levels of compensation, and pay-for-performance sensitivities. Backfilled observations have much lower values for compensation components, e.g., salary, bonus, restricted stock grants, and option grant value, but higher fractional ownership than their nonbackfilled counterparts. The mean (median) salary among backfilled observations is $${\$}$$244,000 ($${\$}$$208,000), compared to $${\$}$$375,000 ($${\$}$$304,000) for the nonbackfilled observations. The mean (median) total compensation, or TDC1, is $${\$}$$1,183,000 ($${\$}$$518,000) for backfilled observations, which is roughly half the total compensation among the nonbackfilled observations, $${\$}$$2,257,000 ($${\$}$$988,000). These differences are statistically significant at the 1% level and suggest that using later vintages of ExecuComp without adjusting for backfilling leads to compensation estimates that are biased downward relative to estimates based on the original data. Following Yermack (1995), we also compute option-grant pay-for-performance sensitivities, labeled Option PFP, and find a mean sensitivity of $${\$}$$2.36 per $${\$}$$1,000 change in shareholder wealth for the backfilled observations, compared to $${\$}$$0.85 for the nonbackfilled sample. This indicates that the backfilled observations have substantially greater option-grant pay-for-performance sensitivity than do the original observations. In panel B of Table 3 we report differences in backfilled observations from subsamples of nonbackfilled S&P index firms. For each of the columns two through four, we present $$t$$-tests for differences in means relative to the backfilled sample (Column 1), where standard errors are clustered by firm, and we report Wilcoxon’s z-scores for the differences in medians. In general, we see that the backfilled observations differ most from the nonbackfilled S&P 1500, as well as the 400 and 600 firms, with smaller (but still significant) differences between the backfilled firms and the firms that have been dropped from the S&P indices. Table 4 reports the means and medians of firm characteristics from CRSP and Compustat. In panel A, we report these statistics for the full sample and the subsamples.15 The table also presents $$t$$-tests for differences in means relative to the nonbackfilled sample (Column 2), where standard errors are clustered by firm, and we report Wilcoxon’s z-scores for differences in medians. Table 4 Summary statistics for firm characteristics A. Firm characteristics across subsamples     (1)  (2)  (3)  (4)  (5)  (6)  (7)     Full sample  Not backfilled  Backfilled  All firm backfilled  Index addition  Other addition  Manager addition  Market value of equity ($${\$}$$MM)  5181  4824  5446*  787**  579**  886**  6088**     (985)  (946)  (1022)*  (353**)  (301**)  (378**)  (1199**)  Leverage  0.189  0.185  0.192*  0.161**  0.147**  0.167*  0.197**     (0.157*)  (0.15)  (0.161**)  (0.067**)  (0.052**)  (0.072**)  (0.174**)  Dividend yield (%)  2.26  2.34  2.19  0.54**  0.4**  0.62**  2.43     (0.14)  (0.16)  (0.14)  (0**)  (0**)  (0**)  (0.22**)  Tobin’s q  2.12**  1.99  2.21**  2.79**  3.03**  2.68**  2.12**     (1.49)  (1.48)  (1.5*)  (1.75**)  (1.79**)  (1.73**)  (1.47)  CDF($$\sigma^{2}_{\it ret})$$  0.5  0.49  0.50  0.26**  0.23**  0.27**  0.54**     (0.5)  (0.49)  (0.51*)  (0.19**)  (0.16**)  (0.2**)  (0.55**)  Return  0.22**  0.19  0.24**  0.53**  0.65**  0.47**  0.20     (0.12*)  (0.11)  (0.13**)  (0.29**)  (0.35**)  (0.27**)  (0.12)  Return$$_{t+1}$$  0.23**  0.19  0.25**  0.64**  0.60**  0.67**  0.20     (0.13)  (0.12)  (0.14)  (0.34**)  (0.31**)  (0.34**)  (0.12**)  Return$$_{t+2}$$  0.19**  0.16  0.21**  0.49**  0.39**  0.53**  0.15     (0.11)  (0.10)  (0.11)  (0.22**)  (0.20**)  (0.24**)  (0.09**)  Return$$_{t+3}$$  0.12**  0.08  0.15**  0.20**  0.14*  0.23**  0.14**     (0.06**)  (0.03)  (0.08**)  (0.07**)  (0.06)  (0.08**)  (0.08**)  Instl ownership Herfindahl  0.075  0.073  0.077**  0.124**  0.126**  0.123**  0.071     (0.05)  (0.05)  (0.05*)  (0.09**)  (0.09**)  (0.09**)  (0.05**)  # firm-year observations  22,720  9,439  13,281  1,545  2,250  705  11,342  A. Firm characteristics across subsamples     (1)  (2)  (3)  (4)  (5)  (6)  (7)     Full sample  Not backfilled  Backfilled  All firm backfilled  Index addition  Other addition  Manager addition  Market value of equity ($${\$}$$MM)  5181  4824  5446*  787**  579**  886**  6088**     (985)  (946)  (1022)*  (353**)  (301**)  (378**)  (1199**)  Leverage  0.189  0.185  0.192*  0.161**  0.147**  0.167*  0.197**     (0.157*)  (0.15)  (0.161**)  (0.067**)  (0.052**)  (0.072**)  (0.174**)  Dividend yield (%)  2.26  2.34  2.19  0.54**  0.4**  0.62**  2.43     (0.14)  (0.16)  (0.14)  (0**)  (0**)  (0**)  (0.22**)  Tobin’s q  2.12**  1.99  2.21**  2.79**  3.03**  2.68**  2.12**     (1.49)  (1.48)  (1.5*)  (1.75**)  (1.79**)  (1.73**)  (1.47)  CDF($$\sigma^{2}_{\it ret})$$  0.5  0.49  0.50  0.26**  0.23**  0.27**  0.54**     (0.5)  (0.49)  (0.51*)  (0.19**)  (0.16**)  (0.2**)  (0.55**)  Return  0.22**  0.19  0.24**  0.53**  0.65**  0.47**  0.20     (0.12*)  (0.11)  (0.13**)  (0.29**)  (0.35**)  (0.27**)  (0.12)  Return$$_{t+1}$$  0.23**  0.19  0.25**  0.64**  0.60**  0.67**  0.20     (0.13)  (0.12)  (0.14)  (0.34**)  (0.31**)  (0.34**)  (0.12**)  Return$$_{t+2}$$  0.19**  0.16  0.21**  0.49**  0.39**  0.53**  0.15     (0.11)  (0.10)  (0.11)  (0.22**)  (0.20**)  (0.24**)  (0.09**)  Return$$_{t+3}$$  0.12**  0.08  0.15**  0.20**  0.14*  0.23**  0.14**     (0.06**)  (0.03)  (0.08**)  (0.07**)  (0.06)  (0.08**)  (0.08**)  Instl ownership Herfindahl  0.075  0.073  0.077**  0.124**  0.126**  0.123**  0.071     (0.05)  (0.05)  (0.05*)  (0.09**)  (0.09**)  (0.09**)  (0.05**)  # firm-year observations  22,720  9,439  13,281  1,545  2,250  705  11,342  B. Firm characteristics of backfilled versus index subsamples     (1)  (2)  (3)  (4)     Backfilled  Not backfilled (in SP1500)  Not backfilled (in SP400 or SP600)  Not backfilled (dropped from SP indices)  Market value of equity ($${\$}$$MM)  5446  5308  1039***  562***     (1022)  (1085***)  (653***)  (161***)  Leverage  0.192  0.18***  0.18***  0.25***     (0.161)  (0.154**)  (0.143***)  (0.144)  Dividend yield (%)  2.19  2.63**  0.49***  0.42***     (0.14)  (0.22***)  (0.06***)  (0***)  Tobin’s q  2.21  1.91***  1.88***  1.73***     (1.51)  (1.47***)  (1.44***)  (1.29***)  CDF($$\sigma^{2}_{\it ret})$$  0.50  0.51  0.39***  0.39***     (0.51)  (0.51)  (0.38***)  (0.37***)  Return  0.24  0.16***  0.17***  0.25     (0.13)  (0.10***)  (0.10***)  (0.03***)  Return$$_{t+1}$$  0.25  0.16***  0.16***  0.39***     (0.14)  (0.12**)  (0.11***)  (0.14)  Return$$_{t+2}$$  0.21  0.16***  0.16***  0.30***     (0.11)  (0.11)  (0.10)  (0.11)  Return$$_{t+3}$$  0.15  0.07***  0.08***  0.08*     (0.08)  (0.03***)  (0.03***)  (–0.05***)  Instl ownership Herfindahl  0.077  0.064***  0.071***  0.182***     (0.054)  (0.050***)  (0.056***)  (0.115***)  # firm-year observations  13,281  7,913  5,595  676  B. Firm characteristics of backfilled versus index subsamples     (1)  (2)  (3)  (4)     Backfilled  Not backfilled (in SP1500)  Not backfilled (in SP400 or SP600)  Not backfilled (dropped from SP indices)  Market value of equity ($${\$}$$MM)  5446  5308  1039***  562***     (1022)  (1085***)  (653***)  (161***)  Leverage  0.192  0.18***  0.18***  0.25***     (0.161)  (0.154**)  (0.143***)  (0.144)  Dividend yield (%)  2.19  2.63**  0.49***  0.42***     (0.14)  (0.22***)  (0.06***)  (0***)  Tobin’s q  2.21  1.91***  1.88***  1.73***     (1.51)  (1.47***)  (1.44***)  (1.29***)  CDF($$\sigma^{2}_{\it ret})$$  0.50  0.51  0.39***  0.39***     (0.51)  (0.51)  (0.38***)  (0.37***)  Return  0.24  0.16***  0.17***  0.25     (0.13)  (0.10***)  (0.10***)  (0.03***)  Return$$_{t+1}$$  0.25  0.16***  0.16***  0.39***     (0.14)  (0.12**)  (0.11***)  (0.14)  Return$$_{t+2}$$  0.21  0.16***  0.16***  0.30***     (0.11)  (0.11)  (0.10)  (0.11)  Return$$_{t+3}$$  0.15  0.07***  0.08***  0.08*     (0.08)  (0.03***)  (0.03***)  (–0.05***)  Instl ownership Herfindahl  0.077  0.064***  0.071***  0.182***     (0.054)  (0.050***)  (0.056***)  (0.115***)  # firm-year observations  13,281  7,913  5,595  676  This table reports means and medians of firm characteristics for fiscal years 1994 through 2005. Panels A and B use firm-level observations and compare firm characteristics obtained from COMPUSTAT and CRSP. Leverage is long-term debt divided by total assets. Div yield is the sum of dividends over the year divided by market equity. $$q$$ is Tobin’s q and CDF($$\sigma^{2}_{\it ret})$$ is the cumulative distribution function of the variance of returns for firms in our sample, following Aggarwal and Samwick (1999). Instl ownership Herfindahl is the Herfindahl index of institutional ownership using holdings from the Thomson database. A firm-level observation is considered to be backfilled (Column 3 in panel A and Column 1 in panel B) if any manager is backfilled. Similarly, Column 7 in panel A is all firms with any manager that was backfilled due to manager addition. In panel A the first column summarizes the means (with medians below in parentheses) for all manager-year salary observations in the October 2009 vintage. The second column includes all manager-year observations that are not backfilled (using Salary as the identifier), and Column 3 reports statistics on backfilled observations. The last four columns report statistics on subsets of backfilled data. Column 4 uses firm-level backfilled data, which are observations backfilled either due to index additions or other reasons. Columns 5 through 7 decompose backfilling into its three types: index, other and manager. Panel B reports the firm characteristics for the backfilled sample versus S&P index firms. ** and * indicate differences at the 1% and 5% level, respectively, relative to Column 2 in panel A (nonbackfilled data), and relative to Column 1 in panel B (backfilled data), where differences in means are tested using $$t$$-tests with standard errors clustered at the firm level, and differences in medians are tested using Wilcoxon rank-sum tests. Focusing on Column 4, which reports summary statistics for firm-level backfilling (Index and Other additions), we find that the backfilled firms tend to be smaller, with lower dividend yields, lower leverage, and higher growth. The stock returns contain notable differences as well. Backfilled firms tend to have substantially higher subsequent stock performance (i.e., after the date of the observation), but lower variance in returns relative to firms that were not backfilled. Panel B of Table 4 reports differences in firm characteristics for backfilled observations compared to three subsamples of nonbackfilled S&P index firms along with $$t$$-tests and Wilcoxon’s z-scores. Comparing backfilled to nonbackfilled firms in the S&P 1500, we find that backfilled firms have significantly higher Tobin’s q and returns, but no significant differences in return volatility. Nonbackfilled observations and observations of firms dropped from the S&P 1500 are markedly different, with the latter group exhibiting low market values of equity, dividend yields, and q. Overall, Tables 3 and 4 show that the differences between backfilled and nonbackfilled observations are both statistically and economically significant. These findings alone raise concerns that including backfilled observations in analyses of relations between pay and various firm characteristics is problematic. 2.2 Multivariate analysis To better understand the differences in characteristics between backfilled and nonbackfilled observations, we use logit specifications to model the likelihoods that (1) a firm-year is backfilled (Table 5) and (2) a firm-manager-year observation is backfilled (Table 6). In all specifications the dependent variable takes the value of one if the observation is backfilled and zero otherwise. Our empirical specifications include variables that have been identified in prior work as being important in explaining the variation in compensation across firms.16 To compare the magnitudes of the models’ coefficients, we standardize the independent variables to have zero mean and unit variance and report odds ratios from the logit models.17 Table 5 Descriptors of backfilled firm-level data    TDC1  TDC1  TDC1  TDC1  Salary     All firm-level  Index  Other  Any exec backfilled  Any exec backfilled     (1)  (2)  (3)  (4)  (5)  ln(Firm size)  0.80*  0.86  0.73*  0.80**  1.08     (–2.05)  (–0.87)  (–2.55)  (–3.19)  (1.34)  Leverage  0.94  0.91  0.97  1.10**  1.07*     (–1.05)  (–1.11)  (–0.50)  (2.85)  (2.54)  Div yield  0.93  0.73  0.92  1.05  0.97     (–0.49)  (–0.51)  (–0.48)  (1.15)  (–0.94)  $$q$$  1.19**  1.16**  1.18**  0.93  1.01     (3.18)  (2.81)  (3.28)  (–1.22)  (0.30)  CDF($$\sigma^{2}_{\it ret}$$)  0.48**  0.37**  0.61**  0.97  1.05     (–6.77)  (–5.51)  (–3.90)  (–0.46)  (1.00)  Return  1.24**  1.16**  1.12**  0.92*  1.04     (6.58)  (3.63)  (3.96)  (–2.19)  (1.76)  Return$$_{t+1}$$  1.45**  1.20**  1.31**  0.87**  1.06**     (11.6)  (4.83)  (7.35)  (–4.67)  (3.18)  Return$$_{t+2}$$  1.36**  1.06  1.34**  0.91**  1.07**     (10.2)  (1.71)  (8.69)  (–2.88)  (3.32)  Return$$_{t+3}$$  1.08**  0.98  1.10**  0.97  1.02     (2.93)  (–0.38)  (3.33)  (–1.21)  (1.05)  Observations  15,463  15,463  15,463  15,463  15,626     TDC1  TDC1  TDC1  TDC1  Salary     All firm-level  Index  Other  Any exec backfilled  Any exec backfilled     (1)  (2)  (3)  (4)  (5)  ln(Firm size)  0.80*  0.86  0.73*  0.80**  1.08     (–2.05)  (–0.87)  (–2.55)  (–3.19)  (1.34)  Leverage  0.94  0.91  0.97  1.10**  1.07*     (–1.05)  (–1.11)  (–0.50)  (2.85)  (2.54)  Div yield  0.93  0.73  0.92  1.05  0.97     (–0.49)  (–0.51)  (–0.48)  (1.15)  (–0.94)  $$q$$  1.19**  1.16**  1.18**  0.93  1.01     (3.18)  (2.81)  (3.28)  (–1.22)  (0.30)  CDF($$\sigma^{2}_{\it ret}$$)  0.48**  0.37**  0.61**  0.97  1.05     (–6.77)  (–5.51)  (–3.90)  (–0.46)  (1.00)  Return  1.24**  1.16**  1.12**  0.92*  1.04     (6.58)  (3.63)  (3.96)  (–2.19)  (1.76)  Return$$_{t+1}$$  1.45**  1.20**  1.31**  0.87**  1.06**     (11.6)  (4.83)  (7.35)  (–4.67)  (3.18)  Return$$_{t+2}$$  1.36**  1.06  1.34**  0.91**  1.07**     (10.2)  (1.71)  (8.69)  (–2.88)  (3.32)  Return$$_{t+3}$$  1.08**  0.98  1.10**  0.97  1.02     (2.93)  (–0.38)  (3.33)  (–1.21)  (1.05)  Observations  15,463  15,463  15,463  15,463  15,626  This table presents odds ratios from logit specifications using firm-level observations from ExecuComp for 1994 through 2005. In Column 1, the dependent variable equals one if the executives in a given firm have TDC1 backfilled due to either index additions or Other, and zero otherwise. The second column uses a dependent variable equal to one if TDC1 for any executive at the firm has been backfilled due to index addition, and Column 3 uses a dependent variable equal to one if TDC1 has been backfilled for other reasons. Column 4 uses a dependent variable equal to one if an executive in the firm has TDC1 that has been backfilled for any reason. The last column identifies a firm as backfilled if any manager has Salary backfilled. Ln(Firm size) is the log of market equity at the beginning of the fiscal year. Leverage is long-term debt divided by total assets. Div yield is the sum of dividends over the year divided by market equity. $$q$$ is Tobin’s q and CDF($$\sigma^{2}_{\it ret})$$ is the cumulative distribution function of the variance of returns for firms in our sample, following Aggarwal and Samwick (1999). All independent variables are standardized to have mean zero and unit variance. All specifications include year effects. Standard errors are clustered at the firm level. ** and * indicate statistical significance at the 1% and 5% level, respectively. Table 6 Descriptors of backfilled manager-level data    TDC1 all  TDC1 firm  TDC1 index  TDC1 other  TDC1 manager  Salary all     (1)  (2)  (3)  (4)  (5)  (6)  Salary  0.87**  1.13  1.03  1.20**  0.61**  0.81**     (–2.62)  (1.95)  (0.30)  (2.77)  (–5.19)  (–4.25)  Bonus  1.03  1.03  1.08  1.02  1.00  0.98     (0.81)  (0.63)  (1.36)  (0.50)  (0.011)  (–0.49)  Other ann comp  1.03  1.17  0.99  1.20  0.81  1.00     (0.26)  (1.60)  (–0.030)  (1.78)  (–1.40)  (0.069)  Stock grants  1.01*  1.01*  0.97  1.01**  1.01**        (2.44)  (2.22)  (–0.21)  (2.77)  (2.81)     Black-Scholes opt val  0.93  1.00  1.02  1.00  0.62        (–1.30)  (0.066)  (0.75)  (0.022)  (–1.95)     Option PFP  1.17**  1.15**  1.10**  1.16**  1.16**        (6.06)  (4.83)  (3.82)  (4.51)  (4.86)     % shares owned  1.19**  1.21**  1.18**  1.23**  1.10**        (10.4)  (9.23)  (6.19)  (8.75)  (3.64)     CEO  0.62**  0.50**  0.61**  0.42**  0.87**  0.63**     (–17.5)  (–18.6)  (–11.1)  (–15.5)  (–3.93)  (–16.1)  ln(Firm size)  0.84*  0.77*  0.78  0.74*  0.88  0.92     (–2.40)  (–2.53)  (–1.48)  (–2.48)  (–1.27)  (–1.39)  Leverage  0.98  0.95  0.88  0.96  1.02  0.99     (–0.51)  (–1.01)  (–1.41)  (–0.54)  (0.52)  (–0.32)  Div yield  1.08  0.92  0.94  0.90  1.09  1.09**     (1.75)  (–0.64)  (–0.18)  (–0.72)  (1.68)  (2.65)  $$q$$  1.16**  1.21**  1.21**  1.19**  1.11*  1.13**     (3.56)  (3.61)  (2.93)  (3.23)  (1.97)  (2.95)  CDF($$\sigma^{2}_{\it ret})$$  0.58**  0.45**  0.34**  0.51**  0.91  0.65**     (–8.04)  (–7.60)  (–6.29)  (–5.32)  (–1.00)  (–6.97)  Return  1.16**  1.20**  1.20**  1.13**  1.02  1.18**     (5.13)  (4.94)  (3.61)  (3.11)  (0.39)  (5.60)  Return$$_{t+1}$$  1.32**  1.43**  1.36**  1.42**  1.01  1.31**     (11.5)  (12.1)  (8.12)  (11.4)  (0.26)  (11.1)  Return$$_{t+2}$$  1.25**  1.31**  1.19**  1.36**  1.09*  1.22**     (9.69)  (10.0)  (4.10)  (10.5)  (2.08)  (8.88)  Return$$_{t+3}$$  1.07**  1.09**  1.02  1.11**  1.04  1.08**     (2.75)  (3.16)  (0.37)  (3.70)  (0.92)  (3.52)  Observations  84,017  79,617  75,865  77,729  78,823  98,776     TDC1 all  TDC1 firm  TDC1 index  TDC1 other  TDC1 manager  Salary all     (1)  (2)  (3)  (4)  (5)  (6)  Salary  0.87**  1.13  1.03  1.20**  0.61**  0.81**     (–2.62)  (1.95)  (0.30)  (2.77)  (–5.19)  (–4.25)  Bonus  1.03  1.03  1.08  1.02  1.00  0.98     (0.81)  (0.63)  (1.36)  (0.50)  (0.011)  (–0.49)  Other ann comp  1.03  1.17  0.99  1.20  0.81  1.00     (0.26)  (1.60)  (–0.030)  (1.78)  (–1.40)  (0.069)  Stock grants  1.01*  1.01*  0.97  1.01**  1.01**        (2.44)  (2.22)  (–0.21)  (2.77)  (2.81)     Black-Scholes opt val  0.93  1.00  1.02  1.00  0.62        (–1.30)  (0.066)  (0.75)  (0.022)  (–1.95)     Option PFP  1.17**  1.15**  1.10**  1.16**  1.16**        (6.06)  (4.83)  (3.82)  (4.51)  (4.86)     % shares owned  1.19**  1.21**  1.18**  1.23**  1.10**        (10.4)  (9.23)  (6.19)  (8.75)  (3.64)     CEO  0.62**  0.50**  0.61**  0.42**  0.87**  0.63**     (–17.5)  (–18.6)  (–11.1)  (–15.5)  (–3.93)  (–16.1)  ln(Firm size)  0.84*  0.77*  0.78  0.74*  0.88  0.92     (–2.40)  (–2.53)  (–1.48)  (–2.48)  (–1.27)  (–1.39)  Leverage  0.98  0.95  0.88  0.96  1.02  0.99     (–0.51)  (–1.01)  (–1.41)  (–0.54)  (0.52)  (–0.32)  Div yield  1.08  0.92  0.94  0.90  1.09  1.09**     (1.75)  (–0.64)  (–0.18)  (–0.72)  (1.68)  (2.65)  $$q$$  1.16**  1.21**  1.21**  1.19**  1.11*  1.13**     (3.56)  (3.61)  (2.93)  (3.23)  (1.97)  (2.95)  CDF($$\sigma^{2}_{\it ret})$$  0.58**  0.45**  0.34**  0.51**  0.91  0.65**     (–8.04)  (–7.60)  (–6.29)  (–5.32)  (–1.00)  (–6.97)  Return  1.16**  1.20**  1.20**  1.13**  1.02  1.18**     (5.13)  (4.94)  (3.61)  (3.11)  (0.39)  (5.60)  Return$$_{t+1}$$  1.32**  1.43**  1.36**  1.42**  1.01  1.31**     (11.5)  (12.1)  (8.12)  (11.4)  (0.26)  (11.1)  Return$$_{t+2}$$  1.25**  1.31**  1.19**  1.36**  1.09*  1.22**     (9.69)  (10.0)  (4.10)  (10.5)  (2.08)  (8.88)  Return$$_{t+3}$$  1.07**  1.09**  1.02  1.11**  1.04  1.08**     (2.75)  (3.16)  (0.37)  (3.70)  (0.92)  (3.52)  Observations  84,017  79,617  75,865  77,729  78,823  98,776  This table reports odds ratios from Logit specifications using manager-level observations from ExecuComp for 1994 through 2005. Columns one through five use TDC1 to determine backfilling, and the last column uses Salary. Column 1 reports odds ratios from a specification in which the dependent variable equals one for any backfilled observation, and zero otherwise. Column 2 sets the dependent variable equal to one if the observation is either index or backfilled for other reasons. Columns three through five use backfilling indicator variables for index, other, and manager backfilling, respectively. Manager compensation and ownership characteristics are obtained directly from ExecuComp (Salary, Bonus, other annual compensation, stock grants, the percentage of shares owned, and Black-Scholes option values). Option PFP is the pay-for-performance sensitivity of options grants as defined in Yermack (1995). CEO takes the value of one if the CEOANN variable in ExecuComp equals “CEO”. Ln(Firm size) is the log of market equity at the beginning of the fiscal year. Leverage is long-term debt divided by total assets. Div yield is the sum of dividends over the year divided by market equity. $$q$$ is Tobin’s q, CDF($$\sigma^{2}_{\it ret})$$ is the cumulative distribution function of the variance of returns for firms in our sample, following Aggarwal and Samwick (1999). If the % shares owned is missing, it is set to zero, and we include a dummy variable equal to one if %shares owned is missing. We do the same for Option PFP. Coefficients on these dummy variables are not reported for brevity. All independent variables are standardized. All specifications include year effects. Standard errors are clustered at the firm level. ** and * indicate statistical significance at the 1% and 5% level, respectively. The odds ratios reported in Table 5 show that backfilled firms generally have higher Tobin’s q, higher stock returns and lower return variances. From Column 1, a firm with Tobin’s q that is one standard deviation above the mean is 1.19 times more likely to be backfilled than a firm with the average $$q$$. A firm with a one-standard-deviation higher variance (given by CDF($$\sigma^{2}_{\it ret}))$$ is roughly half as likely to be backfilled relative to the firm with average variance. Consistent with our univariate results, backfilled firms also tend to have higher subsequent returns. For example, a firm with a stock return over the year $$t+$$1 that is one standard deviation above the mean is 1.45 times more likely to be backfilled (where $$t$$ is the year of the compensation observation). Table 6 focuses on executive-year observations and the specifications include as explanatory variables details of the executive’s compensation, including bonus, stock grants, the Black-Scholes value of option grants, and pay-for-performance sensitivity of option grants [Yermack (1995)]. We also include executive ownership, an indicator variable for executives serving as CEO (CEOANN in ExecuComp) and firm characteristics. Consistent with the univariate analysis, Salary is significantly lower for the backfilled observations, as shown in columns one and six. However, columns two through five show that this magnitude difference is driven by manager-level backfilling. Backfilled observations tend to have larger stock grants, although the economic magnitude of the effect is low. We also see that backfilling appears to be associated with higher pay-for-performance sensitivity, backfilled executives tend to hold a higher percentage of the firm’s shares, and that CEOs are less likely to be backfilled. Examination of firm-level predictors indicates that firms with high stock returns and low return variance (CDF($$\sigma^{2}_{\it ret}))$$ are much more likely to be backfilled. Comparison of coefficients within a particular column indicates that the level and volatility of firm performance are among the strongest predictors of backfilling. These results are consistent with our earlier discussion regarding potential biases induced by systematically backfilling data for managers of firms that have exhibited strong firm performance. 3. The Impact of Backfilled Data on the Estimation of Compensation Metrics A large number of studies examine the relations between executive compensation and other variables of interest. Just a few examples show the wide spectrum of academic interest in the area (some of these studies employ the ExecuComp database and others do not): broad issues about incentives and contracting (e.g., Jensen and Murphy 1990; Yermack 1995; Gillan, Hartzell, and Parrino 2009), compensation incentives and the reporting of financial information (e.g., Burns and Kedia 2006; Bergstresser and Philippon 2006), and governmental regulation and compensation contracts (e.g., Perry and Zenner 2001). To highlight the potential effects of backfilling, we replicate several common tests used in the corporate finance literature. 3.1 The level of compensation Much research has been dedicated to explaining the level of executive pay. For example, several papers provide theoretical and empirical evidence that compensation is related to firm size (e.g., Murphy 1985; Baker and Hall 2004; Murphy and Zábojník 2004; Gabaix and Landier 2008). Given the differences in firm size and executive compensation between backfilled and nonbackfilled observations shown in the previous section, it is likely that backfilling affects the estimated relations between these variables. Similarly, other studies combine compensation data with size and additional firm characteristics in order to construct estimates of abnormal pay (e.g., Smith and Watts 1992; Core, Holthausen, and Larcker 1999; Murphy 1999; Core, Guay and Larcker 2008; Gillan, Hartzell, and Parrino 2009). To demonstrate one such approach using total compensation (TDC1) as the relevant level of compensation, abnormal compensation can be computed as actual compensation minus its expected value from the following prediction equation:   \begin{align*} \ln(\textit{TDC1}_{t}) &= {\rm b}_{1} \ln(\textit{Tenure}_{t}) + {\rm b}_{2} \ln(\textit{Sales}_{t-1}) + {\rm b}_{3} \textit{S&P500}_{t}+ {\rm b}_{4} \ln(\textit{BTM}_{t-1}) \nonumber\\ &\quad + {\rm b}_{5} \textit{ROA}_{t}+ {\rm b}_{6} \textit{ROA}_{t-1}+ {\rm b}_{7} \textit{RET}_{t}+ {\rm b}_{8} \textit{RET}_{t-1} + {\rm b}_{9} \textit{CEO}_{t}, \end{align*} where Tenure$$_{t}$$ is the number of years at time $$t$$ that the executive has been with the company, Sales$$_{t-1}$$ is the company’s lagged annual sales, S&P500$$_{t}$$ is an indicator variable set to one if the firm is in the S&P 500 index and zero otherwise, BTM$$_{t-1}$$ is the lagged value of book equity over market equity, ROA is earnings before interest and taxes (EBIT) divided by the firm’s assets, RET is the firm’s stock return and CEO is an indicator set to one if the executive is the CEO (as identified by the CEOANN variable in ExecuComp).18 Table 7 shows the results of computing abnormal compensation for our full sample (Column 1) and the nonbackfilled sample (Column 2). Below the regression output, we summarize both the fitted values and error terms from each respective regression, that is, normal and abnormal compensation. We see significant differences in both normal and abnormal compensation across the backfilled and nonbackfilled observations. On average, expected or normal TDC1 is $${\$}$$1,759,000 for the full sample, but only $${\$}$$916,000 for backfilled observations. The average error term is zero by construction for the full sample, but when splitting the sample by Back_Total, we see that backfilled observations have negative prediction errors on average, and nonbackfilled data have positive errors. The difference between the two is highly statistically significant ($$t$$-statistic of 5.14). Table 7 Abnormal compensation    Sample:  Independent var: ln(TDC1)$$_{t}$$  All data  Back_Total$$=$$0     (1)  (2)  ln(Tenure)$$_{t}$$  –0.008  –0.010     (–0.76)  (–0.92)  ln(Sales)$$_{t-1}$$  0.31**  0.32**     (25.3)  (25.0)  S&P 500$$_{t}$$  0.37**  0.37**     (9.14)  (8.96)  ln(BTM)$$_{t}$$  –0.22**  –0.23**     (–10.9)  (–10.9)  ROA$$_{t}$$  –0.021  –0.17*     (–0.22)  (–1.95)  ROA$$_{t-1}$$  –0.090  0.003     (–1.02)  (0.03)  Ret$$_{t}$$  0.066**  0.067**     (4.44)  (4.38)  Ret$$_{t-1}$$  0.11**  –0.13**     (8.00)  (–8.58)  CEO$$_{t}$$  0.84**  0.83**     (49.1)  (48.4)  Mean predicted TDC1, in thousands  Full sample  1,759     Backfilled obs.  916     Nonbackfilled obs.  1,826  1,846  Mean abnormal TDC1 (actual minus predicted TDC1), in thousands  Full sample  915     Backfilled obs.  279     Nonbackfilled obs.  966  945  Paired $$t$$-test for difference among nonbackfilled obs. across samples     (21.35)  Mean abnormal log compensation: actual ln(TDC1) minus predicted ln(TDC1), in thousands  Full sample  0.0000     Backfilled obs.  –0.1389     Nonbackfilled obs.  0.0110  0.0000     Sample:  Independent var: ln(TDC1)$$_{t}$$  All data  Back_Total$$=$$0     (1)  (2)  ln(Tenure)$$_{t}$$  –0.008  –0.010     (–0.76)  (–0.92)  ln(Sales)$$_{t-1}$$  0.31**  0.32**     (25.3)  (25.0)  S&P 500$$_{t}$$  0.37**  0.37**     (9.14)  (8.96)  ln(BTM)$$_{t}$$  –0.22**  –0.23**     (–10.9)  (–10.9)  ROA$$_{t}$$  –0.021  –0.17*     (–0.22)  (–1.95)  ROA$$_{t-1}$$  –0.090  0.003     (–1.02)  (0.03)  Ret$$_{t}$$  0.066**  0.067**     (4.44)  (4.38)  Ret$$_{t-1}$$  0.11**  –0.13**     (8.00)  (–8.58)  CEO$$_{t}$$  0.84**  0.83**     (49.1)  (48.4)  Mean predicted TDC1, in thousands  Full sample  1,759     Backfilled obs.  916     Nonbackfilled obs.  1,826  1,846  Mean abnormal TDC1 (actual minus predicted TDC1), in thousands  Full sample  915     Backfilled obs.  279     Nonbackfilled obs.  966  945  Paired $$t$$-test for difference among nonbackfilled obs. across samples     (21.35)  Mean abnormal log compensation: actual ln(TDC1) minus predicted ln(TDC1), in thousands  Full sample  0.0000     Backfilled obs.  –0.1389     Nonbackfilled obs.  0.0110  0.0000  This table reports coefficients from OLS specifications of executive observations from ExecuComp for fiscal years 1994 through 2005. The dependent variable is the natural log of TDC1. Column 1 uses all data, and Column 2 restricts the sample to nonbackfilled observations (Back_Total$$=$$0). Tenure is the time in years since the executive started at the firm and is obtained from ExecuComp. Sales is annual sales from Compustat. S&P 500 equals one if the firm is in the S&P500 index, and zero otherwise. BTM is book equity over market equity, ROA is EBIT over assets, RET is firm stock return and CEO is an indicator set to one if the CEOANN variable in ExecuComp equals “CEO.” All regressions include year and two-digit SIC effects. Below the regression coefficients we report mean values of the fitted values (“normal” compensation) and residuals (“abnormal” compensation). Abnormal compensation is computed as actual TDC1 minus the predicted value of TDC1 from the regression. Standard errors are clustered at the firm level. ** and * indicate statistical significance at the 1% and 5% level, respectively. The results in Table 7 are important for at least three reasons. First, the results suggest that the addition of backfilled data leads to upwardly biased estimates of abnormal compensation for the nonbackfilled observations on average.19 Thus, not correcting for the backfilled data can lead one to conclude that a large portion of the ExecuComp sample appears to be “overpaid” relative to the group of managers that were backfilled, which exhibited stronger performance and lower compensation levels. This result is very strong statistically; as shown in the table, a paired $$t$$-test of the residuals for the nonbackfilled observations across the full and clean samples resoundingly rejects the null hypothesis of no difference (with a t statistic of more than 21). Second, while including backfilled data has a modest effect on abnormal compensation economically – roughly $${\$}$$21,000, on average – the small difference in means masks some very large differences for specific observations. For example, John Menzer, an executive at Wal-Mart, received compensation in 2003 that is $${\$}$$402,000 higher than that predicted when using all of ExecuComp, suggesting he may have been overpaid. However, after excluding backfilled data, his compensation is actually lower than the predicted value by $${\$}$$49,000, suggesting he was underpaid. Thus, backfilled data can affect whether a particular executive appears to be over- or underpaid, which can be quite important for compensation consultants or shareholder activists, and potentially important for researchers. Third, in addition to introducing an upward bias in the typical estimated abnormal compensation, the inclusion of backfilled firms will also change the distribution of abnormal compensation for different subsamples. While an investigation of changes in relations with all possible covariates of abnormal compensation is beyond the scope of this study, we did consider Tobin’s q as a plausible representative variable of interest (e.g., as a way to examine the relation between firm performance or valuation and excess compensation). We find that in a regression of Tobin’s q on abnormal compensation and a set of control variables, there is a negative and significant coefficient on the portion of abnormal compensation estimated erroneously due to backfilling. Next, we consider a related question—the relation between abnormal incentive compensation and future firm performance—as another way to demonstrate the potential impact of backfilled data on economic and statistical inferences. 3.2 Incentive compensation and future returns One of the classic questions in corporate governance is the degree to which an executive’s compensation affects future firm performance. The literature has made two basic arguments about the direction of this relationship. Some authors have argued (with supporting evidence) that compensation structures can provide management with incentives that lead to higher future firm performance (e.g., Abowd 1990; Hayes and Schaefer 2000; Minnick, Unal and Yang 2011). In contrast, other authors have argued and presented evidence that due to agency issues, greater executive incentive compensation leads to lower subsequent firm performance (e.g., Core, Holthausen, and Larcker 1999; Bebchuk, Cremers, and Peyer 2011; Cooper, Gulen, and Rau 2014). For example, Cooper, Gulen, and Rau (2014) find that the firms with the highest abnormal incentive compensation for their executives have significantly lower future returns. These results not only have had potential academic impact but they also have garnered significant attention in the popular press.20 The relation between abnormal incentive compensation and future firm performance is one that would be particularly susceptible to the backfilling bias that we have identified. Thus, we examine this link using two samples (the total sample and the sample without the backfilled observations) and then compare the differences between them. Specifically, we place the sample firms into deciles each year based on their excess incentive compensation, measured as the excess compensation over control firms matched on industry (two-digit SIC) and size (measured by sales). We then measure the cumulative abnormal returns in the year by calculating the excess of the average return of an industry and lagged-return matched portfolio. Table 8 reports the results for the lowest and highest deciles of firms sorted by their excess CEO compensation. Although we have a different sample period from that of Cooper, Gulen, and Rau (2014), our results for the full sample of firms are similar: We find the firms with the highest excess incentive compensation have lower future abnormal returns, a result that is significant at the 10% level. In contrast, if we eliminate the backfilled firms from the sample, there is no relation between abnormal incentive compensation and future firm performance. Table 8 CEO incentive compensation and future firm performance    Abnormal stock return in year t$$+$$1     Full sample  Nonbackfilled sample     (1)  (2)  Lowest decile abnormal  0.046***  0.047***  incentive comp in year t  (5.35)  (5.33)  Highest decile abnormal  0.021**  0.034***  incentive comp in year t  (2.09)  (3.26)  High - low  –0.025*  –0.013     (–1.89)  (–0.94)     Abnormal stock return in year t$$+$$1     Full sample  Nonbackfilled sample     (1)  (2)  Lowest decile abnormal  0.046***  0.047***  incentive comp in year t  (5.35)  (5.33)  Highest decile abnormal  0.021**  0.034***  incentive comp in year t  (2.09)  (3.26)  High - low  –0.025*  –0.013     (–1.89)  (–0.94)  This table reports the results of tests between the abnormal performance of the lowest and highest decile of firms sorted by their excess CEO incentive compensation. The table reports the cumulative abnormal returns in the year after the compensation ranking. Excess incentive compensation is measured as the excess compensation over control firms matched on industry and size. The cumulative abnormal returns are calculated in excess of the average return of an industry and lagged-return matched portfolio. Standard errors are clustered at the firm level. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. This example shows that taking a data sample and adding a large mass of firms with low levels of unexplained incentive compensation and high future performance, in which the firms were selected because of their strong performance, can distort the interpretation of the relation between compensation and performance. 3.3 Pay-for-performance sensitivity Since Jensen and Murphy (1990), researchers often focus on the level of pay-for-performance sensitivity and whether it is sufficiently high. For example, Hall and Liebman (1998) and Murphy (1999) analyze the level and drivers of pay-for-performance sensitivity. Yermack (1995) examines the pay-for-performance sensitivity derived from option compensation. Studies of the level of international pay-for-performance sensitivity often compare their results to findings in the U.S. market based on ExecuComp data (e.g., Fernandes et al. 2013). Researchers have also examined changes in the levels of pay-for-performance sensitivity of executives after major macroeconomic events such as the Sarbanes-Oxley Act or the financial crisis (e.g., Chen, Jeter, and Yang 2015; Fahlenbrach and Stulz 2011). Additional work has tested whether the magnitude of a CEO’s pay-for-performance sensitivity affects corporate decisions (e.g., Minnick, Unal and Yang 2011). We employ two measures of executive compensation for the pay-for-performance sensitivity estimations. The first dependent variable we use is total direct compensation, TDC1. The second measure includes total direct compensation plus stock and option ownership and is defined as TDC1$$+\Delta$$(value of shares and options owned). The change in the value of shares equals the share price times the number of shares owned multiplied by the stock return. Similarly, the change in value of options owned is estimated as the sum across all options in the manager’s portfolio of the Black-Scholes value of the option times an estimate of the option delta multiplied by the stock return.21 3.3.1 Pay-for-performance sensitivity and firm size Given that the backfilled firms in ExecuComp tend to have higher performance than the firms that are not backfilled, and that prior research has shown the sensitivity of pay to performance and the structure of compensation to be a function of stock returns, we expect backfilling in ExecuComp to generate a variety of poorly estimated and perhaps even spurious relations.22 For example, prior work has concluded that pay-for-performance sensitivity is significantly decreasing in firm size (Schaefer 1998; Baker and Hall 2004). However, these studies used ExecuComp data, and this relation could be largely driven by the backfilling bias. Consider a simple example in which there are four types of firms: small firms with high returns (Small/High); small firms with low returns (Small/Low); large firms with high returns (Large/High); and large with low returns (Large/Low). If one further assumes that pay-for-performance sensitivity is an increasing function only of returns and not size, for example, because firms with higher expected returns grant more options, the Small/High and Large/High groups will have high pay-for-performance sensitivity, while Small/Low and Large/Low will have low pay-for-performance sensitivity. In this example, pay-for-performance sensitivity is solely a function of returns and there is no true relation between pay-for-performance sensitivity and size. As long as all four sets of firms are included in the sample for a regression analysis, the econometrician would find no relation with size. However, researchers employing ExecuComp would tend to observe the large firms due to their presence in the index (Large/Low and Large/High), but also a nonrandom sample of small firms, as the backfilled firms would tend to be from the Small/High group. Thus, including the backfilled sample, i.e., using Large/Low, Large/High, and Small/High in the analysis, would overestimate the level of pay-for-performance sensitivity, and would lead the econometrician to find a negative relation between pay-for-performance sensitivity and firm size when in fact none exists. If the backfilled observations were removed, then the estimated average pay-for-performance sensitivity would equal the population average, and one would find no relation between pay-for-performance sensitivity and size. We test for the presence of such a bias by examining the relation between pay-for-performance sensitivity and size. Following Schaefer (1998), in Table 9, we report pay-for-performance sensitivity estimates from fixed effect regressions, in which we split the sample into small and large stocks based on lagged market value of equity. Panel A provides the results for CEOs, and panel B provides the results for all executives. The first two columns in each panel employ all observations available, while the latter two exclude backfilled observations. We find that the relation between, pay-for-performance sensitivity and size differs depending on whether all of the observations are included or if the sample is free of backfilled firms. Table 9 Pay-for-performance sensitivity and firm size    TDC1 $$+ \Delta$$ (value of stock and option ownership)     (1)  (2)  (3)  (4)     Full sample  Excluding backfilled obs     Small  Large  Small  Large  A. CEO’s  $$\Delta_{t}$$shrwealth  11.85  0.027  2.35**  0.026     (1.79)  (0.50)  (6.29)  (0.48)  $$\Delta_{t-1}$$shrwealth  3.25  0.10  0.53**  0.11     (1.56)  (1.35)  (2.78)  (1.36)  Observations  4,743  5,607  4,435  5,145  R-squared  0.362  0.041  0.145  0.041  Diff-in-diff  9.5*              (1.87)           Bootstrapped diff-in-diff (average)  0.47           Bootstrapped $$t$$-stat (average)  (–0.10)           Fraction of bootstrapped diff-in-diff above 9.5  3.3%