Abstract What determines military effectiveness? Though political scientists have studied the sources of military effectiveness, they have generally ignored the role of military leadership, a factor that historians have emphasized as crucial for effectiveness. This article presents the first rigorous examination of the proposition that militaries improve effectiveness by replacing low-performing leaders. The article tests three theories describing how militaries promote and demote leaders: (1) military leaders are promoted and demoted on the basis of combat performance; (2) political leaders fearful of coups do not demote low-performing military leaders, as a coup-proofing tactic; and (3) military leaders that belong to powerful interpersonal networks are less likely to be demoted and more likely to be promoted. Hypotheses are tested using new data on all American and German generals holding combat commands in the North African, Italian, and West European theaters in World War II and new data on the monthly combat performance of American and German divisions in these theaters. Analysis reveals that both armies replaced low-performing generals (coup-proofing motives did not prevent Hitler from demoting low performers) and that interpersonal networks in the US army did not block demotion of low performers. Also, the replacement of low-performing generals improved combat effectiveness in both armies. Historians have long emphasized the importance of leadership in determining military effectiveness. In striking comparison, though political scientists have for decades studied many possible sources of military effectiveness including technology, regime type, strategy, military-industrial power, and others, they have almost completely neglected military leadership as a possible determinant of military effectiveness. This article presents the first rigorous social scientific treatment of the causes and consequences of quality military leadership. It asks three related questions: Do militaries replace poorly performing leaders? Are there conditions that might impede a military from replacing poorly performing leaders? When poorly performing leaders are replaced, does military effectiveness improve? The article proposes that militaries replace poorly performing leaders and that replacing poorly performing leaders improves military effectiveness. It also considers whether interpersonal networks among generals or coup-proofing incentives that prioritize political loyalty over competence might prevent poorly performing leaders from being replaced. The focus on individual military leaders here complements other international relations (IR) scholarship focus on political leaders (e.g., Goemans 2000; Rosen 2005; Bueno de Mesquita et al. 2003; Gelpi and Feaver 2005; Wolford 2007; Chiozza and Goemans 2011; Horowitz, Stam, and Ellis 2015). We test our ideas on military leadership on the American and German armies in World War II. We use two new data sets. The first is a data set of the command tenures of all 320 American and German division-leading generals who led troops (infantry, airborne, or armored) in battle in North Africa, Italy, and Western Europe, from 1941 to 1945. The second data set records the monthly combat performance of all American and German divisions in these three theaters.1 We test whether generals leading low-performing divisions were more likely to be replaced on the German and/or the American side, and then also whether a division’s performance improved after its general was replaced. We also test two theoretical challenges to the proposition that low-performing generals get replaced, that interpersonal networks prevent low-performing generals from being replaced, and that, in civilian dictatorships like Nazi Germany, because generals are promoted for their political loyalty to the dictator, generals are unlikely to be removed even after having performed poorly in combat. Our focus on a single war follows the lead taken by other conflict scholars—especially scholars of intrastate conflict—of analyzing more fine-grained microdata from a single conflict. The microdata approach improves data quality and internal validity (Verwimp, Justino, and Brück 2009; for an example of a quantitative study of interstate war using microdata, see Allen and Vincent 2011). Our results are illuminating. In both the German and American armies, low-performing generals were more likely to be replaced, and replacement of low-performing generals significantly boosted military effectiveness. The tendency of the German army, serving under the civilian dictator Adolf Hitler, to replace low-performing generals provides evidence against the coup-proofing hypothesis that dictators always prioritize political loyalty over competence. The finding that American generals who were members of powerful interpersonal networks were not less likely to be replaced provides evidence against the hypothesis that such networks can distort the relationship between performance and leadership removal. The article makes three important contributions to IR. First, it improves our understanding of the sources of military effectiveness. The article provides some of the first rigorous evidence demonstrating that militaries improve effectiveness by replacing poorly performing leaders, and more generally that quality leadership boosts effectiveness. Second, the article improves our understanding of how military organizations behave. The results here support a more Weberian view of militaries as functionalist organizations, pushing back against perspectives that militaries are always hidebound organizations slow to adapt, impeded by factors such as interpersonal networks and coup-proofing dynamics. Third, the article contributes to an ongoing empirical debate about the sources of American and German military effectiveness in World War II, providing support for the argument that one reason American forces fought well in World War II was the willingness of the American high command to dismiss low-performing generals. The remainder of this article contains five sections. The first section develops the theory that military organizations change leadership as a means of improving performance. The second section presents hypotheses. The third section presents the research design. The fourth section presents results. The final section concludes. Military Leadership and Military Effectiveness Especially since the end of the Cold War, IR scholars have become increasingly interested in understanding the sources of military effectiveness and, relatedly, the determinants of war outcomes. Scholars have examined a number of different possible determinants of military effectiveness, including regime type, technology, terrain, ethnicity, civil-military relations, military strategy, alliances, economic might, nationalism, and others (Stam 1996; Rosen 1996; Brooks 1998; Quinlivan 1999; Reiter and Stam 2002; Biddle 2004; Biddle and Long 2004; Brooks and Stanley 2007; Desch 2008; Beckley 2010; Grauer and Horowitz 2012; Castillo 2014; Talmadge 2015; Reiter 2017). However, there is almost no IR work on the impact of military leadership on military effectiveness, or on the determinants of the promotion and demotion of military leaders, especially in wartime.2 This lacuna is especially striking given the great emphasis that military historians and other observers have placed on military leadership as a critical determinant of military effectiveness for millennia. The ancient Chinese observer Sun Tzu (1963, 115) warned that poor-performing generals will cause “the ruin of the army.” Machiavelli (1999) placed great emphasis on quality military leadership. Napoleon himself once declared, “[t]here are no bad regiments; there are only bad colonels” (quoted in Farwell 2001, 206). Historians have long stressed that outstanding military leaders, from Lord Nelson to Robert E. Lee to Erwin Rommel to David Petraeus, have made key contributions to victory in conflict types ranging from counterinsurgency to naval warfare to high-intensity land war. A standard interpretation of the American Civil War is that the Union won because President Lincoln replaced the timid General George McClellan with more ruthless, effective leaders such as Ulysses Grant, William Sherman, and Phillip Sheridan (McPherson 1988). Military leadership is important for all forms of combat. We propose that effective military leaders exhibit one or more of the following three qualities. First, effective military leaders inspire their soldiers to fight and die for their country and their unit. Though nationalism and other factors have been found to play roles in motivating soldiers to fight (Castillo 2014), leaders themselves can directly encourage soldiers to fight and sacrifice themselves. Different leaders employ different strategies for motivating their soldiers, including making competent decisions, applying brutal discipline, and/or conveying concern for the well-being of their troops. During the American Civil War, for example, General Thomas “Stonewall” Jackson used strict discipline and consistently successful decision-making to motivate his troops (McPherson 1988, esp. 456–457). Second, quality leaders are masters of the craft of combat. They can accomplish a wide variety of tasks, including securing logistical support, appropriately deploying troops, achieving concealment and protection using terrain and human-made structures, and employing weapons technology to maximum effect. These skills improve combat effectiveness and affect battle outcomes. For example, during the 1950 Chinese attack on American forces at Chosin during the Korean War, American Marine units outperformed American Army units because of superior leadership of the former in areas such as communications, reconnaissance, and logistics (Ricks 2012, chapter 11). Masters of combat also provide more strategy and doctrinal options. Mearsheimer (1983) and Stam (1996) described the operational superiority of blitzkrieg or maneuver strategies in comparison to attrition strategies. However, such strategies require quality military leaders who are able to identify and effectively exploit emerging battlefield opportunities. Biddle (2004) made a similar point regarding what he described as the modern system of force employment. Certainly, German adoption of the tremendously successful blitzkrieg strategy presumed a high level of confidence in the quality of combat commanders (Van Creveld 1982, 36). Third, leaders choose subordinates, and quality leaders are more likely to choose quality subordinates, who in turn bolster performance. Indeed, the single most important contribution General George C. Marshall made to the American war effort in World War II may have been his identification and promotion of Dwight D. Eisenhower (Ricks 2012, chapter 2). Generals’ selection of their command staffs is especially important for modern armies (Van Creveld 1985).3 If leader quality affects military effectiveness, what factors determine the promotion and demotion of different kinds of military leaders? A straightforward perspective is that the state and the high military command retain competent leaders and dismiss ineffective leaders. This is a centuries-old idea, stressed by none other than Machiavelli (1999, chapter 12) in his manual on successful statecraft, The Prince: “the republic has to send its citizens, and when one is sent who does not turn out satisfactorily, it ought to recall him, and when one is worthy, to hold him by the laws so that he does not leave his command.” It also echoes Weber’s (1978, esp. 217–218) vision of a rational bureaucracy in which individuals advance on the basis of merit. More recent work in political science and, more broadly, the social sciences has explored how organizational leaders are selected and whether leadership quality affects organizational performance. Sociologists and scholars of business have explored whether and how leadership affects corporate performance and whether corporate leaders are selected on the basis of competence (Lieberson and O’Connor 1972; Tarakci, Greer, and Groenen 2016). There is a long-standing literature in American politics on legislator effectiveness (Volden and Wiseman 2014) and whether presidents choose leaders of federal agencies on the basis of merit or political considerations (Hollibaugh, Horton, and Lewis 2014). The proposition that militaries retain, promote, and demote leaders on the basis of combat performance is not well-developed within modern scholarship. There is only limited empirical work in political science as to whether militaries promote officers on the basis of aptitude and competence, though scholars have occasionally discussed specific episodes of commander (non)replacement.4 Some work is critical of the idea that military organizations retain quality leaders and remove poor-performing leaders. One body of scholarship posits that militaries are hidebound institutions resistant to change, likely to suffer from organizational biases as they process information and make decisions (Posen 1984; Snyder 1984; Van Evera 1999). As discussed in the next section, this article discusses two dynamics, interpersonal networks and coup-proofing dynamics, that might prevent militaries from replacing low-performing leaders. A different question is, in the face of poor performance, why would a military react by replacing its leadership, as opposed to attempting a different kind of change? Indeed, a number of works have explored other types of reactions militaries might have in an attempt to improve performance, such as changing military strategy (Gartner 1997; Biddle 2004) or adopting new technologies (Horowitz 2010). In the pursuit of better performance, there are a few reasons why a military organization might choose to replace leadership rather than change strategy or technology. Changing leaders offers the possibility of a rapid improvement in performance, as the implementation of strategy or new technology can take more time. Financially, changing leadership is relatively cheap, as it does not require buying new equipment or revamping training programs. Relatedly, a state may wish to change its technology, but might not have access to the technology it desires, cost notwithstanding. Changing leaders often does not require militaries to overturn long-held beliefs about military strategy, beliefs that may shade into organizational culture and entrenched standardized operating procedures. Last, changing leaders is a more selective approach to improving performance. It permits an organization to attempt to improve the performance of low-performing units, such as individual divisions, and leave alone high-performing units. In contrast, it is more difficult to make such discriminate changes by changing strategy or technology. Hypotheses We propose that militaries see effective military leadership as a key determinant of success, and accordingly militaries make demotion and promotion decisions in wartime on the basis of combat outcomes. Hypothesis 1a:In wartime, military leaders are more likely to lose their command if their units experience poor combat outcomes. Hypothesis 1b:In wartime, military leaders are less likely to be promoted if their units experience poor combat outcomes. We also present two alternatives to Hypotheses 1a and 1b. The first focuses on regime politics. Scholars have proposed that political leaders who fear being deposed in a coup d’état, such as civilian dictators, sometimes take preventive measures to reduce the coup threat, a technique known as coup-proofing (scholars disagree as to whether or not [certain forms of] coup-proofing actually prevents coups; see Powell 2012, Rwengabo 2013, Albrecht 2015, and Harkness 2016). A central coup-proofing measure is the promotion and retention of military officers on the basis of political reliability, as manifested by factors such as familial ties, ethnic kinship, or ideological affinity with the political leader, rather than by professional competence. In one of the classic works on civil-military relations, Janowitz (1960, 353), speaking of the American context, commented that controlling the system of promotions is “a crucial lever of civilian control.” The selection of military officers on the basis of political reliability rather than competence has been hypothesized to reduce military effectiveness because it lowers leader quality (Biddle and Zirkle 1996; Quinlivan 1999; Talmadge 2015). Scholars have proposed that nondemocracies, Arab states, and states under lower levels of external threat are especially likely to engage in coup-proofing (Quinlivan 1999; Reiter and Stam 2002; Pollack 2002; Pilster and Böhmelt 2012; Talmadge 2015). We compare the leadership and combat experiences of a democratic military, World War II America, and a dictatorial military, World War II Germany. A coup-proofing hypothesis would forecast that a civilian dictator like Adolf Hitler should be strongly motivated to coup-proof his regime, given Germany’s recent history of coups and coup attempts, signs of dissatisfaction with the Nazi regime within the German military, and Hitler’s awareness of latent threats in the German military to his rule (Fest 1994). Coup-proofing theory would predict a divergence between the American and German experiences. The United States ought to retain generals on the basis of combat performance, but, according to coup-proofing theory, coup-proofing considerations should attenuate the connection between combat performance and career outcomes of military leaders in a dictatorship Nazi Germany, as officers appointed for political reliability should retain their commands even when their units perform poorly.5 Hypothesis 2:In wartime, dictators should be less likely to relieve low-performing generals as compared with democratically elected civilian leaders. A second alternative perspective is that interpersonal networks affect organizational leadership decisions. Some propose that organizations include interpersonal networks and that members of powerful networks are likely to enjoy faster levels of professional advancement (Burt 1992). Moore and Trout (1978) presented a complex “visibility” theory of military promotion, emphasizing that, though talent helps put individuals into the pools of individuals eligible for promotion, personal networks are much more important in the promotion process than talent. They also proposed that personal networks become increasingly important in higher ranks of seniority. Other scholarship has provided empirical evidence that personal networks affect the leadership selection and promotion process in American and other militaries (Segal 1967; Peck 1994; Kim and Crabb 2014). That said, some have presented evidence that in the US military the value of such networks does not affect promotion patterns (Schwind and Laurence 2006) or has varied across time (Evans 1992). This interpersonal networks perspective suggests: Hypothesis 3:In wartime, military leaders who are members of powerful interpersonal networks are more likely to be retained and promoted than leaders who are not members of such networks. Thus far we have discussed the determinants of leadership turnover. The remaining hypothesis examines how leadership turnover affects combat performance. We note that Hypotheses 1a and 1b assume that organizations believe that replacing low-performing leaders will improve military organizational performance. We test this assumption. This is a bit more difficult to get at, as the best measure of leader effectiveness is of course combat effectiveness, and for falsifiability reasons we wish to avoid measuring the independent variable with the dependent variable. We take a more indirect approach. If the assumption in Hypotheses 1a and 1b is correct, then the performance of a military organization should improve if the organization’s leader has been replaced. One should note the assumption that the new, replacement leader will be of higher quality than the removed leader, an idea that was also explored in a formal model of civilian leadership replacement (Meirowitz and Tucker 2013). Hypothesis 4:In wartime, when a military unit’s leader is replaced following poor combat performance of the unit, the unit’s combat performance should improve. Research Design We test our hypotheses by collecting and analyzing new data on the command tenure experiences of generals leading individual ground divisions in the American and German armies fighting in the North African, Italian, and West European theaters in World War II in the years 1941–1945. We focus on lower-ranking generals (hereafter referred to simply as generals) who each led a single division (very roughly ten thousand to thirty thousand soldiers). Divisions are larger than regiments, battalions, brigades, companies, platoons, and squads and are smaller than corps and armies. For World War II, divisions are preferable to using smaller units, because there are fewer missing data for division-level performance than there are for the performance of smaller units.6 There is a sample size advantage in looking at divisions instead of larger units, as there were hundreds of divisions in World War II, whereas the numbers of corps numbered in the low dozens and the number of armies even fewer. We focus on generals commanding ground troops rather than naval admirals or air force generals. We focus on World War II for several reasons. First, its scope and extensive historiography provide data that are both comprehensive and high quality, relative to other wars. Second, the length of the war facilitates testing our hypothesis that past poor performance increases the likelihood of a general being replaced. In shorter wars, such as the Six Day War or the 1991 Gulf War, combat does not last long enough for a military to react to poor combat performance by replacing leaders of poor-performing units. Third, World War II has been central to the study of military effectiveness (e.g., Mearsheimer 1983; Biddle 2004; Castillo 2014). Relatedly, we contribute to the long-standing debate comparing American and German military performance during World War II (see Van Creveld 1982, Brown 1986, and Overy 1995), and specifically the debate about American military leadership in World War II. Analyzing a small number of World War II generals, Ricks (2012) speculated that the United States fought well in World War II because it replaced low-performing generals with better-performing generals. It is important to test the Ricks thesis rigorously, as he did not, and it does not reflect a consensus among historians. For example, Blumenson disagreed with the view that the worst American generals were removed, observing instead that most American decisions to relieve generals in World War II were “unwarranted if not altogether unjustified” (quoted in Ricks 2012, 110). Fourth, because Germany is a civilian dictatorship and the United States is not, this comparison provides the independent variable variance needed to test the coup-proofing hypothesis (H2). We do not include analysis of the German-Soviet World War II campaign. Doing so would introduce substantial heterogeneity in both data missingness and data quality across the Soviet and other campaigns (as described below, we code combat outcomes for the North African, Italian, and European campaigns from a single source, and that source does not cover the Soviet campaign). Excluding the Eastern Front also allows us to maintain data homogeneity, as virtually all combat in our data set are between German and American forces. Focusing on leadership and combat dynamics in a single war provides advantages over conducting empirical analysis on a data set of several wars. The sample is large enough to permit quantitative analysis (hundreds of generals and thousands of division-months), but limited enough to permit us to account for context and to collect more fine-grained, higher-quality data. Context is especially critical when assessing combat performance, as the meaning of success in combat can vary substantially across wars, from killing the enemy to capturing territory to securing control of the population (Reiter 2009, chapter 4). In these World War II campaigns, combat success is generally about capturing territory, as Allied armies were attempting to secure territory en route to intermediate objectives (such as the capture of Rome and Paris), ultimately culminating with the capture of Berlin (see Blanken and Lepore 2014). German forces were also attempting to control territory to prevent Allied advances. To test Hypotheses 1, 2, and 3, we use Cox event history models of each general’s command of a particular division. The “failure” event is the end of the general’s command of the division. A command demotion occurs any time a division commander is relieved of command and given either a lower (smaller) combat command, no command, or command of a military school. It is important here to note two events that we do not count as command demotions. First, if a general transitions to the general staff of a corps or army, this is not a demotion. Transitioning to the staff of a larger unit signifies recognition of promise by a higher-ranking officer and, potentially, signals being groomed for higher command. Second, we do not code a general as being demoted if he was a temporary acting commander returning to his previous, lower-level command once a permanent division commander is designated since such appointments carry the assumption that the appointee will return to a lower command. We employ competing risks models to model separately factors that make demotion more likely and promotion more likely (Box-Steffensmeier and Jones 2004). These models are appropriate in that they allow modeling different ways in which command tenures end, and through “right-censoring” they provide a means of addressing our being unable to observe the end of command tenure whether through transfer to the Eastern Front or the end of the war. Figure 1 displays the number of generals demoted for each year the United States and Germany participated in the war (for more detailed information on all generals in the data set, see our Supplementary Codebook). Figure 1. View largeDownload slide Demotions per year Figure 1. View largeDownload slide Demotions per year A general is coded as being promoted if he receives a higher combat command (such as a corps) or is attached to the general staff of a higher command. Other means of a general’s command ending include loss of command due to poor health, injury, death, transfer to the Eastern Front, administrative dissolution of a division, or end of the war. We treat those outcomes as right-censored, meaning that we allow for the possibility that such generals would remain at risk of “failure” in the absence of such events. Figure 2 displays the number of generals promoted each year the United States and Germany participated in the war. Figure 2. View largeDownload slide Promotions per year Figure 2. View largeDownload slide Promotions per year One of our independent variables is combat performance of the division commanded by the general. Coding combat performance is difficult. There are perhaps three different approaches to coding combat performance: coding territory lost or gained, coding (balance of) casualties (Biddle 2004; Beckley 2010; Cochran and Long forthcoming; Weisiger 2016), and a subjective coding of success. Coding combat performance simply on the basis of territory lost or gained can be potentially misleading. If a division on the offensive gains no territory it should be viewed as unsuccessful, whereas a division on the defensive that gains no territory but loses no territory should be viewed as successful. Casualty data pose severe missing data problems, at least for the division level of analysis. Our exploration of primary sources in the US National Archives indicates that for American division-level casualties on a monthly basis in World War II, about 40 percent of observations would be missing. Missing data rates for German divisions would likely be even higher. Missing data issues aside, there are conceptual problems with using casualty data, as in some circumstances military forces may be willing to suffer casualties, or may need to suffer casualties, in order to accomplish goals such as seizing or preventing the seizure of territory (Reiter 2009, chapter 4; Grauer 2016). For example, in the first phase of the highly successful 2007 change in American military strategy in the Iraq War, US combat casualties went up as American forces took the necessary steps of increasing their patrols among the population (Ricks 2009, 238). Rather than focus on territory or casualties, we collected new data allowing us to code each division’s combat performance on a monthly basis, accounting for that division’s combat mission during that month. We drew these data from several sources, primarily the Green Book US government accounts of combat in World War II.7 The advantage of using the Green Book series is that it provides deep descriptions of all American combat in the three theaters we examine. Further, because the Green Books are produced as a single series by the US government, we can be more confident in the consistency of the volumes’ treatment of combat across the course of the war. Some might be concerned about an official publication like the Green Books providing a pro-US tilt, but historians generally view the publications as being reasonably unbiased, as they “set standards of military history research and writing and have become basic documents of the American participation in World War II … Overall, the official U.S. military histories, whether written in house, like those of the army, or by outside contractors, have withstood the initial fastidious skepticism of academics, and remain basic sources for the history of the United States war effort” (Schliffer 2001, 233–34). If these publications did provide a pro-US tilt, any introduced bias would be relatively limited, as our hypotheses do not test whether or not American forces fought better than German forces, but rather whether variance in combat performance within a division affected leadership turnover and whether leadership turnover improves a division’s combat performance over time. We use a three-category variable, 1/0/–1, of monthly combat performance (the combat performance independent variable is the two-month moving average8 of each division’s monthly combat performance coding). A division receives a 1 in a particular month for its combat performance if it enjoyed success in achieving its combat goals. For divisions tasked with launching offensives, a division is considered successful if it conquered net additional territory during that month. For divisions tasked with defense, a division is coded 1 if the division successfully blocked the adversary from gaining consequential territory, even if the defending division itself did not capture any territory. For example, the German 29th Panzer Grenadier Division helped successfully contain Allied forces around Anzio in early 1944 and received a 1 for those months. A division gets coded as –1 in a month if it failed to achieve its combat goals. Units that fail to achieve planned offensive objectives, such as the German 11th Panzer Division failing at Remagen in March 1945 and the American 36th Infantry Division failing at the Gari (Rapido) River in January 1944, are given codings of –1. If a division is destroyed, such as the German 84th Division in Italy in March 1944, or dissolved due to poor performance, such as the German 92nd Infantry Division in June 1944, it receives a –1.9 If a division suffered defeat and engaged in a disorderly retreat, in the sense that the division suffered heavy casualties as it retreated (such as the 1st SS Panzer Division retreating from the Falaise Pocket in August 1944), it received a –1. A third possible coding is 0. A division gets a 0 if does not fight in a particular month. A unit engaging in orderly retreat, that is conceding territory without suffering significant casualties, gets a 0. If a division surrendered without destruction of its unit, a more common outcome at the very end of the war, it gets coded as 0. It also gets coded as 0 if it fights but has mixed success in a month, such as a successful defense and a failed offensive, or a successful defense followed later in the month by retreat.10 Our Supplementary Codebook describes how we code each individual observation. The approach of focusing on the success or failure of each individual division in each individual month enjoys some advantages over coding battle outcomes (for quantitative studies using battle outcome data, see Reiter and Stam 2002, chapter 3; Rotte and Schmidt 2003; Biddle and Long 2004; Ramsay 2008; and Pilster and Böhmelt 2011).11 The battle outcomes approach means creating lists of battles likely of asymmetric size and/or duration (Reiter 2009, chapter 4). The only publicly available quantitative data set of battle outcomes for all battles going back in time has been the Historical Evaluation and Research Organization (HERO) battle data set. A number of observers have described flaws in the HERO data (e.g., Brooks 2003; Desch 2008).12 The division-month approach avoids these problems by applying consistent temporal and spatial limits. The division is a common organizational unit in military commands of very roughly consistent size. Each division’s performance is evaluated on the basis of a single and consistent measure of time, the month. It also permits evaluating the performance of a division when it is fighting but not involved in a major battle. Testing Hypothesis 2 is empirically straightforward, as it simply forecasts different relationships between combat outcomes and command tenure in the American and German armies. Hypothesis 3 requires new data on the interpersonal relationships of military leaders. Perhaps one of the strongest interpersonal networks within militaries is the set of officers who attended military academies. We collected new data on the military academy attendance of all the American generals in our data set. We coded a dichotomous variable 1 if the general attended a military academy, and 0 otherwise. Almost all American generals in the data set attending military academies attended West Point (one attended the Virginia Military Institute). Proxying personal networks for the German sample with academy attendance is inappropriate as the Germans did not have a standardized system of academy education during World War II as the Americans did. We do include a variable coding whether a German general is a member of the Schutzstaffel (SS). However, we recognize that SS generals may be less likely to be demoted either because of network factors, coup-proofing factors (Hitler relied on the SS to maintain internal political control), or combat performance factors (SS divisions fought well).13 We coded a dichotomous variable 1 if the general commanded an SS division, and 0 otherwise. Results The data set contains 1,703 general-month observations for 320 generals—91 American and 229 German—commanding 195 infantry, armored, and airborne divisions—65 American and 130 German—in the North African, Italian, and Western European theaters in April 1941–May 1945.14 The first division-month with combat occurs in April 1941, and the last occurs in May 1945. Event history analysis evaluates the factors that affect the duration of a certain spell, and for this study that spell is the duration of a general’s command. Generals can have multiple commands within the war (that is, multiple spells). We measure command duration in months, and in our data set command duration ranges from one to twenty-five months, with an average of 5.2 months. Table 1 provides summary statistics of the data. Table 1. Summary statistics of combat performance and academy attendance Full Sample US Germany Performance Mean 0.359 0.666 0.137 SD 0.536 0.409 0.507 Min/Max −1/1 −0.5/1 −1/1 N 1508 631 887 Academy Mean – 0.560 – SD – 0.5 – Min/Max – 0/1 – N – 693 – SS Mean – – 0.060 SD – – 0.237 Min/Max – – 0/1 N – – 1007 Full Sample US Germany Performance Mean 0.359 0.666 0.137 SD 0.536 0.409 0.507 Min/Max −1/1 −0.5/1 −1/1 N 1508 631 887 Academy Mean – 0.560 – SD – 0.5 – Min/Max – 0/1 – N – 693 – SS Mean – – 0.060 SD – – 0.237 Min/Max – – 0/1 N – – 1007 Table 1. Summary statistics of combat performance and academy attendance Full Sample US Germany Performance Mean 0.359 0.666 0.137 SD 0.536 0.409 0.507 Min/Max −1/1 −0.5/1 −1/1 N 1508 631 887 Academy Mean – 0.560 – SD – 0.5 – Min/Max – 0/1 – N – 693 – SS Mean – – 0.060 SD – – 0.237 Min/Max – – 0/1 N – – 1007 Full Sample US Germany Performance Mean 0.359 0.666 0.137 SD 0.536 0.409 0.507 Min/Max −1/1 −0.5/1 −1/1 N 1508 631 887 Academy Mean – 0.560 – SD – 0.5 – Min/Max – 0/1 – N – 693 – SS Mean – – 0.060 SD – – 0.237 Min/Max – – 0/1 N – – 1007 In the data set a total of twenty-one generals—eight American and thirteen German—lost their command through demotion. Eighteen German generals ended their combat commands through promotion, and thirteen American generals ended their combat commands through promotion. We analyze the full data set, and then the American and German generals in separate models. We use robust standard errors, clustering on the general. All results are robust to clustering on division (see Supplementary Appendix). We use the Efron method for breaking ties. We exclude general-months in which the general’s division did not fight or has not fought in the last two months. Analysis of Schoenfeld residuals reveals that the German generals data set contains a nonproportional hazard for the combat outcomes and the military academy independent variables (Box-Steffensmeier, Reiter, and Zorn 2003). We accordingly include time interactions for the Germany subsample. The results for analysis of the full data set, the American generals subsample, and the German generals subsample are displayed in Table 2. Note that we analyze separate models for our two primary failure events, promotion and demotion. In the promotion models, we set command demotion as the competing risk and vice versa. Table 2. Cox models of combat command duration, US and Germany, 1941–1945 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Entire Sample Entire Sample US only US Only Germany only Germany Only “Failure” event Demotion Promotion Demotion Promotion Demotion Promotion Combat performance 0.380** 0.886 0.143*** 1.274 0.010*** 1.159 (0.149) (0.387) (0.084) (1.613) (0.009) (0.690) Combat x time — — — — 2.08*** — (0.376) Military academy — — 0.884 0.850 — — (0.645) (0.453) SS — — — — 0.000*** 1.965 (0.000) (1.928) Observations 1218 1218 488 488 729 729 log pseu.-like. –64.33 –87.84 –19.96 –22.15 –29.15 –52.07 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Entire Sample Entire Sample US only US Only Germany only Germany Only “Failure” event Demotion Promotion Demotion Promotion Demotion Promotion Combat performance 0.380** 0.886 0.143*** 1.274 0.010*** 1.159 (0.149) (0.387) (0.084) (1.613) (0.009) (0.690) Combat x time — — — — 2.08*** — (0.376) Military academy — — 0.884 0.850 — — (0.645) (0.453) SS — — — — 0.000*** 1.965 (0.000) (1.928) Observations 1218 1218 488 488 729 729 log pseu.-like. –64.33 –87.84 –19.96 –22.15 –29.15 –52.07 Notes: (1) Robust standard errors reported, clustered on command. (2) Hazard rates reported. (3) Efron methods used for ties. (4) Statistical significance: *p<0.1, **p<0.05, ***p<0.01. (5) All significance tests one-tailed. Table 2. Cox models of combat command duration, US and Germany, 1941–1945 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Entire Sample Entire Sample US only US Only Germany only Germany Only “Failure” event Demotion Promotion Demotion Promotion Demotion Promotion Combat performance 0.380** 0.886 0.143*** 1.274 0.010*** 1.159 (0.149) (0.387) (0.084) (1.613) (0.009) (0.690) Combat x time — — — — 2.08*** — (0.376) Military academy — — 0.884 0.850 — — (0.645) (0.453) SS — — — — 0.000*** 1.965 (0.000) (1.928) Observations 1218 1218 488 488 729 729 log pseu.-like. –64.33 –87.84 –19.96 –22.15 –29.15 –52.07 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Entire Sample Entire Sample US only US Only Germany only Germany Only “Failure” event Demotion Promotion Demotion Promotion Demotion Promotion Combat performance 0.380** 0.886 0.143*** 1.274 0.010*** 1.159 (0.149) (0.387) (0.084) (1.613) (0.009) (0.690) Combat x time — — — — 2.08*** — (0.376) Military academy — — 0.884 0.850 — — (0.645) (0.453) SS — — — — 0.000*** 1.965 (0.000) (1.928) Observations 1218 1218 488 488 729 729 log pseu.-like. –64.33 –87.84 –19.96 –22.15 –29.15 –52.07 Notes: (1) Robust standard errors reported, clustered on command. (2) Hazard rates reported. (3) Efron methods used for ties. (4) Statistical significance: *p<0.1, **p<0.05, ***p<0.01. (5) All significance tests one-tailed. Table 2 demonstrates that better combat performance significantly reduces the chance that a general will be demoted, but has no significant effect on the chance that a general will be promoted, across the full, American, and German samples. This provides support for Hypothesis 1a, but not 1b—performance affects demotion but not promotion. The null result for promotion may be because both militaries wish to remove ineffective generals from their commands, but when a general demonstrates competence, it may be more useful to the operational effort to allow him to keep his command rather than promote him; perhaps combat-proven generals are more greatly needed in battle rather than on a general staff. The results also indicate that attending a military academy has no significant effect on a general’s likelihood of promotion or demotion in the US subsample, providing evidence against Hypothesis 3. Using an alternative measure of interpersonal network, graduation from any college as a proxy for membership in the network of the upper socioeconomic class (note that only 10 percent of Americans held a college degree in 1920, around when World War II American generals would have attended college), was also insignificant.15 Recall that we do not estimate a model with military academy attendance for the Germany sample because of the different nature of German military academies. Instead we estimate models with a variable indicating whether the general served in the SS. We find very strong evidence, both substantively and statistically, that SS membership significantly reduces the likelihood of demotion. This lower likelihood of demotion could be because there was an interpersonal network protecting SS generals, Hitler used the SS to safeguard his regime and for coup-proofing reasons might have been hesitant to demote an SS general,16 or because SS divisions enjoyed higher levels of combat performance. Unfortunately, we do not currently have the data to conclude with confidence exactly why SS generals were less likely to be demoted, in part because of the low number of SS generals. We do not find a significant effect of SS membership on promotion. We ran a number of robustness checks. First, we treat as censored the nine generals that were wounded prior to being demoted, to account for recuperation. Second, we treat as censored three generals that historians indicate may have been removed due to troubled relationships with superiors rather than combat performance. Third, we drop those observations in which divisions are destroyed, surrender, or disband. The results hold across these modifications. One possible concern is that generals might be removed only after fighting has ceased and not while fighting is still ongoing. However, the data do not indicate such a relationship. There is clear evidence of this trend in only four (of the twenty-one) cases: Jay Mackelvie, who commanded the American 90th Infantry Division; Werner Goeritz, who commanded the German 92nd Infantry Division; Erwin Sander, who commanded the German 245th Infantry Division; and Eberhard von Schuckmann, who commanded the German 352nd Infantry (later Volksgrenadier) Division. In only these examples is it the case that divisions took a break from the front during the change of command. An additional concern is that our monadic division-month approach does not account for which divisions are fighting each other (that is, we use division-monads rather than division-dyads), and a division’s performance may be affected by the quality of its opponent as well as the quality of its leader. Though using division-dyads is conceptually appealing, collecting accurate data on division-dyads would be extraordinarily difficult. Combat often means a jumble of one army’s divisions fights a jumble of another army’s divisions, making it difficult to discern discrete dyads of exactly who is fighting whom. Relatedly, a single division may be spread piecemeal across a front, with different elements of that one division fighting several enemy divisions. The Battle of the Bulge, for example, involved large numbers of American and German troops. Even attempting to code a small portion of this battle, such as the Battle of St. Vith, would be problematic, as elements of three American divisions fought elements of four German divisions. The short temporal window addresses this problem at least in part by increasing the likelihood that both the removed general and his replacement will face the same adversary. We also take other steps. We control for the front and month of the war (Supplementary Table A6, see Supplementary Appendix). This will capture whether any month or front was particularly difficult, perhaps because a division is facing a particularly tough opponent (e.g., facing German divisions commanded by General Rommel in North Africa in 1942–1943). The results are robust to the inclusion of these controls. Unfortunately, our sample size does not allow including dummies for each month of the war, which would more flexibly capture monthly variation. Finally, some may be concerned about potential bias from excluding the Eastern Front in the analysis. It could be the case that underperforming German generals were punished with a transfer or that competent German generals were rushed to the Eastern Front as it crumbled. This is not the case as there is no difference between the performance of German generals who were transferred and those who were not. The coefficients presented in Table 2 are difficult to interpret directly because they present the subhazard ratios estimated in each model. Numbers less than one (and bounded by zero) indicate that increases in the independent variable make the failure outcome less likely. In Model 1, for example, the outcome of interest is command demotion. The coefficient on combat outcome is less than one, which indicates that as combat performance improves, the commander is less likely to suffer demotion. To assess how different levels of combat performance influence the probability of command demotion over time, we plot the cumulative incidence functions17 over the number of analysis months and present result in Figures 3–5. Figure 3. View largeDownload slide Competing-risks regression (full sample) Figure 3. View largeDownload slide Competing-risks regression (full sample) Figure 4. View largeDownload slide Competing-risks regression (USA) Figure 4. View largeDownload slide Competing-risks regression (USA) Figure 5. View largeDownload slide Cox PH regression (Germany) Figure 5. View largeDownload slide Cox PH regression (Germany) Figure 3 examines the effect of different combat outcomes over time on the likelihood of suffering a command demotion. Notably, poor performance (combat outcome = −1) is always associated with the greatest likelihood of demotion. As performance improves, the likelihood of demotion decreases. Good performance (combat outcome = 1) is associated with the smallest likelihood of demotion. Figures 4 and 5 describe the effects of combat performance on the American and German samples, respectively. Analysis indicates that there is no statistically significant difference between the effect of combat performance on the likelihood of demotion in the American versus German generals, providing evidence against Hypothesis 2, though American generals appear more likely to be demoted regardless of performance.18 It is possible that the relationship between combat performance and leadership tenure may have changed as the war endured. Specifically, it may be that in the later years of the war as the German war effort was collapsing, German army underperformance would be more likely to be forgiven, and the relationship between poor combat performance and leader demotion might have weakened.19 To examine this possibility, we reran three variants of Model 5 from Table 2: (1) just observations from 1944 and 1945, (2) just observations from 1944, and (3) just observations from 1945. We find that the results hold in these subsamples, casting doubt on the speculation that the combat-command tenure relationship changed in the German army as the war endured. Some might argue that selection effects contaminate our ability to draw inferences about the effects of military academy attendance on likelihood of promotion or demotion. The assignment of the “treatment” of military academy attendance is not random, and several possible dynamics could be at play. For example, individuals with more intrinsic aptitude might be more likely to attend military academies. We do not have the data to model factors that affect decisions to attend military academies, nor do we have the data to permit a matching strategy to get at this threat to causal inference. One possible solution to this threat to inference is to evaluate the interactive effect between combat performance and military academy attendance. That is, conditional on combat performance, does military academy attendance affect the likelihoods of promotion and demotion? Put differently, does military academy attendance insulate a poorly performing general from demotion, and does military academy attendance mean that a high-performing general enjoys an especially high likelihood of promotion? We analyzed models including an interaction of combat outcome by military academy in the American subsample. The results fail to find that military academy attendance influences the effect of combat outcome on demotion and promotion. Thus far, we have examined the determinants of command demotion and promotion, demonstrating that poor command performance increases the likelihood of being removed from command. We next test Hypothesis 4, asking, if a low-performing commanding general is replaced, does the performance of the division, under the command of the new general, improve? We explore this possibility in two empirical tests, using regression. In the first, we compare two sets of divisions: divisions that are led by generals who replaced a general who was removed for low performance, and divisions that are led by generals who replaced a general who was removed for some other reason. The dependent variable is the change in the division’s performance from just before the general’s removal (whether for low performance or some other reason) to four months after the transition (allowing time for the new general to settle into command). The dependent variable ranges from –1 to 2.20 If Hypothesis 4 is correct, then divisions that experienced a transition in command because of low performance should see a bigger increase in combat performance from before to after the transition, as compared with divisions that experienced a transition for some other reason. Table 3 presents this analysis, with the independent variable being a dichotomous variable coded 1 if a general was replaced following low performance. Table 3. Regression analysis of the effects of general replacement on combat performance: comparing generals of similar command tenure Model 7 Replacement Following Low Performance 1.01*** (0.11) Constant 0.99*** (0.11) Observations 69 R squared 0.02 Model 7 Replacement Following Low Performance 1.01*** (0.11) Constant 0.99*** (0.11) Observations 69 R squared 0.02 Notes: (1) Significance tests are one-tailed. (2) Both coefficients significant at the 0.001 level. (3) Robust standard errors clustered on general reported in parentheses. Table 3. Regression analysis of the effects of general replacement on combat performance: comparing generals of similar command tenure Model 7 Replacement Following Low Performance 1.01*** (0.11) Constant 0.99*** (0.11) Observations 69 R squared 0.02 Model 7 Replacement Following Low Performance 1.01*** (0.11) Constant 0.99*** (0.11) Observations 69 R squared 0.02 Notes: (1) Significance tests are one-tailed. (2) Both coefficients significant at the 0.001 level. (3) Robust standard errors clustered on general reported in parentheses. That variable is positively signed and statistically significant, providing support for Hypothesis 4. Replacement moves combat performance a full point on the three-point scale of –1 to 2. One possible concern about the test described in Table 3 is that the observed improvement in performance when a low-performing general gets replaced could be regression to the mean and that the observed correlation between the replacement of a low-performing general and an improvement in performance is spurious rather than causal (Kahneman 2011). We assess the possibility that the observed improvement in performance following replacement of a low-performing general is spurious by comparing two groups. The first group includes divisions four months after a general was replaced following low performance. The second group includes divisions some four months after a low level of performance was observed (that is, performance at the same low level as those divisions that experienced commander replacement), but only those divisions that did not experience replacement. If the observed increase in performance reported in Table 3 is spurious, then we should see no difference between the two groups, as a division’s performance four months after poor performance will be the same whether or not its general was replaced. But, if replacing a poor-performing general does cause an improvement in performance, then we should observe difference between the two groups. Table 4 describes this analysis. The positive and significant coefficient for the replacement variable suggests that we can be more confident that the observed relationship in Table 3 is not spurious. Instead, the replacement of generals did lead to higher performance, as forecasted in Hypothesis 4. Table 4. Regression analysis of the effects of general replacement on combat performance: comparing divisions after poor performance is observed Model 8 Replacement 0.48*** (0.181) Constant 0.27*** (0.086) Observations 71 R squared 0.05 Model 8 Replacement 0.48*** (0.181) Constant 0.27*** (0.086) Observations 71 R squared 0.05 Notes: (1) Significance tests are one-tailed. (2) Statistical significance: *p<0.1, **p<0.05, ***p<0.01. (3) Robust standard errors clustered on general reported in parentheses. Table 4. Regression analysis of the effects of general replacement on combat performance: comparing divisions after poor performance is observed Model 8 Replacement 0.48*** (0.181) Constant 0.27*** (0.086) Observations 71 R squared 0.05 Model 8 Replacement 0.48*** (0.181) Constant 0.27*** (0.086) Observations 71 R squared 0.05 Notes: (1) Significance tests are one-tailed. (2) Statistical significance: *p<0.1, **p<0.05, ***p<0.01. (3) Robust standard errors clustered on general reported in parentheses. The results are robust to including a number of control variables (Supplementary Tables A7 and A8, see Supplementary Appendix). We rerun the regressions in Tables 3 and 4 with a host of biographical information, including whether or not the commander attended a military academy, commanded a battalion, or commanded a regiment. We also control for the month of the war and the front in which the division is fighting. The results are robust to these specifications. Finally, one may be concerned that the improvement in combat performance is instead the result of the division receiving replacement troops, equipment, or rest. We examined what divisions tended to be doing during the month their commanders lost command. In only four cases do the divisions appear to be not fighting when their commander is relieved (Jay Mackelvie, Werner Goeritz, Erwin Sander, and Eberhard von Schuckmann). Two of these are not included in the analysis because the division dissolved after his relief (Goeritz) or occurred just before war’s end (Sander). While it is certainly possible for a division to receive replacements while fighting, we, unfortunately, do not have sufficient data to examine these possibilities in greater detail. Conclusions This article has demonstrated rigorous empirical evidence for the importance of a previously underappreciated determinant of military effectiveness, military leadership. In our empirical sample, militaries improved combat performance by replacing poorly performing generals. The findings also indicate that interpersonal networks did not prevent the American military from replacing underperforming generals, and political loyalty considerations did not prevent the German military from replacing underperforming generals. The finding provides a more comprehensive portrait of how militaries try to improve their performance, that they replace leaders as well as change strategy and adopt new technology. The article also provides some support for a more Weberian view of military organizations, at least for those militaries analyzed here, in contrast to the perspectives provided by some theories of military organizations portraying them as dysfunctional, hidebound entities. Our article is only a first step in understanding the causes and effects of military leadership. We present three suggestions for future research. First, as noted, militaries can improve effectiveness by replacing leaders, changing strategy, and/or adopting new technology. This article focused on changing leadership, and future work can develop and test theory that considers how organizations choose among the options of changing leadership, strategy, and/or technology as means of improving effectiveness. Second, future work should explore cases beyond the United States and Germany in World War II. This is not just a matter of exploring the external validity of this article’s findings. Examining other cases would improve our understanding of the scope conditions of our theory, in particular exploring conditions in which militaries might not replace low-performing leaders. Dictators facing lower levels of external threat in relation to internal threat (Talmadge 2015) or lacking possession of other coup-proofing tools like bribery or indoctrination (Reiter 2016) might be less motivated to replace low-performing military leaders. Careerism and other trends even within democratic militaries like the United States can make it more difficult to replace even low-performing leaders (Ricks 2012). Future work needs to develop further theoretical expectations as to why and when militaries might be less likely to replace low-performing leaders and then test these expectations on appropriate empirical domains. Third, future work can consider more carefully the interplay between different levels of leadership within military organizations. The performance of low-ranking generals is affected by leadership at higher and lower ranks, as higher-ranking generals make decisions about the deployment of individual decisions, and lower-ranking officers must implement the generals’ orders. Future work can explore the interplay between these leadership levels. One promising avenue is the exploration of information flows between levels of the organizational hierarchy and how parochial incentives can distort these flows and, in turn, the accuracy of inferences the high command tries to make about whether or not to replace lower-ranking officers (see Wagstaff 2016). Supplemental Information Supplemental information is available at the Foreign Policy Analysis data archive. Dr. Dan Reiter is the Samuel Candler Dobbs Professor of Political Science at Emory University. William A. Wagstaff is a Ph.D. candidate in Political Science at Emory University and an Assistant Professor with the Blue Horizons Program at the US Air Force’s Center for Strategy and Technology. His dissertation explores the connections between military leadership and combat effectiveness. The views expressed in this article are those of the authors and do not necessarily reflect the official policy or position of the air force, the Department of Defense, or the US government. Opinions, conclusions, and recommendations expressed or implied within are solely those of the author and do not necessarily represent the views of the Air University, the Department of Defense, or any other US government agency. 1As we discuss in greater detail below, our novel measure of combat performance contains division-specific two-month moving averages of division performance. We use territorial objectives (e.g., capturing or holding territory versus losing territory) to measure division performance (see Gartner and Myers 1995, 377). 2 Reiter and Stam (2002, 79) present evidence using Historical Evaluation and Research Organization (HERO) data connecting leadership quality with combat success. Grauer’s (2016) theory touches on effective military leadership, but focuses more on organizational structure and information flows. Rosen (1991, esp. 20–21) proposed that higher-ranking officers seeking to innovate may make promotion and demotion decisions on the basis of whether or not the lower-ranking officers support the proposed innovation, but focused his argument on peacetime. Moore and Trout (1978) present a theory of promotion in the US military. Avant (2007, 85) developed a theory connecting political institutions and military effectiveness, listing in a table “officer selection” as one of the intermediate variables between civilian institutions and effectiveness. The coup-proofing literature also links military leader quality to combat outcomes, proposing that dictators (at least sometimes) coup-proof their armies, have lower-quality military leaders, and in turn experience lower effectiveness (Quinlivan 1999; Brooks 1998; Pilster and Böhmelt 2011; Talmadge 2015). This article tests the coup-proofing thesis that dictators prioritize political loyalty over competence in their officers, as described below. 3 One potential caveat to the importance of military leaders for military effectiveness is that military leaders have different degrees of flexibility, that is, authorized decision-making autonomy, across time and space (Grauer 2016). Postindependence Arab armies, such as the Egyptian army during the 1973 Yom Kippur War, generally afford their generals and lower-ranking officers relatively little wartime decision-making autonomy (Brooks 1998; Pollack 2002). That is, higher leader quality may have less impact on performance in militaries in which leaders have less autonomy. 4 For example, Posen (1993, 97) noted that Prussia’s 1806 defeats at Jena and Auerstadt pushed its high military command to dismiss officers responsible for the defeat and make officer recruitment more systematic. 5 Very occasionally, IR scholars have observed in passing that Hitler appears not to have engaged in coup-proofing (e.g., Castillo 2014, 56–57; Talmadge 2015, 31). This article presents the first rigorous empirical test of the assertion that Hitler did not engage in at least one form of coup-proofing, making military leadership demotion and promotion decisions on the basis of political reliability rather than performance. 6 We drew this conclusion after inspecting combat records available at the National Archives II in College Park, Maryland. 7 Sources for coding data on combat outcomes and the command tenures and outcomes of generals include the following: Armed Forces Information School (1950); Blumenson (1993a, 1993b); Clarke and Smith (1993); Cole (1993a, 1993b); Fisher (1993); Garland and Smythe (1993); Harrison (1993); Howe (1993); and MacDonald (1993a, 1993b). For German combat outcomes, we rely upon Mitchum (2007). 8 The results are robust to using a three-month moving average (see Table A1, Supplementary Appendix). Using three-month moving averages reduces the number of observations available to analyze. Using moving averages also has the benefit of accounting for the commander’s overall performance. For example, a commander that has a successful month and then an unsuccessful month is coded as having average performance in the two-month moving average. Outlying performance is discounted further in the three-month moving averages where, for example, two successful months followed by an unsuccessful month yield a coding that indicates that the commander is relatively successful. 9 Note that we do not code commanders of divisions that are destroyed, surrender, or disband as receiving a demotion despite not receiving a subsequent command. This provides a more conservative test of the hypotheses. The results are robust to dropping the division-months in which the division was destroyed, surrendered, or disbanded. 10 While admittedly blunt, the –1/0/1 coding scheme allows us to sidestep the issue of ordering monthly outcomes that represent mixed performance. It would be difficult, for example, to distinguish a priori between a month with a successful defense and failed offensive versus a month with a successful defense with a later disastrous retreat. We acknowledge also that any ordinal scheme of this nature requires arbitrary cutoffs. Having three distinct values reduces the number of these difficult decisions we must make as compared to when there are four or more values. The current coding scheme allows for this ambiguity while also identifying clearly good (1) and poor (–1) performance. While casualty data are generally problematic, future work could utilize primary source material to understand better how commanders viewed the tradeoff between territorial advancement and the perceived costs to that advancement. 11 The month is the most appropriate time unit, as after-action reports are generally passed to higher-level commands each month, rather than smaller or larger time units. 12 Biddle and Long (2004) and Cochran and Long (forthcoming) use a revised version of the HERO data set, but their revisions strive to correct errors in coding outcomes and other issues and do not strive to address some of the definitional and conceptual issues described here. 13 A further issue is that SS membership is likely correlated with both personal networks (being a ranking member of the Nazi party) and political loyalty. 14 Note that for 1941–1942, we only include German combat operations in North Africa; we exclude combat operations in Greece, Yugoslavia, and the Soviet Union. 15 See Table A5 (Supplementary Appendix). 16 However, though the SS protected Hitler’s regime especially in the 1930s, the SS military leadership grew to be increasingly disenchanted with Hitler as the war unfolded. Some SS generals eventually planned to ignore some of Hitler’s military orders and even cooperated with efforts to overthrow Hitler (Ripley 2004, 335–36). 17 Cumulative incidence functions display the probability that the individual “fails” (here, is demoted) in time period t conditional on surviving to period t. We exclude the Germany sample. STATA 14 required us to manipulate the data and run a Cox proportional hazard model in order to plot the cumulative hazard functions. We manipulated the data in a way such that the cumulative hazard functions for Germany are functionally equivalent to the cumulative incidence functions for the full and American samples. 18 Recent work (Reiter 2016) points out that there is a number of coup-proofing strategies available to leaders and not all of them entail diminishing military effectiveness. This null finding adds further credence to such claims and points to the importance of exploring more carefully the relationship between coup-proofing and subsequent combat performance in other settings. 19 Note that this is different from the possibility of a nonproportional hazard, in which the relationship between combat and leadership demotion changes as the number of months of the general’s tenure increases. 20 A value of 2 means that the division transition from poor performance (combat outcome = –1) to good performance (combat outcome = 1). While –2 is also theoretically possible, there is no instance where the division transitions from good performance to poor performance. References Albrecht Holger. 2015. “ The Myth of Coup-Proofing: Risk and Instances of Military Coups d’état in the Middle East and North Africa, 1950-2013.” Armed Forces & Society 41 ( 4): 659– 87. Google Scholar CrossRef Search ADS Allen Susan Hannah, Vincent Tiffiny. 2011. “ Bombing to Bargain? The Air War for Kosovo.” Foreign Policy Analysis 7 ( 1): 1– 26. Google Scholar CrossRef Search ADS Armed Forces Information School. 1950. The Army Almanac: A Book of Facts Concerning the Army of the United States . Washington: US Government Printing Office. Avant Deborah. 2007. “Political Institutions and Military Effectiveness: Contemporary United States and United Kingdom.” In Creating Military Power: The Sources of Military Effectiveness , edited by Brooks Risa A., Stanley Elizabeth A., 80– 104. Stanford, CA: Stanford University Press. Google Scholar CrossRef Search ADS Beckley Michael C. 2010. “ Economic Development and Military Effectiveness.” Journal of Strategic Studies 33 ( 1): 43– 79. Google Scholar CrossRef Search ADS Biddle Stephen. 2004. Military Power: Explaining Victory and Defeat in Modern Battle . Princeton, NJ: Princeton University Press. Biddle Stephen, Zirkle Robert. 1996. “ Technology, Civil-Military Relations, and Warfare in the Developing World.” Journal of Strategic Studies 19 ( 2): 171– 212. Google Scholar CrossRef Search ADS Biddle Stephen, Long Stephen. 2004. “ Democracy and Military Effectiveness: A Deeper Look.” Journal of Conflict Resolution 48 ( 4): 525– 46. Google Scholar CrossRef Search ADS Blanken Leo J., Lepore Jason J. 2014. “ Performance Measurement in Military Operations: Information Versus Incentives.” Defence and Peace Economics 26 (5): 516– 35. Blumenson Martin. 1993a. Breakout and Pursuit . Washington: US Army. Blumenson Martin. 1993b. Salerno to Cassino . Washington: US Army. Box-Steffensmeier Janet M., Jones Bradford S. 2004. Event History Modeling: A Guide for Social Scientists . Cambridge, MA: Cambridge University Press. Google Scholar CrossRef Search ADS Box-Steffensmeier Janet M., Reiter Dan, Zorn Christopher. 2003. “ Nonproportional Hazards and Event History Analysis in International Relations.” Journal of Conflict Resolution 47 (1): 33– 53. Google Scholar CrossRef Search ADS Brooks Risa A. 1998. Political-Military Relations and the Stability of Arab Regimes . New York: Oxford University Press. Brooks Risa A. 2003. “ Making Military Might: Why Do States Fail and Succeed? A Review Essay.” International Security 28 ( 2): 149– 91. Google Scholar CrossRef Search ADS Brooks Risa A., Stanley Elizabetheds. 2007. Creating Military Power: The Sources of Military Effectiveness . Stanford, CA: Stanford University Press. Google Scholar CrossRef Search ADS Brown John Sloan. 1986. “ Colonel Trevor N. Dupuy and the Mythos of Wehrmacht Superiority: A Reconsideration.” Military Affairs 50 ( 1): 16– 20. Google Scholar CrossRef Search ADS Bueno de Mesquita Bruce, Smith Alastair, Siverson Randolph M., Morrow James D. 2003. The Logic of Political Survival . Cambridge, MA: MIT Press. Burt Ronald S. 1992. Structural Holes: The Social Structure of Competition . Cambridge, MA: Harvard University Press. Castillo Jasen. 2014. Endurance and War: The National Sources of Military Cohesion . Stanford, CA: Stanford University Press. Google Scholar CrossRef Search ADS Chiozza Giacomo, Goemans H. E. 2011. Leaders and International Conflict . Cambridge, MA: Cambridge University Press. Google Scholar CrossRef Search ADS Clarke Jeffrey, Smith Robert Ross. 1993. Riviera to the Rhine . Washington: US Army. Cochran Katherine McNabb, Long Stephen B. Forthcoming. “ Measuring Military Effectiveness: Loss Exchange Ratios for Multilateral Interstate Wars, 1816–1990.” International Interactions . Cole Hugh. 1993a. The Ardennes: Battle of the Bulge . Washington: US Army. Cole Hugh. 1993b. The Lorraine Campaign . Washington: US Army. Desch Michael C. 2008. Power and Military Effectiveness: The Fallacy of Democratic Triumphalism . Baltimore, MD: Johns Hopkins University Press. Evans David. 1992. A New Way to Train Military Officers. Baltimore Sun , February 18. Farwell Byron. 2001. The Encyclopedia of Nineteenth-Century Land Warfare: An Illustrated World View . New York: Norton. Fest Joachim. 1994. Plotting Hitler’s Death: The Story of German Resistance . Translated by Little Bruce. New York: Metropolitan Books. Fisher Ernest.Jr. 1993. Cassino to the Alps . Washington: US Army. Garland Albert N., Smythe Howard. 1993. The Last Offensive . Washington: US Army. Gartner Scott Sigmund. 1997. Strategic Assessment in War . New Haven, CT: Yale University Press. Gartner Scott Sigmund, Myers Marissa Edson. 1995. “ Body Counts and ‘Success’ in the Vietnam and Korean Wars.” Journal of Interdisciplinary History 25 ( 3): 377– 95. Google Scholar CrossRef Search ADS Gelpi Christopher, Feaver Peter D. 2005. Choosing Your Battles: American Civil-Military Relations and the Use of Force . Princeton, NJ: Princeton University Press. Goemans H. E. 2000. War and Punishment: The Causes of War Termination and the First World War . Princeton, NJ: Princeton University Press. Grauer Ryan. 2016. Commanding Military Power . Cambridge, MA: Cambridge University Press. Google Scholar CrossRef Search ADS Grauer Ryan, Horowitz Michael C. 2012. “ What Determines Military Victory? Testing the Modern System.” Security Studies 21 ( 1): 83– 112. Google Scholar CrossRef Search ADS Harkness Kristin A. 2016. “ The Ethnic Army and the State: Explaining Coup Traps and the Difficulty of Democratization in Africa.” Journal of Conflict Resolution 60 (4): 587– 616. Google Scholar CrossRef Search ADS Harrison Gordon. 1993. Cross-Channel Attack . Washington: US Army. Hollibaugh Gary E.Jr, Horton Gabriel, Lewis David E. 2014. “ Presidents and Patronage.” American Journal of Political Science 58 ( 4): 1024– 42. Google Scholar CrossRef Search ADS Horowitz Michael C. 2010. The Diffusion of Military Power: Causes and Consequences for International Politics . Princeton, NJ: Princeton University Press. Horowitz Michael C., Stam Allan C., Ellis Cali M. 2015. Why Leaders Fight . Cambridge, MA: Cambridge University Press. Google Scholar CrossRef Search ADS Howe George F. 1993. Northwest Africa: Seizing the Initiative in the West . Washington: US Army. Janowitz Morris. 1960. The Professional Soldier: A Social and Political Portrait . Glencoe, IL: Free Press. Kahneman Daniel. 2011. Thinking Fast and Slow . New York: Farrar, Straus, and Giraux. Kim Insoo, Crabb Tyler. 2014. “ Collective Identity and Promotion Prospects in the South Korean Army.” Armed Forces & Society 40 ( 2): 295– 309. Google Scholar CrossRef Search ADS Lieberson Stanley, O’Connor James F. 1972. “ Leadership and Organizational Performance: A Study of Large Corporations.” American Sociological Review 37 ( 2): 117– 30. Google Scholar CrossRef Search ADS PubMed MacDonald Charles. 1993a. The Last Offensive . Washington: US Army. MacDonald Charles. 1993b. The Siegfried Line Campaign . Washington: US Army. Machiavelli Niccoló. 1999. The Prince . New York: Signet. McPherson James M. 1988. Battle Cry of Freedom: The Civil War Era . New York: Ballentine Books. Mearsheimer John J. 1983. Conventional Deterrence . Ithaca, NY: Cornell University Press. Meirowitz Adam, Tucker Joshua A. 2013. “ People Power or a One-Shot Deal? A Dynamic Model of Protest.” American Journal of Political Science 57 ( 2): 478– 90. Google Scholar CrossRef Search ADS Mitchum Samuel W.Jr. 2007. German Order of Battle , 3 vols. Mechanicsburg, PA: Stackpole Books. Moore David W., Thomas Trout B. 1978. “ Military Advancement: The Visibility Theory of Promotion.” American Political Science Review 72 ( 2): 452– 68. Google Scholar CrossRef Search ADS Overy Richard. 1995. Why the Allies Won . New York: Norton. Peck B. Mitchell. 1994. “ Assessing the Career Mobility of U.S. Army Officers: 1950-1974.” Armed Forces and Society 20 ( 2): 217– 37. Google Scholar CrossRef Search ADS Pilster Ulrich, Böhmelt Tobias. 2011. “ Coup-Proofing and Military Effectiveness in Interstate Wars.” Conflict Management and Peace Science 28 ( 4): 331– 50. Google Scholar CrossRef Search ADS Pilster Ulrich, Böhmelt Tobias 2012. “ Do Democracies Engage in Less Coup-Proofing? On the Relationship between Regime Type and Civil-Military Relations.” Foreign Policy Analysis 8 ( 4): 355– 72. Google Scholar CrossRef Search ADS Pollack Kenneth M. 2002. Arabs at War: Military Effectiveness, 1948–1991 . Lincoln, NE: University of Nebraska. Posen Barry R. 1984. The Sources of Military Doctrine: France, Germany, and Britain Between the World Wars . Ithaca, NY: Cornell University Press. Posen Barry R. 1993. “ Nationalism, the Mass Army, and Military Power.” International Security 18 ( 2): 80– 124. Google Scholar CrossRef Search ADS Powell Jonathan. 2012. “ Determinants of the Attempting and Outcome of Coups d’état.” Journal of Conflict Resolution 56 ( 6): 1017– 40. Google Scholar CrossRef Search ADS Quinlivan James T. 1999. “ Coup-Proofing: Its Practice and Consequences in the Middle East.” International Security 24 ( 2): 131– 65. Google Scholar CrossRef Search ADS Ramsay Kristopher W. 2008. “ Settling it on the Field: Battlefield Events and War Termination.” Journal of Conflict Resolution 52 ( 6): 850– 79. Google Scholar CrossRef Search ADS Reiter Dan. 2009. How Wars End . Princeton, NJ: Princeton University Press. Reiter Dan. 2016. “Choosing the Tools of Coup-Proofing: The Puzzle of Nazi Germany.” Unpublished manuscript, Emory University. Reiter Dan. ed. 2017. The Sword’s Other Edge: Tradeoffs in the Pursuit of Military Effectiveness . Cambridge: Cambridge University Press. Reiter Dan, Stam Allan C. 2002. Democracies at War . Princeton, NJ: Princeton University Press. Google Scholar CrossRef Search ADS Ricks Thomas E. 2009. The Gamble: General Petraeus and the American Military Adventure in Iraq . New York: Penguin. Ricks Thomas E. 2012. The Generals: American Military Command from World War II to Today . New York: Penguin. Ripley Tim. 2004. Hitler’s Praetorians: The History of the Waffen-SS, 1925-1945 . Staplehurst, UK: Spellmount. Rosen Stephen Peter. 1991. Winning the Next War: Innovation and the Modern Military . Ithaca, NY: Cornell University Press. Rosen Stephen Peter. 1996. Societies and Military Power: India and Its Armies . Ithaca, NY: Cornell University Press. Rosen Stephen Peter. 2005. War and Human Nature . Princeton, NJ: Princeton University Press. Rotte Ralph, Schmidt Christoph. 2003. “ On the Production of Victory: Empirical Determinants of Battlefield Success in Modern War.” Defence and Peace Economics 14 ( 3): 175– 92. Google Scholar CrossRef Search ADS Rwengabo Sabastiano. 2013. “ Regime Stability in Post-1986 Uganda: Counting the Benefits of Coup-Proofing.” Armed Forces and Society 39 ( 3): 531– 59. Google Scholar CrossRef Search ADS Schliffer John. 2001. “History Program, U.S. Army Military.” In World War II in the Pacific: An Encyclopedia , edited by Sandler Stanley, 232– 34. New York: Garland. Schwind David A., Laurence Janice H. 2006. “ Raising the Flag: Promotion to Admiral in the United States Navy.” Military Psychology 18 (Suppl): S83– S101. Google Scholar CrossRef Search ADS Segal David R. 1967. “ Selective Promotion in Officer Cohorts.” Sociological Quarterly 8 ( 2): 199– 205. Google Scholar CrossRef Search ADS Snyder Jack. 1984. The Ideology of the Offensive: Military Decision-Making and the Disasters of 1914 . Ithaca, NY: Cornell University Press. Stam Allan C.III. 1996. Win, Lose, or Draw: Domestic Politics and the Crucible of War . Ann Arbor: University of Michigan Press. Google Scholar CrossRef Search ADS Sun Tzu. 1963. The Art of War . Translated by Griffith Samuel B. London: Oxford University Press. Talmadge Caitlin. 2015. The Dictator’s Army: Battlefield Effectiveness in Authoritarian Regimes . Ithaca, NY: Cornell University Press. Tarakci Murat, Greer Lindred L., Groenen Patrick J. F. 2016. “ When Does Power Disparity Help or Hurt Group Performance?” Journal of Applied Psychology 101 ( 3): 415– 29. Google Scholar CrossRef Search ADS PubMed Van Creveld Martin. 1982. Fighting Power: German and U.S. Army Performance, 1939–1945 . Westport, CT: Greenwood Press. Van Creveld Martin. 1985. Command in War . Cambridge, MA: Harvard University Press. Van Evera Stephen. 1999. Causes of War: Power and the Roots of Conflict . Ithaca, NY: Cornell University Press. Verwimp Philip, Justino Patricia, Brück Tilman. 2009. “ The Analysis of Conflict: A Micro-Level Perspective.” Journal of Peace Research 46 ( 3): 307– 14. Google Scholar CrossRef Search ADS Volden Craig, Wiseman Alan E. 2014. Legislative Effectiveness in the United States Congress . Cambridge, MA: Cambridge University Press. Google Scholar CrossRef Search ADS Wagstaff William A. 2016. “Organizing Evaluation: Assessing Combat Leadership Quality.” Presented at the Annual Meeting of the Peace Science Society (International), Oxford, MS. Weber Max. 1978. Economy and Society . Edited by Ross Gunther, Wittich Ckus. Berkeley: University of California Press. Weisiger Alex. 2016. “ Learning from the Battlefield: Information, Domestic Politics, and Interstate War Duration.” International Organization 70 ( 2): 347– 75. Google Scholar CrossRef Search ADS Wolford Scott. 2007. “ The Turnover Trap: New Leaders, Reputation, and International Conflict.” American Journal of Political Science 51 ( 4): 772– 88. Google Scholar CrossRef Search ADS © The Author (2017). Published by Oxford University Press on behalf of the International Studies Association. All rights reserved. For permissions, please e-mail: firstname.lastname@example.org
Foreign Policy Analysis – Oxford University Press
Published: Aug 3, 2017
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera