# Self-Selection and Comparative Advantage in Social Interactions

Self-Selection and Comparative Advantage in Social Interactions Abstract We propose a theory of social interactions based on self-selection and comparative advantage. In our model, students choose peer groups based on their comparative advantage within a social environment. The effect of moving a student into a different environment with higher-achieving peers depends on where in the ability distribution she falls and the shadow prices that clear the social market. We show that the model’s key prediction—an individual’s ordinal rank predicts her behavior and test scores—is borne out in one randomized controlled trial in Kenya as well as administrative data from the United States. To test whether our selection mechanism can explain the effect of rank on outcomes, we conduct an experiment with nearly 600 public school students in Houston. The experimental results suggest that social interactions are mediated by self-selection based on comparative advantage. (JEL: I21, J24) The editor in charge of this paper was Imran Rasul. Acknowledgments: Previous versions of this paper circulated under the title “A Roy Model of Social Interactions.” We are grateful to Gary Becker, Edward Glaeser, Bryan Graham, Richard Holden, Lawrence Katz, Steven Levitt, Franziska Michor, Bruce Sacerdote, Chris Shannon, Jesse Shapiro, Andrei Shleifer, Michela Tincani, Glen Weyl, as well as seminar participants at Harvard and Chicago for many helpful comments and suggestions. Brad Allan, Vilsa Curto, Tanaya Devi, Matt Davis, Ryan Fagan, Adriano Fernandes, Natalya Naumenko, andWonhee Park provided excellent research assistance. Financial support from the Weatherhead Center for International Affairs and Institute for Humane Studies [Cicala], the Education Innovation Lab at Harvard University [Fryer], and the German National Academic Foundation [Spenkuch] is gratefully acknowledged. Correspondence can be addressed to the authors at Harris School of Public Policy, University of Chicago, 1155 E 60th Street, Chicago IL 60637 [Cicala]; Department of Economics, Harvard University, 1805 Cambridge Street, Cambridge MA 02138 [Fryer]; MEDS Department, Kellogg School of Management, 2211 Campus Drive, Evanston IL 60208 [Spenkuch]; or by e-mail. The usual caveat applies. 1. Introduction Social scientists have long recognized the importance of peer effects.1 Gans (1962), for instance, describes an insidious form of social interactions in which Italian immigrant communities in Boston’s West End impose costs on individuals who “act mobile.”2 Wilson (1987) argues that the development of an “underclass” of black city dwellers on Chicago’s South Side was due to the emigration of working families and the resulting decrease in role models and neighborhood quality, and Borjas (1995) demonstrates that the mean skill level within one’s ethnic group in the previous generation is correlated with own educational achievement. To better understand these and related phenomena, economists have developed models of social interactions by putting environmental variables—such as the mean behavior in one’s social group or the mean educational attainment in one’s neighborhood—into agents’ utility functions. In this class of models, peers are a source of positive externalities; unruly peers cause more trouble and smarter peers encourage higher achievement (Akerlof 1997; Becker 1974, 1996; Benabou 1993; Bernheim 1994). However, empirical evidence in support of theories that predict that favorable social interactions lead to better outcomes has been ambivalent. Although many authors confirm the hypothesis relying on plausibly exogenous variation in data sets ranging from primary students in Texas to freshmen at Dartmouth College, others find no or even negative peer effects.3 Recently, Carrell, Sacerdote, and West (2013) present perplexing evidence from a randomized field experiment designed to boost the achievement of low-ability freshmen at the US Air Force Academy. Based on earlier, experimental estimates indicating nonlinear, positive effects of peers’ mean test scores, the authors “optimally” constructed peer groups by pairing low-ability students with a greater share of high-ability ones. Contrary to the authors’ expectations, however, the intervention had a negative and statistically significant effect on the very students it was designed to help. That is, if anything, “better” classmates led to slightly worse outcomes. Carrell et al. (2013) note that within their “optimally” designed peer groups, low-ability students avoided the very peers with whom they were intended to interact and instead formed more homogeneous subgroups. At an abstract level, social interactions involve a decision about what to do with whom. We note that similar decision problems arise in many other economic settings, such as the choice of an occupation or industry (Heckman and Scheinkman 1987; Miller 1984; Roy 1951), immigration (Borjas 1987), educational investment decisions (Willis and Rosen 1979), or the division of labor within households (Becker 1991). Building on this insight, this paper explores the role of self-selection in social interactions—both theoretically and empirically. Our analysis begins by developing a theory of social interactions that builds on Roy’s classic account of self-selection in the labor market (Roy 1951). To model the endogeneity of social contacts, we posit the existence of an implicit price mechanism in the “market for peers.” In our simple theoretical framework, children care only about social status from membership in peer groups. In equilibrium, heterogeneity in ability leads students to self-select into groups based on their comparative advantage within a particular setting. When self-selection is the guiding principle of peer group formation, a child’s behavior is an equilibrium outcome. It depends on where in the ability distribution she falls, and on the shadow prices that clear the social market. Put differently, selection into peer groups is based on comparative advantage, and peer effects arise due to the endogenous sorting of students into peer groups within a social environment. It is important to emphasize at the outset that there are no intrinsic externalities built into the model. That is, in contrast to standard theories of peer effects, students do not explicitly benefit from the ability of the other members of their group. In our Roy model, children endogenously determine which peer group to join within a particular social environment. Peer effects arise because the presence of others affects students’ choice of peer group and, therefore, their behavior. The idea of self-selection in social interactions is closely related to a large literature on endogenous network formation (see, e.g., Bala and Goyal 2000; Jackson and Rogers 2007; Jackson and Wolinsky 1996; and the survey by Jackson, Rogers, and Zenou 2017). In fact, our theory may be viewed as a simple model of network formation using comparative advantage as an allocation mechanism. Put differently, even within classrooms networks are likely incomplete—not everyone is friends with everyone else—and our model predicts who socializes with whom, which in turn affects children’s behavior. An important and novel implication of our model is that a student’s academic achievement and problem behaviors depend on her ordinal rank among her peers.4 We document the impact of rank in two different data sets.5 The first one comes from a randomized controlled trial in Kenya, and was collected by Duflo et al. (2011). In 2005, 121 Kenyan primary schools with a single first-grade class received additional funds to hire another teacher and create a second section. In 61 of these schools, students were randomized into classrooms. In the remaining 60 schools, students were assigned to sections based on initial achievement. The intervention lasted 18 months. Relying on the experimentally induced variation in the within-classroom rank of students with equal baseline tests scores, we show that increasing a student’s rank by 50 percentiles boosts test scores at endline by about .2 standard deviations. To provide additional evidence on the relationship between ordinal rank and student outcomes, we use administrative data from New York City Public Schools (NYCPS). Our research design for these data exploits transitions from elementary to middle school, that is, from fifth to sixth grade. We estimate that a 50 percentile decrease in rank among schoolmates is associated with as much as a 2.5 percentage point increase in the probability of a serious behavioral incident—about 32% of its mean. To account for sorting into schools as well as potential issues of reverse causality, we use students’ hypothetical change in rank if they had attended the school for which they were zoned as an instrument for the actual change in rank based on the school they chose to attend. Although our IV estimates are less precise, they are qualitatively very similar to their OLS counterparts.6 Consistent with the idea of self-selection in social interactions, the evidence suggests that students’ ordinal rank exerts a significant effect on their achievement and behavior.7 Yet, these data cannot rule out other mechanisms. For instance, if teachers always target the top of the class, then higher ranked students would benefit from more appropriate instruction and thus experience an increase in test scores. Although such an explanation would not invalidate our empirical result that ordinal rank matters for student outcomes, it illustrates the multiplicity of plausible channels through which the effect of rank might operate. Finding direct evidence in support of self-selection based on comparative advantage is challenging—in large part because neither the shadow prices that clear the social market nor students’ comparative advantage are directly observable in standard data sets. Similarly, given the potential number of alternative theories that predict a relationship between ordinal rank and economic outcomes, and the data requirements associated with testing each, it is unclear how one could rule out all, or even most, of them.8 Thus, rather than trying to pin down the share of the relationship between rank and outcomes that is attributable to a particular mechanism, we pursue the more modest goal of providing additional evidence that some portion of the effect of rank on behavior is due to self-selection—as this is the precise mechanism explored in our model. To this end, we conducted a “framed” field experiment.9 Relying on the experimentally induced variation, we can test whether self-selection based on comparative advantage can explain the observed relationship between rank and outcomes. Between February 2015 and May 2015, we recruited nearly 600 children from two public middle schools in Houston, TX and incentivized them to “solve” mazes in a custom-made computer game. According to our conversations with principals and teachers, children at these schools have extensive experience playing comparable games on their phones or even on the schools’ computers. By embedding our experiment in a familiar context, we hope to replicate as many of the myriad situational factors that may affect students’ behavior as we possibly can, while maintaining enough experimental control to identify causal effects. During the initial stage of the game, students were asked to solve a common set of mazes in order to establish a baseline measure of ability. The software then publicly revealed the ordinal rank as well as the cardinal performance of all participants in the same experimental session—similar to the scoreboard in many popular video games. In the next stage, all students were afforded the opportunity to practice solving mazes at a fixed, randomly determined cost per maze. In addition, children in the treatment group could pay to publicly “slime” the screen of any practicing peer. Sliming another student’s screen carried no monetary benefit, but it blocked a portion of the maze on which the respective participant was working, thereby negating the benefits of practicing for her. In the third and last stage of the game, students were asked to solve more difficult mazes and were rewarded with a piece rate for each one they successfully completed. At no point during the experiment did monetary payoffs depend on ordinal rank. Among children in the control group we observe that, conditional on actual performance in the first stage, children who were paired with better peers and, therefore, find themselves closer to the bottom of the distribution are, if anything, more likely to invest in becoming better at solving mazes. This is not the case in the treatment group. Being able to publicly “slime” their peers, lower-ranked students substitute away from practicing and pay to disrupt others instead. Given that children were randomly assigned to either treatment or control, our experimental design permits us to test the hypothesis that the opportunity to self-select into a second, disruptive activity exerts a disproportionate effect on lower-ranked children. That is, by giving students the choice between a constructive and destructive activity we allow for self-selection based on comparative advantage to affect social interactions in the treatment group, but not the control. Perversely, the very children who would ordinarily practice more to overcome their relative disadvantage chose to “act out” instead. This suggests that students’ behavior is mediated, in part, by self-selection based on comparative advantage within narrowly defined social settings. Although it is difficult to generalize from the findings of any given experiment—and perhaps even more so in our case—we note that our experimental design holds many, if not all, potential confounds fixed. For instance, the experimental results cannot be due to a change in students’ cognitive or noncognitive skills, teacher conduct, or environmental influences. This does not necessarily mean that these factors are irrelevant for explaining the patterns that we document in the real-world data. It does, however, suggest that self-selection may be an important mechanism for explaining peer effects. The remainder of the paper is organized as follows. Section 2 develops a formal model of self-selection in social interactions, and Section 3 presents empirical evidence consistent with the theory’s key prediction. Section 4 describes an experiment designed to test the mechanism highlighted in our model. The final section concludes.10 2. A Comparative Advantage Theory of Social Interactions The model we propose in this paper is a simplified version of the well-known multi-sector choice problem, building upon impressive literatures designed to understand the evolution of earnings, the hedonic pricing of skills, and the assignment of workers to firms (e.g., Heckman and Sedlacek 1985; Murphy 1986; Rosen 1974; Roy 1951; Sattinger 1979). The novelty of our approach lies in the application of these classic methods to develop a theory of social interactions where contacts within a social market are endogenous and peer effects arise due to the sorting of agents within narrowly defined social settings. 2.1. Basic Building Blocks Let there be a continuum of students with unit mass. Every child is endowed with one unit of nontransferable time. There are two activities in which students can engage with their peers: studying or mischief. These activities are exclusive and undertaken by separate social groups: “nerds” and “troublemakers.” Children acquire social status from membership in peer groups. How much status membership in group j = N, T conveys depends on the effective group size, Lj, and other exogenously given factors, which we label capital, Kj. We allow capital to broadly represent any nonhuman input into groups’ activities, such as the availability of textbooks or sharp scissors, the quantity of policing, or school and neighborhood quality more generally. Children are heterogeneous along two dimensions. Their varying size and strength yield differences in the ability to cause trouble, whereas heterogeneity in cognitive ability implies differences in their ability to be a true nerd. Let σN(r) denote the effective units of “nerdiness” that student r is capable of contributing to the group (e.g., expertise in differential geometry). Analogously, r’s troublemaking ability is given by σT(r). Without loss of generality, we rank students by their relative skill σ(r) ≡ σN(r)/σT(r), such that σ(r) ≥ σ(r΄) whenever r > r΄. For simplicity, we assume that children are solely interested in maximizing their social status $$U\left( r\right) =\max \left\lbrace s_{N}\sigma _{N}(r),s_{T}\sigma _{T}\left( r\right) \right\rbrace ,$$ (1) where the shadow prices sN and sT denote the (endogenously determined) status per effective unit of nerd and troublemaking ability, respectively. Thus, total utility from membership in group j is given by sjσj(r). Note, there are no explicit externalities built into students’ utility. Conditional on “prices,” the behavior of others has no influence on own decisions—as in analyses of traditional markets. The key assumption in equation (1) is that, all else equal, “nerdier” individuals, that is, those with higher σN(r), will derive more utility from joining the nerd sector than children with less nerd ability. In Cicala et al. (2011), we allow for more general utility functions (e.g., children care about more than just social status, or they can allocate parts of a fixed time endowment). Here, however, we present a very simple and parsimonious model in order to demonstrate how self-selection in social interactions can produce “peer effects.” It is important to note that the main results continue to hold as long as the benefits from joining a particular group are increasing in the respective dimension of ability, so that sorting into peer groups is at least partially determined by comparative advantage.11 Children maximize their social status by choosing either the nerd or troublemaking group according to a simple cut-off rule. The student indifferent between the two sectors, r*, has a skill ratio of $$\sigma \left( r^{\ast }\right) =\frac{s_{T}}{s_{N}}.$$ (2) By individual optimization, all students with index r ≥ r* join forces with the nerds, and individuals with r < r* become troublemakers. In our Roy model of social interactions, comparative (rather than absolute) advantage determines a child’s choice of peer group, and therefore whether she chooses to engage in studying or mischief. As a result, the supply of skills to both groups is given by \begin{eqnarray} L_{N}^{\ast } =\int _{r^{\ast }}^{1}\sigma _{N}\left( q\right) dq, \end{eqnarray} (3) \begin{eqnarray} L_{T}^{\ast } =\int _{0}^{r^{\ast }}\sigma _{T}\left( q\right) dq\text{.} \end{eqnarray} (4) Equilibrium, however, also depends on the endogenously determined shadow prices, and, therefore, on the relationship between social status, sj, and effective group size, Lj. There are many plausible micromechanisms for why the payoffs to joining either group may be a function of groups’ sizes. For instance, it could be that group membership directly confers social status, or that students simply derive utility from spending time with others. Beyond the narrow context of our model and in the spirit of Spence (1973) and Austen-Smith and Fryer (2005), students may also care about signaling their “type” to adults. In equilibrium, the credibility of their signals will depend on the number of others who adopt the same behavior, which introduces interdependence in payoffs. Since our primary goal is to study how self-selection into activities depends on the composition of peers, we remain agnostic about the deep determinants of sj. Instead of taking a firm stand on the ultimate source of utility from group membership, we simply explore different possibilities for the functional relationship between sj and Lj. 2.2. Case I: Social Status Decreases with Effective Group Size In a traditional Roy model it is usually assumed that labor exhibits diminishing returns to scale. If, for instance, increasing the number of troublemakers does more to increase the probability of getting caught than of winning a fight, then social status may be decreasing in effective group size, LT. Similarly, intelligence may confer greater status when this skill is scarce and not readily available within a social environment. Hence, we first assume ∂sj(Lj, Kj)/∂Lj < 0. In equilibrium, market clearing and equation (2) yield the following condition: $$\delta \left( r^{\ast }\right) \equiv \frac{s_{T}(L_{T}(r^{\ast }),K_{T})}{s_{N}(L_{N}\left( r^{\ast }\right) ,K_{N})}=\sigma \left( r^{\ast }\right) ,$$ (5) where δ(r) denotes the ratio of social status in both sectors when the marginal student is r. Since δ΄(r) < 0 for all r, the relative “price” schedule in the market for peers is strictly downward sloping. To see this, note that equations (3) and (4) respectively imply $$dL_{N}^{\ast }/dr<0$$ and $$dL_{T}^{\ast }/ dr>0$$, which causes the absolute as well as the relative status of troublemakers to decrease as agents shift from the nerd into the troublemaking group. We can now describe equilibrium graphically. Figure 1 depicts the situation when social status is decreasing in effective group size. As described above, it features upward sloping “supply” and a downward sloping “price” schedule. There is a unique equilibrium at r* with market clearing relative status, (sT/sN)*. All students with r < r* select into the troublemaker group and children with r ≥ r* choose to become nerds. Figure 1. View largeDownload slide Equilibrium in the market for peers when status decreases with effective group size. Figure 1. View largeDownload slide Equilibrium in the market for peers when status decreases with effective group size. Social status may also depend on the specifics of the exogenously given environment. To allow for this possibility we let sj, depend not only on group size, but also on any number of environmental inputs Kj. Suppose ∂sj(Lj, Kj)/∂Kj > 0 and imagine a shift in the “capital” available to the troublemaking sector—less police surveillance, an increase in the availability of drugs, weapons, or alcohol. Holding everything else constant an increase in troublemakers’ productive capital, KT, is represented by an outward shift of the δ-schedule, which results in higher status for troublemakers and, therefore, fewer nerds. A decrease in the “capital” available to troublemakers has the opposite effects. Thus, with respect to features of the physical environment our Roy model of social interactions features conventional predictions. Comparative statics with respect to the skill distribution, however, can be quite counterintuitive. Consider, for instance, an increase in nerd skill among the population holding troublemaking ability fixed. First, an increase in children’s nerdiness raises σN relative to σT and thus shifts the “supply” curve inward. Second, the equilibrium price schedule shifts outward due to the fact that with more academically able peers there will be more effective units of nerd skill supplied at any r, which lowers sN in equation (5). Although both shifts lead to an unambiguous rise in the relative wage of troublemakers, the effect on quantities is indeterminate. Figure 2 illustrates this point. In the left panel, social status is largely irresponsive, despite the large shift in the distribution of relative skill. On net the shift in δ(r) outweighs that in σ(r), which results in an expansion of the nerd group. In the panel on the right, however, nerds’ social status drops rapidly with group size, leading to a much larger outward shift of the price schedule, δ1(r).12 In the new equilibrium, fewer children choose to become nerds, despite the fact that everyone has higher nerd ability than before. Figure 2. View largeDownload slide Effect of higher-quality peers when social status is decreasing in effective group size. Figure 2. View largeDownload slide Effect of higher-quality peers when social status is decreasing in effective group size. 2.3. Case II: Social Status Increases with Effective Group Size Much of the literature on social interactions assumes that the marginal utility of a social activity is increasing in overall participation. In models with a social multiplier, peer effects arise because an individual’s utility from taking a particular action increases in the number of agents in her reference group who behave in the same way (e.g., Becker and Murphy 2000; Glaeser et al. 2003). By assuming that social status rises with effective group size, that is, ∂sj(Lj, Kj)/∂Lj > 0, our comparative advantage theory can replicate these models. As in the case of decreasing status, the equilibrium price schedule continues to be given by \begin{equation*} \delta \left( r^{\ast }\right) =\frac{s_{T}(L_{T}(r^{\ast }),K_{T})}{s_{N}(L_{N}\left( r^{\ast }\right) ,K_{N})}. \end{equation*} The key difference when social status is increasing in group size is that δ΄(r) > 0 and that there may exist multiple equilibria.13 Figure 3 depicts such a scenario. Here, increasing status yields equilibria at the origin, at r*, and at r**. But only equilibria in which the price schedule, δ, intersects the “supply” curve from above are locally stable.14 Given the existence of multiple equilibria, the Roy model approach can rationalize starkly different behaviors of children in observationally similar environments. Moreover, small changes in the environment may lead to large behavioral responses. Consider, for instance, a decrease in the inputs available to the troublemaking group and assume that ∂sj(Lj, Kj)/∂Kj > 0. As shown in Figure 4, the initial decrease in KT lowers the relative status of troublemakers, and, thus, causes troublemakers close to the initial equilibrium to switch peer groups. The decrease in the size of the troublemaking group further lowers sT, which leads to an even larger outflux, and so on. Analogous to traditional models of a social multiplier, students’ behavior may become very elastic when their choices are complements. In contrast to conventional analyses, our theory allows for heterogeneity in agents’ skill endowments to drive different behavior in a common environment. This form of heterogeneity is necessary to explore the idea of self-selection in social interactions. Figure 3. View largeDownload slide Equilibria in the market for peers when status increases with effective group size. Figure 3. View largeDownload slide Equilibria in the market for peers when status increases with effective group size. Figure 4. View largeDownload slide Comparative statics when status increases with effective group size. Figure 4. View largeDownload slide Comparative statics when status increases with effective group size. 2.4. Case III: Rank-Dependent Utility Our theory can also mimic models in which the utility from engaging in a particular activity depends directly on an individual’s rank. The simplest way to incorporate the idea of rank-dependence into our Roy model is to assume that the shadow prices vary across students and are given by sr, j(r, Kj) with ∂sr, N(r, KN)/∂r > 0 and ∂sr, T(r, KT)/∂r < 0. These assumptions are sufficient (but not necessary) to ensure that there will, again, be a threshold individual, r*, such that students with r < r* select into the troublemaker group whereas those with r ≥ r* choose to become nerds. On theoretical grounds, the key deviation from typical theories of rank-dependent utility is that r is defined in terms of relative rather than absolute skills. This modeling choice sidesteps thorny questions about which peer group a student would choose when she would (in absolute terms) be both the most skilled nerd and the most skilled troublemaker within a given environment, and it produces self-selection based on comparative advantage. But even without assuming that shadow prices vary directly with children’s rank, our model of self-selection in social interactions predicts rank-dependence in behavior. To see this, consider, again, Figures 1 and 3. A student at the bottom of the skill distribution will be more likely to join forces with the troublemakers than an agent in the right tail of the distribution—irrespective of whether social status is increasing or decreasing in effective group size, that is, even in cases I and II above. Formally, the net utility from selecting into the nerd rather the troublemaker group is given by $$\sigma _{N}\left( r\right) s_{N}^{\ast }-\sigma _{T}\left( r\right) s_{T}^{\ast }= \left( \sigma \left( r\right) / \delta \left( r^{\ast }\right) \right) -1$$, which is increasing in r. Thus, within any social market, children in the upper tail of the skill distribution (i.e., those with high r) have more to gain from joining the nerd group than those in the lower tail (i.e., those with low r). Holding the social environment fixed, one would, therefore, expect to find a positive correlation between students’ ordinal rank and their choices of whether to become a troublemaker or a nerd. In this sense, our theory can also be interpreted as providing a microfoundation for rank-dependence in behavior, which does not rely on the (tautological) assumption that the utility from engaging in a particular activity depends directly on rank.15 2.5. Empirical Implications On its face, our model of social interactions is about how individuals select into peer groups. To connect the theory to commonly available datasets on student outcomes, consider some educational production function $$y_{i}=\boldsymbol {X}_{i}{^\prime }\beta +h(e_{i})+u_{i}$$, in which yi denotes student i’s test scores, Xi is a set of environmental variables, and h(ei) is a monotonically increasing function that converts effort, ei, into test scores. As suggested by our choice of name for each group, we would expect that the same student exerts more effort on schoolwork when she chooses to become a “nerd” rather than a “troublemaker.” In symbols, $$e_{i}^{N^{\ast }}\ge e_{i}^{T^{\ast }}$$ for all i. Conversely, we would expect students to engage in more unproductive—perhaps even anti-social—behavior when they opt to join forces with the troublemakers. Under these ancillary assumptions, our theory predicts that ordinal rank will be positively associated with academic achievement in settings in which students have the possibility to divert their time and effort to outside activities. At the same time, it is important to emphasize that it is not rank per se that generates this relationship, but self-selection. So far, we have been discussing outcomes as a function of rank based on the relative skill index r. In practice, we observe prior test scores as a proxy of σN, whereas σT is unobserved. Test score-based rank will be a valid proxy for rank in terms of relative nerd and troublemaking skill whenever σN(r) ≥ σN(r΄) implies that $$\mathbb {E}\left[ \sigma \left( r\right) \right] \ge \mathbb {E}\left[ \sigma \left( r^{\prime }\right) \right]$$. This condition holds trivially if it does not take any specific skill to cause trouble, that is, σT(r) = c for all r, or if σN and σT are independently distributed. It is also satisfied if nerd and troublemaking ability are negatively or not “too positively” correlated. If correct, then students’ scholastic achievement and their proclivity to “act out” should be related to their ordinal rank in the ability distribution. Anecdotally, the phenomenon that the behavior of children varies with their relative standing has been observed among some programs for gifted minority youth held at MIT each summer. These programs attract a subset of black children who are among the best and brightest in their schools. At MIT, however, they interact with more academically able peers, leading them to engage in a wide range of problem behaviors (Suskind 1998). Similarly, in his memoir, Canada (1995) speculates that even the most violent youth in Boston would only be mediocre fighters in the South Bronx and, therefore, be forced to change their ways. The prediction that students’ ordinal rank matters for academic outcomes also resonates with a small literature in social psychology on “big-fish-little-pond” effects (BFLPE; see Marsh et al. 2008 for a comprehensive review). Marsh et al. (1984), for instance, conclude that “children compare their own academic ability […] with the abilities of other students within their school or their reference group, and children use this relativistic impression as one basis of forming their academic self-concept” (p. 217). Marsh (1987) even argues that BFLPE accounts for about a quarter of the impact of academic self-concept on academic performance. Broadly summarizing, the comparative advantage approach to social interactions delivers a novel, testable prediction: students who fall near the top of their reference group should display higher academic achievement than their equally able counterparts who find themselves closer to the bottom in another environment. Conversely, the latter should be more prone to behavioral problems than the former. In the following section, we demonstrate that this implication of our theory is borne out in data from a randomized controlled trial involving more than 100 primary schools in Kenya, as well as administrative data from New York City Public Schools (NYCPS) from the 2003–2004 through 2008–2009 school years. Although the (quasi-) experimental variation in these data allows us to conclude that rank affects student outcomes, with standard data alone we cannot establish the precise, causal mechanism through which the impact of rank operates. In order to be able to speak to rank-based self-selection into activities, we conducted an experiment that compares the behavior of students who could and could not choose between different activities. These results isolate the self-selection channel and are presented in Section 4. 3. Ordinal Rank and Peer Effects The ideal data to test our theory would span multiple social markets—say, schools or classrooms—and contain information on shadow prices, individuals’ choices of peer groups as well as all of their skills. With such data in hand we could directly test whether self-selection based on comparative advantage determines behavior by comparing social status across groups and relating it to agents’ choices. However, in the absence of any information on social status, we confine ourselves to providing reduced form evidence that shows that individual behavior does depend on ordinal rank. That is, in the spirit of Friedman (1953) we test a stark prediction of our approach—one that does not follow from standard theories of social interactions. 3.1. Evidence from Primary Schools in Kenya Our first piece of evidence comes from the Extra Teacher Provision (ETP) intervention by Duflo et al. (2011).16 Starting in May 2005, ETP provided 121 Kenyan primary schools that had a single first-grade class with additional funds to hire an extra teacher and create a second section. In 61 randomly selected “tracking schools,” students were assigned to sections based on scores on exams administered by the schools prior to the intervention. Students above the median were grouped in one section, and those below the median in another one. In the remaining 60 “nontracking schools,” students were randomized into sections. After assignment of students to sections, each of a school’s two sections was also randomly assigned to either a civil service teacher or to one hired on a contractual basis. This intervention spanned 18 months.17 Table 1 displays summary statistics for the 121 schools in the sample of Duflo et al. (2011). Due to random assignment, tracking and nontracking schools look very similar on pretreatment observable characteristics. The same is true for students assigned to either the contract or government teacher section within nontracking schools. Within tracking schools, students assigned to the top section have on average .81 standard deviation higher test scores and are almost .4 years older than their low-ability counterparts. Table 1. Baseline school and class characteristics in the experiment of Duflo et al. (2011), by treatment group. All schools Nontracking schools Tracking schools p-value Mean SD Mean SD Tracking = Nontracking School characteristics at baseline  Total enrollment 589 (232) 549 (198) .32  Number of Government teachers 11.6 (3.3) 11.9 (2.8) .62  Student/teacher ratio 37.1 (12.2) 35.9 (10.1) .56  Performance on national exam (out of 400) 255.6 (23.6) 258.1 (23.4) .57 Class size at baseline  Average class size 91 (37) 89 (33) .76  Proportion of female students .49 (.06) .49 (.05) .54 Within nontracking schools Section A Section B Assigned to civil service teacher Assigned to contract teacher p-value Mean SD Mean SD Section A = Section B Proportion Female .49 (.06) .49 (.06) .89 Average age at endline 9.07 (.53) 9.00 (.45) .45 Average standardized test score at baseline .003 (.10) .002 (.11) .94 Average SD (within section) of test scores at baseline 1.005 (.08) .993 (.08) .43 Within tracking schools Bottom section Top section p-value Mean SD Mean SD Top = Bottom Proportion female .49 (.09) .50 (.08) .38 Average age at endline 9.04 (.59) 9.41 (.60) .00 Assigned to contract teacher .53 (.49) .46 (.47) .44 Respected assignment .99 (.02) .99 (.02) .67 Average standardized test score at baseline −.81 (.04) .81 (.04) .00 Average SD (within section) of test scores at baseline .49 (.13) .65 (.13) .00 All schools Nontracking schools Tracking schools p-value Mean SD Mean SD Tracking = Nontracking School characteristics at baseline  Total enrollment 589 (232) 549 (198) .32  Number of Government teachers 11.6 (3.3) 11.9 (2.8) .62  Student/teacher ratio 37.1 (12.2) 35.9 (10.1) .56  Performance on national exam (out of 400) 255.6 (23.6) 258.1 (23.4) .57 Class size at baseline  Average class size 91 (37) 89 (33) .76  Proportion of female students .49 (.06) .49 (.05) .54 Within nontracking schools Section A Section B Assigned to civil service teacher Assigned to contract teacher p-value Mean SD Mean SD Section A = Section B Proportion Female .49 (.06) .49 (.06) .89 Average age at endline 9.07 (.53) 9.00 (.45) .45 Average standardized test score at baseline .003 (.10) .002 (.11) .94 Average SD (within section) of test scores at baseline 1.005 (.08) .993 (.08) .43 Within tracking schools Bottom section Top section p-value Mean SD Mean SD Top = Bottom Proportion female .49 (.09) .50 (.08) .38 Average age at endline 9.04 (.59) 9.41 (.60) .00 Assigned to contract teacher .53 (.49) .46 (.47) .44 Respected assignment .99 (.02) .99 (.02) .67 Average standardized test score at baseline −.81 (.04) .81 (.04) .00 Average SD (within section) of test scores at baseline .49 (.13) .65 (.13) .00 Notes: Table shows averages and standard deviations of selected characteristics of the 121 schools in the ETP experiment of Duflo et al. (2011). Of the 121 schools in the experiment 60 were randomly assigned to the “tracking” treatment, whereas the remaining 61 schools are classified as “nontracking”. The rightmost column displays p-values for tests of equality across groups. Source: Duflo et al. (2011). View Large Table 1. Baseline school and class characteristics in the experiment of Duflo et al. (2011), by treatment group. All schools Nontracking schools Tracking schools p-value Mean SD Mean SD Tracking = Nontracking School characteristics at baseline  Total enrollment 589 (232) 549 (198) .32  Number of Government teachers 11.6 (3.3) 11.9 (2.8) .62  Student/teacher ratio 37.1 (12.2) 35.9 (10.1) .56  Performance on national exam (out of 400) 255.6 (23.6) 258.1 (23.4) .57 Class size at baseline  Average class size 91 (37) 89 (33) .76  Proportion of female students .49 (.06) .49 (.05) .54 Within nontracking schools Section A Section B Assigned to civil service teacher Assigned to contract teacher p-value Mean SD Mean SD Section A = Section B Proportion Female .49 (.06) .49 (.06) .89 Average age at endline 9.07 (.53) 9.00 (.45) .45 Average standardized test score at baseline .003 (.10) .002 (.11) .94 Average SD (within section) of test scores at baseline 1.005 (.08) .993 (.08) .43 Within tracking schools Bottom section Top section p-value Mean SD Mean SD Top = Bottom Proportion female .49 (.09) .50 (.08) .38 Average age at endline 9.04 (.59) 9.41 (.60) .00 Assigned to contract teacher .53 (.49) .46 (.47) .44 Respected assignment .99 (.02) .99 (.02) .67 Average standardized test score at baseline −.81 (.04) .81 (.04) .00 Average SD (within section) of test scores at baseline .49 (.13) .65 (.13) .00 All schools Nontracking schools Tracking schools p-value Mean SD Mean SD Tracking = Nontracking School characteristics at baseline  Total enrollment 589 (232) 549 (198) .32  Number of Government teachers 11.6 (3.3) 11.9 (2.8) .62  Student/teacher ratio 37.1 (12.2) 35.9 (10.1) .56  Performance on national exam (out of 400) 255.6 (23.6) 258.1 (23.4) .57 Class size at baseline  Average class size 91 (37) 89 (33) .76  Proportion of female students .49 (.06) .49 (.05) .54 Within nontracking schools Section A Section B Assigned to civil service teacher Assigned to contract teacher p-value Mean SD Mean SD Section A = Section B Proportion Female .49 (.06) .49 (.06) .89 Average age at endline 9.07 (.53) 9.00 (.45) .45 Average standardized test score at baseline .003 (.10) .002 (.11) .94 Average SD (within section) of test scores at baseline 1.005 (.08) .993 (.08) .43 Within tracking schools Bottom section Top section p-value Mean SD Mean SD Top = Bottom Proportion female .49 (.09) .50 (.08) .38 Average age at endline 9.04 (.59) 9.41 (.60) .00 Assigned to contract teacher .53 (.49) .46 (.47) .44 Respected assignment .99 (.02) .99 (.02) .67 Average standardized test score at baseline −.81 (.04) .81 (.04) .00 Average SD (within section) of test scores at baseline .49 (.13) .65 (.13) .00 Notes: Table shows averages and standard deviations of selected characteristics of the 121 schools in the ETP experiment of Duflo et al. (2011). Of the 121 schools in the experiment 60 were randomly assigned to the “tracking” treatment, whereas the remaining 61 schools are classified as “nontracking”. The rightmost column displays p-values for tests of equality across groups. Source: Duflo et al. (2011). View Large Duflo et al. (2011) demonstrate that tracking increased the subsequent test scores of all students, regardless of their initial place in the distribution. The authors rationalize this finding with high-ability students benefitting primarily from positive spillover effects due to more able peers, whereas for students in the low-ability section the direct effect of worse peers is more than outweighed by better targeted instruction. Given random assignment of students to classrooms, the nontracking schools in the experiment Duflo et al. (2011) provide an ideal testing ground for the impact of ordinal rank on student outcomes. In what follows we exploit the experimentally generated variation in the within-section rank of children with equal ability to estimate the causal effect of rank on academic achievement. As Duflo et al. (2011), we base our results on the initial random assignment of all students who attended first grade in May 2005.18 Specifically, we implement the empirical setup of Duflo et al. (2011), but add a student’s ordinal rank to the following linear model: $$y_{i}=\varphi r_{i}+\boldsymbol {X}_{i}^{\prime }\beta +\alpha \bar{y}_{-i} +\boldsymbol {T}_{i}^{\prime }\theta +\epsilon _{i}\text{,}$$ (6) where yi denotes individual i’s standardized total test score at endline, ri is her section-specific rank (i.e., her percentile in the distribution of pretreatment test scores), and Xi is a vector of individual controls including the baseline score and its square, gender, age, and so forth. $$\bar{y}_{-i}$$ represents the mean standardized baseline score of i’s peers, and Ti marks a vector of treatment indicators. In alternative specifications, we also include school or section fixed effects, which help to account for unobserved heterogeneity at the school or section level. A conceptual issue with estimating the causal effect of rank is that a student’s rank depends, by definition, on the distribution of ability among her peers. It is, therefore, unclear how to disentangle the two without imposing parametric assumptions. For comparability with previous work, our preferred specification controls for average peer skill. Another reason to control for the mean skill level of peers is to contrast rank effects with linear-in-means models of social interactions, which assume that peer effects operate solely through average ability.19 Table 2 presents results from estimating equation (6) using ordinary least squares. To estimate the impact of rank as cleanly as possible, we restrict attention to nontracking schools, that is, to students for whom, conditional on test scores at baseline, variation in ordinal rank is purely random. For completeness, in Online Appendix Table A.2 we present results for all students with nonmissing baseline scores, including students in tracking schools. Table 2. Estimates of the impact of rank on test scores in the control group of Duflo et al. (2011). Endline test score (1) (2) (3) (4) (5) (6) (7) Percentile (÷100) .418 .641** .649** .472** .462** .639** .449** (.269) (.273) (.265) (.225) (.230) (.265) (.225) Test score at .374*** .314*** .320*** .371*** .365*** .324*** .379*** baseline (.084) (.086) (.083) (.073) (.075) (.083) (.072) Squared test score .015 .020 .020 .022 .025 .020 .022 at baseline (.016) (.016) (.016) (.015) (.015) (.016) (.015) Contract teacher .142*** .139*** .144*** .134*** .144*** .134*** (.051) (.048) (.049) (.045) (.048) (.045) Peers’ mean test .550** .546** .438** score (.206) (.215) (.205) Peers’ mean test .518 .457 score × bottom quarter (.322) (.285) Peers’ mean test .599 .426 score × second quarter (.364) (.359) Peers’ mean test .263 .022 score × third quarter (.353) (.340) Peers’ mean test .793* .820** score × top quarter (.408) (.382) Constant −.309* −.424** −.230 −.226 (.168) (.164) (.250) (.249) Additional controls No No Yes Yes Yes Yes Yes School fixed effects No No No Yes No No Yes Section fixed effects No No No No Yes No No R-squared .240 .243 .254 .390 .413 .255 .391 Number of observations 2,190 2,190 2,188 2,188 2,188 2,188 2,188 Endline test score (1) (2) (3) (4) (5) (6) (7) Percentile (÷100) .418 .641** .649** .472** .462** .639** .449** (.269) (.273) (.265) (.225) (.230) (.265) (.225) Test score at .374*** .314*** .320*** .371*** .365*** .324*** .379*** baseline (.084) (.086) (.083) (.073) (.075) (.083) (.072) Squared test score .015 .020 .020 .022 .025 .020 .022 at baseline (.016) (.016) (.016) (.015) (.015) (.016) (.015) Contract teacher .142*** .139*** .144*** .134*** .144*** .134*** (.051) (.048) (.049) (.045) (.048) (.045) Peers’ mean test .550** .546** .438** score (.206) (.215) (.205) Peers’ mean test .518 .457 score × bottom quarter (.322) (.285) Peers’ mean test .599 .426 score × second quarter (.364) (.359) Peers’ mean test .263 .022 score × third quarter (.353) (.340) Peers’ mean test .793* .820** score × top quarter (.408) (.382) Constant −.309* −.424** −.230 −.226 (.168) (.164) (.250) (.249) Additional controls No No Yes Yes Yes Yes Yes School fixed effects No No No Yes No No Yes Section fixed effects No No No No Yes No No R-squared .240 .243 .254 .390 .413 .255 .391 Number of observations 2,190 2,190 2,188 2,188 2,188 2,188 2,188 Notes: Entries are coefficients and standard errors from estimating equation (6) using ordinary least squares. Heteroskedasticity robust standard errors are clustered at the school level and presented in parentheses. The sample consists of all students who attend nontracking schools and have nonmissing baseline test scores. Going from column (2) to (3) the number of observations decreases because some students are missing information on age and gender. “Additional controls” include age, gender, whether the school is located in the Bungoma district, and whether it was sampled for school based management. “Bottom quarter”, “second quarter”, and so forth are indicator variables for students’ own position in the test score distribution at baseline. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Table 2. Estimates of the impact of rank on test scores in the control group of Duflo et al. (2011). Endline test score (1) (2) (3) (4) (5) (6) (7) Percentile (÷100) .418 .641** .649** .472** .462** .639** .449** (.269) (.273) (.265) (.225) (.230) (.265) (.225) Test score at .374*** .314*** .320*** .371*** .365*** .324*** .379*** baseline (.084) (.086) (.083) (.073) (.075) (.083) (.072) Squared test score .015 .020 .020 .022 .025 .020 .022 at baseline (.016) (.016) (.016) (.015) (.015) (.016) (.015) Contract teacher .142*** .139*** .144*** .134*** .144*** .134*** (.051) (.048) (.049) (.045) (.048) (.045) Peers’ mean test .550** .546** .438** score (.206) (.215) (.205) Peers’ mean test .518 .457 score × bottom quarter (.322) (.285) Peers’ mean test .599 .426 score × second quarter (.364) (.359) Peers’ mean test .263 .022 score × third quarter (.353) (.340) Peers’ mean test .793* .820** score × top quarter (.408) (.382) Constant −.309* −.424** −.230 −.226 (.168) (.164) (.250) (.249) Additional controls No No Yes Yes Yes Yes Yes School fixed effects No No No Yes No No Yes Section fixed effects No No No No Yes No No R-squared .240 .243 .254 .390 .413 .255 .391 Number of observations 2,190 2,190 2,188 2,188 2,188 2,188 2,188 Endline test score (1) (2) (3) (4) (5) (6) (7) Percentile (÷100) .418 .641** .649** .472** .462** .639** .449** (.269) (.273) (.265) (.225) (.230) (.265) (.225) Test score at .374*** .314*** .320*** .371*** .365*** .324*** .379*** baseline (.084) (.086) (.083) (.073) (.075) (.083) (.072) Squared test score .015 .020 .020 .022 .025 .020 .022 at baseline (.016) (.016) (.016) (.015) (.015) (.016) (.015) Contract teacher .142*** .139*** .144*** .134*** .144*** .134*** (.051) (.048) (.049) (.045) (.048) (.045) Peers’ mean test .550** .546** .438** score (.206) (.215) (.205) Peers’ mean test .518 .457 score × bottom quarter (.322) (.285) Peers’ mean test .599 .426 score × second quarter (.364) (.359) Peers’ mean test .263 .022 score × third quarter (.353) (.340) Peers’ mean test .793* .820** score × top quarter (.408) (.382) Constant −.309* −.424** −.230 −.226 (.168) (.164) (.250) (.249) Additional controls No No Yes Yes Yes Yes Yes School fixed effects No No No Yes No No Yes Section fixed effects No No No No Yes No No R-squared .240 .243 .254 .390 .413 .255 .391 Number of observations 2,190 2,190 2,188 2,188 2,188 2,188 2,188 Notes: Entries are coefficients and standard errors from estimating equation (6) using ordinary least squares. Heteroskedasticity robust standard errors are clustered at the school level and presented in parentheses. The sample consists of all students who attend nontracking schools and have nonmissing baseline test scores. Going from column (2) to (3) the number of observations decreases because some students are missing information on age and gender. “Additional controls” include age, gender, whether the school is located in the Bungoma district, and whether it was sampled for school based management. “Bottom quarter”, “second quarter”, and so forth are indicator variables for students’ own position in the test score distribution at baseline. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large As reported in Duflo et al. (2011), contract teachers have a positive impact on test scores. More importantly for our purposes, there is a positive relationship between students’ ordinal rank and their academic achievement. With one exception the point estimates in the upper panel are statistically significant at conventional levels, and they are always economically large. Critically, compared to its baseline value in the first column, the estimated impact of rank on test scores actually increases with the inclusion of additional controls, such as peers’ mean test score, peers’ mean test score interacted with a student’s own position in the ability distribution, age, gender, and so forth. Moreover, the point estimate is also robust to using only within-school or within-section variation as sources of identification. Taking the lowest estimate in the upper panel of Table 2 at face value, a 50 percentile increase in rank increases test scores at endline by about .2 standard deviations.20 The estimates in Table 2 also suggest that peers’ mean ability exerts a positive effect on student achievement. Since our theory deliberately abstracts away from peers as a source of direct externalities, it falls necessarily short of explaining this finding. Such externalities, however, are necessary to rationalize the full set of results in Duflo et al. (2011). Although rank effects can explain why students just below the median of the achievement distribution benefitted from tracking without relying on nonconvexities in teachers’ payoffs; for students just above the initial median to do better in tracked sections, it must be the case that the negative impact of decreasing rank is outweighed by a countervailing force that is outside of our theoretical model. Nonetheless, in Online Appendix D.1, we show that we obtain qualitatively similar estimates of rank effects when we rely on the full sample of Duflo et al. (2011), that is, when we also include students in tracking schools. Broadly summarizing, the evidence from this randomized controlled trial suggests that, even conditional on standard measures of peer quality, ordinal rank has an economically meaningful impact on children’s academic achievement. Although the intervention of Duflo et al. (2011) provides us with exogenous variation to test for an impact of rank on achievement, it does not come without drawbacks. As with any experiment, one may wonder about external validity. Moreover, the data do not contain measures of student behavior, which prevents us from probing the prediction that ordinal rank also affects problem behaviors. 3.2. Evidence from New York City Public Schools To ameliorate these shortcomings, we now turn to administrative data for all students in New York City Public Schools (NYCPS)—the largest school district in the United States. The NYCPS data contain student-level information on approximately 1.1 million students per year across the five boroughs of New York City. Our data span the 2003–2004 to 2008–2009 school years and include student race, gender, free and reduced-price lunch eligibility, behavior, attendance, and matriculation with course grades for all students, as well as state math and English/Language Arts (ELA) test scores for students in grades three through eight. Summary statistics for the variables we use in our core specifications are displayed in Table 3. Table 3. Summary statistics for NYCPS data. Mean SD Behavioral indicators  Behavioral incident in 5th grade .083 (.531)  Behavioral incident in 6th grade .089 (.285)  Behavioral incident in 8th grade .132 (.339) Test scores  5th grade test score (English/Language arts) 661 (36.9)  5th grade test score (Math) 668 (41.6) Demographics  White .149 (.356)  Black .314 (.464)  Hispanic .394 (.489)  Asian .139 (.346)  Other race .004 (.063)  Male .507 (.500)  Female .493 (.500)  Free lunch .830 (.376)  English language learner .093 (.290)  Special education .087 (.282) School year  2004/05 .206 (.405)  2005/06 .194 (.395)  2006/07 .194 (.396)  2007/08 .201 (.401)  2008/09 .205 (.403) Mean SD Behavioral indicators  Behavioral incident in 5th grade .083 (.531)  Behavioral incident in 6th grade .089 (.285)  Behavioral incident in 8th grade .132 (.339) Test scores  5th grade test score (English/Language arts) 661 (36.9)  5th grade test score (Math) 668 (41.6) Demographics  White .149 (.356)  Black .314 (.464)  Hispanic .394 (.489)  Asian .139 (.346)  Other race .004 (.063)  Male .507 (.500)  Female .493 (.500)  Free lunch .830 (.376)  English language learner .093 (.290)  Special education .087 (.282) School year  2004/05 .206 (.405)  2005/06 .194 (.395)  2006/07 .194 (.396)  2007/08 .201 (.401)  2008/09 .205 (.403) Notes: Entries are means and standard deviations for each variable we use in the NYCPS data. For further details about the NYCPS data see the description in the Data Appendix. View Large Table 3. Summary statistics for NYCPS data. Mean SD Behavioral indicators  Behavioral incident in 5th grade .083 (.531)  Behavioral incident in 6th grade .089 (.285)  Behavioral incident in 8th grade .132 (.339) Test scores  5th grade test score (English/Language arts) 661 (36.9)  5th grade test score (Math) 668 (41.6) Demographics  White .149 (.356)  Black .314 (.464)  Hispanic .394 (.489)  Asian .139 (.346)  Other race .004 (.063)  Male .507 (.500)  Female .493 (.500)  Free lunch .830 (.376)  English language learner .093 (.290)  Special education .087 (.282) School year  2004/05 .206 (.405)  2005/06 .194 (.395)  2006/07 .194 (.396)  2007/08 .201 (.401)  2008/09 .205 (.403) Mean SD Behavioral indicators  Behavioral incident in 5th grade .083 (.531)  Behavioral incident in 6th grade .089 (.285)  Behavioral incident in 8th grade .132 (.339) Test scores  5th grade test score (English/Language arts) 661 (36.9)  5th grade test score (Math) 668 (41.6) Demographics  White .149 (.356)  Black .314 (.464)  Hispanic .394 (.489)  Asian .139 (.346)  Other race .004 (.063)  Male .507 (.500)  Female .493 (.500)  Free lunch .830 (.376)  English language learner .093 (.290)  Special education .087 (.282) School year  2004/05 .206 (.405)  2005/06 .194 (.395)  2006/07 .194 (.396)  2007/08 .201 (.401)  2008/09 .205 (.403) Notes: Entries are means and standard deviations for each variable we use in the NYCPS data. For further details about the NYCPS data see the description in the Data Appendix. View Large To account for the possibility that low ability students may be inherently more likely to act out, our research design for these data relies on transitions from elementary to middle school, that is, from fifth to sixth grade. During this transition students typically move from small, local elementary schools to larger middle schools, which disrupts ordinal rank when the feeder schools are heterogeneous. Specifically, to estimate the impact of ordinal rank, holding students’ inherent tendency to misbehave fixed, we relate changes in the behavior of equally able children from the same elementary school to changes in rank induced by a switch to different middle schools. We first exploit the sheer size of the NYCPS data and estimate semiparametric models, which allow us to explore potential nonlinearities in the relationship between rank and behavior. Finding little evidence of important nonlinearities, we then address the potential endogeneity of changes in rank via an instrumental variables strategy based on school zoning regulations. In order to examine the functional relationship between rank and behavior we estimate semiparametric specifications of the following form: $$\Delta y_{i}=f(\Delta r_{i})+\boldsymbol {X}_{i}^{\prime }\beta +\textit {School}_{i}+\textit {Year}_{i}+\epsilon _{i}\text{,}$$ (7) while restricting attention to the set of students who change schools in the transition from fifth to sixth grade. Our behavioral measure, yi, in each year is an indicator equal to one if a student has at least one reported behavioral incident from that year and zero otherwise. Hence, Δyi ∈ {−1, 0, 1}. The three most common behavioral incidents in our data are “engaging in an altercation or physically aggressive behavior with other student(s)”, “behaving in a manner that disrupts the educational process (horseplay),” or “engaging in verbally rude or disrespectful behavior/insubordination.”21 A student’s rank in fifth grade is the student’s percentile ranking based on her achievement on the New York State exam relative to other students who are in the same school in fifth grade.22 We also compute each student’s position relative to her peers in the sixth-grade school based on her fifth grade score, and denote the difference between these two rankings Δri. Results are reported using both math and ELA scores to compute the change in percentile. We include a standard set of controls $$\boldsymbol {X}_{i}^{{}}$$, consisting of the test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, English Language Learner (ELL) status, and special education designation. Using these covariates we attempt to control for factors that plausibly influence changes in behavior and might be correlated with rank. $$\boldsymbol {X}_{i}^{{}}$$ in equation (7) further includes the variance of peers’ test scores, and a third order polynomial on peers’ mean score. Finally, we add year fixed effects and school fixed effects (for both a student’s elementary and middle school). By controlling for school fixed effects we account for the fact that schools might have heterogeneous propensities to classify the same demeanor as a behavioral incident. Our semiparametric estimates of the link between changes in rank and changes in behavior are displayed in Figure 5. Independent of whether we calculate rank based on ELA or math scores, the behavior of students whose rank decreases in going from elementary to middle school worsens significantly compared to students whose relative standing improves. Taking the estimates based on ELA scores at face value, a student experiencing a 50 percentile decline in rank is approximately 2 percentage points more likely to have a behavioral incident on record than a student whose rank improves by 50 percentiles. Given sample means of 8.7% for sixth grade and 4.9% for fifth grade, our estimates are nontrivial in size.23 Figure 5. View largeDownload slide Evidence from New York City public schools. Note: Panels show semiparametric estimates and the associated 95%-confidence intervals of the effect of a change in a student’s class percentile rank (in going from elementary to middle school) on the change in an indicator variable for whether she was involved in a behavioral incident, cf. equation (7). The top panel constructs percentiles based on English/Language Arts (ELA) test scores, whereas the lower one uses math test scores. Estimates are obtained using cubic b-splines with four nodes that divided the sample equally. Section 3.2 and the Online Data Appendix provide additional information on the exact econometric specification as well as the sample. Figure 5. View largeDownload slide Evidence from New York City public schools. Note: Panels show semiparametric estimates and the associated 95%-confidence intervals of the effect of a change in a student’s class percentile rank (in going from elementary to middle school) on the change in an indicator variable for whether she was involved in a behavioral incident, cf. equation (7). The top panel constructs percentiles based on English/Language Arts (ELA) test scores, whereas the lower one uses math test scores. Estimates are obtained using cubic b-splines with four nodes that divided the sample equally. Section 3.2 and the Online Data Appendix provide additional information on the exact econometric specification as well as the sample. Although the NYCPS data allow us to control for students’ natural proclivities to cause trouble by relating changes in behavior to changes in rank induced by the transition from elementary to middle school, there exists the possibility that estimates of equation (7) are driven by reverse causality. That is, behavioral problems during sixth grade might have caused changes in test scores and, therefore, class rank. Another concern is systematic choice of school. Students who chose an academically less challenging middle school might have experienced less of an increase in behavioral problems, even if their rank had not improved. To address these issues, we also estimate two-stage least squares (2SLS) specifications in which we instrument for a student’s change in rank with the predicted change based on the schools they were zoned to attend (given their residential address). Specifically, we estimate the following linear model: $$\Delta y_{i}=\varphi \Delta r_{i}+\boldsymbol {X}_{i}^{\prime }\beta +\textit {School}_{i}+\textit {Year}_{i}+\epsilon _{i}\,\,\text{,}$$ (8) where the first stage is given by \begin{equation*} \Delta r_{i}=\delta \widehat{\Delta r_{i}}+\boldsymbol {X}_{i}^{\prime }\gamma +\textit {School}_{i}+\textit {Year}_{i}+\nu _{i}\text{,}\,\, \end{equation*} and $$\widehat{\Delta r_{i}}$$ denotes student i’s counterfactual change in rank at the beginning of sixth grade (using fifth grade tests scores) had all students attended the schools for which they were zoned. In symbols, let ai, t−1 denote student i’s test score in fifth grade and let $${rank}_{I}\left( a_{i\text{,}t-1}\right)$$ be the percentile ranking of a student with score ai, t−1 among the set of students I, given their respective test scores at t−1. Then, \begin{equation*} \widehat{\Delta r_{i}}\equiv {rank}_{S_{i,t}}\left( a_{i\text{,}t-1}\right) -{rank}_{S_{i,t-1}}\left( a_{i\text{,}t-1}\right) \text{,} \end{equation*} where Si, t−1 and Si, t are the sets of students who are zoned for the same elementary and middle school as i, respectively. Intuitively, our IV approach compares observationally identical students from the same elementary school who experience a differential change in rank because school zoning regulations led them to attend different middle schools. Table 4 presents the resulting 2SLS estimates of the effect of school rank on behavior, as well as the corresponding OLS ones. In the upper panel we use ELA scores to construct rank, whereas math scores are used in the lower one. To facilitate comparisons with standard linear-in-means models, we refrain from controlling for higher order moments of the distribution of scores and simply report the coefficient on peers’ mean test score.24 Based on the OLS point estimates, one would expect a student experiencing a 50 percentile decline in rank to be 3 to 5 percentage points more likely to have a behavioral incident on record than a student whose rank improves by 50 percentiles—consistent with our previous semiparametric results. Table 4. Estimates of the short-run impact of rank on behavior in the NYCPS data. A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 6) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.030*** −.069** −.031*** −.050 −.032*** −.047 (.005) (.034) (.004) (.051) (.005) (.059) Peers’ mean test score (÷100) −.027* −.040** −.079* −.085* (.015) (.020) (.042) (.046) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .009 – .067 – .103 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile Based on Math Test Scores Δ Percentile (÷100) −.047*** −.072** −.051*** −.087* −.053*** −.123** (.006) (.030) (.005) (.047) (.006) (.054) Peers’ mean test score (÷100) −.033*** −.040*** −.042 −.055 (.012) (.016) (.031) (.034) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First Stage F-Statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .009 – .065 – .100 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 6) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.030*** −.069** −.031*** −.050 −.032*** −.047 (.005) (.034) (.004) (.051) (.005) (.059) Peers’ mean test score (÷100) −.027* −.040** −.079* −.085* (.015) (.020) (.042) (.046) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .009 – .067 – .103 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile Based on Math Test Scores Δ Percentile (÷100) −.047*** −.072** −.051*** −.087* −.053*** −.123** (.006) (.030) (.005) (.047) (.006) (.054) Peers’ mean test score (÷100) −.033*** −.040*** −.042 −.055 (.012) (.016) (.031) (.034) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First Stage F-Statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .009 – .065 – .100 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (8) by ordinary least squares as well as two-stage least squares. The dependent variable is listed at the top of each column. The instrument for Δ Percentile is the predicted change in percentile based on school zoning regulations, as explained in the text. The IV specifications contain fewer observations because we do not observe addresses of all students in our data. In the upper panel a student’s percentile in his school is calculated based on ELA test scores, whereas the lower panel uses math test scores. Heteroskedasticity robust standard errors are clustered on the school level and reported in parentheses. In addition to the variables shown in the table, we control for test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, ELL status, and special education designation. To facilitate comparisons between Tables 4 and 5 the set of students included in the analysis has been restricted to those observed from fifth through eight grade. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Table 4. Estimates of the short-run impact of rank on behavior in the NYCPS data. A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 6) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.030*** −.069** −.031*** −.050 −.032*** −.047 (.005) (.034) (.004) (.051) (.005) (.059) Peers’ mean test score (÷100) −.027* −.040** −.079* −.085* (.015) (.020) (.042) (.046) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .009 – .067 – .103 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile Based on Math Test Scores Δ Percentile (÷100) −.047*** −.072** −.051*** −.087* −.053*** −.123** (.006) (.030) (.005) (.047) (.006) (.054) Peers’ mean test score (÷100) −.033*** −.040*** −.042 −.055 (.012) (.016) (.031) (.034) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First Stage F-Statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .009 – .065 – .100 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 6) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.030*** −.069** −.031*** −.050 −.032*** −.047 (.005) (.034) (.004) (.051) (.005) (.059) Peers’ mean test score (÷100) −.027* −.040** −.079* −.085* (.015) (.020) (.042) (.046) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .009 – .067 – .103 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile Based on Math Test Scores Δ Percentile (÷100) −.047*** −.072** −.051*** −.087* −.053*** −.123** (.006) (.030) (.005) (.047) (.006) (.054) Peers’ mean test score (÷100) −.033*** −.040*** −.042 −.055 (.012) (.016) (.031) (.034) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First Stage F-Statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .009 – .065 – .100 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (8) by ordinary least squares as well as two-stage least squares. The dependent variable is listed at the top of each column. The instrument for Δ Percentile is the predicted change in percentile based on school zoning regulations, as explained in the text. The IV specifications contain fewer observations because we do not observe addresses of all students in our data. In the upper panel a student’s percentile in his school is calculated based on ELA test scores, whereas the lower panel uses math test scores. Heteroskedasticity robust standard errors are clustered on the school level and reported in parentheses. In addition to the variables shown in the table, we control for test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, ELL status, and special education designation. To facilitate comparisons between Tables 4 and 5 the set of students included in the analysis has been restricted to those observed from fifth through eight grade. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Due to the large number of observations, our OLS estimates are very precise. Unfortunately, this is not the case when we estimate equation (8) by 2SLS. Although the first stage F-statistic is well above conventional critical values (Stock and Yogo 2005), our instrument explains little residual variation in the excluded variable, as evidenced by small values of Shea’s R2 (Shea 1997). One potential explanation for this is that only 45.3% (53.9%) of students attend the middle (elementary) school for which they are zoned. Nevertheless, not including school fixed effects, the 2SLS estimates are at least as large as their OLS counterparts and statistically significant. If we include school or section fixed effects the standard errors increase by as much as an order of magnitude. The coefficients, however, continue to be negative and economically large. To help judge the magnitude of the implied effect sizes, consider a student whose rank increases by 25 percentiles as she goes from fifth to sixth grade (which corresponds to about one standard deviation in our data). Based on the estimates in Table 3, her behavior should improve by 0.7 percentage points or more. Taking the median coefficient on peers’ mean test scores at face value, to achieve an equivalent improvement her peers’ mean score would need to increase by slightly more than one standard deviation. Although such a comparison is speculative due to the variability in our estimates, it suggests that rank-based peer effects are economically important. In order to investigate whether these effects persist beyond sixth grade, we have replicated the analysis in Table 4, focusing on the change in behavior from fifth to eighth grade instead. Table 5 presents the results. Interestingly, all point estimates are negative and economically meaningful. Six out of the eight estimates are even larger than those in the previous table. Although the 2SLS results are, again, fairly imprecise, the sum of the evidence suggests that behavioral effects from changes in ordinal rank do not dissipate over time. Table 5. Estimates of the medium-run impact of rank on behavior in the NYCPS data. A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 8) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.058*** −.098** −.057*** −.148** −.060*** −.244*** (.006) (.038) (.006) (.068) (.006) (.078) Peers’mean test score (÷100) −.050*** −.067*** −.066 −.102* (.022) (.025) (.048) (.055) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .021 – .070 – .101 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile based on math test scores Δ Percentile (÷100) −.078*** −.055 −.086*** −.019 −.090*** −.077 (.007) (.034) (.006) (.059) (.006) (.074) Peers’ mean test score (÷100) −.037** −.029 −.040 −.022 (.018) (.022) (.039) (.041) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–chort fixed effects No No No No Yes Yes First stage F-statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .021 – .069 – .101 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 8) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.058*** −.098** −.057*** −.148** −.060*** −.244*** (.006) (.038) (.006) (.068) (.006) (.078) Peers’mean test score (÷100) −.050*** −.067*** −.066 −.102* (.022) (.025) (.048) (.055) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .021 – .070 – .101 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile based on math test scores Δ Percentile (÷100) −.078*** −.055 −.086*** −.019 −.090*** −.077 (.007) (.034) (.006) (.059) (.006) (.074) Peers’ mean test score (÷100) −.037** −.029 −.040 −.022 (.018) (.022) (.039) (.041) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–chort fixed effects No No No No Yes Yes First stage F-statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .021 – .069 – .101 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (8) by ordinary least squares as well as two-stage least squares. The dependent variable is listed at the top of each column. The instrument for Δ Percentile is the predicted change in percentile based on school zoning regulations, as explained in the text. The IV specifications contain fewer observations because we do not observe addresses of all students in our data. In the upper panel a student’s percentile in his school is calculated based on ELA test scores, whereas the lower panel uses math test scores. Heteroskedasticity robust standard errors are clustered on the school level and reported in parentheses. In addition to the variables shown in the table, we control for test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, ELL status, and special education designation. To facilitate comparisons between Tables 4 and 5 the set of students included in the analysis has been restricted to those observed from fifth through eight grade. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Table 5. Estimates of the medium-run impact of rank on behavior in the NYCPS data. A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 8) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.058*** −.098** −.057*** −.148** −.060*** −.244*** (.006) (.038) (.006) (.068) (.006) (.078) Peers’mean test score (÷100) −.050*** −.067*** −.066 −.102* (.022) (.025) (.048) (.055) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .021 – .070 – .101 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile based on math test scores Δ Percentile (÷100) −.078*** −.055 −.086*** −.019 −.090*** −.077 (.007) (.034) (.006) (.059) (.006) (.074) Peers’ mean test score (÷100) −.037** −.029 −.040 −.022 (.018) (.022) (.039) (.041) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–chort fixed effects No No No No Yes Yes First stage F-statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .021 – .069 – .101 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 8) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.058*** −.098** −.057*** −.148** −.060*** −.244*** (.006) (.038) (.006) (.068) (.006) (.078) Peers’mean test score (÷100) −.050*** −.067*** −.066 −.102* (.022) (.025) (.048) (.055) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .021 – .070 – .101 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile based on math test scores Δ Percentile (÷100) −.078*** −.055 −.086*** −.019 −.090*** −.077 (.007) (.034) (.006) (.059) (.006) (.074) Peers’ mean test score (÷100) −.037** −.029 −.040 −.022 (.018) (.022) (.039) (.041) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–chort fixed effects No No No No Yes Yes First stage F-statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .021 – .069 – .101 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (8) by ordinary least squares as well as two-stage least squares. The dependent variable is listed at the top of each column. The instrument for Δ Percentile is the predicted change in percentile based on school zoning regulations, as explained in the text. The IV specifications contain fewer observations because we do not observe addresses of all students in our data. In the upper panel a student’s percentile in his school is calculated based on ELA test scores, whereas the lower panel uses math test scores. Heteroskedasticity robust standard errors are clustered on the school level and reported in parentheses. In addition to the variables shown in the table, we control for test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, ELL status, and special education designation. To facilitate comparisons between Tables 4 and 5 the set of students included in the analysis has been restricted to those observed from fifth through eight grade. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large In Online Appendix Tables A.3 and A.4, we examine how children who do and do not comply with school zoning regulations differ from each other on predetermined variables. Compliers have a lower likelihood of behavioral problems in fifth grade, are more likely to be white or Asian, are less likely to be enrolled in special education classes, and have slightly higher socioeconomic status (as proxied by our free lunch indicator). Importantly, a student’s “predicted change in rank” (i.e., our instrument) is statistically indistinguishable for both groups. As are fifth grade test scores. Furthermore, behavioral incidents in fifth grade are, conditional on the set of covariates in equation in (8), uncorrelated with “predicted change in rank.” This is true for the set of all children as well as within the set of compliers. For our IV results to be driven by violations of the exclusion it would have to be the case that students who would have experienced worsening behavior regardless of their change in rank are systematically zoned for schools in which they experience larger declines in relative standing than children whose behavior improved. In light of the absence of selection on initial behavior and test scores, there is little evidence to suggest that this may be the case—although we hasten to point out that the validity of an instrument can never be fully established. The evidence in Online Appendix Tables A.3 and A.4 does, however, suggest that identification of the 2SLS local average treatment effect comes from higher socioeconomic status children who are initially better behaved than average. If one believes that children who are initially well-behaved and from wealthier backgrounds are less affected by their peers, then there is reason to think that the point estimates above understate the average treatment effect. 3.3. Discussion Taken together, the findings in this section suggest that students’ test scores decrease and problem behaviors worsen as their relative standing declines. These results are noteworthy not only because they point to a hitherto underexplored source of peer effects, but also because they come from very different, independent settings. The fact that we find an effect of ordinal rank for primary school children in Kenya as well as for middle school students in the United States suggests a more general phenomenon. Although the data are consistent with the idea that comparative advantage shapes social interactions, they cannot rule out other mechanisms. For instance, another explanation for our findings is that teacher behavior depends on the entire distribution of student ability. Suppose that a student’s perceived ability matters for how much teachers invest in her. If teachers invest more in students who are thought to be smarter, and if ordinal rank serves as a (noisy) signal about ability, then a teacher-focused explanation is consistent with the finding that test scores increase with adolescents’ rank. If lack of teacher attention causes students to act out, then such an explanation can also rationalize why problem behaviors worsen as students’ rank declines. The data requirements to test this alternative hypothesis are very demanding. To implement a convincing test, we would need objective information on how much effort and attention teachers places on students at every part of the ability distribution. We are unaware of such data.25 Moreover, there may be additional theories that predict a relationship between rank and student outcomes, and we do not have a principled way to narrow down the set of plausible mechanisms. 4. An Experiment to Test the Self-Selection Mechanism Rather than trying to estimate the precise share of the relationship between rank and outcomes that is attributable to a particular mechanism, we content ourselves with a “framed” field experiment that achieves two related goals: (i) Rule out that the teacher channel is the sole driver of rank effects, and (ii) explicitly test the idea of self-selection based on comparative advantage. To this end, we recruited nearly six hundred students from two open-enrollment public middle schools in the Houston Independent School District and incentivized them to solve mazes in a custom-made computer game. According to our conversations with principals and teachers, children at these schools have extensive experience playing simple video games on their phones or even on the schools’ computers. Our decision to embed the experiment in the context of a game reflects the desire to replicate as many as possible of the myriad ways in which situational variables may affect students’ behavior, all the while maintaining enough control to shed light on the mechanism through which rank affects outcomes. In light of our second goal, we settled on an experimental design that randomly offered students the opportunity to engage in both constructive and destructive behavior, or only the former. Relative to the situation in which the “choice mechanism” is shut down, our theory of self-selection in social interactions predicts that the opportunity to sabotage others crowds out constructive behavior to greater extent among low-ranked individuals than higher ranked ones. 4.1. Design Specifics Each student participated in one experimental session, which was held in her school’s computer lab (see Online Appendix F for details regarding recruitment, parental consent, implementation logistics, and so forth, and for a copy of the experimental instructions). Sessions lasted about sixty minutes and included, on average, twenty children from the same school. In the first stage of the game, students were asked to solve either five or twenty mazes, depending on the experimental session.26 “Solving” a computerized maze entailed using the arrow keys to steer a cursor from the entrance of the maze to its exit (see Figure 6 for screenshots). Children earned $0.25 per maze that they successfully completed in this stage. All students worked on the same set of mazes and were ranked (among participants in the same session) according to the time it took them to complete the task. The ordinal ranking as well as children’s cardinal performance was then displayed on everyone’s screens in order to make both common knowledge—similar to the scoreboard feature in many of the most popular video games. In the second stage, children in the control group were given the opportunity to practice on up to 20 additional mazes at a fixed, randomly determined cost per maze. Students were instructed that ten of these mazes would reappear again in the third stage of the game. Before the software determined the cost of practicing, students were asked for their maximal willingness to pay to see and work on a maze. Children whose willingness to pay exceeded the cost per maze were allowed to practice on as many mazes as they wished, paying for each one as they went along.27 Participants whose willingness to pay did not exceed the session specific cost were not allowed to practice at all. Our procedure for eliciting students’ willingness to pay thus resembles the well-known BDM mechanism (Becker, DeGroot, and Marschak 1964), with the important difference that quantity was not fixed at one. Instead, we allowed students to choose quantity knowing the realized price per maze.28 Figure 6. View largeDownload slide Sample screenshots from the maze-solving experiment. Figure 6. View largeDownload slide Sample screenshots from the maze-solving experiment. Children in the treatment group were also allowed to practice, and the software elicited their willingness to pay for practicing in the same way. In addition, they were asked how much they would be willing to spend in order to “slime” the screen of a peer of their choosing. Sliming another child’s screen carried no monetary benefit, but it prevented the other student from practicing by blocking a portion of the maze on which she was working (cf. Figure 6). Both activities were nonrival in the sense that children could practice and slime other participants at the same time. However, students were only allowed to engage in a particular activity if their stated willingness to pay exceeded the respective, randomly determined price. If allowed to slime, students could do so as often as they were willing to incur the cost. A ticker publicly displayed who slimed whom in real time. An important advantage of this design over an alternative one with fixed prices that are set in advance is that it requires no knowledge of students’ approximate willingness to pay. In fact, by explicitly eliciting this otherwise unobserved variable we can directly assess how the desirability of practicing changes once students can engage in an alternative, disruptive activity, and how this effect varies with rank. In the final stage of the game, all children were asked to complete ten mazes in twenty minutes, for a payoff of$3.00 per maze they successfully solved. On average, students earned a total of $29.30, including a$2 show-up fee. Neither students’ earnings nor their performance during the last stage of the game were made common knowledge. It is also worth noting that at no point during the experiment did monetary payoffs depend on ordinal rank, and that students were made aware of the payoff structure. Administrative constraints imposed by the schools prevented us from randomizing students into experimental sessions. Within an experimental session, however, all students were randomly assigned to either the treatment or the control group. Thus, whether the children were faced with the opportunity to disrupt their peers was purely random. By comparing the relationship between ordinal rank and students’ willingness to practice across treatment and control, our experimental design permits us to assess how the opportunity to self-select into a second, disruptive activity causes behavior to change among different sets of students, that is, lower- versus higher-ranked ones. It is important to emphasize that our theory of self-selection based on comparative advantage makes no prediction regarding the relationship between rank and willingness to practice in the control group. Students in the control group can engage in only one activity, which leaves no room for self-selection to affect behavior. Any correlation between these students’ choices and their ordinal rank must, therefore, be due to other, unmodeled factors (say, intrinsic preferences over rank, or varying marginal returns to practicing). The control group is nonetheless useful because it allows us to establish a baseline correlation between rank and willingness to practice. If the self-selection mechanism mediates social interactions, then, when given a chance to be disruptive, lower-ranked students—who have a relative disadvantage at solving mazes—should be disproportionately likely to substitute away from practicing. 4.2. Experimental Results Table 6 presents descriptive statistics for the children in our experiment, by treatment and control status. With the exception of grade level, students in the treatment and control group are statistically indistinguishable. Although the difference in grade level is economically small, it is statistically significant at the 10%-level. Notwithstanding the fact that a Kolmogorov–Smirnov test is unable to reject the null hypothesis that the p-values in the rightmost column are uniformly distributed on the unit interval (p = 0.305)—as one would expect under truly random assignment—we address the issue of imbalance by presenting results that do and do not condition on covariates. If anything, our findings become stronger when we control for observables. In addition, we show in Online Appendix Table A.5 that the results are qualitatively robust to conducting our analysis within each grade level. Although estimates disaggregated by grade are far less precise, out of the thirty coefficients in Online Appendix Table A.5 only two change sign compared to our main analysis. Therefore, it seems unlikely that imperfect randomization is driving our results. Table 6. Observable characteristics of students in our framed field experiment, by treatment status. All Treatment Control p-value Mean SD Mean SD Mean SD Treatment = Control Male .538 (.499) .537 (.499) .538 (.499) .971 Minority .873 (.333) .866 (.341) .882 (.324) .843 Grade 6.89 (.762) 6.97 (.715) 6.80 (.803) .077 Special education .134 (.341) .134 (.341) .134 (.341) .988 Limited English proficiency .334 (.472) .339 (.474) .328 (.470) .864 Missing demographic information .023 (.149) .033 (.180) .013 (.115) .103 Self-assessed ability (pre-period; scale 1–10) 6.08 (1.83) 6.19 (1.79) 5.96 (1.87) .246 Baseline performance (seconds per compl. maze) 27.1 (15.8) 26.2 (18.5) 28.0 (12.1) .342 Number of students 573 302 271 All Treatment Control p-value Mean SD Mean SD Mean SD Treatment = Control Male .538 (.499) .537 (.499) .538 (.499) .971 Minority .873 (.333) .866 (.341) .882 (.324) .843 Grade 6.89 (.762) 6.97 (.715) 6.80 (.803) .077 Special education .134 (.341) .134 (.341) .134 (.341) .988 Limited English proficiency .334 (.472) .339 (.474) .328 (.470) .864 Missing demographic information .023 (.149) .033 (.180) .013 (.115) .103 Self-assessed ability (pre-period; scale 1–10) 6.08 (1.83) 6.19 (1.79) 5.96 (1.87) .246 Baseline performance (seconds per compl. maze) 27.1 (15.8) 26.2 (18.5) 28.0 (12.1) .342 Number of students 573 302 271 Notes: Table shows basic descriptive statistics for students who participated in our framed field experiment, by treatment status. The rightmost column displays p-values for tests of equality in means across the treatment and control groups. A Kolmogorov–Smirnov test is unable to reject the null hypothesis that the p-values in the rightmost column are uniformly distributed on the unit interval (p = 0.305). For additional information on the experiment see the main text. The Data Appendix provides precise definitions of all variables. View Large Table 6. Observable characteristics of students in our framed field experiment, by treatment status. All Treatment Control p-value Mean SD Mean SD Mean SD Treatment = Control Male .538 (.499) .537 (.499) .538 (.499) .971 Minority .873 (.333) .866 (.341) .882 (.324) .843 Grade 6.89 (.762) 6.97 (.715) 6.80 (.803) .077 Special education .134 (.341) .134 (.341) .134 (.341) .988 Limited English proficiency .334 (.472) .339 (.474) .328 (.470) .864 Missing demographic information .023 (.149) .033 (.180) .013 (.115) .103 Self-assessed ability (pre-period; scale 1–10) 6.08 (1.83) 6.19 (1.79) 5.96 (1.87) .246 Baseline performance (seconds per compl. maze) 27.1 (15.8) 26.2 (18.5) 28.0 (12.1) .342 Number of students 573 302 271 All Treatment Control p-value Mean SD Mean SD Mean SD Treatment = Control Male .538 (.499) .537 (.499) .538 (.499) .971 Minority .873 (.333) .866 (.341) .882 (.324) .843 Grade 6.89 (.762) 6.97 (.715) 6.80 (.803) .077 Special education .134 (.341) .134 (.341) .134 (.341) .988 Limited English proficiency .334 (.472) .339 (.474) .328 (.470) .864 Missing demographic information .023 (.149) .033 (.180) .013 (.115) .103 Self-assessed ability (pre-period; scale 1–10) 6.08 (1.83) 6.19 (1.79) 5.96 (1.87) .246 Baseline performance (seconds per compl. maze) 27.1 (15.8) 26.2 (18.5) 28.0 (12.1) .342 Number of students 573 302 271 Notes: Table shows basic descriptive statistics for students who participated in our framed field experiment, by treatment status. The rightmost column displays p-values for tests of equality in means across the treatment and control groups. A Kolmogorov–Smirnov test is unable to reject the null hypothesis that the p-values in the rightmost column are uniformly distributed on the unit interval (p = 0.305). For additional information on the experiment see the main text. The Data Appendix provides precise definitions of all variables. View Large Pooling over all 573 students who completed the experiment, Table 7 displays our findings. The numbers therein correspond to the coefficients on rank (ri , s) and rank interacted with a treatment indicator (Ts) in the following econometric model: $$y_{i,s}=\varphi r_{i,s}+\gamma T_{s}\times r_{i,s}+\alpha b_{i}+\boldsymbol {X}_{i}^{\prime }\beta +\mu _{s}+\varepsilon _{i,s},$$ (9) where yi, s is the outcome of interest for student i in experimental session s, bi is her baseline performance at solving mazes, and $$\boldsymbol {X}_{i}^{{}}$$ denotes a vector of controls, which consists of all covariates that are listed in Table 6. Since the cost of practicing and sliming vary at the level of the experimental session, we also include μs, a session fixed effect. To allow for arbitrary forms of correlation in the residuals of children within the same experimental session, all standard errors are clustered at the session level. Given the small number of clusters, we follow the bootstrapping procedure recommended by Cameron, Gelbach, and Miller (2008) whenever we report p-values for hypothesis tests. Table 7. Experimental results. Willingness to pay for practicing Total money spent on practicing Total money spent on sliming (1) (2) (3) (4) (5) (6) Percentile (÷100) −.089** −.103*** −.061 −.081 (.035) (.037) (.074) (.073) Percentile .075* .089** .143 .165* −.371** −.390*** (÷100) × treatment (.041) (.042) (.092) (.092) (.234) (.165) H0: coefficient on percentile = 0 .022 .009 .412 .289 H0: coefficient on percentile × treatment = 0 .078 .045 .142 .090 .035 .004 Controls No Yes No Yes No Yes Experimental session fixed effects Yes Yes Yes Yes Yes Yes Sample Treatment and control Treatment and control Treatment and control Treatment and control Treatment Treatment Mean of dependent variable .201 .201 .204 .204 .137 .137 R-squared .138 .155 .160 .172 .110 .142 Number of observations 573 573 573 573 302 302 Willingness to pay for practicing Total money spent on practicing Total money spent on sliming (1) (2) (3) (4) (5) (6) Percentile (÷100) −.089** −.103*** −.061 −.081 (.035) (.037) (.074) (.073) Percentile .075* .089** .143 .165* −.371** −.390*** (÷100) × treatment (.041) (.042) (.092) (.092) (.234) (.165) H0: coefficient on percentile = 0 .022 .009 .412 .289 H0: coefficient on percentile × treatment = 0 .078 .045 .142 .090 .035 .004 Controls No Yes No Yes No Yes Experimental session fixed effects Yes Yes Yes Yes Yes Yes Sample Treatment and control Treatment and control Treatment and control Treatment and control Treatment Treatment Mean of dependent variable .201 .201 .204 .204 .137 .137 R-squared .138 .155 .160 .172 .110 .142 Number of observations 573 573 573 573 302 302 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (10) by ordinary least squares. The dependent variables are listed at the top of each column. All specifications control for baseline performance and experimental session fixed effects. Additional controls include gender, grade, a minority indicator, special education status, limited English proficiency, self-assessed ability, and indicator variables for missing demographic information. Heteroskedasticity robust standard errors are clustered by experimental session and reported in parentheses. To account for the small number of clusters, reported p-values are based on the wild bootstrap procedure suggested by Cameron et al. (2008) with 10,000 iterations. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Table 7. Experimental results. Willingness to pay for practicing Total money spent on practicing Total money spent on sliming (1) (2) (3) (4) (5) (6) Percentile (÷100) −.089** −.103*** −.061 −.081 (.035) (.037) (.074) (.073) Percentile .075* .089** .143 .165* −.371** −.390*** (÷100) × treatment (.041) (.042) (.092) (.092) (.234) (.165) H0: coefficient on percentile = 0 .022 .009 .412 .289 H0: coefficient on percentile × treatment = 0 .078 .045 .142 .090 .035 .004 Controls No Yes No Yes No Yes Experimental session fixed effects Yes Yes Yes Yes Yes Yes Sample Treatment and control Treatment and control Treatment and control Treatment and control Treatment Treatment Mean of dependent variable .201 .201 .204 .204 .137 .137 R-squared .138 .155 .160 .172 .110 .142 Number of observations 573 573 573 573 302 302 Willingness to pay for practicing Total money spent on practicing Total money spent on sliming (1) (2) (3) (4) (5) (6) Percentile (÷100) −.089** −.103*** −.061 −.081 (.035) (.037) (.074) (.073) Percentile .075* .089** .143 .165* −.371** −.390*** (÷100) × treatment (.041) (.042) (.092) (.092) (.234) (.165) H0: coefficient on percentile = 0 .022 .009 .412 .289 H0: coefficient on percentile × treatment = 0 .078 .045 .142 .090 .035 .004 Controls No Yes No Yes No Yes Experimental session fixed effects Yes Yes Yes Yes Yes Yes Sample Treatment and control Treatment and control Treatment and control Treatment and control Treatment Treatment Mean of dependent variable .201 .201 .204 .204 .137 .137 R-squared .138 .155 .160 .172 .110 .142 Number of observations 573 573 573 573 302 302 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (10) by ordinary least squares. The dependent variables are listed at the top of each column. All specifications control for baseline performance and experimental session fixed effects. Additional controls include gender, grade, a minority indicator, special education status, limited English proficiency, self-assessed ability, and indicator variables for missing demographic information. Heteroskedasticity robust standard errors are clustered by experimental session and reported in parentheses. To account for the small number of clusters, reported p-values are based on the wild bootstrap procedure suggested by Cameron et al. (2008) with 10,000 iterations. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large The results in columns (1) and (2) show that, among children in the control group, willingness to pay for practicing is negatively correlated with ordinal rank.29 That is, conditional on baseline performance, a student at the very bottom of her reference group is willing to spend 10 cents more on practicing a maze than an observationally similar one who ranks at the top of the distribution because she happened to be paired with less able peers. Given a sample mean of 20 cents, this disparity is economically large and statistically significant (p = 0.009). Interestingly, the correlation between ordinal rank and willingness to practice disappears among students in the treatment group. The difference in the slope estimates between both groups is not only statistically significant (p = 0.045), but the coefficient on the interaction term is of opposite sign and almost as large as that on rank itself. Hence, when children are given the choice between practicing and disrupting their peers, it is no longer the case that lower-ranked students invest more than higher-ranked ones. The next two columns show that this conclusion is qualitatively robust to examining actual money spent on practicing rather than self-declared willingness to pay. Although the point estimates lose much of their precision—in large part because most students practice on only one or two mazes—the sign pattern of the coefficients is identical to that in columns (1) and (2), and the coefficient on the interaction term continues to be economically large and marginally significant (p = 0.090). The remaining two columns demonstrate that ordinal rank is negatively correlated with how much money children in the treatment group spent on disrupting others. Taking the point estimate in column (6) at face value, a student who is paired with more able peers and, therefore, ranks at the bottom of her peer group spends 39 cents more on sliming than an identical child who happens to be at the very top of her reference distribution (p = 0.004). Compared to a sample mean of 14 cents, the effect of rank on sliming is very large. Given that children were randomly assigned to either treatment or control, we conclude that the opportunity to engage in a second, disruptive activity caused lower-ranked students to substitute away from investing. Instead, they paid to engage in socially wasteful behavior. Note, in the control group, where students can only choose between practicing and waiting, there is no room for self-selection based on comparative advantage to affect behavior. But in the treatment group, where children are faced with the choice between two very different activities, students with a relative disadvantage at solving mazes opted out of the very activity in which they did poorly compared to their peers. Our experimental results, therefore, suggest that students’ behavior is mediated by self-selection according to their relative standing.30 5. Concluding Remarks Drawing on traditional models of selection in the labor market, we propose a theory of social interactions based on self-selection and comparative advantage. When self-selection is the guiding principle of peer group formation, an individual’s behavior is an equilibrium outcome. It depends on where in the ability distribution she falls, and on the shadow prices that clear the social market. That is, in our model, selection into peer groups is based on comparative advantage, and peer effects arise due to the endogenous sorting of agents into peer groups within a social setting. An important implication that distinguishes our theory from traditional models of peers effects is that student outcomes should depend on ordinal rank within a social environment. Our empirical findings show that this key prediction is borne out in one randomized controlled trial in Kenya as well as administrative data from the United States. To further probe the channel through which ordinal rank affects behavior, we implemented a “framed” field experiment with nearly 600 public school students in Houston. By isolating the choice mechanism, our experimental evidence speaks directly to the idea of self-selection in social interactions. Since our Roy model does not cast peers as a source of direct (positive or negative) externalities, it has the potential to rationalize many of the disparate findings in the empirical literature within a single, tractable framework. In particular, since individuals’ ordinal rank may deteriorate as a result of many well-intentioned interventions, our theory provides a simple explanation for why ostensibly better peers do not always lead to more favorable outcomes (see, e.g., Carrell et al. 2013; Kling, Ludwig, and Katz 2005; Kling, Liebman, and Katz 2007; Sanbonmatsu et al. 2006). For examples of how a Roy model of social interactions can reconcile much of the existing empirical evidence we refer interested readers to the Online Appendix. In the Appendix, we also discuss the implications of self-selection based on comparative for identifying “traditional” peer effects, and for predicting the efficacy of social interventions ex ante. 1 See Epple and Romano (2011) and Ioannides (2011) for recent surveys. 2 Many ethnographers describe similar phenomena around the globe: the Buraku Outcastes of Japan (Devos and Wagatsuma 1966); Blacks in America (Fordham and Ogbu 1986), the Maori of New Zealand (Chapple, Jefferies, and Walker 1997), Blacks on Chicago’s south side circa 1930 (Drake and Cayton 1945), and the working class in Britain (Willis 1977), among others. 3 For confirmatory evidence, see Booij, Leuven, and Oosterbelk (2017), Bursztyn and Jensen (2015), Carrell, Fullerton, and West (2009), Cooley-Fruehwirth (2013), Duflo, Dupas, and Kremer (2011), Feld and Zolitz (2017), Goux and Maurin (2007), Hanushek et al. (2003), Hoxby (2003), Hoxby and Weingarth (2005), Imberman, Kugler, and Sacerdote (2012), Mas and Moretti (2009), or Sacerdote (2001). For null or negative findings, see Angrist and Lang (2004), Cullen, Jacob, and Levitt (2006), Kang (2007), Guryan, Kroft, and Notowidigdo (2009), or Sanbonmatsu et al. (2006). 4 Throughout the paper, ordinal rank will be used to refer to a student’s percentile in a group. 5 Following our working paper in 2011 (Cicala, Fryer, and Spenkuch 2011), others have also documented a relationship between ordinal rank and student outcomes. Murphy and Weinhardt (2014) use administrative data on students in the United Kingdom to show that rank in primary school correlates with their secondary school achievement. Their empirical approach mirrors the one we take with the NYCPS data, and their results replicate ours, at least qualitatively. Elsner and Isphording (2017) use data from the National Longitudinal Study of Adolescent Health (AddHealth) and exploit within-school differences in the ability distribution of cohorts. The results show that ordinal rank is positively associated with high school completion and negatively with problem behaviors. Tincani (2015, 2014) explores the effect of rank when students intrinsically care about their ordinal ranking, that is, when rank directly enters the utility function. 6 In Online Appendix D, we replicate our finding that rank affects student behavior with data from the National Educational Longitudinal Study (NELS). NELS allows us to relate the same student’s behavior in different classrooms to a proxy for her course specific rank. We show that a 50 percentile decline in rank across classes is associated with a nearly 10 percentage point increase in the probability that the teacher reports behavioral problems in the course for which she has the lower rank (relative to a basis of 40%). 7 Our findings are also related to an emerging literature on the economic effects of relative incomes. Luttmer (2005) and Card et al. (2012), for instance, demonstrate that own well-being and satisfaction depend negatively on the earnings of neighbors and coworkers. Bertrand, Pan, and Kamenica (2015) show that spouses’ relative incomes affect marriage formation and the division of household production. Charles, Hurst, and Roussanov (2009) argue that conspicuous consumption serves as a costly signal of economic position. Lastly, Kuziemko et al. (2014) provide experimental evidence to suggest that people are “last-place averse” and that low-income individuals oppose redistribution because it disproportionately benefits those ranking just below them. 8 For instance, to implement a convincing test of the teacher channel, we would need objective information on how much effort and attention teachers place on students at every part of the ability distribution. We are unaware of such data. In Online Appendix D, we report results from a partial test of the hypothesis that teacher behavior varies with students’ rank. Specifically, we test whether teachers’ perception of their students’ ability depends rank. Conditional on actual test scores, we find no evidence that this is the case. 9 Harrison and List (2004) define a “framed” field experiment as laboratory experiment with a nonstandard subject pool and field context in either the commodity, task, or information set that the subjects can use. 10 There are six appendices. Online Appendix A considers the implications of comparative advantage for predicting the efficacy of social interventions ex ante. Online Appendix B illustrates how a Roy model of social interactions can explain many of the disparate findings in the empirical literature on peer effects. Online Appendix C discusses identification of traditional peer effects in the presence of self-selection based on comparative advantage in social interactions. Online Appendix D presents additional evidence omitted from the main text. In Appendices E and F we describe the data used in our analysis and provide further details regarding the implementation of our experiment. All appendices are provided on the authors’ websites. 11 In Cicala et al. (2011), we also extend the basic model to allow for many groups and n-dimensional skill (Heckman and Scheinkman 1987), hierarchies (Rosen 1982), and show that the basic results of our model hold when the sectoral choice problem is cast in a general social multiplier framework (Becker and Murphy 2000; Glaeser, Sacerdote, and Scheinkman 2003). 12 Note that the effective size of the nerd group is integrated on [r*, 1], so that, for any r*, the shift in δ will be larger the more concentrated a given increase in the distribution of σN is in the upper end of the distribution. 13 When social status is independent of group size, that is, ∂sj(Lj, Kj)/∂Lj = 0, then the horizontal price schedule intersects the “supply” curve exactly once, leading to a single equilibrium with traditional comparative statics. 14 To see this, consider the adjustment process following a small shock to “prices”. From the initial equilibrium at r**, a small decrease in relative prices (along the δ-schedule) will lead to agents flowing out of the troublemaking and into the nerd sector, which will cause relative status to decline further and lead to even more agents switching sectors. The process continues until the market reaches a new equilibrium at the origin. Conversely, a small increase in relative prices (along the δ-schedule) will lead to agents flowing into the troublemaking sector. This causes relative status to increase even more, thereby inducing more nerds to become troublemakers until the market reaches equilibrium at r*. Similar reasoning shows that the equilibrium at r* is stable. 15 In a generalized Roy model group membership itself may be costly. If the costs of membership do not systematically covary with students’ rank (i.e., if it is equally costly for all children to join a particular group), then the prediction above trivially carries over to the more general setting. If the difference in cost between joining the troublemaking and nerd groups increases with students rank (i.e., if it becomes relatively more costly for higher rank students to become troublemakers), then the key prediction of our simplified model holds as well. Only if membership is differentially costly and if it becomes relatively cheaper for students to join the troublemakers when their rank increases would we expect to see a different pattern in the data. 16 Since a full description of the experiment is available in the aforementioned paper, we restate only the intervention’s most salient features here and refer the interested reader to Duflo et al. (2011) for additional details. 17 Across five unannounced visits to each school, both section were found to be combined 14.4% of the time in nontracking schools and 9.7% of the time in tracking schools. When sections were not combined, 92% of students in nontracking schools and 96% of students in tracking schools respected their initial assignment. 18 About 21% of students in tracking schools and 23% of those in nontracking schools repeated first grade and participated in the program for only the first year. 19 Online Appendix Table A.1 demonstrates that our results are robust to controlling for higher order polynomials of peers’ mean test score, as well as different moments of the skill distribution, such as the variance of students’ ability. 20 The data of Duflo et al. (2011) also contain component test scores for math and literacy. Our results are qualitatively very similar when using these instead of total test scores, but we note that the impact of rank appears to be stronger for math scores. 21 We have also investigated the relationship between changes in test scores and changes in rank in the NYCPS data, finding qualitatively similar results as in the previous section. 22 The state math and ELA tests are high-stakes exams conducted in the winters of third through eighth grade. For additional information on these tests, see the Online Data Appendix. 23 Interestingly, Figure 5 shows that the relationship between rank and behavior is almost linear, except for in the extremes, where there is less data to deliver precise estimates. This suggests that simple linear models may provide decent approximations to the true functional relationship. 24 Reassuringly, our results are robust to controlling for various moments of the distribution of test scores. 25 In Online Appendix D, we report results from a partial test of the teacher behavior hypothesis. Specifically, we test whether teacher perception of student ability depends on ordinal rank. Conditional on actual test scores, we find no evidence that this is the case. 26 When examining the data from sessions in which the first stage involved a different number of mazes, we found no differences in student behavior. We, therefore, pool these data in the analysis below. 27 Students were told that they could go “go into debt” during this stage of the experiment, that is, that they could spend more money than they had earned during the previous stage. Any extra spending would be subtracted from their earnings in the third stage. At the end of the experiment, no child ended up with negative earnings. 28 We chose not to elicit an entire demand curve in order to reduce complexity and simplify the instructions. 29 As we argue above, for a large class of skill distributions, a ranking of students based on one skill will be a good proxy for one in terms (unobserved) relative skills. If sliming does not involve specific skills, then a ranking solely based on the ability to solve mazes will coincide exactly with one in terms of relative ability. To see this, note that, if sliming does not take skill, then we can normalize σT(r) = 1 for all individuals, which in turn implies that σ(r) ≡ σN(r)/σT(r) = σN(r). 30 An alternative, a priori plausible explanation of our findings might be that students are inequality averse. If inequality aversion was the reason that low-ranked children sabotaged others, then we would expect there to be a positive relationship between rank at baseline and the number of times a student was slimed herself. That is, in order to reduce inequality low-ranked students should disproportionately target higher ranked ones rather than their low-ranked peers. We do not observe such a pattern in the data. Estimating the regression models in columns (5) and (6) of Table 7 with an indicator for whether a student’s screen got slimed as the outcome produces point estimates of .041 and −.009 (with standard errors of .073 and .069, respectively), relative to a mean of .48. An explanation based on inequality aversion is, therefore, at odds with this particular moment of the data. References Akerlof G. A. ( 1997 ). “Social Distance and Social Decisions.” Econometrica , 65 , 1005 – 1028 . Google Scholar CrossRef Search ADS Angrist J. D. , Lang K. ( 2004 ). “Does School Integration Generate Peer Effects? Evidence from Boston’s Metco Program.” American Economic Review , 94 ( 5 ), 1613 – 1634 . Google Scholar CrossRef Search ADS Austen-Smith D. , Fryer R. G. ( 2005 ). “An Economic Analysis of Acting White.” Quarterly Journal of Economics , 120 , 551 – 583 . Bala Venkatesh , Goyal S. ( 2000 ). “A Noncooperative Model of Network Formation.” Econometrica , 68 , 1181 – 1229 . Google Scholar CrossRef Search ADS Becker G. M. , DeGroot M. H. , Marschak J. ( 1964 ). “Measuring Utility by a Single-Response Sequential Method.” Behavioral Science , 9 , 226 – 232 . Google Scholar CrossRef Search ADS PubMed Becker G. S. ( 1974 ). “A Theory of Social Interactions.” Journal of Political Economy , 82 , 1063 – 1093 . Google Scholar CrossRef Search ADS Becker G. S. ( 1991 ). A Treatise on the Family . Harvard University Press , Cambridge, MA . Becker G. S. ( 1996 ). Accounting for Tastes . Harvard University Press , Cambridge, MA . Becker G. S. , Murphy K. M. ( 2000 ). Social Economics: Market Behavior in a Social Environment . Belknap Press of Harvard University , Cambridge, MA . Google Scholar CrossRef Search ADS Benabou R. ( 1993 ). “Workings of a City: Location, Education, and Production.” Quarterly Journal of Economics , 108 , 619 – 652 . Google Scholar CrossRef Search ADS Bernheim B. D. ( 1994 ). “A Theory of Conformity.” Journal of Political Economy , 102 , 841 – 877 . Google Scholar CrossRef Search ADS Bertrand M. , Pan J. , Kamenica E. ( 2015 ). “Gender Identity and Relative Income within Households.” Quarterly Journal of Economics , 130 , 571 – 614 . Google Scholar CrossRef Search ADS Booij A. , Leuven E. , Oosterbelk H. ( 2017 ). “Ability Peer Effects in University: Evidence from a Randomized Experiment.” Review of Economic Studies , 84 , 816 – 839 . Borjas G. J. ( 1987 ). “Self-Selection and the Earnings of Immigrants.” American Economic Review , 77 ( 4 ), 531 – 553 . Borjas G. J. ( 1995 ). “The Economic Benefits from Immigration.” The Journal of Economic Perspectives , 9 ( 2 ), 3 – 22 . Google Scholar CrossRef Search ADS Bursztyn L. , Jensen R. ( 2015 ). “How Does Peer Pressure Pressure Affect Educational Investments?” Quarterly Journal of Economics , 130 , 1329 – 1367 . Cameron A. C. , Gelbach J. B. , Miller D. L. ( 2008 ). “Bootstrap-Based Improvements for Inference with Clustered Errors.” Review of Economics and Statistics , 90 , 414 – 427 . Google Scholar CrossRef Search ADS Canada G. ( 1995 ). Fist, Stick, Knife, Gun: A Personal History of Violence in America . Beacon Press , Boston, MA . Card D. , Mas A. , Moretti E. , Saez E. ( 2012 ) “Inequality at Work: The Effect of Peer Salaries on Job Satisfaction.” American Economic Review , 102 ( 6 ), 2981 – 3002 . Google Scholar CrossRef Search ADS Carrell S. E. , Fullerton R. L. , West J. E. ( 2009 ). “Does Your Cohort Matter? Measuring Peer Effects in College Achievement.” Journal of Labor Economics , 27 , 439 – 464 . Google Scholar CrossRef Search ADS Carrell S. E. , Sacerdote B. I. , West J. E. ( 2013 ). “From Natural Variation to Optimal Policy? The Importance of Endogenous Peer Group Formation.” Econometrica , 81 , 855 – 882 . Google Scholar CrossRef Search ADS Chapple S. , Jefferies R. , Walker R. ( 1997 ). Maori Participation and Performance in Education: a Literature Review and Research Programme . N.Z. Institute of Economic Research . Charles K. K. , Hurst E. , Roussanov N. ( 2009 ) “Conspicuous Consumption and Race.” Quarterly Journal of Economics , 124 , 425 – 467 . Google Scholar CrossRef Search ADS Cicala S. , Fryer R. G. , Spenkuch J. L. ( 2011 ). “A Roy Model of Social Interactions.” NBER Working Paper No. 16880 , Cambridge, MA . Cooley-Fruehwirth J. C. ( 2013 ). “Identifying Peer Achievement Spillovers: Implications for Desegregation and the Achievement Gap.” Quantitative Economics , 4 , 85 – 124 . Google Scholar CrossRef Search ADS Cullen J. B. , Jacob B. A. , Levitt S. D. ( 2006 ) “The Effect of School Choice on Participants: Evidence from Randomized Lotteries.” Econometrica , 74 , 1191 – 1230 . Google Scholar CrossRef Search ADS De Vos George , Wagatsuma H. ( 1966 ). Japan’s Invisible Race: Caste in Culture and Personality . University of California Press , Berkeley . Drake S.C. , Cayton H. R. ( 1945 ). Black Metropolis: A Study of Negro Life in a Northern City. University of Chicago Press . Duflo E. , Dupas P. , Kremer M. ( 2011 ). “Peer Effects, Teacher Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya.” American Economic Review , 101 ( 5 ), 1739 – 1774 . Google Scholar CrossRef Search ADS Elsner B. , Isphording I. ( 2017 ). “A Big Fish in a Small Pond: Ability Rank and Human Capital Investment.” Journal of Labor Economics , 35 ( 3 ), 787 – 828 . Google Scholar CrossRef Search ADS Elsner B. , Isphording I. ( 2017 ). “Rank, Sex, Drugs, and Crime.” Journal of Human Resources , Forthcoming . Epple D. , Romano R. E. ( 2011 ). “Peer Effects in Education: A Survey of the Theory and Evidence.” In Handbook of Social Economics , Vol. 1 , edited by Benhabib J. , Bisin A. , Jackson M. O. . Elsevier , Amsterdam , pp. 1053 – 1163 . Google Scholar CrossRef Search ADS Feld J. , Zolitz U. ( 2017 ). “Understanding Peer Effects: On the Nature, Estimation, and Channels of Peer Effects.” Journal of Labor Economics , 35 , 387 – 428 . Google Scholar CrossRef Search ADS Fordham S. , Ogbu J. U. ( 1986 ). “Black Students’ School Success: Coping with the “Burden of ‘Acting White’.” The Urban Review , 18 , 176 – 206 . Google Scholar CrossRef Search ADS Friedman M. ( 1953 ). “The Methodology of Positive Economics.” In Essays in Positive Economics . University of Chicago Press , Chicago . Google Scholar CrossRef Search ADS Gans H. J. ( 1962 ). The Urban Villagers: Group and Class in the Life of Italian-Americans. Free Press of Glencoe , New York . Glaeser E. L. , Sacerdote B. I. , Scheinkman J. A. ( 2003 ). “The Social Multiplier.” Journal of the European Economic Association , 1 , 345 – 353 . Google Scholar CrossRef Search ADS Goux D. , Maurin E. ( 2007 ). “Close Neighbours Matter: Neighbourhood Effects on Early Performance at School.” Economic Journal , 117 , 1193 – 1215 . Google Scholar CrossRef Search ADS Guryan J. , Kroft K. , Notowidigdo M. J. ( 2009 ). “Peer Effects in the Workplace: Evidence from Random Groupings in Professional Golf Tournaments.” American Economic Journal: Applied Economics , 1 , 34 – 68 . Google Scholar CrossRef Search ADS PubMed Hanushek E. A. , Kain J. F. , Markman J. M. , Rivki S. G. ( 2003 ). “Does Peer Ability Affect Student Achievement?” Journal of Applied Econometrics , 18 , 527 – 544 . Google Scholar CrossRef Search ADS Harrison G. W. , List J. A. ( 2004 ). “Field Experiments.” Journal of Economic Literature , 42 , 1009 – 1055 . Google Scholar CrossRef Search ADS Heckman J. J. , Sedlacek G. ( 1985 ). “Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-selection in the Labor Market.” Journal of Political Economy , 93 , 1077 – 1125 . Google Scholar CrossRef Search ADS Heckman J. J. , Scheinkman J. A. ( 1987 ). “The Importance of Bundling in a Gorman-Lancaster Model of Earnings.” Review of Economic Studies , 54 , 243 – 255 . Google Scholar CrossRef Search ADS Hoxby C. M. ( 2003 ). “The Power of Peers: How Does the Makeup of a Classroom Influence Achievement.” Education Next , 2 ( 2 ), 57 – 63 . Hoxby C. M. , Weingarth G. ( 2005 ). “Taking Race Out of the Equation: School Reassignment and the Structure of Peer Effects.” Working paper . Harvard University . Imberman S. , Kugler A. D. , Sacerdote B. ( 2012 ). “Katrina’s Children: Evidence on the Structure of Peer Effects from Hurricane Evacuees.” American Economic Review , 102 ( 5 ), 2048 – 2082 . Google Scholar CrossRef Search ADS Ioannides Y. M. ( 2011 ). “Neighborhood Effects and Housing.” In Handbook of Social Economics , Vol. 1 , edited by Benhabib J. , Bisin A. , Jackson M. O. . Elsevier , Amsterdam , pp. 1281 – 1340 . Google Scholar CrossRef Search ADS Jackson M. O. , Wolinsky A. ( 1996 ). “A Strategic Model of Social and Economic Networks.” Journal of Economic Theory , 71 , 44 – 74 . Google Scholar CrossRef Search ADS Jackson M. O. , Rogers B. W. ( 2007 ). “Meeting Strangers and Friends of Friends: How Random Are Social Networks? ” American Economic Review , 97 ( 3 ), 890 – 915 . Google Scholar CrossRef Search ADS Jackson M. O. , Rogers B. , Zenou Y. ( 2017 ). “The Economic Consequences of Social Network Structure.” Journal of Economic Literature , 55 , 49 – 95 . Google Scholar CrossRef Search ADS Kang C. ( 2007 ). “Classroom Peer Effects and Academic Achievement: Quasi-Randomization Evidence from South Korea.” Journal of Urban Economics , 61 , 458 – 495 . Google Scholar CrossRef Search ADS Kling J. R. , Ludwig J. , Katz L. F. ( 2005 ). “Neighborhood Effects on Crime for Female and Male Youth: Evidence from a Randomized Housing Voucher Experiment.” Quarterly Journal of Economics , 120 , 87 – 130 . Kling J. R. , Liebman J. B. , Katz L. F. ( 2007 ). “Experimental Analysis of Neighborhood Effects.” Econometrica , 75 , 83 – 119 . Google Scholar CrossRef Search ADS Kuziemko I. , Buell R. W. , Reich T. , Norton M. I. ( 2014 ). “Last-Place Aversion: Evidence and Redistributive Implications.” Quarterly Journal of Economics , 129 , 105 – 149 . Google Scholar CrossRef Search ADS Luttmer E. F. P. ( 2005 ). “Group Loyalty and the Taste for Redistribution.” Journal of Political Economy , 109 , 500 – 528 . Google Scholar CrossRef Search ADS Marsh H. , Parker J. W. ( 1984 ). “Determinants of Student Self-Concept: Is It Better to Be a Relatively Large Fish in a Small Pond Even if You Don’t Learn to Swim as Well?” Journal of Personality and Social Psychology , 47 , 213 – 231 . Google Scholar CrossRef Search ADS Marsh H. ( 1987 ). “The Big-Fish-Little-Pond Effect on Academic Self-Concept.” Journal of Educational Psychology , 79 , 280 – 295 . Google Scholar CrossRef Search ADS Marsh H. , Seaton M. , Trautwein U. , Ludtke O. , Hau K. T. , O’Mara A. J. ( 1987 ). “The Big-Fish-Little-Pond-Effect Stands Up to Critical Scrutiny: Implications for Theory, Methodology, and Future Research.” Educational Psychology Review , 20 , 319 – 350 . Google Scholar CrossRef Search ADS Mas A. , Moretti E. ( 2009 ). “Peers at Work.” American Economic Review , 99 ( 1 ), 122 – 145 . Google Scholar CrossRef Search ADS Miller R. A. ( 1984 ). “Job Matching and Occupational Choice.” Journal of Political Economy , 92 , 1086 – 1120 . Google Scholar CrossRef Search ADS Murphy K. M. ( 1986 ). “Specialization and Human Capital.” Doctoral Dissertation . University of Chicago . Murphy R. , Weinhardt F. ( 2014 ). “Top of the Class: The Importance of Rank Position.” Working paper . London School of Economics . Rogers C.M. , Smith M. D. , Coleman J. M. ( 1978 ). “Social Comparison in the Classroom: the Relationship Between Academic Achievement and Self-Concept.” Journal of Educational Psychology , 70 , 50 – 57 . Google Scholar CrossRef Search ADS PubMed Rosen S. ( 1974 ). “Hedonic Prices and Implicit Markets: Product Differentiation in Pure Competition.” Journal of Political Economy , 82 , 34 – 55 . Google Scholar CrossRef Search ADS Rosen S. ( 1982 ). “Authority, Control, and the Distribution of Earnings.” Bell Journal of Economics , 13 , 311 – 323 . Google Scholar CrossRef Search ADS Roy A. D. ( 1951 ). “Some Thoughts on the Distribution of Earnings.” Oxford Economic Papers , 3 , 135 – 146 . Google Scholar CrossRef Search ADS Sacerdote B. I. ( 2001 ). “Peer Effects with Random Assignment: Results for Dartmouth Roommates.” Quarterly Journal of Economics, 116 , 681 – 704 . Sanbonmatsu L. , Kling J. R. , Duncan G. , Brooks-Gunn J. ( 2006 ). “Neighborhoods and Academic Achievement: Results from the Moving to Opportunity Experiment.” Journal of Human Resources , 41 , 649 – 691 . Google Scholar CrossRef Search ADS Sattinger M. ( 1979 ). “Differential Rents and the Distribution of Earnings.” Oxford Economic Papers , 31 , 60 – 71 . Google Scholar CrossRef Search ADS Shea J. ( 1997 ). “Instrument Relevance in Multivariate Linear Models: A Simple Measure.” Review of Economics and Statistics , 79 , 348 – 352 . Google Scholar CrossRef Search ADS Spence M. ( 1973 ). “Job Market Signaling.” Quarterly Journal of Economics , 87 , 355 – 374 . Google Scholar CrossRef Search ADS Stock J. H. , Yogo M. ( 2005 ). “Testing for Weak Instruments in Linear IV Regression.” In Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg , edited by Andrews D. W. K. , Stock J. H. . Cambridge University Press , Cambridge, UK , pp. 80 – 108 . Suskind R. ( 1998 ). A Hope in the Unseen: An American Odyssey from the Inner City to the Ivy League . Broadway Books , New York . Tincani M. ( 2014 ). “On the Nature of Social Interactions in Education: An Explanation for Recent Experimental Evidence.” Working paper . University College London . Tincani M. ( 2015 ). “Heterogeneous Peer Effects and Rank Concerns: Theory and Evidence.” Working paper . University College London . Willis P. ( 1977 ). Learning to Labor: How Working Class Kids Get Working Class Jobs. Columbia University Press . Willis R. J. , Rosen S. ( 1979 ). “Education and Self-Selection.” Journal of Political Economy , 87 , S7 – S36 . Google Scholar CrossRef Search ADS Wilson W. J. ( 1987 ). The Truly Disadvantaged: the Inner City, the Underclass, and Public Policy. The University of Chicago Press . Supplementary Data Supplementary data are available at JEEA online. © The Authors 2017. Published by Oxford University Press on behalf of European Economic Association. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of the European Economic Association Oxford University Press

# Self-Selection and Comparative Advantage in Social Interactions

, Volume Advance Article – Sep 29, 2017
38 pages

Publisher
European Economic Association
ISSN
1542-4766
eISSN
1542-4774
D.O.I.
10.1093/jeea/jvx031
Publisher site
See Article on Publisher Site

### Abstract

Abstract We propose a theory of social interactions based on self-selection and comparative advantage. In our model, students choose peer groups based on their comparative advantage within a social environment. The effect of moving a student into a different environment with higher-achieving peers depends on where in the ability distribution she falls and the shadow prices that clear the social market. We show that the model’s key prediction—an individual’s ordinal rank predicts her behavior and test scores—is borne out in one randomized controlled trial in Kenya as well as administrative data from the United States. To test whether our selection mechanism can explain the effect of rank on outcomes, we conduct an experiment with nearly 600 public school students in Houston. The experimental results suggest that social interactions are mediated by self-selection based on comparative advantage. (JEL: I21, J24) The editor in charge of this paper was Imran Rasul. Acknowledgments: Previous versions of this paper circulated under the title “A Roy Model of Social Interactions.” We are grateful to Gary Becker, Edward Glaeser, Bryan Graham, Richard Holden, Lawrence Katz, Steven Levitt, Franziska Michor, Bruce Sacerdote, Chris Shannon, Jesse Shapiro, Andrei Shleifer, Michela Tincani, Glen Weyl, as well as seminar participants at Harvard and Chicago for many helpful comments and suggestions. Brad Allan, Vilsa Curto, Tanaya Devi, Matt Davis, Ryan Fagan, Adriano Fernandes, Natalya Naumenko, andWonhee Park provided excellent research assistance. Financial support from the Weatherhead Center for International Affairs and Institute for Humane Studies [Cicala], the Education Innovation Lab at Harvard University [Fryer], and the German National Academic Foundation [Spenkuch] is gratefully acknowledged. Correspondence can be addressed to the authors at Harris School of Public Policy, University of Chicago, 1155 E 60th Street, Chicago IL 60637 [Cicala]; Department of Economics, Harvard University, 1805 Cambridge Street, Cambridge MA 02138 [Fryer]; MEDS Department, Kellogg School of Management, 2211 Campus Drive, Evanston IL 60208 [Spenkuch]; or by e-mail. The usual caveat applies. 1. Introduction Social scientists have long recognized the importance of peer effects.1 Gans (1962), for instance, describes an insidious form of social interactions in which Italian immigrant communities in Boston’s West End impose costs on individuals who “act mobile.”2 Wilson (1987) argues that the development of an “underclass” of black city dwellers on Chicago’s South Side was due to the emigration of working families and the resulting decrease in role models and neighborhood quality, and Borjas (1995) demonstrates that the mean skill level within one’s ethnic group in the previous generation is correlated with own educational achievement. To better understand these and related phenomena, economists have developed models of social interactions by putting environmental variables—such as the mean behavior in one’s social group or the mean educational attainment in one’s neighborhood—into agents’ utility functions. In this class of models, peers are a source of positive externalities; unruly peers cause more trouble and smarter peers encourage higher achievement (Akerlof 1997; Becker 1974, 1996; Benabou 1993; Bernheim 1994). However, empirical evidence in support of theories that predict that favorable social interactions lead to better outcomes has been ambivalent. Although many authors confirm the hypothesis relying on plausibly exogenous variation in data sets ranging from primary students in Texas to freshmen at Dartmouth College, others find no or even negative peer effects.3 Recently, Carrell, Sacerdote, and West (2013) present perplexing evidence from a randomized field experiment designed to boost the achievement of low-ability freshmen at the US Air Force Academy. Based on earlier, experimental estimates indicating nonlinear, positive effects of peers’ mean test scores, the authors “optimally” constructed peer groups by pairing low-ability students with a greater share of high-ability ones. Contrary to the authors’ expectations, however, the intervention had a negative and statistically significant effect on the very students it was designed to help. That is, if anything, “better” classmates led to slightly worse outcomes. Carrell et al. (2013) note that within their “optimally” designed peer groups, low-ability students avoided the very peers with whom they were intended to interact and instead formed more homogeneous subgroups. At an abstract level, social interactions involve a decision about what to do with whom. We note that similar decision problems arise in many other economic settings, such as the choice of an occupation or industry (Heckman and Scheinkman 1987; Miller 1984; Roy 1951), immigration (Borjas 1987), educational investment decisions (Willis and Rosen 1979), or the division of labor within households (Becker 1991). Building on this insight, this paper explores the role of self-selection in social interactions—both theoretically and empirically. Our analysis begins by developing a theory of social interactions that builds on Roy’s classic account of self-selection in the labor market (Roy 1951). To model the endogeneity of social contacts, we posit the existence of an implicit price mechanism in the “market for peers.” In our simple theoretical framework, children care only about social status from membership in peer groups. In equilibrium, heterogeneity in ability leads students to self-select into groups based on their comparative advantage within a particular setting. When self-selection is the guiding principle of peer group formation, a child’s behavior is an equilibrium outcome. It depends on where in the ability distribution she falls, and on the shadow prices that clear the social market. Put differently, selection into peer groups is based on comparative advantage, and peer effects arise due to the endogenous sorting of students into peer groups within a social environment. It is important to emphasize at the outset that there are no intrinsic externalities built into the model. That is, in contrast to standard theories of peer effects, students do not explicitly benefit from the ability of the other members of their group. In our Roy model, children endogenously determine which peer group to join within a particular social environment. Peer effects arise because the presence of others affects students’ choice of peer group and, therefore, their behavior. The idea of self-selection in social interactions is closely related to a large literature on endogenous network formation (see, e.g., Bala and Goyal 2000; Jackson and Rogers 2007; Jackson and Wolinsky 1996; and the survey by Jackson, Rogers, and Zenou 2017). In fact, our theory may be viewed as a simple model of network formation using comparative advantage as an allocation mechanism. Put differently, even within classrooms networks are likely incomplete—not everyone is friends with everyone else—and our model predicts who socializes with whom, which in turn affects children’s behavior. An important and novel implication of our model is that a student’s academic achievement and problem behaviors depend on her ordinal rank among her peers.4 We document the impact of rank in two different data sets.5 The first one comes from a randomized controlled trial in Kenya, and was collected by Duflo et al. (2011). In 2005, 121 Kenyan primary schools with a single first-grade class received additional funds to hire another teacher and create a second section. In 61 of these schools, students were randomized into classrooms. In the remaining 60 schools, students were assigned to sections based on initial achievement. The intervention lasted 18 months. Relying on the experimentally induced variation in the within-classroom rank of students with equal baseline tests scores, we show that increasing a student’s rank by 50 percentiles boosts test scores at endline by about .2 standard deviations. To provide additional evidence on the relationship between ordinal rank and student outcomes, we use administrative data from New York City Public Schools (NYCPS). Our research design for these data exploits transitions from elementary to middle school, that is, from fifth to sixth grade. We estimate that a 50 percentile decrease in rank among schoolmates is associated with as much as a 2.5 percentage point increase in the probability of a serious behavioral incident—about 32% of its mean. To account for sorting into schools as well as potential issues of reverse causality, we use students’ hypothetical change in rank if they had attended the school for which they were zoned as an instrument for the actual change in rank based on the school they chose to attend. Although our IV estimates are less precise, they are qualitatively very similar to their OLS counterparts.6 Consistent with the idea of self-selection in social interactions, the evidence suggests that students’ ordinal rank exerts a significant effect on their achievement and behavior.7 Yet, these data cannot rule out other mechanisms. For instance, if teachers always target the top of the class, then higher ranked students would benefit from more appropriate instruction and thus experience an increase in test scores. Although such an explanation would not invalidate our empirical result that ordinal rank matters for student outcomes, it illustrates the multiplicity of plausible channels through which the effect of rank might operate. Finding direct evidence in support of self-selection based on comparative advantage is challenging—in large part because neither the shadow prices that clear the social market nor students’ comparative advantage are directly observable in standard data sets. Similarly, given the potential number of alternative theories that predict a relationship between ordinal rank and economic outcomes, and the data requirements associated with testing each, it is unclear how one could rule out all, or even most, of them.8 Thus, rather than trying to pin down the share of the relationship between rank and outcomes that is attributable to a particular mechanism, we pursue the more modest goal of providing additional evidence that some portion of the effect of rank on behavior is due to self-selection—as this is the precise mechanism explored in our model. To this end, we conducted a “framed” field experiment.9 Relying on the experimentally induced variation, we can test whether self-selection based on comparative advantage can explain the observed relationship between rank and outcomes. Between February 2015 and May 2015, we recruited nearly 600 children from two public middle schools in Houston, TX and incentivized them to “solve” mazes in a custom-made computer game. According to our conversations with principals and teachers, children at these schools have extensive experience playing comparable games on their phones or even on the schools’ computers. By embedding our experiment in a familiar context, we hope to replicate as many of the myriad situational factors that may affect students’ behavior as we possibly can, while maintaining enough experimental control to identify causal effects. During the initial stage of the game, students were asked to solve a common set of mazes in order to establish a baseline measure of ability. The software then publicly revealed the ordinal rank as well as the cardinal performance of all participants in the same experimental session—similar to the scoreboard in many popular video games. In the next stage, all students were afforded the opportunity to practice solving mazes at a fixed, randomly determined cost per maze. In addition, children in the treatment group could pay to publicly “slime” the screen of any practicing peer. Sliming another student’s screen carried no monetary benefit, but it blocked a portion of the maze on which the respective participant was working, thereby negating the benefits of practicing for her. In the third and last stage of the game, students were asked to solve more difficult mazes and were rewarded with a piece rate for each one they successfully completed. At no point during the experiment did monetary payoffs depend on ordinal rank. Among children in the control group we observe that, conditional on actual performance in the first stage, children who were paired with better peers and, therefore, find themselves closer to the bottom of the distribution are, if anything, more likely to invest in becoming better at solving mazes. This is not the case in the treatment group. Being able to publicly “slime” their peers, lower-ranked students substitute away from practicing and pay to disrupt others instead. Given that children were randomly assigned to either treatment or control, our experimental design permits us to test the hypothesis that the opportunity to self-select into a second, disruptive activity exerts a disproportionate effect on lower-ranked children. That is, by giving students the choice between a constructive and destructive activity we allow for self-selection based on comparative advantage to affect social interactions in the treatment group, but not the control. Perversely, the very children who would ordinarily practice more to overcome their relative disadvantage chose to “act out” instead. This suggests that students’ behavior is mediated, in part, by self-selection based on comparative advantage within narrowly defined social settings. Although it is difficult to generalize from the findings of any given experiment—and perhaps even more so in our case—we note that our experimental design holds many, if not all, potential confounds fixed. For instance, the experimental results cannot be due to a change in students’ cognitive or noncognitive skills, teacher conduct, or environmental influences. This does not necessarily mean that these factors are irrelevant for explaining the patterns that we document in the real-world data. It does, however, suggest that self-selection may be an important mechanism for explaining peer effects. The remainder of the paper is organized as follows. Section 2 develops a formal model of self-selection in social interactions, and Section 3 presents empirical evidence consistent with the theory’s key prediction. Section 4 describes an experiment designed to test the mechanism highlighted in our model. The final section concludes.10 2. A Comparative Advantage Theory of Social Interactions The model we propose in this paper is a simplified version of the well-known multi-sector choice problem, building upon impressive literatures designed to understand the evolution of earnings, the hedonic pricing of skills, and the assignment of workers to firms (e.g., Heckman and Sedlacek 1985; Murphy 1986; Rosen 1974; Roy 1951; Sattinger 1979). The novelty of our approach lies in the application of these classic methods to develop a theory of social interactions where contacts within a social market are endogenous and peer effects arise due to the sorting of agents within narrowly defined social settings. 2.1. Basic Building Blocks Let there be a continuum of students with unit mass. Every child is endowed with one unit of nontransferable time. There are two activities in which students can engage with their peers: studying or mischief. These activities are exclusive and undertaken by separate social groups: “nerds” and “troublemakers.” Children acquire social status from membership in peer groups. How much status membership in group j = N, T conveys depends on the effective group size, Lj, and other exogenously given factors, which we label capital, Kj. We allow capital to broadly represent any nonhuman input into groups’ activities, such as the availability of textbooks or sharp scissors, the quantity of policing, or school and neighborhood quality more generally. Children are heterogeneous along two dimensions. Their varying size and strength yield differences in the ability to cause trouble, whereas heterogeneity in cognitive ability implies differences in their ability to be a true nerd. Let σN(r) denote the effective units of “nerdiness” that student r is capable of contributing to the group (e.g., expertise in differential geometry). Analogously, r’s troublemaking ability is given by σT(r). Without loss of generality, we rank students by their relative skill σ(r) ≡ σN(r)/σT(r), such that σ(r) ≥ σ(r΄) whenever r > r΄. For simplicity, we assume that children are solely interested in maximizing their social status $$U\left( r\right) =\max \left\lbrace s_{N}\sigma _{N}(r),s_{T}\sigma _{T}\left( r\right) \right\rbrace ,$$ (1) where the shadow prices sN and sT denote the (endogenously determined) status per effective unit of nerd and troublemaking ability, respectively. Thus, total utility from membership in group j is given by sjσj(r). Note, there are no explicit externalities built into students’ utility. Conditional on “prices,” the behavior of others has no influence on own decisions—as in analyses of traditional markets. The key assumption in equation (1) is that, all else equal, “nerdier” individuals, that is, those with higher σN(r), will derive more utility from joining the nerd sector than children with less nerd ability. In Cicala et al. (2011), we allow for more general utility functions (e.g., children care about more than just social status, or they can allocate parts of a fixed time endowment). Here, however, we present a very simple and parsimonious model in order to demonstrate how self-selection in social interactions can produce “peer effects.” It is important to note that the main results continue to hold as long as the benefits from joining a particular group are increasing in the respective dimension of ability, so that sorting into peer groups is at least partially determined by comparative advantage.11 Children maximize their social status by choosing either the nerd or troublemaking group according to a simple cut-off rule. The student indifferent between the two sectors, r*, has a skill ratio of $$\sigma \left( r^{\ast }\right) =\frac{s_{T}}{s_{N}}.$$ (2) By individual optimization, all students with index r ≥ r* join forces with the nerds, and individuals with r < r* become troublemakers. In our Roy model of social interactions, comparative (rather than absolute) advantage determines a child’s choice of peer group, and therefore whether she chooses to engage in studying or mischief. As a result, the supply of skills to both groups is given by \begin{eqnarray} L_{N}^{\ast } =\int _{r^{\ast }}^{1}\sigma _{N}\left( q\right) dq, \end{eqnarray} (3) \begin{eqnarray} L_{T}^{\ast } =\int _{0}^{r^{\ast }}\sigma _{T}\left( q\right) dq\text{.} \end{eqnarray} (4) Equilibrium, however, also depends on the endogenously determined shadow prices, and, therefore, on the relationship between social status, sj, and effective group size, Lj. There are many plausible micromechanisms for why the payoffs to joining either group may be a function of groups’ sizes. For instance, it could be that group membership directly confers social status, or that students simply derive utility from spending time with others. Beyond the narrow context of our model and in the spirit of Spence (1973) and Austen-Smith and Fryer (2005), students may also care about signaling their “type” to adults. In equilibrium, the credibility of their signals will depend on the number of others who adopt the same behavior, which introduces interdependence in payoffs. Since our primary goal is to study how self-selection into activities depends on the composition of peers, we remain agnostic about the deep determinants of sj. Instead of taking a firm stand on the ultimate source of utility from group membership, we simply explore different possibilities for the functional relationship between sj and Lj. 2.2. Case I: Social Status Decreases with Effective Group Size In a traditional Roy model it is usually assumed that labor exhibits diminishing returns to scale. If, for instance, increasing the number of troublemakers does more to increase the probability of getting caught than of winning a fight, then social status may be decreasing in effective group size, LT. Similarly, intelligence may confer greater status when this skill is scarce and not readily available within a social environment. Hence, we first assume ∂sj(Lj, Kj)/∂Lj < 0. In equilibrium, market clearing and equation (2) yield the following condition: $$\delta \left( r^{\ast }\right) \equiv \frac{s_{T}(L_{T}(r^{\ast }),K_{T})}{s_{N}(L_{N}\left( r^{\ast }\right) ,K_{N})}=\sigma \left( r^{\ast }\right) ,$$ (5) where δ(r) denotes the ratio of social status in both sectors when the marginal student is r. Since δ΄(r) < 0 for all r, the relative “price” schedule in the market for peers is strictly downward sloping. To see this, note that equations (3) and (4) respectively imply $$dL_{N}^{\ast }/dr<0$$ and $$dL_{T}^{\ast }/ dr>0$$, which causes the absolute as well as the relative status of troublemakers to decrease as agents shift from the nerd into the troublemaking group. We can now describe equilibrium graphically. Figure 1 depicts the situation when social status is decreasing in effective group size. As described above, it features upward sloping “supply” and a downward sloping “price” schedule. There is a unique equilibrium at r* with market clearing relative status, (sT/sN)*. All students with r < r* select into the troublemaker group and children with r ≥ r* choose to become nerds. Figure 1. View largeDownload slide Equilibrium in the market for peers when status decreases with effective group size. Figure 1. View largeDownload slide Equilibrium in the market for peers when status decreases with effective group size. Social status may also depend on the specifics of the exogenously given environment. To allow for this possibility we let sj, depend not only on group size, but also on any number of environmental inputs Kj. Suppose ∂sj(Lj, Kj)/∂Kj > 0 and imagine a shift in the “capital” available to the troublemaking sector—less police surveillance, an increase in the availability of drugs, weapons, or alcohol. Holding everything else constant an increase in troublemakers’ productive capital, KT, is represented by an outward shift of the δ-schedule, which results in higher status for troublemakers and, therefore, fewer nerds. A decrease in the “capital” available to troublemakers has the opposite effects. Thus, with respect to features of the physical environment our Roy model of social interactions features conventional predictions. Comparative statics with respect to the skill distribution, however, can be quite counterintuitive. Consider, for instance, an increase in nerd skill among the population holding troublemaking ability fixed. First, an increase in children’s nerdiness raises σN relative to σT and thus shifts the “supply” curve inward. Second, the equilibrium price schedule shifts outward due to the fact that with more academically able peers there will be more effective units of nerd skill supplied at any r, which lowers sN in equation (5). Although both shifts lead to an unambiguous rise in the relative wage of troublemakers, the effect on quantities is indeterminate. Figure 2 illustrates this point. In the left panel, social status is largely irresponsive, despite the large shift in the distribution of relative skill. On net the shift in δ(r) outweighs that in σ(r), which results in an expansion of the nerd group. In the panel on the right, however, nerds’ social status drops rapidly with group size, leading to a much larger outward shift of the price schedule, δ1(r).12 In the new equilibrium, fewer children choose to become nerds, despite the fact that everyone has higher nerd ability than before. Figure 2. View largeDownload slide Effect of higher-quality peers when social status is decreasing in effective group size. Figure 2. View largeDownload slide Effect of higher-quality peers when social status is decreasing in effective group size. 2.3. Case II: Social Status Increases with Effective Group Size Much of the literature on social interactions assumes that the marginal utility of a social activity is increasing in overall participation. In models with a social multiplier, peer effects arise because an individual’s utility from taking a particular action increases in the number of agents in her reference group who behave in the same way (e.g., Becker and Murphy 2000; Glaeser et al. 2003). By assuming that social status rises with effective group size, that is, ∂sj(Lj, Kj)/∂Lj > 0, our comparative advantage theory can replicate these models. As in the case of decreasing status, the equilibrium price schedule continues to be given by \begin{equation*} \delta \left( r^{\ast }\right) =\frac{s_{T}(L_{T}(r^{\ast }),K_{T})}{s_{N}(L_{N}\left( r^{\ast }\right) ,K_{N})}. \end{equation*} The key difference when social status is increasing in group size is that δ΄(r) > 0 and that there may exist multiple equilibria.13 Figure 3 depicts such a scenario. Here, increasing status yields equilibria at the origin, at r*, and at r**. But only equilibria in which the price schedule, δ, intersects the “supply” curve from above are locally stable.14 Given the existence of multiple equilibria, the Roy model approach can rationalize starkly different behaviors of children in observationally similar environments. Moreover, small changes in the environment may lead to large behavioral responses. Consider, for instance, a decrease in the inputs available to the troublemaking group and assume that ∂sj(Lj, Kj)/∂Kj > 0. As shown in Figure 4, the initial decrease in KT lowers the relative status of troublemakers, and, thus, causes troublemakers close to the initial equilibrium to switch peer groups. The decrease in the size of the troublemaking group further lowers sT, which leads to an even larger outflux, and so on. Analogous to traditional models of a social multiplier, students’ behavior may become very elastic when their choices are complements. In contrast to conventional analyses, our theory allows for heterogeneity in agents’ skill endowments to drive different behavior in a common environment. This form of heterogeneity is necessary to explore the idea of self-selection in social interactions. Figure 3. View largeDownload slide Equilibria in the market for peers when status increases with effective group size. Figure 3. View largeDownload slide Equilibria in the market for peers when status increases with effective group size. Figure 4. View largeDownload slide Comparative statics when status increases with effective group size. Figure 4. View largeDownload slide Comparative statics when status increases with effective group size. 2.4. Case III: Rank-Dependent Utility Our theory can also mimic models in which the utility from engaging in a particular activity depends directly on an individual’s rank. The simplest way to incorporate the idea of rank-dependence into our Roy model is to assume that the shadow prices vary across students and are given by sr, j(r, Kj) with ∂sr, N(r, KN)/∂r > 0 and ∂sr, T(r, KT)/∂r < 0. These assumptions are sufficient (but not necessary) to ensure that there will, again, be a threshold individual, r*, such that students with r < r* select into the troublemaker group whereas those with r ≥ r* choose to become nerds. On theoretical grounds, the key deviation from typical theories of rank-dependent utility is that r is defined in terms of relative rather than absolute skills. This modeling choice sidesteps thorny questions about which peer group a student would choose when she would (in absolute terms) be both the most skilled nerd and the most skilled troublemaker within a given environment, and it produces self-selection based on comparative advantage. But even without assuming that shadow prices vary directly with children’s rank, our model of self-selection in social interactions predicts rank-dependence in behavior. To see this, consider, again, Figures 1 and 3. A student at the bottom of the skill distribution will be more likely to join forces with the troublemakers than an agent in the right tail of the distribution—irrespective of whether social status is increasing or decreasing in effective group size, that is, even in cases I and II above. Formally, the net utility from selecting into the nerd rather the troublemaker group is given by $$\sigma _{N}\left( r\right) s_{N}^{\ast }-\sigma _{T}\left( r\right) s_{T}^{\ast }= \left( \sigma \left( r\right) / \delta \left( r^{\ast }\right) \right) -1$$, which is increasing in r. Thus, within any social market, children in the upper tail of the skill distribution (i.e., those with high r) have more to gain from joining the nerd group than those in the lower tail (i.e., those with low r). Holding the social environment fixed, one would, therefore, expect to find a positive correlation between students’ ordinal rank and their choices of whether to become a troublemaker or a nerd. In this sense, our theory can also be interpreted as providing a microfoundation for rank-dependence in behavior, which does not rely on the (tautological) assumption that the utility from engaging in a particular activity depends directly on rank.15 2.5. Empirical Implications On its face, our model of social interactions is about how individuals select into peer groups. To connect the theory to commonly available datasets on student outcomes, consider some educational production function $$y_{i}=\boldsymbol {X}_{i}{^\prime }\beta +h(e_{i})+u_{i}$$, in which yi denotes student i’s test scores, Xi is a set of environmental variables, and h(ei) is a monotonically increasing function that converts effort, ei, into test scores. As suggested by our choice of name for each group, we would expect that the same student exerts more effort on schoolwork when she chooses to become a “nerd” rather than a “troublemaker.” In symbols, $$e_{i}^{N^{\ast }}\ge e_{i}^{T^{\ast }}$$ for all i. Conversely, we would expect students to engage in more unproductive—perhaps even anti-social—behavior when they opt to join forces with the troublemakers. Under these ancillary assumptions, our theory predicts that ordinal rank will be positively associated with academic achievement in settings in which students have the possibility to divert their time and effort to outside activities. At the same time, it is important to emphasize that it is not rank per se that generates this relationship, but self-selection. So far, we have been discussing outcomes as a function of rank based on the relative skill index r. In practice, we observe prior test scores as a proxy of σN, whereas σT is unobserved. Test score-based rank will be a valid proxy for rank in terms of relative nerd and troublemaking skill whenever σN(r) ≥ σN(r΄) implies that $$\mathbb {E}\left[ \sigma \left( r\right) \right] \ge \mathbb {E}\left[ \sigma \left( r^{\prime }\right) \right]$$. This condition holds trivially if it does not take any specific skill to cause trouble, that is, σT(r) = c for all r, or if σN and σT are independently distributed. It is also satisfied if nerd and troublemaking ability are negatively or not “too positively” correlated. If correct, then students’ scholastic achievement and their proclivity to “act out” should be related to their ordinal rank in the ability distribution. Anecdotally, the phenomenon that the behavior of children varies with their relative standing has been observed among some programs for gifted minority youth held at MIT each summer. These programs attract a subset of black children who are among the best and brightest in their schools. At MIT, however, they interact with more academically able peers, leading them to engage in a wide range of problem behaviors (Suskind 1998). Similarly, in his memoir, Canada (1995) speculates that even the most violent youth in Boston would only be mediocre fighters in the South Bronx and, therefore, be forced to change their ways. The prediction that students’ ordinal rank matters for academic outcomes also resonates with a small literature in social psychology on “big-fish-little-pond” effects (BFLPE; see Marsh et al. 2008 for a comprehensive review). Marsh et al. (1984), for instance, conclude that “children compare their own academic ability […] with the abilities of other students within their school or their reference group, and children use this relativistic impression as one basis of forming their academic self-concept” (p. 217). Marsh (1987) even argues that BFLPE accounts for about a quarter of the impact of academic self-concept on academic performance. Broadly summarizing, the comparative advantage approach to social interactions delivers a novel, testable prediction: students who fall near the top of their reference group should display higher academic achievement than their equally able counterparts who find themselves closer to the bottom in another environment. Conversely, the latter should be more prone to behavioral problems than the former. In the following section, we demonstrate that this implication of our theory is borne out in data from a randomized controlled trial involving more than 100 primary schools in Kenya, as well as administrative data from New York City Public Schools (NYCPS) from the 2003–2004 through 2008–2009 school years. Although the (quasi-) experimental variation in these data allows us to conclude that rank affects student outcomes, with standard data alone we cannot establish the precise, causal mechanism through which the impact of rank operates. In order to be able to speak to rank-based self-selection into activities, we conducted an experiment that compares the behavior of students who could and could not choose between different activities. These results isolate the self-selection channel and are presented in Section 4. 3. Ordinal Rank and Peer Effects The ideal data to test our theory would span multiple social markets—say, schools or classrooms—and contain information on shadow prices, individuals’ choices of peer groups as well as all of their skills. With such data in hand we could directly test whether self-selection based on comparative advantage determines behavior by comparing social status across groups and relating it to agents’ choices. However, in the absence of any information on social status, we confine ourselves to providing reduced form evidence that shows that individual behavior does depend on ordinal rank. That is, in the spirit of Friedman (1953) we test a stark prediction of our approach—one that does not follow from standard theories of social interactions. 3.1. Evidence from Primary Schools in Kenya Our first piece of evidence comes from the Extra Teacher Provision (ETP) intervention by Duflo et al. (2011).16 Starting in May 2005, ETP provided 121 Kenyan primary schools that had a single first-grade class with additional funds to hire an extra teacher and create a second section. In 61 randomly selected “tracking schools,” students were assigned to sections based on scores on exams administered by the schools prior to the intervention. Students above the median were grouped in one section, and those below the median in another one. In the remaining 60 “nontracking schools,” students were randomized into sections. After assignment of students to sections, each of a school’s two sections was also randomly assigned to either a civil service teacher or to one hired on a contractual basis. This intervention spanned 18 months.17 Table 1 displays summary statistics for the 121 schools in the sample of Duflo et al. (2011). Due to random assignment, tracking and nontracking schools look very similar on pretreatment observable characteristics. The same is true for students assigned to either the contract or government teacher section within nontracking schools. Within tracking schools, students assigned to the top section have on average .81 standard deviation higher test scores and are almost .4 years older than their low-ability counterparts. Table 1. Baseline school and class characteristics in the experiment of Duflo et al. (2011), by treatment group. All schools Nontracking schools Tracking schools p-value Mean SD Mean SD Tracking = Nontracking School characteristics at baseline  Total enrollment 589 (232) 549 (198) .32  Number of Government teachers 11.6 (3.3) 11.9 (2.8) .62  Student/teacher ratio 37.1 (12.2) 35.9 (10.1) .56  Performance on national exam (out of 400) 255.6 (23.6) 258.1 (23.4) .57 Class size at baseline  Average class size 91 (37) 89 (33) .76  Proportion of female students .49 (.06) .49 (.05) .54 Within nontracking schools Section A Section B Assigned to civil service teacher Assigned to contract teacher p-value Mean SD Mean SD Section A = Section B Proportion Female .49 (.06) .49 (.06) .89 Average age at endline 9.07 (.53) 9.00 (.45) .45 Average standardized test score at baseline .003 (.10) .002 (.11) .94 Average SD (within section) of test scores at baseline 1.005 (.08) .993 (.08) .43 Within tracking schools Bottom section Top section p-value Mean SD Mean SD Top = Bottom Proportion female .49 (.09) .50 (.08) .38 Average age at endline 9.04 (.59) 9.41 (.60) .00 Assigned to contract teacher .53 (.49) .46 (.47) .44 Respected assignment .99 (.02) .99 (.02) .67 Average standardized test score at baseline −.81 (.04) .81 (.04) .00 Average SD (within section) of test scores at baseline .49 (.13) .65 (.13) .00 All schools Nontracking schools Tracking schools p-value Mean SD Mean SD Tracking = Nontracking School characteristics at baseline  Total enrollment 589 (232) 549 (198) .32  Number of Government teachers 11.6 (3.3) 11.9 (2.8) .62  Student/teacher ratio 37.1 (12.2) 35.9 (10.1) .56  Performance on national exam (out of 400) 255.6 (23.6) 258.1 (23.4) .57 Class size at baseline  Average class size 91 (37) 89 (33) .76  Proportion of female students .49 (.06) .49 (.05) .54 Within nontracking schools Section A Section B Assigned to civil service teacher Assigned to contract teacher p-value Mean SD Mean SD Section A = Section B Proportion Female .49 (.06) .49 (.06) .89 Average age at endline 9.07 (.53) 9.00 (.45) .45 Average standardized test score at baseline .003 (.10) .002 (.11) .94 Average SD (within section) of test scores at baseline 1.005 (.08) .993 (.08) .43 Within tracking schools Bottom section Top section p-value Mean SD Mean SD Top = Bottom Proportion female .49 (.09) .50 (.08) .38 Average age at endline 9.04 (.59) 9.41 (.60) .00 Assigned to contract teacher .53 (.49) .46 (.47) .44 Respected assignment .99 (.02) .99 (.02) .67 Average standardized test score at baseline −.81 (.04) .81 (.04) .00 Average SD (within section) of test scores at baseline .49 (.13) .65 (.13) .00 Notes: Table shows averages and standard deviations of selected characteristics of the 121 schools in the ETP experiment of Duflo et al. (2011). Of the 121 schools in the experiment 60 were randomly assigned to the “tracking” treatment, whereas the remaining 61 schools are classified as “nontracking”. The rightmost column displays p-values for tests of equality across groups. Source: Duflo et al. (2011). View Large Table 1. Baseline school and class characteristics in the experiment of Duflo et al. (2011), by treatment group. All schools Nontracking schools Tracking schools p-value Mean SD Mean SD Tracking = Nontracking School characteristics at baseline  Total enrollment 589 (232) 549 (198) .32  Number of Government teachers 11.6 (3.3) 11.9 (2.8) .62  Student/teacher ratio 37.1 (12.2) 35.9 (10.1) .56  Performance on national exam (out of 400) 255.6 (23.6) 258.1 (23.4) .57 Class size at baseline  Average class size 91 (37) 89 (33) .76  Proportion of female students .49 (.06) .49 (.05) .54 Within nontracking schools Section A Section B Assigned to civil service teacher Assigned to contract teacher p-value Mean SD Mean SD Section A = Section B Proportion Female .49 (.06) .49 (.06) .89 Average age at endline 9.07 (.53) 9.00 (.45) .45 Average standardized test score at baseline .003 (.10) .002 (.11) .94 Average SD (within section) of test scores at baseline 1.005 (.08) .993 (.08) .43 Within tracking schools Bottom section Top section p-value Mean SD Mean SD Top = Bottom Proportion female .49 (.09) .50 (.08) .38 Average age at endline 9.04 (.59) 9.41 (.60) .00 Assigned to contract teacher .53 (.49) .46 (.47) .44 Respected assignment .99 (.02) .99 (.02) .67 Average standardized test score at baseline −.81 (.04) .81 (.04) .00 Average SD (within section) of test scores at baseline .49 (.13) .65 (.13) .00 All schools Nontracking schools Tracking schools p-value Mean SD Mean SD Tracking = Nontracking School characteristics at baseline  Total enrollment 589 (232) 549 (198) .32  Number of Government teachers 11.6 (3.3) 11.9 (2.8) .62  Student/teacher ratio 37.1 (12.2) 35.9 (10.1) .56  Performance on national exam (out of 400) 255.6 (23.6) 258.1 (23.4) .57 Class size at baseline  Average class size 91 (37) 89 (33) .76  Proportion of female students .49 (.06) .49 (.05) .54 Within nontracking schools Section A Section B Assigned to civil service teacher Assigned to contract teacher p-value Mean SD Mean SD Section A = Section B Proportion Female .49 (.06) .49 (.06) .89 Average age at endline 9.07 (.53) 9.00 (.45) .45 Average standardized test score at baseline .003 (.10) .002 (.11) .94 Average SD (within section) of test scores at baseline 1.005 (.08) .993 (.08) .43 Within tracking schools Bottom section Top section p-value Mean SD Mean SD Top = Bottom Proportion female .49 (.09) .50 (.08) .38 Average age at endline 9.04 (.59) 9.41 (.60) .00 Assigned to contract teacher .53 (.49) .46 (.47) .44 Respected assignment .99 (.02) .99 (.02) .67 Average standardized test score at baseline −.81 (.04) .81 (.04) .00 Average SD (within section) of test scores at baseline .49 (.13) .65 (.13) .00 Notes: Table shows averages and standard deviations of selected characteristics of the 121 schools in the ETP experiment of Duflo et al. (2011). Of the 121 schools in the experiment 60 were randomly assigned to the “tracking” treatment, whereas the remaining 61 schools are classified as “nontracking”. The rightmost column displays p-values for tests of equality across groups. Source: Duflo et al. (2011). View Large Duflo et al. (2011) demonstrate that tracking increased the subsequent test scores of all students, regardless of their initial place in the distribution. The authors rationalize this finding with high-ability students benefitting primarily from positive spillover effects due to more able peers, whereas for students in the low-ability section the direct effect of worse peers is more than outweighed by better targeted instruction. Given random assignment of students to classrooms, the nontracking schools in the experiment Duflo et al. (2011) provide an ideal testing ground for the impact of ordinal rank on student outcomes. In what follows we exploit the experimentally generated variation in the within-section rank of children with equal ability to estimate the causal effect of rank on academic achievement. As Duflo et al. (2011), we base our results on the initial random assignment of all students who attended first grade in May 2005.18 Specifically, we implement the empirical setup of Duflo et al. (2011), but add a student’s ordinal rank to the following linear model: $$y_{i}=\varphi r_{i}+\boldsymbol {X}_{i}^{\prime }\beta +\alpha \bar{y}_{-i} +\boldsymbol {T}_{i}^{\prime }\theta +\epsilon _{i}\text{,}$$ (6) where yi denotes individual i’s standardized total test score at endline, ri is her section-specific rank (i.e., her percentile in the distribution of pretreatment test scores), and Xi is a vector of individual controls including the baseline score and its square, gender, age, and so forth. $$\bar{y}_{-i}$$ represents the mean standardized baseline score of i’s peers, and Ti marks a vector of treatment indicators. In alternative specifications, we also include school or section fixed effects, which help to account for unobserved heterogeneity at the school or section level. A conceptual issue with estimating the causal effect of rank is that a student’s rank depends, by definition, on the distribution of ability among her peers. It is, therefore, unclear how to disentangle the two without imposing parametric assumptions. For comparability with previous work, our preferred specification controls for average peer skill. Another reason to control for the mean skill level of peers is to contrast rank effects with linear-in-means models of social interactions, which assume that peer effects operate solely through average ability.19 Table 2 presents results from estimating equation (6) using ordinary least squares. To estimate the impact of rank as cleanly as possible, we restrict attention to nontracking schools, that is, to students for whom, conditional on test scores at baseline, variation in ordinal rank is purely random. For completeness, in Online Appendix Table A.2 we present results for all students with nonmissing baseline scores, including students in tracking schools. Table 2. Estimates of the impact of rank on test scores in the control group of Duflo et al. (2011). Endline test score (1) (2) (3) (4) (5) (6) (7) Percentile (÷100) .418 .641** .649** .472** .462** .639** .449** (.269) (.273) (.265) (.225) (.230) (.265) (.225) Test score at .374*** .314*** .320*** .371*** .365*** .324*** .379*** baseline (.084) (.086) (.083) (.073) (.075) (.083) (.072) Squared test score .015 .020 .020 .022 .025 .020 .022 at baseline (.016) (.016) (.016) (.015) (.015) (.016) (.015) Contract teacher .142*** .139*** .144*** .134*** .144*** .134*** (.051) (.048) (.049) (.045) (.048) (.045) Peers’ mean test .550** .546** .438** score (.206) (.215) (.205) Peers’ mean test .518 .457 score × bottom quarter (.322) (.285) Peers’ mean test .599 .426 score × second quarter (.364) (.359) Peers’ mean test .263 .022 score × third quarter (.353) (.340) Peers’ mean test .793* .820** score × top quarter (.408) (.382) Constant −.309* −.424** −.230 −.226 (.168) (.164) (.250) (.249) Additional controls No No Yes Yes Yes Yes Yes School fixed effects No No No Yes No No Yes Section fixed effects No No No No Yes No No R-squared .240 .243 .254 .390 .413 .255 .391 Number of observations 2,190 2,190 2,188 2,188 2,188 2,188 2,188 Endline test score (1) (2) (3) (4) (5) (6) (7) Percentile (÷100) .418 .641** .649** .472** .462** .639** .449** (.269) (.273) (.265) (.225) (.230) (.265) (.225) Test score at .374*** .314*** .320*** .371*** .365*** .324*** .379*** baseline (.084) (.086) (.083) (.073) (.075) (.083) (.072) Squared test score .015 .020 .020 .022 .025 .020 .022 at baseline (.016) (.016) (.016) (.015) (.015) (.016) (.015) Contract teacher .142*** .139*** .144*** .134*** .144*** .134*** (.051) (.048) (.049) (.045) (.048) (.045) Peers’ mean test .550** .546** .438** score (.206) (.215) (.205) Peers’ mean test .518 .457 score × bottom quarter (.322) (.285) Peers’ mean test .599 .426 score × second quarter (.364) (.359) Peers’ mean test .263 .022 score × third quarter (.353) (.340) Peers’ mean test .793* .820** score × top quarter (.408) (.382) Constant −.309* −.424** −.230 −.226 (.168) (.164) (.250) (.249) Additional controls No No Yes Yes Yes Yes Yes School fixed effects No No No Yes No No Yes Section fixed effects No No No No Yes No No R-squared .240 .243 .254 .390 .413 .255 .391 Number of observations 2,190 2,190 2,188 2,188 2,188 2,188 2,188 Notes: Entries are coefficients and standard errors from estimating equation (6) using ordinary least squares. Heteroskedasticity robust standard errors are clustered at the school level and presented in parentheses. The sample consists of all students who attend nontracking schools and have nonmissing baseline test scores. Going from column (2) to (3) the number of observations decreases because some students are missing information on age and gender. “Additional controls” include age, gender, whether the school is located in the Bungoma district, and whether it was sampled for school based management. “Bottom quarter”, “second quarter”, and so forth are indicator variables for students’ own position in the test score distribution at baseline. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Table 2. Estimates of the impact of rank on test scores in the control group of Duflo et al. (2011). Endline test score (1) (2) (3) (4) (5) (6) (7) Percentile (÷100) .418 .641** .649** .472** .462** .639** .449** (.269) (.273) (.265) (.225) (.230) (.265) (.225) Test score at .374*** .314*** .320*** .371*** .365*** .324*** .379*** baseline (.084) (.086) (.083) (.073) (.075) (.083) (.072) Squared test score .015 .020 .020 .022 .025 .020 .022 at baseline (.016) (.016) (.016) (.015) (.015) (.016) (.015) Contract teacher .142*** .139*** .144*** .134*** .144*** .134*** (.051) (.048) (.049) (.045) (.048) (.045) Peers’ mean test .550** .546** .438** score (.206) (.215) (.205) Peers’ mean test .518 .457 score × bottom quarter (.322) (.285) Peers’ mean test .599 .426 score × second quarter (.364) (.359) Peers’ mean test .263 .022 score × third quarter (.353) (.340) Peers’ mean test .793* .820** score × top quarter (.408) (.382) Constant −.309* −.424** −.230 −.226 (.168) (.164) (.250) (.249) Additional controls No No Yes Yes Yes Yes Yes School fixed effects No No No Yes No No Yes Section fixed effects No No No No Yes No No R-squared .240 .243 .254 .390 .413 .255 .391 Number of observations 2,190 2,190 2,188 2,188 2,188 2,188 2,188 Endline test score (1) (2) (3) (4) (5) (6) (7) Percentile (÷100) .418 .641** .649** .472** .462** .639** .449** (.269) (.273) (.265) (.225) (.230) (.265) (.225) Test score at .374*** .314*** .320*** .371*** .365*** .324*** .379*** baseline (.084) (.086) (.083) (.073) (.075) (.083) (.072) Squared test score .015 .020 .020 .022 .025 .020 .022 at baseline (.016) (.016) (.016) (.015) (.015) (.016) (.015) Contract teacher .142*** .139*** .144*** .134*** .144*** .134*** (.051) (.048) (.049) (.045) (.048) (.045) Peers’ mean test .550** .546** .438** score (.206) (.215) (.205) Peers’ mean test .518 .457 score × bottom quarter (.322) (.285) Peers’ mean test .599 .426 score × second quarter (.364) (.359) Peers’ mean test .263 .022 score × third quarter (.353) (.340) Peers’ mean test .793* .820** score × top quarter (.408) (.382) Constant −.309* −.424** −.230 −.226 (.168) (.164) (.250) (.249) Additional controls No No Yes Yes Yes Yes Yes School fixed effects No No No Yes No No Yes Section fixed effects No No No No Yes No No R-squared .240 .243 .254 .390 .413 .255 .391 Number of observations 2,190 2,190 2,188 2,188 2,188 2,188 2,188 Notes: Entries are coefficients and standard errors from estimating equation (6) using ordinary least squares. Heteroskedasticity robust standard errors are clustered at the school level and presented in parentheses. The sample consists of all students who attend nontracking schools and have nonmissing baseline test scores. Going from column (2) to (3) the number of observations decreases because some students are missing information on age and gender. “Additional controls” include age, gender, whether the school is located in the Bungoma district, and whether it was sampled for school based management. “Bottom quarter”, “second quarter”, and so forth are indicator variables for students’ own position in the test score distribution at baseline. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large As reported in Duflo et al. (2011), contract teachers have a positive impact on test scores. More importantly for our purposes, there is a positive relationship between students’ ordinal rank and their academic achievement. With one exception the point estimates in the upper panel are statistically significant at conventional levels, and they are always economically large. Critically, compared to its baseline value in the first column, the estimated impact of rank on test scores actually increases with the inclusion of additional controls, such as peers’ mean test score, peers’ mean test score interacted with a student’s own position in the ability distribution, age, gender, and so forth. Moreover, the point estimate is also robust to using only within-school or within-section variation as sources of identification. Taking the lowest estimate in the upper panel of Table 2 at face value, a 50 percentile increase in rank increases test scores at endline by about .2 standard deviations.20 The estimates in Table 2 also suggest that peers’ mean ability exerts a positive effect on student achievement. Since our theory deliberately abstracts away from peers as a source of direct externalities, it falls necessarily short of explaining this finding. Such externalities, however, are necessary to rationalize the full set of results in Duflo et al. (2011). Although rank effects can explain why students just below the median of the achievement distribution benefitted from tracking without relying on nonconvexities in teachers’ payoffs; for students just above the initial median to do better in tracked sections, it must be the case that the negative impact of decreasing rank is outweighed by a countervailing force that is outside of our theoretical model. Nonetheless, in Online Appendix D.1, we show that we obtain qualitatively similar estimates of rank effects when we rely on the full sample of Duflo et al. (2011), that is, when we also include students in tracking schools. Broadly summarizing, the evidence from this randomized controlled trial suggests that, even conditional on standard measures of peer quality, ordinal rank has an economically meaningful impact on children’s academic achievement. Although the intervention of Duflo et al. (2011) provides us with exogenous variation to test for an impact of rank on achievement, it does not come without drawbacks. As with any experiment, one may wonder about external validity. Moreover, the data do not contain measures of student behavior, which prevents us from probing the prediction that ordinal rank also affects problem behaviors. 3.2. Evidence from New York City Public Schools To ameliorate these shortcomings, we now turn to administrative data for all students in New York City Public Schools (NYCPS)—the largest school district in the United States. The NYCPS data contain student-level information on approximately 1.1 million students per year across the five boroughs of New York City. Our data span the 2003–2004 to 2008–2009 school years and include student race, gender, free and reduced-price lunch eligibility, behavior, attendance, and matriculation with course grades for all students, as well as state math and English/Language Arts (ELA) test scores for students in grades three through eight. Summary statistics for the variables we use in our core specifications are displayed in Table 3. Table 3. Summary statistics for NYCPS data. Mean SD Behavioral indicators  Behavioral incident in 5th grade .083 (.531)  Behavioral incident in 6th grade .089 (.285)  Behavioral incident in 8th grade .132 (.339) Test scores  5th grade test score (English/Language arts) 661 (36.9)  5th grade test score (Math) 668 (41.6) Demographics  White .149 (.356)  Black .314 (.464)  Hispanic .394 (.489)  Asian .139 (.346)  Other race .004 (.063)  Male .507 (.500)  Female .493 (.500)  Free lunch .830 (.376)  English language learner .093 (.290)  Special education .087 (.282) School year  2004/05 .206 (.405)  2005/06 .194 (.395)  2006/07 .194 (.396)  2007/08 .201 (.401)  2008/09 .205 (.403) Mean SD Behavioral indicators  Behavioral incident in 5th grade .083 (.531)  Behavioral incident in 6th grade .089 (.285)  Behavioral incident in 8th grade .132 (.339) Test scores  5th grade test score (English/Language arts) 661 (36.9)  5th grade test score (Math) 668 (41.6) Demographics  White .149 (.356)  Black .314 (.464)  Hispanic .394 (.489)  Asian .139 (.346)  Other race .004 (.063)  Male .507 (.500)  Female .493 (.500)  Free lunch .830 (.376)  English language learner .093 (.290)  Special education .087 (.282) School year  2004/05 .206 (.405)  2005/06 .194 (.395)  2006/07 .194 (.396)  2007/08 .201 (.401)  2008/09 .205 (.403) Notes: Entries are means and standard deviations for each variable we use in the NYCPS data. For further details about the NYCPS data see the description in the Data Appendix. View Large Table 3. Summary statistics for NYCPS data. Mean SD Behavioral indicators  Behavioral incident in 5th grade .083 (.531)  Behavioral incident in 6th grade .089 (.285)  Behavioral incident in 8th grade .132 (.339) Test scores  5th grade test score (English/Language arts) 661 (36.9)  5th grade test score (Math) 668 (41.6) Demographics  White .149 (.356)  Black .314 (.464)  Hispanic .394 (.489)  Asian .139 (.346)  Other race .004 (.063)  Male .507 (.500)  Female .493 (.500)  Free lunch .830 (.376)  English language learner .093 (.290)  Special education .087 (.282) School year  2004/05 .206 (.405)  2005/06 .194 (.395)  2006/07 .194 (.396)  2007/08 .201 (.401)  2008/09 .205 (.403) Mean SD Behavioral indicators  Behavioral incident in 5th grade .083 (.531)  Behavioral incident in 6th grade .089 (.285)  Behavioral incident in 8th grade .132 (.339) Test scores  5th grade test score (English/Language arts) 661 (36.9)  5th grade test score (Math) 668 (41.6) Demographics  White .149 (.356)  Black .314 (.464)  Hispanic .394 (.489)  Asian .139 (.346)  Other race .004 (.063)  Male .507 (.500)  Female .493 (.500)  Free lunch .830 (.376)  English language learner .093 (.290)  Special education .087 (.282) School year  2004/05 .206 (.405)  2005/06 .194 (.395)  2006/07 .194 (.396)  2007/08 .201 (.401)  2008/09 .205 (.403) Notes: Entries are means and standard deviations for each variable we use in the NYCPS data. For further details about the NYCPS data see the description in the Data Appendix. View Large To account for the possibility that low ability students may be inherently more likely to act out, our research design for these data relies on transitions from elementary to middle school, that is, from fifth to sixth grade. During this transition students typically move from small, local elementary schools to larger middle schools, which disrupts ordinal rank when the feeder schools are heterogeneous. Specifically, to estimate the impact of ordinal rank, holding students’ inherent tendency to misbehave fixed, we relate changes in the behavior of equally able children from the same elementary school to changes in rank induced by a switch to different middle schools. We first exploit the sheer size of the NYCPS data and estimate semiparametric models, which allow us to explore potential nonlinearities in the relationship between rank and behavior. Finding little evidence of important nonlinearities, we then address the potential endogeneity of changes in rank via an instrumental variables strategy based on school zoning regulations. In order to examine the functional relationship between rank and behavior we estimate semiparametric specifications of the following form: $$\Delta y_{i}=f(\Delta r_{i})+\boldsymbol {X}_{i}^{\prime }\beta +\textit {School}_{i}+\textit {Year}_{i}+\epsilon _{i}\text{,}$$ (7) while restricting attention to the set of students who change schools in the transition from fifth to sixth grade. Our behavioral measure, yi, in each year is an indicator equal to one if a student has at least one reported behavioral incident from that year and zero otherwise. Hence, Δyi ∈ {−1, 0, 1}. The three most common behavioral incidents in our data are “engaging in an altercation or physically aggressive behavior with other student(s)”, “behaving in a manner that disrupts the educational process (horseplay),” or “engaging in verbally rude or disrespectful behavior/insubordination.”21 A student’s rank in fifth grade is the student’s percentile ranking based on her achievement on the New York State exam relative to other students who are in the same school in fifth grade.22 We also compute each student’s position relative to her peers in the sixth-grade school based on her fifth grade score, and denote the difference between these two rankings Δri. Results are reported using both math and ELA scores to compute the change in percentile. We include a standard set of controls $$\boldsymbol {X}_{i}^{{}}$$, consisting of the test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, English Language Learner (ELL) status, and special education designation. Using these covariates we attempt to control for factors that plausibly influence changes in behavior and might be correlated with rank. $$\boldsymbol {X}_{i}^{{}}$$ in equation (7) further includes the variance of peers’ test scores, and a third order polynomial on peers’ mean score. Finally, we add year fixed effects and school fixed effects (for both a student’s elementary and middle school). By controlling for school fixed effects we account for the fact that schools might have heterogeneous propensities to classify the same demeanor as a behavioral incident. Our semiparametric estimates of the link between changes in rank and changes in behavior are displayed in Figure 5. Independent of whether we calculate rank based on ELA or math scores, the behavior of students whose rank decreases in going from elementary to middle school worsens significantly compared to students whose relative standing improves. Taking the estimates based on ELA scores at face value, a student experiencing a 50 percentile decline in rank is approximately 2 percentage points more likely to have a behavioral incident on record than a student whose rank improves by 50 percentiles. Given sample means of 8.7% for sixth grade and 4.9% for fifth grade, our estimates are nontrivial in size.23 Figure 5. View largeDownload slide Evidence from New York City public schools. Note: Panels show semiparametric estimates and the associated 95%-confidence intervals of the effect of a change in a student’s class percentile rank (in going from elementary to middle school) on the change in an indicator variable for whether she was involved in a behavioral incident, cf. equation (7). The top panel constructs percentiles based on English/Language Arts (ELA) test scores, whereas the lower one uses math test scores. Estimates are obtained using cubic b-splines with four nodes that divided the sample equally. Section 3.2 and the Online Data Appendix provide additional information on the exact econometric specification as well as the sample. Figure 5. View largeDownload slide Evidence from New York City public schools. Note: Panels show semiparametric estimates and the associated 95%-confidence intervals of the effect of a change in a student’s class percentile rank (in going from elementary to middle school) on the change in an indicator variable for whether she was involved in a behavioral incident, cf. equation (7). The top panel constructs percentiles based on English/Language Arts (ELA) test scores, whereas the lower one uses math test scores. Estimates are obtained using cubic b-splines with four nodes that divided the sample equally. Section 3.2 and the Online Data Appendix provide additional information on the exact econometric specification as well as the sample. Although the NYCPS data allow us to control for students’ natural proclivities to cause trouble by relating changes in behavior to changes in rank induced by the transition from elementary to middle school, there exists the possibility that estimates of equation (7) are driven by reverse causality. That is, behavioral problems during sixth grade might have caused changes in test scores and, therefore, class rank. Another concern is systematic choice of school. Students who chose an academically less challenging middle school might have experienced less of an increase in behavioral problems, even if their rank had not improved. To address these issues, we also estimate two-stage least squares (2SLS) specifications in which we instrument for a student’s change in rank with the predicted change based on the schools they were zoned to attend (given their residential address). Specifically, we estimate the following linear model: $$\Delta y_{i}=\varphi \Delta r_{i}+\boldsymbol {X}_{i}^{\prime }\beta +\textit {School}_{i}+\textit {Year}_{i}+\epsilon _{i}\,\,\text{,}$$ (8) where the first stage is given by \begin{equation*} \Delta r_{i}=\delta \widehat{\Delta r_{i}}+\boldsymbol {X}_{i}^{\prime }\gamma +\textit {School}_{i}+\textit {Year}_{i}+\nu _{i}\text{,}\,\, \end{equation*} and $$\widehat{\Delta r_{i}}$$ denotes student i’s counterfactual change in rank at the beginning of sixth grade (using fifth grade tests scores) had all students attended the schools for which they were zoned. In symbols, let ai, t−1 denote student i’s test score in fifth grade and let $${rank}_{I}\left( a_{i\text{,}t-1}\right)$$ be the percentile ranking of a student with score ai, t−1 among the set of students I, given their respective test scores at t−1. Then, \begin{equation*} \widehat{\Delta r_{i}}\equiv {rank}_{S_{i,t}}\left( a_{i\text{,}t-1}\right) -{rank}_{S_{i,t-1}}\left( a_{i\text{,}t-1}\right) \text{,} \end{equation*} where Si, t−1 and Si, t are the sets of students who are zoned for the same elementary and middle school as i, respectively. Intuitively, our IV approach compares observationally identical students from the same elementary school who experience a differential change in rank because school zoning regulations led them to attend different middle schools. Table 4 presents the resulting 2SLS estimates of the effect of school rank on behavior, as well as the corresponding OLS ones. In the upper panel we use ELA scores to construct rank, whereas math scores are used in the lower one. To facilitate comparisons with standard linear-in-means models, we refrain from controlling for higher order moments of the distribution of scores and simply report the coefficient on peers’ mean test score.24 Based on the OLS point estimates, one would expect a student experiencing a 50 percentile decline in rank to be 3 to 5 percentage points more likely to have a behavioral incident on record than a student whose rank improves by 50 percentiles—consistent with our previous semiparametric results. Table 4. Estimates of the short-run impact of rank on behavior in the NYCPS data. A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 6) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.030*** −.069** −.031*** −.050 −.032*** −.047 (.005) (.034) (.004) (.051) (.005) (.059) Peers’ mean test score (÷100) −.027* −.040** −.079* −.085* (.015) (.020) (.042) (.046) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .009 – .067 – .103 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile Based on Math Test Scores Δ Percentile (÷100) −.047*** −.072** −.051*** −.087* −.053*** −.123** (.006) (.030) (.005) (.047) (.006) (.054) Peers’ mean test score (÷100) −.033*** −.040*** −.042 −.055 (.012) (.016) (.031) (.034) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First Stage F-Statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .009 – .065 – .100 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 6) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.030*** −.069** −.031*** −.050 −.032*** −.047 (.005) (.034) (.004) (.051) (.005) (.059) Peers’ mean test score (÷100) −.027* −.040** −.079* −.085* (.015) (.020) (.042) (.046) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .009 – .067 – .103 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile Based on Math Test Scores Δ Percentile (÷100) −.047*** −.072** −.051*** −.087* −.053*** −.123** (.006) (.030) (.005) (.047) (.006) (.054) Peers’ mean test score (÷100) −.033*** −.040*** −.042 −.055 (.012) (.016) (.031) (.034) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First Stage F-Statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .009 – .065 – .100 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (8) by ordinary least squares as well as two-stage least squares. The dependent variable is listed at the top of each column. The instrument for Δ Percentile is the predicted change in percentile based on school zoning regulations, as explained in the text. The IV specifications contain fewer observations because we do not observe addresses of all students in our data. In the upper panel a student’s percentile in his school is calculated based on ELA test scores, whereas the lower panel uses math test scores. Heteroskedasticity robust standard errors are clustered on the school level and reported in parentheses. In addition to the variables shown in the table, we control for test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, ELL status, and special education designation. To facilitate comparisons between Tables 4 and 5 the set of students included in the analysis has been restricted to those observed from fifth through eight grade. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Table 4. Estimates of the short-run impact of rank on behavior in the NYCPS data. A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 6) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.030*** −.069** −.031*** −.050 −.032*** −.047 (.005) (.034) (.004) (.051) (.005) (.059) Peers’ mean test score (÷100) −.027* −.040** −.079* −.085* (.015) (.020) (.042) (.046) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .009 – .067 – .103 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile Based on Math Test Scores Δ Percentile (÷100) −.047*** −.072** −.051*** −.087* −.053*** −.123** (.006) (.030) (.005) (.047) (.006) (.054) Peers’ mean test score (÷100) −.033*** −.040*** −.042 −.055 (.012) (.016) (.031) (.034) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First Stage F-Statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .009 – .065 – .100 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 6) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.030*** −.069** −.031*** −.050 −.032*** −.047 (.005) (.034) (.004) (.051) (.005) (.059) Peers’ mean test score (÷100) −.027* −.040** −.079* −.085* (.015) (.020) (.042) (.046) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .009 – .067 – .103 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile Based on Math Test Scores Δ Percentile (÷100) −.047*** −.072** −.051*** −.087* −.053*** −.123** (.006) (.030) (.005) (.047) (.006) (.054) Peers’ mean test score (÷100) −.033*** −.040*** −.042 −.055 (.012) (.016) (.031) (.034) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First Stage F-Statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .009 – .065 – .100 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (8) by ordinary least squares as well as two-stage least squares. The dependent variable is listed at the top of each column. The instrument for Δ Percentile is the predicted change in percentile based on school zoning regulations, as explained in the text. The IV specifications contain fewer observations because we do not observe addresses of all students in our data. In the upper panel a student’s percentile in his school is calculated based on ELA test scores, whereas the lower panel uses math test scores. Heteroskedasticity robust standard errors are clustered on the school level and reported in parentheses. In addition to the variables shown in the table, we control for test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, ELL status, and special education designation. To facilitate comparisons between Tables 4 and 5 the set of students included in the analysis has been restricted to those observed from fifth through eight grade. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Due to the large number of observations, our OLS estimates are very precise. Unfortunately, this is not the case when we estimate equation (8) by 2SLS. Although the first stage F-statistic is well above conventional critical values (Stock and Yogo 2005), our instrument explains little residual variation in the excluded variable, as evidenced by small values of Shea’s R2 (Shea 1997). One potential explanation for this is that only 45.3% (53.9%) of students attend the middle (elementary) school for which they are zoned. Nevertheless, not including school fixed effects, the 2SLS estimates are at least as large as their OLS counterparts and statistically significant. If we include school or section fixed effects the standard errors increase by as much as an order of magnitude. The coefficients, however, continue to be negative and economically large. To help judge the magnitude of the implied effect sizes, consider a student whose rank increases by 25 percentiles as she goes from fifth to sixth grade (which corresponds to about one standard deviation in our data). Based on the estimates in Table 3, her behavior should improve by 0.7 percentage points or more. Taking the median coefficient on peers’ mean test scores at face value, to achieve an equivalent improvement her peers’ mean score would need to increase by slightly more than one standard deviation. Although such a comparison is speculative due to the variability in our estimates, it suggests that rank-based peer effects are economically important. In order to investigate whether these effects persist beyond sixth grade, we have replicated the analysis in Table 4, focusing on the change in behavior from fifth to eighth grade instead. Table 5 presents the results. Interestingly, all point estimates are negative and economically meaningful. Six out of the eight estimates are even larger than those in the previous table. Although the 2SLS results are, again, fairly imprecise, the sum of the evidence suggests that behavioral effects from changes in ordinal rank do not dissipate over time. Table 5. Estimates of the medium-run impact of rank on behavior in the NYCPS data. A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 8) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.058*** −.098** −.057*** −.148** −.060*** −.244*** (.006) (.038) (.006) (.068) (.006) (.078) Peers’mean test score (÷100) −.050*** −.067*** −.066 −.102* (.022) (.025) (.048) (.055) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .021 – .070 – .101 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile based on math test scores Δ Percentile (÷100) −.078*** −.055 −.086*** −.019 −.090*** −.077 (.007) (.034) (.006) (.059) (.006) (.074) Peers’ mean test score (÷100) −.037** −.029 −.040 −.022 (.018) (.022) (.039) (.041) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–chort fixed effects No No No No Yes Yes First stage F-statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .021 – .069 – .101 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 8) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.058*** −.098** −.057*** −.148** −.060*** −.244*** (.006) (.038) (.006) (.068) (.006) (.078) Peers’mean test score (÷100) −.050*** −.067*** −.066 −.102* (.022) (.025) (.048) (.055) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .021 – .070 – .101 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile based on math test scores Δ Percentile (÷100) −.078*** −.055 −.086*** −.019 −.090*** −.077 (.007) (.034) (.006) (.059) (.006) (.074) Peers’ mean test score (÷100) −.037** −.029 −.040 −.022 (.018) (.022) (.039) (.041) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–chort fixed effects No No No No Yes Yes First stage F-statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .021 – .069 – .101 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (8) by ordinary least squares as well as two-stage least squares. The dependent variable is listed at the top of each column. The instrument for Δ Percentile is the predicted change in percentile based on school zoning regulations, as explained in the text. The IV specifications contain fewer observations because we do not observe addresses of all students in our data. In the upper panel a student’s percentile in his school is calculated based on ELA test scores, whereas the lower panel uses math test scores. Heteroskedasticity robust standard errors are clustered on the school level and reported in parentheses. In addition to the variables shown in the table, we control for test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, ELL status, and special education designation. To facilitate comparisons between Tables 4 and 5 the set of students included in the analysis has been restricted to those observed from fifth through eight grade. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Table 5. Estimates of the medium-run impact of rank on behavior in the NYCPS data. A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 8) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.058*** −.098** −.057*** −.148** −.060*** −.244*** (.006) (.038) (.006) (.068) (.006) (.078) Peers’mean test score (÷100) −.050*** −.067*** −.066 −.102* (.022) (.025) (.048) (.055) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .021 – .070 – .101 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile based on math test scores Δ Percentile (÷100) −.078*** −.055 −.086*** −.019 −.090*** −.077 (.007) (.034) (.006) (.059) (.006) (.074) Peers’ mean test score (÷100) −.037** −.029 −.040 −.022 (.018) (.022) (.039) (.041) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–chort fixed effects No No No No Yes Yes First stage F-statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .021 – .069 – .101 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 A. Percentile based on ELA test scores Δ Behavioral incident (Grade 5 → Grade 8) Independent variable OLS 2SLS OLS 2SLS OLS 2SLS Δ Percentile (÷100) −.058*** −.098** −.057*** −.148** −.060*** −.244*** (.006) (.038) (.006) (.068) (.006) (.078) Peers’mean test score (÷100) −.050*** −.067*** −.066 −.102* (.022) (.025) (.048) (.055) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–cohort fixed effects No No No No Yes Yes First stage F-statistic – 884.6 – 348.1 – 248.9 Shea’s partial R-squared – .033 – .006 – .004 R-squared .021 – .070 – .101 – Number of observations 122,792 118,699 122,792 118,699 122,792 118,699 B. Percentile based on math test scores Δ Percentile (÷100) −.078*** −.055 −.086*** −.019 −.090*** −.077 (.007) (.034) (.006) (.059) (.006) (.074) Peers’ mean test score (÷100) −.037** −.029 −.040 −.022 (.018) (.022) (.039) (.041) Individual controls Yes Yes Yes Yes Yes Yes Year fixed effects Yes Yes Yes Yes No No School fixed effects No No Yes Yes No No School–chort fixed effects No No No No Yes Yes First stage F-statistic – 723.9 – 405.2 – 306.2 Shea’s partial R-squared – .038 – .008 – .005 R-squared .021 – .069 – .101 – Number of observations 131,294 126,924 131,294 126,924 131,294 126,924 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (8) by ordinary least squares as well as two-stage least squares. The dependent variable is listed at the top of each column. The instrument for Δ Percentile is the predicted change in percentile based on school zoning regulations, as explained in the text. The IV specifications contain fewer observations because we do not observe addresses of all students in our data. In the upper panel a student’s percentile in his school is calculated based on ELA test scores, whereas the lower panel uses math test scores. Heteroskedasticity robust standard errors are clustered on the school level and reported in parentheses. In addition to the variables shown in the table, we control for test score in the same subject from the previous year, an exhaustive set of race dummies, sex, free lunch eligibility, ELL status, and special education designation. To facilitate comparisons between Tables 4 and 5 the set of students included in the analysis has been restricted to those observed from fifth through eight grade. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large In Online Appendix Tables A.3 and A.4, we examine how children who do and do not comply with school zoning regulations differ from each other on predetermined variables. Compliers have a lower likelihood of behavioral problems in fifth grade, are more likely to be white or Asian, are less likely to be enrolled in special education classes, and have slightly higher socioeconomic status (as proxied by our free lunch indicator). Importantly, a student’s “predicted change in rank” (i.e., our instrument) is statistically indistinguishable for both groups. As are fifth grade test scores. Furthermore, behavioral incidents in fifth grade are, conditional on the set of covariates in equation in (8), uncorrelated with “predicted change in rank.” This is true for the set of all children as well as within the set of compliers. For our IV results to be driven by violations of the exclusion it would have to be the case that students who would have experienced worsening behavior regardless of their change in rank are systematically zoned for schools in which they experience larger declines in relative standing than children whose behavior improved. In light of the absence of selection on initial behavior and test scores, there is little evidence to suggest that this may be the case—although we hasten to point out that the validity of an instrument can never be fully established. The evidence in Online Appendix Tables A.3 and A.4 does, however, suggest that identification of the 2SLS local average treatment effect comes from higher socioeconomic status children who are initially better behaved than average. If one believes that children who are initially well-behaved and from wealthier backgrounds are less affected by their peers, then there is reason to think that the point estimates above understate the average treatment effect. 3.3. Discussion Taken together, the findings in this section suggest that students’ test scores decrease and problem behaviors worsen as their relative standing declines. These results are noteworthy not only because they point to a hitherto underexplored source of peer effects, but also because they come from very different, independent settings. The fact that we find an effect of ordinal rank for primary school children in Kenya as well as for middle school students in the United States suggests a more general phenomenon. Although the data are consistent with the idea that comparative advantage shapes social interactions, they cannot rule out other mechanisms. For instance, another explanation for our findings is that teacher behavior depends on the entire distribution of student ability. Suppose that a student’s perceived ability matters for how much teachers invest in her. If teachers invest more in students who are thought to be smarter, and if ordinal rank serves as a (noisy) signal about ability, then a teacher-focused explanation is consistent with the finding that test scores increase with adolescents’ rank. If lack of teacher attention causes students to act out, then such an explanation can also rationalize why problem behaviors worsen as students’ rank declines. The data requirements to test this alternative hypothesis are very demanding. To implement a convincing test, we would need objective information on how much effort and attention teachers places on students at every part of the ability distribution. We are unaware of such data.25 Moreover, there may be additional theories that predict a relationship between rank and student outcomes, and we do not have a principled way to narrow down the set of plausible mechanisms. 4. An Experiment to Test the Self-Selection Mechanism Rather than trying to estimate the precise share of the relationship between rank and outcomes that is attributable to a particular mechanism, we content ourselves with a “framed” field experiment that achieves two related goals: (i) Rule out that the teacher channel is the sole driver of rank effects, and (ii) explicitly test the idea of self-selection based on comparative advantage. To this end, we recruited nearly six hundred students from two open-enrollment public middle schools in the Houston Independent School District and incentivized them to solve mazes in a custom-made computer game. According to our conversations with principals and teachers, children at these schools have extensive experience playing simple video games on their phones or even on the schools’ computers. Our decision to embed the experiment in the context of a game reflects the desire to replicate as many as possible of the myriad ways in which situational variables may affect students’ behavior, all the while maintaining enough control to shed light on the mechanism through which rank affects outcomes. In light of our second goal, we settled on an experimental design that randomly offered students the opportunity to engage in both constructive and destructive behavior, or only the former. Relative to the situation in which the “choice mechanism” is shut down, our theory of self-selection in social interactions predicts that the opportunity to sabotage others crowds out constructive behavior to greater extent among low-ranked individuals than higher ranked ones. 4.1. Design Specifics Each student participated in one experimental session, which was held in her school’s computer lab (see Online Appendix F for details regarding recruitment, parental consent, implementation logistics, and so forth, and for a copy of the experimental instructions). Sessions lasted about sixty minutes and included, on average, twenty children from the same school. In the first stage of the game, students were asked to solve either five or twenty mazes, depending on the experimental session.26 “Solving” a computerized maze entailed using the arrow keys to steer a cursor from the entrance of the maze to its exit (see Figure 6 for screenshots). Children earned $0.25 per maze that they successfully completed in this stage. All students worked on the same set of mazes and were ranked (among participants in the same session) according to the time it took them to complete the task. The ordinal ranking as well as children’s cardinal performance was then displayed on everyone’s screens in order to make both common knowledge—similar to the scoreboard feature in many of the most popular video games. In the second stage, children in the control group were given the opportunity to practice on up to 20 additional mazes at a fixed, randomly determined cost per maze. Students were instructed that ten of these mazes would reappear again in the third stage of the game. Before the software determined the cost of practicing, students were asked for their maximal willingness to pay to see and work on a maze. Children whose willingness to pay exceeded the cost per maze were allowed to practice on as many mazes as they wished, paying for each one as they went along.27 Participants whose willingness to pay did not exceed the session specific cost were not allowed to practice at all. Our procedure for eliciting students’ willingness to pay thus resembles the well-known BDM mechanism (Becker, DeGroot, and Marschak 1964), with the important difference that quantity was not fixed at one. Instead, we allowed students to choose quantity knowing the realized price per maze.28 Figure 6. View largeDownload slide Sample screenshots from the maze-solving experiment. Figure 6. View largeDownload slide Sample screenshots from the maze-solving experiment. Children in the treatment group were also allowed to practice, and the software elicited their willingness to pay for practicing in the same way. In addition, they were asked how much they would be willing to spend in order to “slime” the screen of a peer of their choosing. Sliming another child’s screen carried no monetary benefit, but it prevented the other student from practicing by blocking a portion of the maze on which she was working (cf. Figure 6). Both activities were nonrival in the sense that children could practice and slime other participants at the same time. However, students were only allowed to engage in a particular activity if their stated willingness to pay exceeded the respective, randomly determined price. If allowed to slime, students could do so as often as they were willing to incur the cost. A ticker publicly displayed who slimed whom in real time. An important advantage of this design over an alternative one with fixed prices that are set in advance is that it requires no knowledge of students’ approximate willingness to pay. In fact, by explicitly eliciting this otherwise unobserved variable we can directly assess how the desirability of practicing changes once students can engage in an alternative, disruptive activity, and how this effect varies with rank. In the final stage of the game, all children were asked to complete ten mazes in twenty minutes, for a payoff of$3.00 per maze they successfully solved. On average, students earned a total of $29.30, including a$2 show-up fee. Neither students’ earnings nor their performance during the last stage of the game were made common knowledge. It is also worth noting that at no point during the experiment did monetary payoffs depend on ordinal rank, and that students were made aware of the payoff structure. Administrative constraints imposed by the schools prevented us from randomizing students into experimental sessions. Within an experimental session, however, all students were randomly assigned to either the treatment or the control group. Thus, whether the children were faced with the opportunity to disrupt their peers was purely random. By comparing the relationship between ordinal rank and students’ willingness to practice across treatment and control, our experimental design permits us to assess how the opportunity to self-select into a second, disruptive activity causes behavior to change among different sets of students, that is, lower- versus higher-ranked ones. It is important to emphasize that our theory of self-selection based on comparative advantage makes no prediction regarding the relationship between rank and willingness to practice in the control group. Students in the control group can engage in only one activity, which leaves no room for self-selection to affect behavior. Any correlation between these students’ choices and their ordinal rank must, therefore, be due to other, unmodeled factors (say, intrinsic preferences over rank, or varying marginal returns to practicing). The control group is nonetheless useful because it allows us to establish a baseline correlation between rank and willingness to practice. If the self-selection mechanism mediates social interactions, then, when given a chance to be disruptive, lower-ranked students—who have a relative disadvantage at solving mazes—should be disproportionately likely to substitute away from practicing. 4.2. Experimental Results Table 6 presents descriptive statistics for the children in our experiment, by treatment and control status. With the exception of grade level, students in the treatment and control group are statistically indistinguishable. Although the difference in grade level is economically small, it is statistically significant at the 10%-level. Notwithstanding the fact that a Kolmogorov–Smirnov test is unable to reject the null hypothesis that the p-values in the rightmost column are uniformly distributed on the unit interval (p = 0.305)—as one would expect under truly random assignment—we address the issue of imbalance by presenting results that do and do not condition on covariates. If anything, our findings become stronger when we control for observables. In addition, we show in Online Appendix Table A.5 that the results are qualitatively robust to conducting our analysis within each grade level. Although estimates disaggregated by grade are far less precise, out of the thirty coefficients in Online Appendix Table A.5 only two change sign compared to our main analysis. Therefore, it seems unlikely that imperfect randomization is driving our results. Table 6. Observable characteristics of students in our framed field experiment, by treatment status. All Treatment Control p-value Mean SD Mean SD Mean SD Treatment = Control Male .538 (.499) .537 (.499) .538 (.499) .971 Minority .873 (.333) .866 (.341) .882 (.324) .843 Grade 6.89 (.762) 6.97 (.715) 6.80 (.803) .077 Special education .134 (.341) .134 (.341) .134 (.341) .988 Limited English proficiency .334 (.472) .339 (.474) .328 (.470) .864 Missing demographic information .023 (.149) .033 (.180) .013 (.115) .103 Self-assessed ability (pre-period; scale 1–10) 6.08 (1.83) 6.19 (1.79) 5.96 (1.87) .246 Baseline performance (seconds per compl. maze) 27.1 (15.8) 26.2 (18.5) 28.0 (12.1) .342 Number of students 573 302 271 All Treatment Control p-value Mean SD Mean SD Mean SD Treatment = Control Male .538 (.499) .537 (.499) .538 (.499) .971 Minority .873 (.333) .866 (.341) .882 (.324) .843 Grade 6.89 (.762) 6.97 (.715) 6.80 (.803) .077 Special education .134 (.341) .134 (.341) .134 (.341) .988 Limited English proficiency .334 (.472) .339 (.474) .328 (.470) .864 Missing demographic information .023 (.149) .033 (.180) .013 (.115) .103 Self-assessed ability (pre-period; scale 1–10) 6.08 (1.83) 6.19 (1.79) 5.96 (1.87) .246 Baseline performance (seconds per compl. maze) 27.1 (15.8) 26.2 (18.5) 28.0 (12.1) .342 Number of students 573 302 271 Notes: Table shows basic descriptive statistics for students who participated in our framed field experiment, by treatment status. The rightmost column displays p-values for tests of equality in means across the treatment and control groups. A Kolmogorov–Smirnov test is unable to reject the null hypothesis that the p-values in the rightmost column are uniformly distributed on the unit interval (p = 0.305). For additional information on the experiment see the main text. The Data Appendix provides precise definitions of all variables. View Large Table 6. Observable characteristics of students in our framed field experiment, by treatment status. All Treatment Control p-value Mean SD Mean SD Mean SD Treatment = Control Male .538 (.499) .537 (.499) .538 (.499) .971 Minority .873 (.333) .866 (.341) .882 (.324) .843 Grade 6.89 (.762) 6.97 (.715) 6.80 (.803) .077 Special education .134 (.341) .134 (.341) .134 (.341) .988 Limited English proficiency .334 (.472) .339 (.474) .328 (.470) .864 Missing demographic information .023 (.149) .033 (.180) .013 (.115) .103 Self-assessed ability (pre-period; scale 1–10) 6.08 (1.83) 6.19 (1.79) 5.96 (1.87) .246 Baseline performance (seconds per compl. maze) 27.1 (15.8) 26.2 (18.5) 28.0 (12.1) .342 Number of students 573 302 271 All Treatment Control p-value Mean SD Mean SD Mean SD Treatment = Control Male .538 (.499) .537 (.499) .538 (.499) .971 Minority .873 (.333) .866 (.341) .882 (.324) .843 Grade 6.89 (.762) 6.97 (.715) 6.80 (.803) .077 Special education .134 (.341) .134 (.341) .134 (.341) .988 Limited English proficiency .334 (.472) .339 (.474) .328 (.470) .864 Missing demographic information .023 (.149) .033 (.180) .013 (.115) .103 Self-assessed ability (pre-period; scale 1–10) 6.08 (1.83) 6.19 (1.79) 5.96 (1.87) .246 Baseline performance (seconds per compl. maze) 27.1 (15.8) 26.2 (18.5) 28.0 (12.1) .342 Number of students 573 302 271 Notes: Table shows basic descriptive statistics for students who participated in our framed field experiment, by treatment status. The rightmost column displays p-values for tests of equality in means across the treatment and control groups. A Kolmogorov–Smirnov test is unable to reject the null hypothesis that the p-values in the rightmost column are uniformly distributed on the unit interval (p = 0.305). For additional information on the experiment see the main text. The Data Appendix provides precise definitions of all variables. View Large Pooling over all 573 students who completed the experiment, Table 7 displays our findings. The numbers therein correspond to the coefficients on rank (ri , s) and rank interacted with a treatment indicator (Ts) in the following econometric model: $$y_{i,s}=\varphi r_{i,s}+\gamma T_{s}\times r_{i,s}+\alpha b_{i}+\boldsymbol {X}_{i}^{\prime }\beta +\mu _{s}+\varepsilon _{i,s},$$ (9) where yi, s is the outcome of interest for student i in experimental session s, bi is her baseline performance at solving mazes, and $$\boldsymbol {X}_{i}^{{}}$$ denotes a vector of controls, which consists of all covariates that are listed in Table 6. Since the cost of practicing and sliming vary at the level of the experimental session, we also include μs, a session fixed effect. To allow for arbitrary forms of correlation in the residuals of children within the same experimental session, all standard errors are clustered at the session level. Given the small number of clusters, we follow the bootstrapping procedure recommended by Cameron, Gelbach, and Miller (2008) whenever we report p-values for hypothesis tests. Table 7. Experimental results. Willingness to pay for practicing Total money spent on practicing Total money spent on sliming (1) (2) (3) (4) (5) (6) Percentile (÷100) −.089** −.103*** −.061 −.081 (.035) (.037) (.074) (.073) Percentile .075* .089** .143 .165* −.371** −.390*** (÷100) × treatment (.041) (.042) (.092) (.092) (.234) (.165) H0: coefficient on percentile = 0 .022 .009 .412 .289 H0: coefficient on percentile × treatment = 0 .078 .045 .142 .090 .035 .004 Controls No Yes No Yes No Yes Experimental session fixed effects Yes Yes Yes Yes Yes Yes Sample Treatment and control Treatment and control Treatment and control Treatment and control Treatment Treatment Mean of dependent variable .201 .201 .204 .204 .137 .137 R-squared .138 .155 .160 .172 .110 .142 Number of observations 573 573 573 573 302 302 Willingness to pay for practicing Total money spent on practicing Total money spent on sliming (1) (2) (3) (4) (5) (6) Percentile (÷100) −.089** −.103*** −.061 −.081 (.035) (.037) (.074) (.073) Percentile .075* .089** .143 .165* −.371** −.390*** (÷100) × treatment (.041) (.042) (.092) (.092) (.234) (.165) H0: coefficient on percentile = 0 .022 .009 .412 .289 H0: coefficient on percentile × treatment = 0 .078 .045 .142 .090 .035 .004 Controls No Yes No Yes No Yes Experimental session fixed effects Yes Yes Yes Yes Yes Yes Sample Treatment and control Treatment and control Treatment and control Treatment and control Treatment Treatment Mean of dependent variable .201 .201 .204 .204 .137 .137 R-squared .138 .155 .160 .172 .110 .142 Number of observations 573 573 573 573 302 302 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (10) by ordinary least squares. The dependent variables are listed at the top of each column. All specifications control for baseline performance and experimental session fixed effects. Additional controls include gender, grade, a minority indicator, special education status, limited English proficiency, self-assessed ability, and indicator variables for missing demographic information. Heteroskedasticity robust standard errors are clustered by experimental session and reported in parentheses. To account for the small number of clusters, reported p-values are based on the wild bootstrap procedure suggested by Cameron et al. (2008) with 10,000 iterations. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large Table 7. Experimental results. Willingness to pay for practicing Total money spent on practicing Total money spent on sliming (1) (2) (3) (4) (5) (6) Percentile (÷100) −.089** −.103*** −.061 −.081 (.035) (.037) (.074) (.073) Percentile .075* .089** .143 .165* −.371** −.390*** (÷100) × treatment (.041) (.042) (.092) (.092) (.234) (.165) H0: coefficient on percentile = 0 .022 .009 .412 .289 H0: coefficient on percentile × treatment = 0 .078 .045 .142 .090 .035 .004 Controls No Yes No Yes No Yes Experimental session fixed effects Yes Yes Yes Yes Yes Yes Sample Treatment and control Treatment and control Treatment and control Treatment and control Treatment Treatment Mean of dependent variable .201 .201 .204 .204 .137 .137 R-squared .138 .155 .160 .172 .110 .142 Number of observations 573 573 573 573 302 302 Willingness to pay for practicing Total money spent on practicing Total money spent on sliming (1) (2) (3) (4) (5) (6) Percentile (÷100) −.089** −.103*** −.061 −.081 (.035) (.037) (.074) (.073) Percentile .075* .089** .143 .165* −.371** −.390*** (÷100) × treatment (.041) (.042) (.092) (.092) (.234) (.165) H0: coefficient on percentile = 0 .022 .009 .412 .289 H0: coefficient on percentile × treatment = 0 .078 .045 .142 .090 .035 .004 Controls No Yes No Yes No Yes Experimental session fixed effects Yes Yes Yes Yes Yes Yes Sample Treatment and control Treatment and control Treatment and control Treatment and control Treatment Treatment Mean of dependent variable .201 .201 .204 .204 .137 .137 R-squared .138 .155 .160 .172 .110 .142 Number of observations 573 573 573 573 302 302 Notes: Entries are coefficients and standard errors from estimating the linear model in equation (10) by ordinary least squares. The dependent variables are listed at the top of each column. All specifications control for baseline performance and experimental session fixed effects. Additional controls include gender, grade, a minority indicator, special education status, limited English proficiency, self-assessed ability, and indicator variables for missing demographic information. Heteroskedasticity robust standard errors are clustered by experimental session and reported in parentheses. To account for the small number of clusters, reported p-values are based on the wild bootstrap procedure suggested by Cameron et al. (2008) with 10,000 iterations. *Significant at 10%; **Significant at 5%; ***Significant at 1%. View Large The results in columns (1) and (2) show that, among children in the control group, willingness to pay for practicing is negatively correlated with ordinal rank.29 That is, conditional on baseline performance, a student at the very bottom of her reference group is willing to spend 10 cents more on practicing a maze than an observationally similar one who ranks at the top of the distribution because she happened to be paired with less able peers. Given a sample mean of 20 cents, this disparity is economically large and statistically significant (p = 0.009). Interestingly, the correlation between ordinal rank and willingness to practice disappears among students in the treatment group. The difference in the slope estimates between both groups is not only statistically significant (p = 0.045), but the coefficient on the interaction term is of opposite sign and almost as large as that on rank itself. Hence, when children are given the choice between practicing and disrupting their peers, it is no longer the case that lower-ranked students invest more than higher-ranked ones. The next two columns show that this conclusion is qualitatively robust to examining actual money spent on practicing rather than self-declared willingness to pay. Although the point estimates lose much of their precision—in large part because most students practice on only one or two mazes—the sign pattern of the coefficients is identical to that in columns (1) and (2), and the coefficient on the interaction term continues to be economically large and marginally significant (p = 0.090). The remaining two columns demonstrate that ordinal rank is negatively correlated with how much money children in the treatment group spent on disrupting others. Taking the point estimate in column (6) at face value, a student who is paired with more able peers and, therefore, ranks at the bottom of her peer group spends 39 cents more on sliming than an identical child who happens to be at the very top of her reference distribution (p = 0.004). Compared to a sample mean of 14 cents, the effect of rank on sliming is very large. Given that children were randomly assigned to either treatment or control, we conclude that the opportunity to engage in a second, disruptive activity caused lower-ranked students to substitute away from investing. Instead, they paid to engage in socially wasteful behavior. Note, in the control group, where students can only choose between practicing and waiting, there is no room for self-selection based on comparative advantage to affect behavior. But in the treatment group, where children are faced with the choice between two very different activities, students with a relative disadvantage at solving mazes opted out of the very activity in which they did poorly compared to their peers. Our experimental results, therefore, suggest that students’ behavior is mediated by self-selection according to their relative standing.30 5. Concluding Remarks Drawing on traditional models of selection in the labor market, we propose a theory of social interactions based on self-selection and comparative advantage. When self-selection is the guiding principle of peer group formation, an individual’s behavior is an equilibrium outcome. It depends on where in the ability distribution she falls, and on the shadow prices that clear the social market. That is, in our model, selection into peer groups is based on comparative advantage, and peer effects arise due to the endogenous sorting of agents into peer groups within a social setting. An important implication that distinguishes our theory from traditional models of peers effects is that student outcomes should depend on ordinal rank within a social environment. Our empirical findings show that this key prediction is borne out in one randomized controlled trial in Kenya as well as administrative data from the United States. To further probe the channel through which ordinal rank affects behavior, we implemented a “framed” field experiment with nearly 600 public school students in Houston. By isolating the choice mechanism, our experimental evidence speaks directly to the idea of self-selection in social interactions. Since our Roy model does not cast peers as a source of direct (positive or negative) externalities, it has the potential to rationalize many of the disparate findings in the empirical literature within a single, tractable framework. In particular, since individuals’ ordinal rank may deteriorate as a result of many well-intentioned interventions, our theory provides a simple explanation for why ostensibly better peers do not always lead to more favorable outcomes (see, e.g., Carrell et al. 2013; Kling, Ludwig, and Katz 2005; Kling, Liebman, and Katz 2007; Sanbonmatsu et al. 2006). For examples of how a Roy model of social interactions can reconcile much of the existing empirical evidence we refer interested readers to the Online Appendix. In the Appendix, we also discuss the implications of self-selection based on comparative for identifying “traditional” peer effects, and for predicting the efficacy of social interventions ex ante. 1 See Epple and Romano (2011) and Ioannides (2011) for recent surveys. 2 Many ethnographers describe similar phenomena around the globe: the Buraku Outcastes of Japan (Devos and Wagatsuma 1966); Blacks in America (Fordham and Ogbu 1986), the Maori of New Zealand (Chapple, Jefferies, and Walker 1997), Blacks on Chicago’s south side circa 1930 (Drake and Cayton 1945), and the working class in Britain (Willis 1977), among others. 3 For confirmatory evidence, see Booij, Leuven, and Oosterbelk (2017), Bursztyn and Jensen (2015), Carrell, Fullerton, and West (2009), Cooley-Fruehwirth (2013), Duflo, Dupas, and Kremer (2011), Feld and Zolitz (2017), Goux and Maurin (2007), Hanushek et al. (2003), Hoxby (2003), Hoxby and Weingarth (2005), Imberman, Kugler, and Sacerdote (2012), Mas and Moretti (2009), or Sacerdote (2001). For null or negative findings, see Angrist and Lang (2004), Cullen, Jacob, and Levitt (2006), Kang (2007), Guryan, Kroft, and Notowidigdo (2009), or Sanbonmatsu et al. (2006). 4 Throughout the paper, ordinal rank will be used to refer to a student’s percentile in a group. 5 Following our working paper in 2011 (Cicala, Fryer, and Spenkuch 2011), others have also documented a relationship between ordinal rank and student outcomes. Murphy and Weinhardt (2014) use administrative data on students in the United Kingdom to show that rank in primary school correlates with their secondary school achievement. Their empirical approach mirrors the one we take with the NYCPS data, and their results replicate ours, at least qualitatively. Elsner and Isphording (2017) use data from the National Longitudinal Study of Adolescent Health (AddHealth) and exploit within-school differences in the ability distribution of cohorts. The results show that ordinal rank is positively associated with high school completion and negatively with problem behaviors. Tincani (2015, 2014) explores the effect of rank when students intrinsically care about their ordinal ranking, that is, when rank directly enters the utility function. 6 In Online Appendix D, we replicate our finding that rank affects student behavior with data from the National Educational Longitudinal Study (NELS). NELS allows us to relate the same student’s behavior in different classrooms to a proxy for her course specific rank. We show that a 50 percentile decline in rank across classes is associated with a nearly 10 percentage point increase in the probability that the teacher reports behavioral problems in the course for which she has the lower rank (relative to a basis of 40%). 7 Our findings are also related to an emerging literature on the economic effects of relative incomes. Luttmer (2005) and Card et al. (2012), for instance, demonstrate that own well-being and satisfaction depend negatively on the earnings of neighbors and coworkers. Bertrand, Pan, and Kamenica (2015) show that spouses’ relative incomes affect marriage formation and the division of household production. Charles, Hurst, and Roussanov (2009) argue that conspicuous consumption serves as a costly signal of economic position. Lastly, Kuziemko et al. (2014) provide experimental evidence to suggest that people are “last-place averse” and that low-income individuals oppose redistribution because it disproportionately benefits those ranking just below them. 8 For instance, to implement a convincing test of the teacher channel, we would need objective information on how much effort and attention teachers place on students at every part of the ability distribution. We are unaware of such data. In Online Appendix D, we report results from a partial test of the hypothesis that teacher behavior varies with students’ rank. Specifically, we test whether teachers’ perception of their students’ ability depends rank. Conditional on actual test scores, we find no evidence that this is the case. 9 Harrison and List (2004) define a “framed” field experiment as laboratory experiment with a nonstandard subject pool and field context in either the commodity, task, or information set that the subjects can use. 10 There are six appendices. Online Appendix A considers the implications of comparative advantage for predicting the efficacy of social interventions ex ante. Online Appendix B illustrates how a Roy model of social interactions can explain many of the disparate findings in the empirical literature on peer effects. Online Appendix C discusses identification of traditional peer effects in the presence of self-selection based on comparative advantage in social interactions. Online Appendix D presents additional evidence omitted from the main text. In Appendices E and F we describe the data used in our analysis and provide further details regarding the implementation of our experiment. All appendices are provided on the authors’ websites. 11 In Cicala et al. (2011), we also extend the basic model to allow for many groups and n-dimensional skill (Heckman and Scheinkman 1987), hierarchies (Rosen 1982), and show that the basic results of our model hold when the sectoral choice problem is cast in a general social multiplier framework (Becker and Murphy 2000; Glaeser, Sacerdote, and Scheinkman 2003). 12 Note that the effective size of the nerd group is integrated on [r*, 1], so that, for any r*, the shift in δ will be larger the more concentrated a given increase in the distribution of σN is in the upper end of the distribution. 13 When social status is independent of group size, that is, ∂sj(Lj, Kj)/∂Lj = 0, then the horizontal price schedule intersects the “supply” curve exactly once, leading to a single equilibrium with traditional comparative statics. 14 To see this, consider the adjustment process following a small shock to “prices”. From the initial equilibrium at r**, a small decrease in relative prices (along the δ-schedule) will lead to agents flowing out of the troublemaking and into the nerd sector, which will cause relative status to decline further and lead to even more agents switching sectors. The process continues until the market reaches a new equilibrium at the origin. Conversely, a small increase in relative prices (along the δ-schedule) will lead to agents flowing into the troublemaking sector. This causes relative status to increase even more, thereby inducing more nerds to become troublemakers until the market reaches equilibrium at r*. Similar reasoning shows that the equilibrium at r* is stable. 15 In a generalized Roy model group membership itself may be costly. If the costs of membership do not systematically covary with students’ rank (i.e., if it is equally costly for all children to join a particular group), then the prediction above trivially carries over to the more general setting. If the difference in cost between joining the troublemaking and nerd groups increases with students rank (i.e., if it becomes relatively more costly for higher rank students to become troublemakers), then the key prediction of our simplified model holds as well. Only if membership is differentially costly and if it becomes relatively cheaper for students to join the troublemakers when their rank increases would we expect to see a different pattern in the data. 16 Since a full description of the experiment is available in the aforementioned paper, we restate only the intervention’s most salient features here and refer the interested reader to Duflo et al. (2011) for additional details. 17 Across five unannounced visits to each school, both section were found to be combined 14.4% of the time in nontracking schools and 9.7% of the time in tracking schools. When sections were not combined, 92% of students in nontracking schools and 96% of students in tracking schools respected their initial assignment. 18 About 21% of students in tracking schools and 23% of those in nontracking schools repeated first grade and participated in the program for only the first year. 19 Online Appendix Table A.1 demonstrates that our results are robust to controlling for higher order polynomials of peers’ mean test score, as well as different moments of the skill distribution, such as the variance of students’ ability. 20 The data of Duflo et al. (2011) also contain component test scores for math and literacy. Our results are qualitatively very similar when using these instead of total test scores, but we note that the impact of rank appears to be stronger for math scores. 21 We have also investigated the relationship between changes in test scores and changes in rank in the NYCPS data, finding qualitatively similar results as in the previous section. 22 The state math and ELA tests are high-stakes exams conducted in the winters of third through eighth grade. For additional information on these tests, see the Online Data Appendix. 23 Interestingly, Figure 5 shows that the relationship between rank and behavior is almost linear, except for in the extremes, where there is less data to deliver precise estimates. This suggests that simple linear models may provide decent approximations to the true functional relationship. 24 Reassuringly, our results are robust to controlling for various moments of the distribution of test scores. 25 In Online Appendix D, we report results from a partial test of the teacher behavior hypothesis. Specifically, we test whether teacher perception of student ability depends on ordinal rank. Conditional on actual test scores, we find no evidence that this is the case. 26 When examining the data from sessions in which the first stage involved a different number of mazes, we found no differences in student behavior. We, therefore, pool these data in the analysis below. 27 Students were told that they could go “go into debt” during this stage of the experiment, that is, that they could spend more money than they had earned during the previous stage. Any extra spending would be subtracted from their earnings in the third stage. At the end of the experiment, no child ended up with negative earnings. 28 We chose not to elicit an entire demand curve in order to reduce complexity and simplify the instructions. 29 As we argue above, for a large class of skill distributions, a ranking of students based on one skill will be a good proxy for one in terms (unobserved) relative skills. If sliming does not involve specific skills, then a ranking solely based on the ability to solve mazes will coincide exactly with one in terms of relative ability. To see this, note that, if sliming does not take skill, then we can normalize σT(r) = 1 for all individuals, which in turn implies that σ(r) ≡ σN(r)/σT(r) = σN(r). 30 An alternative, a priori plausible explanation of our findings might be that students are inequality averse. If inequality aversion was the reason that low-ranked children sabotaged others, then we would expect there to be a positive relationship between rank at baseline and the number of times a student was slimed herself. That is, in order to reduce inequality low-ranked students should disproportionately target higher ranked ones rather than their low-ranked peers. We do not observe such a pattern in the data. Estimating the regression models in columns (5) and (6) of Table 7 with an indicator for whether a student’s screen got slimed as the outcome produces point estimates of .041 and −.009 (with standard errors of .073 and .069, respectively), relative to a mean of .48. An explanation based on inequality aversion is, therefore, at odds with this particular moment of the data. References Akerlof G. A. ( 1997 ). “Social Distance and Social Decisions.” Econometrica , 65 , 1005 – 1028 . Google Scholar CrossRef Search ADS Angrist J. D. , Lang K. ( 2004 ). “Does School Integration Generate Peer Effects? Evidence from Boston’s Metco Program.” American Economic Review , 94 ( 5 ), 1613 – 1634 . Google Scholar CrossRef Search ADS Austen-Smith D. , Fryer R. G. ( 2005 ). “An Economic Analysis of Acting White.” Quarterly Journal of Economics , 120 , 551 – 583 . Bala Venkatesh , Goyal S. ( 2000 ). “A Noncooperative Model of Network Formation.” Econometrica , 68 , 1181 – 1229 . Google Scholar CrossRef Search ADS Becker G. M. , DeGroot M. H. , Marschak J. ( 1964 ). “Measuring Utility by a Single-Response Sequential Method.” Behavioral Science , 9 , 226 – 232 . Google Scholar CrossRef Search ADS PubMed Becker G. S. ( 1974 ). “A Theory of Social Interactions.” Journal of Political Economy , 82 , 1063 – 1093 . Google Scholar CrossRef Search ADS Becker G. S. ( 1991 ). A Treatise on the Family . Harvard University Press , Cambridge, MA . Becker G. S. ( 1996 ). Accounting for Tastes . Harvard University Press , Cambridge, MA . Becker G. S. , Murphy K. M. ( 2000 ). Social Economics: Market Behavior in a Social Environment . Belknap Press of Harvard University , Cambridge, MA . Google Scholar CrossRef Search ADS Benabou R. ( 1993 ). “Workings of a City: Location, Education, and Production.” Quarterly Journal of Economics , 108 , 619 – 652 . Google Scholar CrossRef Search ADS Bernheim B. D. ( 1994 ). “A Theory of Conformity.” Journal of Political Economy , 102 , 841 – 877 . Google Scholar CrossRef Search ADS Bertrand M. , Pan J. , Kamenica E. ( 2015 ). “Gender Identity and Relative Income within Households.” Quarterly Journal of Economics , 130 , 571 – 614 . Google Scholar CrossRef Search ADS Booij A. , Leuven E. , Oosterbelk H. ( 2017 ). “Ability Peer Effects in University: Evidence from a Randomized Experiment.” Review of Economic Studies , 84 , 816 – 839 . Borjas G. J. ( 1987 ). “Self-Selection and the Earnings of Immigrants.” American Economic Review , 77 ( 4 ), 531 – 553 . Borjas G. J. ( 1995 ). “The Economic Benefits from Immigration.” The Journal of Economic Perspectives , 9 ( 2 ), 3 – 22 . Google Scholar CrossRef Search ADS Bursztyn L. , Jensen R. ( 2015 ). “How Does Peer Pressure Pressure Affect Educational Investments?” Quarterly Journal of Economics , 130 , 1329 – 1367 . Cameron A. C. , Gelbach J. B. , Miller D. L. ( 2008 ). “Bootstrap-Based Improvements for Inference with Clustered Errors.” Review of Economics and Statistics , 90 , 414 – 427 . Google Scholar CrossRef Search ADS Canada G. ( 1995 ). Fist, Stick, Knife, Gun: A Personal History of Violence in America . Beacon Press , Boston, MA . Card D. , Mas A. , Moretti E. , Saez E. ( 2012 ) “Inequality at Work: The Effect of Peer Salaries on Job Satisfaction.” American Economic Review , 102 ( 6 ), 2981 – 3002 . Google Scholar CrossRef Search ADS Carrell S. E. , Fullerton R. L. , West J. E. ( 2009 ). “Does Your Cohort Matter? Measuring Peer Effects in College Achievement.” Journal of Labor Economics , 27 , 439 – 464 . Google Scholar CrossRef Search ADS Carrell S. E. , Sacerdote B. I. , West J. E. ( 2013 ). “From Natural Variation to Optimal Policy? The Importance of Endogenous Peer Group Formation.” Econometrica , 81 , 855 – 882 . Google Scholar CrossRef Search ADS Chapple S. , Jefferies R. , Walker R. ( 1997 ). Maori Participation and Performance in Education: a Literature Review and Research Programme . N.Z. Institute of Economic Research . Charles K. K. , Hurst E. , Roussanov N. ( 2009 ) “Conspicuous Consumption and Race.” Quarterly Journal of Economics , 124 , 425 – 467 . Google Scholar CrossRef Search ADS Cicala S. , Fryer R. G. , Spenkuch J. L. ( 2011 ). “A Roy Model of Social Interactions.” NBER Working Paper No. 16880 , Cambridge, MA . Cooley-Fruehwirth J. C. ( 2013 ). “Identifying Peer Achievement Spillovers: Implications for Desegregation and the Achievement Gap.” Quantitative Economics , 4 , 85 – 124 . Google Scholar CrossRef Search ADS Cullen J. B. , Jacob B. A. , Levitt S. D. ( 2006 ) “The Effect of School Choice on Participants: Evidence from Randomized Lotteries.” Econometrica , 74 , 1191 – 1230 . Google Scholar CrossRef Search ADS De Vos George , Wagatsuma H. ( 1966 ). Japan’s Invisible Race: Caste in Culture and Personality . University of California Press , Berkeley . Drake S.C. , Cayton H. R. ( 1945 ). Black Metropolis: A Study of Negro Life in a Northern City. University of Chicago Press . Duflo E. , Dupas P. , Kremer M. ( 2011 ). “Peer Effects, Teacher Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya.” American Economic Review , 101 ( 5 ), 1739 – 1774 . Google Scholar CrossRef Search ADS Elsner B. , Isphording I. ( 2017 ). “A Big Fish in a Small Pond: Ability Rank and Human Capital Investment.” Journal of Labor Economics , 35 ( 3 ), 787 – 828 . Google Scholar CrossRef Search ADS Elsner B. , Isphording I. ( 2017 ). “Rank, Sex, Drugs, and Crime.” Journal of Human Resources , Forthcoming . Epple D. , Romano R. E. ( 2011 ). “Peer Effects in Education: A Survey of the Theory and Evidence.” In Handbook of Social Economics , Vol. 1 , edited by Benhabib J. , Bisin A. , Jackson M. O. . Elsevier , Amsterdam , pp. 1053 – 1163 . Google Scholar CrossRef Search ADS Feld J. , Zolitz U. ( 2017 ). “Understanding Peer Effects: On the Nature, Estimation, and Channels of Peer Effects.” Journal of Labor Economics , 35 , 387 – 428 . Google Scholar CrossRef Search ADS Fordham S. , Ogbu J. U. ( 1986 ). “Black Students’ School Success: Coping with the “Burden of ‘Acting White’.” The Urban Review , 18 , 176 – 206 . Google Scholar CrossRef Search ADS Friedman M. ( 1953 ). “The Methodology of Positive Economics.” In Essays in Positive Economics . University of Chicago Press , Chicago . Google Scholar CrossRef Search ADS Gans H. J. ( 1962 ). The Urban Villagers: Group and Class in the Life of Italian-Americans. Free Press of Glencoe , New York . Glaeser E. L. , Sacerdote B. I. , Scheinkman J. A. ( 2003 ). “The Social Multiplier.” Journal of the European Economic Association , 1 , 345 – 353 . Google Scholar CrossRef Search ADS Goux D. , Maurin E. ( 2007 ). “Close Neighbours Matter: Neighbourhood Effects on Early Performance at School.” Economic Journal , 117 , 1193 – 1215 . Google Scholar CrossRef Search ADS Guryan J. , Kroft K. , Notowidigdo M. J. ( 2009 ). “Peer Effects in the Workplace: Evidence from Random Groupings in Professional Golf Tournaments.” American Economic Journal: Applied Economics , 1 , 34 – 68 . Google Scholar CrossRef Search ADS PubMed Hanushek E. A. , Kain J. F. , Markman J. M. , Rivki S. G. ( 2003 ). “Does Peer Ability Affect Student Achievement?” Journal of Applied Econometrics , 18 , 527 – 544 . Google Scholar CrossRef Search ADS Harrison G. W. , List J. A. ( 2004 ). “Field Experiments.” Journal of Economic Literature , 42 , 1009 – 1055 . Google Scholar CrossRef Search ADS Heckman J. J. , Sedlacek G. ( 1985 ). “Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-selection in the Labor Market.” Journal of Political Economy , 93 , 1077 – 1125 . Google Scholar CrossRef Search ADS Heckman J. J. , Scheinkman J. A. ( 1987 ). “The Importance of Bundling in a Gorman-Lancaster Model of Earnings.” Review of Economic Studies , 54 , 243 – 255 . Google Scholar CrossRef Search ADS Hoxby C. M. ( 2003 ). “The Power of Peers: How Does the Makeup of a Classroom Influence Achievement.” Education Next , 2 ( 2 ), 57 – 63 . Hoxby C. M. , Weingarth G. ( 2005 ). “Taking Race Out of the Equation: School Reassignment and the Structure of Peer Effects.” Working paper . Harvard University . Imberman S. , Kugler A. D. , Sacerdote B. ( 2012 ). “Katrina’s Children: Evidence on the Structure of Peer Effects from Hurricane Evacuees.” American Economic Review , 102 ( 5 ), 2048 – 2082 . Google Scholar CrossRef Search ADS Ioannides Y. M. ( 2011 ). “Neighborhood Effects and Housing.” In Handbook of Social Economics , Vol. 1 , edited by Benhabib J. , Bisin A. , Jackson M. O. . Elsevier , Amsterdam , pp. 1281 – 1340 . Google Scholar CrossRef Search ADS Jackson M. O. , Wolinsky A. ( 1996 ). “A Strategic Model of Social and Economic Networks.” Journal of Economic Theory , 71 , 44 – 74 . Google Scholar CrossRef Search ADS Jackson M. O. , Rogers B. W. ( 2007 ). “Meeting Strangers and Friends of Friends: How Random Are Social Networks? ” American Economic Review , 97 ( 3 ), 890 – 915 . Google Scholar CrossRef Search ADS Jackson M. O. , Rogers B. , Zenou Y. ( 2017 ). “The Economic Consequences of Social Network Structure.” Journal of Economic Literature , 55 , 49 – 95 . Google Scholar CrossRef Search ADS Kang C. ( 2007 ). “Classroom Peer Effects and Academic Achievement: Quasi-Randomization Evidence from South Korea.” Journal of Urban Economics , 61 , 458 – 495 . Google Scholar CrossRef Search ADS Kling J. R. , Ludwig J. , Katz L. F. ( 2005 ). “Neighborhood Effects on Crime for Female and Male Youth: Evidence from a Randomized Housing Voucher Experiment.” Quarterly Journal of Economics , 120 , 87 – 130 . Kling J. R. , Liebman J. B. , Katz L. F. ( 2007 ). “Experimental Analysis of Neighborhood Effects.” Econometrica , 75 , 83 – 119 . Google Scholar CrossRef Search ADS Kuziemko I. , Buell R. W. , Reich T. , Norton M. I. ( 2014 ). “Last-Place Aversion: Evidence and Redistributive Implications.” Quarterly Journal of Economics , 129 , 105 – 149 . Google Scholar CrossRef Search ADS Luttmer E. F. P. ( 2005 ). “Group Loyalty and the Taste for Redistribution.” Journal of Political Economy , 109 , 500 – 528 . Google Scholar CrossRef Search ADS Marsh H. , Parker J. W. ( 1984 ). “Determinants of Student Self-Concept: Is It Better to Be a Relatively Large Fish in a Small Pond Even if You Don’t Learn to Swim as Well?” Journal of Personality and Social Psychology , 47 , 213 – 231 . Google Scholar CrossRef Search ADS Marsh H. ( 1987 ). “The Big-Fish-Little-Pond Effect on Academic Self-Concept.” Journal of Educational Psychology , 79 , 280 – 295 . Google Scholar CrossRef Search ADS Marsh H. , Seaton M. , Trautwein U. , Ludtke O. , Hau K. T. , O’Mara A. J. ( 1987 ). “The Big-Fish-Little-Pond-Effect Stands Up to Critical Scrutiny: Implications for Theory, Methodology, and Future Research.” Educational Psychology Review , 20 , 319 – 350 . Google Scholar CrossRef Search ADS Mas A. , Moretti E. ( 2009 ). “Peers at Work.” American Economic Review , 99 ( 1 ), 122 – 145 . Google Scholar CrossRef Search ADS Miller R. A. ( 1984 ). “Job Matching and Occupational Choice.” Journal of Political Economy , 92 , 1086 – 1120 . Google Scholar CrossRef Search ADS Murphy K. M. ( 1986 ). “Specialization and Human Capital.” Doctoral Dissertation . University of Chicago . Murphy R. , Weinhardt F. ( 2014 ). “Top of the Class: The Importance of Rank Position.” Working paper . London School of Economics . Rogers C.M. , Smith M. D. , Coleman J. M. ( 1978 ). “Social Comparison in the Classroom: the Relationship Between Academic Achievement and Self-Concept.” Journal of Educational Psychology , 70 , 50 – 57 . Google Scholar CrossRef Search ADS PubMed Rosen S. ( 1974 ). “Hedonic Prices and Implicit Markets: Product Differentiation in Pure Competition.” Journal of Political Economy , 82 , 34 – 55 . Google Scholar CrossRef Search ADS Rosen S. ( 1982 ). “Authority, Control, and the Distribution of Earnings.” Bell Journal of Economics , 13 , 311 – 323 . Google Scholar CrossRef Search ADS Roy A. D. ( 1951 ). “Some Thoughts on the Distribution of Earnings.” Oxford Economic Papers , 3 , 135 – 146 . Google Scholar CrossRef Search ADS Sacerdote B. I. ( 2001 ). “Peer Effects with Random Assignment: Results for Dartmouth Roommates.” Quarterly Journal of Economics, 116 , 681 – 704 . Sanbonmatsu L. , Kling J. R. , Duncan G. , Brooks-Gunn J. ( 2006 ). “Neighborhoods and Academic Achievement: Results from the Moving to Opportunity Experiment.” Journal of Human Resources , 41 , 649 – 691 . Google Scholar CrossRef Search ADS Sattinger M. ( 1979 ). “Differential Rents and the Distribution of Earnings.” Oxford Economic Papers , 31 , 60 – 71 . Google Scholar CrossRef Search ADS Shea J. ( 1997 ). “Instrument Relevance in Multivariate Linear Models: A Simple Measure.” Review of Economics and Statistics , 79 , 348 – 352 . Google Scholar CrossRef Search ADS Spence M. ( 1973 ). “Job Market Signaling.” Quarterly Journal of Economics , 87 , 355 – 374 . Google Scholar CrossRef Search ADS Stock J. H. , Yogo M. ( 2005 ). “Testing for Weak Instruments in Linear IV Regression.” In Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg , edited by Andrews D. W. K. , Stock J. H. . Cambridge University Press , Cambridge, UK , pp. 80 – 108 . Suskind R. ( 1998 ). A Hope in the Unseen: An American Odyssey from the Inner City to the Ivy League . Broadway Books , New York . Tincani M. ( 2014 ). “On the Nature of Social Interactions in Education: An Explanation for Recent Experimental Evidence.” Working paper . University College London . Tincani M. ( 2015 ). “Heterogeneous Peer Effects and Rank Concerns: Theory and Evidence.” Working paper . University College London . Willis P. ( 1977 ). Learning to Labor: How Working Class Kids Get Working Class Jobs. Columbia University Press . Willis R. J. , Rosen S. ( 1979 ). “Education and Self-Selection.” Journal of Political Economy , 87 , S7 – S36 . Google Scholar CrossRef Search ADS Wilson W. J. ( 1987 ). The Truly Disadvantaged: the Inner City, the Underclass, and Public Policy. The University of Chicago Press . Supplementary Data Supplementary data are available at JEEA online. © The Authors 2017. Published by Oxford University Press on behalf of European Economic Association. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

### Journal

Journal of the European Economic AssociationOxford University Press

Published: Sep 29, 2017

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create lists to

Export lists, citations