TY - JOUR AU1 - Schwebel, David, C AU2 - Long, D, Leann AU3 - McClure, Leslie, A AB - Abstract Objective Youth soccer injury can be prevented through various means, but few studies consider the role of referees. Following previous research suggesting children take fewer risks when supervised intensely, this randomized crossover trial evaluated whether risky play and injuries decrease under supervision from three referees instead of one referee. Methods Youth soccer clubs serving a metropolitan U.S. area participated. Boys’ and girls’ clubs at under age 10 (U10) and under age 11 (U11) levels were randomly assigned such that when the same clubs played each other twice in the same season, they played once with one referee and once with three referees. A total of 98 games were videotaped and subsequently coded to obtain four outcomes: collisions between players, aggressive fouls (involving physical player-to-player contact) called by the referee(s) on the field, aggressive fouls judged by trained coders, and injuries requiring adult attention or play stoppage. Results Poisson mixed model results suggest players in the 98 games committed fewer aggressive fouls, as identified independently by referees (rate ratio [RR] 0.58; 95% confidence interval [CI] 0.35–0.96) and by researchers (RR 0.67; 95% CI 0.50–0.90), when there were three referees versus one referee. Collisions (RR 0.98; 95% CI 0.86–1.12) and injury rates (RR 1.15; 95% CI 0.60–2.19) were similar across conditions. Conclusion When the same youth soccer clubs played with three referees rather than one, they committed fewer aggressive fouls. More intense supervision created better rule adherence. Injury rates were unchanged with increased supervision. Results raise questions concerning whether financial investment in additional referees on youth soccer fields yields safety benefits. football, injury, safety, soccer, supervision Introduction Soccer is the most popular sport in the world. In the United States, the popularity of youth soccer ballooned in a generation, peaking at over 3 million registered children in the mid-2010s (Drape, 2018; Rapids Youth Soccer, 2016), compared to about 811,000 children registered in 1980 and 1.6 million in 1990 (Rapids Youth Soccer, 2016). The gender breakdown among youth players in the United States is nearly balanced, with 52% of registrants boys and 48% girls. Despite the clear health benefits of engagement in youth soccer, there also are risks for sports-related injury. Epidemiological data lists soccer as the third most common cause of youth sports injury requiring emergency department visits in the United States, following football and basketball (Schwebel & Brezausek, 2014). Soccer accounts for an estimated 130,245 emergency department visits by American youth annually (Schwebel & Brezausek, 2014). In other countries, such as Canada (Fridman et al., 2013) and the Netherlands (Belechri et al., 2001), soccer is the leading cause of youth sports injury. Soccer injuries occur primarily from three situations. One set of injuries is the result of overuse or overextension. These injuries, most commonly sprains and strains in the lower extremity, are substantial but not the present focus and usually not treated urgently. A second set of injuries, usually short-lived and comparatively minor, arises when a player is struck by the ball, often in the stomach or face. This type of injury is also not the present focus. The third set of injuries, the present focus, results from collisions, most often between two players. Such injuries comprise 50–70% of youth soccer injuries (Brito et al., 2012; Committee on Sports Medicine and Fitness, 2000; Junge & Dvorak, 2004; Khodaee et al., 2017; Laker, 2011), and are sometimes treated urgently. A number of strategies have been proposed to reduce youth soccer injuries resulting from collisions between players. One prominent strategy is efforts to change the rules so that aggressive and physical play is minimized. Recent efforts to prohibit heading (use of the head to control or retrieve the ball) in youth soccer—a rule change that may reduce risk of head injury from striking the ball but also reduce risk of head or extremity injury from two players colliding while attempting to head the ball—has been lauded by the injury prevention community (Yang & Baugh, 2016). Another prominent strategy works toward changes in coaching practices, attempting to reduce competitive and even illegal play that coaches may encourage or instill in young players (Wiersma & Sherman, 2005). Prototypical coaches in youth soccer leagues are volunteer parents without formal training in either soccer or coaching, and in some cases they promote risky and aggressive behavior among their players. This strategy addresses the complex matter of demarcating a line between training children to play aggressively but within the rules of the game versus breaking, bending, or “pushing” the rules to win. Most athletes recognize aggressive soccer play can lead to success on the field, but such instances of aggression, often illegal, also increase injury risk. Efforts to evaluate coach training programs are few and unproven in efficacy (Wiersma & Sherman, 2005). A third strategy to reduce collision-based youth soccer injuries, and one that has received comparatively less attention, is the role of referees. A series of studies suggest that children behave in safer ways when they recognize or sense they are being supervised by authority figures. This pattern was illustrated initially in a laboratory setting (Schwebel & Bounds, 2003), and then replicated in real-world pedestrian (Barton & Schwebel, 2007), playground (Chelvakumar et al., 2010; Schwebel et al., 2006b, 2015), and swimming pool (Schwebel et al., 2007, 2011) settings. Youth soccer environments offer parallels to playground and swimming settings. In all cases, there is reason and even encouragement for children to take risks and experiment with physical challenges. There are also rules or guidelines in each setting. Unlike playgrounds and swimming pools, however, youth soccer environments present highly dynamic situations and an environment where aggression and risk-taking are of benefit to players trying to win the ball and win the game. This study was developed, therefore, to test whether increased intensity of supervision through the presence of more referees on the field might reduce children’s risk-taking and injury risk situations in youth soccer matches. We conducted our research using a randomized crossover design among both boys’ and girls’ youth soccer games that historically were officiated by a single referee standing in the center of the field. Games were paired such that when the same two clubs played each other, they were randomly assigned to have the historic pattern of one referee versus a new format of three referees, with one in the center and two along the sidelines, as is used in most youth soccer matches with older children. All three referees were granted authority and encouraged to call all fouls they witnessed. We conducted our research with youth about ages 8–10, the oldest age groups in the local community whose games were traditionally officiated by a single referee. We hypothesized the additional referees on the field would lead to reduced risk-taking and injury, as conceptualized through four outcomes: collisions between players, aggressive fouls (i.e., those involving player-to-player contact) called by referees on the field, aggressive fouls detected by our research team while reviewing videotaped games, and injuries to players requiring adult attention and/or removal from the game. Methods We conducted a randomized crossover trial with boys’ and girls’ youth soccer games randomized to have one referee versus three referees, and pairs of teams serving as the unit of analysis. Player-to-player collisions, fouls, and player injuries were assessed through behavioral coding of recorded games to evaluate whether additional referees reduced the risk of child injury. The research protocol was approved by the Institutional Review Board at the University of Alabama at Birmingham. Referees provided informed consent to participate. Since we were videotaping public events in public locations, the consent of players, coaches, and others captured on the videotapes was exempted. All videotapes were stored as highly confidential research data. We monitored for study-related adverse events and noted none. Randomization of games took place in advance so that either one referee or three referees could be scheduled to work. Each game was pseudo-randomly assigned by a referee scheduling group to have one referee or three referees, and then matched pairs were identified to determine sequence approximating a 1:1 allocation ratio (half the pairs had one referee first and half had three referees first). Sequence was therefore concealed prior to assignment and the referee scheduling group, who were responsible for assigning referees to each game and worked independently from the coaches, players, and researchers, applied the randomized number of referees. The referee scheduling group also did not serve as referees for any of the matches. Masking of condition to any party was not possible given the nature of the research design and goals. Youth Soccer League and Game Settings Participating teams were part of a consortium of elite youth soccer clubs in the greater Birmingham, Alabama metropolitan area and extending to nearby cities like Huntsville and Tuscaloosa. The clubs served a racially, geographically, and socioeconomically diverse group of communities and families and were based in rural, suburban, and urban areas. Both boys and girls teams participated, and we focused our research on the U10 (under age 10) and U11 (under age 11) leagues, which were the oldest groups of teams previously playing matches officiated by a single referee. Those teams were populated with players almost entirely ages 9–10, with a few 7- and 8-year olds, generally highly skilled players, “playing up” with teams serving primarily older children. All teams serving those age groups were eligible for participation; no cultural, language, health, or other exclusion criteria applied. Games were played throughout the metropolitan area, with teams traveling as far as about 100 miles (160 km) for matches against clubs in neighboring cities. In many cases, teams played a few matches on the same day or weekend, as part of a round-robin tournament. The crossover design study was originally planned using PASS (NCSS Statistical Software, Kaysville, UT, United States) to detect differences in outcomes of at least 0.27 SD assuming 150 pairs of games (300 total games) with 90% power, and assuming a two-sided alpha level of 0.05. The methodology applied accounts for the paired nature of the study design, which addresses within-participant variability. Because changes to the league occurred between the planning stage of the study and the implementation, and due to logistical challenges in data collection, we achieved a smaller sample size than planned. In total, we included 49 pairs of games in the study (98 total games), which under the same assumptions offered power to detect an effect size of at least 0.47 SD. The recorded 98 games involved 13 clubs and included 38 games for U10 boys, 28 for U10 girls, 18 for U11 boys, and 14 for U11 girls. All games were paired, such that the same clubs played each other, once with one referee on the field and the other time with three referees on the field. All games occurred during the same Fall 2017 season, with the first games occurring on September 9, 2017, and the last on November 11, 2017. Games were scheduled to last 50 min (M = 53.1, SD = 6.5), with a few games shortened by weather conditions or lengthened to account for injury-related stoppage. Climate during the games ranged widely, with hot and humid weather typical at the start of the season and crisp, cooler weather at the end (M temperature = 80.9° F, SD = 5.7°). Light rain was present at two games. We recorded all games from a location in the center of the field sideline using a digital video camera with a wide-angle lens placed on a tripod. Undergraduate students, graduate students, and paid research staff were trained to follow the action of the ball and record both sound and video. Elevated angles were used when logistically feasible. Appropriate gear was present to handle recording in the rain. All games scheduled to be recorded were recorded successfully, but technical issues (e.g., dead battery, full memory card) led to small portions of games not recorded on rare occasions (<10% of games). Referees Referees were hired by the league. All had appropriate training and certification to serve as youth soccer referees. The full panel of 72 referees used by the league included 61 men (85%) and 11 women (15%) with a mean age of 27.2 years (SD = 15.9; range = 14.1–71.2 years). They were 65% Caucasian, 18% African-American, 13% Hispanic, and 3% Asian American; one referee declined to provide their race/ethnicity. Most of these referees were present at one or more of the games included in this research, and all provided informed consent via a very short online questionnaire that included consenting processes and a brief demographic survey. Researchers were available to answer questions about the research and consenting process by email or telephone, or in person during recorded soccer matches. All referees serving the league were eligible for inclusion in the study; no exclusion criteria applied. Logistics of Soccer Refereeing Traditionally, professional soccer matches have three referees on the field, with the “center” referee responsible for calling all contact fouls while two assistant referees monitor sidelines to regulate out-of-bounds calls and offsides. In the mid-2010s, FIFA, the international soccer monitoring body, recognized that assistant referees on the sidelines may sometimes have a better angle to view fouls than the center referee and therefore at the 2014 World Cup, side referees were granted the right and responsibility to call contact penalties also. This strategy for refereeing elite soccer matches has begun to be adopted more broadly and is the strategy we implemented for this research when three referees were present. The three referees positioned themselves in “traditional” locations (center referee covering a diagonal pathway across the field and side referees monitoring opposite sidelines), but all three were granted the right and responsibility to call all fouls. They were trained and familiar with this strategy already, as it was used for older children in this and other local community and interscholastic soccer leagues. For games with just one referee, that referee was responsible for monitoring the full field and calling all fouls and sideline violations. Videotape Coding Following recording of all games, coding of the videotapes proceeded in three steps. First, as detailed below and following recommendations to code behavioral pediatric psychology data by Chorney et al. (2015), objective written criteria were developed, largely through refinement of existing criteria used by our laboratory in previous research (Schwebel et al., 2006a). Joint review of videotapes by the coding team, lab manager, and principal investigator was incorporated into the process of developing those criteria. Second, two coders independently reviewed a randomly selected 15% of games. Inter-rater reliability was established through that review on each variable, with coding agreement of 95% or higher for every categorical variable, κ > 0.70 on categorical variables, and intraclass correlation r ≥ 0.80 on continuous variables. Intraclass correlations between coders for the four primary outcome measures of collisions, aggressive fouls called by the referees, aggressive fouls detected by the coders, and injuries, were 0.80, 0.89, 0.95, and 0.90, respectively, all above the recommended correlation of 0.75 or above for acceptable agreement (Cicchetti, 1994; Hallgren, 2012). Following the establishment of inter-rater reliability, disagreements between coders were resolved through joint tape review and consensus agreement. At that point, the third step of coding proceeded, review of the remaining tapes by a single researcher. Coding was conducted by a mix of doctoral students and paid research assistants, all of whom held a bachelor’s degree in psychology, public health, or a related field and many of whom held master’s degrees. Measures The following four outcome measures were retrieved through videotape review: Collision, defined as contact between two or more players that resulted in one or more players showing visible signs of pain, falling down as a result of the contact, or experiencing an injury that required adult attention. A collision also was recorded when two or more players had contact as a result of a foul, as called by the referee or judged by the coder. Finally, a collision occurred when there was forceful contact between players because one player was trying to gain a better position or get to the ball. Aggressive foul (referee), defined as a foul whistled by a referee on the field, live, during the game, that was aggressive in that it involved player-to-player contact. Examples include pushing, tripping, and other aggressive acts. Handballs, offside calls, and illegal throw-ins were not included. Aggressive foul (coder), defined as a foul identified by our researcher who was coding videotaped games. As in the aggressive referee fouls, only fouls that involved player-to-player contact were included. Thus, the two outcomes of aggressive fouls called by the referee and aggressive fouls coded by the research team were designed to count the same behaviors. The research team benefited, of course, from the opportunity to use replay, slow motion, and zooming on videotapes, so their counts were generally higher than the referee-called fouls. Injury, defined as any sort of pain or tissue damage that was serious enough that an adult attended to the player on the field and/or the player left the game. We further coded injuries into three categories: (a) those occurring as a result of a collision or contact with another player, (b) those occurring as a result of contact with the ball, such as when the ball is kicked and hits a player hard in the stomach, and (c) those occurring as a result of no contact, such as a strained muscle or a cramp. Given the goals of this study, only injuries occurring in the first group—those resulting from a collision or contact with another player—were considered in data analyses. Injuries in this category comprised 80.4% of all injuries that were coded. Statistical Analysis Plan Frequencies of outcome measures were summarized overall, by number of referees, and by gender/age group. To test for differences in the rates for each outcome between games with three referees and games with one referee, separate Poisson mixed models were fitted, allowing for the incorporation of the crossover design. Specifically, correlations within game pairs were addressed with a pair-specific intercept. Offsets were utilized to account for individual recorded game length. This methodology yields rate ratios (RRs) to compare the ratio of event rates between three-referee games and one-referee games. Given the commonality of competing in youth soccer matches among our sample, the carryover effect from one game to the next should be minimal. Carryover was assessed statistically with multiplicative interaction between group sequence and treatment (Brown, 1980; Willan & Pater, 1986). The level of significance for all analyses was set at 0.05. Statistical analyses were conducted in SAS 9.4 (SAS Institute, Cary, NC, United States). Results Table I presents descriptive and inferential results. There was no evidence of carryover effect for any of the outcome measures (all p > .05). As shown, collisions occurred at similar frequencies across games with one referee versus three referees overall, although there were trends for more collisions between players to occur among both older and younger girls with one referee, and among older boys with three referees present. Table I. Crude Outcome Frequencies and Estimated RRs of Risk-Taking in Youth Soccer Games With Three Referees Versus One Referee Outcome . Overall (N = 49 pairs) . U10 girls (n = 14) . U10 boys (n = 19) . U11 girls (n = 7) . U11 boys (n = 9) . Collisions M (SD; all games combined) 9.6 (7.5) 8.6 (8.7) 10.6 (7.1) 6.6 (5.0) 11.4 (7.3)  3 referees M (SD) 9.5 (8.4) 8.0 (7.5) 10.7 (9.1) 5.0 (5.1) 12.9 (9.4)  1 referee M (SD) 9.6 (6.4) 9.3 (9.4) 10.4 (4.6) 8.1 (4.6) 9.9 (4.5)  RR (95% CI)a 0.98 (0.86–1.12) Aggressive fouls M (SD; referee; all games) 0.7 (1.0) 0.6 (0.9) 0.8 (1.2) 0.7 (1.0) 0.7 (1.0)  3 referees M (SD) 0.5 (1.0) 0.4 (0.6) 0.6 (1.2) 0.1 (0.4) 0.9 (1.1)  1 referee M (SD) 0.9 (1.0) 0.9 (1.0) 0.9 (1.1) 1.3 (1.1) 0.6 (0.9)  RR (95% CI)a 0.58 (0.35–0.96)* Aggressive fouls M (SD; coders; all games) 1.9 (1.7) 1.6 (1.7) 1.9 (1.6) 2.1 (1.7) 2.3 (1.9)  3 referees M (SD) 1.6 (1.6) 1.2 (1.5) 1.6 (1.5) 1.7 (1.7) 1.9 (2.0)  1 referee M (SD) 2.3 (1.7) 2.0 (1.8) 2.2 (1.7) 2.6 (1.7) 2.7 (1.9)  RR (95% CI)a 0.67 (0.50–0.90)** Injuries M (SD; all games) 0.4 (0.7) 0.2 (0.6) 0.4 (0.7) 0.4 (0.6) 0.6 (0.8)  3 referees M (SD) 0.4 (0.7) 0.4 (0.6) 0.4 (0.6) 0.3 (0.5) 0.8 (0.8)  1 referee M (SD) 0.4 (0.7) 0.1 (0.3) 0.5 (0.8) 0.6 (0.8) 0.3 (0.7)  RR (95% CI)a 1.15 (0.60–2.19) Outcome . Overall (N = 49 pairs) . U10 girls (n = 14) . U10 boys (n = 19) . U11 girls (n = 7) . U11 boys (n = 9) . Collisions M (SD; all games combined) 9.6 (7.5) 8.6 (8.7) 10.6 (7.1) 6.6 (5.0) 11.4 (7.3)  3 referees M (SD) 9.5 (8.4) 8.0 (7.5) 10.7 (9.1) 5.0 (5.1) 12.9 (9.4)  1 referee M (SD) 9.6 (6.4) 9.3 (9.4) 10.4 (4.6) 8.1 (4.6) 9.9 (4.5)  RR (95% CI)a 0.98 (0.86–1.12) Aggressive fouls M (SD; referee; all games) 0.7 (1.0) 0.6 (0.9) 0.8 (1.2) 0.7 (1.0) 0.7 (1.0)  3 referees M (SD) 0.5 (1.0) 0.4 (0.6) 0.6 (1.2) 0.1 (0.4) 0.9 (1.1)  1 referee M (SD) 0.9 (1.0) 0.9 (1.0) 0.9 (1.1) 1.3 (1.1) 0.6 (0.9)  RR (95% CI)a 0.58 (0.35–0.96)* Aggressive fouls M (SD; coders; all games) 1.9 (1.7) 1.6 (1.7) 1.9 (1.6) 2.1 (1.7) 2.3 (1.9)  3 referees M (SD) 1.6 (1.6) 1.2 (1.5) 1.6 (1.5) 1.7 (1.7) 1.9 (2.0)  1 referee M (SD) 2.3 (1.7) 2.0 (1.8) 2.2 (1.7) 2.6 (1.7) 2.7 (1.9)  RR (95% CI)a 0.67 (0.50–0.90)** Injuries M (SD; all games) 0.4 (0.7) 0.2 (0.6) 0.4 (0.7) 0.4 (0.6) 0.6 (0.8)  3 referees M (SD) 0.4 (0.7) 0.4 (0.6) 0.4 (0.6) 0.3 (0.5) 0.8 (0.8)  1 referee M (SD) 0.4 (0.7) 0.1 (0.3) 0.5 (0.8) 0.6 (0.8) 0.3 (0.7)  RR (95% CI)a 1.15 (0.60–2.19) Note. CI = confidence interval; M = mean; RR = rate ratio; SD = standard deviation; U10 = under age 10 years; U11 = under age 11 years. a Poisson mixed models, accounting for pairings and game length. * p < .05, ** p < .01. Open in new tab Table I. Crude Outcome Frequencies and Estimated RRs of Risk-Taking in Youth Soccer Games With Three Referees Versus One Referee Outcome . Overall (N = 49 pairs) . U10 girls (n = 14) . U10 boys (n = 19) . U11 girls (n = 7) . U11 boys (n = 9) . Collisions M (SD; all games combined) 9.6 (7.5) 8.6 (8.7) 10.6 (7.1) 6.6 (5.0) 11.4 (7.3)  3 referees M (SD) 9.5 (8.4) 8.0 (7.5) 10.7 (9.1) 5.0 (5.1) 12.9 (9.4)  1 referee M (SD) 9.6 (6.4) 9.3 (9.4) 10.4 (4.6) 8.1 (4.6) 9.9 (4.5)  RR (95% CI)a 0.98 (0.86–1.12) Aggressive fouls M (SD; referee; all games) 0.7 (1.0) 0.6 (0.9) 0.8 (1.2) 0.7 (1.0) 0.7 (1.0)  3 referees M (SD) 0.5 (1.0) 0.4 (0.6) 0.6 (1.2) 0.1 (0.4) 0.9 (1.1)  1 referee M (SD) 0.9 (1.0) 0.9 (1.0) 0.9 (1.1) 1.3 (1.1) 0.6 (0.9)  RR (95% CI)a 0.58 (0.35–0.96)* Aggressive fouls M (SD; coders; all games) 1.9 (1.7) 1.6 (1.7) 1.9 (1.6) 2.1 (1.7) 2.3 (1.9)  3 referees M (SD) 1.6 (1.6) 1.2 (1.5) 1.6 (1.5) 1.7 (1.7) 1.9 (2.0)  1 referee M (SD) 2.3 (1.7) 2.0 (1.8) 2.2 (1.7) 2.6 (1.7) 2.7 (1.9)  RR (95% CI)a 0.67 (0.50–0.90)** Injuries M (SD; all games) 0.4 (0.7) 0.2 (0.6) 0.4 (0.7) 0.4 (0.6) 0.6 (0.8)  3 referees M (SD) 0.4 (0.7) 0.4 (0.6) 0.4 (0.6) 0.3 (0.5) 0.8 (0.8)  1 referee M (SD) 0.4 (0.7) 0.1 (0.3) 0.5 (0.8) 0.6 (0.8) 0.3 (0.7)  RR (95% CI)a 1.15 (0.60–2.19) Outcome . Overall (N = 49 pairs) . U10 girls (n = 14) . U10 boys (n = 19) . U11 girls (n = 7) . U11 boys (n = 9) . Collisions M (SD; all games combined) 9.6 (7.5) 8.6 (8.7) 10.6 (7.1) 6.6 (5.0) 11.4 (7.3)  3 referees M (SD) 9.5 (8.4) 8.0 (7.5) 10.7 (9.1) 5.0 (5.1) 12.9 (9.4)  1 referee M (SD) 9.6 (6.4) 9.3 (9.4) 10.4 (4.6) 8.1 (4.6) 9.9 (4.5)  RR (95% CI)a 0.98 (0.86–1.12) Aggressive fouls M (SD; referee; all games) 0.7 (1.0) 0.6 (0.9) 0.8 (1.2) 0.7 (1.0) 0.7 (1.0)  3 referees M (SD) 0.5 (1.0) 0.4 (0.6) 0.6 (1.2) 0.1 (0.4) 0.9 (1.1)  1 referee M (SD) 0.9 (1.0) 0.9 (1.0) 0.9 (1.1) 1.3 (1.1) 0.6 (0.9)  RR (95% CI)a 0.58 (0.35–0.96)* Aggressive fouls M (SD; coders; all games) 1.9 (1.7) 1.6 (1.7) 1.9 (1.6) 2.1 (1.7) 2.3 (1.9)  3 referees M (SD) 1.6 (1.6) 1.2 (1.5) 1.6 (1.5) 1.7 (1.7) 1.9 (2.0)  1 referee M (SD) 2.3 (1.7) 2.0 (1.8) 2.2 (1.7) 2.6 (1.7) 2.7 (1.9)  RR (95% CI)a 0.67 (0.50–0.90)** Injuries M (SD; all games) 0.4 (0.7) 0.2 (0.6) 0.4 (0.7) 0.4 (0.6) 0.6 (0.8)  3 referees M (SD) 0.4 (0.7) 0.4 (0.6) 0.4 (0.6) 0.3 (0.5) 0.8 (0.8)  1 referee M (SD) 0.4 (0.7) 0.1 (0.3) 0.5 (0.8) 0.6 (0.8) 0.3 (0.7)  RR (95% CI)a 1.15 (0.60–2.19) Note. CI = confidence interval; M = mean; RR = rate ratio; SD = standard deviation; U10 = under age 10 years; U11 = under age 11 years. a Poisson mixed models, accounting for pairings and game length. * p < .05, ** p < .01. Open in new tab Supporting our hypothesis, there were significant differences between the rate of aggressive fouls based on the number of referees on the field. Rates of fouls were lower for games with three referees compared to those with one referee. This was true of both fouls identified by the referees live on the field (RR 0.58 [0.35–0.96]) and of fouls identified by our research team through videotape review (RR 0.67 [0.50–0.90]). These trends were true among most age- and gender-based subgroups also, with the exception of fouls called by the referees during the older boys’ games. Finally, there were no significant differences in injury rates across the referee groupings. Minutes played/recorded did not vary significantly across one- versus three-referee games. A sensitivity analysis considering all injuries experienced, rather than only those resulting from collisions or contact with other players, yielded similar results to those presented. Discussion Our results partially support our hypotheses: when youth soccer games involving the same clubs and the same players were randomly assigned to be officiated by three referees rather than just one referee, there were fewer aggressive fouls, as judged both by the referees on the field and by our research team who reviewed the videotaped games. The reduction in fouls was substantial; across both age groups and genders, we witnessed a 33% drop in aggressive fouls as judged by our research team and a 42% drop as judged by the referees on the field. One might suppose that a drop in aggressive fouls would result in a drop in injuries, but our results do not support that supposition. We saw no difference in injury rates across the two sets of games (RR 1.15; 95% confidence interval [CI] 0.60–2.19). We also did not observe a statistically significant difference in collisions between players across the randomized groupings of games (RR 0.98; 95% CI 0.86–1.12). The fact that we saw comparable results across both measures of fouls—those called on the field by the referees and those observed by our research coders on videotape—is encouraging. On the surface, we might expect there would be more fouls with more referees on the field; with three sets of eyes on the field rather than just one, we might anticipate more fouls to be called. Instead, we saw the opposite: when children sensed they were being watched more carefully, there were fewer aggressive fouls. This was true for both sets of measures used to record aggressive violations of the game’s rules; suggests additional referees created a safer, less aggressive, and more rule-abiding competition; and replicates previous work in other settings (Barton & Schwebel, 2007; Chelvakumar et al., 2010; Schwebel et al., 2006b, 2007, 2011, 2015). Our findings were relatively comparable across the two age groups and two genders studied, although some intriguing trends emerged that pique curiosity to explore in future research using larger sub-sample sizes. Among the older boys in the sample, for example, there was some indication of greater risk-taking and injuries with three referees present than with one referee, contrary to the general trend among younger boys and among girls. The older girls studied, in contrast, showed a pattern of results most consistent with our hypothesis, with all risk outcomes elevated in the one referee condition. Results concerning gender differences are of particular interest for practice, as existing data suggest girls playing youth soccer have a higher rate of injury per hour played (Khodaee et al., 2017) than boys, including especially higher rates of head (Pfister et al., 2016) and knee injury (Mitchell et al., 2016). Given our results, perhaps the biggest issue for implementation—especially for cash-strapped youth soccer leagues—is an analysis of the cost-benefit ratio to add additional referees on the field. In the league we studied, the side referees were paid $25 per game. Adding two additional referees would therefore cost $50 per game, or about $2 per player involved in the game. Assuming the cost is passed on to parents, over the course of a 12-game season parents might incur an additional registration cost of roughly $25 for their children to play in a potentially safer and empirically less aggressive soccer game environment with three referees rather than one referee. In some leagues, including many serving low-income families, the league sponsor or others might have to absorb the cost. Is the extra cost worthwhile? The likelihood of injury to youth soccer players over the course of a single season is high. In one large study of youth players, the average player had 1.2 injuries per year requiring absence from the field for at least a week (Junge et al., 2002). Our study’s results suggest there will be fewer fouls in games with three referees instead of one referee, but we did not yield significant reduction in injuries. Weighing the pros and cons of cost versus our initial results which indicate reduced aggressive fouls but no influence on injury outcomes, the cost of adding additional referees to the field might not outweigh the financial impact on parents, players, and leagues. Given our findings concerning fouls, however, and the fact that aggressive fouls sometimes lead to player injuries, we recommend additional research to help us better understand the effect of additional referees on youth player’s behavior, and especially on risk-taking that could and does lead to injury. Qualitative interviews or focus groups with players, coaches, parents, and referees might garner a different perspective on the influence of additional referees on play and injuries. Efforts to engage referees, study individual differences in referee behavior, and study effects in varying circumstances (e.g., player skill level, field size, older and younger children, length of season) are also recommended. Like all research, our study had limitations. First, we limited our research to a narrow age range of players in a single geographic area. Generalizability to other geographic locations seems likely but is untested. Generalizability to other age groups also seems likely but would require careful empirical testing to verify. Second, we conducted our videotape coding based on a single video taken by trained but amateur research assistants. The quality of video was adequate and inter-rater reliability was established, but our tapes were not professional quality and occasional codable events may have been missed due to camera angles and/or videographer error. Third, the coders were necessarily aware of the number of referees on the field and were not masked to the study hypotheses. This knowledge may introduce bias into the study results. Fourth, our outcome measures were captured through a rigorous behavioral coding process, but like any scientific assessment they are imperfect. Our assessment of collisions, for example, incorporated “forceful” collisions between two players, a label that some might critique as vague and subjective. We established inter-rater agreement in coding collisions and all other outcome measures, but inevitably some degree of coder subjectivity was incorporated into assessment of outcome constructs. Last, our randomization of games to have one referee first or three referees first was conducted by the referee scheduling group using pseudo-random assignments that were initially driven by a computerized random process but sometimes adjusted due to the complex logistics of assigning dozens of referees each weekend to hundreds of youth soccer matches across a large geographic area in the community. In summary, we found that placement of three referees during youth soccer matches, rather than just one, created a match with significantly fewer aggressive fouls by the players, both as measured through referees’ calls on the field and through our review of videotaped games. We did not witness significant reductions in player-to-player collisions or in actual injury events through the addition of two referees on the field. The results offer initial evidence replicating previous findings in other settings that more intense supervision by authority figures may create a safer environment for children. Continued research on the role and influence of referees in youth soccer games, and more generally in youth sport, may lead to a low-cost strategy to reduce child injury risk in athletic competitions but still encourage health-promoting engagement in active athletic pursuits. Acknowledgments We thank Jenni Rouse, Pedram Rastegar, Anna Johnston, and the UAB Youth Safety Lab for their support of this research. We thank Zac Crawford, John Clemons, Joe Gallagher, Emma Greenwood, and Pat Byington for their immense help and dedication in facilitating our work with the local soccer clubs and referees, and we thank the clubs, the parents and youth players, and the referees for their assistance. Communication regarding this article may be directed to schwebel@uab.edu. Funding Research reported in this publication was supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number R21HD089887. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This trial is registered at clinicaltrials.gov (NCT04266925). Conflicts of interest: None declared. References Barton B. K. , Schwebel D. C. ( 2007 ). The roles of age, gender, inhibitory control, and parental supervision in children’s pedestrian safety . Journal of Pediatric Psychology , 32 , 517 – 526 . Google Scholar Crossref Search ADS PubMed WorldCat Belechri M. , Petridou E. , Kedikoglou S. , Trichopoulos D. , & Sports Injuries European Union Group. ( 2001 ). Sports injuries among children in six European Union countries . European Journal of Epidemiology , 17 , 1005 – 1012 . Google Scholar Crossref Search ADS PubMed WorldCat Brito J. , Malina R. M. , Seabra A. , Massada J. , Soares J. , Krustrup P. , Rebelo A. ( 2012 ). Injuries in Portuguese youth soccer players during training and match play . Journal of Athletic Training , 47 , 191 – 197 . Google Scholar Crossref Search ADS PubMed WorldCat Brown B. W. ( 1980 ). The crossover experiment for clinical trials . Biometrics , 36 , 69 – 79 . Google Scholar Crossref Search ADS PubMed WorldCat Chelvakumar G. , Sheehan K. , Hill A. L. , Lowe D. , Mandich N. , Schwebel D. C. ( 2010 ). An evaluation of the Stamp-in-Safety program, an intervention to promote safer playground behavior in children . Injury Prevention , 16 , 352 – 354 . Google Scholar Crossref Search ADS PubMed WorldCat Chorney J. M. , McMurtry C. M. , Chambers C. T. , Bakeman R. ( 2015 ). Developing and modifying behavioral coding schemes in pediatric psychology: A practical guide . Journal of Pediatric Psychology , 40 , 154 – 164 . Google Scholar Crossref Search ADS PubMed WorldCat Cicchetti D. V. ( 1994 ). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology . Psychological Assessment , 6 , 284 – 290 . Google Scholar Crossref Search ADS WorldCat Committee on Sports Medicine and Fitness. ( 2000 ). Injuries in youth soccer: A subject review . Pediatrics , 105 , 659 – 661 . Crossref Search ADS PubMed WorldCat Drape J. ( 2018 , July 14). Youth soccer participation has fallen significantly in America. New York Times. https://www.nytimes.com/2018/07/14/sports/world-cup/soccer-youth-decline.html Retrieved 3 July 2020. Fridman L. , Fraser-Thomas J. L. , McFaull S. R. , Macpherson A. K. ( 2013 ). Epidemiology of sports-related injuries in children and youth presenting to Canadian emergency departments from 2007-2010 . BMC Sports Science, Medicine, and Rehabilitation , 5 , 30 . Google Scholar Crossref Search ADS WorldCat Hallgren K. A. ( 2012 ). Computing inter-rater reliability for observational data: An overview and tutorial . Tutorials in Quantitative Methods for Psychology , 8 , 23 – 34 . Google Scholar Crossref Search ADS PubMed WorldCat Junge A. , Dvorak J. ( 2004 ). Soccer injuries . Sports Medicine , 34 , 929 – 938 . Google Scholar Crossref Search ADS PubMed WorldCat Junge A. , Rösch D. , Peterson L. , Graf-Baumann T. , Dvorak J. ( 2002 ). Prevention of soccer injuries: A prospective intervention study in youth amateur players . The American Journal of Sports Medicine , 30 , 652 – 659 . Google Scholar Crossref Search ADS PubMed WorldCat Khodaee M. , Currie D. W. , Asif I. M. , Comstock R. D. ( 2017 ). Nine-year study of US high school soccer injuries: Data from a national sports injury surveillance programme . British Journal of Sports Medicine , 51 , 185 – 193 . Google Scholar Crossref Search ADS PubMed WorldCat Laker S. R. ( 2011 ). Epidemiology of concussion and mild traumatic brain injury . PM&R: The Journal of Injury, Function and Rehabilitation , 10 , S354 – S358 . Google Scholar Crossref Search ADS WorldCat Mitchell J. , Graham W. , Best T. M. , Collins C. , Currie D. W. , Comstock R. D. , Flanigan D. C. ( 2016 ). Epidemiology of meniscal injuries in US high school athletes between 2007 and 2013 . Knee Surgery, Sports Traumatology, Arthroscopy , 24 , 715 – 722 . Google Scholar Crossref Search ADS WorldCat Pfister T. , Pfister K. , Hagel B. , Ghali W. A. , Ronksley P. E. ( 2016 ). The incidence of concussion in youth sports: A systematic review and meta-analysis . British Journal of Sports Medicine , 50 , 292 – 297 . Google Scholar Crossref Search ADS PubMed WorldCat Rapids Youth Soccer. ( 2016 ). US youth soccer statistics. http://rapidsyouthsoccer.org/us-youth-soccer-player-statistics/ Retrieved 3 July 2020. Schwebel D. C. , Bounds M. L. ( 2003 ). The role of parents and temperament on children’s estimation of physical ability: Links to unintentional injury prevention . Journal of Pediatric Psychology , 28 , 505 – 516 . Google Scholar Crossref Search ADS PubMed WorldCat Schwebel D. C. , Brezausek C. M. ( 2014 ). Child development and pediatric sports/recreation injuries: Injury by year of age . Journal of Athletic Training , 49 , 780 – 785 . Google Scholar Crossref Search ADS PubMed WorldCat Schwebel D. C. , Jones H. N. , Holder E. , Marciani F. ( 2011 ). The influence of simulated drowning audits on lifeguard surveillance and swimmer risk-taking behaviors at public swimming pools . International Journal of Aquatic Research and Education , 5 , 210 – 218 . Google Scholar Crossref Search ADS WorldCat Schwebel D. C. , Lindsay S. , Simpson J. ( 2007 ). Brief report: A brief intervention to improve lifeguard surveillance at a public swimming pool . Journal of Pediatric Psychology , 32 , 862 – 868 . Google Scholar Crossref Search ADS PubMed WorldCat Schwebel D. C. , McDaniel M. , Banaszek M. M. ( 2006 a). Ecology of player-to-player contact in boys’ youth soccer play . Journal of Safety Research , 37 , 507 – 510 . Google Scholar Crossref Search ADS PubMed WorldCat Schwebel D. C. , Pennefather J. , Marquez B. , Marquez J. ( 2015 ). Internet-based training to improve preschool playground safety: Evaluation of the Stamp-in-Safety program . Health Education Journal , 74 , 37 – 45 . Google Scholar Crossref Search ADS WorldCat Schwebel D. C. , Summerlin A. L. , Bounds M. L. , Morrongiello B. A. ( 2006 b). The Stamp-in-Safety program: A behavioral intervention to reduce behaviors that can lead to unintentional playground injury in a preschool setting . Journal of Pediatric Psychology , 31 , 152 – 162 . Google Scholar Crossref Search ADS PubMed WorldCat Wiersma L. D. , Sherman C. P. ( 2005 ). Volunteer youth sport coaches’ perspectives of coaching education/certification and parental codes of conduct . Research Quarterly for Exercise and Sport , 76 , 324 – 338 . Google Scholar Crossref Search ADS PubMed WorldCat Willan A. R. , Pater J. L. ( 1986 ). Carryover and the two-period crossover clinical trial . Biometrics , 42 , 593 – 599 . Google Scholar Crossref Search ADS PubMed WorldCat Yang Y. T. , Baugh C. M. ( 2016 ). US youth soccer concussion policy: Heading in the right direction . JAMA Pediatrics , 170 , 413 – 414 . Google Scholar Crossref Search ADS PubMed WorldCat © The Author(s) 2020. Published by Oxford University Press on behalf of the Society of Pediatric Psychology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Injuries on the Youth Soccer (Football) Field: Do Additional Referees Reduce Risk? Randomized Crossover Trial JF - Journal of Pediatric Psychology DO - 10.1093/jpepsy/jsaa050 DA - 2019-09-11 UR - https://www.deepdyve.com/lp/oxford-university-press/injuries-on-the-youth-soccer-football-field-do-additional-referees-e6JxSa8VK9 DP - DeepDyve ER -