Citizen Satisfaction and the Kaleidoscope of Government Performance: How Multiple Stakeholders See Government Performance

Citizen Satisfaction and the Kaleidoscope of Government Performance: How Multiple Stakeholders... Abstract Performance assessment is a central issue for modern governments; however, little attention has been paid to the similarities and differences among various performance indicators. This study investigates how different performance assessments relate to each other by incorporating multiple stakeholders’ perspectives on performance at the individual level. Combining three different surveys and archival data on secondary education, we analyze how academic performance indicators are associated with service users’ (parents’ and students’) and service providers’ (teachers’) judgments of school quality. Our findings suggest that parents, students, and teachers provide similar assessments of school performance, and these assessments reflect the actual quality of the schools. Their evaluations are more closely aligned to archival performance indicators in high-performing schools than low-performing schools. In addition to the convergent validity of the various performance measures, we also find indirect evidence that the perceptual measures have discriminant validity relative to archival measures. The consistency of performance indicators in a centralized regime (South Korea) also contributes to the generalizability of existing theory. Introduction Governments worldwide are adopting performance appraisal systems to evaluate how well programs meet citizen’s needs, but such systems assume that reliable and valid measures of performance exist. Despite the need for quality performance indicators, there are no universally accepted performance measures in the public sector (Andrews, Boyne, and Walker 2006; Rainey 2009). Consequently, various indicators, such as administrative performance records, citizen satisfaction, and employee assessments, have been employed to evaluate public service quality. Even with the wide use of these measures in academic and practical settings, the relationship between different performance indicators has only been examined sporadically and in a few governance contexts. Previous studies testing whether and how different performance indicators relate to each other have shown mixed results (Kelly and Swindell 2002). Some research demonstrates positive correlations among performance indicators such as administrative records, citizen evaluations, and employee assessments (e.g., Charbonneau and Van Ryzin 2012; Favero and Meier 2013; Van Ryzin, Immerwahr, and Altman 2008) while other findings suggest a weak relationship (e.g., Brown and Coulter 1983; Kelly 2003; Meier et al. 2015). The mixed results may be partly due to the failure to consider multiple stakeholders or due to different levels of aggregation in performance perceptions. The former possibility is inherent in the nature of public organizations with multiple stakeholders who hold different perspectives on what good performance is (Boyne 2003b; Moynihan et al. 2011; Walker, Boyne, and Brewer 2010). Although performance assessment in the public sector highly depends on whose opinions are reflected in the evaluation process, the role of different stakeholders in performance assessment is seldom discussed (Andersen, Boesen, and Pedersen 2016). Regarding aggregation, previous literature has largely adopted aggregate levels of citizen satisfaction as measures of citizens’ perceived assessments of service quality even though aggregated measures can lead to aggregation bias (Ringquist 2005) and can disguise substantial dissatisfaction by some citizens. A third research gap in the performance literature is the lack of consideration of various institutional settings. Most empirical assessments take place in decentralized delivery systems such as the United States or the United Kingdom, despite the fact that the definition and measurement of performance can vary substantially across countries. The heavy reliance on decentralized systems in the literature raises concerns about generalizability and indicates the need for more variety in empirical contexts. This article systematically examines and theoretically discusses the relationship among different performance indicators in a centralized regime. We do this by analyzing how archival performance indicators are related to perceptual performance assessments in South Korea. This article contributes to the existing literature in three ways. First, given the importance of multiple stakeholders in public service, we incorporate opinions from both service users (parents and students) and service providers (teachers) and show how they correspond with archival performance (student test scores) to generate a more complete picture of how various performance assessments are related. Second, this article tests the theory both at the individual level and the aggregate level. Separating individual and aggregate level performance makes a theoretical contribution to the citizen satisfaction literature, refining the expectation-disconfirmation model (EDM) by allowing us to examine (1) which level (individual versus aggregate) of performance matters more for citizen satisfaction and (2) whether one level of performance interacts with the other level of performance. This distinction has implications for Tiebout (1956)vote with one’s feet models since they assume individual benefits matter but collective ones do not. Third, we examine performance indicators in a centralized system, using a longitudinal education dataset from Seoul City (the capital city of South Korea). This centralized education system with limited school choice provides an ideal case to determine whether the conclusions drawn from decentralized settings are generalizable to other contexts. The strong regulation of student school assignments in Seoul City also offers a unique opportunity to study citizen satisfaction of public service when the exit option is constrained (Hirschman 1970). The analysis combines archival data on students’ academic achievement and three different surveys on school quality over several years (2010–2015) with data for more than 45,000 parents and students, and 12,000 teachers. Conceptualizing Government Performance Government performance is a key concern of public administration research (Boyne 2003a); “virtually all of management and organization theory is concerned with performance and effectiveness, at least implicitly” (Rainey 2009, 145). Despite this salience, scholars and practitioners face challenges in evaluating government performance because the concept of performance itself can be unclear or ambiguous (Andersen, Boesen, and Pedersen 2016). Government performance includes multiple dimensions as viewed by diverse stakeholders (Radin 2006); therefore, finding valid performance measures is difficult (Rainey 2009). The nature of performance evaluation generates challenges to existing research in the area. First, research often does not incorporate the multiple values in public organizations and generally omits process aspects such as accountability, responsiveness, and probity (Boyne 2002). Second, the role of stakeholders is also neglected despite their significant influence in defining good performance (Amirkhanyan, Kim, and Lambright 2014). Third, the lack of information about the relevant units of analysis reduces comparability between different performance studies (Andersen, Boesen, and Pedersen 2016). At times scholars have improved the performance framework by incorporating different performance standards (e.g., the competing values model) (Quinn and Rohrbaugh 1983); however, the different perspectives of multiple stakeholders and the unity of analysis are still largely ignored in performance research. Multiple Stakeholders A central question for performance criteria is “who actually decides what good performance is” (Andersen, Boesen, and Pedersen 2016, 855)? Public organizations have multiple stakeholders holding diverse views on what constitutes good performance (Boyne 2003b; Walker et al. 2010), but the role of stakeholders in assessing performance is often given insufficient weight (Amirkhanyan et al. 2014; Andersen, Boesen, and Pedersen 2016). This lack of attention is problematic because neglecting the role of some stakeholders in the performance assessment process can generate a narrow view of how well governments do (Moynihan et al. 2011). Stakeholders refer to “any group or individual who can affect or is affected by the achievement of the organization’s objectives” (Freeman 1984, 46). In the example of public schools, students, parents, and teachers are key stakeholders since they affect and are affected by school performance. Each stakeholder group can have different perspectives on the appropriate school performance criteria or weigh individual criteria differently similar to the different views one gets with a simple turn of a kaleidoscope. Parents, for example, may view standardized test scores as the most important educational performance indicators while teachers could think that a well-rounded education is the main goal of schools. The multiple constituency model of performance highlights that organizational performance is judged by both internal and external groups who hold different views on performance (Cameron 1986; Connolly, Conlon, and Deutsch 1980). Internal performance assessments often diverge from external assessments because external stakeholders might have incentives to underestimate performance while internal stakeholders tend to overestimate performance (Andersen, Heinesen, and Pedersen 2016). Little research, however, examines whether stakeholders share common ground for performance assessment or hold different views on performance. Unit of Analysis In addition to the question of whose perspective is being assessed, “what level of analysis is being used” is also a critical question in evaluating performance (Cameron 1986, 542). Andersen, Boesen, and Pedersen (2016, 857) argue that studies that use different units of analysis (individuals, groups, organizations, or programs) may not be directly comparable. In the example of educational performance, they contend that district level performance (Meier and O’Toole 2001), school level performance (Andersen and Mortensen 2010), and individual level performance (Andersen, Heinesen, and Pedersen 2016) may reflect fundamentally different criteria. The conscious choice of the unit of analysis, therefore, is critical for performance research. For investigating individual level variables such as satisfaction or motivation, the individual level analysis is better because “much information is lost if individual scores are aggregated at the organizational level” (Andersen, Boesen, and Pedersen 2016, 858). Previous research, however, mostly relies on aggregate (organizational or community) level analysis due to the lack of data availability. Identifying the unit of analysis is important not only in understanding performance assessment itself but also in investigating the relationship between individual and aggregate level performance. An individual stakeholder can assess organizational performance based on how it affects the individual or on how it affects everyone on average (aggregated). Such assessments might also interact such that individual assessments become based how the individual values his or her own benefits relative to the benefits of all others. Assessing Government Performance Scholars frequently divide performance indicators into two types—archival measures and perceptual measures (Boyne et al. 2006; Brown and Coulter 1983; Kelly and Swindell 2002; Parks 1984).1 The distinction is based on “the degree to which a performance criterion concerns interior experiences and perceptions versus exterior, observable phenomena” (Andersen, Boesen, and Pedersen 2016, 856). Archival measures include administrative records of performance and official indicators of performance such the US Program Assessment Rating Tool (PART). These measures are often highly valued due to being independent from perceptual judgments (Andrews et al. 2006). Archival performance measures, however, have their own limitations; and many scholars criticize them. First, archival measures do not exist in many policy areas (Wall et al. 2004) because program outputs are often highly complex and the state of measurement lags well behind. Second, archival performance measures may focus on easily quantified service aspects and ignore what really matters to citizens (Andrews et al. 2006; Brudney and England 1982). Third, these measures may not work well in the fragmented and pluralistic political environments of public organizations (Radin 2006) where individual stakeholders have vastly different expectations for public programs. Overall these criticisms mean that archival performance indicators may not accurately reflect the quality of public services delivered to citizens (Amirkhanyan et al. 2014; Moynihan 2008; Wall et al. 2004). Other efforts to measure performance focus on perceptual indicators such as stakeholders’ judgments of performance (Andrews et al. 2006; Brewer 2006; Selden and Sowa 2004; Walker and Boyne 2006). Citizen satisfaction with public services is a popular performance measure, and public employees’ perceptions about performance are also frequently employed. Perceptual performance measures have been widely used because finding consensus performance indicators is especially challenging for public organizations (Rainey 2009), and they can capture important elements of performance that archival measures neglect. Perceptual measures also allow for easier comparisons across policy or country contexts. Still, skepticism exists about the validity of perceptual performance indicators. Some scholars question the validity of citizens’ perceptions of government performance because citizens often have limited ability to evaluate the quality of services that do not directly affect them (Andrews et al. 2006; Stipak 1979, 1980). Perceptual measures can also be vulnerable to bias due to nonrandom measurement errors (Boyne et al. 2006). Self-reported performance can be exceptionally problematic when internal organizational stakeholders assess their own performance because they tend to overestimate their performance (Meier and O’Toole 2013a). This positivity bias or social desirability bias (Meier and O’Toole 2013b) can lead to problems of common source bias when other variables are drawn from the same source as the evaluations (Amirkhanyan et al. 2014; Brewer 2006; Favero and Bullock 2015). Comparing archival and perceptual measures concerns questions of convergent and discriminant measurement validity (Campbell and Fiske 1959). Convergent validity is established by demonstrating that two or more different measures of the same construct have common variation while discriminant validity is obtained by showing different measurements of the same construct show unique variation (Schwab 2005). In measuring school performance, for example, if different performance indicators from different sources—such as student academic achievement and teachers’ evaluations of schools—show strong correlations, it indicates convergent validity. A lack of correlation may indicate either one of the measures lacks validity or that the measures have discriminant validity, that is, they are picking up different aspects of performance. Discriminant validity is especially important if perceptual measures tap aspects of service quality that are not measured by the official performance indicators. Through the lenses of convergent and discriminant validity, this study examines whether archival performance indicators and perceptual performance assessments from multiple stakeholders have similarities (convergent validity) or differences (discriminant validity). Figure 1 depicts the expected relationship among academic achievement and evaluations from parents, students, and teachers. Academic achievement can influence opinions of all stakeholders, but the degree of the influence may vary across groups. Academic achievement can capture some characteristics that matter to all groups and other characteristics that matter only for certain groups. Each group also can consider factors that are not captured by academic achievement when they evaluate school quality. We expect that evaluations from different groups can have both convergent and discriminant validity. Figure 1 View largeDownload slide The Relationship among Various Sources of Evaluations Figure 1 View largeDownload slide The Relationship among Various Sources of Evaluations Citizen Evaluations and Performance How do citizens evaluate government performance? Do their assessments correlate with official indicators of performance? These questions relate directly to issues of democracy and governance. To the extent that citizen assessments of performance match official performance criteria, citizens are more likely to support government policy, and overall policy is likely to represent the democratic goal of responsiveness to citizen preferences. Although high-quality services are assumed to promote citizen satisfaction, the existing evidence on the relationship between citizen assessments of public service and evidence of service quality improvements is mixed (Kelly 2005; Kelly and Swindell 2002). The EDM has been the predominant theory in explaining citizen satisfaction with public services (e.g., Jacobsen, Snyder, and Saultz 2015; James 2009; Morgeson 2013; Petrovsky, Mok, and León‐Cázares 2017; Poister and Thomas 2011;Van Ryzin 2004, 2006, 2013). The EDM suggests that citizen satisfaction is a reflection of the gap between the prior expectation and the post experience of service quality. When perceived service quality falls short of expectations, the negative gap (negative disconfirmation) leads to dissatisfaction whereas when perceived performance exceeds expectations, the positive gap (positive disconfirmation) leads to satisfaction (Jacobsen et al. 2015; Morgeson 2013; Van Ryzin 2006, 2013). This model is useful for both understanding “citizens’ judgments of government performance more generally,” and also for explaining “the difference between perceived performance and more objectively measured performance” (Van Ryzin 2013, 598). EDM studies have found that the actual quality of public service has a positive influence on citizen satisfaction (e.g., Poister and Thomas 2011) supporting the convergent validity of these indicators. Other empirical work also finds that citizens can evaluate some services accurately, and their assessments of performance are positively correlated with official measures of performance (Licari, McLean, and Rice 2005; Van Ryzin et al. 2008). In education policy, Favero and Meier (2013) use New York City school data to examine how different performance indicators are related to each other and find that parents and teachers are in considerable agreement with official school performance data (see also Charbonneau and Van Ryzin 2012). A consistent correlation between citizen judgments and a street cleanliness scorecard in New York City also supports that citizen evaluations of government performance are related to the archival measure of performance (Van Ryzin et al. 2008). In these cases, the performance measures appear to have convergent validity. Based on both theoretical expectations and empirical evidence, we hypothesize that standardized test scores (an archival measure) will be positively and significantly related to both parents’ and students’ assessments of school performance (perceptual measures). Hypothesis 1: Parents will give more favorable evaluations to schools that perform well on academic achievement tests. Hypothesis 2: Students will give more favorable evaluations to schools that perform well on academic achievement tests. Employee Evaluations and Performance The internal stakeholders, such as employees and managers, have unique advantages in assessing performance because they gain substantial information about their programs during implementation (Andrews et al. 2006). Unlike external stakeholders who can only observe current service outcomes, internal stakeholders can access the service production process, organizational capacity, or the potential of improving the service quality. They may consider these things when assessing performance and not just the current impact services have. Internal stakeholders have the potential, therefore, to understand the service delivery process better and evaluate performance more accurately than outsiders. Regarding employee perceptions and performance assessment, recent studies provide mixed results. In New York City schools, Favero and Meier (2013) demonstrate that teachers’ perceptions of school quality are positively related to student performance, progress report scores, and quality reviews, and are also negatively associated with a school violence index. By contrast, in Denmark schools, Andersen, Heinesen, and Pedersen (2016) find that teachers’ perceptual variables such as job satisfaction and intrinsic motivation are not significantly related to an archival performance indicator (written exam results).2 In terms of managers’ performance assessments, recent public management literature criticizes the misuse of these perceptual measures by showing that managers’ self-assessments of performance do not correlate well with archival performance measures (Andrews et al. 2006; Meier et al. 2015). A weak tie between self-assessment of performance and archival performance indicators is found regardless of whether measured at the top level (Meier and O’Toole 2013a) or at middle management levels (Meier et al. 2015). Despite these limitations, many studies still rely on internal stakeholders’ evaluations to measure organizational performance and often find that perceptual measures are correlated with archival measures (Wall et al. 2004; Walker and Boyne 2006). Andersen, Heinesen, and Pedersen (2016) argue that both institutional theory and the sociology of professions can explain why performance measures tend to produce consistent results. Teachers, for example, share professional norms developed through a combination of education, peer socialization, and formal rules and procedures backed by regulative institutions. They assess performance based on indicators that are “institutionalized through professional norms,” and this process generates consistency in performance assessments (Andersen, Heinesen, and Pedersen 2016, 75–6). Based on these theoretical expectations, we argue that teachers’ assessments of performance will be positively associated with archival performance indicators. Hypothesis 3: Teachers will give more favorable evaluations to schools that perform well on academic achievement tests. Individual Level and Aggregate Level Performance The citizen satisfaction literature argues that we should take into account both individual and aggregate level (jurisdiction, community, or organization) factors simultaneously to understand citizen satisfaction because the effects of a certain variable can be different depending on the level of analysis (Lyons, Lowery, and DeHoog 1992). Most studies, however, simply ask about citizen satisfaction without determining if the citizen responds to the overall quality of service in the community or the quality of services the individual receives. Citizens might respond to either. This concern is linked to the issue of the unit of analysis in performance research. For parents, the performance of their own children and overall school quality can be separate issues. The two different levels of performance may affect parent satisfaction independently, or they may interact with each other and produce additional effects on parent satisfaction. The separation of one level performance from another provides a unique theoretical contribution to the citizen satisfaction literature by identifying the differential effect of individual and aggregate level service quality on citizen’s satisfaction. Barrows et al. (2016) suggest that citizens’ performance assessments can be influenced by how local service quality ranks relative to that of different reference groups. In the context of American schools, they find that comparing the academic performance of local schools to school performance at the state, national, or international level changes citizens’ average evaluations of local school quality. These learning effects occur when citizens who initially overestimate or underestimate the schools in their community change their evaluations when they learn how other schools perform. A relative comparison between their own and others’ performance, therefore, can influence citizens’ evaluations of performance. Based on the previous discussion, we argue that aggregate level performance can influence the link between individual level performance and citizen satisfaction. Specifically, we expect that parents with children in high-performing schools and poor-performing schools will have different expectations for the quality of services; therefore, their evaluations for school quality will vary because they use different standards (Lyons, Lowery, and DeHoog 1992). In high-performing schools, for instance, parents may increase expectations for their children’s academic achievement because the other students in the same school are doing well. Parents in high-performing schools might also have a higher priority on education; therefore, they would be more satisfied with the school when their children do well on the exams. The positive effects of student performance on parent satisfaction, therefore, can be greater for students in high-performing schools because they excel against high standards. The similar logic can affect students. Students in high-performing schools are likely to evaluate their own education more positively when they achieve higher performance in a school where others also perform well. Through the socializing effect of parents or peers who have higher priorities on academic achievement, students in high-performing schools might adjust their own performance expectations. High test scores, therefore, can be more closely aligned to the perceptual assessments of students in high-performing schools.3 Hypothesis 4: The positive effects of student performance on parents’ evaluations will be greater in high-performing schools. Hypothesis 5: The positive effects of student performance on students’ evaluations will be greater in high-performing schools. The Institutional Setting The distinction between unified and fragmented governmental structures is important to understanding citizen satisfaction and its links to democracy (Lyons, Lowery, and DeHoog 1992). Theoretical work suggests that citizen preferences for public goods will be more adequately reflected in localized and fragmented systems because competing jurisdictions can provide different arrays of services to satisfy citizens’ preferences (Tiebout 1956). Archival performance measures under this context should more closely reflect the opinions of citizens; consequently, the positive correlations between the archival measures and citizen assessments should increase. In contrast, centralized governments often implement a one size fits all policy that is less likely to be customized to the preferences and needs of individual citizens. In this case, official performance indicators set by central government might not capture the performance dimensions that citizens value, generating a gap between official performance records and the citizens’ perceived service quality. To determine how generalizable the existing research on citizen satisfaction is, our study is set in a national context significantly different from the United States–South Korea. US school districts are autonomous local governments with their own chief executive officers (superintendents) and elected school boards that can affect important education policies. Parents and students are encouraged to participate in school decision-making and can influence educational policies by providing signals through their choice of schools (either among public schools or by opting to attend charter or private schools). Under this decentralized system with local control, schools have incentives to consider the preferences of parents and students and focus on the performance indicators that constituents care about. Parental assessments of educational services in this context should more accurately correspond to administrative school performance measures. Empirical evidence on the convergent validity of school performance in a decentralized setting (e.g., New York City) supports this argument (Charbonneau and Van Ryzin 2012; Favero and Meier 2013). Whether these results can be generalized to a centralized system is an open question. Korean schools provide a unique opportunity to test the theory due to its highly centralized education system (Mintrom and Walley 2013). Under a strict nationwide education policy, national governmental agencies (e.g., the Ministry of Education) have a strong authority in operating schools and regulate the educational curriculum, resource allocation, and teacher assignment. Students are randomly assigned to schools within their school districts based on a computer lottery system.4 Parents’ and students’ preferences are considered to a certain extent through the application processes, but there is no guarantee that they can attend the schools they want. The random assignment of students applies to both public and private schools, and students are not allowed to transfer to another school within the same school district. This system, to some extent, restricts the premises of the Tiebout model (1956) where citizens can choose services to maximize their preferences from a differentiated set of local governments.5 If Tiebout choice processes are restrained, one would expect any correlations between citizen satisfaction and performance indicators to decline. Limited school choice and centralized policymaking may also dampen the motivation of schools to encourage parents and students to participate in school policy processes. The gap between archival indicators and citizen perceptions, as a result, might be larger in a centralized system than in a decentralized one. At the same time, the Korean school system places a high priority on student performance on standardized tests (Mintrom and Walley 2013, 262; Seth 2002). Unlike the United States, Korea gives a national exam that all students take. Performance on these high stakes exams determines the future education and occupational opportunities available to the students. This system is reinforced by the Korean national policy of heavy investment in human capital where about 70% of high school graduates go on to higher education and there is a clear hierarchical ranking of Korean universities (Seth 2002). The evidence from this context, therefore, can contribute to building a more generalizable theory of performance assessment. Data and Methods The Seoul Education Longitudinal Study (SELS) of South Korea provides the data for this study. From 2010 to 2015, the Seoul Metropolitan Office of Education surveyed parents, students, teachers, and principals to establish better education policies. The SELS also includes archival data on various school characteristics such as school type, school ownership, school resources, and school environment as well as standardized test scores for individual students. Unique school identifiers and student identifiers allow the survey participants to be linked on one another. The data collection was designed to be representative of the Seoul City school population using a stratified two-stage sampling technique. In the first stage, the number of schools was determined based on the total number of schools in each school district. In the next stage, schools were randomly selected, and then students were also randomly selected from each school. The surveys have high response rates for both middle and high school students (ranging from 72.7% to 95.7%). Dropping cases with missing values yields a sample over the 6 years that includes 45,344 parents, 45,893 students, and 12,583 teachers. An advantage of SELS data is that it includes assessments by three different stakeholders as well as archival performance indicators. This allows us to incorporate various opinions from multiple stakeholders and test for differences or similarities among them (in contrast to studies with only one or two stakeholders). Another benefit of these data is that all personal information is collected at the individual level. In addition to reducing aggregation bias, the individual level analysis permits us to determine whether students and parents rate the quality of schools based on their own (or their own child’s) performance or the average performance of all students. The unit of analysis is the individual for all models. This study uses regression analysis with fixed effects for years and standard errors clustered at the school level to address potential serial correlation and heteroscedasticity. Measures Perceptual Performance Measures To measure perceived school quality by citizens and employees, this article adopts three different dependent variables: parents’ evaluations, students’ evaluations, and teachers’ evaluations. The parent and student survey questions ask about various aspects of the school including overall satisfaction with school, student learning, course variety, career consultation, educational facilities, school safety, etc. For each question, parents and students answer on a five-point scale (from strongly disagree to strongly agree). All survey items loaded strongly onto a single factor producing an eigenvalue of 4.69 in the parent survey (Cronbach’s alpha = 0.90) and an eigenvalue of 4.21 in the student survey (Cronbach’s alpha = 0.89), indicating a strong internal consistency. The teacher evaluation indicator is constructed with a separate factor analysis of questions asking teachers to evaluate the quality of school, using a five-point scale. For the teacher model, all items loaded on a single factor with eigenvalues of 3.55 (Cronbach’s alpha = 0.82) (for details, see Table A1 in Appendix). Archival Performance Measures The main independent variable of theoretical interest is objective school performance as measured by student test scores. All students in the sample take standardized tests in Korean, English, and math every year, and we create a performance index based on these scores. Specifically, we calculate the average scores for Korean, English, and math, and then standardize the average scores for each student. The school level performance index is the school mean score. Although the education literature widely recognizes that education is more than the results of standardized tests, standardized tests are the key performance indicator for Korean schools. The Korean education system relies heavily on standardized tests to evaluate students, and such tests determine future student educational opportunities including admission to higher education (Mintrom and Walley 2013). Control Variables School input variables and school characteristics are included in the models. Previous literature suggests that socioeconomic and demographic factors (e.g., income, education, age, and race) play an important role in shaping citizen satisfaction and preferences (Brown and Coulter 1983; Brown and Reed Benedict 2002). For socioeconomic variables, we include household income (logged) and parent education (average of both parents) at the individual level. Highly educated and experienced teachers are resources that schools can use to achieve high performance. For each school, we measure average teacher education as the percentage of teachers with either a master’s or doctorate, average teacher experience as the average years of teachers’ educational experience, the percentage of full-time teachers, and class size (the student–teacher ratio). Various school characteristics such as the poverty level, school size, and school type can also play a significant role in shaping performance. The citizen satisfaction literature finds that citizens in a low-income or predominantly ethnic minority neighborhood are generally less satisfied with service quality (Rossi 1972). To capture group-level poverty effects, we include the number of students eligible for free lunch (or reduced-price lunch) for each school (logged). We also control for the percentage of low-achievement students in each school using the percentage of low-performers in Korean, English, and math. To capture the effect of school size, we include the total number of students (logged). Legal ownership of schools (public or private) and the type of schools (middle, general high, and vocational high) are also controlled in all models. The teacher model has additional controls for employee characteristics. Given that demographic characteristics play significant roles in shaping teacher satisfaction (Grissom, Nicholson-Crotty, and Keiser 2012), we control for the teacher’s gender, age, and education attainment at the individual level. Descriptive statistics and coding schemes are shown in Table A2 in Appendix. Findings Table 1 shows the bivariate correlations between parents’, students’, and teachers’ evaluations, and standardized test scores. Both service users’ (parents’ and students’) and service providers’ (teachers’) evaluations have positive and significant correlations with test scores. We interpret these relationships as moderate to strong given that stakeholders may have different definitions of performance (which may or may not include test scores) and the potential for unique variance at the individual level (versus the cancellation of errors at the aggregate level). Table 1. Correlations among Different Performance Indicators   Academic Achievement  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Academic achievement  1        Parents’ evaluations  0.220*  1      Students’ evaluations  0.184*  0.537*  1    Teachers’ evaluations  0.181*  0.259*  0.319*  1    Academic Achievement  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Academic achievement  1        Parents’ evaluations  0.220*  1      Students’ evaluations  0.184*  0.537*  1    Teachers’ evaluations  0.181*  0.259*  0.319*  1  Note: School-level analysis. *p < .05. View Large Table 1. Correlations among Different Performance Indicators   Academic Achievement  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Academic achievement  1        Parents’ evaluations  0.220*  1      Students’ evaluations  0.184*  0.537*  1    Teachers’ evaluations  0.181*  0.259*  0.319*  1    Academic Achievement  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Academic achievement  1        Parents’ evaluations  0.220*  1      Students’ evaluations  0.184*  0.537*  1    Teachers’ evaluations  0.181*  0.259*  0.319*  1  Note: School-level analysis. *p < .05. View Large The first two equations in table 2 test how citizen assessments of school quality are related to student test scores. Columns 1 and 2 present the results for parents and students, respectively. Consistent with theoretical expectations, student performance (individual student test scores) is positively and significantly related to parents’ evaluations even with extensive controls. In addition to the test score of their own child, the parents’ evaluations are also positively associated with school performance (the mean school test score). Interestingly, the magnitude of the coefficient for school performance is much larger than that for individual student performance. This result suggests that parents are more satisfied with schools not only when their own child performs well, but also when the school achieves high academic performance overall. Table 2. The Relationship between Archival Performance Indicators and Perceptual Evaluations   Citizen Evaluations  Employee Evaluations  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.068*** (0.007)  0.125*** (0.007)    School performance  0.169*** (0.027)  0.172*** (0.029)  0.135*** (0.030)  Household income  −0.012 (0.009)  0.002 (0.008)  −0.034 (0.055)  Parents education  0.000 (0.008)  0.007 (0.007)  0.022 (0.035)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  0.003 (0.002)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  −0.003 (0.004)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  −0.002 (0.002)  Class size  −0.028*** (0.005)  −0.028*** (0.005)  −0.009 (0.009)  Low-achievement students  −0.021*** (0.004)  −0.010** (0.003)  −0.034*** (0.005)  Free-lunch students  −0.000 (0.010)  0.011 (0.011)  0.020 (0.017)  Number of students  −0.131** (0.042)  −0.150*** (0.044)  −0.209*** (0.056)  Public school  −0.217*** (0.031)  −0.106*** (0.031)  0.098* (0.041)  School type: general high school  −0.060 (0.038)  −0.397*** (0.035)  −0.090+ (0.053)  School type: vocational high school  0.196** (0.063)  −0.083 (0.069)  −0.206* (0.083)  Teacher gender (female)      0.113*** (0.024)  Teacher age      0.028*** (0.007)  Teacher education      0.058** (0.018)  Adjusted R2  0.076  0.158  0.055  N  45,344  45,893  12,583    Citizen Evaluations  Employee Evaluations  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.068*** (0.007)  0.125*** (0.007)    School performance  0.169*** (0.027)  0.172*** (0.029)  0.135*** (0.030)  Household income  −0.012 (0.009)  0.002 (0.008)  −0.034 (0.055)  Parents education  0.000 (0.008)  0.007 (0.007)  0.022 (0.035)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  0.003 (0.002)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  −0.003 (0.004)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  −0.002 (0.002)  Class size  −0.028*** (0.005)  −0.028*** (0.005)  −0.009 (0.009)  Low-achievement students  −0.021*** (0.004)  −0.010** (0.003)  −0.034*** (0.005)  Free-lunch students  −0.000 (0.010)  0.011 (0.011)  0.020 (0.017)  Number of students  −0.131** (0.042)  −0.150*** (0.044)  −0.209*** (0.056)  Public school  −0.217*** (0.031)  −0.106*** (0.031)  0.098* (0.041)  School type: general high school  −0.060 (0.038)  −0.397*** (0.035)  −0.090+ (0.053)  School type: vocational high school  0.196** (0.063)  −0.083 (0.069)  −0.206* (0.083)  Teacher gender (female)      0.113*** (0.024)  Teacher age      0.028*** (0.007)  Teacher education      0.058** (0.018)  Adjusted R2  0.076  0.158  0.055  N  45,344  45,893  12,583  Note: All equations control for dummy variables for individual years (not shown). Reference for school type is middle school. Standard errors are clustered at the school level and shown in parentheses. +p < .10, * p < .05, **p < .01, ***p < .001 (two-tailed test). View Large Table 2. The Relationship between Archival Performance Indicators and Perceptual Evaluations   Citizen Evaluations  Employee Evaluations  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.068*** (0.007)  0.125*** (0.007)    School performance  0.169*** (0.027)  0.172*** (0.029)  0.135*** (0.030)  Household income  −0.012 (0.009)  0.002 (0.008)  −0.034 (0.055)  Parents education  0.000 (0.008)  0.007 (0.007)  0.022 (0.035)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  0.003 (0.002)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  −0.003 (0.004)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  −0.002 (0.002)  Class size  −0.028*** (0.005)  −0.028*** (0.005)  −0.009 (0.009)  Low-achievement students  −0.021*** (0.004)  −0.010** (0.003)  −0.034*** (0.005)  Free-lunch students  −0.000 (0.010)  0.011 (0.011)  0.020 (0.017)  Number of students  −0.131** (0.042)  −0.150*** (0.044)  −0.209*** (0.056)  Public school  −0.217*** (0.031)  −0.106*** (0.031)  0.098* (0.041)  School type: general high school  −0.060 (0.038)  −0.397*** (0.035)  −0.090+ (0.053)  School type: vocational high school  0.196** (0.063)  −0.083 (0.069)  −0.206* (0.083)  Teacher gender (female)      0.113*** (0.024)  Teacher age      0.028*** (0.007)  Teacher education      0.058** (0.018)  Adjusted R2  0.076  0.158  0.055  N  45,344  45,893  12,583    Citizen Evaluations  Employee Evaluations  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.068*** (0.007)  0.125*** (0.007)    School performance  0.169*** (0.027)  0.172*** (0.029)  0.135*** (0.030)  Household income  −0.012 (0.009)  0.002 (0.008)  −0.034 (0.055)  Parents education  0.000 (0.008)  0.007 (0.007)  0.022 (0.035)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  0.003 (0.002)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  −0.003 (0.004)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  −0.002 (0.002)  Class size  −0.028*** (0.005)  −0.028*** (0.005)  −0.009 (0.009)  Low-achievement students  −0.021*** (0.004)  −0.010** (0.003)  −0.034*** (0.005)  Free-lunch students  −0.000 (0.010)  0.011 (0.011)  0.020 (0.017)  Number of students  −0.131** (0.042)  −0.150*** (0.044)  −0.209*** (0.056)  Public school  −0.217*** (0.031)  −0.106*** (0.031)  0.098* (0.041)  School type: general high school  −0.060 (0.038)  −0.397*** (0.035)  −0.090+ (0.053)  School type: vocational high school  0.196** (0.063)  −0.083 (0.069)  −0.206* (0.083)  Teacher gender (female)      0.113*** (0.024)  Teacher age      0.028*** (0.007)  Teacher education      0.058** (0.018)  Adjusted R2  0.076  0.158  0.055  N  45,344  45,893  12,583  Note: All equations control for dummy variables for individual years (not shown). Reference for school type is middle school. Standard errors are clustered at the school level and shown in parentheses. +p < .10, * p < .05, **p < .01, ***p < .001 (two-tailed test). View Large The student evaluation model in table 2 presents results similar to the parent model. Students give more favorable evaluations to their schools when they perform well on standardized tests and when their school achieves higher academic performance. The coefficient for school performance is larger than that for student performance, although the difference between them is smaller than in the parent model. Control variables in the parent and student models show similar patterns. Parents’ and students’ evaluations are positively associated with the teachers’ education levels but negatively related to bigger class size, more low-achievement students, and larger schools. Parents and students in public schools are less satisfied with school quality compared to those in private schools. Parents with children in vocational higher schools are more satisfied than middle school parents, and general high school students are the least satisfied. Turning the kaleidoscope, the results for the teacher model is presented in columns 3 in table 2. Similar to parents’ assessments, test scores show a positive and significant coefficient in the teacher model. School type and individual employee characteristics show significant relationships with teachers’ evaluations. Unlike service users (parents and students) in public schools who are less satisfied than those in private schools, service providers (teachers) in public schools tend to be more satisfied with their schools. Teachers in high schools are more likely to have negative perceptions of their schools compared to middle school employees. Female teachers tend to give more favorable evaluations to schools than their male counterparts. Highly educated teachers and older teachers are more likely to have positive perceptions of their school quality compared to teachers with less education and younger teachers, respectively. The next analyses include a term representing the interaction of student level performance and school level performance to test hypotheses 4 and 5. Table 3 shows the results for both parents’ evaluations and students’ evaluations. For both parent and student models, we find both archival performance variables and their interactions have positive and significant coefficients. This result indicates that individual student performance matters more when schools also achieve better overall performance. Table 3. The Interactive Effects of Individual and Aggregate Level Performance on Citizen Evaluations   Citizen Evaluations  Parents’ Evaluations  Students’ Evaluations  Student performance  0.064*** (0.007)  0.121*** (0.007)  School performance  0.139*** (0.026)  0.151*** (0.029)  Student performance × School performance  0.052*** (0.015)  0.036* (0.016)  Household income  −0.012 (0.009)  0.001 (0.008)  Parents education  −0.000 (0.008)  0.007 (0.007)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  Class size  −0.027*** (0.005)  −0.027*** (0.005)  Low-achievement students  −0.023*** (0.004)  −0.011*** (0.003)  Free-lunch students  0.003 (0.010)  0.013 (0.011)  Number of students  −0.127** (0.041)  −0.148*** (0.043)  Public school  −0.215*** (0.030)  −0.105*** (0.030)  School type: general high school  −0.060 (0.037)  −0.397*** (0.034)  School type: vocational high school  0.158* (0.066)  −0.110 (0.071)  Adjusted R2  0.077  0.158  N  45,344  45,893    Citizen Evaluations  Parents’ Evaluations  Students’ Evaluations  Student performance  0.064*** (0.007)  0.121*** (0.007)  School performance  0.139*** (0.026)  0.151*** (0.029)  Student performance × School performance  0.052*** (0.015)  0.036* (0.016)  Household income  −0.012 (0.009)  0.001 (0.008)  Parents education  −0.000 (0.008)  0.007 (0.007)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  Class size  −0.027*** (0.005)  −0.027*** (0.005)  Low-achievement students  −0.023*** (0.004)  −0.011*** (0.003)  Free-lunch students  0.003 (0.010)  0.013 (0.011)  Number of students  −0.127** (0.041)  −0.148*** (0.043)  Public school  −0.215*** (0.030)  −0.105*** (0.030)  School type: general high school  −0.060 (0.037)  −0.397*** (0.034)  School type: vocational high school  0.158* (0.066)  −0.110 (0.071)  Adjusted R2  0.077  0.158  N  45,344  45,893  Note: All equations control for dummy variables for individual years (not shown). Reference for school type is middle school. Standard errors are clustered at the school level and shown in parentheses. +p < .10, * p < .05, **p < .01, ***p < .001 (two-tailed test). View Large Table 3. The Interactive Effects of Individual and Aggregate Level Performance on Citizen Evaluations   Citizen Evaluations  Parents’ Evaluations  Students’ Evaluations  Student performance  0.064*** (0.007)  0.121*** (0.007)  School performance  0.139*** (0.026)  0.151*** (0.029)  Student performance × School performance  0.052*** (0.015)  0.036* (0.016)  Household income  −0.012 (0.009)  0.001 (0.008)  Parents education  −0.000 (0.008)  0.007 (0.007)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  Class size  −0.027*** (0.005)  −0.027*** (0.005)  Low-achievement students  −0.023*** (0.004)  −0.011*** (0.003)  Free-lunch students  0.003 (0.010)  0.013 (0.011)  Number of students  −0.127** (0.041)  −0.148*** (0.043)  Public school  −0.215*** (0.030)  −0.105*** (0.030)  School type: general high school  −0.060 (0.037)  −0.397*** (0.034)  School type: vocational high school  0.158* (0.066)  −0.110 (0.071)  Adjusted R2  0.077  0.158  N  45,344  45,893    Citizen Evaluations  Parents’ Evaluations  Students’ Evaluations  Student performance  0.064*** (0.007)  0.121*** (0.007)  School performance  0.139*** (0.026)  0.151*** (0.029)  Student performance × School performance  0.052*** (0.015)  0.036* (0.016)  Household income  −0.012 (0.009)  0.001 (0.008)  Parents education  −0.000 (0.008)  0.007 (0.007)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  Class size  −0.027*** (0.005)  −0.027*** (0.005)  Low-achievement students  −0.023*** (0.004)  −0.011*** (0.003)  Free-lunch students  0.003 (0.010)  0.013 (0.011)  Number of students  −0.127** (0.041)  −0.148*** (0.043)  Public school  −0.215*** (0.030)  −0.105*** (0.030)  School type: general high school  −0.060 (0.037)  −0.397*** (0.034)  School type: vocational high school  0.158* (0.066)  −0.110 (0.071)  Adjusted R2  0.077  0.158  N  45,344  45,893  Note: All equations control for dummy variables for individual years (not shown). Reference for school type is middle school. Standard errors are clustered at the school level and shown in parentheses. +p < .10, * p < .05, **p < .01, ***p < .001 (two-tailed test). View Large A more intuitive way of illustrating the interactive relationship is plotting predicted values.6Figure 2 shows predicted parents’ evaluations in a school at varying levels of student and school level performance with 95% confidence intervals. The solid line shows the relationship when school performance is high (two standard deviations above the mean), and the dashed line illustrates the relationship for low levels of school performance (two standard deviations below the mean). The slope of the solid line is positive, suggesting that student performance has a positive effect on parents’ evaluations at high levels of school performance. The slope of the dashed line remains flat, however, indicating that student performance has little additional effect when schools perform poorly. Figure 3 illustrates the predicted effects of student performance on students’ evaluations. It also suggests that student performance has a positive relationship with students’ evaluations for schools with a high level of performance (the solid line). As the level of school performance decreases, the slope of the relationship is still positive but becomes relatively flat (the dashed line). Figure 2. View largeDownload slide Predicted Effects of Student Performance on Parents’ Evaluations Figure 2. View largeDownload slide Predicted Effects of Student Performance on Parents’ Evaluations Figure 3. View largeDownload slide Predicted Effects of Student Performance on Students’ Evaluations Figure 3. View largeDownload slide Predicted Effects of Student Performance on Students’ Evaluations Indications of Discriminant Validity Thus far the analysis has shown that parents’, students’, and teachers’ evaluations of the schools are positively correlated with the official performance measure (student test scores) even controlling for numerous variables that also affect test scores. The results provide support for the convergent validity of both citizen and employee evaluations of services. The perceptual assessments, however, are not perfectly correlated with the test score results. Although this could result from just perceptual errors, it is also possible that the assessments of parents, students, and teachers contain some common element that is separate from what is measured by student test scores. The education literature criticizing the use of standardized tests argues that they are poor measures of critical thinking, creativity, student well-being or other factors that contribute positively to the education of a student (Linn 2001; Worthen 1993). These other factors could generate some commonality among the perceptual indicators that are independent of the test scores. Such commonality might indicate some discriminant validity for the various perceptual assessments. The first indicator of commonality among the perceptual assessments is to take the residuals from table 2 for each of the models and correlate them. Essentially this exercise asks if what test scores and the other factors cannot explain has some commonalities across the indicators. When this is done at the school level (the teacher model is at the school level only), the parent and student residuals are strongly correlated; and the teacher residuals are positively associated with both parent residuals and student residuals (Table A3 in Appendix). A second way to demonstrate the linkage between the various indicators is to add the other perceptual indicators to the equations in table 2 directly to determine if they are positively related. This is a more difficult test than the residuals’ test since it has to overcome the collinearity among the perceptual indicators. When this is done, both students’ evaluations and teachers’ evaluations are positively related to parents’ evaluations; the effect size for students’ evaluations is much larger that of teachers’ evaluations. In the student model, both parents’ evaluations and teachers’ evaluations are positively related to the student measure with the influence of parents being about three times larger. Finally, in the teacher model, both parents’ and students’ perceptual assessments are positively associated with teachers’ evaluations with parents’ evaluations having greater influence. In all models, the significant effect of archival performance indicators remains the same even when we include perceptual evaluations from other stakeholders (Table A4 in Appendix). The third indicator of potential discriminant validity is to take advantage of the panel nature of the data set and run Granger (1969) causality analysis. Essentially Granger causality asks if prior values of one variable can predict current levels of another variable while controlling past levels of that variable. The test is a null hypothesis test, that is, if past values of the first variable add no statistical improvement in prediction to the second variable, then one can conclude that the first variable is not causally related to the second. The Granger tests suggest that three perceptual measures of school performance (parents, students, and teachers) are reciprocally related to each other, and strongly so (Table A5 in Appendix). In sum, three preliminary tests indicate that the perceptual assessments of educational quality by parents, students, and teachers share some commonalities that are separate from test scores and that these assessments feedback to each other. It is not the case that just one of the groups such as teachers shapes the expectations of all other groups. These results are consistent with the notion that these perceptual measures have discriminant validity relative to test scores. At the same time, the analysis does not reveal what this commonality is or what it might be in response to in the education process. Investigating this commonality remains, therefore, an interesting future research question regarding education policy and the relationship among performance indicators. Discussion and Conclusion Assessing performance becomes more important for modern governments under increased pressure to promote efficiency and improve accountability through quality public services (Martin and Smith 2005). Finding valid performance measures, however, is not easy given the multidimensional nature of public performance. The ideal situation, therefore, is to have multiple performance measures that “give similar results both descriptively and in analyses of relationships between performance and other key concepts” (Andersen, Heinesen, and Pedersen 2016, 65). This article incorporates multiple stakeholders’ perspectives on performance and investigates how they are related to archival performance indicators. In the Korean school context, we find that both service users (parents and students) and service providers (teachers) show similarities in their assessments relative to archival performance indicators. The consistency of performance measures from different sources supports the convergent validity of the performance indicators. Based on the individual level analysis, the study also reveals that (1) both individual and aggregate level performance indicators have a significant relationship with citizen evaluations and (2) the effect of individual level performance on citizen evaluations changes depending on the level of aggregate performance. Specifically, both individual student test scores and school mean scores are significantly associated with parents’ and students’ assessments of school performance, and student achievement has a larger positive relationship with parents’ and students’ evaluations in high-performing schools compared to low-performing schools. These significant results make theoretical contributions to the citizen satisfaction literature by suggesting that citizens base their performance assessments on both individual and aggregate terms. The logic is similar to Cyert and March’s (1963) notion of social aspirations where organizations (or in this case individuals) compare themselves to peers. In the case of Korean schools, individual user evaluations are contingent on an inherent comparison to others. A significant relationship between archival performance and perceptual evaluations in a centralized context indicates that the convergent validity of school performance indicators may be generalizable. Parents in decentralized education systems are able to signal preferences about schools through exercising school choice options (Schneider, Teske, and Marschall 2000), and they have more opportunities to get involved in the decision-making process in education policy by participating at the school level. Official performance indicators, therefore, should be more likely to reflect parent opinions in decentralized systems. These local control mechanisms often do not work well in centralized systems. Despite Korean schools being operated under a centralized education system where parents and students have limited school choice, these results show that even in these unfavorable conditions that archival performance indicators are positively linked to citizen evaluations. What factors, then, contribute to the association between academic achievement and citizens’ evaluations despite its limited school choice? The possible factors are (1) the salience of academic performance in achieving higher positions in the society and (2) the uniformity in the education system of Korea. First, education has been valued as a way to reach higher social status, and the examinations have been used as a social selection device in Korea. Under Confucian philosophy, Korean society stresses the idea of merit as the valid criterion for assessing an individual and awarding social status (Seth 2002); and achieving academic advancement (especially, entering prestigious universities) has been the pathway to high-paying jobs and higher social positions.7 The positive correlation between educational attainment and socioeconomic status exists in other countries, but the relationship is especially strong in Korea because of its historical stress on human capital based economic growth. The system, structured to allow for social mobility through education, has shaped the high social demand for education and propelled educational expansion (Seth 2002). For example, Korea has the highest higher education completion rate among OECD countries, with 70.0% of the population aged 25–34 completing tertiary education in 2016. The number is significantly higher than the OECD average (43.1%) and other Western countries such as the United States (47.5%) and the United Kingdom (52.0%) (OECD 2017).8 Second, uniformity in the education system facilitates the use of standardized exams as selection criteria and highlights the importance of test scores. Korean society has pursued a uniformity of education to ensure equal educational opportunity and fairness in educational access (Seth 2002).9 Korea has a national curriculum (that applies to both public and private schools) and allocates resources evenly to both public and private schools. This system notably contrasts with the localized US system with no national curriculum (each state sets their own curriculum requirements), an independent private school sector, and school funding that reflects local property taxes. In Korea, a governmental agency is in charge of national level examinations such as college entrance exams. Nationwide standardized tests make test scores comparable across all students. As a point of comparison, college entrance tests also exist in the United States (e.g., the SAT and ACTs), but only some students in the country take the exams. These exams are administered by a private nonprofit organization, not by the government. Test scores, therefore, are far more critical in Korea given that it is the common measure across all students. The use of test scores as the primary mechanism for deciding who is admitted to prestigious universities has made Korean parents and students take test scores very seriously (Mintrom and Walley 2013, 262). Another noticeable finding is that parents and students respond more to school performance than individual student performance. This result implies that citizens may consider school performance more important than individual performance when evaluating education quality. Parents and students in Seoul city are informed of the school mean scores as well as their own (child’s) score by their schools, and having knowledge about objectively measured school performance may help parents and students make more accurate judgments about school quality. This result might also reflect the collectivist culture in Korea, which has more cohesive social ties between individuals and emphasizes group goals above individual needs (Hofstede, Hofstede, and Minkov 1991). Based on this culture, parents and students in Korea might think that achieving a high level of group performance is as important as individual performance. This finding also poses a challenge to Tiebout models of choice that rely on individual goals rather than collective ones. Replicating these results in less collectivist cultures, therefore, can make a major contribution to theories of choice and decision making. Lastly, in addition to showing that there is convergent validity of school performance, we also find indirect evidence of discriminant validity. The similar results from parents, students, and teachers and the subsequent Granger causality models indicate that there is likely reciprocal causation among the perceptual assessments of school performance. The inter-relationships among parents’, students’, and teachers’ evaluations hold even controlling for standardized test scores and a wide range of other control variables. The inter-relationships are not the result of such factors as income, education, or other community level variables. Each group sees something common in school performance that is different from the results on standardized tests. This indirect indicator of discriminant validity suggests the need for additional research into the composition of stakeholder assessments and more theoretical concern in regard to the measurement of government performance. Supplementary Material Supplementary material is available at the Journal of Public Administration Research and Theory online. Appendix Table A1. Factor-Analytical Results of Survey Items Survey Item  Factor Loading  Parents’ evaluations   I am satisfied with my child’s school  0.69   This school improves my child’s academic ability  0.82   This school improves my child’s aptitude skills  0.77   This school teaches my child based on the level of his or her academic ability  0.81   This school provides consultation and activities to develop my child’s career  0.76   This school has enough educational facilities and adequate environment for education  0.71   This school cares about my child’s safety at school  0.74   Teachers in this school are passionate about teaching  0.81   Eigenvalue  4.69   Cronbach’s alpha  0.90  Students’ evaluations   I am satisfied with my school  0.68   This school improves student academic ability  0.84   This school improves student aptitude skills  0.82   This school teaches students based on the level of his or her academic ability  0.79   This school provides consultation and activities to develop student career  0.77   This school has enough educational facilities and adequate environment for education  0.74   This school cares about student safety at school  0.77   Eigenvalue  4.21   Cronbach’s alpha  0.89  Teachers’ evaluations   I am satisfied with this school  0.59   This school improves student academic ability  0.78   This school improves student aptitude skills  0.78   This school teaches students based on the level of his or her academic ability  0.76   This school provides consultation and activities to develop student career  0.73   This school has enough educational facilities and adequate environment for education  0.60   This school cares about student safety at school  0.71   Eigenvalue  3.55   Cronbach’s alpha  0.82  Survey Item  Factor Loading  Parents’ evaluations   I am satisfied with my child’s school  0.69   This school improves my child’s academic ability  0.82   This school improves my child’s aptitude skills  0.77   This school teaches my child based on the level of his or her academic ability  0.81   This school provides consultation and activities to develop my child’s career  0.76   This school has enough educational facilities and adequate environment for education  0.71   This school cares about my child’s safety at school  0.74   Teachers in this school are passionate about teaching  0.81   Eigenvalue  4.69   Cronbach’s alpha  0.90  Students’ evaluations   I am satisfied with my school  0.68   This school improves student academic ability  0.84   This school improves student aptitude skills  0.82   This school teaches students based on the level of his or her academic ability  0.79   This school provides consultation and activities to develop student career  0.77   This school has enough educational facilities and adequate environment for education  0.74   This school cares about student safety at school  0.77   Eigenvalue  4.21   Cronbach’s alpha  0.89  Teachers’ evaluations   I am satisfied with this school  0.59   This school improves student academic ability  0.78   This school improves student aptitude skills  0.78   This school teaches students based on the level of his or her academic ability  0.76   This school provides consultation and activities to develop student career  0.73   This school has enough educational facilities and adequate environment for education  0.60   This school cares about student safety at school  0.71   Eigenvalue  3.55   Cronbach’s alpha  0.82  Table A1. Factor-Analytical Results of Survey Items Survey Item  Factor Loading  Parents’ evaluations   I am satisfied with my child’s school  0.69   This school improves my child’s academic ability  0.82   This school improves my child’s aptitude skills  0.77   This school teaches my child based on the level of his or her academic ability  0.81   This school provides consultation and activities to develop my child’s career  0.76   This school has enough educational facilities and adequate environment for education  0.71   This school cares about my child’s safety at school  0.74   Teachers in this school are passionate about teaching  0.81   Eigenvalue  4.69   Cronbach’s alpha  0.90  Students’ evaluations   I am satisfied with my school  0.68   This school improves student academic ability  0.84   This school improves student aptitude skills  0.82   This school teaches students based on the level of his or her academic ability  0.79   This school provides consultation and activities to develop student career  0.77   This school has enough educational facilities and adequate environment for education  0.74   This school cares about student safety at school  0.77   Eigenvalue  4.21   Cronbach’s alpha  0.89  Teachers’ evaluations   I am satisfied with this school  0.59   This school improves student academic ability  0.78   This school improves student aptitude skills  0.78   This school teaches students based on the level of his or her academic ability  0.76   This school provides consultation and activities to develop student career  0.73   This school has enough educational facilities and adequate environment for education  0.60   This school cares about student safety at school  0.71   Eigenvalue  3.55   Cronbach’s alpha  0.82  Survey Item  Factor Loading  Parents’ evaluations   I am satisfied with my child’s school  0.69   This school improves my child’s academic ability  0.82   This school improves my child’s aptitude skills  0.77   This school teaches my child based on the level of his or her academic ability  0.81   This school provides consultation and activities to develop my child’s career  0.76   This school has enough educational facilities and adequate environment for education  0.71   This school cares about my child’s safety at school  0.74   Teachers in this school are passionate about teaching  0.81   Eigenvalue  4.69   Cronbach’s alpha  0.90  Students’ evaluations   I am satisfied with my school  0.68   This school improves student academic ability  0.84   This school improves student aptitude skills  0.82   This school teaches students based on the level of his or her academic ability  0.79   This school provides consultation and activities to develop student career  0.77   This school has enough educational facilities and adequate environment for education  0.74   This school cares about student safety at school  0.77   Eigenvalue  4.21   Cronbach’s alpha  0.89  Teachers’ evaluations   I am satisfied with this school  0.59   This school improves student academic ability  0.78   This school improves student aptitude skills  0.78   This school teaches students based on the level of his or her academic ability  0.76   This school provides consultation and activities to develop student career  0.73   This school has enough educational facilities and adequate environment for education  0.60   This school cares about student safety at school  0.71   Eigenvalue  3.55   Cronbach’s alpha  0.82  Table A2. Descriptive Statistics for All Variables Variable  Mean  SD  Min  Max  Source  Dependent variables   Parents’ evaluations  0.0  1.0  −4.1  2.7  Parent survey   Students’ evaluations  0.0  1.0  −3.4  2.3  Student survey   Teachers’ evaluations  0.0  1.0  −5.0  2.3  Teacher survey  Independent variables   Student performance  0.0  1.0  −5.1  4.7  Archival data   School performance  0.0  0.6  −2.5  2.9  Archival data  Parent characteristics   Household income (logged)  5.9  0.7  0  9.2  Parent survey   Parent education  4.1  1.1  1  7  Parent survey  Teacher characteristics   Teacher gender (female = 1)  0.61  —  0  1  Teacher survey   Teacher age  4.6  1.5  1  6  Teacher survey   Teacher education  1.5  0.6  1  4  Teacher survey  School characteristics   Average teacher education (%)  17.3  17.6  0  94.9  Archival data   Average teacher experience (years)  19.4  4.0  4  34  Archival data   Full-time teachers (%)  83.0  7.4  44.1  100  Archival data   Class size (student–teacher ratio)  15.9  3.2  3.9  42  Archival data   Low-achievement students (%)  4.1  4.3  0  37.4  Archival data   Free lunch students (logged)  4.7  1.1  0  7.4  Archival data   Number of students (logged)  6.0  0.5  3.4  7.5  Archival data   Ownership: public school  0.54  —  0  1  Archival data   Ownership: private school  0.46  —  0  1  Archival data   School type: middle school  0.46  —  0  1  Archival data   School type: general high school  0.45  —  0  1  Archival data   School type: vocational high school  0.09  —  0  1  Archival data  Variable  Mean  SD  Min  Max  Source  Dependent variables   Parents’ evaluations  0.0  1.0  −4.1  2.7  Parent survey   Students’ evaluations  0.0  1.0  −3.4  2.3  Student survey   Teachers’ evaluations  0.0  1.0  −5.0  2.3  Teacher survey  Independent variables   Student performance  0.0  1.0  −5.1  4.7  Archival data   School performance  0.0  0.6  −2.5  2.9  Archival data  Parent characteristics   Household income (logged)  5.9  0.7  0  9.2  Parent survey   Parent education  4.1  1.1  1  7  Parent survey  Teacher characteristics   Teacher gender (female = 1)  0.61  —  0  1  Teacher survey   Teacher age  4.6  1.5  1  6  Teacher survey   Teacher education  1.5  0.6  1  4  Teacher survey  School characteristics   Average teacher education (%)  17.3  17.6  0  94.9  Archival data   Average teacher experience (years)  19.4  4.0  4  34  Archival data   Full-time teachers (%)  83.0  7.4  44.1  100  Archival data   Class size (student–teacher ratio)  15.9  3.2  3.9  42  Archival data   Low-achievement students (%)  4.1  4.3  0  37.4  Archival data   Free lunch students (logged)  4.7  1.1  0  7.4  Archival data   Number of students (logged)  6.0  0.5  3.4  7.5  Archival data   Ownership: public school  0.54  —  0  1  Archival data   Ownership: private school  0.46  —  0  1  Archival data   School type: middle school  0.46  —  0  1  Archival data   School type: general high school  0.45  —  0  1  Archival data   School type: vocational high school  0.09  —  0  1  Archival data  Note: Based on a sample of 45,344 parents, 45,893 students, and 12,583 teachers used in the analysis. Parent educational attainment is coded as elementary school graduate = 1; middle school graduate = 2; high school graduate = 3; 2 (or 3) year college graduate = 4; 4-year university graduate = 5; Master’s degree = 6; and Doctorate degree = 7. We code teacher gender as a dummy variable (female = 1; male = 0). Teacher age is coded as a categorical variable with six categories (younger than 25 years = 1; between 26 and 30 years = 2; between 31 and 35 years = 3; between 36 and 40 years = 4; between 41 and 45 years = 5; and older than 46 years = 6). Teacher education is coded as a 4 category-variable (holding bachelor’s degree = 1; holding a Master’s degree = 2; doctorate course completion = 3; and holding a doctorate degree = 4). Table A2. Descriptive Statistics for All Variables Variable  Mean  SD  Min  Max  Source  Dependent variables   Parents’ evaluations  0.0  1.0  −4.1  2.7  Parent survey   Students’ evaluations  0.0  1.0  −3.4  2.3  Student survey   Teachers’ evaluations  0.0  1.0  −5.0  2.3  Teacher survey  Independent variables   Student performance  0.0  1.0  −5.1  4.7  Archival data   School performance  0.0  0.6  −2.5  2.9  Archival data  Parent characteristics   Household income (logged)  5.9  0.7  0  9.2  Parent survey   Parent education  4.1  1.1  1  7  Parent survey  Teacher characteristics   Teacher gender (female = 1)  0.61  —  0  1  Teacher survey   Teacher age  4.6  1.5  1  6  Teacher survey   Teacher education  1.5  0.6  1  4  Teacher survey  School characteristics   Average teacher education (%)  17.3  17.6  0  94.9  Archival data   Average teacher experience (years)  19.4  4.0  4  34  Archival data   Full-time teachers (%)  83.0  7.4  44.1  100  Archival data   Class size (student–teacher ratio)  15.9  3.2  3.9  42  Archival data   Low-achievement students (%)  4.1  4.3  0  37.4  Archival data   Free lunch students (logged)  4.7  1.1  0  7.4  Archival data   Number of students (logged)  6.0  0.5  3.4  7.5  Archival data   Ownership: public school  0.54  —  0  1  Archival data   Ownership: private school  0.46  —  0  1  Archival data   School type: middle school  0.46  —  0  1  Archival data   School type: general high school  0.45  —  0  1  Archival data   School type: vocational high school  0.09  —  0  1  Archival data  Variable  Mean  SD  Min  Max  Source  Dependent variables   Parents’ evaluations  0.0  1.0  −4.1  2.7  Parent survey   Students’ evaluations  0.0  1.0  −3.4  2.3  Student survey   Teachers’ evaluations  0.0  1.0  −5.0  2.3  Teacher survey  Independent variables   Student performance  0.0  1.0  −5.1  4.7  Archival data   School performance  0.0  0.6  −2.5  2.9  Archival data  Parent characteristics   Household income (logged)  5.9  0.7  0  9.2  Parent survey   Parent education  4.1  1.1  1  7  Parent survey  Teacher characteristics   Teacher gender (female = 1)  0.61  —  0  1  Teacher survey   Teacher age  4.6  1.5  1  6  Teacher survey   Teacher education  1.5  0.6  1  4  Teacher survey  School characteristics   Average teacher education (%)  17.3  17.6  0  94.9  Archival data   Average teacher experience (years)  19.4  4.0  4  34  Archival data   Full-time teachers (%)  83.0  7.4  44.1  100  Archival data   Class size (student–teacher ratio)  15.9  3.2  3.9  42  Archival data   Low-achievement students (%)  4.1  4.3  0  37.4  Archival data   Free lunch students (logged)  4.7  1.1  0  7.4  Archival data   Number of students (logged)  6.0  0.5  3.4  7.5  Archival data   Ownership: public school  0.54  —  0  1  Archival data   Ownership: private school  0.46  —  0  1  Archival data   School type: middle school  0.46  —  0  1  Archival data   School type: general high school  0.45  —  0  1  Archival data   School type: vocational high school  0.09  —  0  1  Archival data  Note: Based on a sample of 45,344 parents, 45,893 students, and 12,583 teachers used in the analysis. Parent educational attainment is coded as elementary school graduate = 1; middle school graduate = 2; high school graduate = 3; 2 (or 3) year college graduate = 4; 4-year university graduate = 5; Master’s degree = 6; and Doctorate degree = 7. We code teacher gender as a dummy variable (female = 1; male = 0). Teacher age is coded as a categorical variable with six categories (younger than 25 years = 1; between 26 and 30 years = 2; between 31 and 35 years = 3; between 36 and 40 years = 4; between 41 and 45 years = 5; and older than 46 years = 6). Teacher education is coded as a 4 category-variable (holding bachelor’s degree = 1; holding a Master’s degree = 2; doctorate course completion = 3; and holding a doctorate degree = 4). Table A3. Correlations among the Residuals   Parent Residuals  Student Residuals  Teacher Residuals  Residuals from the parent model  1      Residuals from the student model  0.463*  1    Residuals from the teacher model  0.187*  0.161*  1    Parent Residuals  Student Residuals  Teacher Residuals  Residuals from the parent model  1      Residuals from the student model  0.463*  1    Residuals from the teacher model  0.187*  0.161*  1  Note: The residuals come from the regression models in Table 2. School-level analysis. *p < .05. Table A3. Correlations among the Residuals   Parent Residuals  Student Residuals  Teacher Residuals  Residuals from the parent model  1      Residuals from the student model  0.463*  1    Residuals from the teacher model  0.187*  0.161*  1    Parent Residuals  Student Residuals  Teacher Residuals  Residuals from the parent model  1      Residuals from the student model  0.463*  1    Residuals from the teacher model  0.187*  0.161*  1  Note: The residuals come from the regression models in Table 2. School-level analysis. *p < .05. Table A4. Discriminant Validity of Subjective Evaluations   Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.035*** (0.008)  0.084*** (0.008)    School performance  0.085*** (0.023)  0.071** (0.026)  0.103*** (0.029)  Parents’ evaluations    0.290*** (0.008)  0.187*** (0.034)  Students’ evaluations  0.309*** (0.010)    0.108*** (0.031)  Teachers’ evaluations  0.137*** (0.020)  0.099*** (0.021)    Adjusted R2  0.154  0.213  0.067  N  26,509  26,509  12,571    Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.035*** (0.008)  0.084*** (0.008)    School performance  0.085*** (0.023)  0.071** (0.026)  0.103*** (0.029)  Parents’ evaluations    0.290*** (0.008)  0.187*** (0.034)  Students’ evaluations  0.309*** (0.010)    0.108*** (0.031)  Teachers’ evaluations  0.137*** (0.020)  0.099*** (0.021)    Adjusted R2  0.154  0.213  0.067  N  26,509  26,509  12,571  Note: All control variables in Table 2 are included in each model. Standard errors are clustered at the school level and shown in parentheses. +p < .10, *p < .05, **p < .01, ***p < .001 (two-tailed test). Table A4. Discriminant Validity of Subjective Evaluations   Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.035*** (0.008)  0.084*** (0.008)    School performance  0.085*** (0.023)  0.071** (0.026)  0.103*** (0.029)  Parents’ evaluations    0.290*** (0.008)  0.187*** (0.034)  Students’ evaluations  0.309*** (0.010)    0.108*** (0.031)  Teachers’ evaluations  0.137*** (0.020)  0.099*** (0.021)    Adjusted R2  0.154  0.213  0.067  N  26,509  26,509  12,571    Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.035*** (0.008)  0.084*** (0.008)    School performance  0.085*** (0.023)  0.071** (0.026)  0.103*** (0.029)  Parents’ evaluations    0.290*** (0.008)  0.187*** (0.034)  Students’ evaluations  0.309*** (0.010)    0.108*** (0.031)  Teachers’ evaluations  0.137*** (0.020)  0.099*** (0.021)    Adjusted R2  0.154  0.213  0.067  N  26,509  26,509  12,571  Note: All control variables in Table 2 are included in each model. Standard errors are clustered at the school level and shown in parentheses. +p < .10, *p < .05, **p < .01, ***p < .001 (two-tailed test). Table A5. Granger Causality Tests Independent Variable    Dependent Variable  F  p-Value  Students’ evaluations  →  Parents’ evaluations  19.66  .000  Teachers’ evaluations  →  Parents’ evaluations  19.65  .000  Parents’ evaluations  →  Students’ evaluations  12.94  .000  Teachers’ evaluations  →  Students’ evaluations  33.28  .000  Parents’ evaluations  →  Teachers’ evaluations  27.42  .000  Students’ evaluations  →  Teachers’ evaluations  38.90  .000  Independent Variable    Dependent Variable  F  p-Value  Students’ evaluations  →  Parents’ evaluations  19.66  .000  Teachers’ evaluations  →  Parents’ evaluations  19.65  .000  Parents’ evaluations  →  Students’ evaluations  12.94  .000  Teachers’ evaluations  →  Students’ evaluations  33.28  .000  Parents’ evaluations  →  Teachers’ evaluations  27.42  .000  Students’ evaluations  →  Teachers’ evaluations  38.90  .000  Note: School-level analysis. Table A5. Granger Causality Tests Independent Variable    Dependent Variable  F  p-Value  Students’ evaluations  →  Parents’ evaluations  19.66  .000  Teachers’ evaluations  →  Parents’ evaluations  19.65  .000  Parents’ evaluations  →  Students’ evaluations  12.94  .000  Teachers’ evaluations  →  Students’ evaluations  33.28  .000  Parents’ evaluations  →  Teachers’ evaluations  27.42  .000  Students’ evaluations  →  Teachers’ evaluations  38.90  .000  Independent Variable    Dependent Variable  F  p-Value  Students’ evaluations  →  Parents’ evaluations  19.66  .000  Teachers’ evaluations  →  Parents’ evaluations  19.65  .000  Parents’ evaluations  →  Students’ evaluations  12.94  .000  Teachers’ evaluations  →  Students’ evaluations  33.28  .000  Parents’ evaluations  →  Teachers’ evaluations  27.42  .000  Students’ evaluations  →  Teachers’ evaluations  38.90  .000  Note: School-level analysis. References Amirkhanyan, Anna A., Hyun Joon Kim, and Kristina T. Lambright. 2014. The performance puzzle: Understanding the factors influencing alternative dimensions and views of performance. Journal of Public Administration Research and Theory  24: 1– 34. Google Scholar CrossRef Search ADS   Andersen, Lotte Bøgh, Andreas Boesen, and Lene Holm Pedersen. 2016. Performance in public organizations: Clarifying the conceptual space. Public Administration Review  76: 852– 62. Google Scholar CrossRef Search ADS   Andersen, Lotte Bøgh, Eskil Heinesen, and Lene Holm Pedersen. 2016. Individual performance: From common source bias to institutionalized assessment. Journal of Public Administration Research and Theory  26: 63– 78. Andersen, Simon Calmar, and Peter B. Mortensen. 2010. Policy stability and organizational performance: Is there a relationship? Journal of Public Administration Research and Theory  20: 1– 22. Google Scholar CrossRef Search ADS   Andrews, Rhys, George A. Boyne, and Richard M. Walker. 2006. Subjective and objective measures of organizational performance: An empirical exploration. In Public service performance: Perspectives on measurement and management , ed. George A. Boyne, Kenneth J. Meier, Laurence J. O’Toole, and Richard M. Walker, 14– 34. Cambridge: Cambridge Univ. Press. Barrows, Samuel, Michael Henderson, Paul E. Peterson, and Martin R. West. 2016. Relative performance information and perceptions of public service quality: Evidence from American school districts. Journal of Public Administration Research and Theory  26: 571– 83. Google Scholar CrossRef Search ADS   Boyne, George A. 2002. Concepts and indicators of local authority performance: An evaluation of the statutory frameworks in England and Wales. Public Money and Management  22: 17– 24. Google Scholar CrossRef Search ADS   ———. 2003a. Sources of public service improvement: A critical review and research agenda. Journal of Public Administration Research and Theory  13: 367– 394. CrossRef Search ADS   ———. 2003b. What is public service improvement? Public Administration  81: 211– 227. Google Scholar CrossRef Search ADS   Boyne, George A., Kenneth J. Meier, Laurence J. O’Toole, and Richard M. Walker. 2006. Public service performance: Perspectives on measurement and management . Cambridge: Cambridge Univ. Press. Google Scholar CrossRef Search ADS   Brewer, Gene A. 2006. All measures of performance are subjective: More evidence on US federal agencies. In Public service performance: Perspectives on measurement and management , ed. George A. Boyne, Kenneth J. Meier, Laurence J. O’Toole, and Richard M. Walker, 35– 54. Cambridge: Cambridge Univ. Press. Brown, Ben, and Wm Reed Benedict. 2002. Perceptions of the police: Past findings, methodological issues, conceptual issues and policy implications. Policing: An International Journal of Police Strategies & Management  25: 543– 80. Google Scholar CrossRef Search ADS   Brown, Karin, and Philip B. Coulter. 1983. Subjective and objective measures of police service delivery. Public Administration Review  43: 50– 8. Google Scholar CrossRef Search ADS   Brudney, Jeffrey L., and Robert E. England. 1982. Urban policy making and subjective service evaluations: Are they compatible? Public Administration Review  42: 127– 35. Google Scholar CrossRef Search ADS   Cameron, Kim S. 1986. Effectiveness as paradox: Consensus and conflict in conceptions of organizational effectiveness. Management Science  32: 539– 53. Google Scholar CrossRef Search ADS   Campbell, Donald T., and Donald W. Fiske. 1959. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin  56: 81– 105. Google Scholar CrossRef Search ADS   Charbonneau, Étienne, and Gregg G. Van Ryzin. 2012. Performance measures and parental satisfaction with New York City schools. The American Review of Public Administration  42: 54– 65. Google Scholar CrossRef Search ADS   Connolly, Terry, Edward J. Conlon, and Stuart Jay Deutsch. 1980. Organizational effectiveness: A multiple-constituency approach. Academy of Management Review  5: 211– 8. Cyert, Richard M. and James G. March. 1963. A behavioral theory of the firm . Englewood Cliffs, NJ: Prentice Hall. Favero, Nathan, and Justin B. Bullock. 2015. How (not) to solve the problem: An evaluation of scholarly responses to common source bias. Journal of Public Administration Research and Theory  25: 285– 308. Google Scholar CrossRef Search ADS   Favero, Nathan, and Kenneth J. Meier. 2013. Evaluating urban public schools: Parents, teachers, and state assessments. Public Administration Review  73: 401– 12. Google Scholar CrossRef Search ADS   Freeman, R. Edward. 1984. Strategic management: A stakeholder approach . Marshfield, MA: Pitman Publishing, Inc. Granger, Clive W. J. 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society  37: 424– 38. Google Scholar CrossRef Search ADS   Grissom, Jason A., Jill Nicholson-Crotty, and Lael Keiser. 2012. Does my boss’s gender matter? Explaining job satisfaction and employee turnover in the public sector. Journal of Public Administration Research and Theory  22: 649– 73. Google Scholar CrossRef Search ADS   Hirschman, Albert O. 1970. Exit, voice, and loyalty: Responses to decline in firms, organizations, and states . Cambridge, MA: Harvard Univ. Press. Hofstede, Geert, Gert Jan Hofstede, and Michael Minkov. 1991. Cultures and organizations: Software of the mind . London: McGraw-Hill. Jacobsen, Rebecca, Jeffrey W. Snyder, and Andrew Saultz. 2015. Understanding satisfaction with schools: The role of expectations. Journal of Public Administration Research and Theory  25: 831– 48. Google Scholar CrossRef Search ADS   James, Oliver. 2009. Evaluating the expectations disconfirmation and expectations anchoring approaches to citizen satisfaction with local public services. Journal of Public Administration Research and Theory  19: 107– 23. Google Scholar CrossRef Search ADS   Kelly, Janet M. 2003. Citizen satisfaction and administrative performance measures: Is there really a link? Urban Affairs Review  38: 855– 66. Google Scholar CrossRef Search ADS   ———. 2005. The dilemma of the unsatisfied customer in a market model of public administration. Public Administration Review  65: 76– 84. CrossRef Search ADS   Kelly, Janet M., and David Swindell. 2002. A multiple–indicator approach to municipal service evaluation: Correlating performance measurement and citizen satisfaction across jurisdictions. Public Administration Review  62: 610– 21. Google Scholar CrossRef Search ADS   Licari, Michael J., William McLean, and Tom W. Rice. 2005. The condition of community streets and parks: A comparison of resident and nonresident evaluations. Public Administration Review  65: 360– 68. Google Scholar CrossRef Search ADS   Linn, Robert L. 2001. A century of standardized testing: Controversies and pendulum swings. Educational Assessment  7: 29– 38. Google Scholar CrossRef Search ADS   Lyons, William E., David Lowery, and Ruth Hoogland DeHoog. 1992. The politics of dissatisfaction: Citizens, services, and urban institutions . Armonk, NY: ME Sharpe. Marginson, Simon. 2011. The Confucian model of higher education in East Asia and Singapore. In Higher education in the Asia-Pacific: Strategic responses to globalization , ed. Simon Marginson, Sarjit Kaur, and Erlenawati Sawir, 53– 75. Springer. Google Scholar CrossRef Search ADS   Martin, Stephen, and Peter C. Smith. 2005. Multiple public service performance indicators: Toward an integrated statistical approach. Journal of Public Administration Research and Theory  15: 599– 613. Google Scholar CrossRef Search ADS   Meier, Kenneth J., and Laurence J. O’Toole. 2001. Managerial strategies and behavior in networks: A model with evidence from US public education. Journal of Public Administration Research and Theory  11: 271– 94. Google Scholar CrossRef Search ADS   ———. 2013a. I think (I am doing well), therefore I am: Assessing the validity of administrators’ self-assessments of performance. International Public Management Journal  16: 1– 27. ———. 2013b. Subjective organizational performance and measurement error: Common source bias and spurious relationships. Journal of Public Administration Research and Theory  23: 429– 56. CrossRef Search ADS   Meier, Kenneth J., Søren C. Winter, Laurence J. O’Toole, Nathan Favero, and Simon Calmar Andersen. 2015. The validity of subjective performance measures: School principals in Texas and Denmark. Public Administration  93: 1084– 101. Google Scholar CrossRef Search ADS   Mintrom, Michael, and Richard Walley. 2013. Education governance in comparative perspective. In Education governance for the twenty-first century: Overcoming the structural barriers to school reform , ed. Paul Manna and Patrick McGuinn, 252– 74. Washington, DC: The Brookings Institution. Morgeson, Forrest V. 2013. Expectations, disconfirmation, and citizen satisfaction with the US federal government: Testing and expanding the model. Journal of Public Administration Research and Theory  23: 289– 305. Google Scholar CrossRef Search ADS   Moynihan, Donald P. 2008. The dynamics of performance management: Constructing information and reform . Washington, DC: Georgetown Univ. Press. Moynihan, Donald P., Sergio Fernandez, Soonhee Kim, Kelly M. LeRoux, Suzanne J. Piotrowski, Bradley E. Wright, and Kaifeng Yang. 2011. Performance regimes amidst governance complexity. Journal of Public Administration Research and Theory  21 ( Suppl 1): i141– 55. Google Scholar CrossRef Search ADS   OECD. 2017. Population with tertiary education (indicator) . doi: 10.1787/0b8f90e9-en ( accessed October 29, 2017). Parks, Roger B. 1984. Linking objective and subjective measures of performance. Public Administration Review  44: 118– 27. Google Scholar CrossRef Search ADS   Petrovsky, Nicolai, Jue Young Mok, and Filadelfo León‐Cázares. 2017. Citizen expectations and satisfaction in a young democracy: A test of the expectancy‐disconfirmation model. Public Administration Review  77: 395– 407. Google Scholar CrossRef Search ADS   Poister, Theodore H., and John Clayton Thomas. 2011. The effect of expectations and expectancy confirmation/disconfirmation on motorists’ satisfaction with state highways. Journal of Public Administration Research and Theory  21: 601– 17. Google Scholar CrossRef Search ADS   Quinn, Robert E., and John Rohrbaugh. 1983. A spatial model of effectiveness criteria: Towards a competing values approach to organizational analysis. Management Science  29: 363– 77. Google Scholar CrossRef Search ADS   Radin, Beryl. 2006. Challenging the performance movement: Accountability, complexity, and democratic values . Washington, DC: Georgetown Univ. Press. Rainey, Hal G. 2009. Understanding and managing public organizations . San Francisco, CA: John Wiley & Sons. Ringquist, Evan J. 2005. Assessing evidence of environmental inequities: A meta‐analysis. Journal of Policy Analysis and Management  24: 223– 47. Google Scholar CrossRef Search ADS   Rossi, Peter H. 1972. Community social indicators. In The human meaning of social change , ed. Angus Campbell and Philip E. Converse, 87– 126. New York: Russell Sage Foundation. Schneider, Mark, Paul Teske, and Melissa Marschall. 2000. Choosing schools: Consumer choice and the quality of American schools . Princeton, NJ: Princeton Univ. Press. Schwab, Donald P. 2005. Research methods for organizational studies . Mahwah, NJ: Psychology Press. Selden, Sally Coleman, and Jessica E. Sowa. 2004. Testing a multi-dimensional model of organizational performance: Prospects and problems. Journal of Public Administration Research and Theory  14: 395– 416. Google Scholar CrossRef Search ADS   Seoul Metropolitan Office of Education. 2017. High school admission information. http://english.sen.go.kr/index.jsp ( accessed January 3, 2017). Statistics Korea. 2017. Population census. http://kostat.go.kr/portal/eng/index.action ( accessed October 25, 2017). Seth, Michael J. 2002. Education fever: Society, politics, and the pursuit of schooling in South Korea . Honolulu, HI: University of Hawaii Press. Stipak, Brian. 1979. Citizen satisfaction with urban services: Potential misuse as a performance indicator. Public Administration Review  39: 46– 52. Google Scholar CrossRef Search ADS   ———. 1980. Local governments’ use of citizen surveys. Public Administration Review  40: 521– 5. CrossRef Search ADS   Tiebout, Charles M. 1956. A pure theory of local expenditures. The Journal of Political Economy  64: 416– 24. Google Scholar CrossRef Search ADS   Van Ryzin, Gregg G. 2004. Expectations, performance, and citizen satisfaction with urban services. Journal of Policy Analysis and Management  23: 433– 48. Google Scholar CrossRef Search ADS   ———. 2006. Testing the expectancy disconfirmation model of citizen satisfaction with local government. Journal of Public Administration Research and Theory  16: 599– 611. CrossRef Search ADS   ———. 2013. An experimental test of the expectancy‐disconfirmation theory of citizen satisfaction. Journal of Policy Analysis and Management  32: 597– 614. CrossRef Search ADS   Van Ryzin, Gregg G., Stephen Immerwahr, and Stan Altman. 2008. Measuring street cleanliness: A comparison of New York City’s scorecard and results from a citizen survey. Public Administration Review  68: 295– 303. Google Scholar CrossRef Search ADS   Walker, Richard M., and George A. Boyne. 2006. Public management reform and organizational performance: An empirical assessment of the UK Labour government’s public service improvement strategy. Journal of Policy Analysis and Management  25: 371– 93. Google Scholar CrossRef Search ADS   Walker, Richard M., George A. Boyne, and Gene A. Brewer. 2010. Public management and performance: Research directions . Cambridge: Cambridge Univ. Press. Google Scholar CrossRef Search ADS   Wall, Toby D., Jonathan Michie, Malcolm Patterson, Stephen J. Wood, Maura Sheehan, Chris W. Clegg, and Michael West. 2004. On the validity of subjective measures of company performance. Personnel Psychology  57: 95– 118. Google Scholar CrossRef Search ADS   Worthen, Blaine R. 1993. Critical issues that will determine the future of alternative assessment. Phi Delta Kappan  74: 444– 54. Footnotes An earlier version of this article was presented at the 88th Southern Political Science Association Annual Conference, January 12–14, 2017, New Orleans, LA. 1 Another way of categorizing performance indicators is a distinction between objective and subjective performance measures. Objective and subjective measures are often used interchangeably with archival and perceptual measures respectively in the literature; however, the concepts are not exactly identical. Tests to measure stress levels for students, for instance, are considered as objective measures since they are independent from perceptual judgments but have nothing to do with archives. Since our relatively objective measures are obtained from standardized test scores and our relatively subjective measures are derived from stakeholders’ perceptual assessments, we will use the archival versus perceptual terms rather than the objective and subjective distinction in this article. 2 Teacher job satisfaction and motivation, however, are different concepts from performance. These measures are significantly associated with teachers’ self-assessments of student performance, but this relationship is possibly due to common source bias (Andersen, Heinesen, and Pedersen 2016). 3 The logic is also applicable to the relationship between teachers’ evaluations and academic performance, but we cannot test the interactive relationship in the teacher model because the data do not provide a unique ID to match students and teachers. The only ID available to merge student and teacher data is the school ID; therefore, only school level performance is available for the teacher model. 4 Random assignment is conducted based on the school districts considering students’ commute time to school. Specifically, when middle school students apply to high school, they apply to two high schools among all high schools in Seoul City at the first stage and apply to two high schools among schools within their school districts at the second stage. Based on the applications, a computer randomly assigns students to each school using a lottery system. In the third stage, students who are not assigned to schools in the first and the second stage are once again randomly assigned to schools based on school size, class size, commute time, etc. (Seoul Metropolitan Office of Education 2017). The commute time limit has only modest impact on the random assignment because Seoul is an extremely dense city (about 9.8 million people in total and about 42,000 persons per square mile) and schools are close (1.36 high schools and 1.64 middle schools per square mile) (Statistics Korea 2017). Mintrom and Walley (2013, 259) note the “virtual nonexistence of parent choice in the Korean system.” 5 This structure does not mean that choice cannot exist. In any system wealthy individuals can use wealth to augment educational opportunities (by sending their children outside the country or seeking entrance into the small number of elite schools in Seoul). Our contention is only that Tiebout style choices are far more constrained in Seoul than in the United States and similar countries. 6 The plots of the marginal effects are also available in the online appendix. 7 Other Asian countries influenced by Confucian educational traditions (such as China, Japan, Singapore, Taiwan, etc.) also have a similar system where a national examination serves as a social sorting mechanism and academic performance is highly valued (Marginson 2011). 8 Strong social demand for education and the emphasis on test scores contributes to South Korea’s high performance in the global assessment of students such as the Programme for International Student Assessment (PISA); however, it also leads to extensive private tutoring and the intense pressure on students to succeed academically. 9 Equalization policies in Korea such as random student assignment or the teacher rotation system are also rooted in this philosophy of education. © The Author(s) 2018. Published by Oxford University Press on behalf of the Public Management Research Association. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Public Administration Research and Theory Oxford University Press

Citizen Satisfaction and the Kaleidoscope of Government Performance: How Multiple Stakeholders See Government Performance

Loading next page...
 
/lp/ou_press/citizen-satisfaction-and-the-kaleidoscope-of-government-performance-Bbu5K8HFGm
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of the Public Management Research Association. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ISSN
1053-1858
eISSN
1477-9803
D.O.I.
10.1093/jopart/muy006
Publisher site
See Article on Publisher Site

Abstract

Abstract Performance assessment is a central issue for modern governments; however, little attention has been paid to the similarities and differences among various performance indicators. This study investigates how different performance assessments relate to each other by incorporating multiple stakeholders’ perspectives on performance at the individual level. Combining three different surveys and archival data on secondary education, we analyze how academic performance indicators are associated with service users’ (parents’ and students’) and service providers’ (teachers’) judgments of school quality. Our findings suggest that parents, students, and teachers provide similar assessments of school performance, and these assessments reflect the actual quality of the schools. Their evaluations are more closely aligned to archival performance indicators in high-performing schools than low-performing schools. In addition to the convergent validity of the various performance measures, we also find indirect evidence that the perceptual measures have discriminant validity relative to archival measures. The consistency of performance indicators in a centralized regime (South Korea) also contributes to the generalizability of existing theory. Introduction Governments worldwide are adopting performance appraisal systems to evaluate how well programs meet citizen’s needs, but such systems assume that reliable and valid measures of performance exist. Despite the need for quality performance indicators, there are no universally accepted performance measures in the public sector (Andrews, Boyne, and Walker 2006; Rainey 2009). Consequently, various indicators, such as administrative performance records, citizen satisfaction, and employee assessments, have been employed to evaluate public service quality. Even with the wide use of these measures in academic and practical settings, the relationship between different performance indicators has only been examined sporadically and in a few governance contexts. Previous studies testing whether and how different performance indicators relate to each other have shown mixed results (Kelly and Swindell 2002). Some research demonstrates positive correlations among performance indicators such as administrative records, citizen evaluations, and employee assessments (e.g., Charbonneau and Van Ryzin 2012; Favero and Meier 2013; Van Ryzin, Immerwahr, and Altman 2008) while other findings suggest a weak relationship (e.g., Brown and Coulter 1983; Kelly 2003; Meier et al. 2015). The mixed results may be partly due to the failure to consider multiple stakeholders or due to different levels of aggregation in performance perceptions. The former possibility is inherent in the nature of public organizations with multiple stakeholders who hold different perspectives on what good performance is (Boyne 2003b; Moynihan et al. 2011; Walker, Boyne, and Brewer 2010). Although performance assessment in the public sector highly depends on whose opinions are reflected in the evaluation process, the role of different stakeholders in performance assessment is seldom discussed (Andersen, Boesen, and Pedersen 2016). Regarding aggregation, previous literature has largely adopted aggregate levels of citizen satisfaction as measures of citizens’ perceived assessments of service quality even though aggregated measures can lead to aggregation bias (Ringquist 2005) and can disguise substantial dissatisfaction by some citizens. A third research gap in the performance literature is the lack of consideration of various institutional settings. Most empirical assessments take place in decentralized delivery systems such as the United States or the United Kingdom, despite the fact that the definition and measurement of performance can vary substantially across countries. The heavy reliance on decentralized systems in the literature raises concerns about generalizability and indicates the need for more variety in empirical contexts. This article systematically examines and theoretically discusses the relationship among different performance indicators in a centralized regime. We do this by analyzing how archival performance indicators are related to perceptual performance assessments in South Korea. This article contributes to the existing literature in three ways. First, given the importance of multiple stakeholders in public service, we incorporate opinions from both service users (parents and students) and service providers (teachers) and show how they correspond with archival performance (student test scores) to generate a more complete picture of how various performance assessments are related. Second, this article tests the theory both at the individual level and the aggregate level. Separating individual and aggregate level performance makes a theoretical contribution to the citizen satisfaction literature, refining the expectation-disconfirmation model (EDM) by allowing us to examine (1) which level (individual versus aggregate) of performance matters more for citizen satisfaction and (2) whether one level of performance interacts with the other level of performance. This distinction has implications for Tiebout (1956)vote with one’s feet models since they assume individual benefits matter but collective ones do not. Third, we examine performance indicators in a centralized system, using a longitudinal education dataset from Seoul City (the capital city of South Korea). This centralized education system with limited school choice provides an ideal case to determine whether the conclusions drawn from decentralized settings are generalizable to other contexts. The strong regulation of student school assignments in Seoul City also offers a unique opportunity to study citizen satisfaction of public service when the exit option is constrained (Hirschman 1970). The analysis combines archival data on students’ academic achievement and three different surveys on school quality over several years (2010–2015) with data for more than 45,000 parents and students, and 12,000 teachers. Conceptualizing Government Performance Government performance is a key concern of public administration research (Boyne 2003a); “virtually all of management and organization theory is concerned with performance and effectiveness, at least implicitly” (Rainey 2009, 145). Despite this salience, scholars and practitioners face challenges in evaluating government performance because the concept of performance itself can be unclear or ambiguous (Andersen, Boesen, and Pedersen 2016). Government performance includes multiple dimensions as viewed by diverse stakeholders (Radin 2006); therefore, finding valid performance measures is difficult (Rainey 2009). The nature of performance evaluation generates challenges to existing research in the area. First, research often does not incorporate the multiple values in public organizations and generally omits process aspects such as accountability, responsiveness, and probity (Boyne 2002). Second, the role of stakeholders is also neglected despite their significant influence in defining good performance (Amirkhanyan, Kim, and Lambright 2014). Third, the lack of information about the relevant units of analysis reduces comparability between different performance studies (Andersen, Boesen, and Pedersen 2016). At times scholars have improved the performance framework by incorporating different performance standards (e.g., the competing values model) (Quinn and Rohrbaugh 1983); however, the different perspectives of multiple stakeholders and the unity of analysis are still largely ignored in performance research. Multiple Stakeholders A central question for performance criteria is “who actually decides what good performance is” (Andersen, Boesen, and Pedersen 2016, 855)? Public organizations have multiple stakeholders holding diverse views on what constitutes good performance (Boyne 2003b; Walker et al. 2010), but the role of stakeholders in assessing performance is often given insufficient weight (Amirkhanyan et al. 2014; Andersen, Boesen, and Pedersen 2016). This lack of attention is problematic because neglecting the role of some stakeholders in the performance assessment process can generate a narrow view of how well governments do (Moynihan et al. 2011). Stakeholders refer to “any group or individual who can affect or is affected by the achievement of the organization’s objectives” (Freeman 1984, 46). In the example of public schools, students, parents, and teachers are key stakeholders since they affect and are affected by school performance. Each stakeholder group can have different perspectives on the appropriate school performance criteria or weigh individual criteria differently similar to the different views one gets with a simple turn of a kaleidoscope. Parents, for example, may view standardized test scores as the most important educational performance indicators while teachers could think that a well-rounded education is the main goal of schools. The multiple constituency model of performance highlights that organizational performance is judged by both internal and external groups who hold different views on performance (Cameron 1986; Connolly, Conlon, and Deutsch 1980). Internal performance assessments often diverge from external assessments because external stakeholders might have incentives to underestimate performance while internal stakeholders tend to overestimate performance (Andersen, Heinesen, and Pedersen 2016). Little research, however, examines whether stakeholders share common ground for performance assessment or hold different views on performance. Unit of Analysis In addition to the question of whose perspective is being assessed, “what level of analysis is being used” is also a critical question in evaluating performance (Cameron 1986, 542). Andersen, Boesen, and Pedersen (2016, 857) argue that studies that use different units of analysis (individuals, groups, organizations, or programs) may not be directly comparable. In the example of educational performance, they contend that district level performance (Meier and O’Toole 2001), school level performance (Andersen and Mortensen 2010), and individual level performance (Andersen, Heinesen, and Pedersen 2016) may reflect fundamentally different criteria. The conscious choice of the unit of analysis, therefore, is critical for performance research. For investigating individual level variables such as satisfaction or motivation, the individual level analysis is better because “much information is lost if individual scores are aggregated at the organizational level” (Andersen, Boesen, and Pedersen 2016, 858). Previous research, however, mostly relies on aggregate (organizational or community) level analysis due to the lack of data availability. Identifying the unit of analysis is important not only in understanding performance assessment itself but also in investigating the relationship between individual and aggregate level performance. An individual stakeholder can assess organizational performance based on how it affects the individual or on how it affects everyone on average (aggregated). Such assessments might also interact such that individual assessments become based how the individual values his or her own benefits relative to the benefits of all others. Assessing Government Performance Scholars frequently divide performance indicators into two types—archival measures and perceptual measures (Boyne et al. 2006; Brown and Coulter 1983; Kelly and Swindell 2002; Parks 1984).1 The distinction is based on “the degree to which a performance criterion concerns interior experiences and perceptions versus exterior, observable phenomena” (Andersen, Boesen, and Pedersen 2016, 856). Archival measures include administrative records of performance and official indicators of performance such the US Program Assessment Rating Tool (PART). These measures are often highly valued due to being independent from perceptual judgments (Andrews et al. 2006). Archival performance measures, however, have their own limitations; and many scholars criticize them. First, archival measures do not exist in many policy areas (Wall et al. 2004) because program outputs are often highly complex and the state of measurement lags well behind. Second, archival performance measures may focus on easily quantified service aspects and ignore what really matters to citizens (Andrews et al. 2006; Brudney and England 1982). Third, these measures may not work well in the fragmented and pluralistic political environments of public organizations (Radin 2006) where individual stakeholders have vastly different expectations for public programs. Overall these criticisms mean that archival performance indicators may not accurately reflect the quality of public services delivered to citizens (Amirkhanyan et al. 2014; Moynihan 2008; Wall et al. 2004). Other efforts to measure performance focus on perceptual indicators such as stakeholders’ judgments of performance (Andrews et al. 2006; Brewer 2006; Selden and Sowa 2004; Walker and Boyne 2006). Citizen satisfaction with public services is a popular performance measure, and public employees’ perceptions about performance are also frequently employed. Perceptual performance measures have been widely used because finding consensus performance indicators is especially challenging for public organizations (Rainey 2009), and they can capture important elements of performance that archival measures neglect. Perceptual measures also allow for easier comparisons across policy or country contexts. Still, skepticism exists about the validity of perceptual performance indicators. Some scholars question the validity of citizens’ perceptions of government performance because citizens often have limited ability to evaluate the quality of services that do not directly affect them (Andrews et al. 2006; Stipak 1979, 1980). Perceptual measures can also be vulnerable to bias due to nonrandom measurement errors (Boyne et al. 2006). Self-reported performance can be exceptionally problematic when internal organizational stakeholders assess their own performance because they tend to overestimate their performance (Meier and O’Toole 2013a). This positivity bias or social desirability bias (Meier and O’Toole 2013b) can lead to problems of common source bias when other variables are drawn from the same source as the evaluations (Amirkhanyan et al. 2014; Brewer 2006; Favero and Bullock 2015). Comparing archival and perceptual measures concerns questions of convergent and discriminant measurement validity (Campbell and Fiske 1959). Convergent validity is established by demonstrating that two or more different measures of the same construct have common variation while discriminant validity is obtained by showing different measurements of the same construct show unique variation (Schwab 2005). In measuring school performance, for example, if different performance indicators from different sources—such as student academic achievement and teachers’ evaluations of schools—show strong correlations, it indicates convergent validity. A lack of correlation may indicate either one of the measures lacks validity or that the measures have discriminant validity, that is, they are picking up different aspects of performance. Discriminant validity is especially important if perceptual measures tap aspects of service quality that are not measured by the official performance indicators. Through the lenses of convergent and discriminant validity, this study examines whether archival performance indicators and perceptual performance assessments from multiple stakeholders have similarities (convergent validity) or differences (discriminant validity). Figure 1 depicts the expected relationship among academic achievement and evaluations from parents, students, and teachers. Academic achievement can influence opinions of all stakeholders, but the degree of the influence may vary across groups. Academic achievement can capture some characteristics that matter to all groups and other characteristics that matter only for certain groups. Each group also can consider factors that are not captured by academic achievement when they evaluate school quality. We expect that evaluations from different groups can have both convergent and discriminant validity. Figure 1 View largeDownload slide The Relationship among Various Sources of Evaluations Figure 1 View largeDownload slide The Relationship among Various Sources of Evaluations Citizen Evaluations and Performance How do citizens evaluate government performance? Do their assessments correlate with official indicators of performance? These questions relate directly to issues of democracy and governance. To the extent that citizen assessments of performance match official performance criteria, citizens are more likely to support government policy, and overall policy is likely to represent the democratic goal of responsiveness to citizen preferences. Although high-quality services are assumed to promote citizen satisfaction, the existing evidence on the relationship between citizen assessments of public service and evidence of service quality improvements is mixed (Kelly 2005; Kelly and Swindell 2002). The EDM has been the predominant theory in explaining citizen satisfaction with public services (e.g., Jacobsen, Snyder, and Saultz 2015; James 2009; Morgeson 2013; Petrovsky, Mok, and León‐Cázares 2017; Poister and Thomas 2011;Van Ryzin 2004, 2006, 2013). The EDM suggests that citizen satisfaction is a reflection of the gap between the prior expectation and the post experience of service quality. When perceived service quality falls short of expectations, the negative gap (negative disconfirmation) leads to dissatisfaction whereas when perceived performance exceeds expectations, the positive gap (positive disconfirmation) leads to satisfaction (Jacobsen et al. 2015; Morgeson 2013; Van Ryzin 2006, 2013). This model is useful for both understanding “citizens’ judgments of government performance more generally,” and also for explaining “the difference between perceived performance and more objectively measured performance” (Van Ryzin 2013, 598). EDM studies have found that the actual quality of public service has a positive influence on citizen satisfaction (e.g., Poister and Thomas 2011) supporting the convergent validity of these indicators. Other empirical work also finds that citizens can evaluate some services accurately, and their assessments of performance are positively correlated with official measures of performance (Licari, McLean, and Rice 2005; Van Ryzin et al. 2008). In education policy, Favero and Meier (2013) use New York City school data to examine how different performance indicators are related to each other and find that parents and teachers are in considerable agreement with official school performance data (see also Charbonneau and Van Ryzin 2012). A consistent correlation between citizen judgments and a street cleanliness scorecard in New York City also supports that citizen evaluations of government performance are related to the archival measure of performance (Van Ryzin et al. 2008). In these cases, the performance measures appear to have convergent validity. Based on both theoretical expectations and empirical evidence, we hypothesize that standardized test scores (an archival measure) will be positively and significantly related to both parents’ and students’ assessments of school performance (perceptual measures). Hypothesis 1: Parents will give more favorable evaluations to schools that perform well on academic achievement tests. Hypothesis 2: Students will give more favorable evaluations to schools that perform well on academic achievement tests. Employee Evaluations and Performance The internal stakeholders, such as employees and managers, have unique advantages in assessing performance because they gain substantial information about their programs during implementation (Andrews et al. 2006). Unlike external stakeholders who can only observe current service outcomes, internal stakeholders can access the service production process, organizational capacity, or the potential of improving the service quality. They may consider these things when assessing performance and not just the current impact services have. Internal stakeholders have the potential, therefore, to understand the service delivery process better and evaluate performance more accurately than outsiders. Regarding employee perceptions and performance assessment, recent studies provide mixed results. In New York City schools, Favero and Meier (2013) demonstrate that teachers’ perceptions of school quality are positively related to student performance, progress report scores, and quality reviews, and are also negatively associated with a school violence index. By contrast, in Denmark schools, Andersen, Heinesen, and Pedersen (2016) find that teachers’ perceptual variables such as job satisfaction and intrinsic motivation are not significantly related to an archival performance indicator (written exam results).2 In terms of managers’ performance assessments, recent public management literature criticizes the misuse of these perceptual measures by showing that managers’ self-assessments of performance do not correlate well with archival performance measures (Andrews et al. 2006; Meier et al. 2015). A weak tie between self-assessment of performance and archival performance indicators is found regardless of whether measured at the top level (Meier and O’Toole 2013a) or at middle management levels (Meier et al. 2015). Despite these limitations, many studies still rely on internal stakeholders’ evaluations to measure organizational performance and often find that perceptual measures are correlated with archival measures (Wall et al. 2004; Walker and Boyne 2006). Andersen, Heinesen, and Pedersen (2016) argue that both institutional theory and the sociology of professions can explain why performance measures tend to produce consistent results. Teachers, for example, share professional norms developed through a combination of education, peer socialization, and formal rules and procedures backed by regulative institutions. They assess performance based on indicators that are “institutionalized through professional norms,” and this process generates consistency in performance assessments (Andersen, Heinesen, and Pedersen 2016, 75–6). Based on these theoretical expectations, we argue that teachers’ assessments of performance will be positively associated with archival performance indicators. Hypothesis 3: Teachers will give more favorable evaluations to schools that perform well on academic achievement tests. Individual Level and Aggregate Level Performance The citizen satisfaction literature argues that we should take into account both individual and aggregate level (jurisdiction, community, or organization) factors simultaneously to understand citizen satisfaction because the effects of a certain variable can be different depending on the level of analysis (Lyons, Lowery, and DeHoog 1992). Most studies, however, simply ask about citizen satisfaction without determining if the citizen responds to the overall quality of service in the community or the quality of services the individual receives. Citizens might respond to either. This concern is linked to the issue of the unit of analysis in performance research. For parents, the performance of their own children and overall school quality can be separate issues. The two different levels of performance may affect parent satisfaction independently, or they may interact with each other and produce additional effects on parent satisfaction. The separation of one level performance from another provides a unique theoretical contribution to the citizen satisfaction literature by identifying the differential effect of individual and aggregate level service quality on citizen’s satisfaction. Barrows et al. (2016) suggest that citizens’ performance assessments can be influenced by how local service quality ranks relative to that of different reference groups. In the context of American schools, they find that comparing the academic performance of local schools to school performance at the state, national, or international level changes citizens’ average evaluations of local school quality. These learning effects occur when citizens who initially overestimate or underestimate the schools in their community change their evaluations when they learn how other schools perform. A relative comparison between their own and others’ performance, therefore, can influence citizens’ evaluations of performance. Based on the previous discussion, we argue that aggregate level performance can influence the link between individual level performance and citizen satisfaction. Specifically, we expect that parents with children in high-performing schools and poor-performing schools will have different expectations for the quality of services; therefore, their evaluations for school quality will vary because they use different standards (Lyons, Lowery, and DeHoog 1992). In high-performing schools, for instance, parents may increase expectations for their children’s academic achievement because the other students in the same school are doing well. Parents in high-performing schools might also have a higher priority on education; therefore, they would be more satisfied with the school when their children do well on the exams. The positive effects of student performance on parent satisfaction, therefore, can be greater for students in high-performing schools because they excel against high standards. The similar logic can affect students. Students in high-performing schools are likely to evaluate their own education more positively when they achieve higher performance in a school where others also perform well. Through the socializing effect of parents or peers who have higher priorities on academic achievement, students in high-performing schools might adjust their own performance expectations. High test scores, therefore, can be more closely aligned to the perceptual assessments of students in high-performing schools.3 Hypothesis 4: The positive effects of student performance on parents’ evaluations will be greater in high-performing schools. Hypothesis 5: The positive effects of student performance on students’ evaluations will be greater in high-performing schools. The Institutional Setting The distinction between unified and fragmented governmental structures is important to understanding citizen satisfaction and its links to democracy (Lyons, Lowery, and DeHoog 1992). Theoretical work suggests that citizen preferences for public goods will be more adequately reflected in localized and fragmented systems because competing jurisdictions can provide different arrays of services to satisfy citizens’ preferences (Tiebout 1956). Archival performance measures under this context should more closely reflect the opinions of citizens; consequently, the positive correlations between the archival measures and citizen assessments should increase. In contrast, centralized governments often implement a one size fits all policy that is less likely to be customized to the preferences and needs of individual citizens. In this case, official performance indicators set by central government might not capture the performance dimensions that citizens value, generating a gap between official performance records and the citizens’ perceived service quality. To determine how generalizable the existing research on citizen satisfaction is, our study is set in a national context significantly different from the United States–South Korea. US school districts are autonomous local governments with their own chief executive officers (superintendents) and elected school boards that can affect important education policies. Parents and students are encouraged to participate in school decision-making and can influence educational policies by providing signals through their choice of schools (either among public schools or by opting to attend charter or private schools). Under this decentralized system with local control, schools have incentives to consider the preferences of parents and students and focus on the performance indicators that constituents care about. Parental assessments of educational services in this context should more accurately correspond to administrative school performance measures. Empirical evidence on the convergent validity of school performance in a decentralized setting (e.g., New York City) supports this argument (Charbonneau and Van Ryzin 2012; Favero and Meier 2013). Whether these results can be generalized to a centralized system is an open question. Korean schools provide a unique opportunity to test the theory due to its highly centralized education system (Mintrom and Walley 2013). Under a strict nationwide education policy, national governmental agencies (e.g., the Ministry of Education) have a strong authority in operating schools and regulate the educational curriculum, resource allocation, and teacher assignment. Students are randomly assigned to schools within their school districts based on a computer lottery system.4 Parents’ and students’ preferences are considered to a certain extent through the application processes, but there is no guarantee that they can attend the schools they want. The random assignment of students applies to both public and private schools, and students are not allowed to transfer to another school within the same school district. This system, to some extent, restricts the premises of the Tiebout model (1956) where citizens can choose services to maximize their preferences from a differentiated set of local governments.5 If Tiebout choice processes are restrained, one would expect any correlations between citizen satisfaction and performance indicators to decline. Limited school choice and centralized policymaking may also dampen the motivation of schools to encourage parents and students to participate in school policy processes. The gap between archival indicators and citizen perceptions, as a result, might be larger in a centralized system than in a decentralized one. At the same time, the Korean school system places a high priority on student performance on standardized tests (Mintrom and Walley 2013, 262; Seth 2002). Unlike the United States, Korea gives a national exam that all students take. Performance on these high stakes exams determines the future education and occupational opportunities available to the students. This system is reinforced by the Korean national policy of heavy investment in human capital where about 70% of high school graduates go on to higher education and there is a clear hierarchical ranking of Korean universities (Seth 2002). The evidence from this context, therefore, can contribute to building a more generalizable theory of performance assessment. Data and Methods The Seoul Education Longitudinal Study (SELS) of South Korea provides the data for this study. From 2010 to 2015, the Seoul Metropolitan Office of Education surveyed parents, students, teachers, and principals to establish better education policies. The SELS also includes archival data on various school characteristics such as school type, school ownership, school resources, and school environment as well as standardized test scores for individual students. Unique school identifiers and student identifiers allow the survey participants to be linked on one another. The data collection was designed to be representative of the Seoul City school population using a stratified two-stage sampling technique. In the first stage, the number of schools was determined based on the total number of schools in each school district. In the next stage, schools were randomly selected, and then students were also randomly selected from each school. The surveys have high response rates for both middle and high school students (ranging from 72.7% to 95.7%). Dropping cases with missing values yields a sample over the 6 years that includes 45,344 parents, 45,893 students, and 12,583 teachers. An advantage of SELS data is that it includes assessments by three different stakeholders as well as archival performance indicators. This allows us to incorporate various opinions from multiple stakeholders and test for differences or similarities among them (in contrast to studies with only one or two stakeholders). Another benefit of these data is that all personal information is collected at the individual level. In addition to reducing aggregation bias, the individual level analysis permits us to determine whether students and parents rate the quality of schools based on their own (or their own child’s) performance or the average performance of all students. The unit of analysis is the individual for all models. This study uses regression analysis with fixed effects for years and standard errors clustered at the school level to address potential serial correlation and heteroscedasticity. Measures Perceptual Performance Measures To measure perceived school quality by citizens and employees, this article adopts three different dependent variables: parents’ evaluations, students’ evaluations, and teachers’ evaluations. The parent and student survey questions ask about various aspects of the school including overall satisfaction with school, student learning, course variety, career consultation, educational facilities, school safety, etc. For each question, parents and students answer on a five-point scale (from strongly disagree to strongly agree). All survey items loaded strongly onto a single factor producing an eigenvalue of 4.69 in the parent survey (Cronbach’s alpha = 0.90) and an eigenvalue of 4.21 in the student survey (Cronbach’s alpha = 0.89), indicating a strong internal consistency. The teacher evaluation indicator is constructed with a separate factor analysis of questions asking teachers to evaluate the quality of school, using a five-point scale. For the teacher model, all items loaded on a single factor with eigenvalues of 3.55 (Cronbach’s alpha = 0.82) (for details, see Table A1 in Appendix). Archival Performance Measures The main independent variable of theoretical interest is objective school performance as measured by student test scores. All students in the sample take standardized tests in Korean, English, and math every year, and we create a performance index based on these scores. Specifically, we calculate the average scores for Korean, English, and math, and then standardize the average scores for each student. The school level performance index is the school mean score. Although the education literature widely recognizes that education is more than the results of standardized tests, standardized tests are the key performance indicator for Korean schools. The Korean education system relies heavily on standardized tests to evaluate students, and such tests determine future student educational opportunities including admission to higher education (Mintrom and Walley 2013). Control Variables School input variables and school characteristics are included in the models. Previous literature suggests that socioeconomic and demographic factors (e.g., income, education, age, and race) play an important role in shaping citizen satisfaction and preferences (Brown and Coulter 1983; Brown and Reed Benedict 2002). For socioeconomic variables, we include household income (logged) and parent education (average of both parents) at the individual level. Highly educated and experienced teachers are resources that schools can use to achieve high performance. For each school, we measure average teacher education as the percentage of teachers with either a master’s or doctorate, average teacher experience as the average years of teachers’ educational experience, the percentage of full-time teachers, and class size (the student–teacher ratio). Various school characteristics such as the poverty level, school size, and school type can also play a significant role in shaping performance. The citizen satisfaction literature finds that citizens in a low-income or predominantly ethnic minority neighborhood are generally less satisfied with service quality (Rossi 1972). To capture group-level poverty effects, we include the number of students eligible for free lunch (or reduced-price lunch) for each school (logged). We also control for the percentage of low-achievement students in each school using the percentage of low-performers in Korean, English, and math. To capture the effect of school size, we include the total number of students (logged). Legal ownership of schools (public or private) and the type of schools (middle, general high, and vocational high) are also controlled in all models. The teacher model has additional controls for employee characteristics. Given that demographic characteristics play significant roles in shaping teacher satisfaction (Grissom, Nicholson-Crotty, and Keiser 2012), we control for the teacher’s gender, age, and education attainment at the individual level. Descriptive statistics and coding schemes are shown in Table A2 in Appendix. Findings Table 1 shows the bivariate correlations between parents’, students’, and teachers’ evaluations, and standardized test scores. Both service users’ (parents’ and students’) and service providers’ (teachers’) evaluations have positive and significant correlations with test scores. We interpret these relationships as moderate to strong given that stakeholders may have different definitions of performance (which may or may not include test scores) and the potential for unique variance at the individual level (versus the cancellation of errors at the aggregate level). Table 1. Correlations among Different Performance Indicators   Academic Achievement  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Academic achievement  1        Parents’ evaluations  0.220*  1      Students’ evaluations  0.184*  0.537*  1    Teachers’ evaluations  0.181*  0.259*  0.319*  1    Academic Achievement  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Academic achievement  1        Parents’ evaluations  0.220*  1      Students’ evaluations  0.184*  0.537*  1    Teachers’ evaluations  0.181*  0.259*  0.319*  1  Note: School-level analysis. *p < .05. View Large Table 1. Correlations among Different Performance Indicators   Academic Achievement  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Academic achievement  1        Parents’ evaluations  0.220*  1      Students’ evaluations  0.184*  0.537*  1    Teachers’ evaluations  0.181*  0.259*  0.319*  1    Academic Achievement  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Academic achievement  1        Parents’ evaluations  0.220*  1      Students’ evaluations  0.184*  0.537*  1    Teachers’ evaluations  0.181*  0.259*  0.319*  1  Note: School-level analysis. *p < .05. View Large The first two equations in table 2 test how citizen assessments of school quality are related to student test scores. Columns 1 and 2 present the results for parents and students, respectively. Consistent with theoretical expectations, student performance (individual student test scores) is positively and significantly related to parents’ evaluations even with extensive controls. In addition to the test score of their own child, the parents’ evaluations are also positively associated with school performance (the mean school test score). Interestingly, the magnitude of the coefficient for school performance is much larger than that for individual student performance. This result suggests that parents are more satisfied with schools not only when their own child performs well, but also when the school achieves high academic performance overall. Table 2. The Relationship between Archival Performance Indicators and Perceptual Evaluations   Citizen Evaluations  Employee Evaluations  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.068*** (0.007)  0.125*** (0.007)    School performance  0.169*** (0.027)  0.172*** (0.029)  0.135*** (0.030)  Household income  −0.012 (0.009)  0.002 (0.008)  −0.034 (0.055)  Parents education  0.000 (0.008)  0.007 (0.007)  0.022 (0.035)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  0.003 (0.002)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  −0.003 (0.004)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  −0.002 (0.002)  Class size  −0.028*** (0.005)  −0.028*** (0.005)  −0.009 (0.009)  Low-achievement students  −0.021*** (0.004)  −0.010** (0.003)  −0.034*** (0.005)  Free-lunch students  −0.000 (0.010)  0.011 (0.011)  0.020 (0.017)  Number of students  −0.131** (0.042)  −0.150*** (0.044)  −0.209*** (0.056)  Public school  −0.217*** (0.031)  −0.106*** (0.031)  0.098* (0.041)  School type: general high school  −0.060 (0.038)  −0.397*** (0.035)  −0.090+ (0.053)  School type: vocational high school  0.196** (0.063)  −0.083 (0.069)  −0.206* (0.083)  Teacher gender (female)      0.113*** (0.024)  Teacher age      0.028*** (0.007)  Teacher education      0.058** (0.018)  Adjusted R2  0.076  0.158  0.055  N  45,344  45,893  12,583    Citizen Evaluations  Employee Evaluations  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.068*** (0.007)  0.125*** (0.007)    School performance  0.169*** (0.027)  0.172*** (0.029)  0.135*** (0.030)  Household income  −0.012 (0.009)  0.002 (0.008)  −0.034 (0.055)  Parents education  0.000 (0.008)  0.007 (0.007)  0.022 (0.035)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  0.003 (0.002)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  −0.003 (0.004)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  −0.002 (0.002)  Class size  −0.028*** (0.005)  −0.028*** (0.005)  −0.009 (0.009)  Low-achievement students  −0.021*** (0.004)  −0.010** (0.003)  −0.034*** (0.005)  Free-lunch students  −0.000 (0.010)  0.011 (0.011)  0.020 (0.017)  Number of students  −0.131** (0.042)  −0.150*** (0.044)  −0.209*** (0.056)  Public school  −0.217*** (0.031)  −0.106*** (0.031)  0.098* (0.041)  School type: general high school  −0.060 (0.038)  −0.397*** (0.035)  −0.090+ (0.053)  School type: vocational high school  0.196** (0.063)  −0.083 (0.069)  −0.206* (0.083)  Teacher gender (female)      0.113*** (0.024)  Teacher age      0.028*** (0.007)  Teacher education      0.058** (0.018)  Adjusted R2  0.076  0.158  0.055  N  45,344  45,893  12,583  Note: All equations control for dummy variables for individual years (not shown). Reference for school type is middle school. Standard errors are clustered at the school level and shown in parentheses. +p < .10, * p < .05, **p < .01, ***p < .001 (two-tailed test). View Large Table 2. The Relationship between Archival Performance Indicators and Perceptual Evaluations   Citizen Evaluations  Employee Evaluations  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.068*** (0.007)  0.125*** (0.007)    School performance  0.169*** (0.027)  0.172*** (0.029)  0.135*** (0.030)  Household income  −0.012 (0.009)  0.002 (0.008)  −0.034 (0.055)  Parents education  0.000 (0.008)  0.007 (0.007)  0.022 (0.035)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  0.003 (0.002)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  −0.003 (0.004)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  −0.002 (0.002)  Class size  −0.028*** (0.005)  −0.028*** (0.005)  −0.009 (0.009)  Low-achievement students  −0.021*** (0.004)  −0.010** (0.003)  −0.034*** (0.005)  Free-lunch students  −0.000 (0.010)  0.011 (0.011)  0.020 (0.017)  Number of students  −0.131** (0.042)  −0.150*** (0.044)  −0.209*** (0.056)  Public school  −0.217*** (0.031)  −0.106*** (0.031)  0.098* (0.041)  School type: general high school  −0.060 (0.038)  −0.397*** (0.035)  −0.090+ (0.053)  School type: vocational high school  0.196** (0.063)  −0.083 (0.069)  −0.206* (0.083)  Teacher gender (female)      0.113*** (0.024)  Teacher age      0.028*** (0.007)  Teacher education      0.058** (0.018)  Adjusted R2  0.076  0.158  0.055  N  45,344  45,893  12,583    Citizen Evaluations  Employee Evaluations  Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.068*** (0.007)  0.125*** (0.007)    School performance  0.169*** (0.027)  0.172*** (0.029)  0.135*** (0.030)  Household income  −0.012 (0.009)  0.002 (0.008)  −0.034 (0.055)  Parents education  0.000 (0.008)  0.007 (0.007)  0.022 (0.035)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  0.003 (0.002)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  −0.003 (0.004)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  −0.002 (0.002)  Class size  −0.028*** (0.005)  −0.028*** (0.005)  −0.009 (0.009)  Low-achievement students  −0.021*** (0.004)  −0.010** (0.003)  −0.034*** (0.005)  Free-lunch students  −0.000 (0.010)  0.011 (0.011)  0.020 (0.017)  Number of students  −0.131** (0.042)  −0.150*** (0.044)  −0.209*** (0.056)  Public school  −0.217*** (0.031)  −0.106*** (0.031)  0.098* (0.041)  School type: general high school  −0.060 (0.038)  −0.397*** (0.035)  −0.090+ (0.053)  School type: vocational high school  0.196** (0.063)  −0.083 (0.069)  −0.206* (0.083)  Teacher gender (female)      0.113*** (0.024)  Teacher age      0.028*** (0.007)  Teacher education      0.058** (0.018)  Adjusted R2  0.076  0.158  0.055  N  45,344  45,893  12,583  Note: All equations control for dummy variables for individual years (not shown). Reference for school type is middle school. Standard errors are clustered at the school level and shown in parentheses. +p < .10, * p < .05, **p < .01, ***p < .001 (two-tailed test). View Large The student evaluation model in table 2 presents results similar to the parent model. Students give more favorable evaluations to their schools when they perform well on standardized tests and when their school achieves higher academic performance. The coefficient for school performance is larger than that for student performance, although the difference between them is smaller than in the parent model. Control variables in the parent and student models show similar patterns. Parents’ and students’ evaluations are positively associated with the teachers’ education levels but negatively related to bigger class size, more low-achievement students, and larger schools. Parents and students in public schools are less satisfied with school quality compared to those in private schools. Parents with children in vocational higher schools are more satisfied than middle school parents, and general high school students are the least satisfied. Turning the kaleidoscope, the results for the teacher model is presented in columns 3 in table 2. Similar to parents’ assessments, test scores show a positive and significant coefficient in the teacher model. School type and individual employee characteristics show significant relationships with teachers’ evaluations. Unlike service users (parents and students) in public schools who are less satisfied than those in private schools, service providers (teachers) in public schools tend to be more satisfied with their schools. Teachers in high schools are more likely to have negative perceptions of their schools compared to middle school employees. Female teachers tend to give more favorable evaluations to schools than their male counterparts. Highly educated teachers and older teachers are more likely to have positive perceptions of their school quality compared to teachers with less education and younger teachers, respectively. The next analyses include a term representing the interaction of student level performance and school level performance to test hypotheses 4 and 5. Table 3 shows the results for both parents’ evaluations and students’ evaluations. For both parent and student models, we find both archival performance variables and their interactions have positive and significant coefficients. This result indicates that individual student performance matters more when schools also achieve better overall performance. Table 3. The Interactive Effects of Individual and Aggregate Level Performance on Citizen Evaluations   Citizen Evaluations  Parents’ Evaluations  Students’ Evaluations  Student performance  0.064*** (0.007)  0.121*** (0.007)  School performance  0.139*** (0.026)  0.151*** (0.029)  Student performance × School performance  0.052*** (0.015)  0.036* (0.016)  Household income  −0.012 (0.009)  0.001 (0.008)  Parents education  −0.000 (0.008)  0.007 (0.007)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  Class size  −0.027*** (0.005)  −0.027*** (0.005)  Low-achievement students  −0.023*** (0.004)  −0.011*** (0.003)  Free-lunch students  0.003 (0.010)  0.013 (0.011)  Number of students  −0.127** (0.041)  −0.148*** (0.043)  Public school  −0.215*** (0.030)  −0.105*** (0.030)  School type: general high school  −0.060 (0.037)  −0.397*** (0.034)  School type: vocational high school  0.158* (0.066)  −0.110 (0.071)  Adjusted R2  0.077  0.158  N  45,344  45,893    Citizen Evaluations  Parents’ Evaluations  Students’ Evaluations  Student performance  0.064*** (0.007)  0.121*** (0.007)  School performance  0.139*** (0.026)  0.151*** (0.029)  Student performance × School performance  0.052*** (0.015)  0.036* (0.016)  Household income  −0.012 (0.009)  0.001 (0.008)  Parents education  −0.000 (0.008)  0.007 (0.007)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  Class size  −0.027*** (0.005)  −0.027*** (0.005)  Low-achievement students  −0.023*** (0.004)  −0.011*** (0.003)  Free-lunch students  0.003 (0.010)  0.013 (0.011)  Number of students  −0.127** (0.041)  −0.148*** (0.043)  Public school  −0.215*** (0.030)  −0.105*** (0.030)  School type: general high school  −0.060 (0.037)  −0.397*** (0.034)  School type: vocational high school  0.158* (0.066)  −0.110 (0.071)  Adjusted R2  0.077  0.158  N  45,344  45,893  Note: All equations control for dummy variables for individual years (not shown). Reference for school type is middle school. Standard errors are clustered at the school level and shown in parentheses. +p < .10, * p < .05, **p < .01, ***p < .001 (two-tailed test). View Large Table 3. The Interactive Effects of Individual and Aggregate Level Performance on Citizen Evaluations   Citizen Evaluations  Parents’ Evaluations  Students’ Evaluations  Student performance  0.064*** (0.007)  0.121*** (0.007)  School performance  0.139*** (0.026)  0.151*** (0.029)  Student performance × School performance  0.052*** (0.015)  0.036* (0.016)  Household income  −0.012 (0.009)  0.001 (0.008)  Parents education  −0.000 (0.008)  0.007 (0.007)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  Class size  −0.027*** (0.005)  −0.027*** (0.005)  Low-achievement students  −0.023*** (0.004)  −0.011*** (0.003)  Free-lunch students  0.003 (0.010)  0.013 (0.011)  Number of students  −0.127** (0.041)  −0.148*** (0.043)  Public school  −0.215*** (0.030)  −0.105*** (0.030)  School type: general high school  −0.060 (0.037)  −0.397*** (0.034)  School type: vocational high school  0.158* (0.066)  −0.110 (0.071)  Adjusted R2  0.077  0.158  N  45,344  45,893    Citizen Evaluations  Parents’ Evaluations  Students’ Evaluations  Student performance  0.064*** (0.007)  0.121*** (0.007)  School performance  0.139*** (0.026)  0.151*** (0.029)  Student performance × School performance  0.052*** (0.015)  0.036* (0.016)  Household income  −0.012 (0.009)  0.001 (0.008)  Parents education  −0.000 (0.008)  0.007 (0.007)  Average teacher education  0.004*** (0.001)  0.004*** (0.001)  Average teacher experience  −0.007* (0.003)  −0.009** (0.003)  Full-time teachers  0.003 (0.002)  −0.001 (0.002)  Class size  −0.027*** (0.005)  −0.027*** (0.005)  Low-achievement students  −0.023*** (0.004)  −0.011*** (0.003)  Free-lunch students  0.003 (0.010)  0.013 (0.011)  Number of students  −0.127** (0.041)  −0.148*** (0.043)  Public school  −0.215*** (0.030)  −0.105*** (0.030)  School type: general high school  −0.060 (0.037)  −0.397*** (0.034)  School type: vocational high school  0.158* (0.066)  −0.110 (0.071)  Adjusted R2  0.077  0.158  N  45,344  45,893  Note: All equations control for dummy variables for individual years (not shown). Reference for school type is middle school. Standard errors are clustered at the school level and shown in parentheses. +p < .10, * p < .05, **p < .01, ***p < .001 (two-tailed test). View Large A more intuitive way of illustrating the interactive relationship is plotting predicted values.6Figure 2 shows predicted parents’ evaluations in a school at varying levels of student and school level performance with 95% confidence intervals. The solid line shows the relationship when school performance is high (two standard deviations above the mean), and the dashed line illustrates the relationship for low levels of school performance (two standard deviations below the mean). The slope of the solid line is positive, suggesting that student performance has a positive effect on parents’ evaluations at high levels of school performance. The slope of the dashed line remains flat, however, indicating that student performance has little additional effect when schools perform poorly. Figure 3 illustrates the predicted effects of student performance on students’ evaluations. It also suggests that student performance has a positive relationship with students’ evaluations for schools with a high level of performance (the solid line). As the level of school performance decreases, the slope of the relationship is still positive but becomes relatively flat (the dashed line). Figure 2. View largeDownload slide Predicted Effects of Student Performance on Parents’ Evaluations Figure 2. View largeDownload slide Predicted Effects of Student Performance on Parents’ Evaluations Figure 3. View largeDownload slide Predicted Effects of Student Performance on Students’ Evaluations Figure 3. View largeDownload slide Predicted Effects of Student Performance on Students’ Evaluations Indications of Discriminant Validity Thus far the analysis has shown that parents’, students’, and teachers’ evaluations of the schools are positively correlated with the official performance measure (student test scores) even controlling for numerous variables that also affect test scores. The results provide support for the convergent validity of both citizen and employee evaluations of services. The perceptual assessments, however, are not perfectly correlated with the test score results. Although this could result from just perceptual errors, it is also possible that the assessments of parents, students, and teachers contain some common element that is separate from what is measured by student test scores. The education literature criticizing the use of standardized tests argues that they are poor measures of critical thinking, creativity, student well-being or other factors that contribute positively to the education of a student (Linn 2001; Worthen 1993). These other factors could generate some commonality among the perceptual indicators that are independent of the test scores. Such commonality might indicate some discriminant validity for the various perceptual assessments. The first indicator of commonality among the perceptual assessments is to take the residuals from table 2 for each of the models and correlate them. Essentially this exercise asks if what test scores and the other factors cannot explain has some commonalities across the indicators. When this is done at the school level (the teacher model is at the school level only), the parent and student residuals are strongly correlated; and the teacher residuals are positively associated with both parent residuals and student residuals (Table A3 in Appendix). A second way to demonstrate the linkage between the various indicators is to add the other perceptual indicators to the equations in table 2 directly to determine if they are positively related. This is a more difficult test than the residuals’ test since it has to overcome the collinearity among the perceptual indicators. When this is done, both students’ evaluations and teachers’ evaluations are positively related to parents’ evaluations; the effect size for students’ evaluations is much larger that of teachers’ evaluations. In the student model, both parents’ evaluations and teachers’ evaluations are positively related to the student measure with the influence of parents being about three times larger. Finally, in the teacher model, both parents’ and students’ perceptual assessments are positively associated with teachers’ evaluations with parents’ evaluations having greater influence. In all models, the significant effect of archival performance indicators remains the same even when we include perceptual evaluations from other stakeholders (Table A4 in Appendix). The third indicator of potential discriminant validity is to take advantage of the panel nature of the data set and run Granger (1969) causality analysis. Essentially Granger causality asks if prior values of one variable can predict current levels of another variable while controlling past levels of that variable. The test is a null hypothesis test, that is, if past values of the first variable add no statistical improvement in prediction to the second variable, then one can conclude that the first variable is not causally related to the second. The Granger tests suggest that three perceptual measures of school performance (parents, students, and teachers) are reciprocally related to each other, and strongly so (Table A5 in Appendix). In sum, three preliminary tests indicate that the perceptual assessments of educational quality by parents, students, and teachers share some commonalities that are separate from test scores and that these assessments feedback to each other. It is not the case that just one of the groups such as teachers shapes the expectations of all other groups. These results are consistent with the notion that these perceptual measures have discriminant validity relative to test scores. At the same time, the analysis does not reveal what this commonality is or what it might be in response to in the education process. Investigating this commonality remains, therefore, an interesting future research question regarding education policy and the relationship among performance indicators. Discussion and Conclusion Assessing performance becomes more important for modern governments under increased pressure to promote efficiency and improve accountability through quality public services (Martin and Smith 2005). Finding valid performance measures, however, is not easy given the multidimensional nature of public performance. The ideal situation, therefore, is to have multiple performance measures that “give similar results both descriptively and in analyses of relationships between performance and other key concepts” (Andersen, Heinesen, and Pedersen 2016, 65). This article incorporates multiple stakeholders’ perspectives on performance and investigates how they are related to archival performance indicators. In the Korean school context, we find that both service users (parents and students) and service providers (teachers) show similarities in their assessments relative to archival performance indicators. The consistency of performance measures from different sources supports the convergent validity of the performance indicators. Based on the individual level analysis, the study also reveals that (1) both individual and aggregate level performance indicators have a significant relationship with citizen evaluations and (2) the effect of individual level performance on citizen evaluations changes depending on the level of aggregate performance. Specifically, both individual student test scores and school mean scores are significantly associated with parents’ and students’ assessments of school performance, and student achievement has a larger positive relationship with parents’ and students’ evaluations in high-performing schools compared to low-performing schools. These significant results make theoretical contributions to the citizen satisfaction literature by suggesting that citizens base their performance assessments on both individual and aggregate terms. The logic is similar to Cyert and March’s (1963) notion of social aspirations where organizations (or in this case individuals) compare themselves to peers. In the case of Korean schools, individual user evaluations are contingent on an inherent comparison to others. A significant relationship between archival performance and perceptual evaluations in a centralized context indicates that the convergent validity of school performance indicators may be generalizable. Parents in decentralized education systems are able to signal preferences about schools through exercising school choice options (Schneider, Teske, and Marschall 2000), and they have more opportunities to get involved in the decision-making process in education policy by participating at the school level. Official performance indicators, therefore, should be more likely to reflect parent opinions in decentralized systems. These local control mechanisms often do not work well in centralized systems. Despite Korean schools being operated under a centralized education system where parents and students have limited school choice, these results show that even in these unfavorable conditions that archival performance indicators are positively linked to citizen evaluations. What factors, then, contribute to the association between academic achievement and citizens’ evaluations despite its limited school choice? The possible factors are (1) the salience of academic performance in achieving higher positions in the society and (2) the uniformity in the education system of Korea. First, education has been valued as a way to reach higher social status, and the examinations have been used as a social selection device in Korea. Under Confucian philosophy, Korean society stresses the idea of merit as the valid criterion for assessing an individual and awarding social status (Seth 2002); and achieving academic advancement (especially, entering prestigious universities) has been the pathway to high-paying jobs and higher social positions.7 The positive correlation between educational attainment and socioeconomic status exists in other countries, but the relationship is especially strong in Korea because of its historical stress on human capital based economic growth. The system, structured to allow for social mobility through education, has shaped the high social demand for education and propelled educational expansion (Seth 2002). For example, Korea has the highest higher education completion rate among OECD countries, with 70.0% of the population aged 25–34 completing tertiary education in 2016. The number is significantly higher than the OECD average (43.1%) and other Western countries such as the United States (47.5%) and the United Kingdom (52.0%) (OECD 2017).8 Second, uniformity in the education system facilitates the use of standardized exams as selection criteria and highlights the importance of test scores. Korean society has pursued a uniformity of education to ensure equal educational opportunity and fairness in educational access (Seth 2002).9 Korea has a national curriculum (that applies to both public and private schools) and allocates resources evenly to both public and private schools. This system notably contrasts with the localized US system with no national curriculum (each state sets their own curriculum requirements), an independent private school sector, and school funding that reflects local property taxes. In Korea, a governmental agency is in charge of national level examinations such as college entrance exams. Nationwide standardized tests make test scores comparable across all students. As a point of comparison, college entrance tests also exist in the United States (e.g., the SAT and ACTs), but only some students in the country take the exams. These exams are administered by a private nonprofit organization, not by the government. Test scores, therefore, are far more critical in Korea given that it is the common measure across all students. The use of test scores as the primary mechanism for deciding who is admitted to prestigious universities has made Korean parents and students take test scores very seriously (Mintrom and Walley 2013, 262). Another noticeable finding is that parents and students respond more to school performance than individual student performance. This result implies that citizens may consider school performance more important than individual performance when evaluating education quality. Parents and students in Seoul city are informed of the school mean scores as well as their own (child’s) score by their schools, and having knowledge about objectively measured school performance may help parents and students make more accurate judgments about school quality. This result might also reflect the collectivist culture in Korea, which has more cohesive social ties between individuals and emphasizes group goals above individual needs (Hofstede, Hofstede, and Minkov 1991). Based on this culture, parents and students in Korea might think that achieving a high level of group performance is as important as individual performance. This finding also poses a challenge to Tiebout models of choice that rely on individual goals rather than collective ones. Replicating these results in less collectivist cultures, therefore, can make a major contribution to theories of choice and decision making. Lastly, in addition to showing that there is convergent validity of school performance, we also find indirect evidence of discriminant validity. The similar results from parents, students, and teachers and the subsequent Granger causality models indicate that there is likely reciprocal causation among the perceptual assessments of school performance. The inter-relationships among parents’, students’, and teachers’ evaluations hold even controlling for standardized test scores and a wide range of other control variables. The inter-relationships are not the result of such factors as income, education, or other community level variables. Each group sees something common in school performance that is different from the results on standardized tests. This indirect indicator of discriminant validity suggests the need for additional research into the composition of stakeholder assessments and more theoretical concern in regard to the measurement of government performance. Supplementary Material Supplementary material is available at the Journal of Public Administration Research and Theory online. Appendix Table A1. Factor-Analytical Results of Survey Items Survey Item  Factor Loading  Parents’ evaluations   I am satisfied with my child’s school  0.69   This school improves my child’s academic ability  0.82   This school improves my child’s aptitude skills  0.77   This school teaches my child based on the level of his or her academic ability  0.81   This school provides consultation and activities to develop my child’s career  0.76   This school has enough educational facilities and adequate environment for education  0.71   This school cares about my child’s safety at school  0.74   Teachers in this school are passionate about teaching  0.81   Eigenvalue  4.69   Cronbach’s alpha  0.90  Students’ evaluations   I am satisfied with my school  0.68   This school improves student academic ability  0.84   This school improves student aptitude skills  0.82   This school teaches students based on the level of his or her academic ability  0.79   This school provides consultation and activities to develop student career  0.77   This school has enough educational facilities and adequate environment for education  0.74   This school cares about student safety at school  0.77   Eigenvalue  4.21   Cronbach’s alpha  0.89  Teachers’ evaluations   I am satisfied with this school  0.59   This school improves student academic ability  0.78   This school improves student aptitude skills  0.78   This school teaches students based on the level of his or her academic ability  0.76   This school provides consultation and activities to develop student career  0.73   This school has enough educational facilities and adequate environment for education  0.60   This school cares about student safety at school  0.71   Eigenvalue  3.55   Cronbach’s alpha  0.82  Survey Item  Factor Loading  Parents’ evaluations   I am satisfied with my child’s school  0.69   This school improves my child’s academic ability  0.82   This school improves my child’s aptitude skills  0.77   This school teaches my child based on the level of his or her academic ability  0.81   This school provides consultation and activities to develop my child’s career  0.76   This school has enough educational facilities and adequate environment for education  0.71   This school cares about my child’s safety at school  0.74   Teachers in this school are passionate about teaching  0.81   Eigenvalue  4.69   Cronbach’s alpha  0.90  Students’ evaluations   I am satisfied with my school  0.68   This school improves student academic ability  0.84   This school improves student aptitude skills  0.82   This school teaches students based on the level of his or her academic ability  0.79   This school provides consultation and activities to develop student career  0.77   This school has enough educational facilities and adequate environment for education  0.74   This school cares about student safety at school  0.77   Eigenvalue  4.21   Cronbach’s alpha  0.89  Teachers’ evaluations   I am satisfied with this school  0.59   This school improves student academic ability  0.78   This school improves student aptitude skills  0.78   This school teaches students based on the level of his or her academic ability  0.76   This school provides consultation and activities to develop student career  0.73   This school has enough educational facilities and adequate environment for education  0.60   This school cares about student safety at school  0.71   Eigenvalue  3.55   Cronbach’s alpha  0.82  Table A1. Factor-Analytical Results of Survey Items Survey Item  Factor Loading  Parents’ evaluations   I am satisfied with my child’s school  0.69   This school improves my child’s academic ability  0.82   This school improves my child’s aptitude skills  0.77   This school teaches my child based on the level of his or her academic ability  0.81   This school provides consultation and activities to develop my child’s career  0.76   This school has enough educational facilities and adequate environment for education  0.71   This school cares about my child’s safety at school  0.74   Teachers in this school are passionate about teaching  0.81   Eigenvalue  4.69   Cronbach’s alpha  0.90  Students’ evaluations   I am satisfied with my school  0.68   This school improves student academic ability  0.84   This school improves student aptitude skills  0.82   This school teaches students based on the level of his or her academic ability  0.79   This school provides consultation and activities to develop student career  0.77   This school has enough educational facilities and adequate environment for education  0.74   This school cares about student safety at school  0.77   Eigenvalue  4.21   Cronbach’s alpha  0.89  Teachers’ evaluations   I am satisfied with this school  0.59   This school improves student academic ability  0.78   This school improves student aptitude skills  0.78   This school teaches students based on the level of his or her academic ability  0.76   This school provides consultation and activities to develop student career  0.73   This school has enough educational facilities and adequate environment for education  0.60   This school cares about student safety at school  0.71   Eigenvalue  3.55   Cronbach’s alpha  0.82  Survey Item  Factor Loading  Parents’ evaluations   I am satisfied with my child’s school  0.69   This school improves my child’s academic ability  0.82   This school improves my child’s aptitude skills  0.77   This school teaches my child based on the level of his or her academic ability  0.81   This school provides consultation and activities to develop my child’s career  0.76   This school has enough educational facilities and adequate environment for education  0.71   This school cares about my child’s safety at school  0.74   Teachers in this school are passionate about teaching  0.81   Eigenvalue  4.69   Cronbach’s alpha  0.90  Students’ evaluations   I am satisfied with my school  0.68   This school improves student academic ability  0.84   This school improves student aptitude skills  0.82   This school teaches students based on the level of his or her academic ability  0.79   This school provides consultation and activities to develop student career  0.77   This school has enough educational facilities and adequate environment for education  0.74   This school cares about student safety at school  0.77   Eigenvalue  4.21   Cronbach’s alpha  0.89  Teachers’ evaluations   I am satisfied with this school  0.59   This school improves student academic ability  0.78   This school improves student aptitude skills  0.78   This school teaches students based on the level of his or her academic ability  0.76   This school provides consultation and activities to develop student career  0.73   This school has enough educational facilities and adequate environment for education  0.60   This school cares about student safety at school  0.71   Eigenvalue  3.55   Cronbach’s alpha  0.82  Table A2. Descriptive Statistics for All Variables Variable  Mean  SD  Min  Max  Source  Dependent variables   Parents’ evaluations  0.0  1.0  −4.1  2.7  Parent survey   Students’ evaluations  0.0  1.0  −3.4  2.3  Student survey   Teachers’ evaluations  0.0  1.0  −5.0  2.3  Teacher survey  Independent variables   Student performance  0.0  1.0  −5.1  4.7  Archival data   School performance  0.0  0.6  −2.5  2.9  Archival data  Parent characteristics   Household income (logged)  5.9  0.7  0  9.2  Parent survey   Parent education  4.1  1.1  1  7  Parent survey  Teacher characteristics   Teacher gender (female = 1)  0.61  —  0  1  Teacher survey   Teacher age  4.6  1.5  1  6  Teacher survey   Teacher education  1.5  0.6  1  4  Teacher survey  School characteristics   Average teacher education (%)  17.3  17.6  0  94.9  Archival data   Average teacher experience (years)  19.4  4.0  4  34  Archival data   Full-time teachers (%)  83.0  7.4  44.1  100  Archival data   Class size (student–teacher ratio)  15.9  3.2  3.9  42  Archival data   Low-achievement students (%)  4.1  4.3  0  37.4  Archival data   Free lunch students (logged)  4.7  1.1  0  7.4  Archival data   Number of students (logged)  6.0  0.5  3.4  7.5  Archival data   Ownership: public school  0.54  —  0  1  Archival data   Ownership: private school  0.46  —  0  1  Archival data   School type: middle school  0.46  —  0  1  Archival data   School type: general high school  0.45  —  0  1  Archival data   School type: vocational high school  0.09  —  0  1  Archival data  Variable  Mean  SD  Min  Max  Source  Dependent variables   Parents’ evaluations  0.0  1.0  −4.1  2.7  Parent survey   Students’ evaluations  0.0  1.0  −3.4  2.3  Student survey   Teachers’ evaluations  0.0  1.0  −5.0  2.3  Teacher survey  Independent variables   Student performance  0.0  1.0  −5.1  4.7  Archival data   School performance  0.0  0.6  −2.5  2.9  Archival data  Parent characteristics   Household income (logged)  5.9  0.7  0  9.2  Parent survey   Parent education  4.1  1.1  1  7  Parent survey  Teacher characteristics   Teacher gender (female = 1)  0.61  —  0  1  Teacher survey   Teacher age  4.6  1.5  1  6  Teacher survey   Teacher education  1.5  0.6  1  4  Teacher survey  School characteristics   Average teacher education (%)  17.3  17.6  0  94.9  Archival data   Average teacher experience (years)  19.4  4.0  4  34  Archival data   Full-time teachers (%)  83.0  7.4  44.1  100  Archival data   Class size (student–teacher ratio)  15.9  3.2  3.9  42  Archival data   Low-achievement students (%)  4.1  4.3  0  37.4  Archival data   Free lunch students (logged)  4.7  1.1  0  7.4  Archival data   Number of students (logged)  6.0  0.5  3.4  7.5  Archival data   Ownership: public school  0.54  —  0  1  Archival data   Ownership: private school  0.46  —  0  1  Archival data   School type: middle school  0.46  —  0  1  Archival data   School type: general high school  0.45  —  0  1  Archival data   School type: vocational high school  0.09  —  0  1  Archival data  Note: Based on a sample of 45,344 parents, 45,893 students, and 12,583 teachers used in the analysis. Parent educational attainment is coded as elementary school graduate = 1; middle school graduate = 2; high school graduate = 3; 2 (or 3) year college graduate = 4; 4-year university graduate = 5; Master’s degree = 6; and Doctorate degree = 7. We code teacher gender as a dummy variable (female = 1; male = 0). Teacher age is coded as a categorical variable with six categories (younger than 25 years = 1; between 26 and 30 years = 2; between 31 and 35 years = 3; between 36 and 40 years = 4; between 41 and 45 years = 5; and older than 46 years = 6). Teacher education is coded as a 4 category-variable (holding bachelor’s degree = 1; holding a Master’s degree = 2; doctorate course completion = 3; and holding a doctorate degree = 4). Table A2. Descriptive Statistics for All Variables Variable  Mean  SD  Min  Max  Source  Dependent variables   Parents’ evaluations  0.0  1.0  −4.1  2.7  Parent survey   Students’ evaluations  0.0  1.0  −3.4  2.3  Student survey   Teachers’ evaluations  0.0  1.0  −5.0  2.3  Teacher survey  Independent variables   Student performance  0.0  1.0  −5.1  4.7  Archival data   School performance  0.0  0.6  −2.5  2.9  Archival data  Parent characteristics   Household income (logged)  5.9  0.7  0  9.2  Parent survey   Parent education  4.1  1.1  1  7  Parent survey  Teacher characteristics   Teacher gender (female = 1)  0.61  —  0  1  Teacher survey   Teacher age  4.6  1.5  1  6  Teacher survey   Teacher education  1.5  0.6  1  4  Teacher survey  School characteristics   Average teacher education (%)  17.3  17.6  0  94.9  Archival data   Average teacher experience (years)  19.4  4.0  4  34  Archival data   Full-time teachers (%)  83.0  7.4  44.1  100  Archival data   Class size (student–teacher ratio)  15.9  3.2  3.9  42  Archival data   Low-achievement students (%)  4.1  4.3  0  37.4  Archival data   Free lunch students (logged)  4.7  1.1  0  7.4  Archival data   Number of students (logged)  6.0  0.5  3.4  7.5  Archival data   Ownership: public school  0.54  —  0  1  Archival data   Ownership: private school  0.46  —  0  1  Archival data   School type: middle school  0.46  —  0  1  Archival data   School type: general high school  0.45  —  0  1  Archival data   School type: vocational high school  0.09  —  0  1  Archival data  Variable  Mean  SD  Min  Max  Source  Dependent variables   Parents’ evaluations  0.0  1.0  −4.1  2.7  Parent survey   Students’ evaluations  0.0  1.0  −3.4  2.3  Student survey   Teachers’ evaluations  0.0  1.0  −5.0  2.3  Teacher survey  Independent variables   Student performance  0.0  1.0  −5.1  4.7  Archival data   School performance  0.0  0.6  −2.5  2.9  Archival data  Parent characteristics   Household income (logged)  5.9  0.7  0  9.2  Parent survey   Parent education  4.1  1.1  1  7  Parent survey  Teacher characteristics   Teacher gender (female = 1)  0.61  —  0  1  Teacher survey   Teacher age  4.6  1.5  1  6  Teacher survey   Teacher education  1.5  0.6  1  4  Teacher survey  School characteristics   Average teacher education (%)  17.3  17.6  0  94.9  Archival data   Average teacher experience (years)  19.4  4.0  4  34  Archival data   Full-time teachers (%)  83.0  7.4  44.1  100  Archival data   Class size (student–teacher ratio)  15.9  3.2  3.9  42  Archival data   Low-achievement students (%)  4.1  4.3  0  37.4  Archival data   Free lunch students (logged)  4.7  1.1  0  7.4  Archival data   Number of students (logged)  6.0  0.5  3.4  7.5  Archival data   Ownership: public school  0.54  —  0  1  Archival data   Ownership: private school  0.46  —  0  1  Archival data   School type: middle school  0.46  —  0  1  Archival data   School type: general high school  0.45  —  0  1  Archival data   School type: vocational high school  0.09  —  0  1  Archival data  Note: Based on a sample of 45,344 parents, 45,893 students, and 12,583 teachers used in the analysis. Parent educational attainment is coded as elementary school graduate = 1; middle school graduate = 2; high school graduate = 3; 2 (or 3) year college graduate = 4; 4-year university graduate = 5; Master’s degree = 6; and Doctorate degree = 7. We code teacher gender as a dummy variable (female = 1; male = 0). Teacher age is coded as a categorical variable with six categories (younger than 25 years = 1; between 26 and 30 years = 2; between 31 and 35 years = 3; between 36 and 40 years = 4; between 41 and 45 years = 5; and older than 46 years = 6). Teacher education is coded as a 4 category-variable (holding bachelor’s degree = 1; holding a Master’s degree = 2; doctorate course completion = 3; and holding a doctorate degree = 4). Table A3. Correlations among the Residuals   Parent Residuals  Student Residuals  Teacher Residuals  Residuals from the parent model  1      Residuals from the student model  0.463*  1    Residuals from the teacher model  0.187*  0.161*  1    Parent Residuals  Student Residuals  Teacher Residuals  Residuals from the parent model  1      Residuals from the student model  0.463*  1    Residuals from the teacher model  0.187*  0.161*  1  Note: The residuals come from the regression models in Table 2. School-level analysis. *p < .05. Table A3. Correlations among the Residuals   Parent Residuals  Student Residuals  Teacher Residuals  Residuals from the parent model  1      Residuals from the student model  0.463*  1    Residuals from the teacher model  0.187*  0.161*  1    Parent Residuals  Student Residuals  Teacher Residuals  Residuals from the parent model  1      Residuals from the student model  0.463*  1    Residuals from the teacher model  0.187*  0.161*  1  Note: The residuals come from the regression models in Table 2. School-level analysis. *p < .05. Table A4. Discriminant Validity of Subjective Evaluations   Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.035*** (0.008)  0.084*** (0.008)    School performance  0.085*** (0.023)  0.071** (0.026)  0.103*** (0.029)  Parents’ evaluations    0.290*** (0.008)  0.187*** (0.034)  Students’ evaluations  0.309*** (0.010)    0.108*** (0.031)  Teachers’ evaluations  0.137*** (0.020)  0.099*** (0.021)    Adjusted R2  0.154  0.213  0.067  N  26,509  26,509  12,571    Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.035*** (0.008)  0.084*** (0.008)    School performance  0.085*** (0.023)  0.071** (0.026)  0.103*** (0.029)  Parents’ evaluations    0.290*** (0.008)  0.187*** (0.034)  Students’ evaluations  0.309*** (0.010)    0.108*** (0.031)  Teachers’ evaluations  0.137*** (0.020)  0.099*** (0.021)    Adjusted R2  0.154  0.213  0.067  N  26,509  26,509  12,571  Note: All control variables in Table 2 are included in each model. Standard errors are clustered at the school level and shown in parentheses. +p < .10, *p < .05, **p < .01, ***p < .001 (two-tailed test). Table A4. Discriminant Validity of Subjective Evaluations   Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.035*** (0.008)  0.084*** (0.008)    School performance  0.085*** (0.023)  0.071** (0.026)  0.103*** (0.029)  Parents’ evaluations    0.290*** (0.008)  0.187*** (0.034)  Students’ evaluations  0.309*** (0.010)    0.108*** (0.031)  Teachers’ evaluations  0.137*** (0.020)  0.099*** (0.021)    Adjusted R2  0.154  0.213  0.067  N  26,509  26,509  12,571    Parents’ Evaluations  Students’ Evaluations  Teachers’ Evaluations  Student performance  0.035*** (0.008)  0.084*** (0.008)    School performance  0.085*** (0.023)  0.071** (0.026)  0.103*** (0.029)  Parents’ evaluations    0.290*** (0.008)  0.187*** (0.034)  Students’ evaluations  0.309*** (0.010)    0.108*** (0.031)  Teachers’ evaluations  0.137*** (0.020)  0.099*** (0.021)    Adjusted R2  0.154  0.213  0.067  N  26,509  26,509  12,571  Note: All control variables in Table 2 are included in each model. Standard errors are clustered at the school level and shown in parentheses. +p < .10, *p < .05, **p < .01, ***p < .001 (two-tailed test). Table A5. Granger Causality Tests Independent Variable    Dependent Variable  F  p-Value  Students’ evaluations  →  Parents’ evaluations  19.66  .000  Teachers’ evaluations  →  Parents’ evaluations  19.65  .000  Parents’ evaluations  →  Students’ evaluations  12.94  .000  Teachers’ evaluations  →  Students’ evaluations  33.28  .000  Parents’ evaluations  →  Teachers’ evaluations  27.42  .000  Students’ evaluations  →  Teachers’ evaluations  38.90  .000  Independent Variable    Dependent Variable  F  p-Value  Students’ evaluations  →  Parents’ evaluations  19.66  .000  Teachers’ evaluations  →  Parents’ evaluations  19.65  .000  Parents’ evaluations  →  Students’ evaluations  12.94  .000  Teachers’ evaluations  →  Students’ evaluations  33.28  .000  Parents’ evaluations  →  Teachers’ evaluations  27.42  .000  Students’ evaluations  →  Teachers’ evaluations  38.90  .000  Note: School-level analysis. Table A5. Granger Causality Tests Independent Variable    Dependent Variable  F  p-Value  Students’ evaluations  →  Parents’ evaluations  19.66  .000  Teachers’ evaluations  →  Parents’ evaluations  19.65  .000  Parents’ evaluations  →  Students’ evaluations  12.94  .000  Teachers’ evaluations  →  Students’ evaluations  33.28  .000  Parents’ evaluations  →  Teachers’ evaluations  27.42  .000  Students’ evaluations  →  Teachers’ evaluations  38.90  .000  Independent Variable    Dependent Variable  F  p-Value  Students’ evaluations  →  Parents’ evaluations  19.66  .000  Teachers’ evaluations  →  Parents’ evaluations  19.65  .000  Parents’ evaluations  →  Students’ evaluations  12.94  .000  Teachers’ evaluations  →  Students’ evaluations  33.28  .000  Parents’ evaluations  →  Teachers’ evaluations  27.42  .000  Students’ evaluations  →  Teachers’ evaluations  38.90  .000  Note: School-level analysis. References Amirkhanyan, Anna A., Hyun Joon Kim, and Kristina T. Lambright. 2014. The performance puzzle: Understanding the factors influencing alternative dimensions and views of performance. Journal of Public Administration Research and Theory  24: 1– 34. Google Scholar CrossRef Search ADS   Andersen, Lotte Bøgh, Andreas Boesen, and Lene Holm Pedersen. 2016. Performance in public organizations: Clarifying the conceptual space. Public Administration Review  76: 852– 62. Google Scholar CrossRef Search ADS   Andersen, Lotte Bøgh, Eskil Heinesen, and Lene Holm Pedersen. 2016. Individual performance: From common source bias to institutionalized assessment. Journal of Public Administration Research and Theory  26: 63– 78. Andersen, Simon Calmar, and Peter B. Mortensen. 2010. Policy stability and organizational performance: Is there a relationship? Journal of Public Administration Research and Theory  20: 1– 22. Google Scholar CrossRef Search ADS   Andrews, Rhys, George A. Boyne, and Richard M. Walker. 2006. Subjective and objective measures of organizational performance: An empirical exploration. In Public service performance: Perspectives on measurement and management , ed. George A. Boyne, Kenneth J. Meier, Laurence J. O’Toole, and Richard M. Walker, 14– 34. Cambridge: Cambridge Univ. Press. Barrows, Samuel, Michael Henderson, Paul E. Peterson, and Martin R. West. 2016. Relative performance information and perceptions of public service quality: Evidence from American school districts. Journal of Public Administration Research and Theory  26: 571– 83. Google Scholar CrossRef Search ADS   Boyne, George A. 2002. Concepts and indicators of local authority performance: An evaluation of the statutory frameworks in England and Wales. Public Money and Management  22: 17– 24. Google Scholar CrossRef Search ADS   ———. 2003a. Sources of public service improvement: A critical review and research agenda. Journal of Public Administration Research and Theory  13: 367– 394. CrossRef Search ADS   ———. 2003b. What is public service improvement? Public Administration  81: 211– 227. Google Scholar CrossRef Search ADS   Boyne, George A., Kenneth J. Meier, Laurence J. O’Toole, and Richard M. Walker. 2006. Public service performance: Perspectives on measurement and management . Cambridge: Cambridge Univ. Press. Google Scholar CrossRef Search ADS   Brewer, Gene A. 2006. All measures of performance are subjective: More evidence on US federal agencies. In Public service performance: Perspectives on measurement and management , ed. George A. Boyne, Kenneth J. Meier, Laurence J. O’Toole, and Richard M. Walker, 35– 54. Cambridge: Cambridge Univ. Press. Brown, Ben, and Wm Reed Benedict. 2002. Perceptions of the police: Past findings, methodological issues, conceptual issues and policy implications. Policing: An International Journal of Police Strategies & Management  25: 543– 80. Google Scholar CrossRef Search ADS   Brown, Karin, and Philip B. Coulter. 1983. Subjective and objective measures of police service delivery. Public Administration Review  43: 50– 8. Google Scholar CrossRef Search ADS   Brudney, Jeffrey L., and Robert E. England. 1982. Urban policy making and subjective service evaluations: Are they compatible? Public Administration Review  42: 127– 35. Google Scholar CrossRef Search ADS   Cameron, Kim S. 1986. Effectiveness as paradox: Consensus and conflict in conceptions of organizational effectiveness. Management Science  32: 539– 53. Google Scholar CrossRef Search ADS   Campbell, Donald T., and Donald W. Fiske. 1959. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin  56: 81– 105. Google Scholar CrossRef Search ADS   Charbonneau, Étienne, and Gregg G. Van Ryzin. 2012. Performance measures and parental satisfaction with New York City schools. The American Review of Public Administration  42: 54– 65. Google Scholar CrossRef Search ADS   Connolly, Terry, Edward J. Conlon, and Stuart Jay Deutsch. 1980. Organizational effectiveness: A multiple-constituency approach. Academy of Management Review  5: 211– 8. Cyert, Richard M. and James G. March. 1963. A behavioral theory of the firm . Englewood Cliffs, NJ: Prentice Hall. Favero, Nathan, and Justin B. Bullock. 2015. How (not) to solve the problem: An evaluation of scholarly responses to common source bias. Journal of Public Administration Research and Theory  25: 285– 308. Google Scholar CrossRef Search ADS   Favero, Nathan, and Kenneth J. Meier. 2013. Evaluating urban public schools: Parents, teachers, and state assessments. Public Administration Review  73: 401– 12. Google Scholar CrossRef Search ADS   Freeman, R. Edward. 1984. Strategic management: A stakeholder approach . Marshfield, MA: Pitman Publishing, Inc. Granger, Clive W. J. 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society  37: 424– 38. Google Scholar CrossRef Search ADS   Grissom, Jason A., Jill Nicholson-Crotty, and Lael Keiser. 2012. Does my boss’s gender matter? Explaining job satisfaction and employee turnover in the public sector. Journal of Public Administration Research and Theory  22: 649– 73. Google Scholar CrossRef Search ADS   Hirschman, Albert O. 1970. Exit, voice, and loyalty: Responses to decline in firms, organizations, and states . Cambridge, MA: Harvard Univ. Press. Hofstede, Geert, Gert Jan Hofstede, and Michael Minkov. 1991. Cultures and organizations: Software of the mind . London: McGraw-Hill. Jacobsen, Rebecca, Jeffrey W. Snyder, and Andrew Saultz. 2015. Understanding satisfaction with schools: The role of expectations. Journal of Public Administration Research and Theory  25: 831– 48. Google Scholar CrossRef Search ADS   James, Oliver. 2009. Evaluating the expectations disconfirmation and expectations anchoring approaches to citizen satisfaction with local public services. Journal of Public Administration Research and Theory  19: 107– 23. Google Scholar CrossRef Search ADS   Kelly, Janet M. 2003. Citizen satisfaction and administrative performance measures: Is there really a link? Urban Affairs Review  38: 855– 66. Google Scholar CrossRef Search ADS   ———. 2005. The dilemma of the unsatisfied customer in a market model of public administration. Public Administration Review  65: 76– 84. CrossRef Search ADS   Kelly, Janet M., and David Swindell. 2002. A multiple–indicator approach to municipal service evaluation: Correlating performance measurement and citizen satisfaction across jurisdictions. Public Administration Review  62: 610– 21. Google Scholar CrossRef Search ADS   Licari, Michael J., William McLean, and Tom W. Rice. 2005. The condition of community streets and parks: A comparison of resident and nonresident evaluations. Public Administration Review  65: 360– 68. Google Scholar CrossRef Search ADS   Linn, Robert L. 2001. A century of standardized testing: Controversies and pendulum swings. Educational Assessment  7: 29– 38. Google Scholar CrossRef Search ADS   Lyons, William E., David Lowery, and Ruth Hoogland DeHoog. 1992. The politics of dissatisfaction: Citizens, services, and urban institutions . Armonk, NY: ME Sharpe. Marginson, Simon. 2011. The Confucian model of higher education in East Asia and Singapore. In Higher education in the Asia-Pacific: Strategic responses to globalization , ed. Simon Marginson, Sarjit Kaur, and Erlenawati Sawir, 53– 75. Springer. Google Scholar CrossRef Search ADS   Martin, Stephen, and Peter C. Smith. 2005. Multiple public service performance indicators: Toward an integrated statistical approach. Journal of Public Administration Research and Theory  15: 599– 613. Google Scholar CrossRef Search ADS   Meier, Kenneth J., and Laurence J. O’Toole. 2001. Managerial strategies and behavior in networks: A model with evidence from US public education. Journal of Public Administration Research and Theory  11: 271– 94. Google Scholar CrossRef Search ADS   ———. 2013a. I think (I am doing well), therefore I am: Assessing the validity of administrators’ self-assessments of performance. International Public Management Journal  16: 1– 27. ———. 2013b. Subjective organizational performance and measurement error: Common source bias and spurious relationships. Journal of Public Administration Research and Theory  23: 429– 56. CrossRef Search ADS   Meier, Kenneth J., Søren C. Winter, Laurence J. O’Toole, Nathan Favero, and Simon Calmar Andersen. 2015. The validity of subjective performance measures: School principals in Texas and Denmark. Public Administration  93: 1084– 101. Google Scholar CrossRef Search ADS   Mintrom, Michael, and Richard Walley. 2013. Education governance in comparative perspective. In Education governance for the twenty-first century: Overcoming the structural barriers to school reform , ed. Paul Manna and Patrick McGuinn, 252– 74. Washington, DC: The Brookings Institution. Morgeson, Forrest V. 2013. Expectations, disconfirmation, and citizen satisfaction with the US federal government: Testing and expanding the model. Journal of Public Administration Research and Theory  23: 289– 305. Google Scholar CrossRef Search ADS   Moynihan, Donald P. 2008. The dynamics of performance management: Constructing information and reform . Washington, DC: Georgetown Univ. Press. Moynihan, Donald P., Sergio Fernandez, Soonhee Kim, Kelly M. LeRoux, Suzanne J. Piotrowski, Bradley E. Wright, and Kaifeng Yang. 2011. Performance regimes amidst governance complexity. Journal of Public Administration Research and Theory  21 ( Suppl 1): i141– 55. Google Scholar CrossRef Search ADS   OECD. 2017. Population with tertiary education (indicator) . doi: 10.1787/0b8f90e9-en ( accessed October 29, 2017). Parks, Roger B. 1984. Linking objective and subjective measures of performance. Public Administration Review  44: 118– 27. Google Scholar CrossRef Search ADS   Petrovsky, Nicolai, Jue Young Mok, and Filadelfo León‐Cázares. 2017. Citizen expectations and satisfaction in a young democracy: A test of the expectancy‐disconfirmation model. Public Administration Review  77: 395– 407. Google Scholar CrossRef Search ADS   Poister, Theodore H., and John Clayton Thomas. 2011. The effect of expectations and expectancy confirmation/disconfirmation on motorists’ satisfaction with state highways. Journal of Public Administration Research and Theory  21: 601– 17. Google Scholar CrossRef Search ADS   Quinn, Robert E., and John Rohrbaugh. 1983. A spatial model of effectiveness criteria: Towards a competing values approach to organizational analysis. Management Science  29: 363– 77. Google Scholar CrossRef Search ADS   Radin, Beryl. 2006. Challenging the performance movement: Accountability, complexity, and democratic values . Washington, DC: Georgetown Univ. Press. Rainey, Hal G. 2009. Understanding and managing public organizations . San Francisco, CA: John Wiley & Sons. Ringquist, Evan J. 2005. Assessing evidence of environmental inequities: A meta‐analysis. Journal of Policy Analysis and Management  24: 223– 47. Google Scholar CrossRef Search ADS   Rossi, Peter H. 1972. Community social indicators. In The human meaning of social change , ed. Angus Campbell and Philip E. Converse, 87– 126. New York: Russell Sage Foundation. Schneider, Mark, Paul Teske, and Melissa Marschall. 2000. Choosing schools: Consumer choice and the quality of American schools . Princeton, NJ: Princeton Univ. Press. Schwab, Donald P. 2005. Research methods for organizational studies . Mahwah, NJ: Psychology Press. Selden, Sally Coleman, and Jessica E. Sowa. 2004. Testing a multi-dimensional model of organizational performance: Prospects and problems. Journal of Public Administration Research and Theory  14: 395– 416. Google Scholar CrossRef Search ADS   Seoul Metropolitan Office of Education. 2017. High school admission information. http://english.sen.go.kr/index.jsp ( accessed January 3, 2017). Statistics Korea. 2017. Population census. http://kostat.go.kr/portal/eng/index.action ( accessed October 25, 2017). Seth, Michael J. 2002. Education fever: Society, politics, and the pursuit of schooling in South Korea . Honolulu, HI: University of Hawaii Press. Stipak, Brian. 1979. Citizen satisfaction with urban services: Potential misuse as a performance indicator. Public Administration Review  39: 46– 52. Google Scholar CrossRef Search ADS   ———. 1980. Local governments’ use of citizen surveys. Public Administration Review  40: 521– 5. CrossRef Search ADS   Tiebout, Charles M. 1956. A pure theory of local expenditures. The Journal of Political Economy  64: 416– 24. Google Scholar CrossRef Search ADS   Van Ryzin, Gregg G. 2004. Expectations, performance, and citizen satisfaction with urban services. Journal of Policy Analysis and Management  23: 433– 48. Google Scholar CrossRef Search ADS   ———. 2006. Testing the expectancy disconfirmation model of citizen satisfaction with local government. Journal of Public Administration Research and Theory  16: 599– 611. CrossRef Search ADS   ———. 2013. An experimental test of the expectancy‐disconfirmation theory of citizen satisfaction. Journal of Policy Analysis and Management  32: 597– 614. CrossRef Search ADS   Van Ryzin, Gregg G., Stephen Immerwahr, and Stan Altman. 2008. Measuring street cleanliness: A comparison of New York City’s scorecard and results from a citizen survey. Public Administration Review  68: 295– 303. Google Scholar CrossRef Search ADS   Walker, Richard M., and George A. Boyne. 2006. Public management reform and organizational performance: An empirical assessment of the UK Labour government’s public service improvement strategy. Journal of Policy Analysis and Management  25: 371– 93. Google Scholar CrossRef Search ADS   Walker, Richard M., George A. Boyne, and Gene A. Brewer. 2010. Public management and performance: Research directions . Cambridge: Cambridge Univ. Press. Google Scholar CrossRef Search ADS   Wall, Toby D., Jonathan Michie, Malcolm Patterson, Stephen J. Wood, Maura Sheehan, Chris W. Clegg, and Michael West. 2004. On the validity of subjective measures of company performance. Personnel Psychology  57: 95– 118. Google Scholar CrossRef Search ADS   Worthen, Blaine R. 1993. Critical issues that will determine the future of alternative assessment. Phi Delta Kappan  74: 444– 54. Footnotes An earlier version of this article was presented at the 88th Southern Political Science Association Annual Conference, January 12–14, 2017, New Orleans, LA. 1 Another way of categorizing performance indicators is a distinction between objective and subjective performance measures. Objective and subjective measures are often used interchangeably with archival and perceptual measures respectively in the literature; however, the concepts are not exactly identical. Tests to measure stress levels for students, for instance, are considered as objective measures since they are independent from perceptual judgments but have nothing to do with archives. Since our relatively objective measures are obtained from standardized test scores and our relatively subjective measures are derived from stakeholders’ perceptual assessments, we will use the archival versus perceptual terms rather than the objective and subjective distinction in this article. 2 Teacher job satisfaction and motivation, however, are different concepts from performance. These measures are significantly associated with teachers’ self-assessments of student performance, but this relationship is possibly due to common source bias (Andersen, Heinesen, and Pedersen 2016). 3 The logic is also applicable to the relationship between teachers’ evaluations and academic performance, but we cannot test the interactive relationship in the teacher model because the data do not provide a unique ID to match students and teachers. The only ID available to merge student and teacher data is the school ID; therefore, only school level performance is available for the teacher model. 4 Random assignment is conducted based on the school districts considering students’ commute time to school. Specifically, when middle school students apply to high school, they apply to two high schools among all high schools in Seoul City at the first stage and apply to two high schools among schools within their school districts at the second stage. Based on the applications, a computer randomly assigns students to each school using a lottery system. In the third stage, students who are not assigned to schools in the first and the second stage are once again randomly assigned to schools based on school size, class size, commute time, etc. (Seoul Metropolitan Office of Education 2017). The commute time limit has only modest impact on the random assignment because Seoul is an extremely dense city (about 9.8 million people in total and about 42,000 persons per square mile) and schools are close (1.36 high schools and 1.64 middle schools per square mile) (Statistics Korea 2017). Mintrom and Walley (2013, 259) note the “virtual nonexistence of parent choice in the Korean system.” 5 This structure does not mean that choice cannot exist. In any system wealthy individuals can use wealth to augment educational opportunities (by sending their children outside the country or seeking entrance into the small number of elite schools in Seoul). Our contention is only that Tiebout style choices are far more constrained in Seoul than in the United States and similar countries. 6 The plots of the marginal effects are also available in the online appendix. 7 Other Asian countries influenced by Confucian educational traditions (such as China, Japan, Singapore, Taiwan, etc.) also have a similar system where a national examination serves as a social sorting mechanism and academic performance is highly valued (Marginson 2011). 8 Strong social demand for education and the emphasis on test scores contributes to South Korea’s high performance in the global assessment of students such as the Programme for International Student Assessment (PISA); however, it also leads to extensive private tutoring and the intense pressure on students to succeed academically. 9 Equalization policies in Korea such as random student assignment or the teacher rotation system are also rooted in this philosophy of education. © The Author(s) 2018. Published by Oxford University Press on behalf of the Public Management Research Association. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Journal

Journal of Public Administration Research and TheoryOxford University Press

Published: Mar 2, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off