Rating Scale Design Among Ethiopian Entrepreneurs: A Split-Ballot Experiment

Rating Scale Design Among Ethiopian Entrepreneurs: A Split-Ballot Experiment Attitudes are particularly sensitive to how they are measured. Question wording, order, and response options affect how respondents understand and answer questions, influencing survey statistics (Krosnick & Fabrigar, 1997; Schwarz, 2008). Many attitudinal questions ask respondents to indicate how much they agree or disagree with various statements, although some research has critiqued this approach (Saris, Revilla, Krosnick, & Shaeffer, 2010). A host of studies have investigated different ways of designing agree/disagree rating scales. Research has investigated the number of categories (Alwin & Krosnick, 1991; Menold & Kemper, 2015; Preston & Colman, 2000; Revilla, Saris, & Krosnick, 2014), whether to include a “neutral” or midpoint option (Kroh, 2007), labeling scale points with words versus numbers (Weijters, Cabooter, & Schillewaert, 2010), and including a “no opinion” option. Other research has explored two-part “branching” questions, in which the first question assesses the attitude direction and a second question assesses attitude strength (Gilbert, 2015; Malhotra, Krosnick, & Thomas, 2009). There are cultural differences in how people interpret and answer rating scale questions (Hamamura, Heine, & Paulhus, 2007; Johnson, Kulsea, Cho, & Shavitt, 2005; Smith, Mohler, Harkness, & Onodera, 2005; Tobi & Kampen, 2011; Van Vaerenbergh & Thomas, 2012). Most of this cross-cultural research focuses on high- or middle-income countries in Europe, the United States, East Asia, and Latin America (Revilla & Ochoa, 2015). There is less research about rating scales in low-income countries, even though some scholars have suggested that rating scales might pose particular problems for these populations (Bernal, Wooley, & Schensul, 1997; Flaskerud, 2012). For example, Flaskerud (1988) reported that many people from Central America and Vietnam had difficulty understanding the task of answering a rating scale question, and would say “yes” or “no” instead. Agans, Deeb-Sossa, and Kalsbeek (2006) found that Mexican women had difficulty distinguishing between different categories on rating scales. Given the growth of large, ongoing attitudinal surveys in developing countries (e.g., Afrobarometer, Pew Global Attitudes Project, the World Values Survey), there is a surprising lack of research on rating scale design, particularly using experimental methods. There are several reasons why individuals from developing countries may face challenges in comprehending rating scale questions and mapping their true attitude onto the agree/disagree response options. First, lower levels of education and literacy may inhibit individual’s ability or effort to make fine-grained distinctions between categories. But even among more highly educated people, agree/disagree scales may create challenges. For example, because survey research is less common in developing countries, people are less familiar with the agree/disagree scales, which are so common in developed countries. Further, in some cultures, it is more common to assess attitudes through storytelling rather than from direct questions (Flaskerud, 2012). And finally, linguistic issues arise as well: it is difficult to translate phrases such as “somewhat agree” into some non-Western languages. In recognition of these challenges, Flaskerud (2012) advocates the use of alternative strategies to measure attitudes, such as dichotomous items or anchoring vignettes (Hopkins & King, 2010). However, both of these strategies have drawbacks: dichotomous items cannot convey variability, and anchoring vignettes require substantial effort to design and administer. Many surveys continue to use agree/disagree rating scales in developing countries, despite a lack of evidence about the optimal design for these questions. Three main types of agree/disagree scales are typically used in developing country surveys, each with potential strengths and weaknesses. The first type is numeric scales that have verbal labels at the end points only. On the one hand, numeric scales minimize issues with linguistic constructions and translations. Numeric scales also make it easier for respondents to remember response options (1–5) in a face-to-face survey. On the other hand, it may be strange to associate an attitude with a number, and respondents may face difficulty distinguishing between the different options (Flaskerud, 2012). Second, verbal scales include descriptions for each response option. This scale has the advantage of clarifying the meaning of each response option and using a more natural approach using words. However, the verbal scale may introduce ambiguity owing to difficulty in translating concepts such as “somewhat.” Third, branched scales seek to simplify the cognitive process by splitting attitude questions into two parts. The first question assesses the direction of the attitude and the second question assess attitude strength. Although branched scales reduce complexity, they may also take longer to administer and are repetitive (Gilbert, 2015). There is little evidence about which of these scales produces the most valid data. To address this gap in the literature, I designed a split-ballot questionnaire experiment in Ethiopia to shed light on best practices for designing agree/disagree questions in a low-income country. The experiment was embedded in a paper-and-pencil interview (PAPI) of 608 urban Ethiopian entrepreneurs that included attitude questions about their competitive orientation, commitment to running their business, and trust in suppliers. Respondents were randomly assigned to receive one of three rating scales (numeric, verbal, or branched) throughout the interview. I evaluate the impact of the rating scale on three outcomes: (1) univariate distributions of attitude variables; (2) respondent and interviewer perceptions about the survey; and (3) two measures of validity. Because the survey is based on a sample of urban entrepreneurs, the results of this survey experiment are limited to this specific population. I triangulate the results of the survey experiment with qualitative data from 20 semi-structured, in-depth interviews that were conducted during the formative stages of this research. Methods Data I embedded a split-ballot experiment in the Kal Addis Business Survey, an in-person survey of entrepreneurs in the Ethiopian capital of Addis Ababa, Ethiopia. Conducted in June and July of 2012, the survey was designed to improve knowledge about how small and medium-sized businesses operate in Ethiopia. During the survey design process, local interviewers conducted 20 semi-structured, in-depth interviews with entrepreneurs from Addis Ababa. Interviewers presented participants with verbal, numeric, and branched scales (in random order) and used both scripted and spontaneous probes to understand participants’ reactions to the scales. The audio-recorded interviews were transcribed and translated verbatim into English. Following the qualitative study, trained interviewers administered the PAPI quantitative survey in the Amharic language to a sample of 608 owners or managers of small and medium businesses (between 3 and 99 employees) in Addis Ababa, Ethiopia. The average interview time was 29 min (SD = 9 min). The study used respondent-driven sampling (RDS) methods to generate the sample because a high-quality sampling frame of entrepreneurs does not exist in Ethiopia. RDS is a method of chain referral sampling that combines a snowball sample with a mathematical model to adjust for biases in network sampling (Heckathorn, 1997; Heckathorn, 2002). RDS was designed for sampling “hidden” populations that are difficult to enumerate and lack a sampling frame (e.g., intravenous drug users, men who have sex with men). Evaluations of RDS suggest that it can approximate probability samples when applied to a highly networked population (McCreesh et al., 2012). Initially, the study recruited a sample of 24 entrepreneurs to participate in the survey. These respondents were then asked to recruit up to three members of their networks to participate. All respondents were provided with a leather wallet as a token of appreciation for completing the survey. Each wave of recruits was also asked to recruit up to three other individuals. Recruitment proceeded until the target sample size (600) was exceeded. As an incentive to recruit others, individuals were offered mobile-phone airtime minutes for each individual they recruited to the survey. More details about the sampling methodology and representativeness are published in the article by Lau and Bobashev (2015). Because this article’s focus is the internal validity of the rating scale experiment, I do not apply RDS weights in the article. The sample was 76% male and 24% female. The average age of the sample was 31.2 (SD = 6.9) years. Fifteen percent did not complete secondary school, 34% completed secondary school, 31% had vocational or completed college training, and 20% had higher than college training. Although this represents a mix of educational levels, the sample is significantly more educated than the general population of Ethiopia (Central Statistical Agency [Ethiopia] & ICF International, 2012). Most of the respondents were owners of a small or medium business (82%) rather than managers. Rating Scales The instrument included 38 agree/disagree questions that measured attitudes toward operating a business (see Appendix for English question wording). At the beginning of the interview, each respondent was randomly assigned to one of three questionnaire versions. Each version used a different agree/disagree rating scale. All scales used five response categories, and respondents were assigned to the same rating scale for the entire interview. The random assignment was not consistently correlated with any characteristics of respondents and businesses (tables available upon request). The numeric scale presented only verbal labels at the end points. The interviewer asked respondents to rate each item on a scale from 1 to 5, where 1 was “completely disagree” and 5 was “completely agree.” The possible responses were 1, 2, 3, 4, and 5. The verbal scale had full verbal labels with response options: “Completely disagree,” “Somewhat disagree,” “Neutral,” “Somewhat agree,” and “Completely agree.” The branched scale first assessed direction and then assessed strength. Interviewers first asked (in Amharic), “Do you agree, disagree, or are you neutral?” For respondents who endorsed “agree” or “disagree,” interviewers asked whether respondents “completely” agreed/disagreed or whether they “somewhat” agreed/disagreed. I recoded the two questions from the branched scales into a single variable with five categories. Analysis The analysis proceeds in four stages. In the first stage, I explore how rating scale design affects univariate distributions of attitude questions. This stage is based on a stacked “person-item” data set in which each respondent’s answer to a scale question contributes one row of data: the data set has 22,997 observations (608 respondents answering 38 items, excluding 107 items with missing data). Using this data set, I conduct a cross-tabulation of response distributions by scale type (verbal, branched, numeric) to examine how scale type influences response distributions. When conducting statistical tests, I adjusted standard errors for the fact that respondents answered multiple items. The second stage seeks to determine which rating scale produces the most accurate data using tests of criterion validity similar to Malhotra, Krosnick, and Thomas (2009). This approach involves estimating associations between a criterion variable and an attitude variable, where attitudes are measured using different types of rating scales. If the association between the criterion and attitude variables is the same, regardless of scale design, there is no evidence that one scale is more valid than another. On the other hand, if the association is stronger when a particular scale is used over another, there is evidence that the first scale produces more valid data. I conduct two tests of validity. In the first test, I estimate the association between competitive orientation of the business owner (target attitude) and a dichotomous measure of whether the business made a profit in the 12 months prior to the survey. The impact of an entrepreneur’s psychological orientation on business performance is the subject of a large literature (Lumpkin & Dress, 1996; Swierczek & Ha, 2003). Orientation toward competitors in particular emerged as a key theme in formative research in Addis Ababa. The greater the competitive orientation, the more likely a business is to make a profit. In this study, competitive orientation is an index created from three items (see Appendix). Exploratory factor analysis showed these items loaded on one factor (eigenvalue 1.65). The composite reliability (Raykov, 1997; Schweizer, 2011) of the index is .61 but varies by scale type (verbal = .77; numeric = .61; branched = .51). I estimate a logistic regression of profit on competitive orientation, separately by scale type. Although factors other than competitive orientation influence profit, my interest is in comparing how the association differs by scale type (Malhotra, Krosnick, & Thomas, 2009). Next, I estimate the association between the entrepreneur’s affective trust in his or her main supplier (the target attitude) and the length of time that the entrepreneur has used the supplier (less than 1 year, 1–3 years, more than 3 years; see Appendix for question wording). In the entrepreneurship literature, trust between businesses is a widely used concept to explain deal-making processes and business outcomes (Welter & Smallbone, 2006). Trust also evolves over time (Scarbrough, Swan, Amaeshi, & Briggs, 2013; Smith & Lohrke, 2008), so the more the entrepreneur trusts the supplier, the longer the entrepreneur is to use the supplier. Or conversely, the longer the entrepreneur has used the supplier, the more trust will be developed. In this study, trust is an index created from three items that loaded on one factor during exploratory factor analysis (eigenvalue 2.46). Composite reliability was high (.89) and was similar across scale types (verbal = .91; numeric = .89; branched = .88). I estimate ordinary least squares (OLS) regressions of trust on length of time and then compare the associations. In the third stage, I explore interviewer and respondent perceptions. The survey asked direct questions about the rating scales to both interviewers and respondents, as well as more general questions about the interview experience. Exact question wording is shown in the Appendix. This stage of the analysis presents cross-tabulations between rating scale design and the perceptions of interviewers and respondents. In the fourth stage of the analysis, two researchers analyzed qualitative data from the semi-structured, in-depth interviews. Researchers conducted an inductive content analysis to identify key themes. Results Response Distributions by Scale Type Table 1 shows the distribution of responses to attitude items, by scale type. There is a statistically significant association between scale type and responses to scale questions (p < .01). The largest difference is in the midpoint category, endorsed by 12% in the numeric scale, 6% in the verbal scale, and 3% in the branched scale (p < .01). In contrast, “somewhat agree” and “completely agree” were selected more often in the branched (67%) and verbal scale (66%) compared with a “4” or “5” in the numeric scale (60%) (p < .01). Supplementary analyses showed that the effects of scale type on univariate distributions were similar regardless of question content or respondent characteristics. Table 1 Distribution of Responses to Attitude Questions, by Scale Type (Percentages) Numeric scale  Verbal scale  Branched scale  Entire sample  (1) Completely disagree  20  Completely disagree  20  Completely disagree  21  20  (2)  8  Somewhat disagree  8  Somewhat disagree  9  8  (3)  12  Neutral  6  Neutral  3  7  (4)  13  Somewhat agree  16  Somewhat agree  16  15  (5) Completely agree  47  Completely agree  50  Completely agree  51  49  Numeric scale  Verbal scale  Branched scale  Entire sample  (1) Completely disagree  20  Completely disagree  20  Completely disagree  21  20  (2)  8  Somewhat disagree  8  Somewhat disagree  9  8  (3)  12  Neutral  6  Neutral  3  7  (4)  13  Somewhat agree  16  Somewhat agree  16  15  (5) Completely agree  47  Completely agree  50  Completely agree  51  49  Note: Pooled sample of 22,997 scale questions reported by 608 respondents. All statistical tests are adjusted for the clustering of items within respondents. See text for description of scale types and Appendix for exact question wording. Table 1 Distribution of Responses to Attitude Questions, by Scale Type (Percentages) Numeric scale  Verbal scale  Branched scale  Entire sample  (1) Completely disagree  20  Completely disagree  20  Completely disagree  21  20  (2)  8  Somewhat disagree  8  Somewhat disagree  9  8  (3)  12  Neutral  6  Neutral  3  7  (4)  13  Somewhat agree  16  Somewhat agree  16  15  (5) Completely agree  47  Completely agree  50  Completely agree  51  49  Numeric scale  Verbal scale  Branched scale  Entire sample  (1) Completely disagree  20  Completely disagree  20  Completely disagree  21  20  (2)  8  Somewhat disagree  8  Somewhat disagree  9  8  (3)  12  Neutral  6  Neutral  3  7  (4)  13  Somewhat agree  16  Somewhat agree  16  15  (5) Completely agree  47  Completely agree  50  Completely agree  51  49  Note: Pooled sample of 22,997 scale questions reported by 608 respondents. All statistical tests are adjusted for the clustering of items within respondents. See text for description of scale types and Appendix for exact question wording. There are several possible explanations for these patterns. On the one hand, the verbal and branched scales may encourage acquiescence (the tendency to agree regardless of the question content; Krosnick, 1991) because of repeated mention of the word “agree.” Respondents may also seek to avoid endorsing “neutral” in these scales. Indeed, during the qualitative interviews, several participants interpreted “neutral” as not having an opinion, a category that they wished to avoid. On the other hand, respondents in the numeric scale condition may have been drawn to the midpoint “3” category. This could be owing to a preference to select an average response or because respondents have difficulty answering the question and choose “3” as a satisficing response. The results in Table 1 cannot show which scale type produces the most accurate data; I investigate this issue in the next section. Validity This stage of the analysis investigates the validity of each rating scale design, using two tests of criterion validity. Table 2 shows associations between competitive orientation and profit, separately by scale type. Competitive orientation was associated with profit for the verbal and branched scales, but not for the numeric scales. In other words, the theoretically expected association was observed when attitudes were measured using verbal and branched conditions (p < .01), but not when attitudes were measured using numeric scales. These results suggest that verbal and branched scales produce more valid data than numeric scales. Table 2 Validity Test: Logistic Regression of Profit (Yes/No) on Competitive Orientation, Stratified by Scale Type Variable  Numeric  Verbal  Branched    Odds ratio  z  Odds ratio  z  Odds ratio  z  Competitive orientation  1.36  1.90  1.83**  3.90  1.89**  3.60  Intercept  2.46*  5.54  2.72**  5.92  2.04**  4.54  N  188  196  198  Variable  Numeric  Verbal  Branched    Odds ratio  z  Odds ratio  z  Odds ratio  z  Competitive orientation  1.36  1.90  1.83**  3.90  1.89**  3.60  Intercept  2.46*  5.54  2.72**  5.92  2.04**  4.54  N  188  196  198  Note. * p < .05; **p < .01. Table 2 Validity Test: Logistic Regression of Profit (Yes/No) on Competitive Orientation, Stratified by Scale Type Variable  Numeric  Verbal  Branched    Odds ratio  z  Odds ratio  z  Odds ratio  z  Competitive orientation  1.36  1.90  1.83**  3.90  1.89**  3.60  Intercept  2.46*  5.54  2.72**  5.92  2.04**  4.54  N  188  196  198  Variable  Numeric  Verbal  Branched    Odds ratio  z  Odds ratio  z  Odds ratio  z  Competitive orientation  1.36  1.90  1.83**  3.90  1.89**  3.60  Intercept  2.46*  5.54  2.72**  5.92  2.04**  4.54  N  188  196  198  Note. * p < .05; **p < .01. Numeric scales also performed poorly in the second test of validity. Table 3 shows associations between supplier trust and number of years with supplier. As theoretically expected, there is greater trust among entrepreneurs who had worked with suppliers for longer amounts of time, but only in the verbal and branched conditions. These associations are statistically significant at p < .05, but there is no significant association in the numeric scale. Table 3 Validity Test: OLS Regression of Trust in Supplier and Length of Time With Supplier, Stratified by Scale Type Variable  Numeric  Verbal  Branched    β  t  β  t  β  t  Length of time with main supplier (reference = less than 1 year)      1–3 years  .041  .21  .360  1.83  .446*  2.39      3 or more years  .029  .14  .541**  2.94  .295  1.57  Intercept  −.071  −.44  −.326*  −2.16  −.226  −1.47  n  193  186  194  Variable  Numeric  Verbal  Branched    β  t  β  t  β  t  Length of time with main supplier (reference = less than 1 year)      1–3 years  .041  .21  .360  1.83  .446*  2.39      3 or more years  .029  .14  .541**  2.94  .295  1.57  Intercept  −.071  −.44  −.326*  −2.16  −.226  −1.47  n  193  186  194  Note. *p < .05; **p < .01. Table 3 Validity Test: OLS Regression of Trust in Supplier and Length of Time With Supplier, Stratified by Scale Type Variable  Numeric  Verbal  Branched    β  t  β  t  β  t  Length of time with main supplier (reference = less than 1 year)      1–3 years  .041  .21  .360  1.83  .446*  2.39      3 or more years  .029  .14  .541**  2.94  .295  1.57  Intercept  −.071  −.44  −.326*  −2.16  −.226  −1.47  n  193  186  194  Variable  Numeric  Verbal  Branched    β  t  β  t  β  t  Length of time with main supplier (reference = less than 1 year)      1–3 years  .041  .21  .360  1.83  .446*  2.39      3 or more years  .029  .14  .541**  2.94  .295  1.57  Intercept  −.071  −.44  −.326*  −2.16  −.226  −1.47  n  193  186  194  Note. *p < .05; **p < .01. These results are based on models stratified by scale type. I also pooled the data and included interaction terms between scale type and the criterion variable. These models showed similar results as the stratified model. For the competitive orientation model, the interaction terms were significant at p = .028. For the supplier trust model, the interaction term for the verbal scale had a p = .057. These terms show that for both tests, the strength of the association between predictor and criterion is stronger for verbal scales than for numeric scales. Interviewer and Respondent Perceptions If verbal and branched scales produce more valid data, do interviewers and respondents also have more favorable perceptions of these scales? The results suggest not. Respondents had similar attitudes about the survey experience, regardless of scale type. Interviewers rated respondents as equally cooperative (χ2(6) = 0.7; p = .99) and interested (χ2(6) = 2.2; p = .92) in the three experimental conditions. When interviewers were asked directly about the respondent’s ease of answering scale questions, there were few differences by scale type. See full results in Table A1. Qualitative Interviews Results from qualitative research corroborate conclusions from the quantitative analysis. Participants universally preferred verbal or branched scales to numeric scales. Participants reported considerable uncertainty about the meaning of the 2, 3, or 4 options in the numeric scale. One participant (ID 11), for example, mentioned “the [numeric scale] lacks clearness … the [verbal scale] has words which I can easily pick, and those words are words that express what I feel.” When probed about their interpretation of the middle categories, participants ascribed widely different meanings to the middle categories, and several participants offered inconsistent meanings even within the same interview. The apparent low reliability expressed in these qualitative interviews is consistent with the quantitative data showing lower reliability and validity for numeric scales. In addition, some expressed cognitive discomfort with using numbers to represent attitudes. One participant (ID 2) noted, “It takes time to understand what the numbers 2, 3, and 4 mean. It is somewhat difficult. You have to work it out like homework.” Another participant (ID 1) said “expressing answers in terms of figures is confusing. Hence, it will be good if an explanation is given for each number.” These interviews suggest that respondents in the survey experiment may have selected the midpoint category as a satisficing strategy to cope with the cognitive burden and ambiguity about the meaning of the midpoint categories. When asked to compare verbal and branched scales, participants expressed varied reactions, with a slight preference for verbal scales. One participant (ID 5) said, “The [descriptive scale] gives you more options. As a result, it helps you think more about the response you give.” Another participant (ID 10) mentioned the verbal scale “enabled me to freely communicate my feelings … the response options were not restricted.” These results suggest that to some participants at least, descriptive scales may facilitate the response process. Discussion This experiment demonstrates that the design of agree/disagree rating scales affects attitude measurement for a sample of Ethiopian urban entrepreneurs. When presented with a numeric scale, respondents were especially likely to select the midpoint value (12%) compared with verbal (6%) or branched (3%) scales, and less likely to select the higher end of the scale. In two tests of validity, numeric scales demonstrated lower validity compared with verbal and branched scales. There were fewer differences between verbal and branched scales. Although interviewers and respondents had similar perceptions of all three scale types, numeric scales produced significantly worse data quality compared with verbal and branched scales. Several limitations of this study suggest the need for additional research on rating scale design. The results cannot be generalized beyond the relatively educated sample of entrepreneurs in Addis Ababa. The results may be different for a general population sample. For example, interview participants expressed frustration with numeric scales because of their ambiguity and perceived difficulty. A less educated, general population sample may be less equipped for these cognitive burdens, potentially making numeric scales even more problematic. Future research is needed to test whether the conclusions of this research also apply to other populations in Ethiopia or other less developed countries. Relatedly, these findings only apply to a particular topic, attitudes about running a business in Ethiopia. Future research is needed to determine the optimal rating scale design for other topics, particularly sensitive ones. In addition, the reliability for the competitive orientation index is lower than ideal. Although this may be owing to the experimental treatment (i.e., some scale designs produce worse data quality), additional research is needed using scales with high reliability. In sum, numeric scales performed poorly relative to verbal and branched scales in both the survey experiment and qualitative interviews. Verbal and branched scales performed similarly, though slightly more qualitative interview participants preferred the verbal scales. These findings suggest that numeric scales should be avoided in favor of verbal and branched scales, at least for this population of entrepreneurs in Ethiopia. Charles Q. Lau, PhD, MS, is a Survey Methodologist in RTI International’s Survey Research Division. Dr. Lau conducts research on survey methodology in low- and middle-income countries, focusing on modes of data collection, interviewer effects, and sampling. APPENDIX Question Wording: Profit and Length of Time Using Supplier Profit. Respondents were asked “In the last fiscal year, did your business make money, lose money, or break even?” Those who said “make money” were classified as being profitable. Table A1 Association Between Scale Type and Perceptions of Interview (Percentages)   Verbal labels  Numeric labels  Branched  Pearson χ2 test  Respondent perceptions  Did you think the questions were hard to understand?      Yes  5  5  3  χ2(2) = 1.2; p = .56      No  95  95  97    Did you have difficulty coming up with your answers?      Yes  7  5  7  χ2(2) = 1.1 p = .58      No  93  95  93    Did you think the survey was too long?      Yes  13  10  12  χ2(2) = 0.67; p = .72      No  87  90  88    Interviewer perceptions  Cooperativeness              Very cooperative  76  77  76  χ2(6) = 0.7; p = .99      Moderately cooperative  21  21  21        Somewhat cooperative  2  1  1        Not at all cooperative  <1  <1  1    Interest      Very interested  73  74  74  χ2(6) = 2.2; p = .92      Moderately interested  24  24  22        Somewhat or not at all interested  2  2  3        Not at all interested  <1  <1  1    Ease of answering scale questions      Very easy  68  68  67  χ2(6) = 3.0; p = .81      Somewhat easy  22  24  24        Neutral  8  8  8        Somewhat or very difficult  1  <1  1      Verbal labels  Numeric labels  Branched  Pearson χ2 test  Respondent perceptions  Did you think the questions were hard to understand?      Yes  5  5  3  χ2(2) = 1.2; p = .56      No  95  95  97    Did you have difficulty coming up with your answers?      Yes  7  5  7  χ2(2) = 1.1 p = .58      No  93  95  93    Did you think the survey was too long?      Yes  13  10  12  χ2(2) = 0.67; p = .72      No  87  90  88    Interviewer perceptions  Cooperativeness              Very cooperative  76  77  76  χ2(6) = 0.7; p = .99      Moderately cooperative  21  21  21        Somewhat cooperative  2  1  1        Not at all cooperative  <1  <1  1    Interest      Very interested  73  74  74  χ2(6) = 2.2; p = .92      Moderately interested  24  24  22        Somewhat or not at all interested  2  2  3        Not at all interested  <1  <1  1    Ease of answering scale questions      Very easy  68  68  67  χ2(6) = 3.0; p = .81      Somewhat easy  22  24  24        Neutral  8  8  8        Somewhat or very difficult  1  <1  1    Note: Percentages may not sum to 100% owing to rounding. Table A1 Association Between Scale Type and Perceptions of Interview (Percentages)   Verbal labels  Numeric labels  Branched  Pearson χ2 test  Respondent perceptions  Did you think the questions were hard to understand?      Yes  5  5  3  χ2(2) = 1.2; p = .56      No  95  95  97    Did you have difficulty coming up with your answers?      Yes  7  5  7  χ2(2) = 1.1 p = .58      No  93  95  93    Did you think the survey was too long?      Yes  13  10  12  χ2(2) = 0.67; p = .72      No  87  90  88    Interviewer perceptions  Cooperativeness              Very cooperative  76  77  76  χ2(6) = 0.7; p = .99      Moderately cooperative  21  21  21        Somewhat cooperative  2  1  1        Not at all cooperative  <1  <1  1    Interest      Very interested  73  74  74  χ2(6) = 2.2; p = .92      Moderately interested  24  24  22        Somewhat or not at all interested  2  2  3        Not at all interested  <1  <1  1    Ease of answering scale questions      Very easy  68  68  67  χ2(6) = 3.0; p = .81      Somewhat easy  22  24  24        Neutral  8  8  8        Somewhat or very difficult  1  <1  1      Verbal labels  Numeric labels  Branched  Pearson χ2 test  Respondent perceptions  Did you think the questions were hard to understand?      Yes  5  5  3  χ2(2) = 1.2; p = .56      No  95  95  97    Did you have difficulty coming up with your answers?      Yes  7  5  7  χ2(2) = 1.1 p = .58      No  93  95  93    Did you think the survey was too long?      Yes  13  10  12  χ2(2) = 0.67; p = .72      No  87  90  88    Interviewer perceptions  Cooperativeness              Very cooperative  76  77  76  χ2(6) = 0.7; p = .99      Moderately cooperative  21  21  21        Somewhat cooperative  2  1  1        Not at all cooperative  <1  <1  1    Interest      Very interested  73  74  74  χ2(6) = 2.2; p = .92      Moderately interested  24  24  22        Somewhat or not at all interested  2  2  3        Not at all interested  <1  <1  1    Ease of answering scale questions      Very easy  68  68  67  χ2(6) = 3.0; p = .81      Somewhat easy  22  24  24        Neutral  8  8  8        Somewhat or very difficult  1  <1  1    Note: Percentages may not sum to 100% owing to rounding. Length of Time Using Supplier. Respondents were asked “How long have you used this supplier or broker?” Responses options were “3 months or less,” “More than 3 months to 6 months,” “More than 6 months to 12 months,” “1 to 3 years,” and “More than 3 years.” These responses were recoded into the following categories: less than 1 year; 1-3 years, and more than 3 years. Question Wording: Attitude Questions Below, I present question wording for the 38 attitude questions included in the analysis. The exact wording of the attitude items is presented below. Competitive Orientation (starred items are used in index) * “You think about your competitors when developing products or services” * “You know a lot about your competitors’ products or services” * “You know how your competitors set their prices” “You are aware of your competitors’ plans for the future” Affective Trust in Main Supplier “If you shared your problems with your supplier or broker, he or she would help you” “You can freely share your ideas, feelings, and hopes with your supplier or broker” “You can freely talk to your supplier or broker and you know he or she would want to listen” Cognitive Trust in Main Supplier “Your supplier or broker approaches their job with dedication” “You see no reason to doubt your supplier’s or broker’s competence” “You can rely on your supplier or broker to make your job easy” Supplier Market “If your supplier or broker does not perform as you expect, you can find another one” “Your suppliers or brokers have many competitors” “Suppliers or brokers that don't treat their customers well eventually go out of business” “The law limits the suppliers or brokers you can buy from” “Using the same supplier or broker many times brings the price of supplies down” Ease of Working With Main Supplier “You always receive receipts when you buy supplies” “If the supplies are defective or bad quality, it is easy to return them” “It is easy to transport supplies back to your workspace” “You are satisfied with the quality of supplies” “You are satisfied with the price of supplies” Skilled Labor “It is easy to find workers with the right skills” “Family or friends are good ways of finding skilled workers” “You can afford to hire skilled workers” “Your skilled workers have many job opportunities outside of your business” Technology “Updated equipment or technology is too expensive for your business” “Your workers have the skills to operate updated equipment or technology” “You know a lot about updated equipment or technology” “It is easy to repair updated equipment or technology” “Buying updated equipment or technology would make your business more money” Entrepreneurial Orientation “In the past 12 months, you have made few changes to your products or services” “You often introduce products or services before your competitors” “In general, you like to take big risks with chances of very high return” Commitment “You would be very happy (unhappy) to spend the rest of your career running this business” “You are willing (not willing) to do whatever it takes to make your business a success” “You have other choices than running your business (little choice but to run your business)” “Right now, running your business is (not) a matter of necessity” “You feel (do not feel) an obligation to remain with your business” “You would feel (not feel) guilty if you sold your business or stopped running it” Interviewer Perceptions Exact question wording for the interviewer perception questions was the following: How easy or difficult do you think it was for the respondent to answer the scale questions (agree/disagree)? (SELECT ONE.) Very easy Somewhat easy Neutral Somewhat difficult Very difficult How interested was the respondent during the interview? (SELECT ONE.) Very interested Moderately interested Somewhat interested Not at all interested Respondent Perceptions At the conclusion of the survey, respondents were asked several questions about their experience. These questions were as follows: Did you think the questions were hard to understand? Did you have difficulty coming up with your answers? Did you think the survey was too long? These questions used yes/no response options (rather than scales) to avoid confounding the results of the rating scale experiment with these perception data. References Alwin D. F., Krosnick J. A. ( 1991). The reliability of survey attitude measurement. Sociological Methods and Research , 20, 139– 181. doi: 10.1177/0049124191020001005 Google Scholar CrossRef Search ADS   Agans R. P., Deeb-Sossa N., Kalsbeek W. D. ( 2006). Mexican immigrants and the use of cognitive assessment techniques in questionnaire development. Hispanic Journal of Behavioral Sciences , 28, 209– 230. doi: 10.1177/0739986305285826 Google Scholar CrossRef Search ADS   Bernal H., Wooley S., Schensul J. J. ( 1997). The challenge of using Likert-type scales with low-literate ethnic populations. Nursing Research , 46, 179– 181. Google Scholar CrossRef Search ADS PubMed  Central Statistical Agency [Ethiopia]; ICF International. 2012. Ethiopia demographic and health survey 2011. Addis Ababa, Ethiopia and Calverton, Maryland, USA: Central Statistical Agency and ICF International. Flaskerud J. H. ( 2012). Cultural bias and Likert-type scales revisited. Issues in Mental Health Nursing , 33, 130– 132. doi: 10.3109/01612840.2011.600510 Google Scholar CrossRef Search ADS PubMed  Flaskerud J. H. ( 1988). Is the Likert scale format culturally biased? Nursing Research , 37, 185– 186. Google Scholar CrossRef Search ADS PubMed  Gilbert E. E. ( 2015). A comparison of branched versus unbranched rating scales for the measurement of attitudes in surveys. Public Opinion Quarterly , 79, 443– 470. doi: 10.1093/poq/nfu090 Google Scholar CrossRef Search ADS   Hamamura T., Heine S. J., Paulhus D. L. ( 2007). Cultural differences in response styles: The role of dialectical thinking. Personality and Individual Differences , 44, 932– 942. 10.1016/j.paid.2007.10.034 Google Scholar CrossRef Search ADS   Heckathorn D. ( 1997). Respondent-driven sampling: A new approach to the study of hidden populations. Social Problems , 44, 174– 199. doi: 10.2307/3096941 Google Scholar CrossRef Search ADS   Heckathorn D. ( 2002). Respondent-driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations. Social Problems , 49, 11– 34. doi: 10.1525/sp.2002.49.1.11 Google Scholar CrossRef Search ADS   Hopkins D. J., King G. ( 2010). Improving anchoring vignettes: Designing surveys to correct interpersonal incomparability. Public Opinion Quarterly , 74, 201– 222. doi: 10.1093/poq/nfq011 Google Scholar CrossRef Search ADS   Johnson T., Kulsea P, Cho Y. I., Shavitt S. ( 2005). The relation between culture and response styles: Evidence from 19 countries. Journal of Cross-Cultural Psychology , 36, 264– 277. doi: 10.1177/0022022104272905 Google Scholar CrossRef Search ADS   Kroh M. ( 2007). Measuring left-right political orientation: The choice of response format. Public Opinion Quarterly , 71, 204– 220. doi:10.1093/poq/nfm009 Google Scholar CrossRef Search ADS   Krosnick J. A. ( 1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology , 5, 213– 236. doi: 10.1002/acp.2350050305 Google Scholar CrossRef Search ADS   Krosnick J. A., Fabrigar L. R. ( 1997). Designing rating scales for effective measurement in surveys. In Lyberg L. E., Biemer P., Collins M., de Leeuw E., Dippo C., Schwarz N., Trewin D. (Eds.), Survey measurement and process quality  (pp. 141– 164). Hoboken, NJ: John Wiley & Sons, Inc. Lau C. Q., Bobashev G. V. ( 2015). Respondent-driven sampling: A new method to sample businesses in Africa. Journal of African Economies , 24, 128– 147. doi: 10.1093/jae/eju023 Google Scholar CrossRef Search ADS   Lumpkin G. T., Dress G. G. ( 1996). Clarifying the entrepreneurial orientation construct and linking it to performance. Academy of Management Review , 21, 135– 172. doi: 10.5465/AMR.1996.9602161568 Google Scholar CrossRef Search ADS   Malhotra N., Krosnick J. A., Thomas R. K. ( 2009). Optimal design of branching questions to measure bipolar constructs. Public Opinion Quarterly , 73, 304– 324. doi: 10.1093/poq/nfp023 Google Scholar CrossRef Search ADS   McCreesh N., Frost S., Seeley J., Katongole J., Tarsh M., Ndunguse R., Jichi F., Lunel N., Maher D., Johnston L., Sonnenberg P., Copas A., Hayes R., White R. ( 2012). Evaluation of respondent-driven sampling. Epidemiology , 23, 138– 147. doi: 10.1097/EDE.0b013e31823ac17c Google Scholar CrossRef Search ADS PubMed  Menold N., Kemper C. J. ( 2015). The impact of frequency rating scale formats on the measurement of latent variables in web surveys: An experimental investigation using a measure of affectivity as an example. Psihologija , 48, 431– 449. doi: 10.2298/PSI1504431M Google Scholar CrossRef Search ADS   Preston C. C., Colman A. M. ( 2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica , 104, 1– 15. doi: 10.1016/S0001-6918(99)00050-5 Google Scholar CrossRef Search ADS PubMed  Raykov T. ( 1997). Estimation of composite reliability for congeneric measures. Applied Psychological Measurement , 21, 173– 184. doi: 10.1177/01466216970212006 Google Scholar CrossRef Search ADS   Revilla M., Ochoa C. ( 2015). Quality of different scales in an online survey in Mexico and Colombia. Journal of Politics in Latin America , 7, 157– 177. Revilla M., Saris W. E., Krosnick J. A. ( 2014). Choosing the number of categories in agree-disagree scales. Sociological Methods and Research , 43, 73– 97. doi: 10.1177/0049124113509605 Google Scholar CrossRef Search ADS   Saris W. E., Revilla M., Krosnick J. A., Shaeffer E. M. ( 2010). Comparing questions with agree/disagree response options to questions with item-specific response options. Survey Research Methods , 4, 61– 79. doi: 10.18148/srm/2010.v4i1.2682 Scarbrough H., Swan J., Amaeshi K., Briggs T. ( 2013). Exploring the role of trust in the deal-making process for early-stage technology ventures. Entrepreneurship: Theory and Practice , 37, 1203– 1228. doi: 10.1111/etap.12031 Google Scholar CrossRef Search ADS   Schwarz N. ( 2008). Attitude Measurement. In Crano W. D., Prislin R. (Eds.), Attitudes and attitude change  (pp. 41– 60). New York, NY: Taylor & Francis Group. Schweizer K. ( 2011). On the changing role of Cronbach’s α in the evaluation of the quality of a measure. European Journal of Psychological Assessment.  27, 143– 144. doi: 10.1027/1015-5759/a000069 Google Scholar CrossRef Search ADS   Smith D. A., Lohrke F. T. ( 2008). Entrepreneurial network development: Trusting in the process. Journal of Business Research , 61, 315– 322. doi: 10.1016/j.jbusres.2007.06.018 Google Scholar CrossRef Search ADS   Smith T. W., Mohler P. P, Harkness J., Onodera N. ( 2005). Methods for assessing and calibrating response scales across countries and languages. Comparative Sociology , 4, 365– 415. doi: 10.1163/156913305775010106 Google Scholar CrossRef Search ADS   Swierczek F. W., Ha T. T. ( 2003). Entrepreneurial orientation, uncertainty avoidance and firm performance. International Journal of Entrepreneurship and Innovation , 4, 46– 58. doi: 10.5367/000000003101299393 Google Scholar CrossRef Search ADS   Tobi H., Kampen J. K. ( 2011). Survey error in an international context: An empirical assessment of cross-cultural differences regarding scale effects. Quality and Quantity , 47, 553– 559. doi: 10.1007/s11135-011-9476-3 Google Scholar CrossRef Search ADS   Van Vaerenbergh Y., Thomas T. D. ( 2012). Response styles in survey research: A literature review of antecedents, consequences, and remedies. International Journal of Public Opinion Research , 25, 195– 217. doi: 10.1093/ijpor/eds021 Google Scholar CrossRef Search ADS   Weijters B., Cabooter E., Schillewaert N. ( 2010). The effect of rating scale format on response styles: The number of response categories and response category labels. International Journal of Research in Marketing , 27, 236– 247. doi: 10.1016/j.ijresmar.2010.02.004 Google Scholar CrossRef Search ADS   Welter F., Smallbone D. ( 2006). Exploring the role of trust in entrepreneurial activity. Entrepreneurship: Theory and Practice , 30, 465– 475. doi: 10.1111/j.1540-6520.2006.00130.x Google Scholar CrossRef Search ADS   © The Author 2016. Published by Oxford University Press on behalf of The World Association for Public Opinion Research. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Public Opinion Research Oxford University Press

Rating Scale Design Among Ethiopian Entrepreneurs: A Split-Ballot Experiment

Loading next page...
 
/lp/ou_press/rating-scale-design-among-ethiopian-entrepreneurs-a-split-ballot-IDG7o2hUQZ
Publisher
Oxford University Press
Copyright
© The Author 2016. Published by Oxford University Press on behalf of The World Association for Public Opinion Research. All rights reserved.
ISSN
0954-2892
eISSN
1471-6909
D.O.I.
10.1093/ijpor/edw031
Publisher site
See Article on Publisher Site

Abstract

Attitudes are particularly sensitive to how they are measured. Question wording, order, and response options affect how respondents understand and answer questions, influencing survey statistics (Krosnick & Fabrigar, 1997; Schwarz, 2008). Many attitudinal questions ask respondents to indicate how much they agree or disagree with various statements, although some research has critiqued this approach (Saris, Revilla, Krosnick, & Shaeffer, 2010). A host of studies have investigated different ways of designing agree/disagree rating scales. Research has investigated the number of categories (Alwin & Krosnick, 1991; Menold & Kemper, 2015; Preston & Colman, 2000; Revilla, Saris, & Krosnick, 2014), whether to include a “neutral” or midpoint option (Kroh, 2007), labeling scale points with words versus numbers (Weijters, Cabooter, & Schillewaert, 2010), and including a “no opinion” option. Other research has explored two-part “branching” questions, in which the first question assesses the attitude direction and a second question assesses attitude strength (Gilbert, 2015; Malhotra, Krosnick, & Thomas, 2009). There are cultural differences in how people interpret and answer rating scale questions (Hamamura, Heine, & Paulhus, 2007; Johnson, Kulsea, Cho, & Shavitt, 2005; Smith, Mohler, Harkness, & Onodera, 2005; Tobi & Kampen, 2011; Van Vaerenbergh & Thomas, 2012). Most of this cross-cultural research focuses on high- or middle-income countries in Europe, the United States, East Asia, and Latin America (Revilla & Ochoa, 2015). There is less research about rating scales in low-income countries, even though some scholars have suggested that rating scales might pose particular problems for these populations (Bernal, Wooley, & Schensul, 1997; Flaskerud, 2012). For example, Flaskerud (1988) reported that many people from Central America and Vietnam had difficulty understanding the task of answering a rating scale question, and would say “yes” or “no” instead. Agans, Deeb-Sossa, and Kalsbeek (2006) found that Mexican women had difficulty distinguishing between different categories on rating scales. Given the growth of large, ongoing attitudinal surveys in developing countries (e.g., Afrobarometer, Pew Global Attitudes Project, the World Values Survey), there is a surprising lack of research on rating scale design, particularly using experimental methods. There are several reasons why individuals from developing countries may face challenges in comprehending rating scale questions and mapping their true attitude onto the agree/disagree response options. First, lower levels of education and literacy may inhibit individual’s ability or effort to make fine-grained distinctions between categories. But even among more highly educated people, agree/disagree scales may create challenges. For example, because survey research is less common in developing countries, people are less familiar with the agree/disagree scales, which are so common in developed countries. Further, in some cultures, it is more common to assess attitudes through storytelling rather than from direct questions (Flaskerud, 2012). And finally, linguistic issues arise as well: it is difficult to translate phrases such as “somewhat agree” into some non-Western languages. In recognition of these challenges, Flaskerud (2012) advocates the use of alternative strategies to measure attitudes, such as dichotomous items or anchoring vignettes (Hopkins & King, 2010). However, both of these strategies have drawbacks: dichotomous items cannot convey variability, and anchoring vignettes require substantial effort to design and administer. Many surveys continue to use agree/disagree rating scales in developing countries, despite a lack of evidence about the optimal design for these questions. Three main types of agree/disagree scales are typically used in developing country surveys, each with potential strengths and weaknesses. The first type is numeric scales that have verbal labels at the end points only. On the one hand, numeric scales minimize issues with linguistic constructions and translations. Numeric scales also make it easier for respondents to remember response options (1–5) in a face-to-face survey. On the other hand, it may be strange to associate an attitude with a number, and respondents may face difficulty distinguishing between the different options (Flaskerud, 2012). Second, verbal scales include descriptions for each response option. This scale has the advantage of clarifying the meaning of each response option and using a more natural approach using words. However, the verbal scale may introduce ambiguity owing to difficulty in translating concepts such as “somewhat.” Third, branched scales seek to simplify the cognitive process by splitting attitude questions into two parts. The first question assesses the direction of the attitude and the second question assess attitude strength. Although branched scales reduce complexity, they may also take longer to administer and are repetitive (Gilbert, 2015). There is little evidence about which of these scales produces the most valid data. To address this gap in the literature, I designed a split-ballot questionnaire experiment in Ethiopia to shed light on best practices for designing agree/disagree questions in a low-income country. The experiment was embedded in a paper-and-pencil interview (PAPI) of 608 urban Ethiopian entrepreneurs that included attitude questions about their competitive orientation, commitment to running their business, and trust in suppliers. Respondents were randomly assigned to receive one of three rating scales (numeric, verbal, or branched) throughout the interview. I evaluate the impact of the rating scale on three outcomes: (1) univariate distributions of attitude variables; (2) respondent and interviewer perceptions about the survey; and (3) two measures of validity. Because the survey is based on a sample of urban entrepreneurs, the results of this survey experiment are limited to this specific population. I triangulate the results of the survey experiment with qualitative data from 20 semi-structured, in-depth interviews that were conducted during the formative stages of this research. Methods Data I embedded a split-ballot experiment in the Kal Addis Business Survey, an in-person survey of entrepreneurs in the Ethiopian capital of Addis Ababa, Ethiopia. Conducted in June and July of 2012, the survey was designed to improve knowledge about how small and medium-sized businesses operate in Ethiopia. During the survey design process, local interviewers conducted 20 semi-structured, in-depth interviews with entrepreneurs from Addis Ababa. Interviewers presented participants with verbal, numeric, and branched scales (in random order) and used both scripted and spontaneous probes to understand participants’ reactions to the scales. The audio-recorded interviews were transcribed and translated verbatim into English. Following the qualitative study, trained interviewers administered the PAPI quantitative survey in the Amharic language to a sample of 608 owners or managers of small and medium businesses (between 3 and 99 employees) in Addis Ababa, Ethiopia. The average interview time was 29 min (SD = 9 min). The study used respondent-driven sampling (RDS) methods to generate the sample because a high-quality sampling frame of entrepreneurs does not exist in Ethiopia. RDS is a method of chain referral sampling that combines a snowball sample with a mathematical model to adjust for biases in network sampling (Heckathorn, 1997; Heckathorn, 2002). RDS was designed for sampling “hidden” populations that are difficult to enumerate and lack a sampling frame (e.g., intravenous drug users, men who have sex with men). Evaluations of RDS suggest that it can approximate probability samples when applied to a highly networked population (McCreesh et al., 2012). Initially, the study recruited a sample of 24 entrepreneurs to participate in the survey. These respondents were then asked to recruit up to three members of their networks to participate. All respondents were provided with a leather wallet as a token of appreciation for completing the survey. Each wave of recruits was also asked to recruit up to three other individuals. Recruitment proceeded until the target sample size (600) was exceeded. As an incentive to recruit others, individuals were offered mobile-phone airtime minutes for each individual they recruited to the survey. More details about the sampling methodology and representativeness are published in the article by Lau and Bobashev (2015). Because this article’s focus is the internal validity of the rating scale experiment, I do not apply RDS weights in the article. The sample was 76% male and 24% female. The average age of the sample was 31.2 (SD = 6.9) years. Fifteen percent did not complete secondary school, 34% completed secondary school, 31% had vocational or completed college training, and 20% had higher than college training. Although this represents a mix of educational levels, the sample is significantly more educated than the general population of Ethiopia (Central Statistical Agency [Ethiopia] & ICF International, 2012). Most of the respondents were owners of a small or medium business (82%) rather than managers. Rating Scales The instrument included 38 agree/disagree questions that measured attitudes toward operating a business (see Appendix for English question wording). At the beginning of the interview, each respondent was randomly assigned to one of three questionnaire versions. Each version used a different agree/disagree rating scale. All scales used five response categories, and respondents were assigned to the same rating scale for the entire interview. The random assignment was not consistently correlated with any characteristics of respondents and businesses (tables available upon request). The numeric scale presented only verbal labels at the end points. The interviewer asked respondents to rate each item on a scale from 1 to 5, where 1 was “completely disagree” and 5 was “completely agree.” The possible responses were 1, 2, 3, 4, and 5. The verbal scale had full verbal labels with response options: “Completely disagree,” “Somewhat disagree,” “Neutral,” “Somewhat agree,” and “Completely agree.” The branched scale first assessed direction and then assessed strength. Interviewers first asked (in Amharic), “Do you agree, disagree, or are you neutral?” For respondents who endorsed “agree” or “disagree,” interviewers asked whether respondents “completely” agreed/disagreed or whether they “somewhat” agreed/disagreed. I recoded the two questions from the branched scales into a single variable with five categories. Analysis The analysis proceeds in four stages. In the first stage, I explore how rating scale design affects univariate distributions of attitude questions. This stage is based on a stacked “person-item” data set in which each respondent’s answer to a scale question contributes one row of data: the data set has 22,997 observations (608 respondents answering 38 items, excluding 107 items with missing data). Using this data set, I conduct a cross-tabulation of response distributions by scale type (verbal, branched, numeric) to examine how scale type influences response distributions. When conducting statistical tests, I adjusted standard errors for the fact that respondents answered multiple items. The second stage seeks to determine which rating scale produces the most accurate data using tests of criterion validity similar to Malhotra, Krosnick, and Thomas (2009). This approach involves estimating associations between a criterion variable and an attitude variable, where attitudes are measured using different types of rating scales. If the association between the criterion and attitude variables is the same, regardless of scale design, there is no evidence that one scale is more valid than another. On the other hand, if the association is stronger when a particular scale is used over another, there is evidence that the first scale produces more valid data. I conduct two tests of validity. In the first test, I estimate the association between competitive orientation of the business owner (target attitude) and a dichotomous measure of whether the business made a profit in the 12 months prior to the survey. The impact of an entrepreneur’s psychological orientation on business performance is the subject of a large literature (Lumpkin & Dress, 1996; Swierczek & Ha, 2003). Orientation toward competitors in particular emerged as a key theme in formative research in Addis Ababa. The greater the competitive orientation, the more likely a business is to make a profit. In this study, competitive orientation is an index created from three items (see Appendix). Exploratory factor analysis showed these items loaded on one factor (eigenvalue 1.65). The composite reliability (Raykov, 1997; Schweizer, 2011) of the index is .61 but varies by scale type (verbal = .77; numeric = .61; branched = .51). I estimate a logistic regression of profit on competitive orientation, separately by scale type. Although factors other than competitive orientation influence profit, my interest is in comparing how the association differs by scale type (Malhotra, Krosnick, & Thomas, 2009). Next, I estimate the association between the entrepreneur’s affective trust in his or her main supplier (the target attitude) and the length of time that the entrepreneur has used the supplier (less than 1 year, 1–3 years, more than 3 years; see Appendix for question wording). In the entrepreneurship literature, trust between businesses is a widely used concept to explain deal-making processes and business outcomes (Welter & Smallbone, 2006). Trust also evolves over time (Scarbrough, Swan, Amaeshi, & Briggs, 2013; Smith & Lohrke, 2008), so the more the entrepreneur trusts the supplier, the longer the entrepreneur is to use the supplier. Or conversely, the longer the entrepreneur has used the supplier, the more trust will be developed. In this study, trust is an index created from three items that loaded on one factor during exploratory factor analysis (eigenvalue 2.46). Composite reliability was high (.89) and was similar across scale types (verbal = .91; numeric = .89; branched = .88). I estimate ordinary least squares (OLS) regressions of trust on length of time and then compare the associations. In the third stage, I explore interviewer and respondent perceptions. The survey asked direct questions about the rating scales to both interviewers and respondents, as well as more general questions about the interview experience. Exact question wording is shown in the Appendix. This stage of the analysis presents cross-tabulations between rating scale design and the perceptions of interviewers and respondents. In the fourth stage of the analysis, two researchers analyzed qualitative data from the semi-structured, in-depth interviews. Researchers conducted an inductive content analysis to identify key themes. Results Response Distributions by Scale Type Table 1 shows the distribution of responses to attitude items, by scale type. There is a statistically significant association between scale type and responses to scale questions (p < .01). The largest difference is in the midpoint category, endorsed by 12% in the numeric scale, 6% in the verbal scale, and 3% in the branched scale (p < .01). In contrast, “somewhat agree” and “completely agree” were selected more often in the branched (67%) and verbal scale (66%) compared with a “4” or “5” in the numeric scale (60%) (p < .01). Supplementary analyses showed that the effects of scale type on univariate distributions were similar regardless of question content or respondent characteristics. Table 1 Distribution of Responses to Attitude Questions, by Scale Type (Percentages) Numeric scale  Verbal scale  Branched scale  Entire sample  (1) Completely disagree  20  Completely disagree  20  Completely disagree  21  20  (2)  8  Somewhat disagree  8  Somewhat disagree  9  8  (3)  12  Neutral  6  Neutral  3  7  (4)  13  Somewhat agree  16  Somewhat agree  16  15  (5) Completely agree  47  Completely agree  50  Completely agree  51  49  Numeric scale  Verbal scale  Branched scale  Entire sample  (1) Completely disagree  20  Completely disagree  20  Completely disagree  21  20  (2)  8  Somewhat disagree  8  Somewhat disagree  9  8  (3)  12  Neutral  6  Neutral  3  7  (4)  13  Somewhat agree  16  Somewhat agree  16  15  (5) Completely agree  47  Completely agree  50  Completely agree  51  49  Note: Pooled sample of 22,997 scale questions reported by 608 respondents. All statistical tests are adjusted for the clustering of items within respondents. See text for description of scale types and Appendix for exact question wording. Table 1 Distribution of Responses to Attitude Questions, by Scale Type (Percentages) Numeric scale  Verbal scale  Branched scale  Entire sample  (1) Completely disagree  20  Completely disagree  20  Completely disagree  21  20  (2)  8  Somewhat disagree  8  Somewhat disagree  9  8  (3)  12  Neutral  6  Neutral  3  7  (4)  13  Somewhat agree  16  Somewhat agree  16  15  (5) Completely agree  47  Completely agree  50  Completely agree  51  49  Numeric scale  Verbal scale  Branched scale  Entire sample  (1) Completely disagree  20  Completely disagree  20  Completely disagree  21  20  (2)  8  Somewhat disagree  8  Somewhat disagree  9  8  (3)  12  Neutral  6  Neutral  3  7  (4)  13  Somewhat agree  16  Somewhat agree  16  15  (5) Completely agree  47  Completely agree  50  Completely agree  51  49  Note: Pooled sample of 22,997 scale questions reported by 608 respondents. All statistical tests are adjusted for the clustering of items within respondents. See text for description of scale types and Appendix for exact question wording. There are several possible explanations for these patterns. On the one hand, the verbal and branched scales may encourage acquiescence (the tendency to agree regardless of the question content; Krosnick, 1991) because of repeated mention of the word “agree.” Respondents may also seek to avoid endorsing “neutral” in these scales. Indeed, during the qualitative interviews, several participants interpreted “neutral” as not having an opinion, a category that they wished to avoid. On the other hand, respondents in the numeric scale condition may have been drawn to the midpoint “3” category. This could be owing to a preference to select an average response or because respondents have difficulty answering the question and choose “3” as a satisficing response. The results in Table 1 cannot show which scale type produces the most accurate data; I investigate this issue in the next section. Validity This stage of the analysis investigates the validity of each rating scale design, using two tests of criterion validity. Table 2 shows associations between competitive orientation and profit, separately by scale type. Competitive orientation was associated with profit for the verbal and branched scales, but not for the numeric scales. In other words, the theoretically expected association was observed when attitudes were measured using verbal and branched conditions (p < .01), but not when attitudes were measured using numeric scales. These results suggest that verbal and branched scales produce more valid data than numeric scales. Table 2 Validity Test: Logistic Regression of Profit (Yes/No) on Competitive Orientation, Stratified by Scale Type Variable  Numeric  Verbal  Branched    Odds ratio  z  Odds ratio  z  Odds ratio  z  Competitive orientation  1.36  1.90  1.83**  3.90  1.89**  3.60  Intercept  2.46*  5.54  2.72**  5.92  2.04**  4.54  N  188  196  198  Variable  Numeric  Verbal  Branched    Odds ratio  z  Odds ratio  z  Odds ratio  z  Competitive orientation  1.36  1.90  1.83**  3.90  1.89**  3.60  Intercept  2.46*  5.54  2.72**  5.92  2.04**  4.54  N  188  196  198  Note. * p < .05; **p < .01. Table 2 Validity Test: Logistic Regression of Profit (Yes/No) on Competitive Orientation, Stratified by Scale Type Variable  Numeric  Verbal  Branched    Odds ratio  z  Odds ratio  z  Odds ratio  z  Competitive orientation  1.36  1.90  1.83**  3.90  1.89**  3.60  Intercept  2.46*  5.54  2.72**  5.92  2.04**  4.54  N  188  196  198  Variable  Numeric  Verbal  Branched    Odds ratio  z  Odds ratio  z  Odds ratio  z  Competitive orientation  1.36  1.90  1.83**  3.90  1.89**  3.60  Intercept  2.46*  5.54  2.72**  5.92  2.04**  4.54  N  188  196  198  Note. * p < .05; **p < .01. Numeric scales also performed poorly in the second test of validity. Table 3 shows associations between supplier trust and number of years with supplier. As theoretically expected, there is greater trust among entrepreneurs who had worked with suppliers for longer amounts of time, but only in the verbal and branched conditions. These associations are statistically significant at p < .05, but there is no significant association in the numeric scale. Table 3 Validity Test: OLS Regression of Trust in Supplier and Length of Time With Supplier, Stratified by Scale Type Variable  Numeric  Verbal  Branched    β  t  β  t  β  t  Length of time with main supplier (reference = less than 1 year)      1–3 years  .041  .21  .360  1.83  .446*  2.39      3 or more years  .029  .14  .541**  2.94  .295  1.57  Intercept  −.071  −.44  −.326*  −2.16  −.226  −1.47  n  193  186  194  Variable  Numeric  Verbal  Branched    β  t  β  t  β  t  Length of time with main supplier (reference = less than 1 year)      1–3 years  .041  .21  .360  1.83  .446*  2.39      3 or more years  .029  .14  .541**  2.94  .295  1.57  Intercept  −.071  −.44  −.326*  −2.16  −.226  −1.47  n  193  186  194  Note. *p < .05; **p < .01. Table 3 Validity Test: OLS Regression of Trust in Supplier and Length of Time With Supplier, Stratified by Scale Type Variable  Numeric  Verbal  Branched    β  t  β  t  β  t  Length of time with main supplier (reference = less than 1 year)      1–3 years  .041  .21  .360  1.83  .446*  2.39      3 or more years  .029  .14  .541**  2.94  .295  1.57  Intercept  −.071  −.44  −.326*  −2.16  −.226  −1.47  n  193  186  194  Variable  Numeric  Verbal  Branched    β  t  β  t  β  t  Length of time with main supplier (reference = less than 1 year)      1–3 years  .041  .21  .360  1.83  .446*  2.39      3 or more years  .029  .14  .541**  2.94  .295  1.57  Intercept  −.071  −.44  −.326*  −2.16  −.226  −1.47  n  193  186  194  Note. *p < .05; **p < .01. These results are based on models stratified by scale type. I also pooled the data and included interaction terms between scale type and the criterion variable. These models showed similar results as the stratified model. For the competitive orientation model, the interaction terms were significant at p = .028. For the supplier trust model, the interaction term for the verbal scale had a p = .057. These terms show that for both tests, the strength of the association between predictor and criterion is stronger for verbal scales than for numeric scales. Interviewer and Respondent Perceptions If verbal and branched scales produce more valid data, do interviewers and respondents also have more favorable perceptions of these scales? The results suggest not. Respondents had similar attitudes about the survey experience, regardless of scale type. Interviewers rated respondents as equally cooperative (χ2(6) = 0.7; p = .99) and interested (χ2(6) = 2.2; p = .92) in the three experimental conditions. When interviewers were asked directly about the respondent’s ease of answering scale questions, there were few differences by scale type. See full results in Table A1. Qualitative Interviews Results from qualitative research corroborate conclusions from the quantitative analysis. Participants universally preferred verbal or branched scales to numeric scales. Participants reported considerable uncertainty about the meaning of the 2, 3, or 4 options in the numeric scale. One participant (ID 11), for example, mentioned “the [numeric scale] lacks clearness … the [verbal scale] has words which I can easily pick, and those words are words that express what I feel.” When probed about their interpretation of the middle categories, participants ascribed widely different meanings to the middle categories, and several participants offered inconsistent meanings even within the same interview. The apparent low reliability expressed in these qualitative interviews is consistent with the quantitative data showing lower reliability and validity for numeric scales. In addition, some expressed cognitive discomfort with using numbers to represent attitudes. One participant (ID 2) noted, “It takes time to understand what the numbers 2, 3, and 4 mean. It is somewhat difficult. You have to work it out like homework.” Another participant (ID 1) said “expressing answers in terms of figures is confusing. Hence, it will be good if an explanation is given for each number.” These interviews suggest that respondents in the survey experiment may have selected the midpoint category as a satisficing strategy to cope with the cognitive burden and ambiguity about the meaning of the midpoint categories. When asked to compare verbal and branched scales, participants expressed varied reactions, with a slight preference for verbal scales. One participant (ID 5) said, “The [descriptive scale] gives you more options. As a result, it helps you think more about the response you give.” Another participant (ID 10) mentioned the verbal scale “enabled me to freely communicate my feelings … the response options were not restricted.” These results suggest that to some participants at least, descriptive scales may facilitate the response process. Discussion This experiment demonstrates that the design of agree/disagree rating scales affects attitude measurement for a sample of Ethiopian urban entrepreneurs. When presented with a numeric scale, respondents were especially likely to select the midpoint value (12%) compared with verbal (6%) or branched (3%) scales, and less likely to select the higher end of the scale. In two tests of validity, numeric scales demonstrated lower validity compared with verbal and branched scales. There were fewer differences between verbal and branched scales. Although interviewers and respondents had similar perceptions of all three scale types, numeric scales produced significantly worse data quality compared with verbal and branched scales. Several limitations of this study suggest the need for additional research on rating scale design. The results cannot be generalized beyond the relatively educated sample of entrepreneurs in Addis Ababa. The results may be different for a general population sample. For example, interview participants expressed frustration with numeric scales because of their ambiguity and perceived difficulty. A less educated, general population sample may be less equipped for these cognitive burdens, potentially making numeric scales even more problematic. Future research is needed to test whether the conclusions of this research also apply to other populations in Ethiopia or other less developed countries. Relatedly, these findings only apply to a particular topic, attitudes about running a business in Ethiopia. Future research is needed to determine the optimal rating scale design for other topics, particularly sensitive ones. In addition, the reliability for the competitive orientation index is lower than ideal. Although this may be owing to the experimental treatment (i.e., some scale designs produce worse data quality), additional research is needed using scales with high reliability. In sum, numeric scales performed poorly relative to verbal and branched scales in both the survey experiment and qualitative interviews. Verbal and branched scales performed similarly, though slightly more qualitative interview participants preferred the verbal scales. These findings suggest that numeric scales should be avoided in favor of verbal and branched scales, at least for this population of entrepreneurs in Ethiopia. Charles Q. Lau, PhD, MS, is a Survey Methodologist in RTI International’s Survey Research Division. Dr. Lau conducts research on survey methodology in low- and middle-income countries, focusing on modes of data collection, interviewer effects, and sampling. APPENDIX Question Wording: Profit and Length of Time Using Supplier Profit. Respondents were asked “In the last fiscal year, did your business make money, lose money, or break even?” Those who said “make money” were classified as being profitable. Table A1 Association Between Scale Type and Perceptions of Interview (Percentages)   Verbal labels  Numeric labels  Branched  Pearson χ2 test  Respondent perceptions  Did you think the questions were hard to understand?      Yes  5  5  3  χ2(2) = 1.2; p = .56      No  95  95  97    Did you have difficulty coming up with your answers?      Yes  7  5  7  χ2(2) = 1.1 p = .58      No  93  95  93    Did you think the survey was too long?      Yes  13  10  12  χ2(2) = 0.67; p = .72      No  87  90  88    Interviewer perceptions  Cooperativeness              Very cooperative  76  77  76  χ2(6) = 0.7; p = .99      Moderately cooperative  21  21  21        Somewhat cooperative  2  1  1        Not at all cooperative  <1  <1  1    Interest      Very interested  73  74  74  χ2(6) = 2.2; p = .92      Moderately interested  24  24  22        Somewhat or not at all interested  2  2  3        Not at all interested  <1  <1  1    Ease of answering scale questions      Very easy  68  68  67  χ2(6) = 3.0; p = .81      Somewhat easy  22  24  24        Neutral  8  8  8        Somewhat or very difficult  1  <1  1      Verbal labels  Numeric labels  Branched  Pearson χ2 test  Respondent perceptions  Did you think the questions were hard to understand?      Yes  5  5  3  χ2(2) = 1.2; p = .56      No  95  95  97    Did you have difficulty coming up with your answers?      Yes  7  5  7  χ2(2) = 1.1 p = .58      No  93  95  93    Did you think the survey was too long?      Yes  13  10  12  χ2(2) = 0.67; p = .72      No  87  90  88    Interviewer perceptions  Cooperativeness              Very cooperative  76  77  76  χ2(6) = 0.7; p = .99      Moderately cooperative  21  21  21        Somewhat cooperative  2  1  1        Not at all cooperative  <1  <1  1    Interest      Very interested  73  74  74  χ2(6) = 2.2; p = .92      Moderately interested  24  24  22        Somewhat or not at all interested  2  2  3        Not at all interested  <1  <1  1    Ease of answering scale questions      Very easy  68  68  67  χ2(6) = 3.0; p = .81      Somewhat easy  22  24  24        Neutral  8  8  8        Somewhat or very difficult  1  <1  1    Note: Percentages may not sum to 100% owing to rounding. Table A1 Association Between Scale Type and Perceptions of Interview (Percentages)   Verbal labels  Numeric labels  Branched  Pearson χ2 test  Respondent perceptions  Did you think the questions were hard to understand?      Yes  5  5  3  χ2(2) = 1.2; p = .56      No  95  95  97    Did you have difficulty coming up with your answers?      Yes  7  5  7  χ2(2) = 1.1 p = .58      No  93  95  93    Did you think the survey was too long?      Yes  13  10  12  χ2(2) = 0.67; p = .72      No  87  90  88    Interviewer perceptions  Cooperativeness              Very cooperative  76  77  76  χ2(6) = 0.7; p = .99      Moderately cooperative  21  21  21        Somewhat cooperative  2  1  1        Not at all cooperative  <1  <1  1    Interest      Very interested  73  74  74  χ2(6) = 2.2; p = .92      Moderately interested  24  24  22        Somewhat or not at all interested  2  2  3        Not at all interested  <1  <1  1    Ease of answering scale questions      Very easy  68  68  67  χ2(6) = 3.0; p = .81      Somewhat easy  22  24  24        Neutral  8  8  8        Somewhat or very difficult  1  <1  1      Verbal labels  Numeric labels  Branched  Pearson χ2 test  Respondent perceptions  Did you think the questions were hard to understand?      Yes  5  5  3  χ2(2) = 1.2; p = .56      No  95  95  97    Did you have difficulty coming up with your answers?      Yes  7  5  7  χ2(2) = 1.1 p = .58      No  93  95  93    Did you think the survey was too long?      Yes  13  10  12  χ2(2) = 0.67; p = .72      No  87  90  88    Interviewer perceptions  Cooperativeness              Very cooperative  76  77  76  χ2(6) = 0.7; p = .99      Moderately cooperative  21  21  21        Somewhat cooperative  2  1  1        Not at all cooperative  <1  <1  1    Interest      Very interested  73  74  74  χ2(6) = 2.2; p = .92      Moderately interested  24  24  22        Somewhat or not at all interested  2  2  3        Not at all interested  <1  <1  1    Ease of answering scale questions      Very easy  68  68  67  χ2(6) = 3.0; p = .81      Somewhat easy  22  24  24        Neutral  8  8  8        Somewhat or very difficult  1  <1  1    Note: Percentages may not sum to 100% owing to rounding. Length of Time Using Supplier. Respondents were asked “How long have you used this supplier or broker?” Responses options were “3 months or less,” “More than 3 months to 6 months,” “More than 6 months to 12 months,” “1 to 3 years,” and “More than 3 years.” These responses were recoded into the following categories: less than 1 year; 1-3 years, and more than 3 years. Question Wording: Attitude Questions Below, I present question wording for the 38 attitude questions included in the analysis. The exact wording of the attitude items is presented below. Competitive Orientation (starred items are used in index) * “You think about your competitors when developing products or services” * “You know a lot about your competitors’ products or services” * “You know how your competitors set their prices” “You are aware of your competitors’ plans for the future” Affective Trust in Main Supplier “If you shared your problems with your supplier or broker, he or she would help you” “You can freely share your ideas, feelings, and hopes with your supplier or broker” “You can freely talk to your supplier or broker and you know he or she would want to listen” Cognitive Trust in Main Supplier “Your supplier or broker approaches their job with dedication” “You see no reason to doubt your supplier’s or broker’s competence” “You can rely on your supplier or broker to make your job easy” Supplier Market “If your supplier or broker does not perform as you expect, you can find another one” “Your suppliers or brokers have many competitors” “Suppliers or brokers that don't treat their customers well eventually go out of business” “The law limits the suppliers or brokers you can buy from” “Using the same supplier or broker many times brings the price of supplies down” Ease of Working With Main Supplier “You always receive receipts when you buy supplies” “If the supplies are defective or bad quality, it is easy to return them” “It is easy to transport supplies back to your workspace” “You are satisfied with the quality of supplies” “You are satisfied with the price of supplies” Skilled Labor “It is easy to find workers with the right skills” “Family or friends are good ways of finding skilled workers” “You can afford to hire skilled workers” “Your skilled workers have many job opportunities outside of your business” Technology “Updated equipment or technology is too expensive for your business” “Your workers have the skills to operate updated equipment or technology” “You know a lot about updated equipment or technology” “It is easy to repair updated equipment or technology” “Buying updated equipment or technology would make your business more money” Entrepreneurial Orientation “In the past 12 months, you have made few changes to your products or services” “You often introduce products or services before your competitors” “In general, you like to take big risks with chances of very high return” Commitment “You would be very happy (unhappy) to spend the rest of your career running this business” “You are willing (not willing) to do whatever it takes to make your business a success” “You have other choices than running your business (little choice but to run your business)” “Right now, running your business is (not) a matter of necessity” “You feel (do not feel) an obligation to remain with your business” “You would feel (not feel) guilty if you sold your business or stopped running it” Interviewer Perceptions Exact question wording for the interviewer perception questions was the following: How easy or difficult do you think it was for the respondent to answer the scale questions (agree/disagree)? (SELECT ONE.) Very easy Somewhat easy Neutral Somewhat difficult Very difficult How interested was the respondent during the interview? (SELECT ONE.) Very interested Moderately interested Somewhat interested Not at all interested Respondent Perceptions At the conclusion of the survey, respondents were asked several questions about their experience. These questions were as follows: Did you think the questions were hard to understand? Did you have difficulty coming up with your answers? Did you think the survey was too long? These questions used yes/no response options (rather than scales) to avoid confounding the results of the rating scale experiment with these perception data. References Alwin D. F., Krosnick J. A. ( 1991). The reliability of survey attitude measurement. Sociological Methods and Research , 20, 139– 181. doi: 10.1177/0049124191020001005 Google Scholar CrossRef Search ADS   Agans R. P., Deeb-Sossa N., Kalsbeek W. D. ( 2006). Mexican immigrants and the use of cognitive assessment techniques in questionnaire development. Hispanic Journal of Behavioral Sciences , 28, 209– 230. doi: 10.1177/0739986305285826 Google Scholar CrossRef Search ADS   Bernal H., Wooley S., Schensul J. J. ( 1997). The challenge of using Likert-type scales with low-literate ethnic populations. Nursing Research , 46, 179– 181. Google Scholar CrossRef Search ADS PubMed  Central Statistical Agency [Ethiopia]; ICF International. 2012. Ethiopia demographic and health survey 2011. Addis Ababa, Ethiopia and Calverton, Maryland, USA: Central Statistical Agency and ICF International. Flaskerud J. H. ( 2012). Cultural bias and Likert-type scales revisited. Issues in Mental Health Nursing , 33, 130– 132. doi: 10.3109/01612840.2011.600510 Google Scholar CrossRef Search ADS PubMed  Flaskerud J. H. ( 1988). Is the Likert scale format culturally biased? Nursing Research , 37, 185– 186. Google Scholar CrossRef Search ADS PubMed  Gilbert E. E. ( 2015). A comparison of branched versus unbranched rating scales for the measurement of attitudes in surveys. Public Opinion Quarterly , 79, 443– 470. doi: 10.1093/poq/nfu090 Google Scholar CrossRef Search ADS   Hamamura T., Heine S. J., Paulhus D. L. ( 2007). Cultural differences in response styles: The role of dialectical thinking. Personality and Individual Differences , 44, 932– 942. 10.1016/j.paid.2007.10.034 Google Scholar CrossRef Search ADS   Heckathorn D. ( 1997). Respondent-driven sampling: A new approach to the study of hidden populations. Social Problems , 44, 174– 199. doi: 10.2307/3096941 Google Scholar CrossRef Search ADS   Heckathorn D. ( 2002). Respondent-driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations. Social Problems , 49, 11– 34. doi: 10.1525/sp.2002.49.1.11 Google Scholar CrossRef Search ADS   Hopkins D. J., King G. ( 2010). Improving anchoring vignettes: Designing surveys to correct interpersonal incomparability. Public Opinion Quarterly , 74, 201– 222. doi: 10.1093/poq/nfq011 Google Scholar CrossRef Search ADS   Johnson T., Kulsea P, Cho Y. I., Shavitt S. ( 2005). The relation between culture and response styles: Evidence from 19 countries. Journal of Cross-Cultural Psychology , 36, 264– 277. doi: 10.1177/0022022104272905 Google Scholar CrossRef Search ADS   Kroh M. ( 2007). Measuring left-right political orientation: The choice of response format. Public Opinion Quarterly , 71, 204– 220. doi:10.1093/poq/nfm009 Google Scholar CrossRef Search ADS   Krosnick J. A. ( 1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology , 5, 213– 236. doi: 10.1002/acp.2350050305 Google Scholar CrossRef Search ADS   Krosnick J. A., Fabrigar L. R. ( 1997). Designing rating scales for effective measurement in surveys. In Lyberg L. E., Biemer P., Collins M., de Leeuw E., Dippo C., Schwarz N., Trewin D. (Eds.), Survey measurement and process quality  (pp. 141– 164). Hoboken, NJ: John Wiley & Sons, Inc. Lau C. Q., Bobashev G. V. ( 2015). Respondent-driven sampling: A new method to sample businesses in Africa. Journal of African Economies , 24, 128– 147. doi: 10.1093/jae/eju023 Google Scholar CrossRef Search ADS   Lumpkin G. T., Dress G. G. ( 1996). Clarifying the entrepreneurial orientation construct and linking it to performance. Academy of Management Review , 21, 135– 172. doi: 10.5465/AMR.1996.9602161568 Google Scholar CrossRef Search ADS   Malhotra N., Krosnick J. A., Thomas R. K. ( 2009). Optimal design of branching questions to measure bipolar constructs. Public Opinion Quarterly , 73, 304– 324. doi: 10.1093/poq/nfp023 Google Scholar CrossRef Search ADS   McCreesh N., Frost S., Seeley J., Katongole J., Tarsh M., Ndunguse R., Jichi F., Lunel N., Maher D., Johnston L., Sonnenberg P., Copas A., Hayes R., White R. ( 2012). Evaluation of respondent-driven sampling. Epidemiology , 23, 138– 147. doi: 10.1097/EDE.0b013e31823ac17c Google Scholar CrossRef Search ADS PubMed  Menold N., Kemper C. J. ( 2015). The impact of frequency rating scale formats on the measurement of latent variables in web surveys: An experimental investigation using a measure of affectivity as an example. Psihologija , 48, 431– 449. doi: 10.2298/PSI1504431M Google Scholar CrossRef Search ADS   Preston C. C., Colman A. M. ( 2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica , 104, 1– 15. doi: 10.1016/S0001-6918(99)00050-5 Google Scholar CrossRef Search ADS PubMed  Raykov T. ( 1997). Estimation of composite reliability for congeneric measures. Applied Psychological Measurement , 21, 173– 184. doi: 10.1177/01466216970212006 Google Scholar CrossRef Search ADS   Revilla M., Ochoa C. ( 2015). Quality of different scales in an online survey in Mexico and Colombia. Journal of Politics in Latin America , 7, 157– 177. Revilla M., Saris W. E., Krosnick J. A. ( 2014). Choosing the number of categories in agree-disagree scales. Sociological Methods and Research , 43, 73– 97. doi: 10.1177/0049124113509605 Google Scholar CrossRef Search ADS   Saris W. E., Revilla M., Krosnick J. A., Shaeffer E. M. ( 2010). Comparing questions with agree/disagree response options to questions with item-specific response options. Survey Research Methods , 4, 61– 79. doi: 10.18148/srm/2010.v4i1.2682 Scarbrough H., Swan J., Amaeshi K., Briggs T. ( 2013). Exploring the role of trust in the deal-making process for early-stage technology ventures. Entrepreneurship: Theory and Practice , 37, 1203– 1228. doi: 10.1111/etap.12031 Google Scholar CrossRef Search ADS   Schwarz N. ( 2008). Attitude Measurement. In Crano W. D., Prislin R. (Eds.), Attitudes and attitude change  (pp. 41– 60). New York, NY: Taylor & Francis Group. Schweizer K. ( 2011). On the changing role of Cronbach’s α in the evaluation of the quality of a measure. European Journal of Psychological Assessment.  27, 143– 144. doi: 10.1027/1015-5759/a000069 Google Scholar CrossRef Search ADS   Smith D. A., Lohrke F. T. ( 2008). Entrepreneurial network development: Trusting in the process. Journal of Business Research , 61, 315– 322. doi: 10.1016/j.jbusres.2007.06.018 Google Scholar CrossRef Search ADS   Smith T. W., Mohler P. P, Harkness J., Onodera N. ( 2005). Methods for assessing and calibrating response scales across countries and languages. Comparative Sociology , 4, 365– 415. doi: 10.1163/156913305775010106 Google Scholar CrossRef Search ADS   Swierczek F. W., Ha T. T. ( 2003). Entrepreneurial orientation, uncertainty avoidance and firm performance. International Journal of Entrepreneurship and Innovation , 4, 46– 58. doi: 10.5367/000000003101299393 Google Scholar CrossRef Search ADS   Tobi H., Kampen J. K. ( 2011). Survey error in an international context: An empirical assessment of cross-cultural differences regarding scale effects. Quality and Quantity , 47, 553– 559. doi: 10.1007/s11135-011-9476-3 Google Scholar CrossRef Search ADS   Van Vaerenbergh Y., Thomas T. D. ( 2012). Response styles in survey research: A literature review of antecedents, consequences, and remedies. International Journal of Public Opinion Research , 25, 195– 217. doi: 10.1093/ijpor/eds021 Google Scholar CrossRef Search ADS   Weijters B., Cabooter E., Schillewaert N. ( 2010). The effect of rating scale format on response styles: The number of response categories and response category labels. International Journal of Research in Marketing , 27, 236– 247. doi: 10.1016/j.ijresmar.2010.02.004 Google Scholar CrossRef Search ADS   Welter F., Smallbone D. ( 2006). Exploring the role of trust in entrepreneurial activity. Entrepreneurship: Theory and Practice , 30, 465– 475. doi: 10.1111/j.1540-6520.2006.00130.x Google Scholar CrossRef Search ADS   © The Author 2016. Published by Oxford University Press on behalf of The World Association for Public Opinion Research. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

International Journal of Public Opinion ResearchOxford University Press

Published: Dec 20, 2016

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off