The Impact of Interviewer Effects on Regression Coefficients

The Impact of Interviewer Effects on Regression Coefficients Abstract This article examines the influence of interviewers on the estimation of regression coefficients from survey data. First, we present theoretical considerations with a focus on measurement errors and nonresponse errors due to interviewers. Then, we show via simulation which of several nonresponse and measurement error scenarios has the biggest impact on the estimate of a slope parameter from a simple linear regression model. When response propensity depends on the dependent variable in a linear regression model, bias in the estimated slope parameter is introduced. We find no evidence that interviewer effects on the response propensity have a large impact on the estimated regression parameters. We do find, however, that interviewer effects on the predictor variable of interest explain a large portion of the bias in the estimated regression parameter. Simulation studies suggest that standard measurement error adjustments using the reliability ratio (i.e., the ratio of the measurement-error-free variance to the observed variance with measurement error) can correct most of the bias introduced by these interviewer effects in a variety of complex settings, suggesting that more routine adjustment for such effects should be considered in regression analysis using survey data. 1. INTRODUCTION Regression analysis is an omnipresent statistical method for examining the relationships of two or more variables in a given population for many areas of quantitative research. Regression analysis conditions on predictor variables; if these variables are measured with error, the estimated regression coefficients will be biased. Similarly, bias can arise from nonresponse, depending on the nature of the mechanism. If a finite population is the target of inference, failure of the sampling frame to cover the population can induce bias, as well. This study focuses on measurement and nonresponse error in interviewer-administered surveys. In interviewer-administered surveys, interviewers can contribute to both nonresponse and measurement error: each interviewer might introduce a specific amount of error on each measured variable, leading to data being more similar when it is collected by the same interviewer. In this study, we investigate the impact of interviewers and their interviewer effects on estimated coefficients in a simple linear regression model. A well-known source of systematic error (bias) in estimated regression coefficients is the presence of measurement errors in the values of independent variables. The case of fitting regression models to survey data collected by human interviewers is a more complex situation, because both dependent and independent variables can contain two different types of errors. The first is nonresponse error, where the units of analysis that respond to particular questions have different characteristics on the variables of interest compared with the nonresponding units, and the second is measurement error, where the measurements of the responding units contain errors and differ thus from their true values. The existing literature in this area has targeted the influence of these two errors due to interviewers on the estimation of descriptive parameters (means, proportions, etc.) for single survey variables. Scientific articles examining the effects of interviewers on the relationships of two or more variables are rare. Davis and Scott (1995) concentrated on the joint effects of interviewers and sampling clusters in survey data on estimated domain comparisons. The authors found evidence that interviewer effects on survey variables can vary between domains and therefore the variance of the estimated comparisons of a specific variable between these domains (e.g., the differences in means for a given variable between women and men) can be enlarged due to interviewers. In addition to these empirical results, they also presented theoretical work about joint cluster effects and interviewer effects on the precision of domain comparisons. A second paper (Beullens and Loosveldt 2014) is concerned with interviewer effects on covariances of survey variables and their impact on factor analysis. Essentially, these authors found that there are interviewer effects on covariances which slightly lower the magnitude of the factor loadings and possibly enlarge their variances. However, the authors used real data from the European Social Survey, (i.e., only data from respondents), and did not consider possible nonresponse error introduced by interviewers nor the absence of an interpenetrated design. The first limitation might be problematic because the interviewer effects literature presents several examples of interviewers varying in terms of their response distributions and response rates (West and Blom 2017), motivating additional study of what is causing these phenomena. Recent work (West and Olson 2010; West, Kreuter, and Jaenichen 2013) has also suggested that interviewer variance may be arising from different interviewers recruiting different types of respondents. The second limitation can lead to confounding of interviewer effects with sampling area effects. In a study with an interpenetrated design, Schnell and Kreuter (2005) found that the effects of interviewers on the variances of descriptive survey estimates are stronger than geographical cluster effects. To take the absence of an interpenetrated study design into account, (i.e., to separate interviewer effects from area effects), two papers (Wiggins, Longford, O’Muircheartaigh 1992; Hox 1994) used hierarchical regression models to control for interviewer effects on both intercepts and slopes by assuming that interviewers are drawn from a population of potential interviewers and compared several strategies modeling these as random effects. Wiggins et al. (1992) found that inferences related to regression coefficients can change when interviewer effects are included in regression models by analyzing two case studies. In Hox (1994), the modelling approach includes random intercepts and slopes for interviewers, in addition to interviewer-specific and subject-specific covariates resulting in a multlevel mixed effects model. Both articles consider the lack of an interpenetrated design by including possible area-specific confounding variables in their models; however, they ignore possible nonresponse error caused by interviewers, similar to the work of Davis and Scott (1995) and Beullens and Loosveldt (2014). Collectively, these studies show that interviewer effects can influence multivariate analysis, but the possible nonresponse error introduced by interviewers can not be estimated from the data used and is therefore assumed as negligible. Thus, there is a need to investigate the impact of both nonresponse errors and measurement errors caused by interviewers on the estimated relationship of two variables. One more recent paper is also concerned about interviewer effects on the estimation of relationships between multiple variables: Beullens and Loosveldt (2016) compare interviewer effects on regression analysis between European countries based on data from the European Social Survey. However, this work focuses mainly on the measurement error side of interviewer effects and does not consider the effects of interviewers on response propensity. In this paper, we will examine the impacts of both sources of errors due to interviewers by simulating many different real-world scenarios where dependent and independent variables in linear regression models will contain interviewer-specific measurement errors and nonresponse errors. In the following sections, we give a short introduction to linear regression models and show our theoretical expectations regarding interviewer effects on regression coefficients. The following section provides a comprehensive description of the simulated scenarios. The relevant results, focusing on the bias of the estimated slope parameter, will be presented afterwards. We then apply these results to real data from a national face-to-face survey in Germany, with validation data available on the sampling frame. Finally, we discuss our findings in a context of further research opportunities. We assume that our independent variables are continuous and approximately normally distributed. We discuss relaxing this assumption in the discussion section. 2. THEORETICAL RESULTS 2.1 Linear Model In this paper, we focus on a simple linear regression model defining the relationship between the continuous variables y and x: yj=β0+β1xj+εj,j=1,…,n. (1) We make the standard linear regression assumptions: for the residuals εj we assume E(εj|xj)=0, and all εj are stochastically independent and normally distributed with a constant variance σ2 for all j. The intercept parameter β0 and the slope parameter β1 remain constant for all given pairs of values (xj, yj), and xj is observed without error. A model-unbiased estimator for β1 is given by β^1=1n∑j=1n(xj−x¯)(yj−y¯)1n∑j=1n(xj−x¯)2=σ^x,yσ^x2. (2) This and further theory about linear regression models can be found in Fahrmeir, Kneib, Lang, and Marx (2013), chapters one through four. 2.2 Measurement Error Model Despite the best efforts of survey organizations to standardize the training of interviewers, it has long been known that interviewers can differentially affect the responses to single survey variables (Fowler and Mangione 1990). This may be due to verbal or nonverbal signals sent by interviewers or by personal characteristics of the interviewer that suggest interviewer preferences or expectations (West and Blom 2017). In a statistical framework, these interviewer effects are often viewed as a form of measurement error so that, for a variable with an underlying true value of xij for the jth respondent interviewed by the ith interviewer, the actual reported value is xij*, where xij*=xij+uxi with xij∼N(μx,σx2),uxi∼N(0,τx2),j=1,…,ni,i=1,…,nint. (3) Here uxi is a constant interviewer effect across all individuals interviewed by interviewer i, or, equivalently and perhaps more realistically, an average effect across the respondents associated with this interviewer. (For ease of exposition, we typically assume that nj⊥ni across all i=1,…,nint interviewers, although that is not required.) It is considered to be a random variable in a formal superpopulation sense, in that an infinite number of possible interviewers could have been drawn to be associated with the population and eventually the observed sample. Or, somewhat more informally, we assume that repeated sampling of the population does not condition on the set of interviewers that were used in the observed sample, but rather assumes they were drawn from a very large pool of potential interviewers relative to those in the sample. Under (3), the intra-interviewer correlation ρint associated with x is given by ρint,x=τx2σx2+τx2=τx2σx*2 (4) When x is used as a predictor in a linear regression model, the presence of interviewer effects can be described using a classical additive regression model (Carroll, Ruppert, Stefanski, and Crainiceanu 2006, p. 22). We assume our target parameter of interest is the slope β1 relating y to the true value of x: yij=β0+β1xij+εij, where εij∼N(0,σy2). The model fitted using the observed data estimates yij=β0+β1*xij*+εij, where xij* is given by (3). Assuming that uxi and εij are independent, the relationship between β1 and β1* is given by β1*=λβ1 with λ=σx2σx*2 (5) In (5), λ is called the reliability ratio. In our case, λ can be decomposed if the components xij and uxi from (3) are assumed as independent (Carroll et al. 2006): λ=σx2σx*2=σx*2−τx2σx*2=1−ρint,x (6) We therefore expect that the interviewer effect on x will shrink the estimated regression slope towards zero by ρint,x*100%. Further, we suppose that a correction of the shrinkage introduced by interviewers is possible with the inverse reliability ratio λint−1β1* with λint−1=σx*2σx*2−τx2. We are able to estimate the components of λint from the observed data: the empirical variance of x* is an estimator for the overall variance of x*. An estimator for the interviewer variance τx2 in x* can be calculated by fitting the following mixed-effects regression model: xij*=μ+γi+εx,ij, (7) where μ represents an overall fixed intercept term, γi the random intercept of the interviewers, and εx,ij a random error term. This model assumes independence of the interviewer effects and the errors. We also assume that any additional variability in case-specific measurement errors is negligible relative to the variance of the random interviewer effects (i.e., there is no case-specific error term in [3]). If one was able to estimate the variance of these case-specific errors, this term would also need to be subtracted from the denominator of the corrective factor introduced previously. Corrections based on interviewer effects alone may therefore not recover all of the bias in the parameter estimates if the variability in the case-specific errors is substantial. The variance of the γi terms, estimated by REML estimation, is then an unbiased estimator for τx2 (Fahrmeir et al. 2013, pp. 373–374). Further, the interviewer effect on x also causes an increase in the standard error of the estimated slope because the residual variance in the linear model is Var(y|x*)=Var(y|x)+β12τx2σx2σx*2+β12σxε2σx2σx*2=Var(y|x)+β12σx2ρint,x+β12σxε2σx2σx*2 (Carroll et al. 2006, p. 22), with σxε2 as the variance of εx,ij in (7). However, this increase also depends on values of β1 and σx2. We now consider the interviewer effect on the measurement of the dependent variable y. Similar to (3), the measurement error model for this variable is given by yij*=yij+uyi+εy,ij, with uyi∼N(0,τy2),εy,ij∼N(0,σyε2), (8) with the assumption of independent components. In contrast to the interviewer effect on the independent variable, adding interviewer effects to the dependent variable in a linear regression model does not cause bias in the slope: E(yij*|xij*)=(8)E(yij+uyi+εy,ij|xij*)=E(yij|xij*)+E(uyi)+E(εy,ij)=E(yij|xij*) But the variance of the residuals will increase by the variance of the interviewer effects τy2 and the variance of the additional error term εy,ij: Var(yij*|xij*)=(8)Var(yij+uyi+εy,ij|xij*)=Var(yij|xij*)+Var(uyi|xij*)+Var(εy,ij|xij*)=Var(yij|xij*)+Var(uyi)+Var(εy,ij)=(8)Var(yij|xij*)+τy2+σyε2 This leads to a larger standard error of the estimated slope and hence to reduced power of the usual single-parameter tests in the linear model with H0:β1=0. All expectations so far assume no correlation of the interviewer effects on x and y. Adding correlated error terms leads to a more complex situation. Following Chai (1971), the generalized reliability ratio with possible correlated measurement errors can be approximated for large nobs by λgen=1+σuxi,uyiσx,y1+τx2σx2 (9) with σuxi,uyi and σx,y describing the covariance of the random interviewer effects on x and y and the covariance of the dependent and independent variables. With the given model specification, the second term can be expressed as σx,y=β1σx2. Depending on the values of σuxi,uyi and σx,y, the introduced bias in relation to zero can either be reduced or increased (Biemer and Trewin 1997). Thus, we assume a change in the magnitude of the bias of β1 in the presence of correlated interviewer effects for both dependent and independent variables. 2.3 Nonresponse Model To model the probability of response pij for a subject j recruited by an interviewer i, we use the following logistic regression model: pij= exp(δ0+δ1yij+δ2xij+upi+upiYyij+upiXxij)1+ exp(δ0+δ1yij+δ2xij+upi+upiYyij+upiXxij), (10) with the random interviewer effects upi∼N(0,σRP2),upiY∼N(0,σRPY2), and upiX∼N(0,σRPX2).The parameter δ0 describes the intercept term which defines the overall response rate, δ1 and δ2 represent the relationships of the true values of y and x with the response probability, and upi is the random interviewer effect associated with response propensity, which is assumed to be normally distributed with zero mean and a variance σRP2. The random effects upiYand upiX represent random interviewer slopes for the true values of y and x on the response propensity. For δ1=δ2=0, a “missing completely at random” mechanism occurs for the sampled cases of one interviewer (Little and Rubin 2002, pp. 11–19). Because of a lack of systematic tendency in the response rate as a function of y or x in this setting, we expect no bias in the estimation of β1 due to nonresponse and just an increase of the estimate’s standard errors as a result of fewer observations. When δ1=0 and δ2≠0, we have a “missing at random” mechanism; assuming the linear model is correctly specified, the conditional distribution of y|x will be unchanged, and thus β1 will be estimated without bias. Whenever δ1≠0, the individuals with higher values on y are either more likely or less likely to respond, leading to a “not missing at random” mechanism. In this situation, the estimated variance of y from the respondents only ( Var̂(yRP)) is less than Var̂(y), the estimated variance of y from the whole sample. Goldberger (1981) and Groves (2004, p. 158) describe the impact of the systematic underestimation of this variance on the estimated slope parameter in a simple linear regression model: β^1RP=β1Var̂(yRP)/Var̂(y)1+ρ2(1−Var̂(yRP)/Var̂(y)) (11) In (11), ρ2 is the coefficient of determination or the proportion of variance in y explained by x. Because the influence of y on pij in (10) is a main effect, the variance ratio Var̂(yRP)Var̂(y)≤1 and therefore, Var̂(yRP)/Var̂(y)≤1+ρ2(1−Var̂(yRP)/Var̂(y)). For that reason, we expect a shrinkage of the estimated parameter β^1RP towards zero for δ1≠0. This bias depends also on the magnitude of Var(y|x)=Var(ε), the residual variance; a smaller residual variance leads to less bias, and in the situation of a perfect fit, there is no bias at all. Interviewer-specific random effects introduce additional variance in response probabilities. If they are random and independent from other parameters, we expect no influence on the bias of β^1. We consider the implications of a correlation between interviewer effects on the measurement of Y and interviewer effects on response propensity in the next section. 2.4 Correlation of Nonresponse and Measurement Error In the case of a relationship between the measurement error and nonresponse introduced by interviewers, (i.e., correlated upi and uyi), we expect no additional bias in the slope but rather in the estimated intercept parameter. If Cor(uyi,upi)>0, then respondents with high uyi have a higher response propensity because of their higher values in upi, and therefore, the estimated intercept increases. However, the size of this effect depends on the sizes of the involved variances τy2, σRP2 and the Cor(uyi,upi), with larger variances and higher correlation leading to more bias. However, the adjustment λint−1 from section 2.2 can be estimated using data from respondent only and used to account for the bias in the slope parameter introduced by measurement error. How much of the overall bias can be reduced by this approach is investigated in the following simulaion study. 3. SIMULATION DESIGN Table 1 shows the varying parameters for all of the following simulation scenarios. Data generated for the x and y variables comes from a super-population with the relationship yj=β0+β1xj+εj, where j denotes the elements of the super-population. We assume xj∼N(0,1), εj∼N(0,2), xj is independent of εj, β0=0.5, and β1=1. Thus, yj|xj∼N(0.5+xj,2). Additionally, given a sample of nint = 50 interviewers from a population of interviewers, nobs = 50 describes the number of interviews attempted per interviewer. Thus, the total number of interviews attempted is n=nint·nobs=2500. Table 1. Overview of the Values for All Varying Simulation Parameters Parameter Description Values Parameters from the nonresponse error model described in section 2.3:  δ1 Relationship of y values with response propensity 0,0.6  δ2 Relationship of x values with response propensity 0,0.6   σRP2 Variance of interviewer effects on response propensity 0.01, 0.25   σRPX2 Variance of random interviewer coefficient for x in predicting response propensity 0.01, 0.25   σRPY2 Variance of random interviewer coefficient for y in predicting response propensity 0.01, 0.25 Parameters from the measurement error model described in section 2.2:   τx2 Variance of interviewer effects on x 0.01,0.1   τy2 Variance of interviewer effects on y 0.01,0.1   Cor(uxi,uyi) Correlation of the interviewer effects on x and y −0.25, 0, 0.25 Parameter for the relationship between nonresponse and measurement error model as described in section 2.4:   Cor(uyi,upi) Correlation of the interviewer effects on y and on the response propensity −0.5, 0, 0.5 Parameter Description Values Parameters from the nonresponse error model described in section 2.3:  δ1 Relationship of y values with response propensity 0,0.6  δ2 Relationship of x values with response propensity 0,0.6   σRP2 Variance of interviewer effects on response propensity 0.01, 0.25   σRPX2 Variance of random interviewer coefficient for x in predicting response propensity 0.01, 0.25   σRPY2 Variance of random interviewer coefficient for y in predicting response propensity 0.01, 0.25 Parameters from the measurement error model described in section 2.2:   τx2 Variance of interviewer effects on x 0.01,0.1   τy2 Variance of interviewer effects on y 0.01,0.1   Cor(uxi,uyi) Correlation of the interviewer effects on x and y −0.25, 0, 0.25 Parameter for the relationship between nonresponse and measurement error model as described in section 2.4:   Cor(uyi,upi) Correlation of the interviewer effects on y and on the response propensity −0.5, 0, 0.5 Table 1. Overview of the Values for All Varying Simulation Parameters Parameter Description Values Parameters from the nonresponse error model described in section 2.3:  δ1 Relationship of y values with response propensity 0,0.6  δ2 Relationship of x values with response propensity 0,0.6   σRP2 Variance of interviewer effects on response propensity 0.01, 0.25   σRPX2 Variance of random interviewer coefficient for x in predicting response propensity 0.01, 0.25   σRPY2 Variance of random interviewer coefficient for y in predicting response propensity 0.01, 0.25 Parameters from the measurement error model described in section 2.2:   τx2 Variance of interviewer effects on x 0.01,0.1   τy2 Variance of interviewer effects on y 0.01,0.1   Cor(uxi,uyi) Correlation of the interviewer effects on x and y −0.25, 0, 0.25 Parameter for the relationship between nonresponse and measurement error model as described in section 2.4:   Cor(uyi,upi) Correlation of the interviewer effects on y and on the response propensity −0.5, 0, 0.5 Parameter Description Values Parameters from the nonresponse error model described in section 2.3:  δ1 Relationship of y values with response propensity 0,0.6  δ2 Relationship of x values with response propensity 0,0.6   σRP2 Variance of interviewer effects on response propensity 0.01, 0.25   σRPX2 Variance of random interviewer coefficient for x in predicting response propensity 0.01, 0.25   σRPY2 Variance of random interviewer coefficient for y in predicting response propensity 0.01, 0.25 Parameters from the measurement error model described in section 2.2:   τx2 Variance of interviewer effects on x 0.01,0.1   τy2 Variance of interviewer effects on y 0.01,0.1   Cor(uxi,uyi) Correlation of the interviewer effects on x and y −0.25, 0, 0.25 Parameter for the relationship between nonresponse and measurement error model as described in section 2.4:   Cor(uyi,upi) Correlation of the interviewer effects on y and on the response propensity −0.5, 0, 0.5 The number of interviews completed will shrink with the simulated unit nonresponse. Starting from a base response rate of 50 percent ( δ0=0 in [10]), the response propensity may vary by subject when missingness depends on y (not missing at random) and/or there is interviewer heterogeneity in nonresponse. We consider two values of δ1 in (10), corresponding to missing completely at random ( δ1=0) and to a stronger association with the outcome ( δ1=0.6). For σRP2 we consider two levels corresponding to trivial and substantial interviewer effects on response propensity (0.01 and 0.25), based on West and Olson (2010) and West et al. (2013). In addition, the observed values of xj and yj are subject to a specific measurement error for the corresponding interviewer, as introduced in (3) and (8), resulting in clustering effects according to interviewers. For both variables, x and y, we add either a trivial ( τx2=0.01, τy2=0.01) or a nontrivial ( τx2=0.1, τy2=0.1) amount of interviewer variance. The chosen parameter values lead to the following intraclass correlations (ICCs) for interviewer effects for x and y (West and Olson 2010; West et al. 2013): ρint,x,high=τx2σxij2+τx2=0.11+0.1≈0.091ρint,x,low=τx2σxij2+τx2=0.011+0.01≈0.01ρint,y,high=τy2σyij2+τy2=0.12+0.1≈0.05ρint,y,low=τy2σyij2+τy2=0.012+0.01≈0.005 Further, we consider three possible values, negative (−0.25), positive (0.25) and zero, for the correlation of the interviewer effects on x and y ( Cor(uxi,uyi)). Finally, the influence of a possible correlation between the interviewer effects on y and on the response propensity ( Cor(uyi,upi)), as described in section 2.4, is examined for three possible values (−0.5, 0 and 0.5). Because of a lack of empirical results for these correlations, we choose rather extreme values in order to examine their potential influence. This set of parameter values results in 1,152 possible scenarios. For every scenario, we replicated the following procedure nrep = 100 times: first, values for x and y with the described structure were drawn at random, and linear models were fitted to the generated data (assuming no nonresponse and no measurement error). Next, based on the models from section 2, x* (3), y* (8), and pij (10) were generated for all observations. The response indicator RPij was calculated by comparing pij with a random variable ψij drawn from a uniform distribution over the interval [0,1], and the following decision rule was implemented to simulate a response indicator: RPij={1for ψij≤pij0for ψij>pijwith ψij∼U(0,1),∀i,j In the second step, we fitted linear models to the true values of the responding subjects and to the values containing errors. We also estimated the reliability ratio from (5), using data from respondents measured with errors (i.e., the data situation of a real survey) and calculated an adjusted slope parameter as described in section 2.2. The simulation study was performed in the software package “R” (R Core Team 2017). The annotated R code for performing the simulation study is available in the supplementary materials. Selected results of the simulation are presented in the following section. 4. SIMULATION ANALYSIS We focus on the results for the bias of β^1 in this section because it dominates the root mean squared error (RMSE) in all investigated scenarios. The estimator of the bias for one parameter combination was calculated as the mean of the differences from the true value β1 over all replications: Bias(β̂1)̂=1nrep∑k=1nrep(β^1k−β1) To obtain an overview of the influence of the various simulation parameters on the estimation of β1 given the large number of scenarios, we use a variable importance measure generated from an application of the random forest method. A short introduction to the random forest method and our used variable importance measure procedure is given in the following section. 4.1 Introduction to the Random Forest Method To understand the random forest method, we first need to take a look at tree-based methods. They divide the covariate space (also known as the feature space) into subspaces and fit a simple model like the mean of the dependent variable from the observations at the subspace (for a continuous dependent variable) or assign one category (for categorical dependent variables) to every produced subspace. For a regression problem (for continuous dependent variables), the algorithm starts with one split value of one covariate (zm) that divides the space into two subspaces R1 and R2. In our application, the dependent variable w contains all estimated slope parameters β^1 from all simulated scenarios and all one hundred replicates. The covariates consist of all varying parameters in the simulation. Thus, the resulting R1 and R2 are two sets of parameter values from the simulation who differ only by the values of the selected split variable/simulation parameter zm. The optimal split variable v and the optimal split point s minimize the following criterion: min⁡v,s[∑l=12min⁡cl(∑zm∈Rl(v,s)(wm−cl)2)], (12) with l denoting the index of the subspaces. The inner terms are minimal for c^l=1#{wm|zm∈Rl(v,s)}∑wm|zm∈Rl(v,s)wm, which is the mean of the w values in the produced subspace (here the mean of the β^1 values within the subspace). The values of the inner sums can be computed for all v, s to find the optimum and thereby the border that separates the subspaces R1 and R2. This procedure is repeated for all resulting subspaces until a stopping rule is fulfilled. The only change for a classification problem is that a different split criterion is necessary. Despite their flexibility and interpretability, trees can be very unstable, (i.e., a small change in the given data can produce a totally different tree) (Friedman, Hastie, and Tibshirani 2009, p. 312). The random forest is an ensemble method where many trees are fitted to bootstrap samples of the given data set and aggregated to produce a prediction rule. This aggregation, or the average of all tree results in the case of a continuous variable, produces a prediction function with less variance than received from a single tree, (i.e., the method solves the instability problem of a single tree). The variance reduction is especially efficient if the aggregated trees are different from each other. This further de-correlation of the trees can be obtained by using only a random sample of the covariates for every split and by enlarging the trees as much as possible. This idea was developed by Breiman (2001). Unfortunately, a random forest is harder to interpret than just a single tree. To overcome this absence of interpretability by constructing a variable importance measure, one could use the observations which are not used to fit the tree, (i.e., the observations from the data which are not included in the bootstrap sample). These observations are called the out-of-bag (OOB) sample. To compute a measure of variable importance, the prediction accuracy of one tree for its OOB observations is calculated. Then the values of a variable q in the OOB sample are randomly permuted and the prediction accuracy is measured again. The mean loss of accuracy due to random permutation over all trees is then a measure of the variable importance for variable q (Friedman et al. 2009, pp. 593–594). For a regression tree, the implemented prediction accuracy in the R Package “randomForest” is the mean squared error (MSE) (Liaw and Wiener 2002). Theory about tree-based methods and random forests (including an alternative variable importance measure) can be found in Friedman et al. (2009), chapters nine, ten, and fifteen. 4.2 Simulation Results To analyze the results of our simulations, we estimate a random forest (1,000 trees) with β^1 as a continuous dependent variable, all varying parameters ( δ1,δ2,σRP2,σRPY2,σRPX2,τx2,τy2, Cor(uxi,uyi),Cor(uyi,upi)) as covariates and all nrep·#scenarios=100·1152=115200 simulation runs as the underlying data set. A measure for the parameter importance can be produced as described in section 4.1. A permutation in an important variable would lead to a high increase in the MSE of the random forest. Figure 1 shows the calculated variable importance for all nine varying input parameters in a decreasing order with β^1 as the dependent variable. The x-axis shows the mean percentage increase of the MSE if the values of the parameters at the y-axis are permuted. The parameter resulting in the highest MSE increase and thereby having the highest estimated importance according to this criterion is δ1, which is the relationship between y and the response propensity, followed by τx2, which is the variance of the random interviewer effects on the variable x. The third and fourth most important parameters are δ2, the relationship between x and the response propensity, and Cor(uxi,uyi), the correlation of the interviewer-specific random effects. The other parameters play minor roles. This high impact of τx2 on the estimation of the slope parameter is not surprising and can be explained by the relationship expressed in (5) and (6). The same applies for the high impact of δ1, consistent with our expectations in section 2.3, and the effect of Cor(uxi,uyi), which is anticipated because of (9). The magnitude of τy2’ s effect can also be explained by (9). This generalized reliability ratio shows that the MSE increase from permuting τy2 is based on the relationship with the highly relevant parameters τx2 and Cor(uxi,uyi); for Cor(uxi,uyi)=0, the effect disappears. That σRP2 seems to be less important is consistent with our expectation from section 2.3. The minor importance of Cor(uyi,upi) results from the lower importance of τy2 and σRP2 and their relationship with Cor(uyi,upi), as described in section 2.4. In summary, by analyzing the simulated data with the random forest procedure, it is possible to determine which of the varying simulation parameters has the highest influence on the estimated slope parameter β^1 in a linear regression model. From this overview, it can be derived that the biggest influence on β^1 from interviewers is introduced by measurement error in the independent variable x. Figure 1. View largeDownload slide Parameter Importance for β^1Estimated by a Random Forest. Figure 1. View largeDownload slide Parameter Importance for β^1Estimated by a Random Forest. To overcome the limitations of analyzing the variable importance in a broader sense and to gain a deeper insight in possible interactions of simulation parameters and their effects on the bias of the slope parameter, figure 2 shows a detailed matrix of the estimated biases for σRPY2=0.25 and σRPX2=0.25 with all combinations of the parameter values of δ1,δ2,σRP2,τx2,τy2, Cor(uxi,uyi) and Cor(uyi,upi). Plotted results for the other parameter combinations of σRPY2 and σRPX2 can be found in the Appendix. The parameters of the measurement error model vary in the columns, and the rows differ in terms of parameter values from the response model. The dashed line indicates a bias of zero. Within cells, the empirical bias of β^1 is plotted for three different situations for all three values of Cor(uyi,upi): Figure 2. View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure 2. View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Data from all subjects with a response indicator RPij = 1 (respondents) and the true values for x and y, (i.e., only a possible bias due to nonresponse is observed.) Data from all subjects with a response indicator RPij = 1 (respondents) and the values for x* and y* containing interviewer effects (measurement errors) Data from all subjects with a response indicator RPij = 1 (respondents) and the values for x* and y* containing interviewer effects (measurement errors) and an additional adjustment for the slope parameter by the reliability ratio λ^, estimated using data from respondents only containing interviewer effects (measurement errors) Varying δ1 has a large impact on the bias of β^1 independent of other parameter values. As expected, no bias occurs for the missing completely at random situation with δ1=0. If δ1≠0, a negative bias of approximately –0.12 for δ1=0.6 can be observed for the respondents and true values. Whenever δ2≠0, this bias becomes even stronger. Changes in σRP2 have a negligible influence on the bias. These findings are consistent with our expectations in section 2.3. Considering the interviewer effects on the variables as an additional source of bias, a specific pattern is visible: whenever the variance of the random interviewer effects on x, τx2, is small, the bias in all three situations is the same, (i.e., there is no additional bias due to measurement errors in this case). For τx2=0.1, the additional measurement error enlarges the bias by about ρint,x,high·100%, which is 0.09 for δ1=0 or nine percent in general. The magnitudes of τy2 and Cor(uxi,uyi) introduce minor changes in this relationship. For τy2=0.01, the additional bias from measurement errors is negligible. For τy2=0.1, it is slightly larger (−0.02) when Cor(uxi,uyi)=−0.25 and slightly smaller (0.02) if Cor(uxi,uyi)=0.25. The adjustment by the inverse reliability ratio is a useful method to overcome interviewer effects in x*. However, the reliability ratio ignores the additional changes in bias introduced by τy2 and Cor(uxi,uyi) and corrects not enough in some cases or even too much in other cases. The generalized reliability ratio, introduced in (9), may help to address these minor differences. If the variance of the random interviewer slopes for x or y ( σRPX2 or σRPY2) is high, correcting with the estimated reliability ratio may lead to a slight positive bias. This effect is generally stronger in scenarios where high σRPX2 values are present. In cases of high variances for both variables, their effects add up and the correction can introduce a positive bias. The variation of the parameter Cor(uyi,uRP) produces slight changes in the bias in some cases, but even with big values for this correlation, there is no obvious pattern apparent. These minor effects match our expectations in section 2.4. We also simulated the same scenarios with β1=−1 and negative values for δ1 and δ2, but the pattern of results across the simulation parameters is basically the same, with a positive bias occurring for β1=−1 due to the shrinkage towards zero. 5. APPLICATION We now confirm the empirical results presented so far in this paper by examining this problem using data from a real face-to-face survey. An ideal data set for further understanding the simulation findings would have the following characteristics: 1) measurements on two continuous variables from the survey that have an approximately linear relationship; 2) an underlying interpenetrated sample design to avoid confounding of area effects and interviewer effects; and 3) a unique interviewer identification variable available for every interviewed person, so that all observations can be associated with a specific interviewer. Furthermore, true values for the two variables of interest would need to be available for the entire sample for validation purposes. Given a data set meeting these criteria, we can estimate all of the parameters varied in the simulation studies and assess how these estimates impact our inferences related to the regression coefficient of interest. We analyze data from a face-to-face survey that was conducted in fifteen large areas in Germany (West, Conrad, Kreuter, and Mittereder 2018). In each area, 480 currently employed adults with a history of at least one unemployment spell were randomly sampled from the Integrated Employment Biographies (IEB) database, which contains official government information on employment histories. The overall sample size was thus n = 7,200. Four professional interviewers were assigned to work in each of the fifteen areas (sixty interviewers total), and each was randomly assigned 120 cases in total (i.e., interpenetrated sample assignment, conditional on the area). While the larger goal of this study was to evaluate the effects of experimentally manipulated interviewing techniques on data quality, we do not consider that experimental manipulation here. The survey instrument included questions with responses that could be validated using administrative information on the IEB sampling frame. We specifically focus on log-transformed annual income as the dependent variable of interest and the longest uninterrupted period of employment in the last twenty years (in months) as the independent variable of interest, with the underlying theory being that individuals who are able to hold positions for longer periods of time experience greater benefits in terms of annual income. Data collection continued from March to October in 2014. Respondents were provided with a 20 Euro token of appreciation for participating. In total, n = 1,850 interviews were completed by the sixty interviewers (AAPOR RR1 =25.7%), and after accounting for item-missing data on the two variables of interest, there were n = 1,469 respondents with data available on both variables. Additional details related to the design of the parent study can be found in West et al. (2018). Table 2 below provides estimates of the parameters of interest in our simulation studies based on the data from this German study. When estimating the variance of the random interviewer effects on a particular variable, we fit multilevel models explicitly controlling for the fixed effects of the fifteen areas, removing any variability in the variable being modeled due to fixed effects of the different areas. Of note, the relationships of the dependent and independent variables with response propensity (δ1 and δ2) were not significantly different from zero, similar to the correlation of the random interviewer effects on the reported values of the dependent and independent variables. We also did not find any evidence of interviewer variance in the relationships of the dependent and independent variables with response propensity, and these random coefficients were therefore dropped from the model. The variance of the random interviewer effects on x, found to be the second most important factor in the simulations, was significantly greater than zero (p = 0.003 according to a mixture-based likelihood ratio test), but only of moderate size relative to the simulation inputs (corresponding ρint,x=0.027). Table 2. Estimates of Parameters Varied in the Simulation Studies Based on Real Data Collected in a German Face-to-Face Survey Parameter Estimate/other relevant information δ1 Estimate =0.003, p = 0.887, n = 7,154 δ2 Estimate < 0.001, p = 0.558, n = 7,154 σRP2 Estimate =0.121, p < 0.001, ICC = 0.035 σRPX2 Estimate < 0.001, not significant (removed from model) σRPY2 Estimate < 0.001, not significant (removed from model) τx2 Estimate =73.501, p = 0.003, ICC = 0.027, n = 1,469 τy2 Estimate =0.053, p = 0.198, ICC = 0.008, n = 1,469 Cor(uxi,uyi) Estimate =0.113, p = 0.390 Cor(uyi,upi) Estimate =−0.199, p = 0.128 Parameter Estimate/other relevant information δ1 Estimate =0.003, p = 0.887, n = 7,154 δ2 Estimate < 0.001, p = 0.558, n = 7,154 σRP2 Estimate =0.121, p < 0.001, ICC = 0.035 σRPX2 Estimate < 0.001, not significant (removed from model) σRPY2 Estimate < 0.001, not significant (removed from model) τx2 Estimate =73.501, p = 0.003, ICC = 0.027, n = 1,469 τy2 Estimate =0.053, p = 0.198, ICC = 0.008, n = 1,469 Cor(uxi,uyi) Estimate =0.113, p = 0.390 Cor(uyi,upi) Estimate =−0.199, p = 0.128 Table 2. Estimates of Parameters Varied in the Simulation Studies Based on Real Data Collected in a German Face-to-Face Survey Parameter Estimate/other relevant information δ1 Estimate =0.003, p = 0.887, n = 7,154 δ2 Estimate < 0.001, p = 0.558, n = 7,154 σRP2 Estimate =0.121, p < 0.001, ICC = 0.035 σRPX2 Estimate < 0.001, not significant (removed from model) σRPY2 Estimate < 0.001, not significant (removed from model) τx2 Estimate =73.501, p = 0.003, ICC = 0.027, n = 1,469 τy2 Estimate =0.053, p = 0.198, ICC = 0.008, n = 1,469 Cor(uxi,uyi) Estimate =0.113, p = 0.390 Cor(uyi,upi) Estimate =−0.199, p = 0.128 Parameter Estimate/other relevant information δ1 Estimate =0.003, p = 0.887, n = 7,154 δ2 Estimate < 0.001, p = 0.558, n = 7,154 σRP2 Estimate =0.121, p < 0.001, ICC = 0.035 σRPX2 Estimate < 0.001, not significant (removed from model) σRPY2 Estimate < 0.001, not significant (removed from model) τx2 Estimate =73.501, p = 0.003, ICC = 0.027, n = 1,469 τy2 Estimate =0.053, p = 0.198, ICC = 0.008, n = 1,469 Cor(uxi,uyi) Estimate =0.113, p = 0.390 Cor(uyi,upi) Estimate =−0.199, p = 0.128 Given this combination of estimates, we would predict that the attenuating effects of the interviewer variance on the estimate of the regression coefficient of interest would be relatively modest based on the theory and simulation studies presented here. Table 3 presents the OLS estimates of the regression coefficient of interest, after adjusting for the fixed area effects, for 1) the full sample, when using the true values of each variable; 2) the respondents, when using the true values of each variable; and 3) the respondents, when using the reported values of each variable. Table 3. OLS Estimates of the Regression Coefficient of Interest at Three Different Stages of the German Face-to-Face Survey. (OLS Estimates and Standard Errors are multiplied by 100) Stage of study Sample size OLS estimate Standard error p value Full Sample, True Values 7,154 0.493 0.03 p < 0.001 Respondents, True Values 1,469 0.527 0.07 p < 0.001 Respondents, Reported Values 1,469 0.467 0.13 p < 0.001 Stage of study Sample size OLS estimate Standard error p value Full Sample, True Values 7,154 0.493 0.03 p < 0.001 Respondents, True Values 1,469 0.527 0.07 p < 0.001 Respondents, Reported Values 1,469 0.467 0.13 p < 0.001 Table 3. OLS Estimates of the Regression Coefficient of Interest at Three Different Stages of the German Face-to-Face Survey. (OLS Estimates and Standard Errors are multiplied by 100) Stage of study Sample size OLS estimate Standard error p value Full Sample, True Values 7,154 0.493 0.03 p < 0.001 Respondents, True Values 1,469 0.527 0.07 p < 0.001 Respondents, Reported Values 1,469 0.467 0.13 p < 0.001 Stage of study Sample size OLS estimate Standard error p value Full Sample, True Values 7,154 0.493 0.03 p < 0.001 Respondents, True Values 1,469 0.527 0.07 p < 0.001 Respondents, Reported Values 1,469 0.467 0.13 p < 0.001 We see in table 3 that the estimated coefficient based on respondent reports is only attenuated by a modest amount (a 5.3 percent reduction in magnitude) relative to the” true” coefficient that would have been computed if every individual had responded and provided the exact correct value on each variable. Therefore, these results match our expectations based on the simulation studies, and overall inferences would not be affected. The proposed correction factor, when applied to the OLS estimate of 0.00467 based on respondent reports, would be computed as 2,725.67 (the total variance of the independent variable) divided by 2,652.17 (the variance of the true values of the independent variable in the absence of the interviewer effects), or 1.03. This correction would bring the slightly biased OLS estimate closer to its “true” value: 0.00467·1.03=0.00481. We note that the denominator of this correction may be too large, given the presence of case-specific measurement errors, and knowledge of the variance of these errors would allow us to fully correct for the bias in this estimate. We were not able to fully estimate this variance in the present application because not all respondents consented to having their reports linked to administrative data (which would be required for computing the case-specific errors). 6. DISCUSSION We saw how interviewer effects resulting in measurement errors and nonresponse errors can influence the estimation of the slope parameter in a simple linear regression model in the presence of different missing data mechanisms. In our simulated scenarios, the introduced measurement error in an independent variable and the missing data mechanism have the strongest influences. Both sources introduce a bias towards zero and therefore weaken the estimated relationship between the independent and the dependent variable. The standard treatment for correcting slope parameters estimated from variables containing errors—the reliability ratio—can help to repair the introduced bias from this source, but not the bias from nonresponse error. In our simulated scenario, the reliability ratio estimated only from respondent data is adequate to correct the biggest part of the bias introduced from interviewer effects because the interviewer effect on the response propensity introduces only small bias in the regression slope. There is much additional work to be done in this area. An important next step would be to evaluate the change in the standard errors of the slope in order to determine the interviewer effects on significance tests of β^1. Furthermore, we only considered the case of simple linear regression with a single predictor in this paper; future work should extend the results of this paper in a multivariate direction. While we considered an extensive simulation setting, there are a number of factors that remain to be explored. The correlation between the interviewer-specific random effects on x and y needs more investigation because it has an non-negligible influence on the bias. To correct jointly for both interviewer effects and nonresponse error, one could use Heckman’s selection model (Heckman 1979) and try to use the interviewer ID as a kind of instrumental variable similar to the approach of McGovern, Bärnighausen, Salomon, and Canning (2015) and control for the interviewers in the regression model. Or, one could combine this approach with pattern-mixture models as in the work of Yuan and Little (2009) to adjust for nonresponse error. The magnitudes of interviewer variance for various survey variables are highly dependent on the characteristics of survey questions (Schaeffer, Dykema, and Maynard 2010). To reduce this source of variance, interviewer training (Billiet and Loosveldt 1988; Fowler and Mangione 1990, Chapter 7) and standardized interviewing instead of conversational interviewing (West, Conrad, Kreuter, and Mittereder 2017, 2018) has been shown to be effective. Whenever researchers analyze survey data collected by interviewers, the article by Elliott and West (2015) can be consulted as practical guidance about how to deal with interviewer effects in the case of descriptive estimation. As seen in our simulation study, interviewer variance in the independent variable of a regression model can introduce bias in the estimated slope parameter. Therefore, analysts should estimate the impact of interviewer effects on regression parameters of interest using the proposed adjustment and make appropriate adjustments with the inverse of the reliability ratio if necessary. As we noted previously, this work has focused on the simple linear regression model, and appropriate adjustments for regression models with multiple predictors subject to interviewer effects should be an important focus of future work in this area. While cluster effects and differential measurement error are not unique to interviewer effects—for example, misreporting of parental education in the Programme for International Student Assessment (PISA) may confound relationships between parental education and student performance (Kreuter et al. 2010)—an uncommon feature of this setting is the ability to estimate the degree of measurement error using random effects models applied to the regression predictors of interest. Our simulation study suggests that use of the reliability ratio estimated from such models can correct for much of the resulting bias even under settings where the measurement error structure is more complex than implied by a simple reliability ratio adjustment. This suggests that such adjustments should be considered more routinely than is currently the case; although, further research is needed to better understand how failures of assumptions such as interpenetration may impact such adjustments. Appendix Figure A.1 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.1 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.2 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.2 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.3 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.3 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. REFERENCES Beullens K. , Loosveldt G. ( 2014 ), “Interviewer Effects on Latent Constructs in Survey Research,” Journal of Survey Statistics and Methodology , 2 , 433 – 458 . Google Scholar CrossRef Search ADS Beullens K. , Loosveldt G. ( 2016 ), “Interviewer Effects in the European Social Survey,” Survey Research Methods , 10 , 103 – 118 . Biemer P. P. , Trewin D. ( 1997 ), “A Review of Measurement Error Effects on the Analysis of Survey Data,” in Survey Measurement and Process Quality , pp. 601 – 632, New York : Wiley Online Library . Google Scholar CrossRef Search ADS Billiet J. , Loosveldt G. ( 1988 ), “Improvement of the Quality of Responses to Factual Survey Questions by Interviewer Training,” Public Opinion Quarterly , 52 , 190 – 211 . Google Scholar CrossRef Search ADS Breiman L. ( 2001 ), “Random Forests,” Machine Learning , 45 , 5 – 32 . Google Scholar CrossRef Search ADS Carroll R. J. , Ruppert D. , Stefanski L. A. , Crainiceanu C. M. ( 2006 ), Measurement Error in Nonlinear Models: A Modern Perspective , Boca Raton, FL : CRC press . Google Scholar CrossRef Search ADS Chai J. J. ( 1971 ), “Correlated Measurement Errors and the Least Squares Estimator of the Regression Coefficient,” Journal of the American Statistical Association , 66 , 478 – 483 . Google Scholar CrossRef Search ADS Davis P. , Scott A. ( 1995 ), “The Effect of Interviewer Variance on Domain Comparisons,” Survey Methodology , 21 , 99 – 106 . Elliott M. R. , West B. T. ( 2015 ), “Clustering by Interviewer’: A Source of Variance That Is Unaccounted for in Single-Stage Health Surveys,” American Journal of Epidemiology , 182 , 118 – 126 . Google Scholar CrossRef Search ADS PubMed Fahrmeir L. , Kneib T. , Lang S. , Marx B. ( 2013 ), Regression: Models, Methods and Applications , New York : Springer Science & Business Media . Fowler F. J. Jr, , Mangione T. W. ( 1990 ), Standardized Survey Interviewing: Minimizing Interviewer-Related Error, Vol. 18 , Newbury Park : Sage . Google Scholar CrossRef Search ADS Friedman J. , Hastie T. , Tibshirani R. ( 2009 ), The Elements of Statistical Learning: Data Mining, Inference, and Prediction , New York : Springer-Verlag New York . Goldberger A. S. ( 1981 ), “Linear Regression after Selection,” Journal of Econometrics , 15 , 357 – 366 . Google Scholar CrossRef Search ADS Groves R. M. ( 2004 ), Survey Errors and Survey Costs , Hoboken, New Jersey : John Wiley & Sons . Google Scholar CrossRef Search ADS Heckman J. J. ( 1979 ), “Sample Selection Bias as a Specification Error,” Econometrica: Journal of the Econometric Society , 47 , 153 – 162 . Google Scholar CrossRef Search ADS Hox J. J. ( 1994 ), “Hierarchical Regression Models for Interviewer and Respondent Effects,” Sociological Methods & Research , 22 , 300 – 318 . Google Scholar CrossRef Search ADS Kreuter F. , Eckman S. , Maaz K. , Watermann R. ( 2010 ), “Children’s Reports of Parents’ Education Level: Does It Matter Whom You Ask and What You Ask about?” Survey Research Methods , 4 , 127 – 138 . Liaw A. , Wiener M. ( 2002 ), “Classification and Regression by randomForest,” R News , 2 , 18 – 22 . Little R. J. , Rubin D. B. ( 2002 ), Statistical Analysis with Missing Data , New York : John Wiley & Sons . Google Scholar CrossRef Search ADS McGovern M. E. , Bärnighausen T. , Salomon J. A. , Canning D. ( 2015 ), “Using Interviewer Random Effects to Remove Selection Bias from HIV Prevalence Estimates,” BMC Medical Research Methodology , 15 , 8 . Google Scholar CrossRef Search ADS PubMed R Core Team ( 2017 ), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing. Schaeffer N. C. , Dykema J. , Maynard D. W. ( 2010 ), “Interviewers and Interviewing,” Handbook of Survey Research , 2nd ed, 437 – 470 . Schnell R. , Kreuter F. ( 2005 ), “Separating Interviewer and Sampling-Point Effects,” Journal of Official Statistics , 21 , 389 . West B. T. , Blom A. G. ( 2017 ), “Explaining Interviewer Effects: A Research Synthesis,” Journal of Survey Statistics and Methodology , 5 , 175 – 211 . West B. T. , Conrad F. G. , Kreuter F. , Mittereder F. ( 2017 ), “Nonresponse and Measurement Error Variance among Interviewers in Standardized and Conversational Interviewing,” Journal of Survey Statistics and Methodology , in press. West B. T. , Conrad F. G. , Kreuter F. , Mittereder F. ( 2018 ), “Can Conversational Interviewing Improve Survey Response Quality without Increasing Interviewer Effects?” Journal of the Royal Statistical Society: Series A (Statistics in Society) , 181 , 181 – 203 . Google Scholar CrossRef Search ADS West B. T. , Kreuter F. , Jaenichen U. ( 2013 ), “Interviewer Effects in Face-to-Face Surveys: A Function of Sampling, Measurement Error, or Nonresponse?” Journal of Official Statistics , 29 , 277 – 297 . Google Scholar CrossRef Search ADS West B. T. , Olson K. ( 2010 ), “How Much of Interviewer Variance Is Really Nonresponse Error Variance?” Public Opinion Quarterly , 74 , 1004 – 1026 . Google Scholar CrossRef Search ADS Wiggins R. D. , Longford N. , O’Muircheartaigh C. A. ( 1992 ), “A Variance Components Approach to Interviewer Effects,” in Survey and Statistical Computing , eds. Westlake A. , Banks R. , C. P. , Orchard T. , pp. 243 – 254, Amsterdam : North-Holland Yuan Y. , Little R. J. ( 2009 ), “Mixed-Effect Hybrid Models for Longitudinal Data with Nonignorable Dropout,” Biometrics , 65 , 478 – 486 . Google Scholar CrossRef Search ADS PubMed © The Author(s) 2018. Published by Oxford University Press on behalf of the American Association for Public Opinion Research. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Survey Statistics and Methodology Oxford University Press

The Impact of Interviewer Effects on Regression Coefficients

Loading next page...
 
/lp/ou_press/the-impact-of-interviewer-effects-on-regression-coefficients-0iBCTX4bm4
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press on behalf of the American Association for Public Opinion Research. All rights reserved. For permissions, please email: journals.permissions@oup.com
ISSN
2325-0984
eISSN
2325-0992
D.O.I.
10.1093/jssam/smy007
Publisher site
See Article on Publisher Site

Abstract

Abstract This article examines the influence of interviewers on the estimation of regression coefficients from survey data. First, we present theoretical considerations with a focus on measurement errors and nonresponse errors due to interviewers. Then, we show via simulation which of several nonresponse and measurement error scenarios has the biggest impact on the estimate of a slope parameter from a simple linear regression model. When response propensity depends on the dependent variable in a linear regression model, bias in the estimated slope parameter is introduced. We find no evidence that interviewer effects on the response propensity have a large impact on the estimated regression parameters. We do find, however, that interviewer effects on the predictor variable of interest explain a large portion of the bias in the estimated regression parameter. Simulation studies suggest that standard measurement error adjustments using the reliability ratio (i.e., the ratio of the measurement-error-free variance to the observed variance with measurement error) can correct most of the bias introduced by these interviewer effects in a variety of complex settings, suggesting that more routine adjustment for such effects should be considered in regression analysis using survey data. 1. INTRODUCTION Regression analysis is an omnipresent statistical method for examining the relationships of two or more variables in a given population for many areas of quantitative research. Regression analysis conditions on predictor variables; if these variables are measured with error, the estimated regression coefficients will be biased. Similarly, bias can arise from nonresponse, depending on the nature of the mechanism. If a finite population is the target of inference, failure of the sampling frame to cover the population can induce bias, as well. This study focuses on measurement and nonresponse error in interviewer-administered surveys. In interviewer-administered surveys, interviewers can contribute to both nonresponse and measurement error: each interviewer might introduce a specific amount of error on each measured variable, leading to data being more similar when it is collected by the same interviewer. In this study, we investigate the impact of interviewers and their interviewer effects on estimated coefficients in a simple linear regression model. A well-known source of systematic error (bias) in estimated regression coefficients is the presence of measurement errors in the values of independent variables. The case of fitting regression models to survey data collected by human interviewers is a more complex situation, because both dependent and independent variables can contain two different types of errors. The first is nonresponse error, where the units of analysis that respond to particular questions have different characteristics on the variables of interest compared with the nonresponding units, and the second is measurement error, where the measurements of the responding units contain errors and differ thus from their true values. The existing literature in this area has targeted the influence of these two errors due to interviewers on the estimation of descriptive parameters (means, proportions, etc.) for single survey variables. Scientific articles examining the effects of interviewers on the relationships of two or more variables are rare. Davis and Scott (1995) concentrated on the joint effects of interviewers and sampling clusters in survey data on estimated domain comparisons. The authors found evidence that interviewer effects on survey variables can vary between domains and therefore the variance of the estimated comparisons of a specific variable between these domains (e.g., the differences in means for a given variable between women and men) can be enlarged due to interviewers. In addition to these empirical results, they also presented theoretical work about joint cluster effects and interviewer effects on the precision of domain comparisons. A second paper (Beullens and Loosveldt 2014) is concerned with interviewer effects on covariances of survey variables and their impact on factor analysis. Essentially, these authors found that there are interviewer effects on covariances which slightly lower the magnitude of the factor loadings and possibly enlarge their variances. However, the authors used real data from the European Social Survey, (i.e., only data from respondents), and did not consider possible nonresponse error introduced by interviewers nor the absence of an interpenetrated design. The first limitation might be problematic because the interviewer effects literature presents several examples of interviewers varying in terms of their response distributions and response rates (West and Blom 2017), motivating additional study of what is causing these phenomena. Recent work (West and Olson 2010; West, Kreuter, and Jaenichen 2013) has also suggested that interviewer variance may be arising from different interviewers recruiting different types of respondents. The second limitation can lead to confounding of interviewer effects with sampling area effects. In a study with an interpenetrated design, Schnell and Kreuter (2005) found that the effects of interviewers on the variances of descriptive survey estimates are stronger than geographical cluster effects. To take the absence of an interpenetrated study design into account, (i.e., to separate interviewer effects from area effects), two papers (Wiggins, Longford, O’Muircheartaigh 1992; Hox 1994) used hierarchical regression models to control for interviewer effects on both intercepts and slopes by assuming that interviewers are drawn from a population of potential interviewers and compared several strategies modeling these as random effects. Wiggins et al. (1992) found that inferences related to regression coefficients can change when interviewer effects are included in regression models by analyzing two case studies. In Hox (1994), the modelling approach includes random intercepts and slopes for interviewers, in addition to interviewer-specific and subject-specific covariates resulting in a multlevel mixed effects model. Both articles consider the lack of an interpenetrated design by including possible area-specific confounding variables in their models; however, they ignore possible nonresponse error caused by interviewers, similar to the work of Davis and Scott (1995) and Beullens and Loosveldt (2014). Collectively, these studies show that interviewer effects can influence multivariate analysis, but the possible nonresponse error introduced by interviewers can not be estimated from the data used and is therefore assumed as negligible. Thus, there is a need to investigate the impact of both nonresponse errors and measurement errors caused by interviewers on the estimated relationship of two variables. One more recent paper is also concerned about interviewer effects on the estimation of relationships between multiple variables: Beullens and Loosveldt (2016) compare interviewer effects on regression analysis between European countries based on data from the European Social Survey. However, this work focuses mainly on the measurement error side of interviewer effects and does not consider the effects of interviewers on response propensity. In this paper, we will examine the impacts of both sources of errors due to interviewers by simulating many different real-world scenarios where dependent and independent variables in linear regression models will contain interviewer-specific measurement errors and nonresponse errors. In the following sections, we give a short introduction to linear regression models and show our theoretical expectations regarding interviewer effects on regression coefficients. The following section provides a comprehensive description of the simulated scenarios. The relevant results, focusing on the bias of the estimated slope parameter, will be presented afterwards. We then apply these results to real data from a national face-to-face survey in Germany, with validation data available on the sampling frame. Finally, we discuss our findings in a context of further research opportunities. We assume that our independent variables are continuous and approximately normally distributed. We discuss relaxing this assumption in the discussion section. 2. THEORETICAL RESULTS 2.1 Linear Model In this paper, we focus on a simple linear regression model defining the relationship between the continuous variables y and x: yj=β0+β1xj+εj,j=1,…,n. (1) We make the standard linear regression assumptions: for the residuals εj we assume E(εj|xj)=0, and all εj are stochastically independent and normally distributed with a constant variance σ2 for all j. The intercept parameter β0 and the slope parameter β1 remain constant for all given pairs of values (xj, yj), and xj is observed without error. A model-unbiased estimator for β1 is given by β^1=1n∑j=1n(xj−x¯)(yj−y¯)1n∑j=1n(xj−x¯)2=σ^x,yσ^x2. (2) This and further theory about linear regression models can be found in Fahrmeir, Kneib, Lang, and Marx (2013), chapters one through four. 2.2 Measurement Error Model Despite the best efforts of survey organizations to standardize the training of interviewers, it has long been known that interviewers can differentially affect the responses to single survey variables (Fowler and Mangione 1990). This may be due to verbal or nonverbal signals sent by interviewers or by personal characteristics of the interviewer that suggest interviewer preferences or expectations (West and Blom 2017). In a statistical framework, these interviewer effects are often viewed as a form of measurement error so that, for a variable with an underlying true value of xij for the jth respondent interviewed by the ith interviewer, the actual reported value is xij*, where xij*=xij+uxi with xij∼N(μx,σx2),uxi∼N(0,τx2),j=1,…,ni,i=1,…,nint. (3) Here uxi is a constant interviewer effect across all individuals interviewed by interviewer i, or, equivalently and perhaps more realistically, an average effect across the respondents associated with this interviewer. (For ease of exposition, we typically assume that nj⊥ni across all i=1,…,nint interviewers, although that is not required.) It is considered to be a random variable in a formal superpopulation sense, in that an infinite number of possible interviewers could have been drawn to be associated with the population and eventually the observed sample. Or, somewhat more informally, we assume that repeated sampling of the population does not condition on the set of interviewers that were used in the observed sample, but rather assumes they were drawn from a very large pool of potential interviewers relative to those in the sample. Under (3), the intra-interviewer correlation ρint associated with x is given by ρint,x=τx2σx2+τx2=τx2σx*2 (4) When x is used as a predictor in a linear regression model, the presence of interviewer effects can be described using a classical additive regression model (Carroll, Ruppert, Stefanski, and Crainiceanu 2006, p. 22). We assume our target parameter of interest is the slope β1 relating y to the true value of x: yij=β0+β1xij+εij, where εij∼N(0,σy2). The model fitted using the observed data estimates yij=β0+β1*xij*+εij, where xij* is given by (3). Assuming that uxi and εij are independent, the relationship between β1 and β1* is given by β1*=λβ1 with λ=σx2σx*2 (5) In (5), λ is called the reliability ratio. In our case, λ can be decomposed if the components xij and uxi from (3) are assumed as independent (Carroll et al. 2006): λ=σx2σx*2=σx*2−τx2σx*2=1−ρint,x (6) We therefore expect that the interviewer effect on x will shrink the estimated regression slope towards zero by ρint,x*100%. Further, we suppose that a correction of the shrinkage introduced by interviewers is possible with the inverse reliability ratio λint−1β1* with λint−1=σx*2σx*2−τx2. We are able to estimate the components of λint from the observed data: the empirical variance of x* is an estimator for the overall variance of x*. An estimator for the interviewer variance τx2 in x* can be calculated by fitting the following mixed-effects regression model: xij*=μ+γi+εx,ij, (7) where μ represents an overall fixed intercept term, γi the random intercept of the interviewers, and εx,ij a random error term. This model assumes independence of the interviewer effects and the errors. We also assume that any additional variability in case-specific measurement errors is negligible relative to the variance of the random interviewer effects (i.e., there is no case-specific error term in [3]). If one was able to estimate the variance of these case-specific errors, this term would also need to be subtracted from the denominator of the corrective factor introduced previously. Corrections based on interviewer effects alone may therefore not recover all of the bias in the parameter estimates if the variability in the case-specific errors is substantial. The variance of the γi terms, estimated by REML estimation, is then an unbiased estimator for τx2 (Fahrmeir et al. 2013, pp. 373–374). Further, the interviewer effect on x also causes an increase in the standard error of the estimated slope because the residual variance in the linear model is Var(y|x*)=Var(y|x)+β12τx2σx2σx*2+β12σxε2σx2σx*2=Var(y|x)+β12σx2ρint,x+β12σxε2σx2σx*2 (Carroll et al. 2006, p. 22), with σxε2 as the variance of εx,ij in (7). However, this increase also depends on values of β1 and σx2. We now consider the interviewer effect on the measurement of the dependent variable y. Similar to (3), the measurement error model for this variable is given by yij*=yij+uyi+εy,ij, with uyi∼N(0,τy2),εy,ij∼N(0,σyε2), (8) with the assumption of independent components. In contrast to the interviewer effect on the independent variable, adding interviewer effects to the dependent variable in a linear regression model does not cause bias in the slope: E(yij*|xij*)=(8)E(yij+uyi+εy,ij|xij*)=E(yij|xij*)+E(uyi)+E(εy,ij)=E(yij|xij*) But the variance of the residuals will increase by the variance of the interviewer effects τy2 and the variance of the additional error term εy,ij: Var(yij*|xij*)=(8)Var(yij+uyi+εy,ij|xij*)=Var(yij|xij*)+Var(uyi|xij*)+Var(εy,ij|xij*)=Var(yij|xij*)+Var(uyi)+Var(εy,ij)=(8)Var(yij|xij*)+τy2+σyε2 This leads to a larger standard error of the estimated slope and hence to reduced power of the usual single-parameter tests in the linear model with H0:β1=0. All expectations so far assume no correlation of the interviewer effects on x and y. Adding correlated error terms leads to a more complex situation. Following Chai (1971), the generalized reliability ratio with possible correlated measurement errors can be approximated for large nobs by λgen=1+σuxi,uyiσx,y1+τx2σx2 (9) with σuxi,uyi and σx,y describing the covariance of the random interviewer effects on x and y and the covariance of the dependent and independent variables. With the given model specification, the second term can be expressed as σx,y=β1σx2. Depending on the values of σuxi,uyi and σx,y, the introduced bias in relation to zero can either be reduced or increased (Biemer and Trewin 1997). Thus, we assume a change in the magnitude of the bias of β1 in the presence of correlated interviewer effects for both dependent and independent variables. 2.3 Nonresponse Model To model the probability of response pij for a subject j recruited by an interviewer i, we use the following logistic regression model: pij= exp(δ0+δ1yij+δ2xij+upi+upiYyij+upiXxij)1+ exp(δ0+δ1yij+δ2xij+upi+upiYyij+upiXxij), (10) with the random interviewer effects upi∼N(0,σRP2),upiY∼N(0,σRPY2), and upiX∼N(0,σRPX2).The parameter δ0 describes the intercept term which defines the overall response rate, δ1 and δ2 represent the relationships of the true values of y and x with the response probability, and upi is the random interviewer effect associated with response propensity, which is assumed to be normally distributed with zero mean and a variance σRP2. The random effects upiYand upiX represent random interviewer slopes for the true values of y and x on the response propensity. For δ1=δ2=0, a “missing completely at random” mechanism occurs for the sampled cases of one interviewer (Little and Rubin 2002, pp. 11–19). Because of a lack of systematic tendency in the response rate as a function of y or x in this setting, we expect no bias in the estimation of β1 due to nonresponse and just an increase of the estimate’s standard errors as a result of fewer observations. When δ1=0 and δ2≠0, we have a “missing at random” mechanism; assuming the linear model is correctly specified, the conditional distribution of y|x will be unchanged, and thus β1 will be estimated without bias. Whenever δ1≠0, the individuals with higher values on y are either more likely or less likely to respond, leading to a “not missing at random” mechanism. In this situation, the estimated variance of y from the respondents only ( Var̂(yRP)) is less than Var̂(y), the estimated variance of y from the whole sample. Goldberger (1981) and Groves (2004, p. 158) describe the impact of the systematic underestimation of this variance on the estimated slope parameter in a simple linear regression model: β^1RP=β1Var̂(yRP)/Var̂(y)1+ρ2(1−Var̂(yRP)/Var̂(y)) (11) In (11), ρ2 is the coefficient of determination or the proportion of variance in y explained by x. Because the influence of y on pij in (10) is a main effect, the variance ratio Var̂(yRP)Var̂(y)≤1 and therefore, Var̂(yRP)/Var̂(y)≤1+ρ2(1−Var̂(yRP)/Var̂(y)). For that reason, we expect a shrinkage of the estimated parameter β^1RP towards zero for δ1≠0. This bias depends also on the magnitude of Var(y|x)=Var(ε), the residual variance; a smaller residual variance leads to less bias, and in the situation of a perfect fit, there is no bias at all. Interviewer-specific random effects introduce additional variance in response probabilities. If they are random and independent from other parameters, we expect no influence on the bias of β^1. We consider the implications of a correlation between interviewer effects on the measurement of Y and interviewer effects on response propensity in the next section. 2.4 Correlation of Nonresponse and Measurement Error In the case of a relationship between the measurement error and nonresponse introduced by interviewers, (i.e., correlated upi and uyi), we expect no additional bias in the slope but rather in the estimated intercept parameter. If Cor(uyi,upi)>0, then respondents with high uyi have a higher response propensity because of their higher values in upi, and therefore, the estimated intercept increases. However, the size of this effect depends on the sizes of the involved variances τy2, σRP2 and the Cor(uyi,upi), with larger variances and higher correlation leading to more bias. However, the adjustment λint−1 from section 2.2 can be estimated using data from respondent only and used to account for the bias in the slope parameter introduced by measurement error. How much of the overall bias can be reduced by this approach is investigated in the following simulaion study. 3. SIMULATION DESIGN Table 1 shows the varying parameters for all of the following simulation scenarios. Data generated for the x and y variables comes from a super-population with the relationship yj=β0+β1xj+εj, where j denotes the elements of the super-population. We assume xj∼N(0,1), εj∼N(0,2), xj is independent of εj, β0=0.5, and β1=1. Thus, yj|xj∼N(0.5+xj,2). Additionally, given a sample of nint = 50 interviewers from a population of interviewers, nobs = 50 describes the number of interviews attempted per interviewer. Thus, the total number of interviews attempted is n=nint·nobs=2500. Table 1. Overview of the Values for All Varying Simulation Parameters Parameter Description Values Parameters from the nonresponse error model described in section 2.3:  δ1 Relationship of y values with response propensity 0,0.6  δ2 Relationship of x values with response propensity 0,0.6   σRP2 Variance of interviewer effects on response propensity 0.01, 0.25   σRPX2 Variance of random interviewer coefficient for x in predicting response propensity 0.01, 0.25   σRPY2 Variance of random interviewer coefficient for y in predicting response propensity 0.01, 0.25 Parameters from the measurement error model described in section 2.2:   τx2 Variance of interviewer effects on x 0.01,0.1   τy2 Variance of interviewer effects on y 0.01,0.1   Cor(uxi,uyi) Correlation of the interviewer effects on x and y −0.25, 0, 0.25 Parameter for the relationship between nonresponse and measurement error model as described in section 2.4:   Cor(uyi,upi) Correlation of the interviewer effects on y and on the response propensity −0.5, 0, 0.5 Parameter Description Values Parameters from the nonresponse error model described in section 2.3:  δ1 Relationship of y values with response propensity 0,0.6  δ2 Relationship of x values with response propensity 0,0.6   σRP2 Variance of interviewer effects on response propensity 0.01, 0.25   σRPX2 Variance of random interviewer coefficient for x in predicting response propensity 0.01, 0.25   σRPY2 Variance of random interviewer coefficient for y in predicting response propensity 0.01, 0.25 Parameters from the measurement error model described in section 2.2:   τx2 Variance of interviewer effects on x 0.01,0.1   τy2 Variance of interviewer effects on y 0.01,0.1   Cor(uxi,uyi) Correlation of the interviewer effects on x and y −0.25, 0, 0.25 Parameter for the relationship between nonresponse and measurement error model as described in section 2.4:   Cor(uyi,upi) Correlation of the interviewer effects on y and on the response propensity −0.5, 0, 0.5 Table 1. Overview of the Values for All Varying Simulation Parameters Parameter Description Values Parameters from the nonresponse error model described in section 2.3:  δ1 Relationship of y values with response propensity 0,0.6  δ2 Relationship of x values with response propensity 0,0.6   σRP2 Variance of interviewer effects on response propensity 0.01, 0.25   σRPX2 Variance of random interviewer coefficient for x in predicting response propensity 0.01, 0.25   σRPY2 Variance of random interviewer coefficient for y in predicting response propensity 0.01, 0.25 Parameters from the measurement error model described in section 2.2:   τx2 Variance of interviewer effects on x 0.01,0.1   τy2 Variance of interviewer effects on y 0.01,0.1   Cor(uxi,uyi) Correlation of the interviewer effects on x and y −0.25, 0, 0.25 Parameter for the relationship between nonresponse and measurement error model as described in section 2.4:   Cor(uyi,upi) Correlation of the interviewer effects on y and on the response propensity −0.5, 0, 0.5 Parameter Description Values Parameters from the nonresponse error model described in section 2.3:  δ1 Relationship of y values with response propensity 0,0.6  δ2 Relationship of x values with response propensity 0,0.6   σRP2 Variance of interviewer effects on response propensity 0.01, 0.25   σRPX2 Variance of random interviewer coefficient for x in predicting response propensity 0.01, 0.25   σRPY2 Variance of random interviewer coefficient for y in predicting response propensity 0.01, 0.25 Parameters from the measurement error model described in section 2.2:   τx2 Variance of interviewer effects on x 0.01,0.1   τy2 Variance of interviewer effects on y 0.01,0.1   Cor(uxi,uyi) Correlation of the interviewer effects on x and y −0.25, 0, 0.25 Parameter for the relationship between nonresponse and measurement error model as described in section 2.4:   Cor(uyi,upi) Correlation of the interviewer effects on y and on the response propensity −0.5, 0, 0.5 The number of interviews completed will shrink with the simulated unit nonresponse. Starting from a base response rate of 50 percent ( δ0=0 in [10]), the response propensity may vary by subject when missingness depends on y (not missing at random) and/or there is interviewer heterogeneity in nonresponse. We consider two values of δ1 in (10), corresponding to missing completely at random ( δ1=0) and to a stronger association with the outcome ( δ1=0.6). For σRP2 we consider two levels corresponding to trivial and substantial interviewer effects on response propensity (0.01 and 0.25), based on West and Olson (2010) and West et al. (2013). In addition, the observed values of xj and yj are subject to a specific measurement error for the corresponding interviewer, as introduced in (3) and (8), resulting in clustering effects according to interviewers. For both variables, x and y, we add either a trivial ( τx2=0.01, τy2=0.01) or a nontrivial ( τx2=0.1, τy2=0.1) amount of interviewer variance. The chosen parameter values lead to the following intraclass correlations (ICCs) for interviewer effects for x and y (West and Olson 2010; West et al. 2013): ρint,x,high=τx2σxij2+τx2=0.11+0.1≈0.091ρint,x,low=τx2σxij2+τx2=0.011+0.01≈0.01ρint,y,high=τy2σyij2+τy2=0.12+0.1≈0.05ρint,y,low=τy2σyij2+τy2=0.012+0.01≈0.005 Further, we consider three possible values, negative (−0.25), positive (0.25) and zero, for the correlation of the interviewer effects on x and y ( Cor(uxi,uyi)). Finally, the influence of a possible correlation between the interviewer effects on y and on the response propensity ( Cor(uyi,upi)), as described in section 2.4, is examined for three possible values (−0.5, 0 and 0.5). Because of a lack of empirical results for these correlations, we choose rather extreme values in order to examine their potential influence. This set of parameter values results in 1,152 possible scenarios. For every scenario, we replicated the following procedure nrep = 100 times: first, values for x and y with the described structure were drawn at random, and linear models were fitted to the generated data (assuming no nonresponse and no measurement error). Next, based on the models from section 2, x* (3), y* (8), and pij (10) were generated for all observations. The response indicator RPij was calculated by comparing pij with a random variable ψij drawn from a uniform distribution over the interval [0,1], and the following decision rule was implemented to simulate a response indicator: RPij={1for ψij≤pij0for ψij>pijwith ψij∼U(0,1),∀i,j In the second step, we fitted linear models to the true values of the responding subjects and to the values containing errors. We also estimated the reliability ratio from (5), using data from respondents measured with errors (i.e., the data situation of a real survey) and calculated an adjusted slope parameter as described in section 2.2. The simulation study was performed in the software package “R” (R Core Team 2017). The annotated R code for performing the simulation study is available in the supplementary materials. Selected results of the simulation are presented in the following section. 4. SIMULATION ANALYSIS We focus on the results for the bias of β^1 in this section because it dominates the root mean squared error (RMSE) in all investigated scenarios. The estimator of the bias for one parameter combination was calculated as the mean of the differences from the true value β1 over all replications: Bias(β̂1)̂=1nrep∑k=1nrep(β^1k−β1) To obtain an overview of the influence of the various simulation parameters on the estimation of β1 given the large number of scenarios, we use a variable importance measure generated from an application of the random forest method. A short introduction to the random forest method and our used variable importance measure procedure is given in the following section. 4.1 Introduction to the Random Forest Method To understand the random forest method, we first need to take a look at tree-based methods. They divide the covariate space (also known as the feature space) into subspaces and fit a simple model like the mean of the dependent variable from the observations at the subspace (for a continuous dependent variable) or assign one category (for categorical dependent variables) to every produced subspace. For a regression problem (for continuous dependent variables), the algorithm starts with one split value of one covariate (zm) that divides the space into two subspaces R1 and R2. In our application, the dependent variable w contains all estimated slope parameters β^1 from all simulated scenarios and all one hundred replicates. The covariates consist of all varying parameters in the simulation. Thus, the resulting R1 and R2 are two sets of parameter values from the simulation who differ only by the values of the selected split variable/simulation parameter zm. The optimal split variable v and the optimal split point s minimize the following criterion: min⁡v,s[∑l=12min⁡cl(∑zm∈Rl(v,s)(wm−cl)2)], (12) with l denoting the index of the subspaces. The inner terms are minimal for c^l=1#{wm|zm∈Rl(v,s)}∑wm|zm∈Rl(v,s)wm, which is the mean of the w values in the produced subspace (here the mean of the β^1 values within the subspace). The values of the inner sums can be computed for all v, s to find the optimum and thereby the border that separates the subspaces R1 and R2. This procedure is repeated for all resulting subspaces until a stopping rule is fulfilled. The only change for a classification problem is that a different split criterion is necessary. Despite their flexibility and interpretability, trees can be very unstable, (i.e., a small change in the given data can produce a totally different tree) (Friedman, Hastie, and Tibshirani 2009, p. 312). The random forest is an ensemble method where many trees are fitted to bootstrap samples of the given data set and aggregated to produce a prediction rule. This aggregation, or the average of all tree results in the case of a continuous variable, produces a prediction function with less variance than received from a single tree, (i.e., the method solves the instability problem of a single tree). The variance reduction is especially efficient if the aggregated trees are different from each other. This further de-correlation of the trees can be obtained by using only a random sample of the covariates for every split and by enlarging the trees as much as possible. This idea was developed by Breiman (2001). Unfortunately, a random forest is harder to interpret than just a single tree. To overcome this absence of interpretability by constructing a variable importance measure, one could use the observations which are not used to fit the tree, (i.e., the observations from the data which are not included in the bootstrap sample). These observations are called the out-of-bag (OOB) sample. To compute a measure of variable importance, the prediction accuracy of one tree for its OOB observations is calculated. Then the values of a variable q in the OOB sample are randomly permuted and the prediction accuracy is measured again. The mean loss of accuracy due to random permutation over all trees is then a measure of the variable importance for variable q (Friedman et al. 2009, pp. 593–594). For a regression tree, the implemented prediction accuracy in the R Package “randomForest” is the mean squared error (MSE) (Liaw and Wiener 2002). Theory about tree-based methods and random forests (including an alternative variable importance measure) can be found in Friedman et al. (2009), chapters nine, ten, and fifteen. 4.2 Simulation Results To analyze the results of our simulations, we estimate a random forest (1,000 trees) with β^1 as a continuous dependent variable, all varying parameters ( δ1,δ2,σRP2,σRPY2,σRPX2,τx2,τy2, Cor(uxi,uyi),Cor(uyi,upi)) as covariates and all nrep·#scenarios=100·1152=115200 simulation runs as the underlying data set. A measure for the parameter importance can be produced as described in section 4.1. A permutation in an important variable would lead to a high increase in the MSE of the random forest. Figure 1 shows the calculated variable importance for all nine varying input parameters in a decreasing order with β^1 as the dependent variable. The x-axis shows the mean percentage increase of the MSE if the values of the parameters at the y-axis are permuted. The parameter resulting in the highest MSE increase and thereby having the highest estimated importance according to this criterion is δ1, which is the relationship between y and the response propensity, followed by τx2, which is the variance of the random interviewer effects on the variable x. The third and fourth most important parameters are δ2, the relationship between x and the response propensity, and Cor(uxi,uyi), the correlation of the interviewer-specific random effects. The other parameters play minor roles. This high impact of τx2 on the estimation of the slope parameter is not surprising and can be explained by the relationship expressed in (5) and (6). The same applies for the high impact of δ1, consistent with our expectations in section 2.3, and the effect of Cor(uxi,uyi), which is anticipated because of (9). The magnitude of τy2’ s effect can also be explained by (9). This generalized reliability ratio shows that the MSE increase from permuting τy2 is based on the relationship with the highly relevant parameters τx2 and Cor(uxi,uyi); for Cor(uxi,uyi)=0, the effect disappears. That σRP2 seems to be less important is consistent with our expectation from section 2.3. The minor importance of Cor(uyi,upi) results from the lower importance of τy2 and σRP2 and their relationship with Cor(uyi,upi), as described in section 2.4. In summary, by analyzing the simulated data with the random forest procedure, it is possible to determine which of the varying simulation parameters has the highest influence on the estimated slope parameter β^1 in a linear regression model. From this overview, it can be derived that the biggest influence on β^1 from interviewers is introduced by measurement error in the independent variable x. Figure 1. View largeDownload slide Parameter Importance for β^1Estimated by a Random Forest. Figure 1. View largeDownload slide Parameter Importance for β^1Estimated by a Random Forest. To overcome the limitations of analyzing the variable importance in a broader sense and to gain a deeper insight in possible interactions of simulation parameters and their effects on the bias of the slope parameter, figure 2 shows a detailed matrix of the estimated biases for σRPY2=0.25 and σRPX2=0.25 with all combinations of the parameter values of δ1,δ2,σRP2,τx2,τy2, Cor(uxi,uyi) and Cor(uyi,upi). Plotted results for the other parameter combinations of σRPY2 and σRPX2 can be found in the Appendix. The parameters of the measurement error model vary in the columns, and the rows differ in terms of parameter values from the response model. The dashed line indicates a bias of zero. Within cells, the empirical bias of β^1 is plotted for three different situations for all three values of Cor(uyi,upi): Figure 2. View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure 2. View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Data from all subjects with a response indicator RPij = 1 (respondents) and the true values for x and y, (i.e., only a possible bias due to nonresponse is observed.) Data from all subjects with a response indicator RPij = 1 (respondents) and the values for x* and y* containing interviewer effects (measurement errors) Data from all subjects with a response indicator RPij = 1 (respondents) and the values for x* and y* containing interviewer effects (measurement errors) and an additional adjustment for the slope parameter by the reliability ratio λ^, estimated using data from respondents only containing interviewer effects (measurement errors) Varying δ1 has a large impact on the bias of β^1 independent of other parameter values. As expected, no bias occurs for the missing completely at random situation with δ1=0. If δ1≠0, a negative bias of approximately –0.12 for δ1=0.6 can be observed for the respondents and true values. Whenever δ2≠0, this bias becomes even stronger. Changes in σRP2 have a negligible influence on the bias. These findings are consistent with our expectations in section 2.3. Considering the interviewer effects on the variables as an additional source of bias, a specific pattern is visible: whenever the variance of the random interviewer effects on x, τx2, is small, the bias in all three situations is the same, (i.e., there is no additional bias due to measurement errors in this case). For τx2=0.1, the additional measurement error enlarges the bias by about ρint,x,high·100%, which is 0.09 for δ1=0 or nine percent in general. The magnitudes of τy2 and Cor(uxi,uyi) introduce minor changes in this relationship. For τy2=0.01, the additional bias from measurement errors is negligible. For τy2=0.1, it is slightly larger (−0.02) when Cor(uxi,uyi)=−0.25 and slightly smaller (0.02) if Cor(uxi,uyi)=0.25. The adjustment by the inverse reliability ratio is a useful method to overcome interviewer effects in x*. However, the reliability ratio ignores the additional changes in bias introduced by τy2 and Cor(uxi,uyi) and corrects not enough in some cases or even too much in other cases. The generalized reliability ratio, introduced in (9), may help to address these minor differences. If the variance of the random interviewer slopes for x or y ( σRPX2 or σRPY2) is high, correcting with the estimated reliability ratio may lead to a slight positive bias. This effect is generally stronger in scenarios where high σRPX2 values are present. In cases of high variances for both variables, their effects add up and the correction can introduce a positive bias. The variation of the parameter Cor(uyi,uRP) produces slight changes in the bias in some cases, but even with big values for this correlation, there is no obvious pattern apparent. These minor effects match our expectations in section 2.4. We also simulated the same scenarios with β1=−1 and negative values for δ1 and δ2, but the pattern of results across the simulation parameters is basically the same, with a positive bias occurring for β1=−1 due to the shrinkage towards zero. 5. APPLICATION We now confirm the empirical results presented so far in this paper by examining this problem using data from a real face-to-face survey. An ideal data set for further understanding the simulation findings would have the following characteristics: 1) measurements on two continuous variables from the survey that have an approximately linear relationship; 2) an underlying interpenetrated sample design to avoid confounding of area effects and interviewer effects; and 3) a unique interviewer identification variable available for every interviewed person, so that all observations can be associated with a specific interviewer. Furthermore, true values for the two variables of interest would need to be available for the entire sample for validation purposes. Given a data set meeting these criteria, we can estimate all of the parameters varied in the simulation studies and assess how these estimates impact our inferences related to the regression coefficient of interest. We analyze data from a face-to-face survey that was conducted in fifteen large areas in Germany (West, Conrad, Kreuter, and Mittereder 2018). In each area, 480 currently employed adults with a history of at least one unemployment spell were randomly sampled from the Integrated Employment Biographies (IEB) database, which contains official government information on employment histories. The overall sample size was thus n = 7,200. Four professional interviewers were assigned to work in each of the fifteen areas (sixty interviewers total), and each was randomly assigned 120 cases in total (i.e., interpenetrated sample assignment, conditional on the area). While the larger goal of this study was to evaluate the effects of experimentally manipulated interviewing techniques on data quality, we do not consider that experimental manipulation here. The survey instrument included questions with responses that could be validated using administrative information on the IEB sampling frame. We specifically focus on log-transformed annual income as the dependent variable of interest and the longest uninterrupted period of employment in the last twenty years (in months) as the independent variable of interest, with the underlying theory being that individuals who are able to hold positions for longer periods of time experience greater benefits in terms of annual income. Data collection continued from March to October in 2014. Respondents were provided with a 20 Euro token of appreciation for participating. In total, n = 1,850 interviews were completed by the sixty interviewers (AAPOR RR1 =25.7%), and after accounting for item-missing data on the two variables of interest, there were n = 1,469 respondents with data available on both variables. Additional details related to the design of the parent study can be found in West et al. (2018). Table 2 below provides estimates of the parameters of interest in our simulation studies based on the data from this German study. When estimating the variance of the random interviewer effects on a particular variable, we fit multilevel models explicitly controlling for the fixed effects of the fifteen areas, removing any variability in the variable being modeled due to fixed effects of the different areas. Of note, the relationships of the dependent and independent variables with response propensity (δ1 and δ2) were not significantly different from zero, similar to the correlation of the random interviewer effects on the reported values of the dependent and independent variables. We also did not find any evidence of interviewer variance in the relationships of the dependent and independent variables with response propensity, and these random coefficients were therefore dropped from the model. The variance of the random interviewer effects on x, found to be the second most important factor in the simulations, was significantly greater than zero (p = 0.003 according to a mixture-based likelihood ratio test), but only of moderate size relative to the simulation inputs (corresponding ρint,x=0.027). Table 2. Estimates of Parameters Varied in the Simulation Studies Based on Real Data Collected in a German Face-to-Face Survey Parameter Estimate/other relevant information δ1 Estimate =0.003, p = 0.887, n = 7,154 δ2 Estimate < 0.001, p = 0.558, n = 7,154 σRP2 Estimate =0.121, p < 0.001, ICC = 0.035 σRPX2 Estimate < 0.001, not significant (removed from model) σRPY2 Estimate < 0.001, not significant (removed from model) τx2 Estimate =73.501, p = 0.003, ICC = 0.027, n = 1,469 τy2 Estimate =0.053, p = 0.198, ICC = 0.008, n = 1,469 Cor(uxi,uyi) Estimate =0.113, p = 0.390 Cor(uyi,upi) Estimate =−0.199, p = 0.128 Parameter Estimate/other relevant information δ1 Estimate =0.003, p = 0.887, n = 7,154 δ2 Estimate < 0.001, p = 0.558, n = 7,154 σRP2 Estimate =0.121, p < 0.001, ICC = 0.035 σRPX2 Estimate < 0.001, not significant (removed from model) σRPY2 Estimate < 0.001, not significant (removed from model) τx2 Estimate =73.501, p = 0.003, ICC = 0.027, n = 1,469 τy2 Estimate =0.053, p = 0.198, ICC = 0.008, n = 1,469 Cor(uxi,uyi) Estimate =0.113, p = 0.390 Cor(uyi,upi) Estimate =−0.199, p = 0.128 Table 2. Estimates of Parameters Varied in the Simulation Studies Based on Real Data Collected in a German Face-to-Face Survey Parameter Estimate/other relevant information δ1 Estimate =0.003, p = 0.887, n = 7,154 δ2 Estimate < 0.001, p = 0.558, n = 7,154 σRP2 Estimate =0.121, p < 0.001, ICC = 0.035 σRPX2 Estimate < 0.001, not significant (removed from model) σRPY2 Estimate < 0.001, not significant (removed from model) τx2 Estimate =73.501, p = 0.003, ICC = 0.027, n = 1,469 τy2 Estimate =0.053, p = 0.198, ICC = 0.008, n = 1,469 Cor(uxi,uyi) Estimate =0.113, p = 0.390 Cor(uyi,upi) Estimate =−0.199, p = 0.128 Parameter Estimate/other relevant information δ1 Estimate =0.003, p = 0.887, n = 7,154 δ2 Estimate < 0.001, p = 0.558, n = 7,154 σRP2 Estimate =0.121, p < 0.001, ICC = 0.035 σRPX2 Estimate < 0.001, not significant (removed from model) σRPY2 Estimate < 0.001, not significant (removed from model) τx2 Estimate =73.501, p = 0.003, ICC = 0.027, n = 1,469 τy2 Estimate =0.053, p = 0.198, ICC = 0.008, n = 1,469 Cor(uxi,uyi) Estimate =0.113, p = 0.390 Cor(uyi,upi) Estimate =−0.199, p = 0.128 Given this combination of estimates, we would predict that the attenuating effects of the interviewer variance on the estimate of the regression coefficient of interest would be relatively modest based on the theory and simulation studies presented here. Table 3 presents the OLS estimates of the regression coefficient of interest, after adjusting for the fixed area effects, for 1) the full sample, when using the true values of each variable; 2) the respondents, when using the true values of each variable; and 3) the respondents, when using the reported values of each variable. Table 3. OLS Estimates of the Regression Coefficient of Interest at Three Different Stages of the German Face-to-Face Survey. (OLS Estimates and Standard Errors are multiplied by 100) Stage of study Sample size OLS estimate Standard error p value Full Sample, True Values 7,154 0.493 0.03 p < 0.001 Respondents, True Values 1,469 0.527 0.07 p < 0.001 Respondents, Reported Values 1,469 0.467 0.13 p < 0.001 Stage of study Sample size OLS estimate Standard error p value Full Sample, True Values 7,154 0.493 0.03 p < 0.001 Respondents, True Values 1,469 0.527 0.07 p < 0.001 Respondents, Reported Values 1,469 0.467 0.13 p < 0.001 Table 3. OLS Estimates of the Regression Coefficient of Interest at Three Different Stages of the German Face-to-Face Survey. (OLS Estimates and Standard Errors are multiplied by 100) Stage of study Sample size OLS estimate Standard error p value Full Sample, True Values 7,154 0.493 0.03 p < 0.001 Respondents, True Values 1,469 0.527 0.07 p < 0.001 Respondents, Reported Values 1,469 0.467 0.13 p < 0.001 Stage of study Sample size OLS estimate Standard error p value Full Sample, True Values 7,154 0.493 0.03 p < 0.001 Respondents, True Values 1,469 0.527 0.07 p < 0.001 Respondents, Reported Values 1,469 0.467 0.13 p < 0.001 We see in table 3 that the estimated coefficient based on respondent reports is only attenuated by a modest amount (a 5.3 percent reduction in magnitude) relative to the” true” coefficient that would have been computed if every individual had responded and provided the exact correct value on each variable. Therefore, these results match our expectations based on the simulation studies, and overall inferences would not be affected. The proposed correction factor, when applied to the OLS estimate of 0.00467 based on respondent reports, would be computed as 2,725.67 (the total variance of the independent variable) divided by 2,652.17 (the variance of the true values of the independent variable in the absence of the interviewer effects), or 1.03. This correction would bring the slightly biased OLS estimate closer to its “true” value: 0.00467·1.03=0.00481. We note that the denominator of this correction may be too large, given the presence of case-specific measurement errors, and knowledge of the variance of these errors would allow us to fully correct for the bias in this estimate. We were not able to fully estimate this variance in the present application because not all respondents consented to having their reports linked to administrative data (which would be required for computing the case-specific errors). 6. DISCUSSION We saw how interviewer effects resulting in measurement errors and nonresponse errors can influence the estimation of the slope parameter in a simple linear regression model in the presence of different missing data mechanisms. In our simulated scenarios, the introduced measurement error in an independent variable and the missing data mechanism have the strongest influences. Both sources introduce a bias towards zero and therefore weaken the estimated relationship between the independent and the dependent variable. The standard treatment for correcting slope parameters estimated from variables containing errors—the reliability ratio—can help to repair the introduced bias from this source, but not the bias from nonresponse error. In our simulated scenario, the reliability ratio estimated only from respondent data is adequate to correct the biggest part of the bias introduced from interviewer effects because the interviewer effect on the response propensity introduces only small bias in the regression slope. There is much additional work to be done in this area. An important next step would be to evaluate the change in the standard errors of the slope in order to determine the interviewer effects on significance tests of β^1. Furthermore, we only considered the case of simple linear regression with a single predictor in this paper; future work should extend the results of this paper in a multivariate direction. While we considered an extensive simulation setting, there are a number of factors that remain to be explored. The correlation between the interviewer-specific random effects on x and y needs more investigation because it has an non-negligible influence on the bias. To correct jointly for both interviewer effects and nonresponse error, one could use Heckman’s selection model (Heckman 1979) and try to use the interviewer ID as a kind of instrumental variable similar to the approach of McGovern, Bärnighausen, Salomon, and Canning (2015) and control for the interviewers in the regression model. Or, one could combine this approach with pattern-mixture models as in the work of Yuan and Little (2009) to adjust for nonresponse error. The magnitudes of interviewer variance for various survey variables are highly dependent on the characteristics of survey questions (Schaeffer, Dykema, and Maynard 2010). To reduce this source of variance, interviewer training (Billiet and Loosveldt 1988; Fowler and Mangione 1990, Chapter 7) and standardized interviewing instead of conversational interviewing (West, Conrad, Kreuter, and Mittereder 2017, 2018) has been shown to be effective. Whenever researchers analyze survey data collected by interviewers, the article by Elliott and West (2015) can be consulted as practical guidance about how to deal with interviewer effects in the case of descriptive estimation. As seen in our simulation study, interviewer variance in the independent variable of a regression model can introduce bias in the estimated slope parameter. Therefore, analysts should estimate the impact of interviewer effects on regression parameters of interest using the proposed adjustment and make appropriate adjustments with the inverse of the reliability ratio if necessary. As we noted previously, this work has focused on the simple linear regression model, and appropriate adjustments for regression models with multiple predictors subject to interviewer effects should be an important focus of future work in this area. While cluster effects and differential measurement error are not unique to interviewer effects—for example, misreporting of parental education in the Programme for International Student Assessment (PISA) may confound relationships between parental education and student performance (Kreuter et al. 2010)—an uncommon feature of this setting is the ability to estimate the degree of measurement error using random effects models applied to the regression predictors of interest. Our simulation study suggests that use of the reliability ratio estimated from such models can correct for much of the resulting bias even under settings where the measurement error structure is more complex than implied by a simple reliability ratio adjustment. This suggests that such adjustments should be considered more routinely than is currently the case; although, further research is needed to better understand how failures of assumptions such as interpenetration may impact such adjustments. Appendix Figure A.1 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.1 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.2 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.2 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.3 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. Figure A.3 View largeDownload slide Estimated Bias of Slope Parameter β^1for Different Parameter Settings. REFERENCES Beullens K. , Loosveldt G. ( 2014 ), “Interviewer Effects on Latent Constructs in Survey Research,” Journal of Survey Statistics and Methodology , 2 , 433 – 458 . Google Scholar CrossRef Search ADS Beullens K. , Loosveldt G. ( 2016 ), “Interviewer Effects in the European Social Survey,” Survey Research Methods , 10 , 103 – 118 . Biemer P. P. , Trewin D. ( 1997 ), “A Review of Measurement Error Effects on the Analysis of Survey Data,” in Survey Measurement and Process Quality , pp. 601 – 632, New York : Wiley Online Library . Google Scholar CrossRef Search ADS Billiet J. , Loosveldt G. ( 1988 ), “Improvement of the Quality of Responses to Factual Survey Questions by Interviewer Training,” Public Opinion Quarterly , 52 , 190 – 211 . Google Scholar CrossRef Search ADS Breiman L. ( 2001 ), “Random Forests,” Machine Learning , 45 , 5 – 32 . Google Scholar CrossRef Search ADS Carroll R. J. , Ruppert D. , Stefanski L. A. , Crainiceanu C. M. ( 2006 ), Measurement Error in Nonlinear Models: A Modern Perspective , Boca Raton, FL : CRC press . Google Scholar CrossRef Search ADS Chai J. J. ( 1971 ), “Correlated Measurement Errors and the Least Squares Estimator of the Regression Coefficient,” Journal of the American Statistical Association , 66 , 478 – 483 . Google Scholar CrossRef Search ADS Davis P. , Scott A. ( 1995 ), “The Effect of Interviewer Variance on Domain Comparisons,” Survey Methodology , 21 , 99 – 106 . Elliott M. R. , West B. T. ( 2015 ), “Clustering by Interviewer’: A Source of Variance That Is Unaccounted for in Single-Stage Health Surveys,” American Journal of Epidemiology , 182 , 118 – 126 . Google Scholar CrossRef Search ADS PubMed Fahrmeir L. , Kneib T. , Lang S. , Marx B. ( 2013 ), Regression: Models, Methods and Applications , New York : Springer Science & Business Media . Fowler F. J. Jr, , Mangione T. W. ( 1990 ), Standardized Survey Interviewing: Minimizing Interviewer-Related Error, Vol. 18 , Newbury Park : Sage . Google Scholar CrossRef Search ADS Friedman J. , Hastie T. , Tibshirani R. ( 2009 ), The Elements of Statistical Learning: Data Mining, Inference, and Prediction , New York : Springer-Verlag New York . Goldberger A. S. ( 1981 ), “Linear Regression after Selection,” Journal of Econometrics , 15 , 357 – 366 . Google Scholar CrossRef Search ADS Groves R. M. ( 2004 ), Survey Errors and Survey Costs , Hoboken, New Jersey : John Wiley & Sons . Google Scholar CrossRef Search ADS Heckman J. J. ( 1979 ), “Sample Selection Bias as a Specification Error,” Econometrica: Journal of the Econometric Society , 47 , 153 – 162 . Google Scholar CrossRef Search ADS Hox J. J. ( 1994 ), “Hierarchical Regression Models for Interviewer and Respondent Effects,” Sociological Methods & Research , 22 , 300 – 318 . Google Scholar CrossRef Search ADS Kreuter F. , Eckman S. , Maaz K. , Watermann R. ( 2010 ), “Children’s Reports of Parents’ Education Level: Does It Matter Whom You Ask and What You Ask about?” Survey Research Methods , 4 , 127 – 138 . Liaw A. , Wiener M. ( 2002 ), “Classification and Regression by randomForest,” R News , 2 , 18 – 22 . Little R. J. , Rubin D. B. ( 2002 ), Statistical Analysis with Missing Data , New York : John Wiley & Sons . Google Scholar CrossRef Search ADS McGovern M. E. , Bärnighausen T. , Salomon J. A. , Canning D. ( 2015 ), “Using Interviewer Random Effects to Remove Selection Bias from HIV Prevalence Estimates,” BMC Medical Research Methodology , 15 , 8 . Google Scholar CrossRef Search ADS PubMed R Core Team ( 2017 ), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing. Schaeffer N. C. , Dykema J. , Maynard D. W. ( 2010 ), “Interviewers and Interviewing,” Handbook of Survey Research , 2nd ed, 437 – 470 . Schnell R. , Kreuter F. ( 2005 ), “Separating Interviewer and Sampling-Point Effects,” Journal of Official Statistics , 21 , 389 . West B. T. , Blom A. G. ( 2017 ), “Explaining Interviewer Effects: A Research Synthesis,” Journal of Survey Statistics and Methodology , 5 , 175 – 211 . West B. T. , Conrad F. G. , Kreuter F. , Mittereder F. ( 2017 ), “Nonresponse and Measurement Error Variance among Interviewers in Standardized and Conversational Interviewing,” Journal of Survey Statistics and Methodology , in press. West B. T. , Conrad F. G. , Kreuter F. , Mittereder F. ( 2018 ), “Can Conversational Interviewing Improve Survey Response Quality without Increasing Interviewer Effects?” Journal of the Royal Statistical Society: Series A (Statistics in Society) , 181 , 181 – 203 . Google Scholar CrossRef Search ADS West B. T. , Kreuter F. , Jaenichen U. ( 2013 ), “Interviewer Effects in Face-to-Face Surveys: A Function of Sampling, Measurement Error, or Nonresponse?” Journal of Official Statistics , 29 , 277 – 297 . Google Scholar CrossRef Search ADS West B. T. , Olson K. ( 2010 ), “How Much of Interviewer Variance Is Really Nonresponse Error Variance?” Public Opinion Quarterly , 74 , 1004 – 1026 . Google Scholar CrossRef Search ADS Wiggins R. D. , Longford N. , O’Muircheartaigh C. A. ( 1992 ), “A Variance Components Approach to Interviewer Effects,” in Survey and Statistical Computing , eds. Westlake A. , Banks R. , C. P. , Orchard T. , pp. 243 – 254, Amsterdam : North-Holland Yuan Y. , Little R. J. ( 2009 ), “Mixed-Effect Hybrid Models for Longitudinal Data with Nonignorable Dropout,” Biometrics , 65 , 478 – 486 . Google Scholar CrossRef Search ADS PubMed © The Author(s) 2018. Published by Oxford University Press on behalf of the American Association for Public Opinion Research. All rights reserved. For permissions, please email: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

Journal

Journal of Survey Statistics and MethodologyOxford University Press

Published: May 9, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off